In the broad context of a strategic collaboration between Selvita and Institute of Pharmacology Polish Academy of Science (IPPAS) in the field of CNS (Central Nervous System) disorders, we would like to share our experience regarding GPCRs, the most explored CNS targets.
Figure 1: Schematic presentation of GPCR. Please notice that all receptors belong to this class are consisted of seven transmembrane helixes. G-protein (alpha subunit is attached to the C-term end). ECL means extracellular loop and ICL stands for intercellular loop. Figure was generated on www.gpcrdb.org
G Protein-Coupled Receptors (GPCRs) comprise a large superfamily of signaling proteins (~800 receptors), which are involved in such a big number of physiological processes that even listing them here would take significant volume of this post, thus making it unreadable. These functions make GPCRs an attractive therapeutic target class for various diseases – approximately 30% of marketed drugs target G Protein-Coupled Receptors. So far, 59 members of the superfamily have been drugged with small molecules, aiming for therapy of allergy, pain, high blood pressure, depression, schizophrenia and bipolar disease. Nobel Prize in chemistry in 2012 given to the discovery, functional and structural characterization of G Protein-Coupled Receptors stressed the importance of GPCR research field. The usual drug discovery cycle for GPCRs has been strongly supported by the use of various computational and modelling tools. Working first for IPPAS and now being a part of Selvita computer-aided drug discovery (CADD) group I have acquired almost 10 years of experience working with GPCRs. Due to the nature of this protein family, ligand-based discovery always plays a very important role in hunting for novel inhibitors. Prediction of the protein structure based on its homologs was the most explored playground in the GPCRs world. This allowed us to benefit from structure-based methods that started playing more and more significant role in the GPCRs universe after 2007.
Pharmacophore modelling based on known active compounds is a well-established procedure in CADD. The concept of pharmacophore relies on the spatial orientation of various elements (donors and acceptors of hydrogen bonds, aromatic rings, positively ionized atoms, etc.) significant for interactions with the protein (Figure 2). The obtained pharmacophore model that consists of colorful balls looks charming indeed and, which is more important, might perform in an outstanding way. After screening scientific literature dedicated to the application of this methodology to GPCRs, I decided to tune it a bit in order to improve its performance and at the same time to explore its full potential. My innovative approach relies on the development of a combination of pharmacophore models based on ligands. In such way, we could go beyond a single pharmacophore that by definition is not able to cover the whole chemical space explored by the ligands of GPCR, especially if thousands of known actives are published (eg. ChEMBL database reports more than 5000 actives for the serotonin 5-HT1A and dopamine D2 receptors). Certainly, the application of a set of models instead of a single one is much more time-consuming but reduction of number of false positives (inactives classified as actives) compensates for it.
Figure 2. Exemplary pharmacophore hypothesis for aminotetralines (popular ligands for serotonergic receptors) along with matrix of distances (in angstroms) between features. The feature abbreviations used are: hydrophobic group – H, positively charged group – P, aromatic ring – R
Remaining for a while in the area of simplified chemical structures, that of course can be extrapolated to simplified protein structures, I would like to touch upon a subject of fingerprints and artificial intelligence (a new term for machine learning). Have you ever thought about the fact that something, which is obvious for human, can be completely incomprehensible for machines? Except for the things as eating over the keyboard, computers cannot understand chemical structures which are only fancy and difficult graphs. Therefore, binary representation of chemical structure (fingerprint) is widely used in CADD as it encodes the compound’s structural features into a bitstring, where “1” and “0” mean the presence or absence of particular pattern, respectively (Figure 3). Fingerprints are typically used for the similarity searches (because it is a simple comparison of two vectors). However, within the GPCR field they are often applied as the input data for the machine learning methods and then used as the first filter in the virtual screening cascade, to predict the metabolic stability of GPCR ligands, to search for the most important structural features of GPCR ligands or even for analysis of docking results. Fingerprints in tandem with the machine learning are a great multipurpose tool for in silico studies of GPCRs ligands. I expect this methodology to gain even more importance at Selvita CADD thanks to the constantly growing Ardigen’s expertise in the development of models utilizing artificial intelligence.
Figure 3. Exemplary fingerprints.
Presence of “1” and “0” corresponds to presence or absence of a particular pattern, respectively. Please notice that for the first fingerprint one bit encodes more than one motif – this phenomena is called bit collision.
I would like to close my story by briefly mentioning homology modelling. Selvita has a good track of building homology models using dedicated state of the art software and I would like to leave myself the opportunity to come back to this subject in one of my future posts. Until 2007 the only available crystal structure of GPCR was Rhodopsin. Therefore, homology modelling became the method of choice in the GPCR field due to the unavailability of the crystal structures. The first non-rhodopsin structure solved was that of β2-adrenergic receptor. Since this moment the number of available crystal structures of GPCRs has been constantly increasing, with 62 unique protein targets and a total of 313 crystal structures available now. Using the crystal structure is akin to picking a low hanging fruit, but this fruit can be rotten – GPCR crystals are usually of low resolution and single crystal structure is only a snapshot from the whole conformational space of a binding pocket. Recently published studies prove that the best models in discrimination test are built not only on the closest templates but on more distant ones as well. Furthermore, proteins are labile structures and, like in the case of pharmacophore modelling, a single model is not able to fully describe the dynamics of the receptor. Therefore, using the ensemble of the models gives better results than using only one. This approach was recently applied to successful search for novel ligands of receptor 5-HT1A (Figure 4). Recently, researchers from the University of Copenhagen have created webservice (www.gpcrdb.org) allowing for generation of the GPCRs homology models.
Figure 4. Binding pose of buspirone in the ensemble of the best models of 5-HT1AR.
The compound is rendered as a ball and stick representation. Only residues situated less than 4Å from the partial agonist are shown.
Protein structures, both homology and crystal, both ensembles and single ones, are applied for the docking studies which have to be carefully analyzed. And here is a point when the story has come full circle, because in recent studies the docking results were analyzed with Structural Interaction Fingerprints (SiFT), which encode interaction between ligand and protein as a bitstring.
Fascinated by the avenues of possibilities given by computer modelling, please remember that all aforementioned methods create not only new perspectives in the search for the new GPCRs receptors ligands, but can be expanded to any other biological target. Experts in Selvita’s CADD group are aware of the limitations coming with any of those methodologies therefore we do not treat them as a black box but utilize them thus maximizing the impact of CADD on the progress of drug discovery projects.
Dawid Warszycki, PhD
Specialist II, Computational Chemistry
To contact the author please email firstname.lastname@example.org