High resolution deep mutational scanning of the melanocortin-4 receptor enables target characterization for drug discovery

Conor J Howard; Nathan S Abell; Beatriz A Osuna; Eric M Jones; Leon Y Chan; Henry Chan; Dean R Artis; Jonathan B Asfaha; Joshua S Bloom; Aaron R Cooper; Andrew Liao; Eden Mahdavi; Nabil Mohammed; Alan L Su; Giselle A Uribe; Sriram Kosuri; Diane E Dickel; Nathan B Lubock

doi:10.7554/eLife.104725.1

eLife Assessment

The authors use deep mutational scanning to assess the effect of all possible protein-coding variants in MC4R, a G protein-coupled receptor associated with obesity. They develop new, more precise approaches, enabling them to probe molecular phenotypes directly relevant to the development of drugs that target this receptor. In this important work, the authors provide convincing evidence that variants impact signaling through MC4R in different ways, that some defective variants are amenable to a corrector drug and that deep mutational scanning data could guide compound optimization.

https://doi.org/10.7554/eLife.104725.1.sa3

Significance of findings

important: Findings that have theoretical or practical implications beyond a single subfield

landmark
fundamental
important
valuable
useful

Strength of evidence

convincing: Appropriate and validated methodology in line with current state-of-the-art

exceptional
compelling
convincing
solid
incomplete
inadequate

During the peer-review process the editor and reviewers write an eLife assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife assessments

Abstract

Deep Mutational Scanning (DMS) is an emerging method to systematically test the functional consequences of thousands of sequence changes to a protein target in a single experiment. Because of its utility in interpreting both human variant effects and protein structure-function relationships, it holds substantial promise to improve drug discovery and clinical development. However, applications in this domain require improved experimental and analytical methods. To address this need, we report novel DMS methods to precisely and quantitatively interrogate disease-relevant mechanisms, protein-ligand interactions, and assess predicted response to drug treatment. Using these methods, we performed a DMS of the melanocortin-4 receptor (MC4R), a G protein-coupled receptor (GPCR) implicated in obesity and an active target of drug development efforts. We assessed the effects of >6,600 single amino acid substitutions on MC4R’s function across 18 distinct experimental conditions, resulting in >20 million unique measurements. From this, we identified variants that have unique effects on MC4R-mediated Gα_s- and Gα_q-signaling pathways, which could be used to design drugs that selectively bias MC4R’s activity. We also identified pathogenic variants that are likely amenable to a corrector therapy. Finally, we functionally characterized structural relationships that distinguish the binding of peptide versus small molecule ligands, which could guide compound optimization. Collectively, these results demonstrate that DMS is a powerful method to empower drug discovery and development.

Introduction

Deep Mutational Scanning (DMS) employs cutting-edge synthetic biology or genome editing methods, DNA synthesis, and sequencing to systematically assess the effect of every possible single amino acid substitution on the function of a protein target (Araya and Fowler, 2011; Fowler and Fields, 2014; Starita et al., 2017). Researchers have leveraged DMS to gauge a variety of protein functions or their cellular consequences, including viability (Findlay et al., 2014), protein abundance (Faure et al., 2022; Matreyek et al., 2018), transcriptional signaling (Jones et al., 2020), and inter- and intramolecular interactions (Braberg et al., 2022; Faure et al., 2022). DMS assays are increasingly used for human variant interpretation (Weile and Roth, 2018) and to elucidate the relationship between protein structure and function, including in the evaluation (Brandes et al., 2023) and finetuning of protein language models (Lafita et al., 2024).

While DMS has significantly advanced our understanding of protein function, its potential in drug discovery and development has yet to be fully realized. Applications in this realm require methods that are more disease-relevant, sensitive, and quantitative to measure the subtle effects of both sequence variants and experimental conditions (e.g., drug treatments). For example, DMS assays often measure effects like viability, which are several biological layers removed from the specific molecular mechanisms modulated by drugs. Additionally, the current signal-to-noise ratios of many assays and challenges around uncertainty quantification make drawing quantitative conclusions from DMS data difficult. Moving from categorical (e.g., benign vs. pathogenic) classifications toward precise quantitative measurements of human variants could improve predictions of safety and efficacy at early stages of drug development programs (Plenge et al., 2013). It would also expand the use of matching patients’ medications to their specific genetic variants (i.e., theratyping), which could lead to better patient outcomes (McDonald et al., 2024). Finally, DMS data contain the functional consequences of thousands of different biochemical perturbations. Sufficiently sensitive assays against functional readouts and improved analysis methods would readily complement structure-based approaches by elucidating the functional consequences of ligand binding. Building on this, it should be possible to identify novel protein-ligand interactions that could be reverse-engineered to increase compound potency, further expanding the potential impact of DMS on drug discovery.

In a previous work, we described DMS methods to measure the effects of thousands of single amino acid variants on the function of the beta-2 adrenergic receptor (β2AR) (Jones et al., 2020), a member of the G-protein-coupled receptor (GPCR) class that is the most commonly targeted protein family in drug development (Sriram and Insel, 2018). Here, we build upon this previous work to demonstrate more sensitive and robust DMS methods for drug discovery and development, focusing on the melanocortin-4 receptor (MC4R). MC4R is a GPCR, and human variants that result in partial or complete loss of its function cause the most common form of inherited obesity [OMIM #618406] (Farooqi et al., 2003; Hinney et al., 2003; Vaisse et al., 2000). Variants that increase MC4R activity are protective against obesity (Lotta et al., 2019; Paisdzior et al., 2020), and numerous small molecule and peptide agonists of MC4R have been tested as potential therapeutics (Chen et al., 2015; Clément et al., 2020; Collet et al., 2017; Greenfield et al., 2009; Hinney et al., 2022; Huang and Tao, 2014; Kievit et al., 2013; Sweeney et al., 2023).

In this study, we developed substantially improved experimental and analytical methods for DMS that are capable of detecting subtle quantitative effects of variants and differences between experimental conditions with a high degree of statistical rigor. We then tested the effects of nearly all possible single amino acid variants of MC4R (6,633 of 6,640) on two distinct GPCR signaling functions under a variety of treatment conditions. From this, we generated a high-resolution map of how MC4R’s structure relates to function, and we accurately classified the quantitative effects of human variants. Additionally, we identified amino acid changes that differentially impact (i.e., bias) MC4R’s different GPCR signaling functions, pinpointed human variants that are amenable to a specific class of therapy, and elucidated the functional impact of protein-ligand interactions between MC4R and both peptide and small molecule ligands. This demonstrates the utility of DMS for various drug discovery and development applications, and the methods described herein should be broadly applicable to GPCRs and other drug target classes that function in transcriptional signaling pathways.

Results

Development of highly quantitative deep mutational scanning methods

Assays for disease-relevant mechanisms

A critical tool in drug discovery programs is a very sensitive set of assays that directly interrogate the function(s) of a protein target against which to test potential therapies (Hughes et al., 2011). This is essential for ensuring that compounds are modulating a specific mechanism underlying disease and not having undesired “off-target” effects. Stimulation of MC4R with its agonist, alpha melanocyte-stimulating hormone (α-MSH), results in signaling through multiple canonical GPCR pathways, including Gα_s-coupled cyclic adenosine monophosphate signaling (hereafter referred to as Gs) and Gα_q-coupled calcium signaling (hereafter referred to as Gq) (Tao, 2010). Therefore, we first developed multiplexed reporter assays for these two critical MC4R G-protein signaling functions (Fig. 1, Supplementary Fig. 1), building off of our earlier work performing a deep mutational scan of β2AR (Jones et al., 2020). Both reporters were designed for use in human HEK293T cells and to be compatible with our previously described DMS library construction methods (Jones et al., 2020). Briefly, these methods harness high-throughput DNA synthesis to construct every possible single amino acid variant, and each variant is then linked to a transcriptional reporter containing an oligonucleotide sequence barcode unique to that variant. Reporter constructs are then integrated into cells using a site-specific recombination-based landing pad system and drug selection to ensure that each cell contains a single variant-barcode combination. Activation of the receptor turns on a response element for the signaling pathway, leading to the expression of the barcoded reporter, which is then quantified using RNA sequencing.

Highly quantitative Deep Mutational Scanning (DMS) methods for drug discovery applications.
**Top:** DMS experimental and analysis methods improvements reported in this paper. Schematic made with BioRender.com/g94q960. **Bottom:** Questions commonly encountered in drug discovery and development that can be addressed by improved DMS methodology.

The MC4R Gs assay was adapted and further optimized from the cAMP response element-based reporter we previously used for β2AR (Supplementary Fig. 1A,C). For Gq signaling, an analogous reporter using an NFAT response element alone to activate reporter gene expression was not suitable due to weak signal-to-noise. To solve this problem, we incorporated a “relay” system to amplify the reporter signal using a synthetic transcription factor composed of Gal4 fused to the VP64-p65-Rta (VPR) transcriptional activator (Chavez et al., 2015) (Fig. 1, Supplementary Fig. 1B, see Methods for more details). The resulting Gal4-VPR transcription factor was placed under the control of the NFAT response element, and the reporter gene was placed under the control of a UAS element that responds to Gal4 binding. This assay design resulted in robust reporter expression upon stimulation of MC4R (Supplementary Fig. 1C). Together, these assays provide sensitive functional readouts for two important MC4R signaling activities implicated in obesity and other phenotypes (Fatima et al., 2022; Sweeney et al., 2023), and we used them to build DMS libraries to assess the effects of all possible single amino acid substitutions in MC4R.

Analysis model for statistically robust comparisons

To explore many hypotheses that arise in drug discovery applications, it is valuable to assay a DMS library using experimental replication and under a variety of conditions, such as different drugs and/or pathways of interest. However, most popular methods for DMS analysis do not leverage DNA barcodes or other experimental replicate information, nor do they support hypothesis testing between conditions (Faure et al., 2020; Rubin et al., 2017). Consequently, we developed an alternative modeling framework to enable this (Fig. 1, Supplementary Fig. 2A, see Methods: Negative Binomial Regression Analysis Pipeline for additional details). Briefly, borrowing from approaches for inferring differential expression from RNA-seq data (Ahlmann-Eltze and Huber, 2021; Love et al., 2014; McCarthy et al., 2012), we applied a mixed effect negative binomial generalized linear model (GLM) to raw barcode counts directly. The model contains a random effect across barcodes to share barcode information between replicates and conditions, and incorporates sample-specific offsets to account for technical covariates like sequencing depth, as is common for RNA-seq (Robinson and Oshlack, 2010). For each variant, we estimate the mean shift in barcode count and associated standard error for each treatment condition, relative to wild-type. Using per-condition summary statistics, we either directly test whether each variant barcode mean is significantly different from wild-type (zero) in each treatment, or we can define more complex linear contrasts on variant effects across multiple treatments.

Increasing power through barcoding

As described above, our assay design and analysis framework harness DNA barcodes that are uniquely associated with a particular variant and provide multiple independent measures of a variant’s effect. In our previous DMS of β₂AR, the median number of barcodes independently linked to each variant was ∼10 (Jones et al., 2020), and we reasoned that increasing this number would increase the power to detect functional effects. To this end, we optimized and scaled our library cloning and cellular integration protocols to target ∼30 barcodes per variant in building DMS libraries for MC4R. As expected, this increased the power to detect variant effects. For example, the separation between the activity of alleles that are clearly deleterious (i.e., a stop codon at any position in the protein) and all other alleles (i.e., wild-type or missense variants) was drastically increased in the MC4R Gs assay relative to the same assay for β₂AR (Supplementary Fig. 2B). To further test the effect of the number of barcodes for a given variant, we computationally down-sampled the barcodes for representative positions of MC4R and ran the resulting data through our analysis pipeline. As expected, by increasing the number of barcodes per variant, the magnitude of the standard error of the estimated variant effect decreases in a manner consistent with increased sample size (Supplementary Fig. 2C). This confirms that increasing the number of barcodes per variant enables the quantification of subtle differences within a standard hypothesis testing framework, and provides an experimental parameter that one can vary to improve the power to detect the effects of sequence variants of particular interest.

Comprehensive deep mutational scanning of MC4R

With these methods in hand, we carried out a comprehensive assessment of the effects of all single amino acid substitutions (including nonsense variants) on MC4R’s Gs and Gq signaling activities under a variety of experimental conditions (Fig. 2a, Supplementary Table 1, Supplementary Figs. 3-4). We selected experimental conditions that would inform aspects of drug discovery and development programs, such as elucidating protein structure-function relationships, identifying regions of the protein that bias activity towards or away from a specific function, classifying the effects of human variants in the presence and absence of potential therapies, and uncovering functional differences in protein-ligand interactions. In total, we tested 18 unique conditions, each performed in quadruplicate, including: basal activity (i.e., no stimulation) of MC4R, stimulation of MC4R with a range of doses of the native peptide agonist alpha-melanocyte-stimulating hormone (α-MSH), stimulation with a range of doses of a small molecule agonist (THIQ) (Sebhat et al., 2002), treatment with a small molecule corrector (Ipsen 17) (Poitout et al., 2007; Wang et al., 2014), and library composition normalization controls (forskolin, see Methods). Our resulting DMS assays had extraordinary variant coverage, with 99.9% (6,633/6,640) of all possible single amino acid substitutions present in all experimental conditions. Each variant was represented by an average of 56 and 28 barcodes for the Gs and Gq signaling pathways, respectively. Between both assays, this translates to more than 557,000 uniquely engineered human cells, each containing a distinctive variant-reporter-barcode combination. When factoring in the number of experimental conditions (18 unique), replicates (four per condition), amino acid variants tested (99.9% of 6,640 possible), and the mean barcodes per variant (56 and 28 for Gs and Gq, respectively), this equates to >21,500,000 measurements across all datasets.

Effect of 6,633 missense and nonsense variants on MC4R signaling functions.
A. Heatmaps showing the functional effects (z-scores) for nearly all possible amino acid substitutions (6,633 of 6,640) on MC4R activity for two GPCR signaling functions (Gs and Gq) under a variety of conditions. Heatmaps showing results (both z-score and log2[fold change of variant activity over wild-type]) for all experimental conditions are shown in Supplementary Figs. 3-4. The results of the Gs assay with low α-MSH stimulation are highlighted on the left (and in Panels **B-F**). TM: transmembrane domain; GoF: gain-of-function; LoF: loss-of-function; WT: wild-type activity B. A modified snake plot (adapted from https://gpcrdb.org/) showing the sensitivity of each MC4R residue to mutation (defined as the mean log2[fold change variant activity over wild-type] divided by sqrt(sum(standard error^2)) after excluding nonsense variants). C. Z-scores for each variant (point), broken out by variant type and clinical (ClinVar) classification. Blue indicates statistically significant LoF, brown is significant GoF (significance threshold: FDR < 1%). VUS: variant of uncertain significance. D. Functional effect (Log2[fold change of variant activity over wild-type], x-axis) for all human variants relative to the frequency of the allele in the in the human population (y-axis, gnomAD global population (Chen et al., 2024)). E. DMS results for human MC4R variants (y-axis) relative to previous functional classifications in the literature (x-axis, from (Huang et al., 2017)) F. DMS results (z-scores, x-axis) compared to change in α-MSH potency (relative to WT) of 25 MC4R variants made to the orthosteric binding site for α-MSH (y-axis, cAMP accumulation assay results from (Zhang et al., 2021)). G. Fractions of MC4R variants that result in LoF, GoF, or WT activity for eight unique experimental conditions.

Multiple lines of evidence support the high quality and utility of these data for classification of variant effects (Fig. 2A-F, Supplementary Figs. 3-5). Focusing on one dataset as a representative example (Gs signaling using a low dose of α-MSH stimulation), variants that introduce stop codons or fall within transmembrane domains and buried surfaces disproportionately lead to significant loss of MC4R function (Fig. 2A-C). The results also correlate well with expectations from human genetics data and variant effect prediction algorithms (Fig. 2C-D, Supplementary Fig. 5). For example, the majority of human MC4R variants classified as pathogenic or likely pathogenic in ClinVar (Landrum et al., 2014) lead to a significant reduction of Gs signaling under low α-MSH stimulation conditions (significance threshold: false discovery rate (FDR) < 1%; Fig. 2C). Variants that are significantly loss-of-function in this condition are rarer in the human population, and more common human variants have no significant effect on MC4R function (significance threshold: FDR < 1%; Fig. 2D). Loss-of-function variants by our DMS assay are also typically predicted to be deleterious by commonly used variant effect predictors like AlphaMissense (Cheng et al., 2023) and popEVE (Orenbuch et al., 2023) (Supplementary Fig. 5).

Because of the sensitivity of our reporter system and the statistical power gained by testing dozens of unique barcodes per variant, we anticipated that these assays would capture subtle quantitative, rather than just qualitative, effects on MC4R function. To assess this, we benchmarked our results against previous quantitative characterizations of MC4R variants from the literature (Fig. 2E-F). For example, many MC4R variants that have been observed in the human population have been previously tested for their effects on MC4R function, and a review (Huang et al., 2017) summarizing this work systematically classified >70 variants according to whether they result in “full”, “partial”, or “no/mild” loss of MC4R activity. Our results are consistent with these classifications: variants classified previously as “full” loss-of-function typically have very low MC4R activity in our assay, “partial” variants have intermediate effects, and “no/mild” effect variants have near-normal activity (Fig. 2E, median log2[fold change of variant activity over wild-type activity] of −1.4, −0.8, and −0.3, respectively, for each group). Finally, our results show a high degree of correlation (Pearson correlation = 0.84, R² = 0.71) with quantitative effect measurements reported for 25 variants individually introduced into the orthosteric site of MC4R (Zhang et al., 2021). Collectively, this demonstrates that our high quality MC4R DMS data accurately and quantitatively assess the effects of variants on MC4R’s function.

Systematic human variant interpretation of MC4R

Looking across multiple data sets gives a comprehensive picture of variant effects, as the various experimental conditions tested have disparate strengths and weaknesses at detecting loss-of-function (LoF) versus gain-of-function (GoF) activities (Fig. 2G). For example, unstimulated conditions (e.g., zero α-MSH) uncover variants that lead to constitutive activation of MC4R, but they have less ability to detect loss-of-function variants. In contrast, conditions with agonist stimulation are much better enabled to identify loss of MC4R function. Considering individual α-MSH stimulation conditions (zero, low, medium, and high) for both the Gs and Gq assays, each condition identifies 6.6 - 39.3% of variants as loss-of-function and 0.02 - 1.1% as gain-of-function (Fig. 2G). Collectively across all α-MSH stimulation conditions, 3,370 variants (50.8%) are loss-of-function in at least one condition, 347 variants (5.2%) are gain-of-function in at least one condition, and 2,996 (45.2%) always show wild-type activity. Interestingly, 80 (1.2%) variants are classified as both loss- and gain-of-function, depending upon the condition.

To aid in clinical variant interpretation, we provide detailed functional effect classifications for 220 human variants reported in ClinVar (Landrum et al., 2014) or from published patient sequencing studies (Brouwers et al., 2021; Farooqi et al., 2003; Hinney et al., 2013, 2006; Huang et al., 2017; Rodríguez Rondón et al., 2024; Stutzmann et al., 2008; Wade et al., 2021; Yeo et al., 2003) in Supplementary Table 2. In total, 130 of these reported human variants (59.1%) are LoF in at least one condition, consistent with being pathogenic for obesity-related phenotypes. This includes 83.9% (26/31) of the variants that are reported in ClinVar as pathogenic or likely pathogenic, 53.3% (32/60) of those that are unclassified or have conflicting interpretation, and 0% (0/3) of those classified as benign or likely benign. A small number of reported human variants (V103I, H158R, I251L) result in significant increases in Gs and/or Gq signaling, consistent with having a protective effect for obesity, and are generally classified as benign in ClinVar or as having wild-type activity in the literature (Hinney et al., 2006). These results highlight the utility of systematic deep scans across multiple experimental conditions for facilitating human variant interpretation.

Variants that bias MC4R function

MC4R signals through multiple G protein pathways (Breit et al., 2011; Ju et al., 2018; Paisdzior et al., 2020; Sharma et al., 2019), and evidence from human variant interpretation studies (Lotta et al., 2019; Metzger et al., 2024; Rodríguez Rondón et al., 2024), mouse modeling (Li et al., 2016), and previous drug programs (Clément et al., 2018; Sharma et al., 2019) suggest that biasing MC4R’s activity toward or away from specific pathways could be therapeutically valuable for the treatment of obesity. To gain a better understanding of how MC4R structure relates to its various functions, we systematically searched for variants that differentially impact Gs versus Gq signaling. We applied Principal Component Analysis (PCA) across eight total DMS datasets (four each for Gs and Gq: with zero, low, medium, and high α-MSH stimulation; see Supplementary Table 1 for details). The first two principal components explained 66% and 12% of the variance, respectively (Fig. 3A, Supplementary Fig. 6A,B). Through inspection, we found that Principal Component 1 (PC1) separates variants that impact both signaling functions, with variants that are loss-of-function for both Gs and Gq having higher PC1 values (Supplementary Fig. 6A,B). PC2 separates variants that affect Gs and Gq signaling differently (Fig. 3A-D, Supplementary Fig. 6A,B). Variants with higher PC2 values typically exhibit Gq bias by having greater than wild-type levels of Gq signaling activity, while retaining wild-type levels of Gs activity (Fig. 3C,D; Supplementary Fig. 6C,D). In contrast, variants with more negative PC2 values are Gs-biased variants that typically have wild-type levels of Gs signaling and reduced Gq signaling upon agonist stimulation (Fig. 3C).

Systematic identification of variants that have biased effects on MC4R signaling.
A. Principal component analysis of eight MC4R DMS conditions (Gs and Gq signaling each at four α-MSH stimulation doses: zero, low, medium, high, see Supplementary Table 1). Each point indicates a unique variant, with those exhibiting extreme Gq or Gs bias colored in purple and green, respectively. PC, principal component. Schematics made with BioRender.com/d70i288. B. Ribbon structure of MC4R showing the maximum absolute PC2 value for each residue position. **C. Top:** Closeup of MC4R structure shown in bottom left box in B, highlighting positions where variants result in Gs (green) or Gq (purple) bias. **Bottom:** MC4R signaling activity (log2[fold change of variant activity relative to wild-type]) for three selected variants across α-MSH doses. Error bars are +/- 2 standard errors. Med: medium; WT: wild-type **D. Left:** Closeup of MC4R structure shown in bottom right box in B, highlighting positions where variants result in Gs or Gq bias as in C. **Right:** MC4R signaling activity for select variants as in C.

Overall, more variants substantially increase Gq signaling than Gs signaling (Fig. 3A). Gs is the primary G protein coupling for MC4R (Podyma et al., 2018; Tao, 2010), and our data suggest that there is little room for further improving MC4R’s already robust Gs signaling activity. Interestingly, biased variants are positionally diverse. For example, the 14 variants that display the most extreme Gq bias (Fig. 3A) are found at 12 different residue positions, with position 79 unique in having several variants that result in Gq bias (Supplementary Fig. 6D). Many of the variants with extreme Gq or Gs bias are located within the regions of MC4R that interact with G proteins, with some scattered throughout the transmembrane domains and far fewer in the vicinity of the protein’s orthosteric site (Fig. 3B-D).

Structural insights into biased signaling

Comparing these results with existing structural information could provide additional detailed insights into MC4R’s signaling functions. It is thought that ligand binding in GPCR orthosteric sites is communicated to the intra-cellular G-protein binding domain through a series of conserved residues or “microswitches” (Hauser et al., 2021; Zhou et al., 2019). Structural studies comparing the inactive and active state structures have confirmed that MC4R shares a similar signaling cascade (Yu et al., 2020; Zhang et al., 2021). Upon ligand binding, W258 (W258^6×48 in https://gpcrdb.org/ nomenclature; (Isberg et al., 2015)) of the conserved CWxP motif undergoes a conformational rearrangement that is translated to L133^3×36 and I137^3×40, of the conserved PIF motif (MIF in melanocortin receptors). This causes F254^6×44 in the PIF motif to rearrange, which in turn disrupts the packing of three different interactions: 1) L140^3×43 and I143^3×46, 2) I251^6×41 and L247^6×37, and 3) R147^3×50 and N240^6×30. These, amongst other rearrangements, culminate in the receptor being able to bind a G-protein. This interaction with the G-protein is primarily mediated through R147^3×50 in the conserved DRY motif, T150^3×53, Y157^34×53, H158^34×54 and R305^7×56 (Zhang et al., 2021).

Strikingly, a number of mutations at residues throughout this signaling cascade had extremely positive PC2 values, implicating them as Gq-biasing mutations. Within the core of the cascade, we identified I137^3×40L, F254^6×44P, and L140^3×43I as Gq-biased (Fig. 3A; Supplemental Fig. 6C). We also identified a number of Gq-biasing mutations within the G-protein binding pocket, specifically T150^3×53G and H158^34×54R (Fig. 3A,C). Interestingly, the H158^34×54R variant is found in the human population (Hinney et al., 2006; Wade et al., 2021) and has previously been shown to preferentially signal through the Gq pathway (Paisdzior et al., 2020). H158^34×54 is also co-located near two other mutations (K164^3×55L, F152^3×55R), in intracellular loop 2 (ICL2) and near the ends of the third and fourth transmembrane domains (TM3 and TM4), that display bias (Fig. 3C). Interestingly, K164^3×55L exhibits a Gs bias in that it drives loss of function through Gq.

Our data also point to a number of potentially novel interactions. For example, M79^2×39 packs against residues H387^G.H5.19 and Q390^G.H5.22 of the Gs alpha subunit (G_sα) (Flock et al., 2015). This position has multiple different variants that result in Gq bias, including M79^2×39R, M79^2×39S, and M79^2×39G (Fig. 3A, Supplemental Fig. 6D). The most extreme bias signal in our PCA analysis came from I223^5×69L (Fig. 3A,D), which interfaces with the C-terminus of G_sα (Flock et al., 2015), near position L394. Further down towards the intracellular side of TM5, we also identified V228R (Fig. 3A,D), which interfaces proximal to E323^G.hgh4.13 in G_sα (Flock et al., 2015). Collectively, the combination of DMS data and structural information is a fruitful avenue for generating protein structure-function hypotheses. These results highlight the power of DMS to identify regions of MC4R that could be harnessed for designing drugs that precisely modulate specific cellular signaling functions.

Systematic prediction of treatment response

Many variants of MC4R disrupt signaling by causing protein misfolding, which ultimately inhibits proper localization of MC4R to the cell membrane (Huang et al., 2017). Correctors are a class of small molecule drugs that facilitate protein folding. Corrector therapies have been developed for phenotypes such as cystic fibrosis (Boyle and De Boeck, 2013) and Fabry disease (Germain et al., 2016), and they have been proposed as a strategy for treating MC4R-associated obesity (Huang et al., 2017; Huang and Tao, 2014; Wang et al., 2014). One feature of corrector therapies is that they are typically only effective, and therefore FDA-approved, for a subset of patients harboring specific sequence variants (Boyle and De Boeck, 2013; Weaver et al., 2022). Identifying variants that respond to corrector therapy is typically done by rigorously testing the effect of a compound on a single variant at a time (Weaver et al., 2022), and DMS offers an attractive avenue to systematically test the treatment response of thousands of patient variants in a single assay.

To this end, we tested whether Ipsen 17, a small molecule tool compound that has been shown to correct MC4R misfolding (Poitout et al., 2007; Wang et al., 2014), is able to restore the Gs signaling function of the MC4R variants in our DMS library. Out of all 6,633 tested variants, 290 (4.4%) showed disrupted Gs signaling in the absence of treatment that was partially or fully rescued by the addition of Ipsen 17 (Supplementary Fig. 7, see Methods for details of statistical analysis). This includes a number of variants that have been classified as pathogenic in ClinVar (Landrum et al., 2014) or otherwise found in patient sequencing studies (Brouwers et al., 2021; Farooqi et al., 2003; Hinney et al., 2013, 2006; Huang et al., 2017; Stutzmann et al., 2008; Wade et al., 2021; Yeo et al., 2003) (Fig. 4 shows results for selected variants reported in the human population). Other reported patient variants showed no functional improvement in response to corrector therapy (Fig. 4). Collectively, these data support that performing DMS in the presence of a small molecule corrector can be used to systematically predict which patients are likely to benefit from such treatment modalities.

Corrector therapy rescues the activity of a subset of human MC4R variants.
Gs signaling activity of 21 selected MC4R variant alleles (of 6,633 tested) with (red) and without (gray) Ipsen 17, a small molecule corrector that has been shown to restore the activity of misfolded MC4R. Bars represent the activity of the variant allele normalized to that of WT MC4R in the no corrector condition, and error bars are +/- two standard errors.

Mapping protein-ligand interactions

DMS experiments can be used to define “drug-resistant” variants within MC4R that disrupt the activity of different types of ligands, providing functional insight into protein-ligand interactions that are key for understanding the mechanisms underlying agonism. Such functional information would be a valuable addition to structural methods and has the potential to streamline the lengthy and iterative cycle of compound optimization in drug discovery. Substantial work has been done to characterize how peptide agonists interact structurally with MC4R, but similar work on small-molecule agonists with comparable activity and selectivity remains relatively limited (Gonçalves et al., 2018; Heyder et al., 2021; Sharma et al., 2019; Yu et al., 2020; Zhang et al., 2021). To characterize the functional interaction landscapes of different ligand types and to better understand what distinguishes peptide and small-molecule MC4R pharmacophores, we performed DMS assays of MC4R using both native peptide agonist stimulation (α-MSH) and small molecule agonist stimulation (THIQ). Bayesian meta-regression analysis of the lowest dose concentrations of α-MSH and THIQ for the Gs signaling assay revealed a set of variants that uniquely disrupt activation by one ligand but not the other (at FDR < 5%, Fig. 5A). These variants cluster exclusively within the orthosteric binding pocket (Fig. 5B-E) and at positions of known binding interactions of each molecule (Zhang et al., 2021). Notably, there are many more variants that uniquely disrupt activation by α-MSH (Fig. 5A-D). For example, a majority of amino acid substitutions at position I104^2×6 disrupt activation of MC4R by α-MSH, but none lead to significant reduction in MC4R activity upon stimulation by THIQ (Fig. 5C). Multiple positions within the orthosteric binding pocket displayed this pattern (Fig. 5B-E), which is consistent with how the peptide agonist utilizes a larger network of interactions to increase binding affinity.

Systematic identification of functional protein-ligand interactions.
A. Bayesian meta-regression of the α-MSH and THIQ datasets at low agonist concentration for Gs reveals variants that differentially affect MC4R activation by each agonist. Statistically significant effects are colored by which agonist condition they impaired (α-MSH, blue; THIQ, orange, FDR < 0.05). Variants at positions 48, 104, and 129 are labeled. B. Side-views of MC4R structure in surface (left) and ribbon (right) view show clustering of variants from A that specifically abrogate α-MSH but not THIQ activation (blue) in the extracellular orthosteric binding site. C. Effect of all possible variants at residues 48, 104, and 129 at low concentration of α-MSH (blue) and THIQ (orange). Variants that disproportionately affect activation by only one agonist are boxed by respective color. D. Top-down surface views of MC4R (PDB: 7f58) with bound α-MSH (left, blue; PDB: 7f53) or THIQ (right, orange; PDB: 7f58). Positions are colored by whether variants at that position uniquely perturb activation by α-MSH (blue) or THIQ (orange). E. Zoomed view of the MC4R binding pocket with α-MSH bound (left) or THIQ bound (right). Structures are colored as in (D). Residues that form the HFRW motif of α-MSH and functional groups R1, R2, and R3 of THIQ are labeled in bold. Residues 48, 104, and 129 are shown in stick form.

Interestingly, this comparison also highlights how the same residue of MC4R can be critical for interfacing with multiple ligands but points to substantive differences in the precise physical interaction between each ligand and that position of the target. For example, positions P48^1×36 and I129^3×32 harbor variants that can have differentially deleterious effects under the two ligand conditions (Fig. 5C-E). Many amino acid substitutions at P48^1×36 (including to V, I, F, Y, Q, and R) disrupt stimulation of MC4R by α-MSH but not THIQ. However, changing this position to the negatively charged P48^1×36D variant has the opposite effect, uniquely ablating activation by THIQ. The HFRW motif (His6-Phe7-Arg8-Trp9) of α-MSH represents a conserved pharmacophore critical for activation of MC4R by peptides, and the tri-branched THIQ molecule (R1-R2-R3) mimics the HFRW conformational architecture (Fig. 5E) (Gonçalves et al., 2018; Hruby et al., 1987; Zhang et al., 2021). Our observation that α-MSH stimulation is more sensitive to variants at position P48^1×36 is consistent with how the His6 of α-MSH forms more interactions with this hydrophobic pocket of MC4R, while the analogous R3 group of THIQ is more flexible and forms non-specific interactions in this region (Fig. 5C–E) (Gonçalves et al., 2018; Hruby et al., 1987; Zhang et al., 2021). At position I129^3×32, mutation to any polar uncharged variant (S, T, N, or Q) alters THIQ activation, while I129^3×23V uniquely inhibits α-MSH activation (Fig. 5C–E). The Phe7 of α-MSH and the analogous R2 group of THIQ form key interactions with Ca²⁺ and the core hydrophobic pocket formed by I129^3×32 of MC4R (Fig. 5E) (Zhang et al., 2021). The enrichment of variants at I129^3×32 that uniquely disrupt MC4R activation by THIQ points to a stronger dependency of the small-molecule on interactions with this residue. In summary, by assessing the functional consequences of all mutations within the MC4R orthosteric site, we not only confirm known binding interactions but also reveal interactions that distinguish peptide and small-molecule activation. These relationships provide additional functional insight into the structural mechanism of MC4R ligand binding that could be harnessed for drug design.

Discussion

Deep Mutational Scanning holds substantial potential for improving many phases of drug discovery and development, but realizing this potential requires the ability to draw highly quantitative and disease-relevant conclusions from DMS data, an outstanding challenge for the field. Here, we demonstrated the value of improved DMS assays and analysis methods in addressing these challenges by applying our approaches to a medically relevant GPCR, MC4R. First, we were able to quantify differences in Gs-mediated cAMP and Gq-mediated calcium signaling for MC4R in response to its native ligand, α-MSH. Understanding these subtle differences in molecular signaling phenotypes is crucial for driving precision treatments. For example, Gq signaling bias is one of the purported mechanisms for the better side-effect profile of the FDA approved MC4R agonist, Setmelanotide (Sharma et al., 2019). Additionally, we systematically assessed the responses of MC4R variants implicated in human obesity to a small-molecule corrector, Ipsen 17. Whereas traditional approaches would have required the separate experimental assessment of every variant, DMS tests every conceivable single amino acid variant in one experiment. This enables researchers to understand how a potential treatment will work for broader swaths of the population before drugs reach the clinic, which could ultimately lead to more effective clinical trial designs.

Our data also contribute to the growing understanding of the complicated mechanism by which MC4R (and other GPCRs) translate ligand binding on the extracellular side of the membrane, through an allosteric network of residues, to G-protein activation on the intracellular side (Howard et al., 2024). We demonstrate the utility of DMS for understanding how proteins bind other molecules. In particular, we identified mutations that uniquely disrupt the binding of two agonists, one a peptide (α-MSH) and one a small molecule (THIQ). This is crucial information for understanding the mechanism by which these molecules bind MC4R, which could lead to hypotheses for further optimization of more potent compounds. DMS data contain a substantial amount of latent information about protein structure, and can help identify novel pockets to target with new compounds (Weng et al., 2024). As suggested above, we also envision using DMS data to help optimize molecules by identifying mutations that uniquely disrupt or potentiate sets of compounds in a chemical series.

Importantly, the experimental and analysis frameworks we describe are widely applicable to GPCRs and more broadly to other target classes. GPCRs are known to signal through a variety of pathways, and there are existing transcriptional reporters for most G-protein mediated signaling pathways that could be harnessed to gain a holistic understanding of the downstream consequences of GPCR activity (Hauser et al., 2022). Furthermore, there are a substantial number of existing transcriptional reporters for a wide variety of cellular signaling processes mediated by transcription factors, kinases, and other major targets of drug discovery efforts, which could extend these methods to other target classes.

Finally, there has been incredible progress over the last few years in the development of machine learning-based models to predict the structural and functional consequences of mutations on proteins. That said, these models have struggled to address many real-world applications in the treatment of human disease (McDonald et al., 2024). One potential explanation to this gap in performance could be due to the fact that many DMS datasets (our previous work included) used to train these models typically assay a limited set of experimental conditions, often assay cellular effects that are only indirectly related to disease etiology, suffer from low signal to noise ratios, have no quantification of uncertainty in their measurements, and are subject to a heterogeneous set of subtle experimental caveats. Detailed experimental characterization of protein function (such as the >21,500,000 different measurements in this work alone) and efforts such as ProteinGym (Notin et al., 2023) to benchmark large-scale functional data will continue to be critical for developing and evaluating large scale models that can be applied to a wider variety of drug discovery and development applications. As both the experimental and computational methods continue to improve, we foresee DMS having a profound impact on the drug discovery process.

Materials and methods

Reporter assays and cell line development

The Gs (cAMP-luciferase) reporter assay, diagramed in Supplementary Fig. 1A, was adapted from an assay previously used to assess the function of other GPCRs (Jones et al., 2020, 2019).

The Gq assay relay reporter system (diagrammed in Supplementary Fig. 1B) is described in detail in a patent stemming from this work (Chan et al., 2024). Briefly, it was constructed as follows: A piggybac transposon plasmid was constructed using Gibson assembly harboring a genetic cassette that expresses a synthetic transcription factor, Gal4_DBD-VPR under the control of an NFAT response element. Human embryonic kidney cells (HEK293T) were then cotransfected with this plasmid along with a piggybac transposase expression vector, and cells were selected for puromycin resistance. These cells were then isocloned by dilution plating, and colonies were selected based on single cell clones. Five clones were then carried forward based on cell morphology and growth rates. A second series of Bxb1-based landing pad vectors were constructed containing a library of 20 Gq-coupled GPCRs under the control of the dox-inducible promoter, along with a second genetic cassette containing a DNA-barcoded luciferase gene under the control of the Gal4_UAS enhancer. This plasmid library was integrated into each of the five isoclonal Gq-relay cell lines. The best performing Gq-relay cell line was selected on signal-to-noise criteria across a panel of agonists for the 20 Gq-coupled GPCRs. Use of this relay system in combination specifically with an MC4R transgene resulted in α-MSH-dose dependent expression of the reporter gene (Supplementary Fig. 1C).

Building the DMS libraries

Generation and genomic integration of the plasmid libraries containing all possible single amino acid substitutions of MC4R used a further optimized version of an earlier method (Jones et al., 2020). In brief, variant segments of MC4R cDNA were amplified from DNA microarrays and cloned into base vectors through a multi-step process to yield pooled libraries of MC4R variants with fully intact and barcoded reporter gene cassettes. Fully assembled plasmid libraries were then co-transfected with a plasmid encoding Bxb1 recombinase into HEK293T cells containing a landing pad at the H11 safe harbor locus to achieve single copy integration per cell. In a deviation from the previously published method, two independent replicates of each sub-library were cloned and pooled together post-cellular integration in order to maximize library coverage and the number of barcodes per variant.

Variant-barcode mapping

As described above, barcodes were randomly appended to variants during amplification of MC4R segments from variant oligo pools and then ligated into library base vectors. After this first step of plasmid library cloning, variant segments were amplified along with the neighboring barcode and sequenced with 2×150 paired-end reads (see Supplementary Table 3 for amplification and sequencing primers) on an Illumina NextSeq 550 instrument using a 300-cycle Mid Output kit. Illumina 2×150 BCL files were demultiplexed with bcl2fastq2 into R1 and R2 FASTQ files, which were merged into single fragments using Flash2 requiring a minimum 5 bp of overlap (Quan et al., 2019). The first 21 bp of each fragment corresponding to the barcode sequence were extracted into the read name, and the remaining fragment was adapter-trimmed using umi_tools and cutadapt, respectively (Martin, 2011; Smith et al., 2017). The remaining fragments were mapped against a custom reference composed of the designed oligonucleotide library using STAR with default parameters except requiring that alignments be strictly unique to be reported (Dobin et al., 2013). Taking each alignment as an oligo-barcode pair, the read counts per unique oligo-barcode pair were computed for each replicate and joined by barcode. Finally, the resulting maps were filtered to require each oligo-barcode pair to pass three requirements in both replicates: correct barcode length, total read depth > 10, and purity > 0.75. The purity of an oligo-barcode pair was defined as the read count of that pair divided by the total number of reads containing that barcode. Post-processing after STAR was performed using samtools for BAM manipulation and custom R code (Li et al., 2009).

Running DMS assays

The DMS assay protocol was optimized from a previous method (Jones et al., 2020). HEK293T single-copy variant cell libraries were seeded at a density of ∼17×10⁶ cells per 150 mm tissue-culture treated dish in DMEM + 10% fetal bovine serum (FBS). Four dishes were seeded for each experimental condition, with each dish being treated as an independent biological replicate (4 replicates per condition). Twenty-four hours after seeding, media was exchanged with DMEM + 0.5% FBS +/- 10 ng/mL Doxycycline. For chaperone experiments, all conditions were additionally replicated +/- 1 μM Ipsen 17. 24 hrs after Doxycycline induction, media was exchanged with Opti-MEM + DMSO, Forskolin, or MC4R agonist (α-MSH or THIQ). Forskolin bypasses MC4R to constitutively activate cAMP signaling, so this condition was used as a variant-independent measurement of library composition. For chaperone experiments, cells were washed 3x with 10 mL DMEM to remove Ipsen 17 prior to agonist stimulation. Six hours after agonist stimulation, cells were harvested by scraping in 4 mL lysis buffer (RLT buffer (Qiagen) + 143 mM β-ME). Lysis was performed by passing the cell slurry 6x through a sterile 18G needle and then spinning through QIAshredder (Qiagen) columns. RNA was extracted from 1 ml of the homogenized lysate with the RNeasy Plus Mini kit (Qiagen), including optional on-column DNAse digestion, and eluted into 100 µL water. Eight reverse transcriptase reactions per sample were performed with the SuperScript IV kit (Thermo Fisher), as described previously (Jones et al., 2020) (primers listed in Supplementary Table 3). cDNA from each sample was treated with 1 µL RNase A (100 µg/ml, Thermo Fisher) and 3.2 µL RNase H (5,000 U/mL, NEB) at 37°C for 30 min. RNase-treated samples were concentrated to ∼55 µL by spinning through Amicon Ultra 10 kDa concentrators (EMD Millipore) for ∼8 min. To determine the necessary cycle numbers for equivalent amplification of each sample library, qPCR reactions were performed on 1 µL cDNA (diluted 1:8 in water) with Q5 polymerase (NEB), SYBR Green (Thermo Fisher), and library amplification primers (Supplementary Table 3). Final amplification cycles for each sample were chosen by adding three cycles to the Cq values generated from each respective qPCR reaction. Illumina sequencing libraries were prepared by amplifying 50 µL of each cDNA sample with sequencing adapters (500 nM each library amplification primer, Supplementary Table 3) using the NEBNext Q5 High Fidelity 2x PCR Kit (NEB) under the following cycling conditions: 98°C for 30 s, X cycles of 98°C for 8 s, 65°C for 20 s, and 72°C for 10 s, followed by an extension of 72°C for 2 min. 3 µL of each DNA library sample was run on a 4% E-Gel (Thermo Fisher) and densitometry was performed with Fiji to account for differences in library yields. Samples were mixed at equal amounts into a single pool and then purified into 200 µL IDTE (Qiagen) with AxyPrep Magnetic beads (Fisher Scientific). The purified library was quantified with the DeNovix High-Sensitivity Fluorescence kit and prepared for sequencing with a 10% PhiX spike-in. Final library mixture was sequenced using custom read and index primers (Supplementary Table 3) on an Illumina NextSeq 550 with the High Output 75 cycle kit.

Sequence processing for barcode expression

Illumina 1×26 BCL files were demultiplexed with bcl2fastq2 and processed to remove the last 5 bp using basic bash commands. The resulting sequences were counted for each sample, and the resulting barcodes were joined with the appropriate oligo-barcode map. The resulting barcodes were joined with sample and MC4R variant metadata and returned for regression analysis. All processing after demultiplexing was performed with custom R code.

Negative binomial regression analysis pipeline

In many DMS methods, barcodes are summed within protein-coding variants to generate a single variant-level count per sample. This unnecessarily sacrifices available power obtained by repeatedly measuring the same molecular process in distinct cells. An alternative is to model barcode counts directly. However, to apply standard models, missing data must either be removed, which can remove a majority of detected barcodes, or somehow imputed. Additionally, many existing methods either use a log transformation of read counts to obtain approximate normality, or apply Poisson regression and related assumptions for inference (Faure et al., 2020; Rubin et al., 2017). However, there is strong prior reason to believe that DMS counts are overdispersed in our data, since our synthetic reporter system is read out via RNA-seq (Robinson and Smyth, 2007).

Instead, we developed and applied a mixed effects negative binomial general linear model (GLM) to resolve these challenges. These models have been widely deployed to model count data, and in particular identify differential expression, in bulk and single-cell RNA-seq analysis. We implement maximum likelihood estimation for this model using glmmTMB, which can accommodate the potentially large scale of multiplex count data (Brooks et al., 2017). For each position, we consider all variants located at that position along with all wild-type variants in the same protein subregion and apply the following model:

For the ith condition, the jth variant, the kth barcode, and the mth sample. Consequently, the first two terms in the last equation above correspond to a global mean term for each condition and a term for the variant-specific deviation from wild-type in each condition. The last two terms are the random effect for barcode k, and the sample-specific technical offset for sample m. The definition of the offset is often context specific, and here we use the log of the sum of barcode counts derived from stops, reasoning they should be constant across conditions and replicates. We fit this model for each MC4R position independently and extract coefficients for the additive shift in the mean of each variant relative to wild-type. Using the per-condition summary statistics, we obtained Wald test statistics by dividing the effect size by the standard error and computed p-values against the normal distribution. P-values were adjusted for multiple testing using the Benjamini-Hochberg method and thresholded to 1% or 5% FDR where indicated.

To define more complex null hypotheses like chaperone rescue, we extracted marginal means for each variant under each treatment using the emmeans package (Lenth, 2024). We define the chaperone rescue contrast as the additive shift of each variant in each treatment condition to the wild-type mean specifically in the untreated condition. Since this quantity is a linear contrast across marginal means, we computed the associated contrast estimates and standard errors, and tested them for significance using the same approach as the per-condition summary statistics.

Comparison to human genetics data and variant effect predictors

Pathogenicity classifications of MC4R missense and nonsense variants were obtained from ClinVar (Landrum et al., 2014) on January 5, 2024. Human population frequency data for MC4R missense variants were obtained from gnomAD (Chen et al., 2024) on January 8, 2024. A comprehensive list of 220 MC4R variants that are of potential clinical relevance (Supplementary Table 2) was mined from ClinVar (Landrum et al., 2014), along with papers or review articles describing variants identified in human sequencing studies (Brouwers et al., 2021; Farooqi et al., 2003; Hinney et al., 2013, 2006; Huang et al., 2017; Rodríguez Rondón et al., 2024; Stutzmann et al., 2008; Wade et al., 2021; Yeo et al., 2003). Effect predictions for MC4R missense variants were obtained from the public releases of AlphaMissense (Cheng et al., 2023) and popEVE (Orenbuch et al., 2023).

Identification of functionally biased variants

We used Principal Component Analysis to identify biased variants. Specifically, we used the test statistic (log2 fold change divided by the standard error) for the DMSO and all α-MSH conditions in both the Gq and Gs pathways to create a matrix with conditions (defined as Drug_Concentration) as columns, and each individual mutant (defined as Position_Substitution) as rows. We then passed this matrix to R’s prcomp function with default parameters, and used the first two principal components to visualize the results. Visual inspection of the loadings via a biplot, along with plotting of various variant types (Supplementary Fig. 6A,B) revealed that PC1 separates variants based on their overall effect on MC4R function, and PC2 separates variants based on differential response through the Gq and Gs pathways. We set a simple cutoff of +- 7.5 on PC2 to highlight particularly interesting variants.

Identifying variants that respond to corrector treatment

We identified variants whose activity increased upon treatment with Ipsen 17 using a slight modification of the summary statistics from the general-purpose model described above. For each variant and condition, we compute the log2-scaled marginal mean (averaging across barcodes and replicates) and its associated standard error. Then, for each variant we compute the following two summary statistics, and test whether their estimates are significantly different from zero. First, we define a variants’ defect as the marginal mean of that variant under DMSO treatment minus the marginal mean of WT under DMSO treatment. This quantifies the existence and severity of the variant’s intrinsic effect on MC4R activity. Second, we define that variant’s rescue as the marginal mean of that variant under Ipsen 17 treatment minus the marginal mean of that variant under DMSO activity. This quantifies the magnitude of the (typically) increase in MC4R activity upon Ipsen 17 treatment for each variant, relative to its DMSO-only baseline. After computing the indicated estimate and errors, we perform significance testing as in the general model, where we define Wald statistics as the estimate divided by the propagated error, compute p-values from the normal distribution, and adjust for multiple testing using the Benjamini-Hochberg procedure.

Identifying critical variants for protein-ligand interactions

We identified sets of mutations that specifically inhibited or potentiated MC4R activation in the presence of ligands. We applied the general DMS model (see Negative binomial regression analysis pipeline) and extracted log2-scaled fold changes and standard errors for each variant relative to WT, within each ligand-treated condition. To account for systematic differences between ligands across all variants, we applied Bayesian meta-regression via the brms R package (Bürkner, 2017) and regressed the α-MSH summary statistics against THIQ, while including the errors in both quantities via the se() and me() brms functions. Finally, to infer significant α-MSH- or THIQ-specific effects, we extracted the residual of each variant relative to the meta-regression best-fit line and tested whether this residual was significantly non-zero based on the posterior sampling performed with brms.

Structural modeling

Molecular visualization of variant effects on MC4R was performed with UCSF ChimeraX (Pettersen et al., 2021). For visualization of functionally biased extremes in MC4R (Fig. 3B), the maximum absolute PC2 value for each position was calculated and the define attribute function of ChimeraX was used to color the structure of α-MSH-bound MC4R (PDB: 7F53) by these relative values, ranging from white (no bias) to pink (extreme bias). For all other structural panels related to variant bias, positions of interest are colored binarily by green or purple to indicate Gs-bias or Gq-bias, respectively. For visualizing variant effects on protein-ligand interactions (Fig. 5), positions with significant variants identified by meta-regression were colored by whether variants at a given position perturb activation uniquely by α-MSH (blue) or THIQ (orange). Where α-MSH and and THIQ structures are shown, the respective α-MSH-bound (PDB: 7F53) and THIQ-bound (PDB: 7F58) cryo-EM models were used for visualization. For the depictions in Fig. 5B, α-MSH-bound MC4R was used.

Figure generation

Cartoon diagrams of reporter assay designs in Fig. 1 and Supplementary Fig. 1A,B were created with BioRender.com.

Sequence data and software availability

Raw sequencing data are available from SRA under project accession number PRJNA1161152. All code used for analysis and figure generation are available at https://github.com/octantbio/mc4r-dms.

Competing Interests

CJH, NSA, BAO, EMJ, LYC, HC, JBA, JSB, ARC, SK, DED, and NBL are current employees of Octant, Inc. and/or hold shares or options in the company. EMJ, HC, AC, and SK are listed inventors on patents related to this work. All other authors declare no competing financial interests.

Acknowledgements

We would like to thank Jakob Sture Madsen for his intellectual contributions to the project and his helpful comments on the paper.

Supplemental Information

Supplemental Figures

Methods development detail.
**A, B.** Detailed cartoons of reporters for cAMP (Gs) signaling (A) and calcium (Gq) signaling (B). C. Luciferase assays used to determine agonist (α-MSH and THIQ) dose ranges for the DMS experiments. For each reporter (Gs, Gq), we fit a 3-parameter log-logistic with the bottom of the curve fixed to the DMSO mean. We shared the slope and Emax parameter between α-MSH and THIQ, and let the EC50 parameter vary between the drugs.
Panels A and B made with BioRender.com/y06g086.

Analysis method and barcoding detail.
A. Schematic for analysis method. DMS data can be represented as a tensor, with dimensions for Position in the protein, Amino Acid substitution, and Condition + Replicate. Within the Condition + Replicate dimension, there are multiple drugs (i.e., α-MSH and THIQ) and multiple signaling pathways (i.e., Gs and Gq). Furthermore, each amino acid substitution has multiple barcodes associated with it. Our negative binomial mixed-effect model takes all of this structure into account to produce summary statistics that are the basis of our comparisons. B. Distribution of Z-statistics for stop codons versus all other variant effects from published β2AR (Jones et al., 2020) and MC4R (this study) DMS assays assessing Gs signaling activity. C. Downsampling simulation from the MC4R data showing the effect of number of barcodes per variant on the estimation of log2 fold change relative to wild type. Here we show two representative positions in the α-MSH low condition, randomly downsampled a fixed number of barcodes associated with that variant. We repeated that process for 5 independent samplings, and ran the resulting data through our model resulting in parameter estimates. Increasing the number of barcodes per variant decreases the magnitude of the error bars (+/- 2 standard errors).

Results (z-scores) of MC4R Deep Mutational Scans across all agonist stimulation conditions.
Shown are z-scores indicating the effect of each variant (y-axis) at each position (x-axis) relative to wild-type. GoF: gain-of-function; LoF: loss-of-function, WT: wild-type.

Results (log2[fold change]) of MC4R Deep Mutational Scans across all agonist stimulation conditions.
Shown are the log₂(fold change) values of each variant (y-axis) at each position (x-axis) relative to wild-type. GoF: gain-of-function; LoF: loss-of-function, WT: wild-type.

MC4R Deep Mutational Scans are consistent with variant effect predictions.
**A., B.** Shown are z-scores for each MC4R variant (x-axis) from the Gs-α-MSH-low (forskolin normalized) DMS condition relative to the AlphaMissense (A) or popEve (B) score of the variant. GoF: gain-of-function; LoF: loss-of-function, WT: wild-type.

Variants with differential effects on Gs and Gq signaling.
A. PCA of z-scores from all Gs and Gq assays using no, low, medium, and high α-MSH. Each point represents a variant, with stop codon variants colored in blue and supporting that PC1 is driven by variants that are LoF in both Gs and Gq assays versus those that retain activity in one or both assays. B. Same PCA as A but with arrows showing how each of the incorporated datasets is contributing to the components. This supports that PC2 is driven by differences in Gs and Gq signaling. **C,D.** Additional examples of variants that have differential effects on Gs and Gq signal, as in Fig. 3C,D. **Left:** Closeup of MC4R structure showing positions of interest in purple. **Right/Bottom:** MC4R signaling activity (log2[fold change of variant activity relative to wild-type]) for selected variants across α-MSH doses. Error bars are +/- 2 standard errors. Med: medium; WT: wild-type

Identification of variants that respond to corrector treatment.
Scatter plot showing activity of each MC4R variant in Gs signaling assay upon stimulation with 1 μM α-MSH (>EC99) in the absence of corrector (x-axis) versus upon treatment with 1 μM Ipsen 17 followed by stimulation with 1 μM α-MSH (>EC99) (y-axis). Each point represents a variant, with those in red classified as both 1) loss-of-function in the absence of corrector; and 2) functionally rescued in the Ipsen 17 treatment condition (see Methods for details).

Variants that differentially impact activation by peptide versus small molecule agonists.
A. PCA of z-scores from all Gs signaling assays with no, low, medium, and high concentrations of α-MSH or THIQ. Each point represents a variant colored by whether it uniquely abrogates signaling in the α-MSH (blue) or THIQ (orange) condition(s). B. Top-down ribbon view of MC4R with bound α-MSH (left, blue; PDB: 7f53) or THIQ (right, orange; PDB: 7f58). MC4R residues are colored by whether variants at a given position disrupt activation uniquely by α-MSH (blue), THIQ (orange), both (green), or neither (gray). Residues 48, 104, and 129 are shown in stick form.

Supplemental Tables

Supplementary Tables. Supplementary Table 1. Experimental conditions across all DMS assays Supplementary Table 2. Functional effects of >200 human MC4R variants Supplementary Table 3. Oligo sequences used in this study

References

1. Ahlmann-Eltze C
2. Huber W
2021glmGamPoi: fitting Gamma-Poisson generalized linear models on single cell count dataBioinformatics 36:5701–5702https://doi.org/10.1093/bioinformatics/btaa1009
1. Araya CL
2. Fowler DM
2011Deep mutational scanning: assessing protein function on a massive scaleTrends Biotechnol 29:435–442https://doi.org/10.1016/j.tibtech.2011.04.003
1. Boyle MP
2. De Boeck K.
2013A new era in the treatment of cystic fibrosis: correction of the underlying CFTR defectLancet Respir Med 1:158–163https://doi.org/10.1016/S2213-2600(12)70057-7
1. Braberg H
2. Echeverria I
3. Kaake RM
4. Sali A
5. Krogan NJ
2022From systems to structure - using genetic data to model protein structuresNat Rev Genet 23:342–354https://doi.org/10.1038/s41576-021-00441-w
1. Brandes N
2. Goldman G
3. Wang CH
4. Ye CJ
5. Ntranos V
2023Genome-wide prediction of disease variant effects with a deep protein language modelNat Genet 55:1512–1522https://doi.org/10.1038/s41588-023-01465-0
1. Breit A
2. Büch TRH
3. Boekhoff I
4. Solinski HJ
5. Damm E
6. Gudermann T
2011Alternative G protein coupling and biased agonism: new insights into melanocortin-4 receptor signallingMol Cell Endocrinol 331:232–240https://doi.org/10.1016/j.mce.2010.07.007
1. Brooks ME
2. Kristensen K
3. van Benthem KJ
4. Magnusson A
5. Berg CW
6. Nielsen A
7. Skaug HJ
8. Mächler M
9. Bolker BM.
2017glmmTMB Balances Speed and Flexibility Among Packages for Zero-inflated Generalized Linear Mixed ModelingR J 9:378–400https://doi.org/10.32614/RJ-2017-066
1. Brouwers B
2. de Oliveira EM
3. Marti-Solano M
4. Monteiro FBF
5. Laurin S-A
6. Keogh JM
7. Henning E
8. Bounds R
9. Daly CA
10. Houston S
11. Ayinampudi V
12. Wasiluk N
13. Clarke D
14. Plouffe B
15. Bouvier M
16. Babu MM
17. Farooqi IS
18. Mokrosiński J.
2021Human MC4R variants affect endocytosis, trafficking and dimerization revealing multiple cellular mechanisms involved in weight regulationCell Rep 34https://doi.org/10.1016/j.celrep.2021.108862
1. Bürkner P-C
2017. brms: An R Package for Bayesian Multilevel Models Using StanJ Stat Softw 80:1–28https://doi.org/10.18637/jss.v080.i01
1. Chan H
2. Cooper A
3. Gasperini MJ
2024Systems and Methods for Measuring Cell Signaling Protein ActivityUS Patent
1. Chavez A
2. Scheiman J
3. Vora S
4. Pruitt BW
5. Tuttle M
6. Iyer E P R
7. Lin S
8. Kiani S
9. Guzman CD
10. Wiegand DJ
11. Ter-Ovanesyan D
12. Braff JL
13. Davidsohn N
14. Housden BE
15. Perrimon N
16. Weiss R
17. Aach J
18. Collins JJ
19. Church GM
2015Highly efficient Cas9-mediated transcriptional programmingNat Methods 12:326–328https://doi.org/10.1038/nmeth.3312
1. Cheng J
2. Novati G
3. Pan J
4. Bycroft C
5. Žemgulytė A
6. Applebaum T
7. Pritzel A
8. Wong LH
9. Zielinski M
10. Sargeant T
11. Schneider RG
12. Senior AW
13. Jumper J
14. Hassabis D
15. Kohli P
16. Avsec Ž
2023Accurate proteome-wide missense variant effect prediction with AlphaMissenseScience 381https://doi.org/10.1126/science.adg7492
1. Chen KY
2. Muniyappa R
3. Abel BS
4. Mullins KP
5. Staker P
6. Brychta RJ
7. Zhao X
8. Ring M
9. Psota TL
10. Cone RD
11. Panaro BL
12. Gottesdiener KM
13. Van der Ploeg LHT
14. Reitman ML
15. Skarulis MC
2015RM-493, a melanocortin-4 receptor (MC4R) agonist, increases resting energy expenditure in obese individualsJ Clin Endocrinol Metab 100:1639–1645https://doi.org/10.1210/jc.2014-4024
1. Chen S
2. Francioli LC
3. Goodrich JK
4. Collins RL
5. Kanai M
6. Wang Q
7. Alföldi J
8. Watts NA
9. Vittal C
10. Gauthier LD
11. Poterba T
12. Wilson MW
13. Tarasova Y
14. Phu W
15. Grant R
16. Yohannes MT
17. Koenig Z
18. Farjoun Y
19. Banks E
20. Donnelly S
21. Gabriel S
22. Gupta N
23. Ferriera S
24. Tolonen C
25. Novod S
26. Bergelson L
27. Roazen D
28. Ruano-Rubio V
29. Covarrubias M
30. Llanwarne C
31. Petrillo N
32. Wade G
33. Jeandet T
34. Munshi R
35. Tibbetts K
36. Genome Aggregation Database Consortium
37. O’Donnell-Luria A
38. Solomonson M
39. Seed C
40. Martin AR
41. Talkowski ME
42. Rehm HL
43. Daly MJ
44. Tiao G
45. Neale BM
46. MacArthur DG
47. Karczewski KJ
2024A genomic mutational constraint map using variation in 76,156 human genomesNature 625:92–100https://doi.org/10.1038/s41586-023-06045-0
1. Clément K
2. Biebermann H
3. Farooqi IS
4. Van der Ploeg L
5. Wolters B
6. Poitou C
7. Puder L
8. Fiedorek F
9. Gottesdiener K
10. Kleinau G
11. Heyder N
12. Scheerer P
13. Blume-Peytavi U
14. Jahnke I
15. Sharma S
16. Mokrosinski J
17. Wiegand S
18. Müller A
19. Weiß K
20. Mai K
21. Spranger J
22. Grüters A
23. Blankenstein O
24. Krude H
25. Kühnen P.
2018MC4R agonism promotes durable weight loss in patients with leptin receptor deficiencyNat Med 24:551–555https://doi.org/10.1038/s41591-018-0015-9
1. Clément K
2. van den Akker E
3. Argente J
4. Bahm A
5. Chung WK
6. Connors H
7. De Waele K
8. Farooqi IS
9. Gonneau-Lejeune J
10. Gordon G
11. Kohlsdorf K
12. Poitou C
13. Puder L
14. Swain J
15. Stewart M
16. Yuan G
17. Wabitsch M
18. Kühnen P
2020Setmelanotide POMC and LEPR Phase 3 Trial Investigators. 2020. Efficacy and safety of setmelanotide, an MC4R agonist, in individuals with severe obesity due to LEPR or POMC deficiency: single-arm, open-label, multicentre, phase 3 trialsLancet Diabetes Endocrinol 8:960–970https://doi.org/10.1016/S2213-8587(20)30364-8
1. Collet T-H
2. Dubern B
3. Mokrosinski J
4. Connors H
5. Keogh JM
6. de Oliveira E Mendes
7. Henning E
8. Poitou-Bernert C
9. Oppert J-M
10. Tounian P
11. Marchelli F
12. Alili R
13. Le Beyec J
14. Pépin D
15. Lacorte J-M
16. Gottesdiener A
17. Bounds R
18. Sharma S
19. Folster C
20. Henderson B
21. O’Rahilly S
22. Stoner E
23. Gottesdiener K
24. Panaro BL
25. Cone RD
26. Clément K
27. Farooqi IS
28. Van der Ploeg LHT
2017Evaluation of a melanocortin-4 receptor (MC4R) agonist (Setmelanotide) in MC4R deficiencyMol Metab 6:1321–1329https://doi.org/10.1016/j.molmet.2017.06.015
1. Dobin A
2. Davis CA
3. Schlesinger F
4. Drenkow J
5. Zaleski C
6. Jha S
7. Batut P
8. Chaisson M
9. Gingeras TR
2013STAR: ultrafast universal RNA-seq alignerBioinformatics 29:15–21https://doi.org/10.1093/bioinformatics/bts635
1. Farooqi IS
2. Keogh JM
3. Yeo GSH
4. Lank EJ
5. Cheetham T
6. O’Rahilly S
2003Clinical spectrum of obesity and mutations in the melanocortin 4 receptor geneN Engl J Med 348:1085–1095https://doi.org/10.1056/NEJMoa022050
1. Fatima MT
2. Ahmed I
3. Fakhro KA
4. Akil ASA-S
2022Melanocortin-4 receptor complexity in energy homeostasis,obesity and drug development strategiesDiabetes Obes Metab 24:583–598https://doi.org/10.1111/dom.14618
1. Faure AJ
2. Domingo J
3. Schmiedel JM
4. Hidalgo-Carcedo C
5. Diss G
6. Lehner B
2022Mapping the energetic and allosteric landscapes of protein binding domainsNature 604:175–183https://doi.org/10.1038/s41586-022-04586-4
1. Faure AJ
2. Schmiedel JM
3. Baeza-Centurion P
4. Lehner B
2020DiMSum: an error model and pipeline for analyzing deep mutational scanning data and diagnosing common experimental pathologiesGenome Biol 21https://doi.org/10.1186/s13059-020-02091-3
1. Findlay GM
2. Boyle EA
3. Hause RJ
4. Klein JC
5. Shendure J
2014Saturation editing of genomic regions by multiplex homology-directed repairNature 513:120–123https://doi.org/10.1038/nature13695
1. Flock T
2. Ravarani CNJ
3. Sun D
4. Venkatakrishnan AJ
5. Kayikci M
6. Tate CG
7. Veprintsev DB
8. Babu MM
2015Universal allosteric mechanism for Gα activation by GPCRsNature 524:173–179https://doi.org/10.1038/nature14663
1. Fowler DM
2. Fields S
2014Deep mutational scanning: a new style of protein scienceNat Methods 11:801–807https://doi.org/10.1038/nmeth.3027
1. Germain DP
2. Hughes DA
3. Nicholls K
4. Bichet DG
5. Giugliani R
6. Wilcox WR
7. Feliciani C
8. Shankar SP
9. Ezgu F
10. Amartino H
11. Bratkovic D
12. Feldt-Rasmussen U
13. Nedd K
14. Sharaf El Din U
15. Lourenco CM
16. Banikazemi M
17. Charrow J
18. Dasouki M
19. Finegold D
20. Giraldo P
21. Goker-Alpan O
22. Longo N
23. Scott CR
24. Torra R
25. Tuffaha A
26. Jovanovic A
27. Waldek S
28. Packman S
29. Ludington E
30. Viereck C
31. Kirk J
32. Yu J
33. Benjamin ER
34. Johnson F
35. Lockhart DJ
36. Skuban N
37. Castelli J
38. Barth J
39. Barlow C
40. Schiffmann R.
2016Treatment of Fabry’s Disease with the Pharmacologic Chaperone MigalastatN Engl J Med 375:545–555https://doi.org/10.1056/NEJMoa1510198
1. Gonçalves JPL
2. Palmer D
3. Meldal M
2018MC4R Agonists: Structural Overview on Antiobesity TherapeuticsTrends Pharmacol Sci 39:402–423https://doi.org/10.1016/j.tips.2018.01.004
1. Greenfield JR
2. Miller JW
3. Keogh JM
4. Henning E
5. Satterwhite JH
6. Cameron GS
7. Astruc B
8. Mayer JP
9. Brage S
10. See TC
11. Lomas DJ
12. O’Rahilly S
13. Farooqi IS
2009Modulation of blood pressure by central melanocortinergic pathwaysN Engl J Med 360:44–52https://doi.org/10.1056/NEJMoa0803085
1. Hauser AS
2. Avet C
3. Normand C
4. Mancini A
5. Inoue A
6. Bouvier M
7. Gloriam DE
2022Common coupling map advances GPCR-G protein selectivityElife 11https://doi.org/10.7554/eLife.74107
1. Hauser AS
2. Kooistra AJ
3. Munk C
4. Heydenreich FM
5. Veprintsev DB
6. Bouvier M
7. Babu MM
8. Gloriam DE
2021GPCR activation mechanisms across classes and macro/microscalesNat Struct Mol Biol 28:879–888https://doi.org/10.1038/s41594-021-00674-7
1. Heyder NA
2. Kleinau G
3. Speck D
4. Schmidt A
5. Paisdzior S
6. Szczepek M
7. Bauer B
8. Koch A
9. Gallandi M
10. Kwiatkowski D
11. Bürger J
12. Mielke T
13. Beck-Sickinger AG
14. Hildebrand PW
15. Spahn CMT
16. Hilger D
17. Schacherl M
18. Biebermann H
19. Hilal T
20. Kühnen P
21. Kobilka BK
22. Scheerer P
2021Structures of active melanocortin-4 receptor-Gs-protein complexes with NDP-α-MSH and setmelanotideCell Res 31:1176–1189https://doi.org/10.1038/s41422-021-00569-8
1. Hinney A
2. Bettecken T
3. Tarnow P
4. Brumm H
5. Reichwald K
6. Lichtner P
7. Scherag A
8. Nguyen TT
9. Schlumberger P
10. Rief W
11. Vollmert C
12. Illig T
13. Wichmann H-E
14. Schäfer H
15. Platzer M
16. Biebermann H
17. Meitinger T
18. Hebebrand J
2006Prevalence, spectrum, and functional characterization of melanocortin-4 receptor gene mutations in a representative population-based sample and obese adults from GermanyJ Clin Endocrinol Metab 91:1761–1769https://doi.org/10.1210/jc.2005-2056
1. Hinney A
2. Hohmann S
3. Geller F
4. Vogel C
5. Hess C
6. Wermter A-K
7. Brokamp B
8. Goldschmidt H
9. Siegfried W
10. Remschmidt H
11. Schäfer H
12. Gudermann T
13. Hebebrand J
2003Melanocortin-4 receptor gene: case-control study and transmission disequilibrium test confirm that functionally relevant mutations are compatible with a major gene effect for extreme obesityJ Clin Endocrinol Metab 88:4258–4267https://doi.org/10.1210/jc.2003-030233
1. Hinney A
2. Körner A
3. Fischer-Posovszky P
2022The promise of new anti-obesity therapies arising from knowledge of genetic obesity traitsNat Rev Endocrinol :1–15https://doi.org/10.1038/s41574-022-00716-0
1. Hinney A
2. Volckmar A-L
3. Knoll N
2013Melanocortin-4 receptor in energy homeostasis and obesity pathogenesisProg Mol Biol Transl Sci 114:147–191https://doi.org/10.1016/B978-0-12-386933-3.00005-4
1. Howard MK
2. Hoppe N
3. Huang X-P
4. Macdonald CB
5. Mehrota E
6. Grimes PR
7. Zahm A
8. Trinidad DD
9. English J
10. Coyote-Maestas W
11. Manglik A
2024Molecular basis of proton-sensing by G protein-coupled receptorsbioRxiv https://doi.org/10.1101/2024.04.17.590000
1. Hruby VJ
2. Wilkes BC
3. Hadley ME
4. Al-Obeidi F
5. Sawyer TK
6. Staples DJ
7. de Vaux AE
8. Dym O
9. Castrucci AM
10. Hintz MF
1987. alpha-Melanotropin: the minimal active sequence in the frog skin bioassayJ Med Chem 30:2126–2130https://doi.org/10.1021/jm00394a033
1. Huang H
2. Tao Y-X
2014A small molecule agonist THIQ as a novel pharmacoperone for intracellularly retained melanocortin-4 receptor mutantsInt J Biol Sci 10:817–824https://doi.org/10.7150/ijbs.9625
1. Huang H
2. Wang W
3. Tao Y-X
2017Pharmacological chaperones for the misfolded melanocortin-4 receptor associated with human obesityBiochim Biophys Acta Mol Basis Dis 1863:2496–2507https://doi.org/10.1016/j.bbadis.2017.03.001
1. Hughes JP
2. Rees S
3. Kalindjian SB
4. Philpott KL
2011Principles of early drug discoveryBr J Pharmacol 162:1239–1249https://doi.org/10.1111/j.1476-5381.2010.01127.x
1. Isberg V
2. de Graaf C
3. Bortolato A
4. Cherezov V
5. Katritch V
6. Marshall FH
7. Mordalski S
8. Pin J-P
9. Stevens RC
10. Vriend G
11. Gloriam DE
2015Generic GPCR residue numbers - aligning topology maps while minding the gapsTrends Pharmacol Sci 36:22–31https://doi.org/10.1016/j.tips.2014.11.001
1. Jones EM
2. Jajoo R
3. Cancilla D
4. Lubock NB
5. Wang J
6. Satyadi M
7. Chong R
8. de March C
9. Bloom JS
10. Matsunami H
11. Kosuri S.
2019A Scalable, Multiplexed Assay for Decoding GPCR-Ligand Interactions with RNA SequencingCell Syst 8:254–260https://doi.org/10.1016/j.cels.2019.02.009
1. Jones EM
2. Lubock NB
3. Venkatakrishnan AJ
4. Wang J
5. Tseng AM
6. Paggi JM
7. Latorraca NR
8. Cancilla D
9. Satyadi M
10. Davis JE
11. Babu MM
12. Dror RO
13. Kosuri S
2020Structural and functional characterization of G protein-coupled receptors with deep mutational scanningElife 9https://doi.org/10.7554/eLife.54895
1. Ju SH
2. Cho G-B
3. Sohn J-W
2018Understanding melanocortin-4 receptor control of neuronal circuits: Toward novel therapeutics for obesity syndromePharmacol Res 129:10–19https://doi.org/10.1016/j.phrs.2018.01.004
1. Kievit P
2. Halem H
3. Marks DL
4. Dong JZ
5. Glavas MM
6. Sinnayah P
7. Pranger L
8. Cowley MA
9. Grove KL
10. Culler MD
2013Chronic treatment with a melanocortin-4 receptor agonist causes weight loss, reduces insulin resistance, and improves cardiovascular function in diet-induced obese rhesus macaquesDiabetes 62:490–497https://doi.org/10.2337/db12-0598
1. Lafita A
2. Gonzalez F
3. Hossam M
4. Smyth P
5. Deasy J
6. Allyn-Feuer A
7. Seaton D
8. Young S
2024Fine-tuning protein Language Models with Deep Mutational Scanning improves variant effect predictionarXiv
1. Landrum MJ
2. Lee JM
3. Riley GR
4. Jang W
5. Rubinstein WS
6. Church DM
7. Maglott DR
2014ClinVar: public archive of relationships among sequence variation and human phenotypeNucleic Acids Res 42:D980–5https://doi.org/10.1093/nar/gkt1113
1. Lenth RV
2024emmeans: Estimated Marginal Means, aka Least-Squares MeansCRAN
1. Li H
2. Handsaker B
3. Wysoker A
4. Fennell T
5. Ruan J
6. Homer N
7. Marth G
8. Abecasis G
9. Durbin R
10. 1000 Genome Project Data Processing Subgroup
2009The Sequence Alignment/Map format and SAMtoolsBioinformatics 25:2078–2079https://doi.org/10.1093/bioinformatics/btp352
1. Li Y-Q
2. Shrestha Y
3. Pandey M
4. Chen M
5. Kablan A
6. Gavrilova O
7. Offermanns S
8. Weinstein LS
2016G(q/11)α and G(s)α mediate distinct physiological responses to central melanocortinsJ Clin Invest 126:40–49https://doi.org/10.1172/JCI76348
1. Lotta LA
2. Mokrosiński J
3. de Oliveira E Mendes
4. Li C
5. Sharp SJ
6. Luan J ’an
7. Brouwers B
8. Ayinampudi V
9. Bowker N
10. Kerrison N
11. Kaimakis V
12. Hoult D
13. Stewart ID
14. Wheeler E
15. Day FR
16. Perry JRB
17. Langenberg C
18. Wareham NJ
19. Farooqi IS.
2019Human Gain-of-Function MC4R Variants Show Signaling Bias and Protect against ObesityCell 177:597–607https://doi.org/10.1016/j.cell.2019.03.044
1. Love MI
2. Huber W
3. Anders S
2014Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2Genome Biol 15https://doi.org/10.1186/s13059-014-0550-8
1. Martin M
2011Cutadapt removes adapter sequences from high-throughput sequencing readsEMBnet.journal 17:10–12https://doi.org/10.14806/ej.17.1.200
1. Matreyek KA
2. Starita LM
3. Stephany JJ
4. Martin B
5. Chiasson MA
6. Gray VE
7. Kircher M
8. Khechaduri A
9. Dines JN
10. Hause RJ
11. Bhatia S
12. Evans WE
13. Relling MV
14. Yang W
15. Shendure J
16. Fowler DM
2018Multiplex assessment of protein variant abundance by massively parallel sequencingNat Genet 50:874–882https://doi.org/10.1038/s41588-018-0122-z
1. McCarthy DJ
2. Chen Y
3. Smyth GK
2012Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variationNucleic Acids Res 40:4288–4297https://doi.org/10.1093/nar/gks042
1. McDonald EF
2. Oliver KE
3. Schlebach JP
4. Meiler J
5. Plate L
2024Benchmarking AlphaMissense pathogenicity predictions against cystic fibrosis variantsPLoS One 19https://doi.org/10.1371/journal.pone.0297560
1. Metzger PJ
2. Zhang A
3. Carlson BA
4. Sun H
5. Cui Z
6. Li Y
7. Jahnke MT
8. Layton DR
9. Gupta MB
10. Liu N
11. Kostenis E
12. Gavrilova O
13. Chen M
14. Weinstein LS
2024A human obesity-associated MC4R mutation with defective Gq/11α signaling leads to hyperphagia in miceJ Clin Invest 134https://doi.org/10.1172/JCI165418
1. Notin P
2. Kollasch AW
3. Ritter D
4. van Niekerk L
5. Paul S
6. Spinner H
7. Rollins N
8. Shaw A
9. Weitzman R
10. Frazer J
11. Dias M
12. Franceschi D
13. Orenbuch R
14. Gal Y
15. Marks DS.
2023ProteinGym: Large-Scale Benchmarks for Protein Design and Fitness PredictionbioRxiv https://doi.org/10.1101/2023.12.07.570727
1. Orenbuch R
2. Kollasch AW
3. Spinner HD
4. Shearer CA
5. Hopf TA
6. Franceschi D
7. Dias M
8. Frazer J
9. Marks DS
2023Deep generative modeling of the human proteome reveals over a hundred novel genes involved in rare genetic disordersmedRxiv https://doi.org/10.1101/2023.11.27.23299062
1. Paisdzior S
2. Dimitriou IM
3. Schöpe PC
4. Annibale P
5. Scheerer P
6. Krude H
7. Lohse MJ
8. Biebermann H
9. Kühnen P
2020Differential Signaling Profiles of MC4R Mutations with Three Different LigandsInt J Mol Sci 21https://doi.org/10.3390/ijms21041224
1. Pettersen EF
2. Goddard TD
3. Huang CC
4. Meng EC
5. Couch GS
6. Croll TI
7. Morris JH
8. Ferrin TE
2021UCSF ChimeraX: Structure visualization for researchers, educators, and developersProtein Sci 30:70–82https://doi.org/10.1002/pro.3943
1. Plenge RM
2. Scolnick EM
3. Altshuler D
2013Validating therapeutic targets through human geneticsNat Rev Drug Discov 12:581–594https://doi.org/10.1038/nrd4051
1. Podyma B
2. Sun H
3. Wilson EA
4. Carlson B
5. Pritikin E
6. Gavrilova O
7. Weinstein LS
8. Chen M
2018The stimulatory G protein Gsα is required in melanocortin 4 receptor-expressing cells for normal energy balance, thermogenesis, and glucose metabolismJ Biol Chem 293:10993–11005https://doi.org/10.1074/jbc.RA118.003450
1. Poitout L
2. Brault V
3. Sackur C
4. Bernetière S
5. Camara J
6. Plas P
7. Roubert P
2007Identification of a novel series of benzimidazoles as potent and selective antagonists of the human melanocortin-4 receptorBioorg Med Chem Lett 17:4464–4470https://doi.org/10.1016/j.bmcl.2007.06.010
1. Quan J
2. Langelier C
3. Kuchta A
4. Batson J
5. Teyssier N
6. Lyden A
7. Caldera S
8. McGeever A
9. Dimitrov B
10. King R
11. Wilheim J
12. Murphy M
13. Ares LP
14. Travisano KA
15. Sit R
16. Amato R
17. Mumbengegwi DR
18. Smith JL
19. Bennett A
20. Gosling R
21. Mourani PM
22. Calfee CS
23. Neff NF
24. Chow ED
25. Kim PS
26. Greenhouse B
27. DeRisi JL
28. Crawford ED
2019FLASH: a next-generation CRISPR diagnostic for multiplexed detection of antimicrobial resistance sequencesNucleic Acids Res 47https://doi.org/10.1093/nar/gkz418
1. Robinson MD
2. Oshlack A
2010A scaling normalization method for differential expression analysis of RNA-seq dataGenome Biol 11https://doi.org/10.1186/gb-2010-11-3-r25
1. Robinson MD
2. Smyth GK
2007Moderated statistical tests for assessing differences in tag abundanceBioinformatics 23:2881–2887https://doi.org/10.1093/bioinformatics/btm453
1. Rondón AV Rodríguez
2. Welling MS
3. van den Akker ELT
4. van Rossum EFC
5. Boon EMJ
6. van Haelst MM
7. Delhanty PJD
8. Visser JA.
2024MC4R variants modulate α-MSH and setmelanotide induced cellular signaling at multiple levelsJ Clin Endocrinol Metab https://doi.org/10.1210/clinem/dgae210
1. Rubin AF
2. Gelman H
3. Lucas N
4. Bajjalieh SM
5. Papenfuss AT
6. Speed TP
7. Fowler DM
2017A statistical framework for analyzing deep mutational scanning dataGenome Biol 18https://doi.org/10.1186/s13059-017-1272-5
1. Sebhat IK
2. Martin WJ
3. Ye Z
4. Barakat K
5. Mosley RT
6. Johnston DBR
7. Bakshi R
8. Palucki B
9. Weinberg DH
10. MacNeil T
11. Kalyani RN
12. Tang R
13. Stearns RA
14. Miller RR
15. Tamvakopoulos C
16. Strack AM
17. McGowan E
18. Cashen DE
19. Drisko JE
20. Hom GJ
21. Howard AD
22. MacIntyre DE
23. van der Ploeg LHT
24. Patchett AA
25. Nargund RP.
2002Design and pharmacology of N-[(3R)-1,2,3,4-tetrahydroisoquinolinium-3-ylcarbonyl]-(1R)-1-(4-chlorobenzyl)-2-[4-cyclohexyl-4-(1H-1,2,4-triazol-1-ylmethyl)piperidin-1-yl]-2-oxoethylamine (1), a potent, selective, melanocortin subtype-4 receptor agonistJ Med Chem 45:4589–4593https://doi.org/10.1021/jm025539h
1. Sharma S
2. Garfield AS
3. Shah B
4. Kleyn P
5. Ichetovkin I
6. Moeller IH
7. Mowrey WR
8. Van der Ploeg LHT.
2019Current Mechanistic and Pharmacodynamic Understanding of Melanocortin-4 Receptor ActivationMolecules 24https://doi.org/10.3390/molecules24101892
1. Smith T
2. Heger A
3. Sudbery I
2017UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracyGenome Res 27:491–499https://doi.org/10.1101/gr.209601.116
1. Sriram K
2. Insel PA
2018G Protein-Coupled Receptors as Targets for Approved Drugs: How Many Targets and How Many Drugs?Mol Pharmacol 93:251–258https://doi.org/10.1124/mol.117.111062
1. Starita LM
2. Ahituv N
3. Dunham MJ
4. Kitzman JO
5. Roth FP
6. Seelig G
7. Shendure J
8. Fowler DM
2017Variant Interpretation: Functional Assays to the RescueAm J Hum Genet 101:315–325https://doi.org/10.1016/j.ajhg.2017.07.014
1. Stutzmann F
2. Tan K
3. Vatin V
4. Dina C
5. Jouret B
6. Tichet J
7. Balkau B
8. Potoczna N
9. Horber F
10. O’Rahilly S
11. Farooqi IS
12. Froguel P
13. Meyre D
2008Prevalence of melanocortin-4 receptor deficiency in Europeans and their age-dependent penetrance in multigenerational pedigreesDiabetes 57:2511–2518https://doi.org/10.2337/db08-0153
1. Sweeney P
2. Gimenez LE
3. Hernandez CC
4. Cone RD
2023Targeting the central melanocortin system for the treatment of metabolic disordersNat Rev Endocrinol 19:507–519https://doi.org/10.1038/s41574-023-00855-y
1. Tao Y-X
2010The melanocortin-4 receptor: physiology, pharmacology, and pathophysiologyEndocr Rev 31:506–543https://doi.org/10.1210/er.2009-0037
1. Vaisse C
2. Clement K
3. Durand E
4. Hercberg S
5. Guy-Grand B
6. Froguel P
2000Melanocortin-4 receptor mutations are a frequent and heterogeneous cause of morbid obesityJ Clin Invest 106:253–262https://doi.org/10.1172/JCI9238
1. Wade KH
2. Lam BYH
3. Melvin A
4. Pan W
5. Corbin LJ
6. Hughes DA
7. Rainbow K
8. Chen J-H
9. Duckett K
10. Liu X
11. Mokrosiński J
12. Mörseburg A
13. Neaves S
14. Williamson A
15. Zhang C
16. Farooqi IS
17. Yeo GSH
18. Timpson NJ
19. O’Rahilly S
2021Loss-of-function mutations in the melanocortin 4 receptor in a UK birth cohortNat Med 27:1088–1096https://doi.org/10.1038/s41591-021-01349-y
1. Wang X-H
2. Wang H-M
3. Zhao B-L
4. Yu P
5. Fan Z-C
2014Rescue of defective MC4R cell-surface expression and signaling by a novel pharmacoperone Ipsen 17J Mol Endocrinol 53:17–29https://doi.org/10.1530/JME-14-0005
1. Weaver JL
2. Wu W
3. Hyland PL
4. Lim R
5. Smpokou P
6. Pacanowski M
2022Expanding Approved Patient Populations for Rare Disease Treatment Using In Vitro DataClin Pharmacol Ther 112:58–61https://doi.org/10.1002/cpt.2414
1. Weile J
2. Roth FP
2018Multiplexed assays of variant effects contribute to a growing genotype-phenotype atlasHum Genet 137:665–678https://doi.org/10.1007/s00439-018-1916-x
1. Weng C
2. Faure AJ
3. Escobedo A
4. Lehner B
2024The energetic and allosteric landscape for KRAS inhibitionNature 626:643–652https://doi.org/10.1038/s41586-023-06954-0
1. Yeo GSH
2. Lank EJ
3. Farooqi IS
4. Keogh J
5. Challis BG
6. O’Rahilly S
2003Mutations in the human melanocortin-4 receptor gene associated with severe familial obesity disrupts receptor function through multiple molecular mechanismsHum Mol Genet 12:561–574https://doi.org/10.1093/hmg/ddg057
1. Yu J
2. Gimenez LE
3. Hernandez CC
4. Wu Y
5. Wein AH
6. Han GW
7. McClary K
8. Mittal SR
9. Burdsall K
10. Stauch B
11. Wu L
12. Stevens SN
13. Peisley A
14. Williams SY
15. Chen V
16. Millhauser GL
17. Zhao S
18. Cone RD
19. Stevens RC
2020Determination of the melanocortin-4 receptor structure identifies Ca2+ as a cofactor for ligand bindingScience 368:428–433https://doi.org/10.1126/science.aaz8995
1. Zhang H
2. Chen L-N
3. Yang D
4. Mao C
5. Shen Q
6. Feng W
7. Shen D-D
8. Dai A
9. Xie S
10. Zhou Y
11. Qin J
12. Sun J-P
13. Scharf DH
14. Hou T
15. Zhou T
16. Wang M-W
17. Zhang Y
2021Structural insights into ligand recognition and activation of the melanocortin-4 receptorCell Res 31:1163–1175https://doi.org/10.1038/s41422-021-00552-3
1. Zhou Q
2. Yang D
3. Wu M
4. Guo Y
5. Guo W
6. Zhong L
7. Cai X
8. Dai A
9. Jang W
10. Shakhnovich EI
11. Liu Z-J
12. Stevens RC
13. Lambert NA
14. Babu MM
15. Wang M-W
16. Zhao S
2019Common activation mechanism of class A GPCRsElife 8https://doi.org/10.7554/eLife.50279

Article and author information

Author information

Conor J Howard
Octant, Inc., Emeryville, USA
ORCID iD: 0000-0001-5375-6248
- These authors contributed equally to this work.
Nathan S Abell
Octant, Inc., Emeryville, USA
ORCID iD: 0000-0003-1098-5395
- These authors contributed equally to this work.
Beatriz A Osuna
Octant, Inc., Emeryville, USA
ORCID iD: 0000-0003-2604-6173
Eric M Jones
Octant, Inc., Emeryville, USA
ORCID iD: 0000-0002-6648-1965
Leon Y Chan
Octant, Inc., Emeryville, USA
ORCID iD: 0000-0002-0189-4689
Henry Chan
Octant, Inc., Emeryville, USA
ORCID iD: 0009-0008-5927-3587
Dean R Artis
Octant, Inc., Emeryville, USA
Jonathan B Asfaha
Octant, Inc., Emeryville, USA
ORCID iD: 0000-0003-1160-5104
Joshua S Bloom
Department of Human Genetics and Department of Computational Medicine, University of California, Los Angeles, Los Angeles, USA, Howard Hughes Medical Institute, Chevy Chase, USA
ORCID iD: 0000-0002-7241-1648
Aaron R Cooper
Octant, Inc., Emeryville, USA
ORCID iD: 0000-0003-4588-2513
Andrew Liao
Octant, Inc., Emeryville, USA
ORCID iD: 0000-0002-4548-9175
Eden Mahdavi
Octant, Inc., Emeryville, USA
Nabil Mohammed
Octant, Inc., Emeryville, USA
ORCID iD: 0009-0003-6446-6379
Alan L Su
Octant, Inc., Emeryville, USA
ORCID iD: 0009-0005-3406-9270
Giselle A Uribe
Octant, Inc., Emeryville, USA
ORCID iD: 0000-0003-1523-6916
Sriram Kosuri
Octant, Inc., Emeryville, USA
ORCID iD: 0000-0002-4661-0600
Diane E Dickel
Octant, Inc., Emeryville, USA
ORCID iD: 0000-0001-5497-6824
- For correspondence: diane@octant.bio
- These authors contributed equally to this work.
Nathan B Lubock
Octant, Inc., Emeryville, USA
ORCID iD: 0000-0001-8064-2465
- For correspondence: nate@octant.bio
- These authors contributed equally to this work.

Author Notes

Current affiliation: Tacit Therapeutics, Inc., South San Francisco, USA

Current affiliation: Annexon Biosciences, Brisbane, USA

Current affiliation: Arsenal Biosciences, South San Francisco, USA

Current affiliation: Department of Chemical and Systems Biology, Stanford University, Stanford, USA

Version history

Preprint posted: October 12, 2024
Sent for peer review: November 4, 2024
Reviewed Preprint version 1: December 19, 2024

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

views: 257
downloads: 8
citations: 0

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Significance of findings

Strength of evidence

Abstract

Introduction

Results

Development of highly quantitative deep mutational scanning methods

Assays for disease-relevant mechanisms

Highly quantitative Deep Mutational Scanning (DMS) methods for drug discovery applications.

Analysis model for statistically robust comparisons

Increasing power through barcoding

Comprehensive deep mutational scanning of MC4R

Effect of 6,633 missense and nonsense variants on MC4R signaling functions.

Systematic human variant interpretation of MC4R

Variants that bias MC4R function

Systematic identification of variants that have biased effects on MC4R signaling.

Structural insights into biased signaling

Systematic prediction of treatment response

Corrector therapy rescues the activity of a subset of human MC4R variants.

Mapping protein-ligand interactions

Systematic identification of functional protein-ligand interactions.

Discussion

Materials and methods

Reporter assays and cell line development

Building the DMS libraries

Variant-barcode mapping

Running DMS assays

Sequence processing for barcode expression

Negative binomial regression analysis pipeline

Comparison to human genetics data and variant effect predictors

Identification of functionally biased variants

Identifying variants that respond to corrector treatment

Identifying critical variants for protein-ligand interactions

Structural modeling

Figure generation

Sequence data and software availability

Competing Interests

Acknowledgements

Supplemental Information

Supplemental Figures

Methods development detail.

Analysis method and barcoding detail.

Results (z-scores) of MC4R Deep Mutational Scans across all agonist stimulation conditions.

Results (log2[fold change]) of MC4R Deep Mutational Scans across all agonist stimulation conditions.

MC4R Deep Mutational Scans are consistent with variant effect predictions.

Variants with differential effects on Gs and Gq signaling.

Identification of variants that respond to corrector treatment.

Variants that differentially impact activation by peptide versus small molecule agonists.

Supplemental Tables

References

Article and author information

Author information

Conor J Howard#

Nathan S Abell#

Beatriz A Osuna

Eric M Jones

Leon Y Chan

Henry Chan

Dean R Artis

Jonathan B Asfaha

Joshua S Bloom

Aaron R Cooper

Andrew Liao

Eden Mahdavi

Nabil Mohammed

Alan L Su

Giselle A Uribe

Sriram Kosuri

Diane E Dickel*

Nathan B Lubock*

Author Notes

Version history

Copyright

Metrics

Conor J Howard

Nathan S Abell

Diane E Dickel

Nathan B Lubock