An integrative genomic analysis of the Longshanks selection experiment for longer limbs in mice

Abstract
eLife digest
Introduction
Results
Discussion
Materials and methods
Appendix 1
Data availability
References
Article and author information
Metrics

Abstract

Evolutionary studies are often limited by missing data that are critical to understanding the history of selection. Selection experiments, which reproduce rapid evolution under controlled conditions, are excellent tools to study how genomes evolve under selection. Here we present a genomic dissection of the Longshanks selection experiment, in which mice were selectively bred over 20 generations for longer tibiae relative to body mass, resulting in 13% longer tibiae in two replicates. We synthesized evolutionary theory, genome sequences and molecular genetics to understand the selection response and found that it involved both polygenic adaptation and discrete loci of major effect, with the strongest loci tending to be selected in parallel between replicates. We show that selection may favor de-repression of bone growth through inactivating two limb enhancers of an inhibitor, Nkx3-2. Our integrative genomic analyses thus show that it is possible to connect individual base-pair changes to the overall selection response.

https://doi.org/10.7554/eLife.42014.001

eLife digest

Humans have been making use of artificial selection for thousands of years. Much of what we eat, for example, from beef to poultry to cereals, comes from a collection of organisms with genomes that have been completely reshaped by the actions of generations of farmers and breeders. Yet, despite decades of research in evolutionary biology, it remains difficult to predict what will happen to an organism’s genes when selective pressure is applied.

Traits that at first seem simple often arise from layers upon layers of complexity. It can take hundreds if not thousands of tiny changes to many genes, plus just the right alterations to a few key ones, to have a desired effect on a single trait. Also, if you consider that often the genomes of the starting population are unknown and that many traits are under simultaneous selection in wild populations, it becomes clear why many questions remain unanswered.

Castro, Yancoskie, et al. have analyzed an on-going laboratory experiment dubbed “the Longshanks experiment” to explore how an animal’s genome changes under strong selection. Over five years, two independent populations of mice were selectively bred to have longer legs. In each generation, the mice were measured and those with the longest tibia – a bone in the shin – relative to their body mass were allowed to breed. Genetic data were also recorded. Now, Castro, Yancoskie, et al. have analyzed the genetic data up to the first 17 generations in the Longshanks experiment to find out what kind of genes may be relevant to the 13% increase in leg length seen in the mice so far.

This analysis uncovered many genes, possibly thousands, all acting in concert to increase tibia length. But the gene with the largest effect by far was a key developmental gene called Nkx3-2. Mutations in this gene cause a disease called spondylo-megaepiphyseal-metaphyseal dysplasia in people, which can lead to long limbs and a short trunk. Although inactivating this gene completely in mice is lethal, among the founding group of mice in the Longshanks experiment was a rare copy of Nkx3-2. This copy of the gene worked perfectly in all tissues with the exception of the legs, where a genetic switch that controls it was left in the “off” state. Mice inheriting this short stretch of DNA ended up having longer tibia. In effect, these mice held the winning ticket in the genetic lottery that was the Longshanks experiment.

Even in highly controlled experiments that record a great deal of information about the organisms involved, predicting how the genome will change and which genes will be involved is not a straightforward question. Finding out how the genome may change in response to selection is important because scientists can build better models to help with breeding farm animals or crops, or with predicting the consequences of climate change. As a result, experiments such as these may have broad applications in conservation, genomic medicine and agriculture.

https://doi.org/10.7554/eLife.42014.002

Introduction

Understanding how populations adapt to a changing environment is an urgent challenge of global significance. The problem is especially acute for mammal populations, which are often small and fragmented due to widespread habitat loss. Such populations often show increased inbreeding, leading to the loss of genetic diversity (Hoffmann and Sgrò, 2011). Because beneficial alleles in mammals typically come from standing genetic variation rather than new mutations like in microbes, this loss of diversity would ultimately impose a limit on the ability of small populations to adapt. Nonetheless, mammals respond readily to selection in many traits, both in nature and in the laboratory (Darwin, 1859; Gingerich, 2001; Garland and Rose, 2009; Keightley et al., 2001). In quantitative genetics, such traits are interpreted as the overall effect from a large set of loci, each with an infinitesimally small (and undetectable) effect (‘infinitesimal model’). Broadly speaking, the infinitesimal model has performed remarkably well across a wide range of selection experiments, and the model is the basis for commercial breeding (Walsh and Lynch, 2018). However, it remains unclear what type of genomic change is associated with rapid response to selection, especially in small populations where allele frequency changes can be dominated by random genetic drift.

While a large body of theory exists to describe the birth, rise and eventual fixation of adaptive variants under diverse selection scenarios (Maynard Smith and Haigh, 1974; Barton, 1995; Otto and Barton, 2001; Weissman and Barton, 2012; Crow and Kimura, 1965; Hill and Robertson, 1966), few empirical datasets capture sufficient detail on the founding conditions and selection regime to allow full reconstruction of the selection response. This is particularly problematic in nature, where historical samples, environmental measurements and replicates are often missing. Selection experiments, which reproduce rapid evolution under controlled conditions, are therefore excellent tools to understand response to selection—and by extension—adaptive evolution in nature (Garland and Rose, 2009).

Here we describe an integrative, multi-faceted investigation into an artificial selection experiment, called Longshanks, in which mice were selected for increased tibia length relative to body mass (Marchini et al., 2014). The mammalian limb is an ideal model to study the dynamics of complex traits under selection: it is both morphologically complex and functionally diverse, reflecting its adaptive value; and limb development has been studied extensively in mammals, birds and fishes as a genetic and evolutionary paradigm (Petit et al., 2017). The Longshanks selection experiment thus offers the opportunity to study selection response not only from a quantitative and population genetics perspective, but also from a developmental (Marchini and Rolian, 2018) and genomic perspective.

By design, the Longshanks experiment preserves a nearly complete archive of the phenotype (trait measurements) and genotype (via tissue samples) in the pedigree. Previously, Marchini et al. investigated how selection was able to overcome correlation between tibia length and body mass and produced independent changes in tibia length during the first 14 generations of the Longshanks experiment (Marchini et al., 2014). Importantly, that study focused on the phenotypes and inferred genetic correlations indirectly using the pedigree. The current genomic analysis was initiated when the on-going experiment reached generation 17 and extends the previous study by integrating both phenotypic and genetic aspects of the Longshanks experiment. By sequencing the initial and final genomes, the current analysis benefits from direct and highly resolved genetic information. Here, with essentially complete information, we wish to answer a number of important questions regarding the factors that determine and constrain rapid adaptation: Are the observed changes in gene frequency due to selection or random drift? Does rapid selection response of a complex trait proceed through innumerable loci of infinitesimally small effect, or through a few loci of large effect? What type of signature of selection may be associated with this process? Finally, when the same trait changes occur independently, do these depend on changes in the same gene(s) or the same pathways (parallelism)?

Results

Longshanks selection for longer tibiae

At the start of the Longshanks experiment, we established three base populations with 14 pairs each by sampling from a genetically diverse, commercial mouse stock (Hsd:ICR, also known as CD-1; derived from mixed breeding of classical laboratory mice [Yalcin et al., 2010]). In two replicate ‘Longshanks’ lines (LS1 and LS2), we bred mice by pairing 16 males and females (and excluding sibling pairs) with the longest tibia relative to the cube root of body mass for each sex. This corresponds to 15–20% of all offspring (only details essential to understanding our analysis are summarized here. See Marchini et al., 2014 for a detailed description of the breeding scheme). We kept a third Control line (Ctrl) using an identical breeding scheme, except that breeders were selected at random. In LS1 and LS2, we observed a strong and significant response to selection in tibia length (0.29 and 0.26 Haldane or standard deviations (s.d.) per generation, from a selection differential of 0.73 s.d. in LS1 and 0.62 s.d. in LS2). Over 20 generations, selection for longer relative tibia length produced increases of 5.27 and 4.81 s.d. in LS1 and LS2, respectively (or 12.7% and 13.1% in tibia length), with a modest decrease in body mass (−1.5% in LS1 and −3.7% in LS2; Student’s t-test, p<2 × 10⁻⁴ and p<1 × 10⁻⁸, respectively; Figure 1B and C; Figure 1—figure supplement 1; n.b. this relationship was in part biased by the F1 generation, which were fed a different diet and phenotyped three weeks later than later generations, see Marchini et al., 2014 for details). By contrast, Ctrl showed no directional change in tibia length or body mass (Figure 1C; Student’s t-test, p>0.05). This approximately 5 s.d. change in 20 generations is rapid compared to typical rates observed in nature (Hendry and Kinnison, 1999, but see Grant and Grant, 2002) but is in line with responses seen in selection experiments (Gingerich, 2001; Keightley et al., 2001; Falconer and Mackay, 1996; Pitchers et al., 2014).

Figure 1 with 2 supplements see all

Download asset Open asset

Selection for Longshanks mice produced rapid increase in tibia length.

(**A and B**) Tibia length varies as a quantitative trait among outbred mice derived from the Hsd:ICR (also known as CD-1) commercial stock. Selective breeding for mice with the longest tibiae relative to body mass within families has produced a strong selection response in tibia length over 20 generations in Longshanks mice (13%, blue arrow, LS1). (C) Both LS1 and LS2 produced replicated rapid increase in tibia length (blue and red; line and shading show mean ±s.d.) compared to random-bred Controls (gray). Arrowheads along the x-axis mark sequenced generations F0 and F17. See Figure 1—figure supplement 1 for body mass data. Lower panel: Representative tibiae from the Ctrl, LS1 and LS2 after 20 generations of selection. (D) Analysis of sequence diversity data (linked variants or haplotypes: lines; variants: dots) may detect signatures of selection, such as selective sweeps (F17 in LS1 and LS2) that result from selection favoring a particular variant (dots), compared to neutral or background patterns (Ctrl). Alternatively, selection may elicit a polygenic response, which may involve minor shifts in allele frequency at many loci and therefore may leave a very different selection signature from the one shown here.

https://doi.org/10.7554/eLife.42014.003

Simulating selection response: infinitesimal model with linkage

The rapid but generally smooth increase in tibia length in Longshanks is typically interpreted as evidence for a highly dispersed genetic architecture with no individually important loci contributing to the selection response. This is classically described under quantitative genetics as the infinitesimal model. Crucially, the appropriate null hypothesis for the genomic response here should capture “polygenic adaptation” rather than a neutral model. We therefore developed a simulation that faithfully recapitulates the artificial selection experiment by integrating the trait measurements, selection regime, pedigree and genetic diversity of the Longshanks selection experiment, in order to generate an accurate expectation for the genomic response. Using the actual pedigree and trait measurements, we mapped fitness onto tibia length $T$ and cube-root body mass $B$ as a single composite trait $l n ({T B}^{ϕ})$ . We estimated $ϕ$ from actual data as −0.57, such that the ranking of breeders closely matched the actual composite ranking used to select breeders in the selection experiment, based on $T$ and $B$ separately (Marchini et al., 2014) (Figure 1—figure supplement 2A). We assumed a maximally polygenic genetic architecture using an “infinitesimal model with linkage” (abbreviated here as H_INF), under which the trait is controlled by very many loci, each of infinitesimally small effect (see Appendix for details). Results from simulations seeded with actual genotypes or haplotypes showed that overall, the predicted increase in inbreeding closely matched the observed data (Figure 1—figure supplement 2B). We tested models with varying selection intensity and initial linkage disequilibrium (LD), and for each, ran 100 simulated replicates to determine the significance of changes in allele frequency (Figure 1—figure supplement 2C-E). This flexible quantitative genetics framework allowed us to explore possible changes in genetic diversity over 17 generations of breeding under strong selection.

In simulations, we followed blocks of genomes as they were passed down the pedigree. In order to compare with observations, we seeded the initial genomes with single nucleotide polymorphisms (SNPs) in the same number and initial frequencies as the data. We observed much more variation between chromosomes in overall inbreeding (Figure 1—figure supplement 2B) and in the distribution of allele frequencies (Figure 2—figure supplement 1B) than expected from simulations in which the ancestral SNPs were initially in linkage equilibrium. This can be explained by linkage disequilibrium (LD) between the ancestral SNPs, which greatly increases random variation. Therefore, we based our significance threshold tests on simulations that were seeded with SNPs drawn with LD consistent with the initial haplotypes (Figure 1—figure supplement 2C and E; see Appendix).

Because our simulations assume infinitesimal effects of loci, allele frequency shifts exceeding this stringent threshold would suggest that discrete loci contribute significantly to the selection response. An excess of such loci in either a single LS replicate or in parallel would thus imply a mixed genetic architecture of a few large-effect loci amid an infinitesimal background.

Sequencing the Longshanks mice reveals genomic signatures of selection

To detect the genomic changes in the actual Longshanks experiment, we sequenced all individuals of the founder (F0) and 17^th generation (F17) to an average of 2.91-fold coverage (range: 0.73–20.6×; n = 169 with <10% missing F0 individuals; Supplementary file 1). Across the three lines, we found similar levels of diversity, with an average of 6.7 million (M) segregating SNPs (approximately 0.025%, or 1 SNP per four kbp; Supplementary file 2; Figure 2—figure supplement 1A and Figure 2—figure supplement 2). We checked the founder populations to confirm negligible divergence between the three founder populations (across-line F_ST on the order of 1 × 10⁻⁴), which increased to 0.18 at F17 (Supplementary file 2). This is consistent with random sampling from an outbred breeding stock. By F17, the number of segregating SNPs dropped to around 5.8 M (Supplementary file 2). This 13% drop in diversity (0.9M SNPs genome-wide) is predicted by drift. Our simulations confirmed this and moreover, showed that selection contributed negligibly to the drop in diversity (Appendix, Figure 1—figure supplement 2B, D).

We conclude that despite the strong selection on the LS lines, there was little perturbation to genome-wide diversity. Indeed, the changes in diversity in 17 generations were remarkably similar in all three lines, despite Ctrl not having experienced selection on relative tibia length (Figure 2—figure supplement 1A). Hence, and consistent with our simulation results (Figure 1—figure supplement 2B,D), changes in global genome diversity had little power to distinguish selection from neutral drift despite the strong phenotypic selection response.

We next asked whether specific loci reveal more definitive differences between the LS replicates and Ctrl (and from infinitesimal predictions). We calculated ∆z², the square of an arcsine transformed allele frequency difference between F0 and F17; this has an expected variance of 1/2N_e per generation, independent of starting frequency, and ranges from 0 to π². We averaged ∆z² within 10 kbp windows (see Methods for details), and found 169 windows belonging to eight clusters (i.e., loci) that had significant shifts in allele frequency in LS1 and/or LS2 (corresponding to 9 and 164 clustered windows respectively at p≤0.05 under H_{INF, max LD}; ∆z² ≥0.33 π²; genome-wide ∆z² = 0.02 ± 0.03 π²; Figure 2; Figure 1—figure supplement 2D, Figure 2—figure supplement 2, Figure 2—figure supplement 3; see Methods for details) and 8 windows in three clusters in Ctrl (genome-wide ∆z² = 0.01 ± 0.02 π²). The eight loci in Longshanks each overlapped between 2 to 179 genes and together contained 11 candidate genes with known roles in bone, cartilage and/or limb development (e.g., Nkx3-2 and Sox9; Table 1; Figure 2—figure supplement 3, Figure 2—figure supplement 4). Four out of the eight loci contain genes with a ‘short tibia’ or ‘short limb’ knockout phenotype (Table 1; p≤0.032 from 1000 permutations, see Methods for details). Of the broader set of genes at these loci with any limb knockout phenotypes, only fibrillin 2 (Fbn2) is polymorphic for SNPs coding for different amino acids, suggesting that for the majority of loci with large shifts in allele frequency, gene regulation was likely important in the selection response (Figure 2—figure supplement 4; Supplementary file 3; see Appendix for further analyses on enrichment in gene functions, protein-coding vs. cis-acting changes and clustering with loci affecting human height).

Figure 2 with 4 supplements see all

Download asset Open asset

Widespread genomic response to selection for increased tibia length.

Allele frequency shifts between generations F0 and F17 in LS1, LS2 and Ctrl lines are shown as ∆z² profiles across the genome (plotted here as fraction of its range from 0 to π²). The Ctrl ∆z² profile (gray) confirmed our expectation from theory and simulation that drift, inbreeding and genetic linkage could combine to generate large ∆z² shifts even without selection. Nonetheless the LS1 (blue) and LS2 (red) profiles show a greater number of strong and parallel shifts than Ctrl. These selective sweeps provide support for the contribution of discrete loci to selection response (arrowheads, blue: LS1; red: LS2; purple: parallel; see also Figure 1—figure supplement 2E, Figure 2—figure supplement 2, Figure 2—figure supplement 3) beyond a polygenic background, which may explain a majority of the selection response and yet leave little discernible selection signature. Candidate genes are highlighted (Table 1). An additional *a priori* candidate limb regulator *Gli3* is indicated with a black arrowhead.

https://doi.org/10.7554/eLife.42014.006

Table 1

Major loci likely contributing to the selection response.

These eight loci show significant allele frequency shifts in ∆z² and are ordered according to their estimated selection coefficients according to Haldane (1932). Shown for each locus are the full hitchhiking spans, peak location and their size covering the core windows, the overlapping TAD and the number of genes found in it. The two top-ranked loci show shifts in parallel in both LS1 and LS2, with the remaining six showing line-specific response (LS1: 1; LS2: 5). Candidate genes found within the TAD with limb, cartilage, or bone developmental knockout phenotype functions are shown, with asterisks (*) marking those with a ‘short tibia’ knockout phenotype (see also Figure 2—figure supplement 3 and Supplementary file 3 for full table).

https://doi.org/10.7554/eLife.42014.011

Rk	Chr	Span (Mbp)	Peak	Core (kbp)	TAD (kbp)	Genes	∆q
Rk	Chr	Span (Mbp)	Peak	Core (kbp)	TAD (kbp)	Genes	LS1	LS2	Ctrl	Type	Candidate genes
1	5	38.95–45.13	41.77	900	720	3	0.69	0.86	−0.14	Parallel	Nkx3-2
2	10	77.47–87.69	81.07	5360	6520	175	0.79	0.88	−0.04	Parallel	Sbno5, Aes, Adamtsl5, Chst11, Cry1, Prdm4*
3	18	53.63–63.50	58.18	220	520	4	0.05	0.78	−0.06	LS2-specific	-
4	13	35.59–55.21	48.65	70	2600	22	0.24	0.80	−0.03	LS2-specific	Id4
5	1	53.16–57.13	55.27	10	720	4	0.65	0.01	−0.23	LS1-specific	-
6	15	31.92–44.43	41.54	10	680	3	−0.23	0.66	0.02	LS2-specific	Rspo2*
7	6	118.65–125.25	120.30	130	1360	12	−0.03	0.79	−0.15	LS2-specific	Wnt5b*
8	11	111.10–115.06	113.42	10	2120	2	−0.14	0.66	−0.15	LS2-specific	Sox9*

Rk, Rank.

Chr, Chromosome.
Core, Span of 10 kbp windows above H_{INF, max LD} p≤0.05 significance threshold.

TAD, Merged span of topologically associating domains (TAD) overlapping the core span. TADs mark segments along a chromosome that share a common regulatory mechanism. Data from Dixon et al. (2012).
Candidate genes, Genes within the TAD span showing ‘short tibia’, ‘short limbs’, ‘abnormal osteoblast morphology’ or ‘abnormal cartilage morphology’ knockout phenotypes are listed, with * marking those with ‘short tibia’.

Taken together, two major observations stand out from our genomic survey. One, a polygenic, infinitesimal selection model with strong LD among marker SNPs performed better than moderate LD or no LD (Figure 1—figure supplement 2E); and two, we nevertheless find more discrete loci in LS1 and LS2 than in Ctrl, beyond the significance threshold set by the infinitesimal model (Figure 2; Figure 2—figure supplement 2). Thus, we conclude that although the genetic basis of the selection response in the Longshanks experiment may be largely polygenic, evidence strongly suggests discrete loci with major effect, even when each line is considered separately.

We next tested the repeatability of the selection response at the gene/locus level using the two LS replicates. If the founding populations shared the same selectively favored variants, we may observe parallelism or co-incident selective sweeps, as long as selection could overcome random drift. Indeed, the ∆z² profiles of LS1 and LS2 were more similar to each other than to Ctrl (Figure 2 and 3A; Figure 3—figure supplement 1; Pearson’s correlation in ∆z² from 10 kbp windows: LS1–LS2: 0.21 vs. LS1–Ctrl: 0.06 and LS2–Ctrl: 0.05). Whereas previous genomic studies with multiple natural or artificial selection replicates focused mainly on detecting parallel loci (Burke et al., 2010; Jones et al., 2012; Chan et al., 2012; Kelly and Hughes, 2018), here we have the possibility to quantify parallelism and determine the selection value of a given locus. Six out of eight significant loci at the H_{INF, max LD} threshold were line-specific, even though all eight selected alleles were present in the F0 generation in both lines. This prevalence of line-specific loci was consistent under different significance thresholds. However, the two remaining loci that ranked first and second by selection coefficient were parallel, both with s > 0.3 (Figure 3B; note that as outliers, the selection coefficient may be substantially overestimated, but their rank order should remain the same), supporting the idea that the probability of parallelism can be high among those loci with the greatest selection advantage (Orr, 2005).

Finding just two parallel loci out of 8 discrete loci may appear to be low, given the genetic similarity in the founding generation and the identical selection applied to both Longshanks replicates. However, one should bear in mind the very many genetic paths to increasing tibia length under an infinitesimal model, and that the effect of drift is expected to be very strong in these small populations. In larger populations, the shift in the balance from drift to selection should result in selection being able to favor increasingly subtle variants and thus produce a greater proportion of parallel loci. However, we expect the trend of parallelism being enriched among the top loci to hold.

In contrast to the subtle differences within each line in changes in global diversity over 17 generations (Figure 2 and Figure 2—figure supplement 2), we found the signature of parallelism to be significantly enriched in the comparison between the selected replicates (χ² test, LS1–LS2: p≤1 × 10⁻¹⁰), as opposed to comparisons between each selected line and Ctrl (LS1–Ctrl: p>0.01 and LS2–Ctrl: p>0.2, both non-significant after correcting for multiple testing), or between simulated replicates (Figure 3—figure supplement 1; see Appendix for details). Because the parallel selected loci between LS1 and LS2 have the highest selection coefficients and parallelism is not generally expected in our populations, these loci provide the strongest evidence for the role of discrete major loci. As such, the top-ranked parallel locus is the prime candidate for molecular dissection (see Figure 4—figure supplement 1 and Appendix ‘Molecular dissection of Gli3’ for an additional a priori candidate locus with known limb function).

Figure 3 with 1 supplement see all

Download asset Open asset

Selection response in the Longshanks lines was largely line-specific, but the strongest signals occurred in parallel.

(A) Allele frequencies showed greater shifts in LS2 (red) than in Ctrl (gray; left panel; diamonds: peak windows; dots: other 10 kbp windows; see Figure 3—figure supplement 1 for Ctrl vs. LS1 and Appendix for details). Changes in the two lines were not correlated with each other. In contrast, there were many more parallel changes in a comparison between LS1 (blue) vs. LS2 (red; middle panel; adjacent windows appear as clusters due to hitchhiking). The overall distribution closely matches simulated results under the infinitesimal model with maximal linkage disequilibrium (*H_{INF, max LD}*; right heatmap summarizes the percentage seen in 100 simulated replicates), with most of the windows showing little to no shift (red hues near 0; see also Figure 3—figure supplement 1 for an example replicate). Tick marks along the axes show genome-wide maximum ∆z² shifts in each of 100 replicate simulations in LS1 (x-axis, blue) and LS2 (y-axis, red), from which we derived line-specific thresholds at the p≤0.05 significance level. While the frequency shifts from simulations matched the bulk of the observed data well, no simulation recovered the strong parallel shifts observed between LS1 and LS2 (compare middle to right panel, points along the diagonal). (B) Genome-wide ranking based on estimated selection coefficients s among the candidate discrete loci at p≤0.05 under *H_{INF, max LD}*. While six out of eight total loci showed significant shifts in only LS1 or LS2, the two loci with the highest selection coefficients were likely selected in parallel in both LS1 and LS2 (also see middle panel in A).

https://doi.org/10.7554/eLife.42014.012

Molecular dissection of the Nkx3-2 locus highlights cis-acting changes

Between the two major parallel loci, we chose the locus on chromosome 5 (Chr5) at 41–42 Mbp for functional validation because it showed the strongest estimated selection coefficient, its signature of selection was clear, and crucially for functional characterization, it contains only three genes, including Nkx3-2 (also known as Bapx1), a known regulator of bone maturation (Figure 2 and 4A) (Provot et al., 2006). At this locus, the pattern of variation resembles a selective sweep spanning 1 Mbp (Figure 4A). Comparison between F0 and F17 individuals revealed no recombinant in this entire region (Figure 5—figure supplement 1A, top panel), precluding fine-mapping using recombinants. We then analyzed the genes in this region to identify the likely target(s) of selection. First, we determined that no coding changes existed for either Rab28 or Nkx3-2, the two genes located within the topologically associating domain (TAD, which mark chromosome segments with shared gene regulatory logic) (Dixon et al., 2012). We then performed in situ hybridization and detected robust expression of Nkx3-2 and Rab28 in the developing fore- and hind limb buds of Ctrl, LS1 and LS2 E12.5, in a domain broadly overlapping the presumptive zeugopod, the region including the tibia (Figure 4—figure supplement 2B). A third gene, Bod1l, straddled the TAD boundary with its promoter located in the neighboring TAD, making its regulation by sequences in the selected locus unlikely. Consistent with this, Bod1l showed only weak or undetectable expression in the developing limb bud (Figure 4—figure supplement 2A). We next combined ENCODE chromatin profiles and our own ATAC-Seq data to identify limb enhancers in the focal TAD. Here we found three novel enhancer candidates (N1, N2 and N3) carrying three, one and three SNPs respectively, all of which showed significant allele frequency shifts in LS1 and LS2 (Figure 4B and C; Figure 5—figure supplement 1A). Chromosome conformation capture assays showed that the N1 and N3 sequences formed long-range looping contacts with the Nkx3-2 promoter—a hallmark of enhancers—despite as much as 600 kbp of intervening sequence (Figure 4B). We next used transgenic reporter assays to determine whether these sequences could drive expression in the limbs. Here, we were not only interested in whether the sequence encoded enhancer activity, but specifically whether the SNPs would affect the activity (Figure 4C and D). An examination of the predicted transcription factor binding sites showed that both the N1 and N3 enhancers contain multiple SNPs with consistent directional impact on the putative enhancer activity (Figure 4C). In contrast, the N2 enhancer contains only a single SNP and is predicted to have inconsistent effect on its activity. We therefore excluded the N2 enhancer from further testing. We found that the F0 alleles of the N1 and N3 enhancers (three SNPs each in about one kbp) drove robust and consistent lacZ expression in the developing limb buds (N1 and N3) as well as in expanded trunk domains (N3) at E12.5 (Figure 4E). In contrast, transgenic reporters carrying the selected F17 alleles of the N1 and N3 enhancers drove consistently weak, nearly undetectable lacZ expression (Figure 4E). Thus, switching from the F0 to the F17 enhancer alleles led to a nearly complete loss in activity (‘loss-of-function’) at developmental stage E12.5. This is consistent with the role of Nkx3-2 as a repressor in long bone maturation (Provot et al., 2006). It should be noted that even though our selective regime favored an increase in the target phenotype (tibia length), at the molecular level we expect advantageous loss- and gain-of-function variants to be equally likely favored by selection. In fact, in an additional functional validation example at the Gli3 locus, we found a gain-of-function enhancer variant that may have been favored at that locus (see Figure 4—figure supplement 1 and Appendix ‘Molecular dissection of Gli3’).

Figure 4 with 2 supplements see all

Download asset Open asset

Strong parallel selection response at the bone maturation repressor *Nkx3-2* locus was associated with decreased activity of two enhancers.

(A) ∆z² in this region of chromosome five showed strong parallel differentiation spanning 1 Mbp in both Longshanks but not in the Control line. This 1 Mbp region contains three genes: *Nkx3-2*, *Rab28* and *Bod1l* (whose promoter lies outside the TAD boundary, shown as gray boxes). Although an originally rare allele in all lines, this region swept almost to fixation by generation 17 in LS2 (Figure 5—figure supplement 1A). (B) Chromatin profiles [ATAC-Seq, red, (Buenrostro et al., 2013); ENCODE histone modifications, purple] from E14.5 developing limb buds revealed five putative limb enhancers (gray and red shading) in the TAD, three of which contained SNPs showing significant frequency shifts. Chromosome conformation capture assays (4C-Seq) from E14.5 limb buds from the N1, N2 and N3 enhancer viewpoints (bi-directional arrows) showed significant long-range looping between the enhancers and sequences around the *Nkx3-2* promoter (heat-map from gray to red showing increasing contacts; Promoters are shown with black arrows and blue vertical shading). (C) Selected alleles at 7 SNPs found within the N1, N2, and N3 enhancers increased ~0.75 in frequency in both LS1 and LS2. Selected alleles at three of these sites are predicted to lead to loss (red inhibition circles) of transcription factor binding sites in the *Nkx3-2* pathway (including a SNP in N3 causing loss of two adjoining *Nkx3-2* binding sites) and thus reduce enhancer activity in N1 and N3. (**D, E**) Transient transgenic reporter assays of the N1 and N3 enhancers showed that the F0 alleles drove robust and consistent expression at centers of future cartilage condensation (N1) and broader domains of *Nkx3-2* expression (N3) in E12.5 fore- and hind limb buds (FL, HL; ti: tibia). Fractions indicate the number of embryos showing similar *lacZ* staining out of all transgenic embryos. Substituting the F17 enhancer allele (i.e., replacing three positions each in N1 and N3) led to little observable limb bud expression in both the N1/F17 and N3/F17 embryos, suggesting that selection response for longer tibia involved de-repression of bone maturation through a loss-of-function regulatory allele of *Nkx3-2* at this locus. Scale bar: 1 mm for both magnifications.

https://doi.org/10.7554/eLife.42014.014

At the Nkx3-2 locus, we hypothesize that the F17 allele causes de-repression of bone and/or cartilage formation by reducing enhancer activity and Nkx3-2 expression. Crucially, the F0 N1 enhancer showed activity that presages future long bone cartilage condensation in the limb (Figure 4E). That is, the observed expression pattern recalls previous results that suggest that undetected early expression of Nkx3-2 may mark the boundaries and size of limb bone precursors, including the tibia (Sivakamasundari et al., 2012). Conversely, over-expression of Nkx3-2 has been shown to cause shortened tibia (even loss) in mice (Bren-Mattison et al., 2011; Tribioli and Lufkin, 2006). In humans, homozygous frameshift mutations in NKX3-2 cause the rare disorder spondylo-megaepiphyseal-metaphyseal dysplasia (SMMD; OMIM: 613330), which is characterized by short-trunk, long-limbed dwarfism and bow-leggedness (Hellemans et al., 2009). The affected bones in SMMD patients broadly correspond to the expression domains of the two novel N1 (limbs) and N3 (limbs and trunk) enhancers. Instead of wholesale loss of Nkx3-2 expression, which would have been lethal in mice (Akazawa et al., 2000) or likely cause major defects similar to SMMD patients (Hellemans et al., 2009), our in situ hybridization data did not reveal qualitative differences in Nkx3-2 expression domains between Ctrl or LS embryos (Figure 4—figure supplement 2B). Taken together, our results recapitulate the key features of a cis-acting mode of adaptation: Nkx3-2 is a broadly expressed pleiotropic transcription factor that is lethal when knocked out (Akazawa et al., 2000). We found no amino acid changes between the F0 and F17 alleles that could impact protein function. Rather, selection favored changes in tissue-specific expression by modular enhancers. By combining population genetics, functional genomics and developmental genetic techniques, we were able to dissect a megabase-long locus and present data supporting the identification of up to six candidate quantitative trait nucleotides (QTNs). In mice, this represents a rare example of genetic dissection of a trait to the base-pair level.

Linking molecular mechanisms to evolutionary consequence

We next aimed to determine the evolutionary relevance of the Nkx3-2 enhancer variants at the molecular and the population levels. At the strongly expressed N3/F0 ‘trunk and limb’ enhancer, we note that the SNPs in the F17 selected allele lead to disrupted Nkx3-1 and Nkx3-2 binding sites (Figure 4C and 5A; UNIPROBE database [Berger et al., 2008]). This suggests that the selected SNPs may disrupt an auto-feedback loop to decrease Nkx3-2 activity in the limb bud and trunk domains (Figure 5A). Using a GFP transgenic reporter assay in stickleback fish embryos, we found that the mouse N1/F0 enhancer allele was capable of driving expression in the distal cells but not in the fin rays of the developing fins (Figure 5A). This pattern recapitulates fin expression of nkx3.2 in fish, which gives rise to endochondral radials (homologous to ulna/tibia in mice) (Crotwell and Mabee, 2007). Our results suggest that strong selection may have favored the weaker N1/F17 and N3/F17 enhancer alleles in the context of the Longshanks selection regime despite the deep functional conservation of the F0 variants.

Figure 5 with 1 supplement see all

Download asset Open asset

Linking base-pair changes to rapid morphological evolution.

(A) At the *Nkx3-2* locus, we identified two long-range enhancers, N1 and N3 (circles), located 600 and 230 kbp away, respectively. During development, they drive partially overlapping expression domains in limbs (N1 and N3) and trunk (N3), which are body regions that may correlate positively (tibia length) and possibly negatively (trunk with body mass) with the Longshanks selection regime. For both enhancers, the selected F17 alleles carry loss-of-function variants (gray crosses). Two out of three SNPs in the N3 F17 enhancer are predicted to disrupt an auto-feedback loop, likely reducing *Nkx3-2* expression in the trunk and limb regions. Conversely, the enhancer function of the strong N1 F0 allele is evolutionarily conserved in fishes, demonstrated by its ability to drive consistent GFP expression (green) in the pectoral fins (pf, outlined) and branchial arches (white arrowhead, left) in transgenic stickleback embryos at 11 days post-fertilization. The N1 enhancer can recapitulate *nkx3.2* expression in distal cells specifically in the endochondral radial domain in developing fins (black arrowheads, right). Scale bar: 250 µm for both magnifications. (B) Allele frequency of the selected allele (minor allele at F0, q) at N3 over 20 generations (blue: LS1; red: LS2; gray broken line: Ctrl; results from N1 were nearly identical due to tight linkage). Observed frequencies from genotyped generations in the Ctrl line are marked with filled circles. Dashed lines indicate missing Ctrl generations. Open circles at generations 0 and 17 indicate allele frequencies from whole genome sequencing. The allele frequency fluctuated in Ctrl due to random drift but followed a generally linear increase in the selected lines from around 0.17 to 0.85 (LS1) and 0.98 (LS2) by generation 17. Shaded contours mark expected allelic trajectories under varying selection coefficients starting from 0.17 (red horizontal line; the average starting allele frequency between LS1 and LS2 founders). The gray shaded region marks the 95% confidence interval under random drift.

https://doi.org/10.7554/eLife.42014.017

Using theory and simulations, we went beyond qualitative molecular dissection to quantitatively estimate the selection coefficient at the Nkx3-2 locus and its contribution to the total selection response in the Longshanks mice. We retraced the selective sweep of the Nkx3-2 N1 and N3 alleles through targeted genotyping in 1569 mice across all 20 generations. The selected allele steadily increased from around 0.17 to 0.85 in LS1 and 0.98 in LS2 but fluctuated around 0.25 in Ctrl (Figure 5B). We estimated that such a change of around 0.7 in allele frequency would correspond to a selection coefficient s of ~0.24 ± 0.12 at this locus (Figure 5—figure supplement 1B; see Appendix section on ‘Estimating selection coefficient of the top-ranking locus, Nkx3-2, from changes in allele frequency’). By extending our simulation framework to allow for a major locus against an infinitesimal background, we find that the Nkx3-2 locus would contribute 9.4% of the total selection response (limits 3.6–15.5%; see Appendix section ‘Estimating selection coefficient’ for details) in order to produce a shift of 0.7 in allele frequency over 17 generations. To avoid inflation stemming from estimating from outliers, we also independently estimated the contribution of the Nkx3-2 locus using a linear mixed animal model based on the full genotyped series mentioned above (see Appendix section ‘Estimating selection coefficient, animal model’ for details). Using this alternative approach, we estimated that each selected allele increases tibia length by 0.36% (N = 1569, 95% conf. int.: 0.07–0.64%, p=0.0171). Multiplying the effect with the increase in the allele frequency suggests that the Nkx3-2 locus alone would account for approximately 4% of the overall 12.9% increase in tibia length. This lower estimate of around 4% is nonetheless within the bounds of the estimate from simulations. Together, both approaches indicate that the Nkx3-2 locus contributes substantially to the selection response.

Discussion

A defining task of our time is to understand the factors that determine and constrain how small populations respond to sudden environmental changes. Here, we analyze the replicated and controlled Longshanks experiment to characterize the genomic changes that occur as small experimental populations respond to selection.

An important conclusion from the Longshanks experiment is that selection response can be steady and robust even in extremely bottlenecked populations. That is, we found that tibia length increased readily and repeatedly in response to selection even with as few as 14–16 breeding pairs per generation. The sustained response was possible because the lines were founded with enough standing variation, and generation 17 was still only a fraction of the way to the expected limit for the selection response at ~2N_e generations (Robertson, 1960), estimated here to be around 90 (see legend for Figure 1—figure supplement 2B; Appendix on ‘Estimating the selection coefficient’). Although other selective breeding studies using a similar base population of mice encountered selection limits at around generation 20–25 (possibly due to countervailing selection rather than loss of genetic variance) for high voluntary wheel running behavior (Careau et al., 2013) and for nest-building behavior (Bult and Lynch, 2000), here all evidence suggests that the Longshanks mice should continue to show increases in tibia length for many more generations.

The estimated N_e of 46 in the Longshanks experiment, while small, is comparable to those in natural populations like the Soay sheep (McRae et al., 2005), Darwin’s finches (Grant and Grant, 1992) or Tasmanian devils (Epstein et al., 2016) (this last study documents a rapid and parallel evolutionary response to transmissible tumors). These populations span a wide range of time in sustained bottlenecking, from the most recent in Tasmanian devils, to likely many millions of years in Darwin’s finches. Accordingly, we also expect very different dynamics during short- vs. long-term selection response: for a short bout of selection, such as the 20 generations analyzed in this study, selection response depends overwhelmingly on standing genetic variation, with little to no contribution from de novo mutations (Hill, 1982; Weber and Diggins, 1990). Over the long term, however, de novo mutations would contribute increasingly to selection response. In the Longshanks experiment, we observe a robust early response to selection (Figure 1B and Figure 1—figure supplement 1), and a gradual decrease in sequence diversity, consistent with the effect of drift (Figure 1—figure supplement 2B and Figure 2—figure supplement 1A, Supplementary file 2). There has long been broad empirical support for adaptation from standing genetic variation in nature (Jones et al., 2012; Epstein et al., 2016; Hancock et al., 2011) and breeding (Sheng et al., 2015). At least in the short-run, our result demonstrating robust selection response in the Longshanks experiment provides grounds for some optimism regarding the ability of populations to respond rapidly to changes in their environment.

By combining pedigree records with sequencing of founder individuals, our data had sufficient detail to allow precise modeling of trait response, with predicted shifts in allele frequency distribution that closely matched our results (e.g. Figure 1—figure supplement 2D). Furthermore, we functionally validated loci that showed allele frequency shifts outside the model’s predictions and found key enhancers of major effect. Connecting trait changes to allele frequency changes at specific loci has been a longstanding objective in selection experiments, with a number of notable early attempts (e.g., Keightley et al., 1996). To date, we know of only a few studies that attempt to explicitly link traits with changes in allele frequencies (Kessner and Novembre, 2015; Rice and Townsend, 2012; Chen et al., 2019; Nuzhdin et al., 1999) and none have systematically tested the underlying architecture against an infinitesimal background. Here, our results imply a mixed genetic architecture with a few discrete loci of large effect amid an infinitesimal background. It remains to be seen whether other evolve-and-resequence (E&R) studies, with different selection pressures and population parameters, may reveal similar results.

To put our finding of a mixed genetic architecture into perspective, it is worth noting that the infinitesimal model is still the most predictive model by far in practical quantitative genetics, for diverse domesticated species from cattle to crops, despite its intrinsically unrealistic assumptions (Hill et al., 2008; Lynch and Walsh, 1998; Hill and Zhang, 2012). In general, current genomic data for many traits is consistent with a very large number of loci, each with a small effect. From a practical point of view, however, the use of an infinitesimal model does not preclude the presence or indeed the importance of a few major effect loci. Rather, it simply assumes that they are rare enough to allow reasonable model fit (Walsh and Lynch, 2018, page 878). Here, we note that it is actually not clear how one might parameterize a generally applicable predictive oligogenic model with more than a single major effect locus. In this study, while we consider the most likely genetic architecture underlying selection response for tibia length to be a small number of major effect loci together with a polygenic background, we cannot reject other alternative models that could also account for the observed response, such as an effectively infinitesimal model with linkage, as well as models with a few major trait loci.

Among other classical examples of complex traits, such as height or body weight, that may have been subjected to selection, we observe a range of genetic architectures in ways often tightly connected to their population size and/or selection history. Height in humans is often cited as the classical complex trait under possible selection of unknown (and much debated) intensity (see Turchin et al., 2012; Berg and Coop, 2014; Barton et al., 2019). It shows high heritability and a highly dispersed genetic architecture (with the top-ranked locus accounting for only 0.8% of the variation explained in cosmopolitan European populations) (Weedon et al., 2007; Wood et al., 2014). In contrast, as few as 4 to 6 loci account for 83% and 50% of the variation in height in horses and dogs, respectively (Makvandi-Nejad et al., 2012; Rimbault et al., 2013). In both horses and dogs, selection has been strong and sustained, and breed-specific populations tend to be small. Interestingly, and in line with our experiment, the major allele at the IGF1 locus stems from a standing genetic variant, despite many factors that may theoretically favor large-effect de novo mutations (Sutter et al., 2007). In chickens, modern breeding practice and selection from large populations yielded a highly polygenic genetic architecture for body weight, with some of the best empirical evidence for epistasis (Carlborg et al., 2006; Wahlberg et al., 2009; Rubin et al., 2010; Pettersson et al., 2013). Similarly, results from many selection experiments in Drosophila suggest that the genetic architecture underlying selection response may involve many genes (Jha et al., 2015; Reeves and Tautz, 2017; Orozco-terWengel et al., 2012; Turner et al., 2011). By contrast, the extreme tail of the effect size distribution (as inferred from ∆z²) from the Longshanks experiment appears to account for a substantial part of the selection response, presumably due to the combined effects of relatively low diversity in commercial mouse stocks and the small founding populations. But unlike these previous QTL studies or selection experiments, in which either the genetic architecture of a trait or the selection value were estimated separately, sometimes from only few parental individuals or lines, E&R studies sample a much broader pool of alleles and continually compete them against each other. Thus, our approach allowed simultaneous inference of genetic architecture and distribution of effect sizes, is more likely to be representative of the population at large, and is more akin to genome-wide association studies (GWAS), except that here we can also directly connect a trait to its selective value and capture the trajectory of any given allele.

Parallel evolution is often seen as a hallmark for detecting selection (Chan et al., 2012; Schluter et al., 2004; Chan et al., 2010; Martin and Orgogozo, 2013). We investigated the factors that contribute to parallelism in allele frequency shifts over 17 generations by contrasting the two Longshanks replicates against the Control line. However, we observed little parallelism between selected lines and Ctrl, or between simulated replicates under selection, even though the simulated haplotypes were sampled directly from actual founders. This underscores that parallelism depends on both shared selection pressure (absent in Ctrl) and the availability of large-effect loci that confer a substantial selection advantage (absent under the infinitesimal model; Figure 3; Figure 3—figure supplement 1). With increasing population size, selection would be better able to detect variants with more subtle effects. This would in turn lower the threshold beyond which the selective advantage of an allele would become deterministic, that is, exhibit parallelism.

Through in-depth dissection of the Nkx3-2 locus, our data show in fine detail how the selective value of standing variants depends strongly on the selection regime: the originally common F0 variant of the N1 enhancer shows deep functional conservation and can evidently recapitulate fin nkx3.2 expression in fishes (Figure 5A). Yet, in the Longshanks experiment selection strongly favored the weaker allele (Figure 5B). In fact, our molecular dissection of two loci show that both gain-of-function (Gli3) and loss-of-function (Nkx3-2) variants could be favored by selection (Figure 4E and 5A; Figure 4—figure supplement 1D). Through synthesis of multiple lines of evidence, our work uncovered the key role of Nkx3-2, which was not an obvious candidate gene like Gli3 due to the lack of abnormal limb phenotype in Nkx3-2 knockout mice. To our surprise, the same loss of NKX3-2 function in human SMMD patients manifests in opposite ways in different bone types as short trunk and long limbs (Hellemans et al., 2009). This matches the expression domains of our N1 (limb) and N3 (limb and trunk) enhancers (Figure 5A). Evidently, in the absence of lethal coding mutations, the F17 haplotype was doubly beneficial at both enhancers for the limb and potentially also trunk target tissues under the novel selection regime in the Longshanks selection experiment. We estimate that these enhancer variants, along with any other tightly linked beneficial SNPs, segregate as a single locus, which in turn contributes ~10% of the overall selection response.

Despite our efforts to uncover the mechanism underlying the selective advantage of the Nkx3-2 locus, much remains unknown. For example, it remains unclear how such a major allele could segregate in the general mouse stock (and as the reference C57BL/6J allele, no less). It could be that this allele has the same effect in the general mouse population but is conditionally neutral under non-selective breeding and simply escaped notice. However, our preliminary exploration in a panel of C57BL/6-by-DBA/2 (‘BXD’) mice suggested otherwise: mapping of tibia length or mineral density did not reveal this locus as a major QTL determining tibia length (unpublished data kindly provided by Weikuan Gu), suggesting that this allele’s effect on tibia length may depend on the genetic background. Alternatively, the broader C57BL/6 allele could be linked to a compensatory mutation that became uncoupled among the founders of the Longshanks lines. Finally, although we do observe the specific N1 and N3 SNP positions as variable across the rodent and indeed the broader mammalian lineages, further work is needed to determine their effect, if any, on limb development.

Conclusion

Using the Longshanks selection experiment and synthesizing theory, empirical data and molecular genetics, we show that it is possible to identify some of the individual SNPs that have contributed to the response to selection on morphology. In particular, discrete, large-effect loci are revealed by their parallel response. Further work should focus on dissecting the mechanisms behind the dynamics of selective sweeps and/or polygenic adaptation by re-sequencing the entire selection pedigree, testing how the selection response depends on the genetic architecture, and the extent to which linkage places a fundamental limit on our inference of selection. Improved understanding in these areas may have broad implications for conservation, rapid adaptation to climate change and quantitative genetics in medicine, agriculture and in nature.

Share this article

Cite this article

Selection for Longshanks mice produced rapid increase in tibia length.

Widespread genomic response to selection for increased tibia length.

Major loci likely contributing to the selection response.

Selection response in the Longshanks lines was largely line-specific, but the strongest signals occurred in parallel.

Strong parallel selection response at the bone maturation repressor Nkx3-2 locus was associated with decreased activity of two enhancers.

Linking base-pair changes to rapid morphological evolution.

Author details

João PL Castro

Contribution

Contributed equally with

Competing interests

Michelle N Yancoskie

Contribution

Contributed equally with

Competing interests

Marta Marchini

Contribution

Competing interests

Stefanie Belohlavy

Contribution

Competing interests

Layla Hiramatsu

Contribution

Competing interests

Marek Kučka

Contribution

Competing interests

William H Beluch

Contribution

Competing interests

Ronald Naumann

Contribution

Competing interests

Isabella Skuplik

Contribution

Competing interests

John Cobb

Contribution

Competing interests

Nicholas H Barton

Contribution

Competing interests

Campbell Rolian

Contribution

Contributed equally with

For correspondence

Competing interests

Yingguang Frank Chan

Contribution

Contributed equally with

For correspondence

Competing interests

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism

Further reading