Local and distal elements contribute to generation of GAP1 CNV alleles.
(A) Schematic of Saccharomyces cerevisiae GAP1 locus on Chromosome XI: 513332-518060 with LTR, ARS elements and tRNA genes labeled. ODIRA is a DNA replication-error based CNV mechanism. Here, we classify a clone as ODIRA if it has an inverted sequence in at least one breakpoint. ODIRA typically forms tandem triplications with an inverted middle copy and contains an ARS. Long terminal repeat non-allelic homologous recombination (LTR NAHR) is a mechanism we define by having both CNV breakpoints at LTR sites. Sometimes we detect a hybrid sequence between two LTR sequences, a result of recombination between the two LTRs. Non-allelic homologous recombination (NAHR) is defined by having at least one CNV breakpoint not at LTR sites, ie. other homologous sequences in the genome. Sometimes we detect a hybrid sequence between the two homologous sequences. Transposon-mediated mechanisms observed involve at least one intermediate novel LTR retrotransposon insertion. The newly deposited LTR sequences recombines with other LTR sequences, either pre-existing or introduced by a second de novo retrotransposition, to form a resulting CNV. Complex CNV is defined by a clone having more than two breakpoints in chromosome XI, indicative of having more than one amplification event. (B) Violin plot of CNV length in each genome-sequenced clone, n = 177. Strain has a significant effect on CNV length, Kruskal-Wallis test, p = 3.0 x 10-4. (C) Barplot of inferred CNV mechanisms, described in A, for each CNV clone isolated from evolving populations. Inference came from a combination of read depth, split read, and discordant read analysis (see Methods). Strain is significantly associated with CNV Mechanism Fisher’s Exact Test, p = 5.0 x 10-4. There is a significant increase in ODIRA prevalence between WT and LTRΔ, chi-sq, p = 0.02469. There is a significant decrease in ODIRA prevalence from WT to ARSΔ and ALLΔ, chi-sq, p = 0.002861 and 0.002196, respectively. There is a significant decrease of LTR NAHR from WT to LTRΔ, chi-sq, p = 0.03083.
(D) Top: Schematic of S. cerevisiae chromosome XI, with LTR, ARS elements, tRNA genes annotated. LTR-blue, ARS-purple, tRNA-orange, GAP1 ORF-white rectangle. Using a combination of read depth, split read, and discordant read analysis, we defined the extent of the amplified region, the precise CNV breakpoints, and GAP1 copy number. GAP1 copy numbers were estimated using read depth relative to the average read depth of chromosome XI. We define the upstream and downstream breakpoints as kilobases away from the start codon of the GAP1 ORF (vertical dotted line). Bottom: Dumbbell plots represent the amplified region (>1 copy) relative to the WT architecture reference genome. The ends of the dumbbells mark the approximate CNV breakpoints. Select clones were chosen as representative of the observed diversity of amplifications.
(E) Scatterplots of CNV length for all genome-sequenced clones, n = 177. We defined the upstream and downstream breakpoints as kilobases away from the start codon of the GAP1 ORF (vertical dotted line in dumbbell plot). CNV mechanisms are defined in Figure 4A.