Whole genomes from the extinct Xerces Blue butterfly can help identify declining insect species

  1. Toni de-Dios
  2. Claudia Fontsere
  3. Pere Renom
  4. Josefin Stiller
  5. Laia Llovera
  6. Marcela Uliano-Silva
  7. Alejandro Sánchez-Gracia
  8. Charlotte Wright
  9. Esther Lizano
  10. Berta Caballero
  11. Arcadi Navarro
  12. Sergi Civit
  13. Robert K Robbins
  14. Mark Blaxter
  15. Tomàs Marquès  Is a corresponding author
  16. Roger Vila  Is a corresponding author
  17. Carles Lalueza-Fox  Is a corresponding author
  1. Institute of Evolutionary Biology, Spain
  2. Institute of Genomics, University of Tartu, Estonia
  3. Section for Evolutionary Genomics, The Globe Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Denmark
  4. Centre for Biodiversity Genomics, University of Copenhagen, Denmark
  5. Wellcome Sanger Institute, United Kingdom
  6. Departament of Genetics, Microbiology and Statistics-Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Spain
  7. Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Spain
  8. Museu de Ciències Naturals de Barcelona, Spain
  9. Catalan Institution of Research and Advanced Studies (ICREA), Spain
  10. Department of Entomology, National Museum of Natural History, Smithsonian Institution, United States
  11. CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), Spain
5 figures, 5 tables and 2 additional files

Figures

Phylogenetic placement of the Xerces Blue.

(a) Maximum likelihood tree from whole mitochondrial genomes of Xerces Blue, Silvery Blue, and Green-Underside Blue. Node labels are bootstrap support values. (b) Time-calibrated phylogeny from Bayesian inference using mitochondrial protein-coding genes of Xerces Blue and related butterflies. Node values show median age estimates from dating analysis with a molecular clock (above nodes) or from fixing the age of the root (below nodes). Bars are 95% HPD intervals for node ages. All posterior probabilities were 1, except for one node annotated in black.

Plotting of PC1 and PC2 of the principal component analysis (PCA).

The PCA was generated with nuclear DNA data (N = 6,682,591 SNPs (single nucleotide polymorphisms)) from 11 historical butterfly specimens (4 G. xerces and 7 G. lygdamus), a modern G. lygdamus from Canada (RVcoll10-B005) and a modern G. alexis reference genome. The PCA shows a clear separation of both historical species and the reference in the first PC (explaining 52.81% of the variance), and separation of G. xerces and G. lygdamus by the second PC (explaining 6.09% of the variance), supporting they are separated lineages.

Figure 3 with 1 supplement
Pairwise sequentially Markovian coalescent (PSMC) plot of one Xerces Blue (Glaucopsyche xerces) (L05) specimen and one Silvery Blue specimen (Glaucopsyche lygdamus).

The two historical samples are those with higher average coverage. Individual PSMC plots were bootstrapped 100 times each (lighter lines). One year of generation time and a mutation rate of µ = 1.9 × 10−9 were used. The peak of the Marine Isotopic Stage 7 interglacial is marked in yellow.

Figure 3—figure supplement 1
Pairwise sequentially Makovian coalescent (PSMC) plots of Xerces Blue and Silvery Blue.

(a) PSMC of Xerces Blue L05 down-sampled to half (red) and 75% of coverage (green). (b) PSMC of down-sampled Xerces Blue L05 corrected. Lower-coverage results in underestimation of heterozygote site and thus lower historical effective population sizes. This situation can be corrected assuming a false negative rate (FNR) by visually adjusting the curves using the psmc_plot.py program from the PSMC package. (c) PSMC of Xerces Blue L03, L05, L07, and L09 corrected assuming FNR. (d) PSMC of historical Silvery Blue L02, L04, L006, L08, L11, L12, and L13 and modern Silvery Blue from Canada (RVcoll10-B005) corrected assuming FNR. Despite current differences in coverage, individuals from each species follow the same trajectory.

Figure 4 with 1 supplement
Runs of homozygosity (RoH) in the genomes of Xerces Blue and Silvery Blue (modern and historical).

(a) Percentage of the autosomal genome in RoH by size bins: very short RoH (<100 kb), short RoH (100–500 kb), intermediate RoH (500 kb to 1 Mb), and long (1–5 Mb). Short RoH reflect LD patterns, intermediate size RoH describe background inbreeding due to genetic drift, and long RoH appear in the case of very recent inbreeding due to consanguinity. Error bars show the standard deviation. (b) Distribution of RoH in the autosomal genome of a Xerces specimen, L05. (c) Distribution of RoH in the autosomal genome of a Silvery specimen L13.

Figure 4—figure supplement 1
Runs of homozygosity (RoH) in the genomes of Xerces Blue, Silvery Blue, and Green-Underside Blue (modern and historical).

(a) Heterozygosity estimates for low-coverage data from both Silvery Blue (historical and modern), Xerces Blue butterflies, and Green-Underside Blue (modern). (b) RoH for each individual classified by length.

Functional effect prediction on the fixed amino acid-changing alleles observed in Xerces Blue and Silvery Blue.

(a) Wide genome Ka/Ks ratio comparison. (b) High-to-moderate effect variant comparison in homozygous sites. (c) High-to-moderate effect variant comparison in heterozygous sites. (d) Presence of high-to-moderate variants in regions of the genome in runs of homozygosity (RoH). Error bars show the standard deviation.

Tables

Table 1
List of historical specimens analysed in this study.
Genome #SpeciesSubspp.StateLocalityDateCollection
USNMENT101413G. xercesCaliforniaSan FranciscoNABarnes
USNMENT101402G. xercesCaliforniaSan Francisco16/4/1923Barnes
USNMENT101441G. xercesCaliforniaSan FranciscoNABarnes
USNMENT101406G. xercesCaliforniaSan FranciscoNABarnes
USNMENT101434G. xercesCaliforniaSan Francisco16/4/1923Barnes
USNMENT00181297G. lygdamusincognitusCaliforniaMarin CountryNABarnes
USNMENT00181298G. lygdamusincognitusCaliforniaFairfax27/5/1932WMD Field
USNMENT00181299G. lygdamusincognitusCaliforniaOakland14/4/1948Graham Heid
USNMENT00181300G. lygdamusincognitusCaliforniaSan Jose27/3/1964Opler
USNMENT00181301G. lygdamusincognitusCaliforniaHaywood City1/5/1931WMD Field
USNMENT00181302G. lygdamusincognitusCaliforniaSanta Cruz1/4/1932JW Tilden/Field
USNMENT00181303G. lygdamusincognitusCaliforniaSanta Cruz8/4/1927GW Rawson
Table 2
Mapping statistics of the analysed historical specimens.

Mapping statistics of the four historical G. xerces (L003, L005, L007, and L009) and the seven historical G. lydagmus (L002, L004, L006, L008, L011, L012, and L013) specimens mapped against the G. alexis reference genome. Average depth is displayed for the covered regions of each individual.

Sample identifierGenerated readsQ25 unique mapped readsBreadth of coverage (%)Average depth covered regions
L002300,294,24823,337,75137.275.105
L003405,198,06032,547,82036.866.78
L004357,165,43828,722,18538.776.55
L005776,312,37856,459,03745.712.42
L006359,520,16828,498,72040.076.18
L007348,916,87026,758,35634.796.21
L008508,120,15632,107,19242.087.422
L009322,955,38439,312,61740.68.02
L011236,886,53424,165,28238.65.40
L012328,359,66918,683,73833.374.29
L013385,635,64452,612,93747.212.3
Table 3
Coordinates of the analysed colouration genes.

Genomic coordinates in G. alexis reference genomes of different wing colouration genes described in other butterfly species.

ChromosomeStartEndGene
FR990043.15,387,7065,403,599Wnt1
FR990043.15,417,9025,423,677Wnt6
FR990043.15,519,3535,539,737Wnt10b
FR990043.15,553,6665,554,753Wnt10a
FR990043.126,972,85626,974,487WntA
FR990046.12,343,4672,357,667Wnt7b
FR990046.16,255,2756,271,623Wnt5b
FR990046.119,475,63619,486,554Wnt9
FR990050.116,200,97816,212,495Wnt11
FR990054.120,633,40020,655,261Cortex
FR990059.120,254,46020,255,275Optix
Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Biological sample (Glaucopsyche xerces; female)L003This paperSAMEA114094142See Materials and methods
Biological sample (G. xerces; male)L005This paperSAMEA114094143See Materials and methods
Biological sample (G. xerces; male)L007This paperSAMEA114094144See Materials and methods
Biological sample (G. xerces; female)L009This paperSAMEA114094145See Materials and methods
Biological sample (Glaucopsyche lygdamus; male)L002This paperSAMEA114094134See Materials and methods
Biological sample (G. lygdamus; male)L004This paperSAMEA114094135See Materials and methods
Biological sample (G. lygdamus; male)L006This paperSAMEA114094136See Materials and methods
Biological sample (G. lygdamus; male)L008This paperSAMEA114094137See Materials and methods
Biological sample (G. lygdamus; male)L011This paperSAMEA114094138See Materials and methods
Biological sample (G. lygdamus; female)L012This paperSAMEA114094139See Materials and methods
Biological sample (G. lygdamus; male)L013This paperSAMEA114094140See Materials and methods
Biological sample (G. lygdamus; male)RVcoll10-B005This paperSAMEA114094141See Materials and methods
Biological sample (Glaucopsyche alexis; male)G. alexisHinojosa Galisteo et al., 2021ilGlaAlex1.1; GCA_905404095.1
Biological sample (Aricia agestis)A. agestisHayward et al., 2023LR990279.1
Biological sample (Aricia artaxerxes)A. artaxerxesEbdon et al., 2022OW569311.1
Biological sample (Celastrina argiolus)C. argiolusHayward et al., 2021LR994603.1
Biological sample (Cyaniris semiargus; male)C. semiargusLohse et al., 2023LR994570.1
Biological sample (G. alexis; male)G. alexisHinojosa Galisteo et al., 2021FR990065.1
Biological sample (G. xerces)G. xercesGrewe et al., 2021MW677564.1
Biological sample (Lysandra bellargus; female)L. bellargusLohse et al., 2022HG995365.1
Biological sample (Lysandra coridon; male)L. coridonVila et al., 2023HG992145.1
Biological sample (Plebejus argus)P. argusZhou et al., 2020MN974526.1
Biological sample (Plebejus melissa)P. melissaEllis et al., 2021DWQ001000057.1
Biological sample (Plebejus anna)P. annaEllis et al., 2021DWTA01000073.1
Biological sample (Polyommatus icarus; male)P. icarushttps://www.darwintreeoflife.org/OW569343.1
Biological sample (Shijimiaeoides divina)S. divinaJeong et al., 2017NC_029763.1
Biological sample (Zizina emelina)Z. emelinaLiu et al., 2020MN013031.1
Software, algorithmBUSCOManni et al., 2021v.5.1.2
Software, algorithmAdapterRemovalSchubert et al., 2016v.2.2.2
Software, algorithmBWA – backtrackLi and Durbin, 2009v.0.7.1
Software, algorithmBWA – memLi, 2013v.0.7.1
Software, algorithmQualimap2Okonechnikov et al., 2016v.2.2.2
Software, algorithmpmdtoolsSkoglund et al., 2014v.0.50
Software, algorithmMapDamage2Jónsson et al., 2013v.2.7.12
Software, algorithmBedtoolsQuinlan and Hall, 2010v.2.27.1
Software, algorithmsnpADPrüfer, 2018v.0.3.2
Software, algorithmGATKMcKenna et al., 2010v.3.5–3.7
Software, algorithmvcftoolsDanecek et al., 2011v.0.1.12b–0.1.14b
Software, algorithmangsdKorneliussen et al., 2014v.0.916
Software, algorithmbcftoolsDanecek et al., 2021v.1.9
Software, algorithmMitofinderAllio et al., 2020v.1.4
Software, algorithmMACSERanwez et al., 2018v.2.05
Software, algorithmMAFFTKatoh and Standley, 2013v.7.490
Software, algorithmIQ-TREE2Minh et al., 2020v.2.1.3
Software, algorithmModelFinderKalyaanamoorthy et al., 2017Available in IQ-TREE2
Software, algorithmUFBoot2Hoang et al., 2018Available in IQ-TREE2
Software, algorithmBEAST2Bouckaert et al., 2019v.2.6.3
Software, algorithmbModelTestBouckaert and Drummond, 2017v.1.2.1
Software, algorithmTracerRambaut et al., 2018v.1.7.2
Software, algorithmPSMCLi and Durbin, 2011v.0.6.5
Software, algorithmPCAngsdMeisner and Albrechtsen, 2018v.20180209
Software, algorithmBcftools-rohNarasimhan et al., 2016v.1.9
Software, algorithmSNPeffCingolani et al., 2012v.4.3
Software, algorithmPicardBroad Institute, 2015v.2.0.1
Software, algorithmSamtoolsLi et al., 2009v.1.6
Software, algorithmBamUtilJun et al., 2015v.1.0.13
Software, algorithmBedtoolsQuinlan and Hall, 2010v.2.27.1
Software, algorithmBLASTAltschul et al., 1990v.2.2.2
Software, algorithmBBMapBushnell, 2014v.38.18
Software, algorithmPrinseqSchmieder and Edwards, 2011v.0.20.4
Software, algorithmKraken2Wood et al., 2019v.2.1.1
Software, algorithmRR Core Team, 2019v.3.6.3–4.1.0
Software, algorithmGgplot2Wickham, 2016v.3.0.0
Table 4
Wolbachia DNA reads assigned using Kraken2.
SpecimenWolbachia genus readsWolbachia spp. reads
L0021905
L0031313
L0042135
L0053118
L0062429
L0071522
L00841421
L0092366
L0111849
L0121689
L01352324

Additional files

Supplementary file 1

Additional information on the genomics analyses.

(a) DNA metric. (b) Mappability of the reference genomes. (c) Mitochondrial DNA genomes used in the phylogenetic analysis. (d) Heterozygosity, homozygosity, and runs of homozygosity (RoH). (e) Genes in uncovered genomic regions in Xerces Blue.

https://cdn.elifesciences.org/articles/87928/elife-87928-supp1-v2.xlsx
MDAR checklist
https://cdn.elifesciences.org/articles/87928/elife-87928-mdarchecklist1-v2.docx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Toni de-Dios
  2. Claudia Fontsere
  3. Pere Renom
  4. Josefin Stiller
  5. Laia Llovera
  6. Marcela Uliano-Silva
  7. Alejandro Sánchez-Gracia
  8. Charlotte Wright
  9. Esther Lizano
  10. Berta Caballero
  11. Arcadi Navarro
  12. Sergi Civit
  13. Robert K Robbins
  14. Mark Blaxter
  15. Tomàs Marquès
  16. Roger Vila
  17. Carles Lalueza-Fox
(2024)
Whole genomes from the extinct Xerces Blue butterfly can help identify declining insect species
eLife 12:RP87928.
https://doi.org/10.7554/eLife.87928.3