Research Article

Insights into early animal evolution from the genome of the xenacoelomorph worm Xenoturbella bocki

Center for Life’s Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, United Kingdom
worm~lab, Institute of Zoology, University of Cologne, Germany
Department of Biosciences, Durham University, United Kingdom
Université Côte D'Azur, CNRS, Inserm, iBV, France
The Arctic University Museum of Norway, UiT – The Arctic University of Norway, Norway
Collège Doctoral, Sorbonne Université, France
Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, Norway
Institute of Clinical Medicine, Medical Faculty, University of Oslo, Norway
Museum für Naturkunde, Leibniz Institute for Evolution and Biodiversity Science, Germany
University of Greifswald, Institute for Mathematics and Computer Science, Germany
University of Greifswald, Center for Functional Genomics of Microbes, Germany
Royal Brompton Hospital, Guy's and St Thomas' NHS Foundation Trust, United Kingdom
Institut Pasteur, Université de Paris, CNRS UMR3525, Unité Régulation Spatiale des Génomes, France
The University of Chicago, Division of Biological Sciences, United States
Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), Sorbonne Universite, France
Department of Biochemistry, University of Oxford, United Kingdom

Aug 7, 2024

Open access
Copyright information

Abstract
Editor's evaluation
eLife digest
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

The evolutionary origins of Bilateria remain enigmatic. One of the more enduring proposals highlights similarities between a cnidarian-like planula larva and simple acoel-like flatworms. This idea is based in part on the view of the Xenacoelomorpha as an outgroup to all other bilaterians which are themselves designated the Nephrozoa (protostomes and deuterostomes). Genome data can provide important comparative data and help understand the evolution and biology of enigmatic species better. Here, we assemble and analyze the genome of the simple, marine xenacoelomorph Xenoturbella bocki, a key species for our understanding of early bilaterian evolution. Our highly contiguous genome assembly of X. bocki has a size of ~111 Mbp in 18 chromosome-like scaffolds, with repeat content and intron, exon, and intergenic space comparable to other bilaterian invertebrates. We find X. bocki to have a similar number of genes to other bilaterians and to have retained ancestral metazoan synteny. Key bilaterian signaling pathways are also largely complete and most bilaterian miRNAs are present. Overall, we conclude that X. bocki has a complex genome typical of bilaterians, which does not reflect the apparent simplicity of its body plan that has been so important to proposals that the Xenacoelomorpha are the simple sister group of the rest of the Bilateria.

Editor's evaluation

The authors provide a high-quality genome of the xenacoelomorph worm Xenoturbella bocki and discuss its structure and evolution. Understanding the genomic structure of this group provides important insights into bilaterian evolution. The authors make a solid case that the data they present is consistent with Xenacoelomorpha being a secondarily simplified member of Deuterostomia rather than a primitively simple sister group to all other bilaterians.

https://doi.org/10.7554/eLife.94948.sa0

eLife digest

Xenoturbella bocki is a small marine worm predominantly found on the seafloor of fjords along the west coast of Sweden. This simple organism’s unusual evolutionary history has long intrigued zoologists as it is not clear how it is related to other animal groups. The worm may belong to one of the earliest branches of the animal kingdom, which would explain its simple body. On the other hand, it could be related to a more complex group, the deuterostomes, which includes a wide range of animals, from mammals and birds to sea urchins and starfish.

Understanding X. bocki’s evolution could provide valuable insights into how bilaterians evolved as a whole. Unlike its close relatives, the acoelomorphs, X. bocki evolves more slowly, which makes it simpler to study its genome. As a result, it serves as a starting point for investigating the evolutionary processes and genetics underpinning the broader group of bilaterians.

To better understand the evolution of X. bocki’s simple body, Schiffer et al. asked whether its genome is simpler or differs in other ways from that of more complex bilaterian organisms. Sequencing the entire X. bocki genome revealed that it has a similar number of genes to that of other animals and includes the genes required for complex biochemical pathways. Reconstructing the worm’s chromosomes – the structures that house genetic information – showed that the X. bocki genes are also distributed in a manner similar to those in other animals.

The findings suggest that, despite its simple body plan, X. bocki has a complex genome that is typical of bilaterians. This challenges the idea that X. bocki belongs to a more primitive, simplified sister group to Bilateria and provides a starting point for further studies of how this simple worm evolved.

Introduction

Xenoturbella bocki (Figure 1) is a morphologically simple marine worm first described from specimens collected from muddy sediments in the Gullmarsfjord on the west coast of Sweden. There are now six described species of Xenoturbella – the only genus in the higher-level taxon of Xenoturbellida (Telford, 2008). X. bocki was initially included as a species within the Platyhelminthes (Westblad, 1949), but molecular phylogenetic studies have shown that Xenoturbellida is the sister group of the Acoelomorpha, a second clade of morphologically simple worms also originally considered Platyhelminthes: Xenoturbellida and Acoelomorpha constitute their own phylum, the Xenacoelomorpha (Philippe et al., 2019; Cannon et al., 2016). In addition to multiple phylogenetic studies that support the monophyly of the phylum, Xenacoelomorpha is convincingly supported by classical analysis in the field of evolution of development, for example, their sharing unique amino acid signatures in their Caudal genes (Philippe et al., 2019) and a Hox4/5/6 gene (Ueki et al., 2019). Here we analyze our data in this phylogenetic framework of a monophyletic taxon.

Figure 1

Download asset Open asset

Schematic drawings of *X. bocki* showing the simple body organization of the marine vermiform animal.

ant, anterior; post, posterior; If, lateral furrow; rf, ring furrow; m, mouth opening.

The simplicity of xenacoelomorph species compared to other bilaterians is a central feature of discussions over their evolution. While Xenacoelomorpha are clearly monophyletic, their phylogenetic position within the Metazoa has been controversial for a quarter of a century. There are two broadly discussed scenarios: a majority of studies have supported a position for Xenacoelomorpha as the sister group of all other Bilateria (the Protostomia and Deuterostomia, collectively named Nephrozoa) (Jimenez-Guri et al., 2006; Ryan et al., 2006; Jékely, 2013); work we have contributed to Telford, 2008; Philippe et al., 2019; Philippe et al., 2011; Bourlat et al., 2006, has instead placed Xenacoelomorpha within the Bilateria as the sister group of the Ambulacraria (Hemichordata and Echinodermata) to form a clade called the Xenambulacraria (Philippe et al., 2011).

X. bocki has neither organized gonads nor a centralized nervous system. It has a blind gut, no body cavities, and lacks nephrocytes (Nakano, 2015). If Xenacoelomorpha is the sister group to Nephrozoa, these character absences can be parsimoniously interpreted as representing the primitive state of the Bilateria. According to advocates of the Nephrozoa hypothesis, these and other characters absent in Xenacoelomorpha must have evolved in the lineage leading to Nephrozoa after the divergence of Xenacoelomorpha. More generally, there has been a tendency to interpret Xenacoelomorpha (especially Acoelomorpha) as living approximations of Urbilateria (Hejnol et al., 2009; Hejnol and Martindale, 2008).

An alternative explanation for the simple body plan of xenaceolomorphs is that it is derived from that of more complex urbilaterian ancestors through loss of morphological characters. The loss or remodeling of morphological complexity is a common feature of evolution in many animal groups and is typically associated with unusual modes of living (Martynov et al., 2020; Westheide, 1987) – in particular, the adoption of a sessile (sea squirts, barnacles) or parasitic (neodermatan flatworms, orthonectids) lifestyle, extreme miniaturization (e.g., tardigrades, orthonectids), or even neoteny (e.g., flightless hexapods).

The biology of Xenoturbella is difficult to study in vivo – they are hard to collect and mostly inactive in culture: knowledge of their embryology is restricted to one descriptive paper of a handful of embryos (Nakano et al., 2013). One route to better understanding the biology of this key taxon in the phylogeny of the animals is to read and study their genome.

In the past, some genomic features gleaned from analysis of various Xenacoelomorpha have been used to test these evolutionary hypotheses. For example, the common ancestor of the protostomes and deuterostomes has been reconstructed with approximately eight Hox genes but only four have been found in the Acoelomorpha (Nemertoderma) and five in Xenoturbella. This has been interpreted as a primary absence with the full complement of eight proposed to have appeared subsequent to the divergence of Xenacoelomorpha and Nephrozoa. Similarly, analysis of the microRNAs (miRNAs) of an acoelomorph, Symsagittifera roscoffensis, found that many bilaterian miRNAs were absent from its genome (Sempere et al., 2006). Some of the missing bilaterian miRNAs, however, were subsequently observed in Xenoturbella (Philippe et al., 2011).

The few xenacoelomorph genomes available to date are from the acoel Hofstenia miamia (Gehrke et al., 2019) – like other Acoelomorpha it shows accelerated sequence evolution relative to Xenoturbella (Philippe et al., 2019) – and from two closely related species Praesagittifera naikaiensis (Arimoto et al., 2019) and Symsagittifera roscoffensis (Martinez et al., 2023). The analyses of gene content of Hofstenia showed similar numbers of genes and gene families to other bilaterians (Gehrke et al., 2019), while an analysis of the neuropeptide content concluded that most bilaterian neuropeptides were present in Xenacoelomorpha (Moroz et al., 2021).

In order to infer the characteristics of the ancestral xenacoelomorph genome, and to complement the data from the Acoelomorpha, we describe a highly scaffolded genome of the slowly evolving xenacoelomorph X. bocki. Our data allow us to contribute knowledge of Xenacoelomorpha and Xenoturbella in particular of genomic traits, such as gene content and genome structure and to help reconstruct the genome structure and composition of the ancestral xenacoelomorph. Our data suggest that, while Xenoturbella is generally described as having a very simple body (interpreted by many as primitively simple), its genome is of a similar complexity to many other bilaterians, perhaps lending support to the idea that the simplicity of X. bocki is derived.

Results

Assembly of a draft genome of X. bocki

We collected X. bocki specimens (Figure 1) from the bottom of the Gullmarsfjord close to the biological field station in Kristineberg (Sweden). These adult specimens were starved for several days in tubes with artificial sea water, and then sacrificed in lysis buffer. We extracted high molecular weight (HMW) DNA from single individuals for each of the different sequencing steps below.

We assembled a high-quality draft genome of X. bocki using one short read Illumina library and one TruSeq Synthetic Long Reads (TSLR) Illumina library. We used a workflow based on a primary assembly with SPAdes (‘Materials and methods’; Bankevich et al., 2012). The primary assembly had an N50 of 8.5 kb over 37,880 contigs with a maximum length of 206,709 bp. After using the redundans pipeline (Pryszcz and Gabaldón, 2016), this increased to an N50 of ~62 kb over 23,094 contigs and scaffolds spanning ~121 Mb, and a longest scaffold of 960,978 kb (Table 1).

Table 1

Improvement of assembly and scaffolding metrics.

Assembly step	# seqs	# reals	# Ns	Max length	N50
Redundans contigs	37,880	113,212,556	38,3327	206,709	8544
Redundans scaffolds	24,538	117,405,089	3,021,351	952,321	52,073
Pre instaGRAAL	23,094	117,396,873	3,534,582	960,978	61,989
Final scaffolds	27,939	107,712,917	3,328,069	8,757,424	2,730,651

Assessed with the jvci toolbox: https://github.com/tanghaibao/jcvi (Tang, 2010).

The final genome was obtained with Hi-C scaffolding using the program instaGRAAL (Baudry et al., 2020). The scaffolded genome has a span of 111 Mbp (117 Mbp including small fragments unincorporated into the HiC assembly) and an N50 of 2.7 Mbp (for contigs >500 bp). The assembly contains 18 megabase-scale scaffolds encompassing 72 Mbp (62%) of the genomic sequence, with 43% GC content. The original assembly indicated a repeat content of about 25% after a RepeatModeller-based RepeatMasker annotation (‘Materials and methods’). As often seen in non-model organisms, about 2/3 of the repeats are not classified.

We used BRAKER (Hoff et al., 2019, Hoff et al., 2016) with extensive RNA-seq data, and additional single-cell UTR enriched transcriptome sequencing data to predict 15,154 gene models. A total of 9575 gene models (63%) are found on the 18 large scaffolds (which represent 62% of the total sequence). A total of 13,298 of our predicted genes (88%) have RNA-seq support. Although this proportion is at the low end of bilaterian gene counts, we note that our RNA-seq libraries were all taken from adult animals and thus may not represent the true complexity of the gene complement. We consider our predicted gene number to be a lower bound estimate for the true gene content.

The predicted X. bocki genes have a median coding length of 873 nt and a mean length of 1330 nt. Median exon length is 132 nt (mean 212 nt) and median intron length is 131 nt (mean 394 nt). Genes have a median of 4 exons and a mean of 8.5 exons. A total of 2532 genes have a single exon, of which 1381 are supported as having a single exon by RNA-seq (TPM > 1). A comparison of the exon, intron, and intergenic sequence content in Xenoturbella with those described in other animal genomes (Francis and Wörheide, 2017) shows that X. bocki falls within the range of other similarly sized metazoan genomes (Figure 2) for all these measures.

Figure 2

Download asset Open asset

A comparison of total length of exons, intrans, and intergeneic space in the *X. bocki* genome with other metazoans (data from Francis and Wörheide, 2017).

*X. bocki* does not appear to be an outlier in any of these comparisons.

The genome of a co-sequenced Chlamydia species

We recovered the genome of a marine Chlamydia species from Illumina data obtained from one X. bocki specimen and from Oxford Nanopore data from a second specimen supporting previous microscopic analyses and single-gene PCRs suggesting that X. bocki is host to a species in the bacterial genus Chlamydia. The bacterial genome was found as five contigs spanning 1,906,303 bp (N50 of 1,237,287 bp), which were assembled into two large scaffolds. Using PROKKA (Seemann, 2014), we predicted 1738 genes in this bacterial genome, with 3 ribosomal RNAs, 35 transfer RNAs, and 1 transfer-messenger RNA. The genome is 97.5% complete for bacterial BUSCO (Simão et al., 2015) genes, missing only one of the 40 core genes.

Marine chlamydiae are not closely related to the group of human pathogens (Dharamshi et al., 2020), and we were not able to align the genome of the Chlamydia-related symbiont from X. bocki to the reference strain Chlamydia trachomatis F/SW4, nor to Chlamydophila pneumoniae TW-183. To investigate the phylogenetic position of the species co-occurring with Xenoturbella, we aligned the 16S rRNA gene from the X. bocki-hosted Chlamydia with orthologs from related species including sequences of genes amplified from DNA/RNA extracted from deep-sea sediments. The X. bocki-hosted Chlamydia belong to a group designated as Simkaniaceae in Dharamshi et al., 2020, with the sister taxon in our phylogenetic tree being the Chlamydia species previously found in X. westbladi (X. westbladi is almost certainly a synonym of X. bocki) (Rouse et al., 2016; Figure 3).

Figure 3

Download asset Open asset

*X. bocki* harbors a marine *Chlamydiae* species as potential symbiont.

In the phylogenetic analysis of 16S rDNA (ML: GTR + F + R7; bootstrap values included) the bacteria in our *X. bocki* isolate (arrow) are sister lo a previous isolate from *X. westbladi. X. westbladi* is most likely a mis-identification of *X. bocki*.

To investigate whether the X. bocki-hosted Chlamydia might contribute to the metabolic pathways of its host, we compared the completeness of metabolic pathways in KEGG for the X. bocki genome alone and for the X. bocki genome in combination with the bacteria. We found only slightly higher completeness in a small number of pathways involved in carbohydrate metabolism, carbon fixation, and amino acid metabolism (see supplementary material), suggesting that the relationship is likely to be commensal or parasitic rather than a true symbiosis.

A second large fraction of bacterial reads, annotated as Gammaproteobacteria, were identified and filtered out during the data processing steps. These bacteria were also previously reported as potential symbionts of X. bocki (Kjeldsen et al., 2010). However, these sequences were not sufficiently well covered to reconstruct a genome, and we did not investigate them further.

HGT into the X. bocki genome is low

Given the close association with bacteria, we were curious to see whether the X. bocki genome contains an elevated number of horizontally acquired genes. We did not find this to be the case. We were able to detect 56 potential horizontal gene transfer (HGT) events. Phylogenies generated using closest blast hits for each HGT candidate unveiled one of the 56 genes to be of chlamydial origin and thus likely originating from a bacterial contig. A number of HGT candidates appear to be of proteobacterial origin, coding for a functionally diverse set of proteins. In summary, 0.35% of the X. bocki genes we have identified might be horizontally acquired. See supplementary online material for alignments and gene trees.

A phylogenetic gene presence/absence matrix supports Xenambulacraria

The general completeness of the X. bocki gene set allowed us to use the presence and absence of genes identified in our genomes as a source of information to find the best supported phylogenetic position of the Xenacoelomorpha. We conducted two separate phylogenetic analyses of gene presence/absence data: one including the fast-evolving Acoelomorpha and one without. In both analyses, the best tree grouped Xenoturbella with the Ambulacraria (Figure 4a). The analysis including acoels, however, placed the acoels as the sister group to Nephrozoa separate from Xenoturbella (Figure 4b). There are two explanations for this finding. The first would be that the Xenacoelomorphs are paraphyletic; that Xenoturbella is the sister group of the Ambulacraria and Acoelomorpha the sister group of Nephrozoa. Because many other studies have shown the monophyly of Xenacoelomorpha to be robust (Philippe et al., 2019; Cannon et al., 2016; Rouse et al., 2016; Srivastava et al., 2014; Philippe et al., 2011; Bourlat et al., 2006; Ueki et al., 2019), we do not think this a plausible explanation. The second explanation of this observation is that it is the result of systematic error caused by a high rate of gene loss or by orthologs being incorrectly scored as missing due to higher rates of sequence evolution in acoelomorphs (Natsidis et al., 2021). Under this second scenario, we consider it more likely that, of the two clades, it is the Acoelomorpha not Xenoturbella that are wrongly placed and that the position of Xenoturbella represents the more likely position of the entire phylum of Xenacoelomorpha. We note that under both scenarios the focus of our work, Xenoturbella, is the sister group of the Ambulacraria though the implied error suggests that using gene presence/absence may not be the ideal way to solve difficult phylogenetic problems.

Figure 4

Download asset Open asset

A phylogeny based on the presence and absence of genes calculated with OMA.

Both analysis (a) and (b) confirm Xenambulacraria, that is, Xenoturbellida in a group with Echinoderms and Hemichordates. Inclusion of the acoel flatworms places these as sister to all other Bilateria (b). This placement appears as an artifact due to the very fast evolution in this taxon, in particular as good evidence exists for uniting Xenoturbellida and Acoela (Philippe et al., 2019; Cannon et al., 2016; Rouse et al., 2016; Srivastava et al., 2014; Philippe et al., 2011; Bourlat et al., 2006; Ueki et al., 2019).

The X. bocki molecular toolkit is typical of bilaterians

One of our principal aims was to ask whether the Xenoturbella genome lacks characteristics otherwise present in the Bilateria. We found that for the Metazoa gene set in BUSCO (v5) the X. bocki proteome translated from our gene predictions is 82.5% complete and ~90% complete when partial hits are included (82 and 93%, respectively, for the Eukaryote gene set). This estimate is even higher in the acoel H. miamia, which was originally reported to be 90% (Gehrke et al., 2019), but in our re-analysis was 95.71%. In comparison, the morphologically highly simplified and fast-evolving annelid Intoshia linei (Schiffer et al., 2018) has a genome of fewer than 10,000 genes (Mikhailov et al., 2016) and in our analysis is only ~64% complete for the BUSCO (v5) Metazoa set. The model nematode Caenorhabditis elegans is ~79% complete for the same set. Despite the morphological simplicity of both Xenoturbella and Hofstenia, these Xenacoelomorpha are missing few core genes compared to other bilaterian lineages that we perceive to have undergone a high degree of morphological evolutionary change (such as the evolution of miniaturization, parasitism, sessility, etc.).

Using our phylogenomic matrix of gene presence/absence (see above), we identified all orthologs that could be detected both in Bilateria (in any bilaterian lineage) and in any non-bilaterian; ignoring HGT and other rare events, these genes must have existed in Urbilateria (and, of less interest to us, in Urmetazoa). The absence of any of these bilaterian genes in any lineage of Bilateria must therefore be explained by loss of the gene. All individual bilaterian genomes were missing many of these orthologs but Xenacoelomorphs and some other bilaterians lacked more of these than did other taxa. The average numbers of these genes present in bilaterians = 7577; Xenoturbella = 5459; Hofstenia = 5438; Praesagittifera = 4280; Drosophila = 4844 and Caenorhabditis = 4323.

To better profile the Xenoturbella and xenacoelomorph molecular toolkits, we used OrthoFinder to conduct orthology searches in a comparison of 155 metazoan and outgroup species, including the transcriptomes of the sister species Xenoturbella profunda and a draft genome of the acoel Paratomella rubra we had available, as well as the Hofstenia and Praesagittifera proteomes (Supplementary file 1). For each species, we counted, in each of the three Xenacoelomorphs, the number of orthogroups for which a gene was present. The proportion of orthogroups containing an X. bocki and X. profunda protein (87.4 and 89.2%) are broadly similar to the proportions seen in other well-characterized genomes, for example, S. purpuratus proteins (93.8%) or N. vectensis proteins (84.3%) (Figure 5). In this analysis, the fast-evolving nematode C. elegans appears as an outlier, with only ~64% of its proteins in orthogroups and ~35% unassigned. Both Xenoturbella species have an intermediate number of unassigned genes of ~11–12%. Similarly, the proportion of species-specific genes (~14% of all genes) corresponds closely to what is seen in most other species (with the exception of the parasitic annelid I. linei, Figure 5).

Figure 5

Download asset Open asset

In our orthology screen, *X. bocki* shows similar percentages of genes in orthogroups, unassigned genes, and species-specific orthogroups as other well-annotated enomes.

Idiosyncrasies of Xenoturbella

In order to identify sets of orthologs specific to the two Xenoturbella species, we used the kinfin software (Laetsch and Blaxter, 2017a) and found 867 such groups in the OrthoFinder clustering. We profiled these genes based on Pfam domains and GO terms derived from InterProScan. While these Xenoturbella specific proteins fall into diverse classes, we did see a considerable number of C-type lectin, Immunoglobulin-like, PAN, and Kringle domain containing Pfam annotations. Along with the cysteine-rich secretory protein family and the G-protein-coupled receptor activity GO terms, these genes and families of genes may be interesting for future studies into the biology of Xenoturbella in its native environment.

Gene families and signaling pathways are retained in X. bocki

In our orthology clustering, we did not see an inflation of Xenoturbella-specific groups in comparison to other taxa, but also no conspicuous absence of major gene families (Figure 6). Family numbers of transcription factors like Zinc-fingers or homeobox-containing genes, as well as, for example, NACHT-domain encoding genes seem to be neither drastically inflated nor contracted in comparison to other species in our InterProScan-based analysis.

Figure 6

Download asset Open asset

The heatmaps show a comparative measure of relative completeness of signaling pathways based on KEGG and assessed with GenomeMaple or abundance of genes in a given gene-family based on lnterProsScan annotations.

(a) Cell signaling pathways in *X. bocki* are functionally complete, but in comparison to other species contain less genes. The overall completeness is not significantly different to, for example, the nematode *C. elegans* (inset, t-test). (b) The number of family members per species in major gene families (based on Pfam domains), like transcription factors, fluctuates in evolution. The *X. bock*i genome does not appear to contain particularly less or more genes in any of the analyzed families. Due to the comparative nature of the assay, no ‘true’ scale *can* be given: darker colors indicate higher comparative completeness. Schematic cladograms are drawn by the authors.

To catalogue the completeness of cell signaling pathways, we screened the X. bocki proteome against KEGG pathway maps using GenomeMaple (Takami et al., 2016). The X. bocki gene set is largely complete in regard to the core proteins of these pathways, while an array of effector proteins is absent (Figure 6). In comparison to other metazoan species, as well as to a unicellular choanoflagellate and a yeast, the X. bocki molecular toolkit has significantly lower KEGG completeness than morphologically complex animals such as the sea urchin and amphioxus (t-test; Figure 6). Xenoturbella is, however, not significantly less complete compared to other bilaterians considered to have low morphological complexity and which have been shown to have reduced gene content, such as C. elegans, the annelid parasite I. linei, or the acoel H. miamia (Figure 6).

Clustered homeobox genes in the X. bocki genome

Acoelomorph flatworms possess three unlinked HOX genes, orthologs of anterior (Hox1), central (Hox4/5 or Hox5), and posterior Hox (HoxP). In contrast, previous analysis of X. bocki transcriptomes identified one anterior, three central, and one posterior Hox genes. We identified clear evidence of a syntenic Hox cluster with four Hox genes (centHox1, postHox, centHox3, and antHox1) in the X. bocki genome (Figure 7). There was also evidence of a fragmented annotation of centHox2, split between the four gene Hox cluster and a separate scaffold (Figure 7). In summary, this suggests that all five Hox genes form a Hox cluster in the X. bocki genome, but that there are possible unresolved assembly errors disrupting the current annotation. We also identified other homeobox genes on the Hox cluster scaffold, including Evx (Figure 7).

Figure 7

Download asset Open asset

X. *bocki* has five HOX genes, which are located in relatively close proximity on one of our chromosome-size scaffolds.

Similar clusters exist for the ParaHox and ‘pharyngeal’ genes. Numbers between genes are distance (below) and number of genes between (below). Colors indicate gene families. Red box marks the position of a partial Hox gene. The ‘?’ gene has an unresolved homeodomain identity.

Along with the Hox genes, we surveyed other homeobox genes that are typically clustered in Bilateria. The canonical bilaterian ParaHox cluster contains three genes Cdx, Xlox (=Pdx), and Gsx. We identified Cdx and a new Gsx annotation on the same scaffold, as well as a previously reported Gsx paralog on a separate scaffold. This indicates partial retention of the ParaHox cluster in X. bocki along with a duplication of Gsx. On both of these ParaHox-containing scaffolds, we observed other homeobox genes.

Hemichordates and chordates have a conserved cluster of genes involved in patterning their pharyngeal pores – the so-called ‘pharyngeal cluster’. The homeobox genes of this cluster (Msxlx, Nk2-1/2/4/8) were present on a single X. bocki scaffold. Another pharyngeal cluster transcription factor, the Forkhead containing Foxa, and ‘bystander’ genes from that cluster including Egln, Mipol1, and Slc25a21 are found in the same genomic region. Different subparts of the cluster are found in non-bilaterians and protostomes, and the cluster may well be plesiomorphic for the Bilateria rather than a deuterostome synapomorphy (Kapli et al., 2021).

The X. bocki neuropeptide complement is larger than previously thought

A catalog of acoelomorph neuropeptides was previously described using transcriptome data (Thiel et al., 2018a). We have discovered 12 additional neuropeptide genes and 39 new neuropeptide receptors in X. bocki adding 6 bilaterian peptidergic systems to the Xenoturbella catalog (NPY-F; MCH/Asta-C; TRH; ETH; CCHa/Nmn-B; Np-S/CCAP), and 6 additional bilaterian systems to the Xenacoelomorpha catalog (Corazonin; Kiss/GPR54; GPR83; 7B2; Trunk/PTTH; NUCB2), making a total of 31 peptidergic systems (Figure 8).

Figure 8 with 6 supplements see all

Download asset Open asset

*X.bocki* genome contains genes for most bilaterian specific peptidergic system and a prokineticin gene containing a signature sequence shared with ambulacraria.

(a) Sequence alignment of Cnidarian Colipase-like protein, Ecdysozoan Astakine-like protein and Spiralian, Chordates and Xenacoelomorpha Prokineticin-like proteins show conserved cysteine positions (highlighted by red triangle), as well as clade specific signature sequences sequences among which a “K/R-RFP-K/R” sequence shared only by ambulacrarians and *X. bocki*. The signature previously reported for Ecdysozoa and chordata, as well as new signatures we found in Spiralia and Cnidaria is absent from ambulacrarians and *X. bocki* prokineticin ligand sequences. Sequences are available as Figure 8—source data 1; alignment files are available at https://doi.org/10.5281/zenodo.6962271. (b) Peptidergic systems found in Xenoturbella (X), Nemertodermatida (N) and Acoelomorpha (A). Novel findings are highlighted in the top right inset. Color of schemes and inset cladogram nodes on grey background depicts the evolutionary origin of peptidergic systems in accordance with our analysis: bilaterian, protostomian, chordate, xenacoelomorph + ambulacrarian last common ancestors respectively. 7B2, Neuroendocrine protein 7B2; AKH, adipokinetic hormone; Asta-A, Allatostatin-A; Asta-C, Allatostatin-C; AVP, arginine vasopressin; AVT, Arginine vasotocin; CCAP, crustacean cardioactive peptide; CCHa, CCHamide peptide; CCK, cholecystokinin; CRF, Corticotropin-releasing factor; DH31, diuretic hormone 31; DH44, diuretic hormone 44; EH, eclosion hormone; GlycH A5, Glycoprotein Hormone alpha5; GlycH B2, Glycoprotein Hormone beta2; GnRH, Gonadotropin Releasing Hormone; GPR54, G Protein-Coupled Receptor 54; GPR83, G Protein-Coupled Receptor 83; ILP, Insulin-like peptide; Kiss, Kisspeptine; MCH, melanin concentrating hormone; Nmn-B, Neuromedin B; Np-S, Neuropeptide S; NP-Y/F, Neuropeptide Y/F; NucB2, nucleobindin 2; PDF, Pigment-dispersing factor; PEN, neuroendocrine peptide PEN; PTTH, Prothoracicotropic hormone; RYa, RYamide peptide; t-FMRFa, trochozoan-FMRFamide peptide.

Figure 8—source data 1 Xenoturbella bocki neuropeptide sequences.: https://cdn.elifesciences.org/articles/94948/elife-94948-fig8-data1-v2.docx
Download elife-94948-fig8-data1-v2.docx
Figure 8—source data 2 Xenoturbella bocki neuropeptide receptor sequences.: https://cdn.elifesciences.org/articles/94948/elife-94948-fig8-data2-v2.docx
Download elife-94948-fig8-data2-v2.docx

Among the ligand genes, we identified six new repeat-containing sequences. One of these, the LRIGamide-peptide, had been identified in Nemertodermatida and Acoela and its loss in Xenoturbella had been proposed (Thiel et al., 2018a). We also identified the first 7B2 neuropeptide and NucB2/Nesfatin genes in Xenacoelomorpha. Finally, we identified three new X. bocki insulin-like peptides, one of them sharing sequence similarity and an atypical cysteine pattern with the Ambulacrarian octinsulin, constituting a potential synapomorphy of Xenambulacraria (see https://doi.org/10.5281/zenodo.6962271).

Our searches also revealed the presence of components of the arthropod moulting pathway components (PTTH/trunk, NP-S/CCAP, and Bursicon receptors), which have recently been shown to be of ancient origin (de Oliveira et al., 2019). We further identified multiple paralogs of the Tachykinin, Rya/Luquin, tFMRFa, Corazonin, Achatin, CCK, and Prokineticin receptor families. Two complete X. bocki Prokineticin ligands were also found in our survey (Figure 8).

Chordate Prokineticin ligands possess a conserved N-terminal ‘AVIT’ sequence required for the receptor activation (Negri and Ferrara, 2018). This sequence is absent in arthropod Astakine, which instead possess two signature sequences within their Prokineticin domain (Ericsson and Söderhäll, 2018). To investigate Prokineticin ligands in Xenacoelomorpha, we compared the sequences of their Prokineticin ligands with those of other bilaterians (Figure 8). Our alignment reveals clade-specific signatures already reported in Ecdysozoa and Chordata sequences, but also two new signatures specific to Lophotrochozoa and Cnidaria sequences, as well as a very specific ‘K/R-RFP-K/R’ signature shared only by ambulacrarian and X. bocki sequences. The shared Ambulacrarian/Xenacoelomorpha signature is found at the same position as the Chordate sequence involved in receptor activation – adjacent to the N-terminus of the Prokineticin domain (Figure 8).

The X. bocki genome contains most of the bilaterian miRNAs reported missing from acoels

microRNAs have previously been used to investigate the phylogenetic position of the acoels and Xenoturbella. The acoel S. roscoffensis lacks some protostome and bilaterian miRNAs, and this lack was interpreted as supporting the position of acoels as sister group to the Nephrozoa. Based on shallow 454 microRNA sequencing (and sparse genomic traces) of Xenoturbella, some of the bilaterian miRNAs missing from acoels were found – 16 of the 32 expected metazoan (1 miRNA) and bilaterian (31 miRNAs) microRNA families – of which six could be identified in genome traces (Philippe et al., 2011).

By deep sequencing two independent small RNA samples, we have now identified the majority of the missing metazoan and bilaterian microRNAs and identified them in the genome assembly (Figure 9). Altogether, we found 23 out of 31 bilaterian microRNA families (35 genes including duplicates); the single known Metazoan microRNA family (MIR-10) in two copies; the Deuterostome-specific MIR-103; and 7 Xenoturbella-specific microRNAs, giving a total of 46 microRNA genes. None of the protostome-specific miRNAs were found. We could not confirm in the RNA sequences or new assembly a previously identified, and supposedly xenambulacrarian-specific, MIR-2012 ortholog.

Figure 9

Download asset Open asset

The rev sed microRNA complement of X. *bocki* has a near-complete set of metazoan, bilaterian, and deuterostome families and genes.

Presence (color) and absence (black) of microRNA families (column), paralog numbers (values and heatmap coloring) organized in node-specific blocks in a range of representative protostome and deuterostome species compared with *Xenoturbella* (species from MirGeneDB 2.1; Fromm et al., 2022). The bottom row depicts 2011 complement by Philippe et al., 2011 (blue numbers on black depict detected miRNA reads, but lack of genomic evidence). Red ‘x’ in the pink box highlights the lack of evidence for an Ambulacraria-specific microRNA in *X. bocki*.

The X. bocki genome retains ancestral metazoan linkage groups

The availability of chromosome-scale genomes has made it possible to reconstruct 24 ancestral linkage units broadly preserved in bilaterians (Simakov et al., 2020). In fast-evolving genomes, such as those of nematodes, tunicates, or platyhelminths, these ancestral linkage groups (ALGs) are often dispersed and/or extensively fused (supplementary material). We were interested in testing whether the general conservation of the gene content in X. bocki is reflected in its genome structure.

We compared the genome of Xenoturbella to several other metazoan genomes and found that it has retained most of these ancestral bilaterian units: 12 chromosomes in the X. bocki genome derive from a single ALG, 5 chromosomes are made of the fusion of 2 ALGs, and 1 Xenoturbella chromosome is a fusion of 3 ALGs, as highlighted with the comparison of ortholog content with amphioxus, the sea urchin, and the sea scallop (Figure 10 and supplementary material).

Figure 10 with 2 supplements see all

Download asset Open asset

A comparison of scaffolds in the *X. bocki* genome with other Metazoa.

17 of the 18 large scaffolds in the *X. bocki* genome are linked via synteny to distinct chromosomal scaffolds in these species.

One ALG that has been lost in chordates but not in ambulacrarians nor in mollusks (ALG R in sea urchin and sea scallop) is detectable in X. bocki (Figure 10), while X. bocki does not show the fusions that are characteristic of lophotrochozoans.

We also attempted to detect some pre-bilaterian arrangement of ancestral linkage: for instance, Simakov et al., 2022 predicted that several pre-bilaterian linkage groups successively fused in the bilaterian lineage to give ALGs A1, Q, and E. These ALGs are all represented as single units in X. bocki in common with other Bilateria. None of the inferred pre-bilaterian chromosomal arrangements that could have provided support for the Nephrozoa hypothesis were found in X. bocki, although of course this does not rule out Nephrozoa.

One X. bocki chromosomal fragment appears aberrant

The smallest of the 18 large scaffolds in the X. bocki genome did not show strong 1:1 clustering with any scaffold/chromosome of the bilaterian species we compared it to. To exclude potential contamination in the assembly as a source for this contig, we examined the orthogroups to which the genes from this scaffold belong. We found that X. profunda (Rouse et al., 2016), for which a transcriptome is available, was the species that most often occurred in the same orthogroup with genes from this scaffold (41 shared orthogroups), suggesting the scaffold is not a contaminant.

We did observe links between the aberrant scaffold and several scaffolds from the genome of the sponge Ephydatia muelleri , but could not detect distinct synteny relationships to a single scaffold in other species. In line with this, genes on the scaffold show a different age structure compared to other scaffolds, with both more older genes (pre bilaterian) and more Xenoturbella-specific genes (Figure 11; supported by Ks statistics, supplementary material). This aberrant scaffold also had significantly lower levels of methylation than the rest of the genome.

Figure 11

Download asset Open asset

Phylostratigraphic age distribution of genes on all major scaffolds in the *X. bocki* genome.

One scaffold (c1896), which showed no synteny to a distinct chromosomal scaffold in the other metazoan species, also had a divergent gene age structure in comparison to other *X. bocki* scaffolds.

Discussion

The phylogenetic positions of Xenoturbella and the Acoelomorpha have been controversial since the first molecular data from these species appeared over 25 years ago. Today we understand that they constitute a monophyletic group of morphologically simple worms (Telford, 2008; Philippe et al., 2011; Hejnol, 2015), but there remains a disagreement over whether they represent a secondarily simplified sister group of the Ambulacraria or a primitively simple sister group to all other Bilateria. Here we wanted to analyze the genome of X. bocki to glean insights into their biology from a new perspective.

Previous analyses of the content of genomes, especially of Acoela, have found a small number of Hox genes and of microRNAs of acoels, and this has been interpreted as representing an intermediate stage on the path to the ~8 Hox genes and 30 odd microRNAs of the Nephrozoa. A strong version of the Nephrozoa idea would go further than these examples and anticipate, for example, a genome-wide paucity of bilaterian genes, GRNs, and biochemical pathways and/or an arrangement of chromosomal segments intermediate between those of the Eumetazoa and the Nephrozoa.

One criticism of the results from analyses of acoel genomes is that the Acoelomorpha have evolved rapidly (their long branches in phylogenetic trees showing high rates of sequence change). This rapid evolution might plausibly be expected to correlate with other aspects of rapid genome evolution such as higher rates of gene loss and chromosomal rearrangements, leading to significant differences from other Bilateria. The more normal rates of sequence evolution observed in Xenoturbella therefore recommend it as a more appropriate xenacoelomorph to study with fewer apomorphic characters expected.

We have sequenced, assembled, and analyzed a draft genome of X. bocki. To help with annotation of the genome, we have also sequenced miRNAs and small RNAs as well as using bisulfite sequencing, Hi-C, and Oxford Nanopore Technologies sequencing. We compared the gene content of the Xenoturbella genome to species across the Metazoa and its genome structure to several other high-quality draft animal genomes.

We found the X. bocki genome to be fairly compact, but not unusually reduced in size compared to many other bilaterians. It appears to contain a similar number of genes (~15,000) as other animals, for example, from the model organisms Drosophila melanogaster (>14,000) and C. elegans (~20,000). The BUSCO completeness, as well as a high level of representation of X. bocki proteins in the orthogroups of our 155 species orthology screen, indicates that we have annotated a near-complete gene set. Surprisingly, there are fewer genes than in the acoel Hofstenia (>22,000; BUSCO_v5 score ~95%). This said, of the genes found in Urbilateria (orthogroups in our presence/absence analysis containing a member from both a bilaterian and an outgroup), Xenoturbella and Hofstenia have very similar numbers (5459 and 5438, respectively). Gene, intron, and exon lengths all also fall within the range seen in many other invertebrate species (Francis and Wörheide, 2017). It thus seems that basic genomic features in Xenoturbella are not anomalous among Bilateria. Unlike some extremely simplified animals, such as orthonectids, we observe no extreme reduction in gene content.

All classes of homeodomain transcription factors have previously been reported to exist in Xenacoelomorpha (Brauchle et al., 2018). We have identified five HOX-genes in X. bocki and at least four, and probably all five of these are on one chromosomal scaffold within 187 Kbp. X. bocki also has the ParaHox genes Gsx and Cdx; while Xlox/pdx is not found, it is present in Cnidarians and must therefore have been lost (Jimenez-Guri et al., 2006). If block duplication models of Hox and ParaHox evolutionary relationships are correct, the presence of a complete set of ParaHox genes implies the existence of their Hox paralogs in the ancestor of Xenacoelomorphs, suggesting the xenacoelomorph ancestor also possessed a Hox 3 ortholog. If anthozoans also have an ortholog of bilaterian Hox 2 (Ryan et al., 2006), this must also have been lost from Xenacoelomorphs. The minimal number of Hox genes in the xenacoelomorph stem lineage was therefore probably 7 (AntHox1, lost Hox2, lost Hox 3, CentHox 1, CentHox 2, CentHox 3, and postHoxP).

Based on early sequencing technology and without a reference genome available, it was thought that Acoelomorpha lack many bilaterian microRNAs. Using deep sequencing of small RNAs and our high-quality genome, we have shown that Xenoturbella shows a near-complete bilaterian set of miRNAs including the single deuterostome-specific miRNA family (MIR-103) (Figure 9). The low number of differential family losses of Xenoturbella (8 of 31 bilaterian miRNA families) inferred is the same as the number lost in the flatworm Schmidtea, and substantially lower than the number lost in the rotifer Brachionus (which has lost 14 bilaterian families). It is worth mentioning that X. bocki shares the absence of one miRNA family (MIR-216) with all Ambulacrarians, although if Deuterostomia are paraphyletic this could be interpretable as a primitive state (Kapli et al., 2021).

The last decade has seen a re-evaluation of our understanding of the evolution of the neuropeptide signaling genes (Jékely, 2013; Mirabeau and Joly, 2013). The peptidergic systems are thought to have undergone a diversification that produced approximately 30 systems in the bilaterian common ancestor (Jékely, 2013; Mirabeau and Joly, 2013). Our study identified 31 neuropeptide systems in X. bocki, and for all of these either the ligand, receptor, or both are also present in both protostomes and deuterostomes, indicating conservation across Bilateria. It is likely that more ligands (which are short and variable) remain to be found with better detection methods. It appears that the Xenoturbella genome contains a nearly complete bilaterian neuropeptide complement with no signs of simplification but rather signs of expansions of certain gene families. Our analyses also reveal a potential synapomorphy linking Xenacoelomorpha with Ambulacraria (Figure 8 and https://doi.org/10.5281/zenodo.6962271).

We have used the predicted presence and absence of genes across a selection of metazoan genomes as characters for phylogenetic analyses. Our trees reconfirm the findings of recent phylogenomic gene alignment studies in linking Xenoturbella to the Ambulacraria. We also used these data to test different bilaterians for their propensity to lose otherwise conserved genes (or for our inability to identify orthologs; Natsidis et al., 2021). While the degree of gene loss appears similar between Xenoturbella and acoels, the phylogenetic analysis shows longer branches leading to the acoels, most likely due to faster evolution, gain of lineage-specific genes, and some degree of gene loss in the branch leading to the Acoelomorpha. Recent work has shown the tendency of rapidly evolving genes (in particular those belonging to rapidly evolving species) to be missed by orthology detection software (Natsidis et al., 2021; Weisman et al., 2020).

This pattern of conservation of evolutionarily old parts of the Metazoan genome is further reinforced by the retention in Xenoturbella of linkage groups present from sponges to vertebrates. It is interesting to note that X. bocki does not follow the pattern seen in other morphologically simplified animals such as nematodes and platyhelminths, which have lost and/or fused these ALGs. We interpret this to be a signal of comparably slower genomic evolution in Xenoturbella in comparison to some other bilaterian lineages. The fragmented genome sequence of Hofstenia prevents us from asking whether the ancient linkage groups have also been preserved in the Acoelomorpha.

One of the chromosome-scale scaffolds in our assembly showed a different methylation and age signal, with both older and younger genes, and no clear relationship to metazoan linkage groups. By analyzing orthogroups of genes on this scaffold for their phylogenetic signal and finding X. bocki genes to cluster with those of X. profunda, we concluded that the scaffold most likely does not represent a contamination. It remains unclear whether this scaffold is a fast-evolving chromosome or a chromosomal fragment or arm. Very fast evolution on a chromosomal arm has, for example, been shown in the zebrafish (Howe et al., 2016).

Apart from DNA from X. bocki, we also obtained a highly contiguous genome of a species related to marine Chlamydia species (known from microscopy to exist in X. bocki); a symbiotic relationship between Xenoturbella and the bacterium has been thought possible (Pillonel et al., 2018; Robertson et al., 2024). The large gene number and the completeness of genetic pathways we found in the chlamydial genome do not support an endosymbiotic relationship.

Overall, we have shown that, while Xenoturbella has lost some genes – in addition to the reduced number of Hox genes previously noted, we observe a reduction of some signaling pathways to the core components – in general, the X. bocki genome is not strikingly simpler than many other bilaterian genomes. We do not find support for a strong version of the Nephrozoa hypothesis that would predict many missing bilaterian genes. Bilaterian Hox and microRNA absent from Acoelomorpha are found in Xenoturbella eliminating the impact of two character types that were previously cited in support of Nephrozoa. The Xenoturbella genome has also largely retained the ALGs found in other bilaterians and does not represent a structure intermediate between Eumetazoan and bilaterian ground states. Overall, while we can rule out a strong version of the Nephrozoa hypothesis with many Bilaterian characteristics missing in xenacoelomorphs, our analysis of the Xenoturbella genome cannot distinguish between a weak version of Nephrozoa and the Xenambulacraria topology.

Share this article

Cite this article

Schematic drawings of X. bocki showing the simple body organization of the marine vermiform animal.

Improvement of assembly and scaffolding metrics.

A comparison of total length of exons, intrans, and intergeneic space in the X. bocki genome with other metazoans (data from Francis and Wörheide, 2017).

X. bocki harbors a marine Chlamydiae species as potential symbiont.

A phylogeny based on the presence and absence of genes calculated with OMA.

In our orthology screen, X. bocki shows similar percentages of genes in orthogroups, un­assigned genes, and species-specific orthogroups as other well-annotated enomes.

The heatmaps show a comparative measure of relative completeness of signaling pathways based on KEGG and assessed with GenomeMaple or abundance of genes in a given gene-family based on lnterProsScan annotations.

X. bocki has five HOX genes, which are located in relatively close proximity on one of our chromosome-size scaffolds.

X.bocki genome contains genes for most bilaterian specific peptidergic system and a prokineticin gene containing a signature sequence shared with ambulacraria.

Figure 8—source data 1

Figure 8—source data 2

The rev sed microRNA complement of X. bocki has a near-complete set of metazoan, bilaterian, and deuterostome families and genes.

A comparison of scaffolds in the X. bocki genome with other Metazoa.

Phylostratigraphic age distribution of genes on all major scaffolds in the X. bocki genome.

Blobplot analysis of the primary lllumina genome assembly.

Hi-C based genome scaffolding with instaGRAAL.

Author details

Philipp H Schiffer

Contribution

For correspondence

Competing interests

Paschalis Natsidis

Contribution

Competing interests

Daniel J Leite

Contribution

Competing interests

Helen E Robertson

Contribution

Competing interests

François Lapraz

Contribution

Competing interests

Ferdinand Marlétaz

Contribution

Competing interests

Bastian Fromm

Contribution

Competing interests

Liam Baudry

Contribution

Competing interests

Fraser Simpson

Contribution

Competing interests

Eirik Høye

Contribution

Competing interests

Anne C Zakrzewski

Contribution

Competing interests

Paschalia Kapli

Contribution

Competing interests

Katharina J Hoff

Contribution

Competing interests

Steven Müller

Contribution

Competing interests

Martial Marbouty

Contribution

Competing interests

Heather Marlow

Contribution

Competing interests

Richard R Copley

Contribution

Competing interests

Romain Koszul

Contribution

Competing interests

Peter Sarkies

Contribution

Competing interests

Maximilian J Telford

Contribution

For correspondence

In our orthology screen, X. bocki shows similar percentages of genes in orthogroups, unassigned genes, and species-specific orthogroups as other well-annotated enomes.