Mosaic cis-regulatory evolution drives transcriptional partitioning of HERVH endogenous retrovirus in the human embryo
Abstract
The human endogenous retrovirus type-H (HERVH) family is expressed in the preimplantation embryo. A subset of these elements are specifically transcribed in pluripotent stem cells where they appear to exert regulatory activities promoting self-renewal and pluripotency. How HERVH elements achieve such transcriptional specificity remains poorly understood. To uncover the sequence features underlying HERVH transcriptional activity, we performed a phyloregulatory analysis of the long terminal repeats (LTR7) of the HERVH family, which harbor its promoter, using a wealth of regulatory genomics data. We found that the family includes at least 8 previously unrecognized subfamilies that have been active at different timepoints in primate evolution and display distinct expression patterns during human embryonic development. Notably, nearly all HERVH elements transcribed in ESCs belong to one of the youngest subfamilies we dubbed LTR7up. LTR7 sequence evolution was driven by a mixture of mutational processes, including point mutations, duplications, and multiple recombination events between subfamilies, that led to transcription factor binding motif modules characteristic of each subfamily. Using a reporter assay, we show that one such motif, a predicted SOX2/3 binding site unique to LTR7up, is essential for robust promoter activity in induced pluripotent stem cells. Together these findings illuminate the mechanisms by which HERVH diversified its expression pattern during evolution to colonize distinct cellular niches within the human embryo.
Data availability
Scripts, data tables, and notes for figures 1-4,6a and supplemental figures 1-1,2-1,3-1,4-1,5-1,6-2 by TAC and JDC - https://github.com/LumpLord/Mosaic-cis-regulatory-evolution-drives-transcriptional-partitioning-of-HERVH-endogenous-retrovirus..Scripts and data tables by MS for figures 5,6c and supplemental figures 6-1,6-3,5-2 - https://github.com/Manu-1512/LTR7-up
-
Transcription factor binding dynamics during human ES cell differentiationNCBI Gene Expression Omnibus, GSE61475.
-
3D Chromosome Regulatory Landscape of Human Pluripotent CellsNCBI Gene Expression Omnibus, GSE69647.
-
ChIP-exo of human KRAB-ZNFs transduced in HEK 293T cells and KAP1 in hES H1 cellsNCBI Gene Expression Omnibus, GSE78099.
-
Repeat elements study in pluripotent stem cellsNCBI Gene Expression Omnibus, GSE54726.
-
Principles of Signalling Pathway Modulation for Enhancing Human Naïve Pluripotency Induction [ChIP-seq]NCBI Gene Expression Omnibus, GSE125553.
-
Tracing pluripotency of human early embryos and embryonic stem cells by single cell RNA-seqNCBI Gene Expression Omnibus, GSE36552.
-
Single-Cell RNA-seq Defines the Three Cell Lineages of the Human BlastocystNCBI Gene Expression Omnibus, GSE66507.
Article and author information
Author details
Funding
National Institutes of Health (GM112972)
- Cédric Feschotte
National Institutes of Health (HG009391)
- Cédric Feschotte
National Institutes of Health (GM122550)
- Cédric Feschotte
Cornell Center for Vertebrate Genomics
- Thomas Carter
Howard Hughes Medical Institute
- John L Rinn
National Institutes of Health (GM099117)
- John L Rinn
Cornell Presidential Fellow Program
- Manvendra Singh
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Copyright
© 2022, Carter et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.