Uncovering perturbations in human hematopoiesis associated with healthy aging and myeloid malignancies at single-cell resolution

  1. Marina Ainciburu
  2. Teresa Ezponda
  3. Nerea Berastegui
  4. Ana Alfonso-Pierola
  5. Amaia Vilas-Zornoza
  6. Patxi San Martin-Uriz
  7. Diego Alignani
  8. Jose Lamo-Espinosa
  9. Mikel San-Julian
  10. Tamara Jiménez-Solas
  11. Felix Lopez
  12. Sandra Muntion
  13. Fermin Sanchez-Guijo
  14. Antonieta Molero
  15. Julia Montoro
  16. Guillermo Serrano
  17. Aintzane Diaz-Mazkiaran
  18. Miren Lasaga
  19. David Gomez-Cabrero
  20. Maria Diez-Campelo
  21. David Valcarcel
  22. Mikel Hernaez
  23. Juan P Romero  Is a corresponding author
  24. Felipe Prosper  Is a corresponding author
  1. Area de Hemato-Oncología, Centro de Investigación Médica Aplicada, Universidad de Navarra, Instituto de investigación sanitaria de Navarra (IDISNA), Spain
  2. Centro de Investigación Biomédica en Red de Cáncer, Spain
  3. Clinica Universidad de Navarra, Spain
  4. Flow Cytometry Core, Universidad de Navarra, Spain
  5. Hospital Universitario de Salamanca, Spain
  6. Red de Investigación Cooperativa en Terapia Celular TerCel, ISCIII., Spain
  7. Department of Hematology, Vall d'Hebron Hospital Universitari, Spain
  8. Computational Biology Program, Universidad de Navarra, Spain
  9. Translational Bioinformatics Unit, NavarraBiomed, Spain
  10. Biological & Environmental Sciences & Engineering Division, King Abdullah University of Science and Technology, Saudi Arabia
9 figures and 7 additional files

Figures

Figure 1 with 4 supplements
Transcriptional profiling of CD34+ cells from young and elderly healthy donors.

(A) CD34+ cells were obtained from bone marrow aspirates of young (n=5) and elderly (n=3) donors and subjected to single-cell RNA sequencing. (B) UMAP plot with young cells colored according to unsupervised clustering results (left) and elderly cells labeled using an in-house cell classifier (right). (C) Dot plot of cluster markers (adjusted p-value <0.05) for the different cellular subpopulations identified. Dot size represents the percentage of cells that express each marker, and color represents scaled expression values. (D) Bar plots showing the proportion of cells assigned to each cellular subpopulation for each donor independently. (E) Dot plot of enriched terms after performing gene set enrichment analysis (GSEA) for each identified cluster. Dot color represents the enriched group, size indicates the NES absolute value, and transparency indicates the adjusted p-value.

Figure 1—figure supplement 1
Evaluation of GLMnet classification method.

(A) UMAP plots of the three elderly donors’ cells. They are colored by the probability of belonging to a specific identity, as computed on each of the binary classification models. (B) CD34+ cells from Granja et al. data. (Left) Cells are colored by the original classification. (Middle) Colored as the result of the GLMnet classification, using young donor data and identities as reference. (Right) Colored according to the predicted cellular identities using Seurat. (C) Heatmap showing the proportion of cells predicted within each of the ground truth groups. The sum per column equals to 100%.

Figure 1—figure supplement 2
Classification of CD34+ cells in individual young and elderly donors.

UMAP plots with cells colored by cellular subpopulation, separated by donor. (Top) Cells from young donors labeled by unsupervised clustering and manual labeling (bottom) cells from elderly donors classified with GLMnet.

Figure 1—figure supplement 3
CD34+ progenitor proportions by flow activated cell sorting (FACS).

Plot depicting the percentage of hematopoietic stem cells (HSCs), granulocyte-monocyte progenitors (GMPs), and megakaryocyte-erythroid progenitors (MEPs) from total CD34+ subpopulation detected in healthy young and elderly individuals. Each point represents an individual and the mean ± standard deviation (SD) is shown for each group. **p-Value from t test <0.01.

Figure 1—figure supplement 4
Differentially expressed genes upon aging.

Violin plots showing normalized expression of genes involved in differentially enriched pathways. Expression levels are divided by cell subpopulation and age (young cells colored in red and elderly cells in blue). (A) Genes upregulated in elderly subpopulations. (B) Genes upregulated in young subpopulations. *Adjusted p-value <0.05, **adjusted p-value <0.01, ***adjusted p-value <0.001.

Figure 2 with 1 supplement
Trajectory inference of the hematopoietic lineages at single-cell resolution.

(A) UMAP plots showing the results from applying Palantir algorithm to young and elderly cells. For both datasets, a hematopoietic stem cell (HSC) was established as initial state, based on UMAP coordinates. Final states were only indicated for the elderly dataset, as the UMAP nearest neighbors to the six young final points. Cells are colored by pseudotime and (B) differentiation potential. (C) Branch probabilities for each of the six differentiation paths retrieved. (D) Scatter plot of pseudotime vs. branch probabilities for the monocytic trajectory obtained using Palantir for young and elderly donors. Color represents the cellular subpopulation. (E) Heatmap of gene expression trends for dynamic genes along the monocytic trajectory in young and elderly donors. The columns are arranged according to pseudotime values, and the rows are grouped according to gene clustering results. A summary of enriched terms for the gene clusters in young donors is shown. (F) Expression trends in the comparison of young and elderly cells regarding the different genes involved in the monocytic trajectory (NS = not significant, *adjusted p-value <0.05, **adjusted p-value <0.01, ***adjusted p-value <0.001).

Figure 2—figure supplement 1
Trajectory inference with STREAM reveals the main hematopoietic differentiation branches.

(A) STREAM plot obtained using cells from young healthy donors. Color denotes cellular subpopulations. The x-axis displays inferred pseudotime values. (B) Expression of known cell-type markers for the different hematopoietic lineages projected in the STREAM plot. Color represents normalized expression values. (C) STREAM plot of elderly differentiation trajectories projected in the young reference. Color indicates the proportion of cells belonging to each condition under study. (D) STREAM plot of elderly differentiation trajectories projected in the young reference. Color represents the cell-type identity (gray color represents the proportion of young cells). (E) Scatter plot of recovered pseudotime values with Palantir (x-axis) and Stream (y-axis) points are colored by cell type. (F) Violin plots colored by condition and representing the pseudotime and differentiation potential per cell type. (Bottom) Branch probability for the differentiation route from hematopoietic stem cells (HSCs) to monocytes. Wilcoxon two-sample test, *adjusted p-value <0.05, **adjusted p-value <0.01, ***adjusted p-value <0.001, NS = non-significant.

Figure 3 with 1 supplement
Gene regulatory network reconstruction of hematopoietic cellular populations.

(A) (Left) Heatmap showing the proportion of cells per cluster that have an activated state for different regulons in young cells. (Right) UMAP plots with normalized expression and AUC values for specific transcription factors. (B) Gene regulatory network of the identified regulons for the hematopoietic system in young donors. Regulons were trimmed to include only targets with an importance score higher than the third quartile in each regulon. Node shape denotes gene-type identity, and color denotes cell population. Any target that can be assigned to multiple transcription factors is colored in gray. (C) (Left) Heatmap showing the proportion of cells per cluster that have an activated state for different regulons in elderly cells. (Right) UMAP plots with normalized expression and AUC values for specific transcription factors. (D) Gene regulatory network of the identified regulons for the hematopoietic system in elderly donors. Regulons were trimmed to include only the targets with an importance score higher than the third quantile in each regulon. Node shape denotes gene-type identity, and color denotes cell population. Any target that can be assigned to multiple transcription factors is colored in gray. (E) Bar plot with enriched gene ontology categories after over-representation analysis. Categories are grouped per cell type, and color denotes the enriched group. Bar length represents statistical significance of the enrichment, as -log10 p-value.

Figure 3—figure supplement 1
Extraction of cell subpopulation-specific regulons from gene regulatory networks.

Regulons ranked by their specificity score (RSS), computed with pyscenic for each subpopulation. Names for the top five regulons with the most specific activity per subpopulation are shown. (A) Young regulons. (B) Elderly regulons.

Figure 4 with 1 supplement
Computational analysis of pathological conditions, including myelodysplastic syndromes (MDS) and acute myeloid leukemia.

(A) UMAP plot of CD34+ cells from MDS (n=4). Cells are colored according to identity, as assessed using a previously described cell-type classification method. (B) Bar plots showing the proportion of cells assigned to each cellular subpopulation for each donor independently. Color denotes the cellular subpopulation. (C) Gene set enrichment analysis (GSEA) results after performing differential expression between MDS and elderly donors. Dot color represents enrichment direction, transparency the statistical significance, and size NES absolute value. (D) Expression trends in the comparison of healthy and pathological cells regarding the different genes involved in the erythroid trajectory (NS = not significant, *adjusted p-value <0.05, **adjusted p-value <0.01, ***adjusted p-value <0.001). (E) Heatmap showing the proportion of cells per cluster that had an activated state for different regulons in the four samples of patients with MDS among AML cells.

Figure 4—figure supplement 1
Computational analysis of pathological samples.

(A) Gene set enrichment analysis (GSEA) results after performing differential expression between MDS and young donors. Dot color represents enrichment direction, transparency the statistical significance, and size NES absolute value. (B) UMAP with cells colored by Palantir probabilities for the erythroid trajectory. (C) Heatmap of gene expression trends for dynamic genes along the erythroid trajectory in young, elderly, and MDS donors. (D) Gene regulatory network of the identified regulons for MDS donors. Regulons were trimmed to include only the targets with an importance score higher than the third quantile in each regulon. Node shape denotes gene-type identity, and color denotes cell population. Any target that can be assigned to multiple transcription factors is colored in gray. Important genes are labeled in red.

Author response image 1
MYC activity in HSPC from young and elderly donors.

Violin plots showing the activity of MYC regulon obtained with SCENIC (top), and the expression of MYC target gene sets V1 (middle) and V2 (botton), summarized in a score calculated with AUCell.

Author response image 2
Seurat classification scores.

Box-plot describing the distribution of seurat scores of cells classified as MEPs by Seurat and HSCs by GLMnet.

Author response image 3
Expression of marker genes in the HSC compartment.

Dot plot depicting the normalized scaled expression of canonical marker genes by HSC of the 5 young and 3 elderly healthy donors. Marker genes are colored by the cell population they characterize. Dot color represents expression levels, and dot size represents the percentage of cells that express a gene.

Author response image 4
HSC sub clustering.

(A) UMAP visualization of HSC from young (left) and elderly (right) donors subjected to re-integration and unsupervised clustering.

Cells are colored by clusters. (B) Bar plot showing the proportion of cells from each donor assigned to the different clusters. (C) UMAP plots for young (left) and elderly (right) HSC colored by the normalized expression of CDK6 (top left) and by the summarized expression of multiple gene signatures, quantified as scores calculated using the software AUCell.

Author response image 5
Activity of HSC-specific regulons.

Violin plots showing activity scores for the top 5 HSC-specific regulons generated by SCENIC in HSC from both young (left) and elderly (right) donors, separated by sub-clusters. Color indicates sub-cluster.

Additional files

Supplementary file 1

Parameters used for single-cell RNA sequencing (scRNA-seq) analysis.

https://cdn.elifesciences.org/articles/79363/elife-79363-supp1-v2.xlsx
Supplementary file 2

Cell-type specific markers for each of the studied conditions (adjusted p-value <0.01 and logFC >0.1).

https://cdn.elifesciences.org/articles/79363/elife-79363-supp2-v2.xlsx
Supplementary file 3

Cell-type proportion test between young donors, elderly donors, and conditions.

https://cdn.elifesciences.org/articles/79363/elife-79363-supp3-v2.xlsx
Supplementary file 4

Differential expression analysis results between condition and per-cell subpopulation.

https://cdn.elifesciences.org/articles/79363/elife-79363-supp4-v2.xlsx
Supplementary file 5

Clinical information from the donors and patients used in this study.

https://cdn.elifesciences.org/articles/79363/elife-79363-supp5-v2.xlsx
Supplementary file 6

Ranking of specific regulons per cell subpopulation and condition.

https://cdn.elifesciences.org/articles/79363/elife-79363-supp6-v2.xlsx
MDAR checklist
https://cdn.elifesciences.org/articles/79363/elife-79363-mdarchecklist1-v2.pdf

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Marina Ainciburu
  2. Teresa Ezponda
  3. Nerea Berastegui
  4. Ana Alfonso-Pierola
  5. Amaia Vilas-Zornoza
  6. Patxi San Martin-Uriz
  7. Diego Alignani
  8. Jose Lamo-Espinosa
  9. Mikel San-Julian
  10. Tamara Jiménez-Solas
  11. Felix Lopez
  12. Sandra Muntion
  13. Fermin Sanchez-Guijo
  14. Antonieta Molero
  15. Julia Montoro
  16. Guillermo Serrano
  17. Aintzane Diaz-Mazkiaran
  18. Miren Lasaga
  19. David Gomez-Cabrero
  20. Maria Diez-Campelo
  21. David Valcarcel
  22. Mikel Hernaez
  23. Juan P Romero
  24. Felipe Prosper
(2023)
Uncovering perturbations in human hematopoiesis associated with healthy aging and myeloid malignancies at single-cell resolution
eLife 12:e79363.
https://doi.org/10.7554/eLife.79363