Unsupervised detection of fragment length signatures of circulating tumor DNA using non-negative matrix factorization

  1. Gabriel Renaud
  2. Maibritt Nørgaard
  3. Johan Lindberg
  4. Henrik Grönberg
  5. Bram De Laere
  6. Jørgen Bjerggaard Jensen
  7. Michael Borre
  8. Claus Lindbjerg Andersen
  9. Karina Dalsgaard Sørensen
  10. Lasse Maretty  Is a corresponding author
  11. Søren Besenbacher  Is a corresponding author
  1. Technical University of Denmark, Denmark
  2. Aarhus University, Denmark
  3. Karolinska Institute, Sweden
  4. Regional Hospital of West Jutland, Denmark
  5. Aarhus University Hospital, Denmark

Abstract

Sequencing of cell-free DNA (cfDNA) is currently being used to detect cancer by searching both for mutational and non-mutational alterations. Recent work has shown that the length distribution of cfDNA fragments from a cancer patient can inform tumor load and type. Here, we propose non-negative matrix factorization (NMF) of fragment length distributions as a novel and completely unsupervised method for studying fragment length patterns in cfDNA. Using shallow whole-genome sequencing (sWGS) of cfDNA from a cohort of patients with metastatic castration-resistant prostate cancer (mCRPC), we demonstrate how NMF accurately infers the true tumor fragment length distribution as an NMF component - and that the sample weights of this component correlate with ctDNA levels (r=0.75). We further demonstrate how using several NMF components enables accurate cancer detection on data from various early stage cancers (AUC=0.96). Finally, we show that NMF, when applied across genomic regions, can be used to discover fragment length signatures associated with open chromatin.

Data availability

Danish law requires ethical approval of any specific research aim and imposes restrictions on sharing of personal data. This means that the prostate cancer data used in this article cannot be uploaded to international databases. External researchers (academic or commercial) interested in analysing the prostate dataset (including any derivatives of it) will need to contact the Data Access Committee via email to kdso@clin.au.dk. The Data Access Committee is formed of co-authors Karina Dalsgaard Sørensen and Michael Borre, and Ole Halfdan Larsen (Department Head Consultant, Department of Clinical Medicine, Aarhus University). Due to Danish Law, for the authors to be allowed to share the data (pseudonymised) it will require prior approval from The Danish National Committee on Health Research Ethics (or similar) for the specific new research goal. The author (based in Denmark) has to submit the application for ethical approval, with the external researcher(s) as named collaborator(s)). In addition to ethical approval, a Collaboration Agreement and a Data Processing Agreement is required, both of which must be approved by the legal office of the institution of the author (data owner) and the legal office of the institution of the external researcher (data processor). Raw fragment length distributions along with ctDNA% estimates are available in Supplementary File 1.

The following previously published data sets were used

Article and author information

Author details

  1. Gabriel Renaud

    Department of Health Technology, Technical University of Denmark, Kongens Lyngby, Denmark
    Competing interests
    The authors declare that no competing interests exist.
  2. Maibritt Nørgaard

    Department of Molecular Medicine, Aarhus University, Aarhus N, Denmark
    Competing interests
    The authors declare that no competing interests exist.
  3. Johan Lindberg

    Department of Medical Epidemiology and Biostatistics, Karolinska Institute, Stockholm, Sweden
    Competing interests
    The authors declare that no competing interests exist.
  4. Henrik Grönberg

    Department of Medical Epidemiology and Biostatistics, Karolinska Institute, Stockholm, Sweden
    Competing interests
    The authors declare that no competing interests exist.
  5. Bram De Laere

    Department of Medical Epidemiology and Biostatistics, Karolinska Institute, Stockholm, Sweden
    Competing interests
    The authors declare that no competing interests exist.
  6. Jørgen Bjerggaard Jensen

    Department of Urology, Regional Hospital of West Jutland, Holstebro, Denmark
    Competing interests
    The authors declare that no competing interests exist.
  7. Michael Borre

    Department of Urology, Aarhus University Hospital, Aarhus, Denmark
    Competing interests
    The authors declare that no competing interests exist.
  8. Claus Lindbjerg Andersen

    Department of Molecular Medicine, Aarhus University, Aarhus N, Denmark
    Competing interests
    The authors declare that no competing interests exist.
  9. Karina Dalsgaard Sørensen

    Department of Molecular Medicine, Aarhus University, Aarhus N, Denmark
    Competing interests
    The authors declare that no competing interests exist.
  10. Lasse Maretty

    Department of Molecular Medicine, Aarhus University, Aarhus N, Denmark
    For correspondence
    lasse.maretty@clin.au.dk
    Competing interests
    The authors declare that no competing interests exist.
  11. Søren Besenbacher

    Department of Molecular Medicine, Aarhus University, Aarhus N, Denmark
    For correspondence
    besenbacher@clin.au.dk
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1455-1738

Funding

The Independent Research Fund Denmark (Sapere Aude Research Leader)

  • Søren Besenbacher

The Danish Cancer Society

  • Karina Dalsgaard Sørensen

The Central Denmark Region Health Fund

  • Karina Dalsgaard Sørensen

Aarhus Universitet (Graduate School of Health)

  • Maibritt Nørgaard

Direktør Emil C. Hertz og Hustru Inger Hertz Fond

  • Karina Dalsgaard Sørensen

KV Fonden

  • Karina Dalsgaard Sørensen

Raimond og Dagmar Ringgård-Bohns Fond

  • Karina Dalsgaard Sørensen

Beckett Fonden

  • Karina Dalsgaard Sørensen

Snedkermester Sophus Jacobsen og Hustru Astrid Jacobsens Fond

  • Karina Dalsgaard Sørensen

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Ethics

Human subjects: The prostate study was approved by The National Committee on Health Research Ethics (#1901101) and notified to The Danish Data Protection Agency (#1-16-02-366-15). All patients provided written informed consent.

Copyright

© 2022, Renaud et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,539
    views
  • 497
    downloads
  • 13
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Gabriel Renaud
  2. Maibritt Nørgaard
  3. Johan Lindberg
  4. Henrik Grönberg
  5. Bram De Laere
  6. Jørgen Bjerggaard Jensen
  7. Michael Borre
  8. Claus Lindbjerg Andersen
  9. Karina Dalsgaard Sørensen
  10. Lasse Maretty
  11. Søren Besenbacher
(2022)
Unsupervised detection of fragment length signatures of circulating tumor DNA using non-negative matrix factorization
eLife 11:e71569.
https://doi.org/10.7554/eLife.71569

Share this article

https://doi.org/10.7554/eLife.71569

Further reading

    1. Cancer Biology
    2. Immunology and Inflammation
    Sofia V Krasik, Ekaterina A Bryushkova ... Ekaterina O Serebrovskaya
    Research Article

    The current understanding of humoral immune response in cancer patients suggests that tumors may be infiltrated with diffuse B cells of extra-tumoral origin or may develop organized lymphoid structures, where somatic hypermutation and antigen-driven selection occur locally. These processes are believed to be significantly influenced by the tumor microenvironment through secretory factors and biased cell-cell interactions. To explore the manifestation of this influence, we used deep unbiased immunoglobulin profiling and systematically characterized the relationships between B cells in circulation, draining lymph nodes (draining LNs), and tumors in 14 patients with three human cancers. We demonstrated that draining LNs are differentially involved in the interaction with the tumor site, and that significant heterogeneity exists even between different parts of a single lymph node (LN). Next, we confirmed and elaborated upon previous observations regarding intratumoral immunoglobulin heterogeneity. We identified B cell receptor (BCR) clonotypes that were expanded in tumors relative to draining LNs and blood and observed that these tumor-expanded clonotypes were less hypermutated than non-expanded (ubiquitous) clonotypes. Furthermore, we observed a shift in the properties of complementarity-determining region 3 of the BCR heavy chain (CDR-H3) towards less mature and less specific BCR repertoire in tumor-infiltrating B-cells compared to circulating B-cells, which may indicate less stringent control for antibody-producing B cell development in tumor microenvironment (TME). In addition, we found repertoire-level evidence that B-cells may be selected according to their CDR-H3 physicochemical properties before they activate somatic hypermutation (SHM). Altogether, our work outlines a broad picture of the differences in the tumor BCR repertoire relative to non-tumor tissues and points to the unexpected features of the SHM process.

    1. Cancer Biology
    2. Computational and Systems Biology
    Rosalyn W Sayaman, Masaru Miyano ... Mark A LaBarge
    Research Article Updated

    Effects from aging in single cells are heterogenous, whereas at the organ- and tissue-levels aging phenotypes tend to appear as stereotypical changes. The mammary epithelium is a bilayer of two major phenotypically and functionally distinct cell lineages: luminal epithelial and myoepithelial cells. Mammary luminal epithelia exhibit substantial stereotypical changes with age that merit attention because these cells are the putative cells-of-origin for breast cancers. We hypothesize that effects from aging that impinge upon maintenance of lineage fidelity increase susceptibility to cancer initiation. We generated and analyzed transcriptomes from primary luminal epithelial and myoepithelial cells from younger <30 (y)ears old and older >55 y women. In addition to age-dependent directional changes in gene expression, we observed increased transcriptional variance with age that contributed to genome-wide loss of lineage fidelity. Age-dependent variant responses were common to both lineages, whereas directional changes were almost exclusively detected in luminal epithelia and involved altered regulation of chromatin and genome organizers such as SATB1. Epithelial expression variance of gap junction protein GJB6 increased with age, and modulation of GJB6 expression in heterochronous co-cultures revealed that it provided a communication conduit from myoepithelial cells that drove directional change in luminal cells. Age-dependent luminal transcriptomes comprised a prominent signal that could be detected in bulk tissue during aging and transition into cancers. A machine learning classifier based on luminal-specific aging distinguished normal from cancer tissue and was highly predictive of breast cancer subtype. We speculate that luminal epithelia are the ultimate site of integration of the variant responses to aging in their surrounding tissue, and that their emergent phenotype both endows cells with the ability to become cancer-cells-of-origin and represents a biosensor that presages cancer susceptibility.