Multi-ancestry meta-analysis of host genetic susceptibility to tuberculosis identifies shared genetic architecture

  1. Haiko Schurz  Is a corresponding author
  2. Vivek Naranbhai
  3. Tom A Yates
  4. James Gilchrist
  5. Tom Parks
  6. Peter J Dodd
  7. Marlo Möller
  8. Eileen G Hoal
  9. Andrew P Morris
  10. Adrian VS Hill
  11. International Tuberculosis Host Genetics Consortium
  1. DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, South Africa
  2. Wellcome Centre for Human Genetics, University of Oxford, United Kingdom
  3. Massachusetts General Hospital, United States
  4. Dana-Farber Cancer Institute, United States
  5. Centre for the AIDS Programme of Research in South Africa, South Africa
  6. Harvard Medical School, United States
  7. Division of Infection and Immunity, Faculty of Medical Sciences, University College London, United Kingdom
  8. Department of Paediatrics, University of Oxford, United Kingdom
  9. Department of Infectious Diseases Imperial College London, United Kingdom
  10. School of Health and Related Research, University of Sheffield, United Kingdom
  11. Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, The University of Manchester, United Kingdom
  12. Jenner Institute, University of Oxford, United Kingdom
4 figures, 2 tables and 3 additional files

Figures

Figure 1 with 5 supplements
Manhattan plot of p-values (more than three studies) from the MR-MEGA analysis of all 12 datasets with genomic control reveals one significant association in the HLA-ll region of chromosome 6 (rs28383206).

Image produced using R scripts provided by MR-MEGA (Mägi et al., 2017), and source data file has been uploaded to https://doi.org/10.5061/dryad.6wwpzgn2s.

Figure 1—figure supplement 1
QQ-plot (left) from the MR-MEGA analysis of all 12 datasets with genomic control correction, including two principal components (PCs) as covariates.

The lambda value of 1.00 suggests no significant inflation or deflation of p-values. Lambda values for the input data (right), and source data file has been uploaded to https://doi.org/10.5061/dryad.6wwpzgn2s.

Figure 1—figure supplement 2
Proportion of variants that had a significant change in association p-value (based on chi-square difference test) following the inclusion of the force of infection p-value for different p-value bins.
Figure 1—figure supplement 2—source data 1

Proportions of variants for each p-value bin that had a significant change in association p-value, based on chi-square test for significant difference.

https://cdn.elifesciences.org/articles/84394/elife-84394-fig1-figsupp2-data1-v1.xlsx
Figure 1—figure supplement 3
Forest plots for the suggestive chromosome 9 peaks, rs4576509 (left) and rs6477824 (right) for the trans-ethnic MR-MEGA analysis including all 12 cohorts.
Figure 1—figure supplement 3—source data 1

Odds ratios and 95% confidence intervals for each source data file used to plot the forest plot of the two suggestive associations rs4576509 and rs6477824.

https://cdn.elifesciences.org/articles/84394/elife-84394-fig1-figsupp3-data1-v1.xlsx
Figure 1—figure supplement 4
Forest plots for the suggestive chromosome 11 peak, rs12362545, for the trans-ethnic MR-MEGA analysis including all 12 cohorts.
Figure 1—figure supplement 4—source data 1

Odds ratios and 95% confidence intervals for each source data file used to plot the forest plot of the suggestive association rs12362545.

https://cdn.elifesciences.org/articles/84394/elife-84394-fig1-figsupp4-data1-v1.xlsx
Figure 1—figure supplement 5
Forest plots for the suggestive chromosome 16 peak, rs35787595, for the trans-ethnic MR-MEGA analysis including all 12 cohorts.
Figure 1—figure supplement 5—source data 1

Odds ratios and 95% confidence intervals for each source data file used to plot the forest plot of the suggestive association rs35787595.

https://cdn.elifesciences.org/articles/84394/elife-84394-fig1-figsupp5-data1-v1.xlsx
Regional association plot for the chromosome 6 HLA-ll rs28383206 association in the multi-ancestry analysis revealing a significant peak in the HLA-ll region.

Image produced using the online LocusZoom database with linkage disequilibrium (LD) mapping set to ‘all’ and p-values>0.01 removed (Boughton et al., 2021), and source data file has been uploaded to https://doi.org/10.5061/dryad.6wwpzgn2s.

Figure 3 with 1 supplement
HLA conditioning analysis.

(A) Forest plot (odds ratio and 95% confidence interval) of the significant chromosome 6 association (rs28383206) for tuberculosis (TB) susceptibility in the multi-ancestry analysis, implemented using MR-MEGA with genomic control correction (GCC). Of the 12 studies included, 8 contained this variant. Studies that did not contain the variant are included in the plot but do not have results associated with them. (B) Forest plot for HLA DQA1*02:01 for the eight studies included in the HLA association analysis. Other studies included were obtained from literature searches of previous studies where HLA imputation and association studies were performed (Sveinbjornsson et al., 2016; Li et al., 2021; Zheng et al., 2018). For source data, see Figure 3—source data 1.

Figure 3—figure supplement 1
Results for the HLA class I and II meta-analysis of all studies overall (unconditioned) (top) and conditioned on the lead SNP for the six studies in which the lead SNP was present at minor allele frequency (MAF) > 2.5% (bottom).
Figure 3—figure supplement 1—source data 1

Association statistics used to plot the p-value distribution for the fixed-effects meta-analysis for each HLA locus for both the conditioned and unconditioned analyses.

https://cdn.elifesciences.org/articles/84394/elife-84394-fig3-figsupp1-data1-v1.xlsx
Figure 4 with 4 supplements
Principal component analysis (PCA) plot of all 12 studies based on the MR-MEGA mean pairwise genome-wide allele frequency differences.

Image produced using the R plot function. For source data, see Figure 4—source data 1.

Figure 4—figure supplement 1
Manhattan plot of all p-values (≥2 studies) for the European subgroup analysis.

Fixed effects (FE) model with genomic control correction (GCC) implemented in GWAMA, and source data file has been uploaded to https://doi.org/10.5061/dryad6wwpzgn2s.

Figure 4—figure supplement 2
Manhattan plot of all p-values (≥2 studies) for the Asian subgroup analysis.

Fixed effects (FE) model with genomic control correction (GCC) implemented in GWAMA, and source data file has been uploaded to https://doi.org/10.5061/dryad.6wwpzgn2s.

Figure 4—figure supplement 3
Manhattan plot of all p-values (≥2 studies) for the African subgroup analysis.

Fixed effects (FE) model with genomic control correction (GCC) implemented in GWAMA, and source data file has been uploaded https://doi.org/10.5061/dryad.6wwpzgn2s.

Figure 4—figure supplement 4
QQ-plots for the region-specific fixed effects (FE) meta-analysis using genomic control correction (GCC) and implemented in GWAMA.

Source data file has been uploaded to https://doi.org/10.5061/dryad.6wwpzgn2s.

Tables

Table 1
Summary of ITHGC TB-GWAS datasets.
DatasetPopulationCases/ controlsTB prevalence per 100 ,000 paEstimated proportion of controls ever exposed to Mtb (±SD)*#SNPsGenotyping platformReference
China 1Asian483/587890.302 (0.101)7,710,153Affymetrix Genome-Wide Human SNP Array 6.0thye@bni-hamburg.de
(unpublished)
China 2Asian1290/1145890.302 (0.101)9,769, 029Illumina Human OmniZhonghua-8 chipsmagdakellis@gmail.com
(unpublished)
China 3Asian972/1537890.302 (0.101)9,726,450Illumina Human OmniZhonghua-8 chipsQi et al., 2017
ThailandAsian433/2952360.404 (0.112)6,723,358Illumina Human610-QuadMahasirimongkol et al., 2012
JapanAsian751/3199230.142 (0.125)9,051,051Illumina HumanHap550Mahasirimongkol et al., 2012
RussiaEuropean5914/60221090.191 (0.093)10,878,777Affymetrix Genome-Wide Human SNP Array 6.0Curtis et al., 2015
EstoniaEuropean239/7047130.116 (0.093)10,611,556Illumina 370Kandres.metspalu@ut.ee
(unpublished)
GermanyEuropean586/3337.80.067 (0.081)10,602,193Illumina Omni2.5+exomethye@bni-hamburg.de
(unpublished)
GambiaAfrican1316/13821260.280 (0.089)18,634,017Affymetrix GeneChip 500KThe Wellcome Trust Case Control Consortium, 2007
GhanaAfrican1359/19522820.539 (0.198)19,029,214Affymetrix Genome-Wide Human SNP Array 6.0Thye et al., 2010
RSA(A) African19/5777170.436 (0.127)9,227,330Affymetrix 500kDaya et al., 2014b
RSA(M)African410/4057170.436 (0.127)11,371,838Illumina MEGA arraySchurz et al., 2018
  1. GWAS, genome-wide association studies; ITHGC, International Tuberculosis Host Genetics Consortium; Mtb, Mycobacterium tuberculosis; TB, tuberculosis.

  2. *

    Estimated proportion of control individuals ever infected with Mtb by age 35–44 in 2010, based on data from Houben & Dodd.

  3. Raw genotyping data available.

  4. RSA(A/M): South African admixed population (RSA) Affymetrix (A) and MEGA (M) array data.

Table 2
Significant and suggestive associations (p-value ≤1e–5) for the multi-ancestry analysis including data from all 12 datasets implementing MR-MEGA analysis with GCC.
Marker nameChromosomePositionGeneLocationCADD scoreEANEAEAFSample sizeDatasetsp-Value
rs28383206632575167HLA-DRB1Intergenic7.6GA0.16825,05988.26e–09
  1. GCC, genomic control correction; EA, effect allele; EAF, effect allele frequency; NEA, noneffect allele.

Additional files

Supplementary file 1

Extended and supplementary tables.

(a) Summary of ITHGC TB-GWAS datasets. (b) Polygenic heritability estimates at different TB prevalence rates. (c) Suggestive associations (p-value≤1e–5) for the multi-ancestry analysis including data from all 12 datasets implementing MR-MEGA analysis with GCC. (d) Results for the concordance in direction of effect analysis for all p-value thresholds and reference populations (for SNNP selection). (e) Suggestive associations for the European and Asian ancestry-specific FE analysis (with GCC).

https://cdn.elifesciences.org/articles/84394/elife-84394-supp1-v1.docx
MDAR checklist
https://cdn.elifesciences.org/articles/84394/elife-84394-mdarchecklist1-v1.pdf
Source data 1

Collection of data and results that are not used for figures or discussed in depth in the article but may still be valuable for other researchers working on similar topics.

https://cdn.elifesciences.org/articles/84394/elife-84394-data1-v1.xlsx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Haiko Schurz
  2. Vivek Naranbhai
  3. Tom A Yates
  4. James Gilchrist
  5. Tom Parks
  6. Peter J Dodd
  7. Marlo Möller
  8. Eileen G Hoal
  9. Andrew P Morris
  10. Adrian VS Hill
  11. International Tuberculosis Host Genetics Consortium
(2024)
Multi-ancestry meta-analysis of host genetic susceptibility to tuberculosis identifies shared genetic architecture
eLife 13:e84394.
https://doi.org/10.7554/eLife.84394