Multi-ancestry meta-analysis of host genetic susceptibility to tuberculosis identifies shared genetic architecture

  1. Haiko Schurz  Is a corresponding author
  2. Vivek Naranbhai
  3. Tom A Yates
  4. James J Gilchrist
  5. Tom Parks
  6. Peter J Dodd
  7. Marlo Möller
  8. Eileen G Hoal
  9. Andrew P Morris
  10. Adrian VS Hill
  11. International Tuberculosis Host Genetics Consortium
  1. DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, South Africa
  2. Wellcome Centre for Human Genetics, University of Oxford, United Kingdom
  3. Massachusetts General Hospital, United States
  4. Dana-Farber Cancer Institute, United States
  5. Centre for the AIDS Programme of Research in South Africa, South Africa
  6. Harvard Medical School, United States
  7. Division of Infection and Immunity, Faculty of Medical Sciences, University College London, United Kingdom
  8. Department of Paediatrics, University of Oxford, United Kingdom
  9. Department of Infectious Diseases Imperial College London, United Kingdom
  10. School of Health and Related Research, University of Sheffield, United Kingdom
  11. Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, The University of Manchester, United Kingdom
  12. Jenner Institute, University of Oxford, United Kingdom

Abstract

The heritability of susceptibility to tuberculosis (TB) disease has been well recognized. Over 100 genes have been studied as candidates for TB susceptibility, and several variants were identified by genome-wide association studies (GWAS), but few replicate. We established the International Tuberculosis Host Genetics Consortium to perform a multi-ancestry meta-analysis of GWAS, including 14,153 cases and 19,536 controls of African, Asian, and European ancestry. Our analyses demonstrate a substantial degree of heritability (pooled polygenic h2 = 26.3%, 95% CI 23.7–29.0%) for susceptibility to TB that is shared across ancestries, highlighting an important host genetic influence on disease. We identified one global host genetic correlate for TB at genome-wide significance (p<5 × 10-8) in the human leukocyte antigen (HLA)-II region (rs28383206, p-value=5.2 × 10-9) but failed to replicate variants previously associated with TB susceptibility. These data demonstrate the complex shared genetic architecture of susceptibility to TB and the importance of large-scale GWAS analysis across multiple ancestries experiencing different levels of infection pressure.

Editor's evaluation

This article describes an important multi-ancestry meta-analysis of genome-wide association studies of susceptibility to tuberculosis. It demonstrates substantial heritability from common genetic variants, although this varies across studies. The main finding of the article is a variant in the HLA region that affects tuberculosis risk, for which the evidence is solid. The results and methods will be of interest to infectious disease researchers and human genetics researchers. The article highlights both the promise and challenges of performing multi-ancestry genetic association studies of infectious disease risk.

https://doi.org/10.7554/eLife.84394.sa0

Introduction

Tuberculosis (TB), caused by Mycobacterium tuberculosis (Mtb) and related species, remains a leading cause of death globally. Around one-quarter of the global population is estimated to show immunological evidence of prior exposure to Mtb (Houben and Dodd, 2016), and in 2019 an estimated 10 million people developed the disease, resulting in 1.4 million deaths (WHO, 2020). This disease burden could be substantially reduced with action to address the social determinants of disease and equitable scale-up of existing interventions. However, tools to prevent, diagnose, and treat TB could be improved if a better understanding of the underpinning pathophysiology could help identify those at greatest risk of the disease.

The role of host genetic factors in TB susceptibility has long been of significant interest. Over 100 candidate genes have been studied, but few associations have proven reproducible (Naranbhai, 2016). This failure to replicate may be a result of the modest size of many TB genome-wide association studies (GWAS), variability in phenotyping between studies, the impact of population-specific effects, the challenge of complex population structure in some high-burden settings (e.g., admixed individuals), and, possibly, pathogen variation (Correa-Macedo et al., 2019; Daya et al., 2014a; Luo et al., 2019; Möller and Kinnear, 2020; Müller et al., 2021; Omae et al., 2017; Schurz et al., 2018). Seventeen GWAS have been reported but only two loci replicate between studies (Daya et al., 2014a; Schurz et al., 2018; Chimusa et al., 2014; The Wellcome Trust Case Control Consortium, 2007; Curtis et al., 2015; Mahasirimongkol et al., 2012; Qi et al., 2017; Thye et al., 2010; Thye et al., 2012; Quistrebert et al., 2021; Sveinbjornsson et al., 2016; Hong et al., 2017; Li et al., 2021; Luo et al., 2019; Zheng et al., 2018; Grant et al., 2016; Png et al., 2012). The WT1 locus, identified in cohorts from Ghana and Gambia, replicated in South Africa and Russia. The ASAP1 locus identified in Russia was replicated through reanalysis of prior studies (Correa-Macedo et al., 2019; Möller and Kinnear, 2020).

To address these challenges, we established the International Tuberculosis Host Genetics Consortium (ITHGC) to study the host genetics of disease through collaborative and equitable data sharing (Naranbhai, 2016). The ITHGC includes 12 case–control GWAS from nine countries in Europe, Africa, and Asia (total of 14,153 pulmonary TB cases and 19,536 healthy controls). Inclusion of multiple ancestral groups in a multi-ancestry meta-analysis has the advantage of maximizing power and enhancing fine-mapping resolution to identifying true global associated variants that influence TB susceptibility across population groups.

Here we present the first analyses of the ITHGC dataset exploring host genetic correlates of TB susceptibility using a multi-ancestry meta-analysis approach, including fine-mapping of human leukocyte antigen (HLA) loci and estimation of genetic heritability.

Results

Study overview

In total, 12 GWAS from three major ancestral groups (European, African, and Asian) were included in this study (Table 1; a more detailed table outlining the selection of cases and controls is provided in Supplementary file 1a). All individual datasets were imputed and aligned to the same reference allele before association testing, using an additive genetic model, to obtain odds ratios (OR) and p-values to be used in the meta-analysis. For each individual study (for which we had raw genotyping data), the polygenic heritability was estimated, and HLA alleles were imputed for fine-mapping of the HLA regions.

Table 1
Summary of ITHGC TB-GWAS datasets.
DatasetPopulationCases/ controlsTB prevalence per 100 ,000 paEstimated proportion of controls ever exposed to Mtb (±SD)*#SNPsGenotyping platformReference
China 1Asian483/587890.302 (0.101)7,710,153Affymetrix Genome-Wide Human SNP Array 6.0thye@bni-hamburg.de
(unpublished)
China 2Asian1290/1145890.302 (0.101)9,769, 029Illumina Human OmniZhonghua-8 chipsmagdakellis@gmail.com
(unpublished)
China 3Asian972/1537890.302 (0.101)9,726,450Illumina Human OmniZhonghua-8 chipsQi et al., 2017
ThailandAsian433/2952360.404 (0.112)6,723,358Illumina Human610-QuadMahasirimongkol et al., 2012
JapanAsian751/3199230.142 (0.125)9,051,051Illumina HumanHap550Mahasirimongkol et al., 2012
RussiaEuropean5914/60221090.191 (0.093)10,878,777Affymetrix Genome-Wide Human SNP Array 6.0Curtis et al., 2015
EstoniaEuropean239/7047130.116 (0.093)10,611,556Illumina 370Kandres.metspalu@ut.ee
(unpublished)
GermanyEuropean586/3337.80.067 (0.081)10,602,193Illumina Omni2.5+exomethye@bni-hamburg.de
(unpublished)
GambiaAfrican1316/13821260.280 (0.089)18,634,017Affymetrix GeneChip 500KThe Wellcome Trust Case Control Consortium, 2007
GhanaAfrican1359/19522820.539 (0.198)19,029,214Affymetrix Genome-Wide Human SNP Array 6.0Thye et al., 2010
RSA(A) African19/5777170.436 (0.127)9,227,330Affymetrix 500kDaya et al., 2014b
RSA(M)African410/4057170.436 (0.127)11,371,838Illumina MEGA arraySchurz et al., 2018
  1. GWAS, genome-wide association studies; ITHGC, International Tuberculosis Host Genetics Consortium; Mtb, Mycobacterium tuberculosis; TB, tuberculosis.

  2. *

    Estimated proportion of control individuals ever infected with Mtb by age 35–44 in 2010, based on data from Houben & Dodd.

  3. Raw genotyping data available.

  4. RSA(A/M): South African admixed population (RSA) Affymetrix (A) and MEGA (M) array data.

The summary statistics from the individual GWAS of each dataset were used to conduct a combined, multi-ancestry meta-analysis using MR-MEGA and ancestry-specific (European, African, and Asian) fixed effects (FE) meta-analyses using GWAMA. Finally, the impact of infection pressure on the multi-ancestry meta-regression was assessed and the concordance in direction of effect for the reference allele between studies was investigated.

Polygenic heritability estimates suggest a genetic contribution to TB disease susceptibility

Twin studies estimate the narrow-sense heritability of susceptibility to TB at up to 80% (Diehl and Von, 1936; Kallmann and Reisner, 1943; Comstock, 1978), but there are few modern estimates. Using raw (unimputed) genotyping data, and assuming population prevalence of disease in each study population equivalent to the reported WHO prevalence rates for that country (WHO, 2020), we estimated polygenic heritability of susceptibility to TB in 10 contributing studies which ranged from 5 to 36% (average of 26.3%, Supplementary file 1b). Comparisons of the heritability estimates between studies from different geographical locations do not take into consideration the differences in environmental pressures between the included studies, and as such these estimates of heritability are only interpretable if the distribution of nongenetic determinants of TB is held constant (Pearce, 2011). Furthermore, variations in phenotype definition can have an impact on heritability estimates (Supplementary file 1a). This is supported by previous research by McHenry et al., 2021a, where significant differences in polygenic heritability estimates were identified between subjects with latent TB infection (LTBI), active TB, and subjects classified as resistors. (McHenry et al., 2021a). As this study includes data with varying methods of classifying TB cases and healthy controls (Supplementary file 1a), there is potential for a degree of heterogeneity and misclassification (between cases and controls) that can have an impact on the heritability estimates. Recent history has seen the near elimination of TB in several countries associated with economic development and public health action. However, while improvement of socioeconomic standing and environment has a stronger impact than host genetics, these crude estimates of polygenic heritability do indicate that TB susceptibility is, in part, heritable. These results require future, more rigorous investigations to narrow down the level of heritable risk and pinpoint genomic loci involved by accounting for population stratification to obtain more accurate heritability estimates.

Multi-ancestry meta-analysis identifies susceptibility loci for TB

For the primary multi-ancestry meta-analysis, MR-MEGA was used as it allows for differences in allelic effects of variants on disease risk between GWAS. Principal components (PCs), derived from a matrix of similarities in allele frequencies between GWAS, were plotted and revealed distinct separation between the three main ancestral groups included in the study (Figure 4) . To account for this, the first two PCs were included as covariates in MR-MEGA as they sufficiently accounted for the allele frequency differences between the study populations, as assessed via a QQ-plot and associated lambda inflation value (Figure 1—figure supplement 1, lambda = 1.00). In total, 26,620,804 variants with a minor allele frequency (MAF) > 1% and present in at least three studies were included in the analysis, of which 3,184,478 were present in all 12 datasets.

A significant association peak on chromosome 6 was identified in the HLA class II region (Figure 1). One variant (rs28383206, OR = 0.89, CI = 0.84–0.94, p-value=8.26 × 10–9) within this peak was associated with susceptibility to TB at genome-wide significance (p<5.0e–8, Figures 13, Table 2). Both the residual heterogeneity (p-value=0.012) and ancestry-correlated heterogeneity (p-value=5.28e–6) are significant (p-value<0.05) for the associated variant. However, the evidence of ancestry-correlated heterogeneity is much stronger than for residual heterogeneity, indicating that genetic ancestry contributes more to differences in effects sizes between GWAS than does study design (e.g., phenotyping differences and potential case–control misclassification). The association peak encompasses many HLA-ll genes, including HLA-DRB1/5 (major histocompatibility complex, class II, DR beta 1/5), HLA-DQA1 (major histocompatibility complex, class II, DQ alpha 1), and HLA-DQB3 (major histocompatibility complex, class II, DQ beta 3, Figures 1 and 2). While not reaching genome-wide significance, the HLA class l locus is also indirectly tagged through the association with rs2621322, in the TAP2 (transporter 2, ATP binding cassette subfamily B member) gene, a transporter protein that restores surface expression of MHC class I molecules and has previously been implicated in TB susceptibility (Thu et al., 2016). HLA-A, DQA1, DQB1, DRB1, and TAP2 genes have previously been linked to TB susceptibility through TB candidate gene and GWAS analysis (Thu et al., 2016; Kinnear et al., 2017; Stein et al., 2017; Sveinbjornsson et al., 2016; Zhang et al., 2021). The HLA-II locus encodes several proteins crucial in antigen presentation, including HLA-DR, HLA-DQ, and HLA-DP, which are widely implicated in susceptibility to infection and autoimmunity (Kelly and Trowsdale, 2019; Shiina et al., 2009).

Figure 1 with 5 supplements see all
Manhattan plot of p-values (more than three studies) from the MR-MEGA analysis of all 12 datasets with genomic control reveals one significant association in the HLA-ll region of chromosome 6 (rs28383206).

Image produced using R scripts provided by MR-MEGA (Mägi et al., 2017), and source data file has been uploaded to https://doi.org/10.5061/dryad.6wwpzgn2s.

Regional association plot for the chromosome 6 HLA-ll rs28383206 association in the multi-ancestry analysis revealing a significant peak in the HLA-ll region.

Image produced using the online LocusZoom database with linkage disequilibrium (LD) mapping set to ‘all’ and p-values>0.01 removed (Boughton et al., 2021), and source data file has been uploaded to https://doi.org/10.5061/dryad.6wwpzgn2s.

Figure 3 with 1 supplement see all
HLA conditioning analysis.

(A) Forest plot (odds ratio and 95% confidence interval) of the significant chromosome 6 association (rs28383206) for tuberculosis (TB) susceptibility in the multi-ancestry analysis, implemented using MR-MEGA with genomic control correction (GCC). Of the 12 studies included, 8 contained this variant. Studies that did not contain the variant are included in the plot but do not have results associated with them. (B) Forest plot for HLA DQA1*02:01 for the eight studies included in the HLA association analysis. Other studies included were obtained from literature searches of previous studies where HLA imputation and association studies were performed (Sveinbjornsson et al., 2016; Li et al., 2021; Zheng et al., 2018). For source data, see Figure 3—source data 1.

Table 2
Significant and suggestive associations (p-value ≤1e–5) for the multi-ancestry analysis including data from all 12 datasets implementing MR-MEGA analysis with GCC.
Marker nameChromosomePositionGeneLocationCADD scoreEANEAEAFSample sizeDatasetsp-Value
rs28383206632575167HLA-DRB1Intergenic7.6GA0.16825,05988.26e–09
  1. GCC, genomic control correction; EA, effect allele; EAF, effect allele frequency; NEA, noneffect allele.

HLA-II

Given the strong association peak in the HLA-ll locus (Figures 1 and 2), we imputed HLA-ll alleles to fine-map this association. HLA alleles were imputed using the HIBAG R package that utilizes both genotyping array and population-specific reference panels to obtain the most accurate imputations for each individual dataset. Association testing was then conducted using an additive genetic model for each individual dataset before meta-analyzing the results (Source data 1, sheets 11–15).

Notwithstanding inconsistency across populations, the strongest signal in the combined global analyses is at DQA1*02:01, revealing a protective effect (OR = 0.88, 95% CI = 0.82–93, p-value=1.3e–5, Figure 3B). The signal remains apparent in the six populations with the lead SNP at MAF > 2.5% and individual-level data available (p-value=0.0003). After conditioning on the lead SNP (rs28383206) in this subset, there is no residual significant association at DQA1*02:01 (p-value=0.44, Figure 3—figure supplement 1), suggesting that the classical allele is tagging the rs28383206 association. This observation is consistent with previous observations of HLA analysis in Icelandic (DQA1*02:01: OR = 0.82, p-value=7.39e–4) and Han Chinese populations (DQA1*02:01: OR = 0.82, p-value=7.39e–4), but showed opposite direction of effect in another Chinese population (DQA1*02:01: OR = 1.28, p-value=0.0193, Figure 3B; Sveinbjornsson et al., 2016; Li et al., 2021; Zheng et al., 2018).

The significant HLA associations overlap with the association peak observed in the multi-ancestry meta-analysis (Figure 2) but show more consistency in the direction of effects between the input studies compared to the lead SNPs identified in the association peak. This suggests that the rs28383206 association in the meta-analysis is tagging an HLA allele, where the different linkage disequilibrium (LD) patterns from the included ancestral populations result in the differences in effects sizes between populations at the rs28383206 association.

This variation in significant associations is, in part, attributable to the observed variation in HLA allele frequencies across all the included studies and may also reflect differential tagging of at least one unknown causal variant across populations (Source data 1, sheets 16–22).

The variable role of classical HLA alleles in different populations could be partially due to unique infectious pressures that each geographical region faces and could also explain why different strains of Mtb are more or less prevalent in different regions as they adapted to the HLA profile of the population within this region. Sequencing efforts of global mycobacterial isolates find hyperconservation of class II epitopes, suggesting pathogen advantage achieved through limiting HLA-II recognition and highlighting the potential complex interplay between pathogen and host evolution in modifying class II presentation in TB infection (Comas et al., 2010). Previous work has shown evidence of interaction between genetic variants of the host and specific strains of Mtb in Ghanaian, Ugandan, South African, and Asian populations (Möller and Kinnear, 2020; Müller et al., 2021; Correa-Macedo et al., 2019; Salie et al., 2014; Luo et al., 2015; Wampande et al., 2019; Micheni et al., 2021; McHenry et al., 2021b; McHenry et al., 2020). These interactions provide further evidence that Mtb may have undergone substantial genetic evolution, in concert with host migration and evolution of different populations (Comas et al., 2013; Coscolla and Gagneux, 2014). Some studies suggest that HLA-II epitopes may have undergone regional mutations that modify HLA-II binding, and we speculate that the heterogeneity observed in HLA-II associations between regions may, at least in part, be accounted for by different pressures exerted by varying stains of Mtb (Copin et al., 2016).

Impact of infection pressure on meta-regression

To further understand the heterogeneity across populations, we attempted to account for variation in levels of prior exposure that could serve to mask host effects given that not all controls will have been exposed to Mtb. In low transmission settings, more susceptible but unexposed individuals would be included as controls, who, had they been exposed to Mtb, might have progressed to TB disease. Overall, including each cohort’s estimated prevalence of prior exposure had a significant impact on the residual heterogeneity and association statistics of 5% of the variants included in the meta-analysis (419,460/8,355,367), which at a significance level of p-value<0.05 is what is to be expected purely by chance. Separating the results into bins according to p-values revealed that the bins where the covariate had the biggest impact were for p-values in the range of 1e–3 to 1e–5 (Figure 1—figure supplement 2), while significant and suggestive associations reported in this study did not show any significant changes in residual heterogeneity. While the proportion of variants significantly impacted when correcting for infection pressures is low and has the biggest impact on variants with larger p-values, there was still an overall reduction in the chi-square value for the residual heterogeneity (mean chi-square value reduced by 10). This suggests that accounting for potential lifetime of infections does account for some of the observed residual heterogeneity; it is most likely not the main driving force for these residuals.

When considering the impact of force of infection, it is important to consider not only the proportion of controls ever exposed but also the impact of recurrent exposure. There is some evidence to suggest that genetic barriers to progression to TB may be overcome if the infectious dose is high (Fox et al., 1929). Repeated exposure may be observed where TB prevalence is high, as in South Africa, and could contribute to the overall lower effects sizes observed in the GWAS enrolling RSA people. Inclusion of potential lifetime infections in meta-regression could help adjust for these effects and prove useful for not only TB, but meta-analysis of infectious diseases in general, and should be further explored.

Other suggestive loci that did not reach significance

There were four loci with suggestive associations and strong peaks on the Manhattan plot (Figure 1) that did not reach significance but should still be considered as potential variants of interest (Supplementary file 1c). One chr9 peak (rs4576509, p-value=7.40e–07) was intergenic (Figure 1—figure supplement 3) while the second (rs6477824, p-value=2.99e–07) is located in the 5′-UTR region of the zinc finger protein 483 (ZNF483) gene (Figure 1—figure supplement 3), previously associated with age at menarche (Demerath et al., 2013; Elks et al., 2010). The chromosome 11 peak (rs12362545, p-value=1.24e–06) is located in the PPFIA binding protein 2 (PPFIBP2) gene (Figure 1—figure supplement 4), which plays a role in axon guidance and neuronal synapse development and has previously been implicated in cancer development (Colas et al., 2011; Wu et al., 2018). The final peak (rs35787595, p-value=5.41e–06), on chromosome 16 (Figure 1—figure supplement 5), is located in the craniofacial development protein 1 (CFDP1) gene region and involved in chromatin organization (Messina et al., 2017). These genes have not been previously linked to TB susceptibility and a potential role is unclear, and as a result further validation of these variants is needed before any conclusions on their impact to TB susceptibility can be drawn.

Ancestry-specific meta-analysis

Concordance in the direction of effects of the risk allele between the ancestry-specific meta-analyses was examined to determine whether significant enrichment (above the expected 50%) exists at different p-value thresholds. Significant enrichment in the concordance of direction of effect was only observed when using the European ancestry as reference compared to the African meta-analysis results for SNPs with p-values>0.001 and <0.01 (p-value=0.0061, Supplementary file 1d). The lack of enrichment between the ancestries suggests significant ancestry-specific associations, which could be further compounded by the differences in local infection pressures. Due to the lack of concordance and the separation of the ancestral populations in the principal component analysis (PCA) plot (Figure 4), ancestry-specific meta-analysis was done.

Figure 4 with 4 supplements see all
Principal component analysis (PCA) plot of all 12 studies based on the MR-MEGA mean pairwise genome-wide allele frequency differences.

Image produced using the R plot function. For source data, see Figure 4—source data 1.

The PCA plot (Figure 4) for the 12 studies (based on mean pairwise genome-wide allele frequency differences calculated by MR-MEGA) illustrates distinct separation between the three major population groups (Asia, Europe, and Africa). The separation observed between the African studies (Gambia/Ghana and RSA) is due to the high level of admixture in the RSA population. The RSA population is a five-way admixed South African population with genetic contributions from Bantu-speaking African, KhoeSan, European, and South and South East Asian populations, which explains the observed shift in the PCA plot (Daya et al., 2013; Figure 4).

QQ-plots for the ancestry-specific analysis show no significant inflation or deflation. After removing associations without any clear peaks on the Manhattan plots (associations driven by a single study), we found no significant associations for the ancestry-specific analysis. However, suggestive peaks that did not reach genome-wide significance were identified in the European and Asian ancestry-specific analyses (Figure 4—figure supplements 1 and 2, Supplementary file 1e). Potential causes for the lack of associations and suggestive peaks in the African analysis (Figure 4—figure supplement 3) are the increased genetic diversity within Africa, the inclusion of admixed samples (RSA), and the smaller sample size compared to the other ancestry-specific meta-analysis. While power can be increased through inclusion of greater genetic diversity, between-subpopulation differences in allele frequency can introduce confounding. Confounding by genetic background can result in both spurious associations and the masking of true associations. Such confounding may explain why the results observed elsewhere may not replicate in admixed samples. Removing the admixed data and analyzing only the Gambian and Ghanaian datasets also did not produce any significant results, although, clearly, the sample size was smaller.

For the European analysis (Figure 4—figure supplement 1), suggestive peaks were identified on chromosomes 6 (rs28383206, p-value=7.06e–08), 8 (rs3935174, p-value=1.00e–06), and 11 (rs12362545, p-value=1.06e–07, Supplementary file 1e), while the Asian (Figure 4—figure supplement 2) analysis identified suggestive peaks on chromosome 6 (rs146049519, p-value=1.06e–06) and 8 (rs62495207, p-value=5.10e–06, Supplementary file 1e).

The suggestive peaks on chromosomes 6 and 11 in the European subgroup analysis overlap with the suggestive peaks of the multi-ancestry meta-analysis (Figure 1, Figure 4—figure supplement 4, Supplementary file 1e), but the suggestive peak on chromosome 8 is unique to this population (Figure 4—figure supplement 1, Supplementary file 1e). The strongest signal for this peak (rs3935174, OR = 0.87, p-value=1.00e–6) is located in the ArfGAP with SH3 domain, ankyrin repeat, and PH domain 1 (ASAP1) region, which encodes an ADP-ribosylation factor (ARF) GTPase-activating protein and is potentially involved in the regulation of membrane trafficking and cytoskeleton remodeling (Brown et al., 1998). Variants in ASAP1 (rs4733781 and rs10956514) have previously been linked to TB susceptibility in a TB-GWAS analysis of the same Russian population included here (Curtis et al., 2015). While these ASAP1 variants were present in all 12 studies and had consistent direction of effects, they presented with a strong signal in the European ancestry-specific analysis only (African and Asian p-values all ≥ 0.1). These differences in association were not driven by allele frequency differences as they are similar between the included study populations. A possible explanation for the association being observed only in the European meta-analysis is that the association is driven by the Russian dataset. rs4733781 has a strong signal in the Russian dataset (p-value=2.96e–7), but very weak signals in all other populations included in the analysis (p-value>0.01) and is in LD with rs3935174 (r2 = 0.6935 and D’ = 0.8791) identified in our analysis. rs4733781 also did not replicate in a previous GWAS from Iceland (Sveinbjornsson et al., 2016), further suggesting that this association is not specific to European populations, but rather driven by the large Russian dataset included in this study.

The suggestive peak on chromosome 8 in the Asian subgroup analysis lies in an intergenic region (Figure 4—figure supplement 2, Supplementary file 1e) and the link to TB susceptibility is unclear. Finally, the suggestive region on chromosome 6 overlaps with the significant peak from the multi-ancestry analysis (Figure 1 and Figure 4—figure supplement 2) and is located in the major histocompatibility complex, class II, DR beta 1 (HLA-DRB1), as discussed above (Figure 4—figure supplement 2, Supplementary file 1e).

Prior associations

To determine whether associations from previously published TB-GWAS, TB candidate SNPs, and SNPs within candidate gene studies replicate in this meta-analysis, we extracted all significant and suggestive associations from prior analyses and compared these to our multi-ancestry and ancestry-specific meta-analysis results (Luo et al., 2019; Schurz et al., 2018; Chimusa et al., 2014; The Wellcome Trust Case Control Consortium, 2007; Curtis et al., 2015; Mahasirimongkol et al., 2012; Qi et al., 2017; Thye et al., 2010; Thye et al., 2012; Quistrebert et al., 2021; Hong et al., 2017; Zheng et al., 2018; Grant et al., 2016; Png et al., 2012; Daya et al., 2014b). In total, 44 SNPs and 36 genes were identified from the GWAS catalog, of which 33 SNPs and all candidate genes were present in our data (Source data 1, sheet 2). We also extracted the association statistics for a further 90 previously identified candidate genes from our multi-ancestry and population-specific meta-analysis results (Source data 1, sheet 2; Naranbhai, 2016).

Using a Bonferroni-corrected p-value of 0.0015 for the number of SNPs tested (33) as the significance threshold for replication, two candidate SNPs (rs4733781: p-value=3.22e–5; rs10956514: p-value=0.000118; Source data 1, sheets 3 and 4) replicated in the multi-ancestry meta-analysis, both located in the ASAP1 gene region (Curtis et al., 2015; Chen et al., 2019; Wang et al., 2018). However, as discussed in the previous section, these associations are driven by the Russian dataset, which is the same data used by Curtis et al., 2015, where these associations were originally discovered (Curtis et al., 2015). As the Russian population included in our analysis presenting with a strong signal for these variants, there is no independent evidence for these candidate SNPs as they did not replicate in any other population.

For the Asian ancestry-specific analysis, the replicated variant was rs41553512, located in the HLA-DRB5 gene (p-value=3.53E-05). HLA-DRB5 is located within the HLA-ll region identified in the multi-ancestry meta-analysis (Figure 1) and was previously identified by Qi et al., 2017 in a Han Chinese population. The African ancestry-specific analysis did not replicate previous associations, with the lowest p-value at rs6786408 in the FOXP1 gene (p-value=0.023). While this variant was previously identified in a North African cohort, the fact that it does not replicate here could be because of the genetic diversity within Africa and specifically the variability introduced by the five-way admixed South African population.

Discussion

This large-scale, multi-ethnic meta-analysis of genetic susceptibility to TB, involving 14,153 cases and 19,536 controls, identified one risk locus achieving genome-wide significance, and further investigation of this region revealed significant classical HLA allele associations. This association is noteworthy given we show that there is association in other studies for the same allele (Kinnear et al., 2017; Stein et al., 2017).

Based on the significant association, rs28383206, in the HLA region identified in this multi-ancestry analysis (Figure 3A), HLA-specific imputation and association testing were done to fine-map the region and identify potential HLA alleles driving this association. HLA DQA1*02:01 had the strongest signal in the meta-analysis across the eight included studies (Figure 3B), but this signal disappeared when conditioning on the significant SNP (rs28383206). HLA DQA1*02:01 has previously been identified in an Icelandic and two Chinese populations, but the direction of effect was not consistent (Sveinbjornsson et al., 2016; Li et al., 2021; Zheng, 2018). Despite these inconsistencies, the association between Mtb and HLA class II should be explored in more detail in future studies. A study investigating the outcomes of Mtb exposure in individuals of African ancestry identified protective effects of HLA class II alleles for individuals resistant to TB, highlighting the importance of HLA class II and susceptibility to TB (Dawkins et al., 2022). HLA class II is a key determinant of the immune response in TB, and Mtb has the mechanisms to directly interfere with MHC class 2 antigen presentation (Sia and Rengarajan, 2019). This is supported by studies in mice, where mice in which the MHC class ll genes were deleted died quickly when exposed to Mtb and died faster than the mice in which MHC class I genes were deleted (Sia and Rengarajan, 2019).

The p-values of residual heterogeneity in genetic effects between the studies in the multi-ancestry meta-analysis show no significant inflation between the studies. This suggests that the differences in study characteristics (phenotype definition, infection pressure, Mtb strain) are not the main contributor to the lack of significant associations. However, they certainly have an impact, which is further compounded with ancestry-correlated heterogeneity and other factors (e.g., socioeconomic standing). The ancestry-correlated heterogeneity p-values are generally lower than the residual heterogeneity, suggesting that genetic ancestry has a stronger impact on the differences in effects sizes between the studies. This is supported by the fact that previous TB genetic association studies have identified significant effects of ancestry on TB susceptibility (Chimusa et al., 2014; Daya et al., 2014b). However, the effects of genetic ancestry can be confounded by other factors not accounted for in this analysis, such as the differences in socioeconomic factors (including the differences in housing, employment, poverty, and access to healthcare), phenotype definitions, and differences in infection pressure between the included study populations (Hargreaves et al., 2011; Duarte et al., 2018; Lönnroth et al., 2009). Specifically, the lack of consistency and specificity in TB diagnosis between the included studies introduces heterogeneity and the potential for misclassification of cases and controls, which can reduce the power to detect significant associations (Supplementary file 1a). While this is a limitation of this study, the fact that the residual heterogeneity is overpowered by the ancestry-specific heterogeneity suggests that the phenotype definitions are not the main driver behind the lack of significant associations. For the ancestry-specific analysis, fewer studies result in there being less input heterogeneity to account for, but the reduced sample size was not sufficient to detect any ancestry-specific genome-wide associations. This is particularly evident for the African ancestry-specific meta-analysis where the large degree of heterogeneity, which could be a result of the high genetic diversity within Africa, in combination with differences in socioeconomic factors compared to other populations included in this study, resulted in no observable suggestive association peaks (Campbell and Tishkoff, 2008; Peprah et al., 2015). Furthermore, the suggestive associations (Supplementary file 1c and e) reported in this study should be interpreted with care, and further validation is required before any conclusions can be drawn on the impact that they could have on TB susceptibility.

Polygenic heritability estimates revealed genetic contributions to TB susceptibility for all studies, but the level of this contribution varied greatly (5–36%), suggesting that other factors are contributing to both the lack of significant associations detected in this meta-analysis and the variation observed for the polygenic heritability estimates. These factors likely include environmental, socioeconomic, and varying levels of infection pressures, as well as genetic ancestry-specific effects between the included study populations. An individual from South Africa will face a much higher force of infection than individuals in Europe, and making the assumption that environmental circumstances are equal will significantly skew these crude heritability estimates (Pearce, 2011). This argument is sustained by the fact that increasing disease prevalence (higher infection pressure) increased the level of genetic contribution to TB susceptibility up to a certain point, presumably accounted for by increasingly informative control samples, after which further increasing the infection pressure will not further impact genetic susceptibility.

To determine the impact that force of infection has on the level of genetic contribution to TB susceptibility, we modeled values for proportion of people ever infected with Mtb to include in the multi-ancestry meta-analysis and correct for the different force of infection faced by individuals in each country. Inclusion of this covariate, however, only resulted in a significant difference for 5% of the analyzed variants, what is to be expected based on chance alone, and as such we cannot conclude that a significant portion of the observed residual heterogeneity is explained by this. Limited metadata forced us to make several assumptions about the ages of study participants and the dates on which they were enrolled. With more precise metadata, or Mtb infection test results in controls, the potential impact of lifetime infection could be better quantified and may contribute to elucidating genetic TB susceptibility. Multi-ancestry meta-analysis of other infectious diseases could also potentially benefit from the inclusion of force of infection covariates. It would also be important to determine whether there is a level of exposure beyond which host genetic barriers to infection are overcome (Simmons et al., 2018).

A single significant association was identified in this multi-ancestry meta-analysis, which is small when compared to other meta-analyses of similar size. Factors contributing to this include the difficulty in analyzing multi-ancestry data, the outdated arrays and lack of suitable reference panels for the included study populations, and heterogeneity in case and control definitions between the studies. The issue of heterogeneity in definitions is especially pronounced for this study as it included unpublished data with limited information, which does not indicate how cases were confirmed and controls were collected. The complexity of TB and generally small genetic effects suggests that larger sample sizes or alternative methods of investigation are needed. Utilizing GWAS arrays that better capture diverse populations in combination with imputation making use of larger and more diverse reference panels would allow for larger and more consistent datasets for future meta-analysis. Remapping specific areas of interest such as the HLA, ASAP1, or TLR using long-read sequencing would be invaluable. Increased amounts of genetic data will also allow for more accurate TB heritability analysis and permit analysis of polygenic risk scores and exploration of host–pathogen interactions.

In conclusion, this large-scale multi-ancestry TB GWAS meta-analysis revealed significant associations and shared genetic TB susceptibility architecture across multiple populations from different genetic backgrounds. The analysis shows the value of collaboration and data sharing to solve difficult problems and elucidate what determines susceptibility to complex diseases such as TB. We hope that this publication will encourage others to make their data available for future large-scale meta-analyses.

Methods

Data

This analysis includes 12 of the 17 published (and unpublished, Table 1, Supplementary file 1) GWAS of TB (with HIV-negative cohorts) prior to 2022 (Schurz et al., 2018; Chimusa et al., 2014; The Wellcome Trust Case Control Consortium, 2007; Curtis et al., 2015; Mahasirimongkol et al., 2012; Qi et al., 2017; Thye et al., 2010; Thye et al., 2012; Daya et al., 2014b). For unpublished works, we contacted researchers that were funded for genetic TB research and acquired data-sharing agreements to obtain summary statistics (or raw data) along with any metadata that was available. It excludes data from Iceland and Vietnam (Quistrebert et al., 2021) as they declined to share data. It excludes data from China, Korea, Peru, and Japan (Luo et al., 2019; Hong et al., 2017; Li et al., 2021; Zheng, 2018; Sveinbjornsson et al., 2016) as data-sharing agreements could not be finalized in time for this analysis. The Indonesian and Moroccan data were too sparsely genotyped and not suitable for reliable imputation. In addition, the Moroccan data was family-based and thus also not suitable for this meta-analysis as this would introduce confounding effects from the inclusion of related individuals (Grant et al., 2016; Png et al., 2012). Finally, cases and controls are also available within large-scale biobanks, for example, UK Biobank, which could also be leveraged in future iterations of this analysis (Munafò et al., 2018).

Included individuals were genotyped on a variety of genotyping arrays (Table 1, Supplementary file 1), and raw genotyping data was available for eight datasets and for the remainder association testing summary statistics were obtained to use in the meta-analysis (Table 1, Supplementary file 1). Quality control (QC) of raw genotyping data (Table 1, Supplementary file 1) was done using Plink (v1.9), followed by pre-phasing using SHAPEIT and imputation with IMPUTE2 with the 1000 genomes phase 3 reference panel (Chang et al., 2015; Delaneau et al., 2013; Howie et al., 2009; Sudmant et al., 2015). QC and imputation were done as described previously (Schurz et al., 2018; Schurz et al., 2019); briefly, we used a MAF filter of 0.025 and an individual and SNP missingness filter of 0.1. Hardy–Weinberg equilibrium threshold was set at a Bonferroni-corrected p-value according to the number of SNPs testes (0.05/number of SNPs) and samples where sex could not be determined from genotyping were also removed. Imputed data was filtered at a quality score of 0.3, prior to individual and genotype filtration steps. Prior to QC and imputation, allele orientation was corrected using Genotype Harmoniser version 1.4.15, and the genome build of all datasets was checked for consistency (GRCh37) and updated if necessary using the liftOver software from the UCSC genome browser (Deelen et al., 2014; Kent et al., 2002). The four datasets with only summary statistics available (Table 1, Supplementary file 1) were imputed and QC’d during the original investigations, but the marker names and allele orientation were checked for concordance between the summary statistics and the rest of the consortium’s imputed data.

Polygenic heritability analysis

To assess the level of genetic contribution to TB susceptibility, we estimated polygenic heritability on the individual studies for which raw genotyping data was available (Table 1, Supplementary file 1). Polygenic heritability estimates were calculated using GCTA (v1.93.2), a genomic risk prediction tool (Yang et al., 2011). The genetic relationship matrix was calculated for each autosomal chromosome. Raw genotype data was pruned for SNPs in LD using a 50 SNP window, sliding by 10 SNPs at a time and removing all variants with LD > 0.5. Samples were filtered by removing cryptic relatedness (--grm-cutoff 0.025) and assuming that the causal loci have similar distribution of allele frequencies as the genotyped SNPs (--grm-adj 0). Principal components were then calculated (--pca 20) to include as covariates prior to estimating heritability. Heritability estimations were transformed onto the liability scale using the GCTA software to account for the difference in the proportion of cases in the data compared to the population prevalence (Yang et al., 2011). The average heritability estimate was calculated by taking the mean of all estimates and the confidence intervals were estimated based on the standard error across all studies and the number of studies included.

Meta-analysis

All variants with MAF > 1% and polymorphic in at least three studies (from at least two different ancestries) were included in the primary analysis. For the GWAS, summary statistics of each dataset variants with infinite confidence intervals were removed prior to the meta-analysis. A multi-ancestry meta-analysis plus separate ancestry-specific analyses for Africa, Asia, and Europe were performed. MR-MEGA (Meta-Regression of Multi-Ethnic Genetic Association, v0.20), a meta-analysis tool that maximizes power and enhances fine-mapping when combining data across different ethnicities, was used for the multi-ancestry meta-analysis (Mägi et al., 2017). To account for the expected heterogeneity in allelic effects between populations, MR-MEGA implements a multi-ancestry meta-regression that includes covariates to represent genetic ancestry, obtained from multidimensional scaling of mean pairwise genome-wide allele frequency differences. Genomic control correction (GCC) was implemented during the MR-MEGA analysis for the individual input data (if lambda was >1.05) and output statistics, and the first two PCs, calculated from the genome-wide allele frequency differences, were included as covariates in the regression. QQ-plots of p-values and associated lambda values were used to assess the quality of results prior to downstream investigation.

For the ancestry-specific analyses, the studies were grouped by the major ancestral groups (Table 1, Supplementary file 1) and all variants with a MAF of > 1% that were observed in at least two studies were included in the meta-analysis. We performed traditional fixed-effects meta-analyses in GWAMA (v2.2.2), implementing GCC and assessed the results using QQ-plots (Mägi and Morris, 2010). The genome-wide significance threshold for all association testing was set at p-value=5 × 10-8 (Panagiotou et al., 2012).

HLA imputation

To fine-map HLA alleles over the HLA locus we imputed HLA class l and ll variants for all 8 studies for which raw data was available (Table 1 and Supplementary file 1). HLA imputation for the HLA class l regions A, B and C as well as the HLA class ll regions DPB1, DRB1, DQB1 and DQA1 was done using the R package HIBAG (version 1.5), implemented in the R free software environment (version 4.0.5) using the predict() command for imputation (R Development Core Team, 2013; Zheng, 2018; Zheng et al., 2014).

The reference datasets for HLA imputation are both genotyping panel and population-specific, and HIBAG has a database of reference data for many genotyping arrays. Each reference panel is also available for either Asian, European, or African populations or a mixture of the three (https://hibag.s3.amazonaws.com/hlares_index.html#estimates). For each dataset included for imputation, the reference panel chosen was the same as the genotyping array used for the data and the reference population was chosen to match the data as closely as possible. Asian and European reference panels were used for the Asian and European populations and African references were used for the Gambia and Ghana datasets, while mixed datasets were implemented for the admixed RSA population.

Following imputation, the HIBAG package (hlaAssocTest) command was used to implement an additive association test for the HLA alleles across the different regions limited to alleles at MAF > 2.5%. Analyses were adjusted for the first four PCs with and without the rs28383206 genotype in the model. Association testing results for the eight included studies were then combined in a fixed-effects meta-analysis using Metasoft software (Han and Eskin, 2011). Ancestry-specific meta-analysis grouped according to the major population groups (Table 1, Supplementary file 1) was also done using the same method.

Estimation of infection pressure

To generate a covariate capturing the likely cumulative exposure to Mtb for included controls, the results of Houben and Dodd, 2016 were adapted to produce a distance matrix to feed into the meta-analysis. The approach in this article fits a Gaussian process model of infection risk history to local data. To represent uncertainty in derived results, a sample of 200 estimated histories of the annual risk of TB infection in each country was used to calculate the expected fraction of control participants ever infected with Mtb, assuming that controls were uniformly aged between 35 and 44 y in 2010, which approximates the period during which controls were recruited for most of the studies. The true age of the controls was not known for all of the datasets, but as quite a substantial skew to the age distribution would be required to have an impact on the results, we believe our choice here is justified. This was done by including estimates for the potential lifetime infections for each source population as a covariate in the MR-MEGA multi-ancestry meta regression. To determine the impact of the covariate, a chi-square difference test was implemented, on an SNP-SNP basis, on the residual and association testing statistics of two meta-analysis output statistics, one including and the other excluding the potential lifetime infections covariate (Satorra and Bentler, 2001). The aim was to determine whether inclusion of potential lifetime infections in the regression explained some of the residual heterogeneity.

Concordance of direction of effect

To determine the degree to which direction of effect is shared for SNPs between the ancestry-specific meta-analysis, we followed the methodology of Mahajan et al., 2014. First, we identified all variants present in all 12 included datasets. Among these SNPs, we then identified an independent subset of variants in the European ancestry-specific meta-analysis showing nominal evidence of association (p-value≤0.001) and separated by at least 500 kb. The identified SNPs were then extracted from the Asian and African ancestry-specific meta-analysis results to calculate the number of SNPs that had the same direction of effect as in the European analysis. To determine whether significant excess in concordance of effect direction was present, a one-sided binomial test was implemented with the expected concordance set at 50%. This analysis was then repeated for other p-value thresholds (0.001<p≤0.01; 0.01<p≤0.5; and 0.5<p≤1), and also using the African and Asian meta-analysis results as reference.

Data availability

Summary statistics of all meta-analysis will be made available on Dryad (https://doi.org/10.5061/dryad.6wwpzgn2s). The summary statistics and raw data (where available) of the individual data files cannot be made available but enquiries or requests for this data can be made through the corresponding authors or authors directly responsible for the data, listed in Table 1. As the ITHGC consortium has strict data transfer and sharing agreements with the original authors/owners of the data we can not ethically share the source data files in any way, be it either anonymized, de-identified or in any other form. All data that is not restricted by these data transfer and ethical agreements has been either uploaded to the online repository (https://doi.org/10.5061/dryad.6wwpzgn2s) or submitted along with this document. If any interested researchers want to apply for access to the original raw and individual GWAS datasets or any other other data currently restricted they can contact the corresponding author of this manuscript to put them in touch with the original data owners/authors, or the original data owners/authors can be contacted directly by contacting the corresponding authors listed in Table 1. Once the original authors/owners of the data have been contacted discussions can be had to share the data using the appropriate and ethically approved methods, which could include data transfer agreements or similar application processes.

The following previously published data sets were used
    1. Schurz H
    2. Naranbhai V
    3. Yates TA
    4. Gilchrist J
    5. Parks T
    6. Dodd P
    7. Möller M
    8. Hoal EG
    9. Morris A
    10. Hill AV
    (2022) Dryad Digital Repository
    Multi-ancestry meta-analysis of host genetic susceptibility to tuberculosis identifies shared genetic architecture.
    https://doi.org/10.5061/dryad.6wwpzgn2s

References

  1. Book
    1. Diehl K
    2. Von O
    (1936)
    Der Erbeinfluss Bei Der Tuberkulose
    Gustav Fischer.
    1. Elks CE
    2. Perry JRB
    3. Sulem P
    4. Chasman DI
    5. Franceschini N
    6. He C
    7. Lunetta KL
    8. Visser JA
    9. Byrne EM
    10. Cousminer DL
    11. Gudbjartsson DF
    12. Esko T
    13. Feenstra B
    14. Hottenga J-J
    15. Koller DL
    16. Kutalik Z
    17. Lin P
    18. Mangino M
    19. Marongiu M
    20. McArdle PF
    21. Smith AV
    22. Stolk L
    23. van Wingerden SH
    24. Zhao JH
    25. Albrecht E
    26. Corre T
    27. Ingelsson E
    28. Hayward C
    29. Magnusson PKE
    30. Smith EN
    31. Ulivi S
    32. Warrington NM
    33. Zgaga L
    34. Alavere H
    35. Amin N
    36. Aspelund T
    37. Bandinelli S
    38. Barroso I
    39. Berenson GS
    40. Bergmann S
    41. Blackburn H
    42. Boerwinkle E
    43. Buring JE
    44. Busonero F
    45. Campbell H
    46. Chanock SJ
    47. Chen W
    48. Cornelis MC
    49. Couper D
    50. Coviello AD
    51. d’Adamo P
    52. de Faire U
    53. de Geus EJC
    54. Deloukas P
    55. Döring A
    56. Smith GD
    57. Easton DF
    58. Eiriksdottir G
    59. Emilsson V
    60. Eriksson J
    61. Ferrucci L
    62. Folsom AR
    63. Foroud T
    64. Garcia M
    65. Gasparini P
    66. Geller F
    67. Gieger C
    68. GIANT Consortium
    69. Gudnason V
    70. Hall P
    71. Hankinson SE
    72. Ferreli L
    73. Heath AC
    74. Hernandez DG
    75. Hofman A
    76. Hu FB
    77. Illig T
    78. Järvelin M-R
    79. Johnson AD
    80. Karasik D
    81. Khaw K-T
    82. Kiel DP
    83. Kilpeläinen TO
    84. Kolcic I
    85. Kraft P
    86. Launer LJ
    87. Laven JSE
    88. Li S
    89. Liu J
    90. Levy D
    91. Martin NG
    92. McArdle WL
    93. Melbye M
    94. Mooser V
    95. Murray JC
    96. Murray SS
    97. Nalls MA
    98. Navarro P
    99. Nelis M
    100. Ness AR
    101. Northstone K
    102. Oostra BA
    103. Peacock M
    104. Palmer LJ
    105. Palotie A
    106. Paré G
    107. Parker AN
    108. Pedersen NL
    109. Peltonen L
    110. Pennell CE
    111. Pharoah P
    112. Polasek O
    113. Plump AS
    114. Pouta A
    115. Porcu E
    116. Rafnar T
    117. Rice JP
    118. Ring SM
    119. Rivadeneira F
    120. Rudan I
    121. Sala C
    122. Salomaa V
    123. Sanna S
    124. Schlessinger D
    125. Schork NJ
    126. Scuteri A
    127. Segrè AV
    128. Shuldiner AR
    129. Soranzo N
    130. Sovio U
    131. Srinivasan SR
    132. Strachan DP
    133. Tammesoo M-L
    134. Tikkanen E
    135. Toniolo D
    136. Tsui K
    137. Tryggvadottir L
    138. Tyrer J
    139. Uda M
    140. van Dam RM
    141. van Meurs JBJ
    142. Vollenweider P
    143. Waeber G
    144. Wareham NJ
    145. Waterworth DM
    146. Weedon MN
    147. Wichmann HE
    148. Willemsen G
    149. Wilson JF
    150. Wright AF
    151. Young L
    152. Zhai G
    153. Zhuang WV
    154. Bierut LJ
    155. Boomsma DI
    156. Boyd HA
    157. Crisponi L
    158. Demerath EW
    159. van Duijn CM
    160. Econs MJ
    161. Harris TB
    162. Hunter DJ
    163. Loos RJF
    164. Metspalu A
    165. Montgomery GW
    166. Ridker PM
    167. Spector TD
    168. Streeten EA
    169. Stefansson K
    170. Thorsteinsdottir U
    171. Uitterlinden AG
    172. Widen E
    173. Murabito JM
    174. Ong KK
    175. Murray A
    (2010) Thirty new loci for age at menarche identified by a meta-analysis of genome-wide association studies
    Nature Genetics 42:1077–1085.
    https://doi.org/10.1038/ng.714
    1. Kallmann FJ
    2. Reisner D
    (1943)
    Twin studies on the significance of genetic factors in tuberculosis
    American Review of Tuberculosis 47:547–549.
    1. Mahajan A
    2. Go MJ
    3. Zhang W
    4. Below JE
    5. Gaulton KJ
    6. Ferreira T
    7. Horikoshi M
    8. Johnson AD
    9. Ng MCY
    10. Prokopenko I
    11. Saleheen D
    12. Wang X
    13. Zeggini E
    14. Abecasis GR
    15. Adair LS
    16. Almgren P
    17. Atalay M
    18. Aung T
    19. Baldassarre D
    20. Balkau B
    21. Bao Y
    22. Barnett AH
    23. Barroso I
    24. Basit A
    25. Been LF
    26. Beilby J
    27. Bell GI
    28. Benediktsson R
    29. Bergman RN
    30. Boehm BO
    31. Boerwinkle E
    32. Bonnycastle LL
    33. Burtt N
    34. Cai Q
    35. Campbell H
    36. Carey J
    37. Cauchi S
    38. Caulfield M
    39. Chan JCN
    40. Chang LC
    41. Chang TJ
    42. Chang YC
    43. Charpentier G
    44. Chen CH
    45. Chen H
    46. Chen YT
    47. Chia KS
    48. Chidambaram M
    49. Chines PS
    50. Cho NH
    51. Cho YM
    52. Chuang LM
    53. Collins FS
    54. Cornelis MC
    55. Couper DJ
    56. Crenshaw AT
    57. van Dam RM
    58. Danesh J
    59. Das D
    60. de Faire U
    61. Dedoussis G
    62. Deloukas P
    63. Dimas AS
    64. Dina C
    65. Doney AS
    66. Donnelly PJ
    67. Dorkhan M
    68. van Duijn C
    69. Dupuis J
    70. Edkins S
    71. Elliott P
    72. Emilsson V
    73. Erbel R
    74. Eriksson JG
    75. Escobedo J
    76. Esko T
    77. Eury E
    78. Florez JC
    79. Fontanillas P
    80. Forouhi NG
    81. Forsen T
    82. Fox C
    83. Fraser RM
    84. Frayling TM
    85. Froguel P
    86. Frossard P
    87. Gao Y
    88. Gertow K
    89. Gieger C
    90. Gigante B
    91. Grallert H
    92. Grant GB
    93. Grrop LC
    94. Groves CJ
    95. Grundberg E
    96. Guiducci C
    97. Hamsten A
    98. Han BG
    99. Hara K
    100. Hassanali N
    101. Hattersley AT
    102. Hayward C
    103. Hedman AK
    104. Herder C
    105. Hofman A
    106. Holmen OL
    107. Hovingh K
    108. Hreidarsson AB
    109. Hu C
    110. Hu FB
    111. Hui J
    112. Humphries SE
    113. Hunt SE
    114. Hunter DJ
    115. Hveem K
    116. Hydrie ZI
    117. Ikegami H
    118. Illig T
    119. Ingelsson E
    120. Islam M
    121. Isomaa B
    122. Jackson AU
    123. Jafar T
    124. James A
    125. Jia W
    126. Jöckel KH
    127. Jonsson A
    128. Jowett JBM
    129. Kadowaki T
    130. Kang HM
    131. Kanoni S
    132. Kao WHL
    133. Kathiresan S
    134. Kato N
    135. Katulanda P
    136. Keinanen-Kiukaanniemi KM
    137. Kelly AM
    138. Khan H
    139. Khaw KT
    140. Khor CC
    141. Kim HL
    142. Kim S
    143. Kim YJ
    144. Kinnunen L
    145. Klopp N
    146. Kong A
    147. Korpi-Hyövälti E
    148. Kowlessur S
    149. Kraft P
    150. Kravic J
    151. Kristensen MM
    152. Krithika S
    153. Kumar A
    154. Kumate J
    155. Kuusisto J
    156. Kwak SH
    157. Laakso M
    158. Lagou V
    159. Lakka TA
    160. Langenberg C
    161. Langford C
    162. Lawrence R
    163. Leander K
    164. Lee JM
    165. Lee NR
    166. Li M
    167. Li X
    168. Li Y
    169. Liang J
    170. Liju S
    171. Lim WY
    172. Lind L
    173. Lindgren CM
    174. Lindholm E
    175. Liu CT
    176. Liu JJ
    177. Lobbens S
    178. Long J
    179. Loos RJF
    180. Lu W
    181. Luan J
    182. Lyssenko V
    183. Ma RCW
    184. Maeda S
    185. Mägi R
    186. Männisto S
    187. Matthews DR
    188. Meigs JB
    189. Melander O
    190. Metspalu A
    191. Meyer J
    192. Mirza G
    193. Mihailov E
    194. Moebus S
    195. Mohan V
    196. Mohlke KL
    197. Morris AD
    198. Mühleisen TW
    199. Müller-Nurasyid M
    200. Musk B
    201. Nakamura J
    202. Nakashima E
    203. Navarro P
    204. Ng PK
    205. Nica AC
    206. Nilsson PM
    207. Njølstad I
    208. Nöthen MM
    209. Ohnaka K
    210. Ong TH
    211. Owen KR
    212. Palmer CNA
    213. Pankow JS
    214. Park KS
    215. Parkin M
    216. Pechlivanis S
    217. Pedersen NL
    218. Peltonen L
    219. Perry JRB
    220. Peters A
    221. Pinidiyapathirage JM
    222. Platou CG
    223. Potter S
    224. Price JF
    225. Qi L
    226. Radha V
    227. Rallidis L
    228. Rasheed A
    229. Rathman W
    230. Rauramaa R
    231. Raychaudhuri S
    232. Rayner NW
    233. Rees SD
    234. Rehnberg E
    235. Ripatti S
    236. Robertson N
    237. Roden M
    238. Rossin EJ
    239. Rudan I
    240. Rybin D
    241. Saaristo TE
    242. Salomaa V
    243. Saltevo J
    244. Samuel M
    245. Sanghera DK
    246. Saramies J
    247. Scott J
    248. Scott LJ
    249. Scott RA
    250. Segrè AV
    251. Sehmi J
    252. Sennblad B
    253. Shah N
    254. Shah S
    255. Shera AS
    256. Shu XO
    257. Shuldiner AR
    258. Sigurđsson G
    259. Sijbrands E
    260. Silveira A
    261. Sim X
    262. Sivapalaratnam S
    263. Small KS
    264. So WY
    265. Stančáková A
    266. Stefansson K
    267. Steinbach G
    268. Steinthorsdottir V
    269. Stirrups K
    270. Strawbridge RJ
    271. Stringham HM
    272. Sun Q
    273. Suo C
    274. Syvänen AC
    275. Takayanagi R
    276. Takeuchi F
    277. Tay WT
    278. Teslovich TM
    279. Thorand B
    280. Thorleifsson G
    281. Thorsteinsdottir U
    282. Tikkanen E
    283. Trakalo J
    284. Tremoli E
    285. Trip MD
    286. Tsai FJ
    287. Tuomi T
    288. Tuomilehto J
    289. Uitterlinden AG
    290. Valladares-Salgado A
    291. Vedantam S
    292. Veglia F
    293. Voight BF
    294. Wang C
    295. Wareham NJ
    296. Wennauer R
    297. Wickremasinghe AR
    298. Wilsgaard T
    299. Wilson JF
    300. Wiltshire S
    301. Winckler W
    302. Wong TY
    303. Wood AR
    304. Wu JY
    305. Wu Y
    306. Yamamoto K
    307. Yamauchi T
    308. Yang M
    309. Yengo L
    310. Yokota M
    311. Young R
    312. Zabaneh D
    313. Zhang F
    314. Zhang R
    315. Zheng W
    316. Zimmet PZ
    317. Altshuler D
    318. Bowden DW
    319. Cho YS
    320. Cox NJ
    321. Cruz M
    322. Hanis CL
    323. Kooner J
    324. Lee JY
    325. Seielstad M
    326. Teo YY
    327. Boehnke M
    328. Parra EJ
    329. Chambers JC
    330. Tai ES
    331. McCarthy MI
    332. Morris AP
    (2014) Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility
    Nature Genetics 46:234–244.
    https://doi.org/10.1038/ng.2897
  2. Software
    1. R Development Core Team
    (2013) R: A language and environment for statistical computing
    R Foundation for Statistical Computing, Vienna, Austria.
  3. Book
    1. Zheng X
    (2018) Imputation-based HLA typing with SNPs in GWAS studies
    In: Boegel S, editors. HLA Typing: Methods and Protocols. Springer. pp. 163–176.
    https://doi.org/10.1007/978-1-4939-8546-3

Article and author information

Author details

  1. Haiko Schurz

    DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
    Contribution
    Conceptualization, Data curation, Investigation, Methodology, Writing – original draft, Writing – review and editing
    For correspondence
    haikoschurz@gmail.com
    Competing interests
    No competing interests declared
    Additional information
    co-first authors
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0009-3409
  2. Vivek Naranbhai

    1. Wellcome Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
    2. Massachusetts General Hospital, Boston, United States
    3. Dana-Farber Cancer Institute, Boston, United States
    4. Centre for the AIDS Programme of Research in South Africa, Durban, South Africa
    5. Harvard Medical School, Boston, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Methodology, Writing – original draft, Writing – review and editing
    Competing interests
    No competing interests declared
    Additional information
    co-first authors
  3. Tom A Yates

    Division of Infection and Immunity, Faculty of Medical Sciences, University College London, London, United Kingdom
    Contribution
    Supervision, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6081-1767
  4. James J Gilchrist

    1. Wellcome Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
    2. Department of Paediatrics, University of Oxford, Oxford, United Kingdom
    Contribution
    Supervision, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2045-6788
  5. Tom Parks

    1. Wellcome Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
    2. Department of Infectious Diseases Imperial College London, London, United Kingdom
    Contribution
    Supervision, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  6. Peter J Dodd

    School of Health and Related Research, University of Sheffield, Sheffield, United Kingdom
    Contribution
    Supervision, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  7. Marlo Möller

    DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
    Contribution
    Resources, Supervision, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0805-6741
  8. Eileen G Hoal

    DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
    Contribution
    Resources, Supervision, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  9. Andrew P Morris

    Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, The University of Manchester, Manchester, United Kingdom
    Contribution
    Software, Supervision, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6805-6014
  10. Adrian VS Hill

    1. Wellcome Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
    2. Jenner Institute, University of Oxford, Oxford, United Kingdom
    Contribution
    Resources, Supervision, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  11. International Tuberculosis Host Genetics Consortium

    Contribution
    Conceptualization, Data curation, Writing – review and editing
    Competing interests
    No competing interests declared
    1. Haiko Schurz, DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University,, Cape Town, South Africa
    2. Vivek Naranbhai, Wellcome Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
    3. Tom A Yates, Division of Infection and Immunity, Faculty of Medical Sciences, University College, London, United Kingdom
    4. James J Gilchrist, Wellcome Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
    5. Tom Parks, Wellcome Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
    6. Peter J Dodd, Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, The University of Manchester, Manchester, United Kingdom
    7. Marlo Möller, DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University,, Cape Town, South Africa
    8. Eileen G Hoal, DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University,, Cape Town, South Africa
    9. Andrew P Morris, Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, The University of Manchester, Manchester, United Kingdom
    10. Adrian VS Hill, Wellcome Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
    11. Reinout van Crevel, Department of Internal Medicine and Radboud Center for Infectious Diseases, Radboud University Medical Center, Nijmegen, Netherlands
    12. Arjan van Laarhoven, Department of Internal Medicine and Radboud Center for Infectious Diseases, Radboud University Medical Center, Nijmegen, Netherlands
    13. Tom HM Ottenhoff, Head Lab Dept of Infectious Diseases; Head Group Immunology and Immunogenetics of Bacterial Infectious Diseases Leiden University Medical Center, Leiden, Netherlands
    14. Andres Metspalu, Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, Estonia
    15. Reedik Magi, Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, Estonia
    16. Christian G Meyer, Institute of Tropical Medicine, Eberhard-Karls University Tübingen, Tübingen, Germany
    17. Magda Ellis, Tuberculosis Research Group, Centenary Institute, Sydney, Australia
    18. Thorsten Thye, School of Health and Related Research, University of Sheffield, Sheffield, United Kingdom
    19. Surakameth Mahasirimongkol, Department of Medical Sciences, Ministry of Public Health, Nonthaburi, Thailand
    20. Ekawat Pasomsub, Virology Laboratory, Department of Pathology, Faculty of Medicine, Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
    21. Katsushi Tokunaga, Genome Medical Science Project, National Center for Global Health and Medicine, Tokyo, Japan
    22. Yosuke Omae, Genome Medical Science Project, National Center for Global Health and Medicine, Tokyo, Japan
    23. Hideki Yanai, Fukujuji Hospital and Research Institute of Tuberculosis, Japan Anti-Tuberculosis Association, Kiyose, Japan
    24. Taisei Mushiroda, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
    25. Michiaki Kubo, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
    26. Atsushi Takahashi, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
    27. Yoichiro Kamatani, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
    28. Bachti Alisjahbana, Faculty of Medicine, Universitas Padjdjaran - Hasan Sadikin Hospital, Bandung, Indonesia
    29. Wei Liu, Department of Plastic and Reconstructive Surgery, Shanghai Key Laboratory of Tissue Engineering, Shanghai Ninth People’s Hospital, Shanghai Jiao Tong University – School of Medicine, Shanghai, China
    30. A-dong Sheng, National Clinical Research Center for Respiratory Diseases, National Key Discipline of Pediatrics, Capital Medical University, Beijing, China
    31. Yurong Yang, Ningxia Medical University, Ningxia Hui Autonomous Region, Ningxia, China

Funding

National Institute for Health Research (Academic Clinical Lectureship)

  • James J Gilchrist

Versus Arthritis (21754)

  • Andrew P Morris

Medical Research Council (MR/P022081/1)

  • Peter J Dodd

National Institute for Health Research (NIHR Clinical Lecturer)

  • Tom A Yates

National Institute for Health Research (CL-2020-21-001)

  • Tom Parks

Wellcome

https://doi.org/10.35802/222098
  • Tom Parks

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. For the purpose of Open Access, the authors have applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.

Acknowledgements

Computation used the Oxford Biomedical Research Computing (BMRC) facility, a joint development between the Wellcome Centre for Human Genetics and the Big Data Institute supported by Health Data Research UK and the NIHR Oxford Biomedical Research Centre. Financial support was provided by the Wellcome Trust Core Award Grant Number 203141/Z/16/Z. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. This work was partly supported by a Grant in-Aid for Scientific Research (B) (KAKENHI 21406006) from Japan Society for the Promotion of Science (JSPS). The clinical information and samples in Thailand, in this part, were supported by JSPS KAKENHI 17256005 and later by research grant from the Ministry of Health, Labor and Welfare (MHLW) H21-aids-12. We would like to thank all the subjects and the members of the Rotary Club of Osaka-Midosuji District 2660 Rotary International in Japan who donated their DNA for this work. We thank all members of BioBank Japan, Institute of Medical Science, The University of Tokyo, and of RIKEN Center for Genomic Medicine for their contribution to the completion of our study. This work was conducted as a part of the BioBank Japan Project that was supported by the Ministry of Education, Culture, Sports, Science and Technology of the Japanese government. As for Thai samples, we thank all of the staff and collaborators of the TB/HIV Research Project, Thailand, a research project between the Research Institute of Tuberculosis, the Japan Anti-tuberculosis Association, and the Thai Ministry of Public Health for collecting clinical data and DNA samples. We thank the German Consortium 'TB or not TB Network' (https://www.tbornottb.de/), which was responsible for collecting the German TB samples. We acknowledge the support of the DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa. This research was funded in whole, or in part, by the Wellcome Trust. For the purpose of open access, the author has applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission. JJG is funded by an NIHR Academic Clinical Lectureship. APM acknowledges support from Versus Arthritis (grant reference 21754). PJD was supported by a fellowship from the UK Medical Research Council (MR/P022081/1); this UK-funded award is part of the EDCTP2 program supported by the European Union. ME was supported by an NHMRC fellowship (552496). The research was supported by the NHMRC grant 1025166. AvL and RvC are supported by the National Institute of Allergy and Infectious Diseases at NIH [R01 AI136921]. TAY is an NIHR Clinical Lecturer supported by the National Institute for Health Research. TP acknowledges funding from the National Institute for Health Research (CL-2020-21-001) and the Wellcome Trust (222098/Z/20/Z). The views expressed in this publication are those of the author(s) and not necessarily those of the NHS, the National Institute for Health Research, or the Department of Health and Social Care. AM and RM are funded by the EU project no. 2014-2020.4.01.15-0012 'Gentransmed'. BA is supported by the 'Scientific Programme Indonesia Netherlands' (SPIN) under the Royal Academy of Arts and Sciences (KNAW), the Netherlands.

Ethics

A research collaboration agreement was signed by all contributors. Ethics approval for the meta-analysis presented here was granted by the Health Research Ethics Committee of Stellenbosch University (project registration number S17/01/013). In addition, all institutions involved in the ITHGC have ethics approval for their respective studies: China 1 and 2: The study protocol was approved by the Ethics Committee of the Beijing Chest Hospital, the 309 Hospital of the PLA, Shijiazhuang Fifth Hospital, the China PLA General Hospital, the Tongliao TB institute and the Center for Diseases Control and Prevention in Jalainuoer. China 3: Ethics approval was granted by the Ethics Committees of the Beijing Children's Hospital, the Beijing Geriatric Hospital, the Tuberculosis Hospital in Shaanxi Province, the Beijing Institute of Genomics, Chinese Academy of Sciences and the Center for Disease Control and Prevention of Jiangsu Province. Thailand: Ethics approval was granted by the Ethics Review Committee of the Ministry of Public Health in Thailand. Japan: Ethics approval was granted by the Institutional Review Board of the Center for Genomic Medicine, RIKEN Russia: Blood samples from all participants were collected and studied with written informed consent according to the Declaration of Helsinki and with approvals from the local ethics committees in Russia (St. Petersburg and Samara) and the UK (Human Biological Resource Ethics Committee of the University of Cambridge and the National Research Ethics Service, Cambridgeshire 1 REC, 10/H0304/71). Estonia: The Estonian Bioethics and Human Research Council (EBIN) approved the Estonian Genome Center study reported in this manuscript. Germany: The study protocol was approved by the ethics committee (EC) of the University of Luebeck, Germany (reference 07-125), and was adopted by other ethics committees covering all 18 participating centres (EC of the medical faculty of the University of Goettingen; EC of the Medical Council of Hessen, Frankfurt /Main; EC of the Medical Council Hamburg; EC of the Medical Council Lower Saxony, Hannover; EC of the Medical Faculty Carl Gustav Carus, Technical University of Dresden; EC of the Medical Council Berlin; EC of the Medical Council Bavaria, Munich; EC of the Medical Faculty, Friedrich-Alexander-University Erlangen-Nuremberg; EC of the Medical Faculty of the University of Regensburg; EC of the University of Witten/ Herdecke) Gambia: Ethics approval was granted by the Medical Research Council (MRC) and the Gambian government joint ethical committee. Ghana: Ethics approval was granted by the Committee on Human Research, Publications and Ethics, School of Medical Sciences, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana, and the Ethics Committee of the Ghana Health Service, Accra, Ghana. RSA A and RSA M: Ethics approval was granted by the Health Research Ethics Committee of Stellenbosch University (project registration numbers S17/01/013, NO6/07/132 and 95/072).

Copyright

© 2024, Schurz et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,224
    views
  • 162
    downloads
  • 7
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Haiko Schurz
  2. Vivek Naranbhai
  3. Tom A Yates
  4. James J Gilchrist
  5. Tom Parks
  6. Peter J Dodd
  7. Marlo Möller
  8. Eileen G Hoal
  9. Andrew P Morris
  10. Adrian VS Hill
  11. International Tuberculosis Host Genetics Consortium
(2024)
Multi-ancestry meta-analysis of host genetic susceptibility to tuberculosis identifies shared genetic architecture
eLife 13:e84394.
https://doi.org/10.7554/eLife.84394

Share this article

https://doi.org/10.7554/eLife.84394

Further reading

    1. Computational and Systems Biology
    2. Genetics and Genomics
    Sanjarbek Hudaiberdiev, Ivan Ovcharenko
    Research Article

    Enhancers and promoters are classically considered to be bound by a small set of transcription factors (TFs) in a sequence-specific manner. This assumption has come under increasing skepticism as the datasets of ChIP-seq assays of TFs have expanded. In particular, high-occupancy target (HOT) loci attract hundreds of TFs with often no detectable correlation between ChIP-seq peaks and DNA-binding motif presence. Here, we used a set of 1003 TF ChIP-seq datasets (HepG2, K562, H1) to analyze the patterns of ChIP-seq peak co-occurrence in combination with functional genomics datasets. We identified 43,891 HOT loci forming at the promoter (53%) and enhancer (47%) regions. HOT promoters regulate housekeeping genes, whereas HOT enhancers are involved in tissue-specific process regulation. HOT loci form the foundation of human super-enhancers and evolve under strong negative selection, with some of these loci being located in ultraconserved regions. Sequence-based classification analysis of HOT loci suggested that their formation is driven by the sequence features, and the density of mapped ChIP-seq peaks across TF-bound loci correlates with sequence features and the expression level of flanking genes. Based on the affinities to bind to promoters and enhancers we detected five distinct clusters of TFs that form the core of the HOT loci. We report an abundance of HOT loci in the human genome and a commitment of 51% of all TF ChIP-seq binding events to HOT locus formation thus challenging the classical model of enhancer activity and propose a model of HOT locus formation based on the existence of large transcriptional condensates.

    1. Developmental Biology
    2. Genetics and Genomics
    Ignacy Czajewski, Bijayalaxmi Swain ... Daan MF van Aalten
    Research Article

    O-GlcNAcylation is an essential intracellular protein modification mediated by O-GlcNAc transferase (OGT) and O-GlcNAcase (OGA). Recently, missense mutations in OGT have been linked to intellectual disability, indicating that this modification is important for the development and functioning of the nervous system. However, the processes that are most sensitive to perturbations in O-GlcNAcylation remain to be identified. Here, we uncover quantifiable phenotypes in the fruit fly Drosophila melanogaster carrying a patient-derived OGT mutation in the catalytic domain. Hypo-O-GlcNAcylation leads to defects in synaptogenesis and reduced sleep stability. Both these phenotypes can be partially rescued by genetically or chemically targeting OGA, suggesting that a balance of OGT/OGA activity is required for normal neuronal development and function.