Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.
Read more about eLife’s peer review process.Editors
- Reviewing EditorBavesh KanaUniversity of the Witwatersrand, Johannesburg, South Africa
- Senior EditorBavesh KanaUniversity of the Witwatersrand, Johannesburg, South Africa
Reviewer #1 (Public Review):
Summary:
Osnes et al., describe a large and impressive study of the population and transmission dynamics of Neisseria gonorrhoeae using a global dataset of 9,732 genomes. This included dense sampling from Norway and the state of Victoria, Australia. Understanding the transmission dynamics of this increasingly drug-resistant pathogen is crucial for designing optimal public health interventions. This study provides useful insights into the differing transmission dynamics between two well-sampled populations.
The authors have also developed novel techniques to address the size and complexity of the dataset, including an approach to account for recombination when building large phylogenies. While the authors have made significant efforts to account for sampling biases in the data, it is not clear that this has been sufficient to address the problem. The use of non-standard analysis techniques also requires further validation.
Strengths:
The size of the dataset and the comparisons between densely sampled regions are major strengths of the manuscript. While sampling biases may limit the generalizability of the results, as acknowledged by the authors, the characterization of local and inter-country transmission, will help to inform future studies into N. gonorrhoeae.
Weaknesses:
Sampling bias:
The authors have gone to considerable efforts to acknowledge and account for biases in the sampling between different locations. Despite this, comparisons are still frequently made in the manuscript between populations with very different sampling profiles, which are likely to dominate the import, export and local transmission signals.
To determine the sensitivity of their results to sampling, the authors randomly took subsamples of each population at varying sizes. While this would address issues with the overall number of genomes being considered, it is not obvious that it would account for biases in sampling including the differing dates over which each population was sampled.
Randomly subsampling tips of the tree is unlikely to change the overall population structure of each dataset much. For example, subsampling a single outbreak would result in highly similar genomes each time. Subsampling clades would provide a better indication of how sensitive the results are to particular clusters within each population. Simulations would also help to determine under what conditions the inferred asymptotes for import and export fractions are likely to hold.
The text states that Europe and the USA have 'older' transmission lineages than Norway and Victoria. Norway is also found to export more lineages than Victoria, which is likely to be heavily influenced by biases in the distribution of the 'rest of world' samples. Although the impact of sampling bias is acknowledged by the authors in cases such as these, it would be better to avoid making direct comparisons in the first place.
Recombination detection and filtering:
The authors introduce a novel pipeline for masking recombination before building phylogenetic trees, based on randomly subsetting the dataset and running the Gubbins algorithm. While I appreciate it is challenging to account for recombination in a dataset of this size, further verification needs to be done to demonstrate the effectiveness of this approach.
In particular, this approach resulted in ~ 10% of sites being filtered out from a diverse set of genomes. This is considerably less than a previous publication that considered ~400 diverse gonococcal genomes, where just under 50% of sites were removed using the Gubbins algorithm (Sánchez-Busó et al., 2019).
One reason for this is that the new approach requires recombination events called by Gubbins to meet additional filtering requirements before they are masked from the alignment. This may exclude rarer recombination events, which could subsequently impact the length of branches in the final phylogeny.
Transmission clustering:
The use of LineageHomology and ancestral state reconstruction to determine transmission clusters may be susceptible to biases in sampling between locations. As noted by the authors, locations with sparse sampling, such as the USA, are likely to have older ancestral nodes that are exclusive to that location. Biases in the sampling of countries that transmit to and from each location will also heavily impact the size of the inferred clusters.
This could potentially explain the occurrence of larger 'mixed outbreaks' in Victoria when compared to Norway, as these clusters may be older and driven by a lack of observed isolates in the 'rest of world' subset.
While it would not solve the problem entirely, a SNP-based cut-off as used in the original study of Victorian isolates by Williamson et al., is less likely to be as heavily biased.
Import and export estimates:
Using LineageHomology to define import and export estimates may have a similar problem with sampling biases. This is acknowledged by the authors and nicely described in Supplementary Figure 6. The authors make a comparison with the analysis of SARS-CoV-2 genomes by du Plessis et. al. (2021). However, in the analysis of SARS-CoV-2, the sampling times were far more consistent than those observed in the gonococcal dataset. To address this, the authors could compare their results to an analysis restricted to samples observed in a similar time period. This could most easily be achieved by cutting the inferred phylogeny at a particular date and re-running the LineageHomology analyses.
Reviewer #2 (Public Review):
Summary:
This work analyses the historical spread and evolution, termed 'population dynamics', of a human bacterial pathogen, Neisseria gonorrhoeae, the cause of the sexually transmitted infection, gonorrhoea. N. gonorrhoeae is classified as a high priority pathogen by the World Health Organisation, due to infections numbering in the tens of millions annually, with high levels of antibiotic resistance and no vaccine available, meaning treating and preventing infections is becoming increasingly more difficult. To implement interventions effectively, important resistant lineages and their transmission routes must be identified on a national and international level.
In this work, Osnes et al. use genomic data, coupled with geographic, temporal and demographic metadata, to analyse the global population dynamics N. gonorrhoeae using 9,732 genomes. The study also includes a granular analysis of transmission between and within four regions of different sizes with high levels of data coverage: USA, Europe, Norway, and Victoria state in Australia.
The authors built a phylogenetic tree including all genomes using a novel computationally efficient method for removing genome regions resulting from recombination, which would otherwise result in incorrect branch lengths and tree topology. Using the tree, the authors show that the effective population size of N. gonorrhoeae, describing population size and diversity, decreased in the period from 2010 to present day, and was not entirely an artefact of sampling bias. The authors then stratified the tree based on isolates that contained alleles that are associated with resistance to antibiotics commonly used to treat gonorrhoea. The authors found resistance was associated with particular lineages, of which most, but not all, underwent shrinking in effective population size in the last decade.
Using the tree, the authors then inferred likely importation, exportation, and local transmission events, finding notable differences in the contribution of imports to local incidence between locations, as well as the likelihood of exportation. As inference of these events relies on sampling density, the authors used a novel method for identifying whether sampling was representative of the population diversity of a given location. Using this approach, they found that the densely sampled regions, Norway and Victoria, were likely representative of the local N. gonorrhoeae population diversity, whilst the larger, less densely sampled regions, Europe and USA, were not. Finally, they investigated the contribution of specific transmission networks to the spread gonorrhoea, finding that the frequency of males within a transmission network may play a role in the rate of N. gonorrhoeae transmission in Norway, but not Victoria.
This work introduces several novel approaches to the analysis of pathogen population dynamics, and highlights notable differences in N. gonorrhoeae transmission between and within distinct geographic locations.
Strengths:
• The authors have collated a large global collection of N. gonorrhoeae genomes with associated metadata, and in some cases generated assemblies themselves. A dataset of this size and detail is a valuable asset to the public health community, enabling analysis of both national and international population dynamics.
• The stratification of the phylogenetic tree by antimicrobial resistance gene alleles enables the study of how antibiotic usage has shaped global and regional N. gonorrhoeae populations. Analysis of changes in the effective population size of clades harbouring resistance alleles is particularly impactful, as this can be used to show how changes in treatment patterns affect the growth or decline of drug-resistant pathogen populations. This analysis also enables the determination of the frequency of multiple resistance alleles being present in single isolates, important for determining the scale of multidrug resistance within the N. gonorrhoeae global population.
• The use of ancestral trait reconstruction to quantify importation, exportation and local transmission is an important contribution to public health efforts tackling N. gonorrhoeae spread. Understanding the differences in transmission networks within and between different geographic locations provides public health researchers with crucial information to model and implement effective targeted interventions on regional and international scales.
Weaknesses:
• The method used to generate the phylogenetic tree and mask regions of recombination is likely flawed. The authors repeatedly down-sampled the whole population to 500 genomes, using Gubbins to identify regions that have recombined and therefore would not follow the clonal history of the N. gonorrhoeae population. This small sample size will result in the same ancient internal nodes being sampled repeatedly, whilst more recent internal nodes will not. Therefore, more recent recombination events would not be identified by this method and were therefore likely included in the whole genome alignment used to build the tree. Furthermore, Gubbins was designed to identify recombination between closely related genomes, not across a whole species, where the background mutation rate will be too high to differentiate between recombined regions and the clonal frame. Both of these factors will mean that the amount of the genome predicted to have recombined will likely be underestimated, resulting in inflated branch lengths and incorrect tree topology. This effect is potentially the cause of the observed drop in N. gonorrhoeae effective population size between 2010-present day in Figure 2, which does not align with gonorrhoea incidence, and the elevated estimated mutation rate of 7.41x10-6 substitutions per site per year, which is higher than previous estimates based on N. gonorrhoeae global populations. The result of underestimation of recombined regions will be two-fold. Inclusion of recombined regions in the alignment will result in inflated branch lengths, which will impact all estimates of effective population size in the study. Furthermore, tree topology may be incorrect, which will impact ancestral trait reconstruction and result in incorrect inference of import, export and local transmission events in Figures 3, 4 and 5. Additionally, the clade-specific resistance gene analyses will be affected in Figure 2, as certain isolates may be incorrectly included or excluded within stratified clades. Therefore, the conclusions made about the changes in effective population size for the global population, and individual clades, as well as the differences in transmission dynamics between locations, are likely to be incorrect.
• The method used to identify sampling bias, shown in Figure 4, is a novel and interesting take on the problem. However, it is not clear whether the effect being measured is the presence of sampling bias or an artefact of differences in N. gonorrhoeae diversity between locations. The results in Figure 4 do align with what is known about the population datasets; the data from Norway and Victoria is more comprehensive than that of the USA and Europe due to the difference in size of the respective human populations, meaning the likelihood of sampling bias will be lower in the smaller population. However, with increased human population size, we would also expect a greater amount of pathogen diversity, due to increased within-region transmission and greater numbers of importation events. Supporting this, we see in Figure 3 that the transmission lineages in the USA and Europe are estimated to have emerged earlier than Norway or Victoria, indicative of a greater amount of standing population diversity. Therefore, the reason why convergence is observed when up-sampling from smaller populations may be because a vast majority of isolates will sit within a small part of the tree, whilst from a larger, more diverse population, isolates will be placed all across the tree and so convergence will never be observed. In effect, it is unknown whether increasing the sample size of the USA and Europe to be truly representative of their respective N. gonorrhoeae populations would ever result in convergence between the two methods of up-sampling. Testing this method using simulations could be used to determine whether it is sensitive to sampling bias, or population diversity.
• In Figure 5, a significant difference in transmission lineage size was only found between male-dominated and mixed lineages in Norway and not Victoria. Therefore, the conclusion that sex distribution within transmission networks affects the size of transmission lineages is not supported by the data, and could also be due to geographical and other demographic differences between the datasets which were not accounted for.
Reviewer #3 (Public Review):
Summary:
Here, Osnes et al examine the population dynamics of Neisseria gonorrhoeae. They develop new methodologies to deal with the issue of recombination, as well as using ancestral state reconstruction approaches to quantify the number of import and export transmission events occurring in different regions in the world. Overall, they provide a framework for understanding intercontinental transmission that could be applied to other microbial pathogens.
Strengths:
A major strength of this study is the incredibly large number of genomes analysed, which span a wide temporal range with significant geographical diversity. The use of ancestral state reconstruction to quantitatively determine the number of import and export events of N. gonorrhoeae in densely sampled Norway and Victoria, Australia, is an interesting application of this well-known method and could be applied to other bacterial species that have been well-sampled.
Weaknesses:
The methods development to deal with the issue of recombination in their dataset to ensure that the recombination signal does not affect their dating estimates and effective population size analysis is thorough but has likely not been able to remove all bias. Additionally, the authors discuss the utility of using the identified transmission lineages in this study to better type N. gonorrhoeae as there are issues with traditional typing, such as MLST, due to the highly recombinogenic nature of this species. However, no method seems to be provided to enable future researchers to easily assign their genomes to the transmission lineages identified in this study.