Strong isolation by distance and evidence of population microstructure reflect ongoing Plasmodium falciparum transmission in Zanzibar

  1. Sean V Connelly  Is a corresponding author
  2. Nicholas F Brazeau
  3. Mwinyi Msellem
  4. Billy E Ngasala
  5. Ozkan Aydemir
  6. Varun Goel
  7. Karamoko Niaré
  8. David J Giesbrecht
  9. Zachary R Popkin-Hall
  10. Chris Hennelly
  11. Zackary Park
  12. Ann M Moormann
  13. John M Ong'echa
  14. Robert Verity
  15. Safia Mohammed
  16. Shija J Shija
  17. Lwidiko E Mhamilawa
  18. Ulrika Morris
  19. Andreas Mårtensson
  20. Jessica T Lin
  21. Anders Björkman
  22. Jonathan J Juliano
  23. Jeffrey A Bailey
  1. MD-PhD Program, University of North Carolina at Chapel Hill, United States
  2. Research Division, Ministry of Health, United Republic of Tanzania
  3. Department of Parasitology and Medical Entomology, Muhimbili University of Health and Allied Sciences, United Republic of Tanzania
  4. Global Health and Migration Unit, Department of Women's and Children's Health, Uppsala University, Sweden
  5. Department of Medicine, University of Massachusetts Chan Medical School, United States
  6. Carolina Population Center, University of North Carolina at Chapel Hill, United States
  7. Department of Pathology and Laboratory Medicine, Brown University, United States
  8. Institute for Global Health and Infectious Diseases, School of Medicine, University of North Carolina at Chapel Hill, United States
  9. Division of Infectious Diseases, Department of Medicine, School of Medicine, University of North Carolina at Chapel Hill, United States
  10. Center for Global Health Research, Kenya Medical Research Institute, Kenya
  11. MRC Centre for Global Infectious Disease Analysis, Imperial College London, United Kingdom
  12. Zanzibar Malaria Elimination Program (ZAMEP), United Republic of Tanzania
  13. Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Sweden
  14. Department of Global Public Health, Karolinska Institute, Sweden
  15. Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, United States
  16. Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, United States
5 figures, 2 tables and 1 additional file

Figures

Figure 1 with 4 supplements
Parasites between Zanzibar and coastal mainland Tanzania are highly related but microstructure within Zanzibar is apparent.

(A) Principal component analysis (PCA) comparing parasites from symptomatic vs. asymptomatic patients from coastal Tanzania and Zanzibar. Clusters with an identity by descent (IBD) value of > 0.90 were limited to a single representative infection to prevent local structure of highly related isolates within shehias from driving clustering. (B) A discriminant analysis of principal components (DAPC) was performed utilizing isolates with unique pseudohaplotypes, pruning highly related isolates to a single representative infection. Districts were included with at least five isolates remaining to have sufficient samples for the DAPC. For plotting the inset map, the district coordinates (e.g., mainland, Kati, etc.) were calculated from the averages of the shehia centroids within each district.

Figure 1—figure supplement 1
Sampling locations in Zanzibar (shehia) and mainland (Bagamoyo district) Tanzania.

The centroids of the sampling locations are shown as blue rectangles. The ferry terminal in Zanzibar town is shown as a red rectangle. In Zanzibar, samples were collected throughout Unguja and in northern Pemba. In mainland Tanzania, samples were collected from Bagamoyo district.

Figure 1—figure supplement 2
Principal component analysis (PCA) utilizing samples across Africa shows clustering based on geographic location.

Samples from Ahero, Kenya (n = 147), a random 20% of samples from five regions across Africa (Verity et al., 2020) (n = 275) and from this study (n = 282) were subsetted to 756 common loci. Within-sample allele frequency (WSAF) was calculated, with an imputation step to replace missing values with the median WSAF, to perform PCA.

Figure 1—figure supplement 3
Molecular inversion probe (MIP) performance shows coverage of loci.

Panel (A) shows the log-transformed read depth for genome-wide single-nucleotide polymorphisms (SNPs) for samples (columns) and loci (rows). The log-transformed unique molecular identifier (UMI) count ranges from 0 to 9.87. Panel (B) shows the mean UMI coverage for the analyzed drug resistance mutations with a nonparametric bootstrap 95% CI.

Figure 1—figure supplement 4
Principal component analysis (PCA) with highly related samples shows population stratification radiating from coastal mainland to Zanzibar.

PCA of 282 total samples was performed using whole-sample allele frequency (A) and discriminant analysis of principal components (DAPC) was performed after retaining samples with unique pseudohaplotypes in districts that had five or more samples present (B). As opposed to Figure 1, all isolates were used in this analysis and isolates with unique pseudohaplotypes were not pruned to a single representative infection.

Figure 2 with 1 supplement
Coastal Tanzania and Zanzibari parasites have more highly related pairs within their given region than between regions.

K-means clustering of shehia coordinates was performed using geographic coordinates of all shehias present from the sample population to generate five clusters (colored boxes). All shehias were included to assay pairwise identity by descent (IBD) between differences throughout Zanzibar. K-means cluster assignments were converted into interpretable geographic names Pemba, Unguja North (Unguja_N), Unguja Central (Unguja_C), Unguja South (Unguja_S), and mainland Tanzania (Mainland). Pairwise comparisons of within-cluster IBD (column 1 of IBD distribution plots) and between-cluster IBD (columns 2–5 of IBD distribution plots) were done for all clusters. All IBD values > 0 were plotted for each comparison. In general, within-cluster IBD had more pairwise comparisons containing high IBD identity.

Figure 2—figure supplement 1
Diagnostic plot showing total within-cluster sum of squares versus number of clusters for the determination of optimal K.

Mainland samples were considered an independent cluster. We selected a K of 4 for determining clusters in Zanzibar based on the inflection point above.

Figure 3 with 1 supplement
Isolation by distance is shown between all Zanzibari parasites (A), only Unguja parasites (B), and only Pemba parasites (C).

Samples were analyzed based on geographic location. Zanzibar (N = 136) (A), Unguja (N = 105) (B), or Pemba (N = 31) (C) and greater circle (GC) distances between pairs of parasite isolates were calculated based on shehia centroid coordinates. These distances were binned at 4 km increments out to 12 km. Identity by descent (IBD) beyond 12 km is shown in Figure 3—figure supplement 1. The maximum GC distance for all of Zanzibar was 135 km, 58 km on Unguja, and 12 km on Pemba. The mean IBD and 95% CI are plotted for each bin.

Figure 3—figure supplement 1
Isolation by distance in Zanzibar isolates (A) and only Unguja isolates (B).

Samples were filtered based on location and greater circle distance were calculated. These distances were binned at 10 km increments. The mean IBD and 95% CI are plotted for each bin.

Figure 4 with 4 supplements
Highly related pairs span long distances across Zanzibar.

Sample pairs were filtered to have identity by descent (IBD) estimates of ≥ 0.25. Within shehia pairwise IBD estimates are shown within Unguja (A) and Pemba (B) as single points, with dark green representing the greatest degree of IBD. Shehias labeled with black dots do not have within IBD estimates of ≥ 0.25. Between shehia IBD reflects pairs of parasites with IBD ≥0.25, with the color of the connecting arc representing the degree of IBD and yellow representing maximal connectivity. Panel (C) shows the network of highly related pairs (IBD ≥ 0.25) within and between the six northern Pemba shehias (note: Micheweni is a shehia in Micheweni district). Samples (nodes) are colored by shehia and IBD estimates (edges) are represented on a continuous scale with increasing width and yellow-shading indicating higher IBD.

Figure 4—figure supplement 1
Network analysis of within shehia comparisons with an identity by descent (IBD) of ≥0.25 in Unguja.

Pairwise IBD comparisons of ≥ 0.25 within different shehias were used. If a shehia is not represented, it does not have a pairwise comparison meeting the IBD threshold. Shehias that did contain a pair with an IBD of ≥ 0.25 are plotted (A). Network analysis of related pairs (IBD ≥ 0.25) is plotted in (B). The width of each line represents higher magnitudes of IBD between pairs.

Figure 4—figure supplement 2
Network analysis of sample pairs with an identity by descent (IBD) of ≥0.25 for coastal mainland Tanzania.

The network of highly related (IBD ≥ 0.25) pairs is plotted above within coastal mainland Tanzania. The width of each line represents higher magnitudes of IBD between pairs.

Figure 4—figure supplement 3
Sample pairs with an identity by descent (IBD) of ≥0.125 between Zanzibar and mainland Tanzania.

Relatively few sample pairs showed moderate levels of IBD (between 0.125 and 0.20) between the coastal mainland and Zanzibar.

Figure 4—figure supplement 4
Sample pairs with an identity by descent (IBD) of ≥0.125 between Unguja and Pemba.

Relatively few sample pairs showed moderate levels of IBD (between 0.125 and 0.20) between Unguja and Pemba.

Complexity of infection (COI) and Fws metric shows a higher COI and lower Fws in asymptomatic than symptomatic infections in both mainland Tanzania and Zanzibar isolates.

COI (A) was estimated using the REAL McCOIL’s categorical method (Chang et al., 2017). The mean COI for asymptomatic was greater than symptomatic infections for all regions; MAIN-A: 2.5 (2.1–2.9), MAIN-S: 1.7 (1.6–1.9), p<0.05, Wilcoxon–Mann–Whitney test and ZAN-A: 2.2 (1.7–2.8), ZAN-S: 1.7 (1.5–1.9), p=0.05, Wilcoxon–Mann–Whitney test. Fws (B) was estimated utilizing the formula, (1-Hw)/Hp, where Hw is the within-sample heterozygosity and Hp is the heterozygosity across the population. Mean Fws was less in asymptomatic than symptomatic samples; MAIN-A: 0.67 (0.6–0.7), MAIN-S: 0.85 (0.8–0.9), p<0.05, Wilcoxon–Mann–Whitney test and ZAN-A: 0.73 (0.6–0.8), ZAN-S: 0.84 (0.8–0.9), p=0.05, Wilcoxon–Mann–Whitney test. A nonparametric bootstrap was applied to calculate the mean and 95% CI from the COI and Fws values.

Tables

Table 1
Blood samples from Zanzibar and coastal Tanzania.
DescriptionLocation (district)DatesClinical status*Sample sizeAge range (yr)# in genome-wide analysis# in drug resistance analysis
Community cross-sectional surveysZanzibar (multiple)2016A702–702152
In vivo efficacy study of artesunate-amodiaquine (ASAQ) with single low-dose primaquine (SLDP) in pediatric uncomplicated malaria patientsZanzibar (multiple)2017S1432–60117134
Study of transmission of Plasmodium falciparum to colony-reared mosquitoesMainland Tanzania (Bagamoyo)2018A407–16340
Parasite clearance study of artemether-lumefantrine (AL)Mainland Tanzania (Bagamoyo)2018S1382–11110123
  1. *

    Asymptomatic (A) or symptomatic (S).

Table 2
Drug resistance polymorphism prevalence in Zanzibar and coastal mainland Tanzania.
MutationZanzibarMainland Tanzania
Mutant allele prevalence*CI# Genotyped samples Mutant allele prevalence*CI# Genotyped samples
Pfcrt-M74I0.0540.026–0.0981840.0000–0.034106
Pfcrt-N75E0.0540.026–0.0981840.0000–0.034106
Pfcrt-K76T0.0540.026–0.0981840.0000–0.034106
Pfdhfr-A16V0.0000–0.0211730.0000–0.032112
Pfdhfr-N51I0.9770.943–0.9941770.9640.911–0.99112
Pfdhfr-C59R0.9710.934–0.9911740.9450.884–0.98109
Pfdhfr-S108N1.0000.98–11791.0000.965–1104
Pfdhfr-S108T0.0000–0.021790.0000–0.035104
Pfdhfr-I164L0.0000–0.021840.0000–0.03798
Pfdhps-A437G1.0000.98–11821.0000.968–1115
Pfdhps-K540E0.9550.913–0.981780.9640.91–0.99111
Pfdhps-A581G0.0440.019–0.0851810.1070.058–0.175122
Pfk13-K189N0.0230.006–0.0581740.0000–0.0490
Pfk13-K189T0.0780.042–0.131660.0950.042–0.17984
Pfmdr1-N86Y0.0110.001–0.041800.0080–0.044124
Pfmdr1-Y184F0.6440.57–0.7141800.5300.435–0.624115
Pfmdr1-D1246Y0.0110.001–0.0391840.0190.002–0.067105
Pfmdr2-I492V0.4300.357–0.5061790.4070.302–0.51886
  1. *

    Prevalence was calculated as described in the ‘Methods’.

  2. 95% CI of these polymorphisms were calculated using the Pearson–Klopper method.

  3. The number of genotyped samples per loci is also shown for each polymorphism.

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Sean V Connelly
  2. Nicholas F Brazeau
  3. Mwinyi Msellem
  4. Billy E Ngasala
  5. Ozkan Aydemir
  6. Varun Goel
  7. Karamoko Niaré
  8. David J Giesbrecht
  9. Zachary R Popkin-Hall
  10. Chris Hennelly
  11. Zackary Park
  12. Ann M Moormann
  13. John M Ong'echa
  14. Robert Verity
  15. Safia Mohammed
  16. Shija J Shija
  17. Lwidiko E Mhamilawa
  18. Ulrika Morris
  19. Andreas Mårtensson
  20. Jessica T Lin
  21. Anders Björkman
  22. Jonathan J Juliano
  23. Jeffrey A Bailey
(2024)
Strong isolation by distance and evidence of population microstructure reflect ongoing Plasmodium falciparum transmission in Zanzibar
eLife 12:RP90173.
https://doi.org/10.7554/eLife.90173.3