SIMMER employs similarity algorithms to accurately identify human gut microbiome species and enzymes capable of known chemical transformations
Abstract
Bacteria within the gut microbiota possess the ability to metabolize a wide array of human drugs, foods, and toxins, but the responsible enzymes for these chemical events remain largely uncharacterized due to the time-consuming nature of current experimental approaches. Attempts have been made in the past to computationally predict which bacterial species and enzymes are responsible for chemical transformations in the gut environment, but with low accuracy due to minimal chemical representation and sequence similarity search schemes. Here, we present an in silico approach that employs chemical and protein Similarity algorithms that Identify MicrobioMe Enzymatic Reactions (SIMMER). We show that SIMMER accurately predicts the responsible species and enzymes for a queried reaction, unlike previous methods. We demonstrate SIMMER use cases in the context of drug metabolism by predicting previously uncharacterized enzymes for 88 drug transformations known to occur in the human gut. We validate these predictions on external datasets and provide an in vitro validation of SIMMER's predictions for metabolism of methotrexate, an anti-arthritic drug. After demonstrating its utility and accuracy, we made SIMMER available as both a command-line and web tool, with flexible input and output options for determining chemical transformations within the human gut. We present SIMMER as a computational addition to the microbiome researcher's toolbox, enabling them to make informed hypotheses before embarking on the lengthy laboratory experiments required to characterize novel bacterial enzymes that can alter human ingested compounds.
Data availability
Data generated and analyzed during this study are provided in Figures 2-10 source data files, Table 1 source data file, supplemental files, and at https://github.com/aebustion/SIMMER. Accession numbers of previously published datasets are provided in the Materials and Methods section. SIMMER code can either be run at the SIMMER website (https://simmer.pollard.gladstone.org/) or downloaded directly from the above-linked GitHub.
Article and author information
Author details
Funding
PhRMA Foundation (Predoctoral Fellowship)
- Annamarie E Bustion
ARCS Foundation (Graduate Student Scholarship)
- Annamarie E Bustion
UCSF Benioff Center for Microbiome Medicine (Trainee Pilot Award)
- Annamarie E Bustion
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Copyright
© 2023, Bustion et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 2,055
- views
-
- 175
- downloads
-
- 11
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Biochemistry and Chemical Biology
- Microbiology and Infectious Disease
Teichoic acids (TA) are linear phospho-saccharidic polymers and important constituents of the cell envelope of Gram-positive bacteria, either bound to the peptidoglycan as wall teichoic acids (WTA) or to the membrane as lipoteichoic acids (LTA). The composition of TA varies greatly but the presence of both WTA and LTA is highly conserved, hinting at an underlying fundamental function that is distinct from their specific roles in diverse organisms. We report the observation of a periplasmic space in Streptococcus pneumoniae by cryo-electron microscopy of vitreous sections. The thickness and appearance of this region change upon deletion of genes involved in the attachment of TA, supporting their role in the maintenance of a periplasmic space in Gram-positive bacteria as a possible universal function. Consequences of these mutations were further examined by super-resolved microscopy, following metabolic labeling and fluorophore coupling by click chemistry. This novel labeling method also enabled in-gel analysis of cell fractions. With this approach, we were able to titrate the actual amount of TA per cell and to determine the ratio of WTA to LTA. In addition, we followed the change of TA length during growth phases, and discovered that a mutant devoid of LTA accumulates the membrane-bound polymerized TA precursor.
-
- Biochemistry and Chemical Biology
- Computational and Systems Biology
Protein–protein interactions are fundamental to understanding the molecular functions and regulation of proteins. Despite the availability of extensive databases, many interactions remain uncharacterized due to the labor-intensive nature of experimental validation. In this study, we utilized the AlphaFold2 program to predict interactions among proteins localized in the nuage, a germline-specific non-membrane organelle essential for piRNA biogenesis in Drosophila. We screened 20 nuage proteins for 1:1 interactions and predicted dimer structures. Among these, five represented novel interaction candidates. Three pairs, including Spn-E_Squ, were verified by co-immunoprecipitation. Disruption of the salt bridges at the Spn-E_Squ interface confirmed their functional importance, underscoring the predictive model’s accuracy. We extended our analysis to include interactions between three representative nuage components—Vas, Squ, and Tej—and approximately 430 oogenesis-related proteins. Co-immunoprecipitation verified interactions for three pairs: Mei-W68_Squ, CSN3_Squ, and Pka-C1_Tej. Furthermore, we screened the majority of Drosophila proteins (~12,000) for potential interaction with the Piwi protein, a central player in the piRNA pathway, identifying 164 pairs as potential binding partners. This in silico approach not only efficiently identifies potential interaction partners but also significantly bridges the gap by facilitating the integration of bioinformatics and experimental biology.