PREDICT Study Design with Patient Diagnostic Criteria and Histopathology

a. Study overview depicting clinical and cellular measurements from 13 functional gastrointestinal disorder (FGID) patients and 14 pediatric Crohn’s disease (pediCD) patients. Terminal ileum biopsies were isolated at a treatment-naïve diagnostic visit, and pediCD patients were followed up to determine their anti-TNF response and categorized as not on anti-TNF (NOA), Full Response (FR), or Partial Response (PR) (see Methods). Two panels of flow cytometry allowed for relative frequency quantification of 32 cell types and subsets, and 10X 3’ v2 single-cell RNA-sequencing (scRNA-seq) captured 254,911 total cells including 118 FGID and 305 pediCD end clusters through ARBOL (see Methods).

b. Demographic data, weight, height, and BMI for cohort (see Table 1 and Supplemental Figure 1); Significance testing by Mann-Whitney U-test.

c. Clinical inflammatory laboratory values for cohort (see Table 1 and Supplemental Figure 1); Significance testing by Mann-Whitney U-test.

d. Representative histopathology of FGID (top) and pediCD (bottom) at 10x (scale bar = 100um) and 40x (scale bar = 20um) magnification.

e. Representative treatment history and clinical inflammatory parameters used for determination of NOA (p027), FR (p011) and PR (p018) status (see Methods, Table 1, and Supplemental Figure 1; ADA: adalimumab, INF: infliximab; MTX: methotrexate; Pred: prednisone; mSCD: modified specific carbohydrate diet; EEN: exclusive enteral nutrition).

Demographics and clinical characteristics of patients analyzed on PREDICT. *

indicates significance detected between Crohn’s Disease (CD) and Functional Gastrointestinal Disorder (FGID).

Demographics and clinical characteristics of Crohn’s Disease cohorts.*

indicates significance detected between

Flow Cytometry of Ileal Biopsies Does Not Reveal Significant Changes in Cell Composition in FGID vs. pediCD or across the pediCD Treatment Response Spectrum

a. Representative flow cytometry end gates for selected cell subsets (left: epithelial and hematopoietic; middle: naïve and effector T cells; right: pDCs and antigen-presenting cells) from single-cell dissociated samples from one terminal ileum biopsy for pediCD patients (see Supplemental Figure 2 for full gating strategy; numbers represent percentage of cells in each gate).

b. Fractional composition of selected cell subsets of CD45+ cells from 13 FGID and 14 pediCD patients (error bars are s.e.m).

c. Fractional composition of selected cell subsets of CD45+ cells from 4 NOA, 5 FR and 5 PR patients.

d. Fractional composition of dendritic, pDC, central memory (CM) and effector memory (EM) CD4+ and CD8+ cells from 13 FGID vs 14 pediCD patients. Dendritic cells and pDC plotted as percentage of CD45+ cells. CM/EM CD4+ and CD8+ cells plotted as percentage of total CD4+ and CD8+ cells, respectively. p < 0.05 by Mann-Whitney for pediCD versus FGID and 1-way ANOVA for pediCD cohorts).

e. Fractional composition of dendritic cells, pDCs, central memory (CM) and effector memory (EM) CD4+ and CD8+ cells from 4 NOA, 5FR, and 5 PR patients. Graphs plotted as in d.

A Comprehensive Cell Atlas of Terminal Ileum in Non-Inflammatory FGID

a. tSNE of 99,488 single-cells isolated from terminal ileal biopsies of 13 FGID patients. Colors represent major cell type groups determined via Louvain clustering with resolution set by optimized silhouette score.

b. tSNE as in a with individual patients plotted. For specific proportions please see Supplemental Figure 4.

c. tSNE of each major cell type which was used as input into iterative tiered clustering (ITC).

d. End clusters determined by ARBOL of complete FGID data set were hierarchically clustered on the median expression of 4,445 pairwise differentially expressed genes, using complete linkage and distance calculated with Pearson correlation, between each end cell cluster. Simpson’s Index of Diversity represented as 1-Simpson’s where 1 (black) indicates equivalent richness of all patients in that cluster, and 0 (white) indicates a completely patient-specific subset. Numbers represent the number of cells in that cluster. Names of subsets are determined by Disease.CellType.GeneA.GeneB as in Methods.

e. Dot plot of 2 defining genes for each cell type. Dot size represents fraction of cells expressing the gene, and color intensity represents binned count-based expression level (log(scaled UMI+1)) amongst expressing cells. All cluster defining genes are provided in Supplemental Tables 5 to 7.

A Comprehensive Cell Atlas of Terminal Ileum in pediCD

a. tSNE of 124,054 single-cells isolated from terminal ileal biopsies of 14 pediCD patients. Colors represent major cell type groups determined via Louvain clustering with resolution set by optimized silhouette score.

b. tSNE as in a with individual patients plotted. For specific proportions please see Supplemental Figure 4.

c. tSNE of each major cell type which was used as input into iterative tiered clustering (ITC).

d. End clusters determined by ARBOL of complete pediCD data set were hierarchically clustered on the median expression of 1,844 pairwise differentially expressed genes, using complete linkage and distance calculated with Pearson correlation, between each end cell cluster. Simpson’s Index of Diversity represented as 1-Simpson’s where 1 (black) indicates equivalent richness of all patients in that cluster, and 0 (white) indicates a completely patient-specific subset. Numbers represent the number of cells in that cluster. Names of subsets are determined by Disease.CellType.GeneA.GeneB as in Methods.

e. Dot plot of 2 defining genes for each cell type. Dot size represents fraction of cells expressing the gene, and color intensity represents binned count-based expression level (log(scaled UMI+1)) amongst expressing cells. All cluster defining genes are provided in Supplemental Tables 8 to 10.

A Collective Cell Vector in pediCD Reveals Predictive Axes of Disease Trajectory and Treatment Response

a. Spearman rank correlation heatmap of principal components calculated from the frequencies of each end cluster per main cell type together with clinical metadata. Correlation is represented by both the intensity and size of the box and those which are FDR < 0.05 have a bounding box.

b. A PCA plot of each patient’s end cell cluster composition as determined by ARBOL for T/NK/ILC, from the pediCD dataset where the end cell clusters frequency is calculated amongst all cell types and colors represent anti-TNF response. (inset highlights the specific correlation between PC2 of the T, Myeloid, Epithelial cell frequency analysis with anti-TNF response). We term this principal component “pediCD-TIME”.

c. Cell cluster frequencies of the parent cell type found to be significant by Mann-Whitney U test between selected clusters (see Supplemental Figure 7 for all graphs; Supplemental Table 15).

d. Heatmap showing cell frequencies per patient of most positive and negative cell subsets of PC2 from PCA performed on T/NK/ILC, myeloid and epithelial cell subsets (Supplemental Table 16). Cell subsets are sorted by PC2 score, and patients were sorted by anti-TNF response. Heatmap is not normalized and displaying the log counts-per million of each cell subset normalized per cell type. *Patient p022’s response category changed from FR to PR after database lock in December of 2020. No other patient’s categorization has changed.

ARBOL analysis of joint FGID and pediCD T/NK/ILCs, Epithelial and Myeloid cells identifies high-diversity clusters that drive disease severity signature

a. A principal component analysis (PCA) plot of each patient’s end cell cluster composition as determined by ARBOL for T/NK/ILC, Myeloid, and Epithelial cells from FGID and pediCD patients where the end cell clusters frequency is calculated amongst total cells and colors represent disease state and treatment response.

b. The Spearman rank correlation between PC1 of the joint T/NK/ILC, Myeloid, and Epithelial cell frequency analysis with anti-TNF response.

c. A UMAP plot of 7,366 activated T/NK/ILC cells from FGID (colored by disease) and pediCD (colored by treatment response) data sets.

d. Hierarchical clustering of activated T/NK/ILC cells from FGID and pediCD data set with input clusters determined based on results of ARBOL, and performed on the median expression of all genes, using complete linkage and distance calculated with Pearson correlation, between each end cell cluster. Simpson’s Index of Diversity represented as 1-Simpson’s where 1 (black) indicates equivalent richness of all patients in that cluster, and 0 (white) indicates a completely patient-specific subset. Numbers represent the number of cells in that cluster. Names of subsets are determined by Disease.CellType.GeneA.GeneB as in Methods, and names in red indicate strongest PR-PC1-negative loadings. Pie charts represent the fractional composition of FGID or pediCD cells (note the initial proportions for reference).

e. Top JOINT-PC1-negative clusters are both (left) high diversity and (right) almost exclusively found in pediCD patients.

Histological and scRNA-seq analyses indicate CD4+ T cell infiltration into the epithelium of pediCD samples

a. Representative (left) multiplex immunofluorescence of B cells (CD20), T cells (CD3), cytotoxic T cells (CD8), regulatory T cells (FOXP3), myeloid cells (CD68) and epithelium (panCK) which was then (right) analyzed using an automated classifier mask to divide sections into epithelium, lamina propria and glass areas for cellular quantification.

b. Representative CD and FGID sections stained with indicated markers, inset focuses on T cells within epithelium. Quantification of 8 pediCD and 10 FGID participants is represented as a row z-scored hierarchically-clustered heatmap for total number of indicated cells per mm2 of whole section, epithelium, or lamina propria.

c. Cells per mm2 of epithelium for CD3+ cells, CD3+CD8- cells (CD4 T cells), and CD3+CD8- FOXP3- (effector CD4 T cells) for pediCD and FGID participants.

d. Spearman rank correlation heatmap of the counts-per-million for each of the top 25 clusters defining pediCD-TIME positive (NOA-associated) and pediCD-TIME negative (PR-associated) vectors. Correlation is represented by both the intensity and size of the box and those which are FDR < 0.05 have a bounding box

Anti-TNF treatment shifts the pediCD cellular ecosystem towards adult treatment-refractory disease

a. GSEA analysis showing the ranks of 92 PREDICT markers (markers of top 25 cell states associated with disease severity and treatment outcomes) in bulk RNA sequencing of illeal or colonic mucosa of two other treatment-naïve cohorts (pediatric RISK cohort, n = 69, adult E−MTAB−7604 cohort, n = 43) comparing pediCD patients who did or did not respond to anti-TNF therapy. P-value is estimated based on an adaptive multi-level split Monte-Carlo scheme.

b. UMAP plots of 275,544 cells from PREDICT ileal biopsies (Baseline: 107,432 cells; Repeat: 47,796 cells) or GIMATS anti-TNF treated ileal resections (GIMATS Uninflamed: 61,965; GIMATS Inflamed 58,351) integrated using Harmony (batch = 10x version).

c. A UMAP plot from cells in (b) colored by major cell type.

d. A PCA plot (left) or UMAP plot (right) of each patient’s end cell cluster composition as determined by ARBOL with Harmony integration (batch = 10x version) for T, B, Myeloid, Fibroblast, Plasma from the treatment-naïve PREDICT pediCD samples (n=13), repeat PREDICT biopsies (n=2 NOA, 6 on-anti-TNF), and adult on-treatment GIMATS biopsies (n=22) where the end cell clusters frequency is calculated amongst total cells and colors represent severity and cohorts are shapes.

e. A UMAP trajectory of the CPM table underlying (d).

f. The top 50 varying (yellow) cell clusters along the pseudotime trajectory as determined by Moran’s I test (q-value < 0.000029).

g. Cells per million of significantly varying T/NK/ILC clusters where individual points represent patient biopsies colored by severity and cohorts as shapes (as in d)

Clinical trajectory and treatments for all pediCD patients

a. Representative treatment history and clinical inflammatory parameters used for determination of NOA, FR and PR status for all pediCD patients (see Methods, Table 1, and Figure 1; ADA: adalimumab, INF: infliximab; MES: mesalamine MTX: methotrexate; Pred: prednisone; mSCD: modified specific carbohydrate diet; EEN: exclusive enteral nutrition)

Representative gating strategies for flow cytometry

a. Representative gating strategy for Panel 1 focused on T cells and myeloid cells, for antibodies please see Supplemental Table 1.

b. Representative gating strategy for Panel 2 focused on non-classical T cells and innate lymphoid cells (NB: Lineage = CD14, CD20, CD11c, CD11b, CD56), for antibodies please see Supplemental Table 1.

Comparison of quality control measures reveals similar sequencing depths and gene capture between FGID and pediCD

a. Quality control measures for scRNA-seq of ileal biopsies of 27 patients (13 FGID, 14 pediCD) included in the study. Top two graphs denote total genes (nFeature) and UMIs (nCount) after normalization with SCTransform. Lower graphs denote total genes (nFeature), UMIs (nCount) and mitochondrial read percentage (mt.percentage) of pre-processed 10X 3’ v2 single-cell RNA-sequenced samples. Cutoffs for individual samples are represented in Supplemental Table 2.

b. Quality control measures as in a split by cell type.

c. Comparison of total genes captured (nFeature, left) and total UMIs (nCount, right) between FGID (blue) and pediCD (red) split by cell type.

Traditional clustering with SCTransform normalization reveals similarities across cell types in FGID and pediCD

a. UMAP representing one round of clustering of 197,281 single-cells across FGID and pediCD samples. Traditional clustering performed on both diseases together. Colors represent major cell types determined by one round of clustering with Seurat RunUMAP parameters (PCs = 1:50, n.neighbors = 50, min.dist = 1). Cell types were assigned based on significantly upregulated marker genes (Wilcoxon; p.adj<0.05) obtained from comparison of specific cell type versus all other cell types.

b. UMAP as in a colored to highlight FGID (blue) and pediCD (red) cells.

c. UMAP as in a colored by Tier 1 ITC clusters performed separately for FGID and pediCD (NB: this was used for Figure 3 and Figure 4).

d. Comparison of cell cluster frequencies between FGID (blue) and pediCD (red) with cells clustered (left) separately or (right) together. Patient contributions denoted by circles (FGID) and triangles (pediCD).

e. Differentially expressed genes across cell type in FGID vs pediCD determined to be significant by Wilcoxon test (logFC>0.25, FDR<0.001).

f. Volcano plots for Myeloid, Epithelial, T-cell clusters denoting differentially expressed genes in FGID vs. pediCD. Those in pink are significant by Wilcoxon test; Supplemental Table 4 for full gene lists.

g. UMAPs of jointly clustered pediCD and FGID Myeloid cells.

Schematic for iterative tiered clustering and random forest classifier approach

a. Flowchart depicting iterative tiered clustering (ITC) used for generating FGID and pediCD cellular atlases. After sequencing, cells underwent quality control and a cell by gene expression matrix was derived from the 27 ileal samples. Dimensionality reduction and graph-based clustering were performed using the standard Seurat workflow to annotate cell types. Resulting clusters were then iteratively processed through the same pipeline unless end conditions were met. Each cluster was checked for three end conditions which included: only one cluster remaining, two clusters remaining with no more than 5 up and down regulated genes as determined by Wilcoxon test (logFC > 1.5, FDR < 0.001), and/or less than 100 cells in the cluster. Iterative clustering stopped if any of the three conditions are met. Unlike traditional clustering, in ITC principal component and clustering resolution parameters are chosen automatically. Stop conditions are built in as parameters to the ITC pipeline, allowing customization to the dataset. From the results of ITC, we then build the cell cluster taxonomical tree through hierarchical clustering (Figures 3 and 4).

b. Cell and cluster numbers after various processing steps tabulated

c. Random forest classifier approach for integrating FGID and pediCD datasets. FGID and pediCD datasets were used as training datasets to create random forest predictors used in downstream sub-clustering of cell types and subsets. The opposing dataset was then tested by each algorithm independently to determine correspondence and bias as depicted in Figure 6 and Supplemental Figure 8.

d. A visual representation of the parameter scanning method outputs used that illustrate resolution vs. clusters, numbers of clusters vs. silhouette, and resolution vs. silhouette. The resolution parameter is optimized to maximize the silhouette score.

e. ARBOL was run independently in two distinct Google Cloud Platform Terra sessions by two distinct computationalists on the nasal polyp dataset from Ordovas-Montanes et al., returning a 1:1 correlation of clustering results (Ordovas-Montanes et al., 2018, p.).

Representative marker genes for myeloid and T cells.

a. Dot plot of curated genes related to myeloid biology. Dot size represents fraction of cells expressing the gene, and color intensity represents binned count-based expression level (log(scaled UMI+1)) amongst expressing cells. All cluster defining genes are provided in Supplemental Table 7 and Supplemental Table 10. Dot size is only plotted if more than 5% of cells are expressing the transcript. Names are descriptive names generated from inspection of ITC output which were then converted to standardized naming scheme as in Methods.

b. Dot plot of marker genes related to T/NK/ILC lymphoid biology as in a.

Cell types associated with pediCD severity after PCA analysis

a. Volcano plots for T/NK/ILC clusters between NOA, FR and PR, where named clusters are significant by Fisher’s exact test and those in pink are significant by Mann-Whitney U test.

b. Cell cluster frequencies of the parent cell type found to be significant by Mann-Whitney U test between NOA and FR/PR.

c. Cell cluster frequencies of the parent cell type between NOA and FR (as above).

d. Cell cluster frequencies of the parent cell type between NOA and PR (as above).

e. Cell cluster frequencies of the parent cell type between FR and PR (as above).

Illustrative example of ARBOL on proliferating T cells and comparison with differential expression and topic modeling

a. t-SNE representation of ARBOL clusters at the overarching cell type (top), tier 2 clustering of T/NK/ILC (middle), and end-cluster levels (bottom). ARBOL sub-clusters at each level based on gene modules with appreciable biological relevance, as shown by heatmaps of genes (rows) expressed in sub-clusters at both the tier 2 and end-cluster levels. Red box in tier 2 heatmap (middle) denotes ARBOL sub-clustering identification of proliferating T and NK clusters, of which several sub-clusters (bottom) are important for severity. Severity associated T/NK/ILC end-clusters (bottom) are found in multiple tier 2 clusters (T1C0.T2C11 + T1C0.T2C7).

b. Violin plots of select genes important to ARBOL end-clusters and volcano plots of all genes at the level of proliferating T cells (T1C0.T2C11). MAST differential expression at this level between severity conditions finds IFNG significantly upregulated in PR vs. NOA, but not significant in other tests, nor are FOXP3, GZMA, or IL22 in any comparison.

c. Topic modeling performed on proliferating T cells (T1C0.T2C11) finds topics unique to ARBOL end-clusters, providing another avenue for identification of gene modules found by high-resolution ARBOL sub-clustering.

d. Topics found in (c) and principal components used by ARBOL for sub-clustering T1C0.T2C11 display covariation with severity conditions.

e. Topics found in (c) and their relationship to specific anti-TNF disease outcomes.

Leave-one-out cross-validation (LOOCV) to determine robustness of cluster contributions to PC loadings in pediCD severity vector

a. Top 15 positive and negative pediCD-TIME associated ARBOL cell clusters. Loadings of PC2 (pediCD-TIME) from PCA of CPM per total cell number per patient calculated with full dataset in red and grey bar showing mean of LOOCV. Error bars show standard deviation around the mean.

b. As in a, but CPM calculated per cell type (e.g. T.MKI67.FOXP3 /(n T/NK/ILC) * 100000 per patient). PC1 of per-cell-type CPM matrix is associated with severity (R=0.72) and is more robust to LOOCV.

c. Full dataset PCA used in a (as in Figure 5).

d. Full dataset PCA used in b.

Random Forest (RF) Classifier Applied to Myeloid Cellular Taxonomies Identifies Correspondence between FGID and pediCD

a. Correspondence between cell subsets from FGID-to-pediCD and pediCD-to-FGID.

Top left heatmaps: RF probabilities for each cell averaged over subset to gain probability of each FGID matching onto each pediCD subset (left), and pediCD onto FGID (right).

Bubble plot (center): size = sum(probability matrices) for confidence of predictions, marker color = diff(probability matrices) to show which direction the RF model is more confident on, e.g. more likely for FGID subset to belong to pediCD subset or pediCD subset to belong to FGID subset. Markers are filtered to show the top 10th percentile of correspondence.

Dendrograms: separated-tiered clustering on prediction probabilities of FGID (blue) and pediCD (red) using complete linkage with correlation distance metric, clusters are cut at height 0.7 (range 0-1).

Heatmap: 1-Gini-Simpson index based on patient diversity, mono-patient clusters (white), full representation (black). Right 3 columns show row-normalized of frequency of NOA, FR, PR representation in each CD cell subset. Significant differences (Mann-Whitney, alpha=0.05) are marked, triangle NOA vs. PR and circle NOA vs. FR.

b. Distribution of Gini-Simpson’s index of patient diversity in FGID (top) and pediCD (bottom) for myeloid cell clusters.

c. Sankey plot comparing joined traditional single-level clustering (left) to disease-separated iterative tiered clustering (right). Each line follows each cell as it moves between in the two cluster sets (back bar split based on cluster identity).

d. Gini-Simpson index on representation of traditional clusters in each of the separated tiered clusters (i.e., from how many of the higher-level clusters does the deep clustering pull). Calculated separately for FGID (blue) and pediCD (red).

e. Similar to d but showing the total counts of how many traditional clusters are represented in a single tiered cluster per disease.

f. UMAP of combined Myeloid cells: red shows example end clusters from ITC that are split across the traditional-clustering joint-disease UMAP.

Random Forest classification applied to T cell subsets and integration using STACAS

a. Correspondence between cell subsets from FGID-to-pediCD and pediCD-to-FGID.

Top left heatmaps: RF probabilities for each cell averaged over subset to gain probability of each FGID matching onto each pediCD subset (left), and pediCD onto FGID (right).

Bubble plot (center): size = sum(probability matrices) for confidence of predictions, marker color = diff(probability matrices) to show which direction the RF model is more confident on, e.g. more likely for FGID subset to belong to pediCD subset or pediCD subset to belong to FGID subset. Markers are filtered to show the top 10th percentile of correspondence.

Dendrograms: separated-tiered clustering on prediction probabilities of FGID (blue) and pediCD (red) using complete linkage with correlation distance metric, clusters are cut at height 0.7 (range 0-1).

Heatmap: 1-Gini-Simpson index based on patient diversity, mono-patient clusters (white), full representation (black). Right 3 columns show row-normalized of frequency of NOA, FR, PR representation in each pediCD cell subset. Significant differences (Mann-Whitney, alpha=0.05) are marked, triangle NOA vs. PR and circle NOA vs. FR.

b. T cells from the main FGID (n = 29,640 cells) and pediCD (n = 38,031) datasets were integrated using identification of mutual nearest neighbors in a reduced space (reciprocal PCA method) with the STACAS package. UMAP plots show distribution of cells coming from FGID (blue) and pediCD (red) datasets and 11 clusters obtained using Louvain algorithm. Sankey plot shows the contribution of ARBOL clusters to each Louvain cluster in the integrated dataset.

Up-sampling improves Random Forest model test recall, accuracy, and precision.

Points represent the test recall (top left), precision (bottom left), or accuracy (top right) for random forest models. Error bars show the mean and standard deviation of the 5 test scores for each model under two methods of 5-fold cross validation (CV), traditional CV (red) and up-sampled CV (blue). X-axis shows 20 RF models (10 trained on FGID, 10 trained on CD), each trained on a major cell type to predict which ARBOL end-cluster of that major type a provided cell transcriptome belongs to. Recall, accuracy, and precision are calculated on the remaining 20% of the dataset in each of the 5-fold CV. SciKit-Learn’s precision and recall score function https://scikit-learn.org/stable/auto_examples/model_selection/plot_precision_recall.html was used with the “weighted” option to account for multi-class label imbalance. Up-sampled 5-fold cross validation was performed (blue) by up-sampling per label to the largest label size. For up-sampled CV, recall, precision, and accuracy are calculated as normal on the remaining 20% of data. Across major cell types, the accuracy of the RF models improved on average 9.8% when using the up-sampled 5-fold CV in the CD disease condition and improved 6.5% on average across major cell types in the FGID disease condition. Precision increased by 13% in CD and 8.7% in FGID respectively. Recall increased by 9.8% in CD and 6.5% in FGID respectively.

Distinct Distributions of Macrophages Across the pediCD Treatment Response Spectrum Relative to FGID

a. UMAP representation of macrophages (27 patients; 10,134 cells) from FG and pediCD datasets, run across 50 principal components based on 539 genes significantly upregulated (Wilcoxon; p.adj<0.05) in macrophages versus all other cell types and not significantly differentially expressed between FG and pediCD sets. UMAP parameters [min-dist=0.1, N-neighbors=ceiling(sqrt(Ncells)/2)]. Labels are set at median of IQR for each subset. Clinical metadata showing future response to anti-TNF treatment: FGID (grey), NOA (blue), FR (yellow), PR (red).

b. Same UMAP as in a colored to isolate single subsets. Subsets chosen based on significant Mann-Whitney tests (Figure 5; Supplemental Figure 7), (black) cells from subset, (grey) rest of macrophages.

c. Same UMAP as in a separated into FGID and pediCD.

d. Same UMAP as in a split into each treatment response group. Shaded area captures 80% most densely populated regions of plot area calculated using 2d KDE estimate from MASS R package.

e. Permutation test results using Hellinger distance to measure if 2 conditions are sampled from the same distribution (0 = complete overlap, 1 = no overlap). Hellinger distance is computed with sqrt(1 - sum(sqrt(kde1*kde2))) with a KDE estimation for each condition group calculated across 1000 points uniformly distributed across plot area, with bandwidth selected using ks::Hpi() function. Black distribution shows test statistic varying min-dist parameter with 11 evenly spaced values between 0.01 and 1. Vertical line shows test statistic using UMAP parameters [min-dist=0.1, Neighbors=ceiling(sqrt(Ncells)), Npcs=50, nDim=2]. Grey distribution shows results of 11,000 permutations to treatment response group varied across same min-dist umap parameters between 0.01 and 1. All tests are significant beyond a 0.001 threshold.

f. Clockwise from left: UMAP of macrophages with color intensity displaying amount of TNF expression based on ((log(scaledUMI+1)). Plot (top) showing fraction of macrophages expressing TNF with colored dots showing fraction of TNF+ cells within each treatment response group and grey violins showing results of 10,000 permutations of treatment response labels. Violin plot (bottom) of ((log(scaledUMI+1)) TNF expression split on treatment response group.

g. Diversity of macrophage clusters in FGID and pediCD: (top) each dot represents a cell subset, y-axis shows how many patients are included within the subset. (bottom) each dot represents a subset, with y position showing (1-Gini-Simpson’s Diversity Index), Subsets below red dashed line set at 0.1 diversity were excluded.

Distinct Distributions of Lymphocytes Across the pediCD Treatment Response Spectrum Relative to FGID

a. UMAP representation of T/NK/ILCs (27 patients; 67,579 cells) from FG and CD datasets, run across 50 principal components based on 345 genes significantly upregulated (Wilcoxon; p.adj<0.05) in lymphocytes versus all other cell types and not significantly differentially expressed between FG and pediCD sets. UMAP parameters [min-dist=0.1, N-neighbors=ceiling(sqrt(Ncells)/2)] Labels are set at median of IQR for each subset. Clinical metadata showing future response to anti-TNF treatment: FGID (grey), NOA (blue), FR (yellow), PR (red).

b. Same UMAP as in a colored to isolate single subsets. Subsets chosen based on significant Mann-Whitney tests (Figure 5), (black) cells from subset, (grey) rest of lymphocytes.

c. Same UMAP as in a separated into FG and pediCD.

d. Same UMAP as in a split into each treatment response group. Shaded area captures 80% most densely populated regions of plot area calculated using 2d KDE estimate from MASS R package.

e. Permutation test results using Hellinger distance to measure if 2 conditions are sampled from the same distribution (0 = complete overlap, 1 = no overlap). Hellinger distance is computed with sqrt(1 - sum(sqrt(kde1*kde2))) with a KDE estimation for each condition group calculated across 1000 points uniformly distributed across plot area, with bandwidth selected using ks::Hpi() function. Black distribution shows test statistic varying min-dist parameter with 11 evenly spaced values between 0.01 and 1. Vertical line shows test statistic using UMAP parameters [min-dist=0.1, Neighbors=ceiling(sqrt(Ncells)), Npcs=50, nDim=2]. Grey distribution shows results of 11,000 permutations to treatment response group varied across same min-dist umap parameters between 0.01 and 1. All tests are significant beyond a 0.001 threshold.

f. Violin plot (left) of ((log(scaledUMI+1)) MKI67 expression split on treatment response group. UMAP (right) of lymphocytes with color intensity displaying MKI67 expression based on ((log(scaledUMI+1)) (right).

g. Diversity of lymphocyte clusters in FGID and CD: (top) each dot represents a cell subset, y-axis shows how many patients are included within the subset. (bottom) each dot represents a subset, with y position showing (1-Gini-Simpson’s Diversity Index), Subsets below red dashed line set at 0.1 diversity were excluded.

Comparison of Diversity Indices and Entropy of separately- or jointly-clustered T/NK/ILCs from pediCD and FGID patients

a. Distribution of Gini-Simpson’s index of patient diversity in pediCD, FGID, and jointly-clustered T/NK/ILC’s.

b. Sankey plot comparing pediCD ARBOL atlas clusters (left) to joint pediCD and FGID ARBOL (right). Each line follows each cell as it moves between in the two cluster sets (back bar split based on cluster identity). Alluvials of 10 cells or less were filtered out.

c. Log base 2 Shannon entropies of curated atlas T/NK/ILC cell states in the joint FGID-CD atlas, the harmony-integrated GIMATS + longitudinal PREDICT CD-FGID atlas, and simulated entropies with 75% label preservation, 50% preservation, and random shuffling.

Bulk RNA-seq data from PREDICT cohort is insufficient to recover a disease severity signature, but the 92-gene signature derived from scRNA-seq PREDICT cohort is enriched in severe patients.

a. PCA of Bulk RNA-seq from CD and FGID (gray) biopsies show dominance of sample-level variance including FGID (n = 26) and without FGID samples (n = 13).

b. Differential expression between conditions fails to find many significant (FDR < 0.05, red) DE genes, except in FGID (NM) vs. FR. NOA vs FR shows change in KRT17, suggesting bulk RNA seq confounded by variance in epithelial cell dissociation.

c. 92 genes are enriched in DE between conditions in bulk RNA-seq data by GSEA.