scRNA-seq analysis of C. gigas circulating hemocytes reveals 7 transcriptomic cell clusters

(A) Schematic of the scRNA-seq 10X Genomics Chromium microfluidic technology and bioinformatics processing workflow used. Dissociated hemocytes were collected from a pathogen-free oyster and encapsulated in droplets for processing. After sequencing, the data were processed bioinformatically. (B) Uniform Manifold Approximation and Projection (UMAP) plot for dimensional reduction of the data set and summary of cells and the number of Differentially Expressed Genes (DEGs) in each cluster. The table shows the characteristics (number of cells, percentage of total cells and number of Differentially Expressed Genes in each cluster) of the seven clusters identified. (C) Heatmap showing the top 10 overexpressed genes in each cell per cluster as determined by FindAllMarkers() function in Seurat, corresponding to clusters in UMAP plots from Fig. 1B, ranked by log2FC. (D) Dot plot representing the ten most enriched DEGs per cluster based on average expression (avg_log2FC). The color gradient of the dot represents the expression level, while the size represents the percentage of cells expressing each gene per cluster.

Top 10 overexpressed genes identified in each transcriptomic cluster.

The first column indicates the gene number according to the annotation. ‘log2FC’ represents the log2 fold change of the gene in the cluster compared to all other cells. ‘Pct1’ is the percentage of cells expressing the gene in the cluster and ‘Pct2’ is the fraction of cells expressing the gene in all other clusters. The description is the annotation of the expressed gen.

KEGG and Gene Ontology analysis of the gene signature in each cluster.

(A) A synthetic representation of the KEGG pathway analysis is shown. Colored columns represent the 7 transcriptomic clusters. Each row is a KEGG pathway, the colored dot represents the −log(p-value) and the dot size represents the number (count) of enriched genes in each pathway category. The fold enrichment is shown on the x-axis. Panels (B), (C) and (D) show the results of Gene Ontology terms (GO-terms) for Biological Processes (BP), Cellular Components (CC) and Molecular Functions (MF) respectively, obtained with the overexpressed genes of each cluster (Absolute value Log2FC > 0.25 and significant p-value < 0.001) using RBGOA analysis (p-value <= 0.001) for three different ontology universes. Each panel corresponds to one ontology universe, and the analysis highlights over- and under-enriched terms. The number in the heatmap and the scale indicate the proportion of significant positive GO-terms. Positive values indicate over-enrichment and negative values indicate under-enrichment of the respective BP, CC and MF ontologies.

C. gigas naive hemocyte formula and Percoll gradient hemolymph fractionation

(A) Morphology, percentages and characteristics of the 7 cell types identified by MCDH staining. H : Hyalinocyte, ML : Macrophage Like, BBL : Basophilic Blast Like cell, ABL : Acidophilic Blast Like cell, SGC : Small Granule Cell, BGC : Big Granule Cell, VC : Vesicular Cell. Scale bar : 5 µm. Hyalinocytes (54 %) and blast cells (ABL & BBL) (35 %) were predominant in the first fraction. The second fraction showed an increase in blasts (ABL 26 %, BBL 36 %) and a decrease in hyalinocytes (21 %). Fraction 3 had a mixed content with a decrease in hyalinocytes and blasts (15 %), and a majority of blasts and macrophage-like cells. Fraction 4 had an increase in granular cells (SGC 22 %, BGC 10 %, VC 15 %) and a decrease in blasts (ABL 8 %, BBL 9 %). Fraction 5 showed an increase in small granule cells (36 %) and vesicular cells (28 %), and a decrease in big granule cells (5%). Fraction 6 had fewer hyalinocytes, macrophage-like cells, and blast cells compared to small granule cells and vesicular cells. The last fraction consisted mainly of small granule cells (81 %). (B) Sorting of hemocytes on a discontinuous Percoll gradient. 7 fractions were identified along the gradient at the top of each density cushion (from d=1.0647 at #1 to d= 1.1049 at #7). (C) Representation of the average values (from 5 different fractionation experiments) of the different hemocyte types in the seven percoll gradient fractions compared to the average hemolymph composition of a naive oyster (Total). VC : Vesicular Cells, BGC : Big Granule Cells, SGC : Small Granule Cells, ABL : Acidophilic Blast Like cells, BBL : Basophilic Blast Like cells, ML : Macrophage Like cells and H : Hyalinocytes respectively. (Supp. Fig. S4 for statistics).

Table of the 14 marker genes specific to the different transcriptomic clusters.

Gene number, average Log2FC, pct1/pct2 ratio (percentage of cells expressing this transcript in the cluster divided by the percentage of all other cells expressing this transcript) and cluster number are reported. The description is taken from our annotation and the marker name is derived from the description.

Characterization of molecular markers specific to the different hemocyte morphotypes.

(A) Relative expression level of the 14 markers in the various fractions after gradient density sorting. The graphs show the relative expression of genes compared to their expression in total hemocytes in the various fractions (red dotted line). Standard deviations were calculated based on four independent experiments. (B) Average percentage of each hemocyte type in the 7 Percoll gradient fractions used to quantify marker gene expression by qPCR. (C) Correlation matrix between the relative gene expression of each marker gene in each fraction and the percentage of each hemocyte type in the same fractions. Values and color scale represent the Pearson correlation coefficient (r) ranging from −1 (inverse correlation) to +1 (full correlation). H : Hyalinocyte, ML : Macrophage Like, BBL : Basophilic Blast Like cell, ABL : Acidophilic Blast Like cell, SGC : Small Granule Cell, BGC : Big Granule Cell, VC : Vesicular Cell. LACC24 : Laccase 24, CLEC : C-type lectin domain-containing protein, EGFL : EGF-like domain-containing protein 8, LEVAR : Putative regulator of levamisole receptor-1, TGC : TGc domain-containing protein, XBOX : X-box binding protein-like protein, MLDP : ML domain containing protein, HMGB1 : High mobility group protein B1, CUBN : Cubilin, GAL : Galectin, CAV : Caveolin, NAT1 : Natterin-1, MOX : DBH-like monooxygenase protein 1, GPROT : G protein receptor F1-2 domain-containing protein.

Phagocytosis, Reactive Oxygen Species production capacity and copper storage of hemocytes.

(A) Images of small granule cells (SGC) and macrophage-like (ML) cells with phagocytosed zymosan particles (panels a & c - red stars) and Vibrio tasmaniensis LMG20012T bacteria (panels b & d - red stars) from whole hemolymph sample. Scale bar : 5 µm. (B) Quantification of the phagocytic activity of zymosan particles by each cell type. The graph shows the result of 3 independent experiments. (C) Luminescence recording to detect the production of Reactive Oxygen Species (ROS). In orange, a biphasic curve was obtained on naive oyster hemolymph after zymosan addition at t = 0 min. In blue, the control condition corresponds to hemocytes without zymosan addition. (D) Graph showing the intensity of ROS production in each Percoll fraction. Normalized burst intensity was calculated from the luminescence peak obtained from each fraction. In blue, no drug was added to the experiment, in orange, ROS production was impaired by the addition of apocynin. (E) NBT (NitroBlueTetrazolium) staining of hemocytes exposed to zymosan particles. Hemocytes morphology after MCDH staining: Macrophage Like (a), Basophilic (b) and Acidophilic (c) Blast cells. NBT staining of the different types of hemocytes (d-f). Red stars show zymosan and bacteria particles. Black arrows indicate Macrophage-Like cells. Scale bar : 10 µm (F) Quantification of NBT-positive cells present in the total hemolymph of oysters exposed to zymosan. (H) UMAP plots showing cells expressing NADPH oxidase found in the scRNA-seq dataset and their expression level. (G) Labeling of intracellular copper stores in C.gigas hemocytes. MCDH (upper panels) and rhodanine (lower panels) staining of oyster hemocytes to reveal copper accumulation. Scale bar : 10µm. For panels (B), (D) and (F) the alphabetic characters displayed above the data points in each plot represent statistically significant differences among the groups, as determined by Tukey’s test following ANOVA. Groups denoted by different letters differ significantly at the p < 0.05 level of statistical significance. H : Hyalinocytes, ABL : Acidophilic Blast-Like cells, BBL : Basophilic Blast-Like cells, ML : Macrophage-Like cells, SGC : Small Granule Cells, VC : Vesicular Cells and BGC : Big Granule Cells.

Hemocyte expression profiles of some antimicrobial peptides.

(A) (B) and (C) Relative Gene Expression in the 7 Percoll hemocyte fractions of Big-Defensin1 & 2 (Cg-BigDefs), BPI (Cg-BPI) and hemocyte defensin (Cg-Defh), respectively, in comparison to the gene expression level in unfractionated hemolymph. (D) Correlation matrix between the relative gene expression of BigDefensin1 & 2, BPI and hemocyte defensin gene in each fraction and the percentage of each hemocyte type in each fraction (H : Hyalinocytes, ABL : Acidophilic Blast Like, BBL : Basophilic Blast Like, SGC : Small Granule Cell, ML : Macrophage Like, BGC : Big Granule Cell, VC : Vesicular Cell. Values and color scale represent the Pearson correlation coefficient (r) ranging from −1 (inverse correlation) to +1 (full correlation).

Pseudotime ordering of cells revealed 6 potential differentiation pathways of hemocytes.

(A) UMAP plot of scRNA-seq analysis showing the 7 transcriptomic clusters used for pseudotime analysis. 4 clusters were identified cytologically (SGC for small granule cells - cluster 3, H for hyalinocytes - cluster 2, ML for Macrophage Like - cluster 1 and VC for vesicular cells - cluster 7), cl.4, cl.5, and cl.6 represent clusters 4, 5, and 6, respectively. (B) Graphical representation (UMAP projection) of the Monocle 3 pseudo-time order of the clustered cells. Cluster 4 (cl.4) was used as the origin for the pseudotime analysis. (C) (D) (E) (F) (G) and (H) show the gene expression level of selected marker genes obtained from the monocle3 trajectory analysis at the beginning and end of the modelized differentiation pathways (in red on the UMAP plot) from cluster 4 to hyalinocytes, to cluster 5 cells, to cluster 6 cells, to Vesicular Cells (VC), to Macrophage-Like cells (ML) and to Small Granule Cells (SGC) respectively. The color scale represents the normalized expression level of each gene. (I) Dot plot showing the average expression and the percentage of cells expressing identified transcripts encoding for transcription factors in the scRNA-seq dataset. Average expression is expressed in Log2FC.

Proposed hemocyte ontology in Crassostrea gigas based on the transcriptomic, cytological and functional results obtained.

Cells are colored according to the same color code as the transcriptomic clusters. Cluster numbers and cell types are indicated. To the left of the cells are the overexpressed transcription factors and to the right are the identified marker genes in each cluster. Functional characteristics of hyalinocytes, macrophage-like cells and small granule cells are marked in red. (AMP : AntiMicrobial Peptide, Burst : ROS production, Phago : phagocytosis)

STARsolo summary metrics report.

Metrics dashboard obtained after the STARsolo step, describing the quality of the sequencing and the various characteristics of the cells detected after aligning the reads to the C. gigas genome from the Roslin Institute.

Table presenting the result of KEGG analysis performed using DAVID Bioinformatics Resources.

GO term enrichment analysis was conducted on specifically overexpressed genes in each cluster obtained after scRNA-seq processing (genes with Log2FC > 0.25 and significant p-value < 0.001) to highlight the most relevant GO terms associated with a given gene list. The visualization of the different pathways can be obtained from the KEGG website using the KEGG prefix and KEGG number (https://www.genome.jp/kegg/pathway.html)

Sequences of primers used in this study.

Name of the transcript targeted by the primer pair, name of the primer used in this study and nucleotide sequence of each primer.

Hemocyte composition of the 7 Percoll fractions used for qPCR analysis.

For each cell type in each fraction, the table presents the average percentage, standard deviation, standard error, minimum, maximum, and median count values.

Transcription factors identified in the scRNA-seq dataset of Crassostrea gigas hemocytes.

Transcription factors identified in the scRNA-seq dataset and their homology with human proteins, as indicated by the blast alignment results. Each entry includes the gene identifier, the protein it represents in C.gigas, and its human counterpart. The bit score and coverage indicate the strength and extent of the alignment, respectively.

ScRNA-seq quality control metrics.

Distribution of A) unique molecular identifiers (UMIs), B) genes and C) percentage of mitochondrial genes detected per cell. Each dot represents one cell. D) Plot of the percentage of mitochondrial genes versus the number of genes detected in each cell. The red box represents the cells selected for further analysis (number of genes detected between 750 and 4000 and with a percentage of mitochondrial genes less than 5%) E) The FindVariableFeatures() function was used to identify features with high cell-to-cell variation in the dataset to highlight the biological signal in the single cell dataset. F) Table summarizing some quality control metrics. The table shows the thresholds to remove poor quality cells (doublets or empty droplets). The number of cells, UMIs and genes before and after filtering are shown. G) Uniform Manifold Approximation and Projection (UMAP) plot of the cells with the number of expressed genes. H) UMAP plot of the number of reads per cell. I) UMAP plot of the percentage of mitochondrial genes in each cell. J) UMAP graph of the percentage of ribosomal protein transcripts in each cell.

Results of the C. gigas genome re-annotation.

A) The number and percentage of Coding DNA Sequences (CDS) with valid annotation after each annotation step are shown. A BLAST query was performed against the TrEMBL/Uniprot database and InterproScan annotation against Pfam, PrositeSiteProfiles, CDD, TIGRFAM, PRINTS, SMART, SUPERFAMILY and Hamap databases. Blast and InterProScan results were compiled and processed using Blast2Go. A first mapping step was used to enrich the Blast result with GO-terms, and the annotation step was used to optimize and validate the GO-terms annotations. B) C) and D) show the distribution of the various categories of GO-terms across the three primary domains of Gene Ontology : Molecular Function, Cellular Component and Biological Process, respectively.

Whisker plot showing the distribution of counts of different hemocyte populations.

Hemolymph was collected from eight different oysters and hemocytes were plated on a slide using cytospin centrifugation. The proportion of each hemocyte type in these hemolymphs was calculated after MCDH staining. H : Hyalinocytes, ML : Macrophage-Like cells, BBL : Basophilic Blast-Like cells, ABL : Acidophilic Blast-Like cells, SGC : Small Granule Cells, BGC : Big Granule Cells, and VC : Vesicular Cells. The calculation of the proportion of each cell type in these hemolymphs reveals a heterogeneous composition of hemocytes between individuals as well as between cell types.

Statistical significance of enrichment in different hemocyte types in Percoll gradient fractions.

(A) a,b,c,d,e,f, and g : MCDH staining of cells from the 7 fractions isolated from the percoll gradient in (B). Scale bar : 10 µm. Fraction 1 to 7 : Quantification of the different types of hemocytes found in each of the 7 fractions from 5 independent fractionation experiments. (B) Statistical significance of enrichment of different hemocyte types in Percoll gradient fractions. Results are from six independent experiments. Statistical significance is indicated by letters, as different letters indicate a significant difference between enrichments of cell types within the Percoll density gradient fractions (ANOVA, Tukey’s test, p-value <0.05). Hyalinocytes (H) were significantly enriched in the first fraction compared to the other fractions and compared to unsorted hemocytes. However, they were significantly depleted in fractions 4, 5, 6 and 7 compared to unsorted hemocytes. Macrophage-like cells (ML) were significantly enriched in fractions 3 and 4 compared to fractions 1, 6 and 7. They were depleted in fraction 1 compared to unsorted hemocytes. Acidophilic blasts (ABL) were significantly depleted in fractions 4, 5, 6, and 7 compared to unsorted hemocytes. Basophilic blasts (BBL) were significantly enriched in fractions 2 and 3 compared to fractions 4, 5, 6, and 7 and in fraction 1 compared to fractions 6 and 7. Compared to unsorted hemocytes, basophilic blasts (BBL) were significantly enriched in fraction 2 and depleted in fractions 6 to 7. Small granule cells (SGC) were significantly depleted in fractions 1, 2, and 3 compared to fractions 4, 5, 6, and 7, and also significantly depleted in fractions 4 and 5 compared to fractions 6 and 7. In addition, small granule cells (SGC) were significantly enriched in fractions 5, 6, and 7 compared to unsorted hemocytes. The distribution of the big granule cells (BGC) showed a significant enrichment in fraction 4, compared to fractions 1 and 2, but no significant changes were observed with unsorted hemocytes. Vesicular cells (VC) were enriched in fraction 5 compared to fractions 1, 2, 3, 6, 7 and unsorted hemocytes in fractions 1, 2, and 3 compared to fraction 5 and enriched in fraction 5 compared to fraction 7 and unsorted hemocytes.

Cluster specificity and expression level of the 14 selected cluster markers

A) Violin graph showing the average expression level (Log2FC) of the 14 selected marker transcripts specific to the different scRNA-seq clusters. B) Identification of cells expressing the selected markers on the UMAP plot. Positive cells are colored purple according to the Log2FC value. LACC24 and CLEC are specific of cluster 1, EGFL and LEVAR of cluster 2, TGC and XBOX of cluster 3, MLDP and HMGB1 of cluster 4, CUBN and GAL of cluster 5, CAV and NAT1 of cluster 6 and MOX and GPROT of cluster 7. LACC24 : Laccase 24, CLEC : C-type lectin domain-containing protein, EGFL : EGF-like domain-containing protein 8, LEVAR : Putative regulator of levamisole receptor-1, TGC : TGc domain-containing protein, XBOX : X-box binding protein-like protein, MLDP : ML domain-containing protein, HMGB1 : High mobility group protein B1, CUBN : Cubilin, GAL : Galectin, CAV : Caveolin, NAT1 : Natterin-1, MOX : DBH-like monooxygenase protein 1, GPROT : G protein receptor F1-2 domain-containing protein.

Measurement of the ability of C.gigas hemocytes to phagocytose Vibrio tasmaniensis LMG20012T.

Oyster hemocytes were challenged with non-pathogenic Vibrio tasmaniensis LMG20012T and phagocytosis was measured by observing intracellular bacteria after MCDH staining. (A) MCDH staining of hemocytes after phagocytosis assay. Scale bar : 10µm. (B) Bar plot showing the proportion of each cell type and the proportion of phagocytic cells. H : Hyalinocytes, ABL : Acidophilic Blast-Like cells, BBL : Basophilic Blast-Like cells, ML : Macrophage-Like cells, SGC : Small Granule Cells, VC : Vesicular Cells and BGC : Big Granule Cells.

NBT (NitroBlueTetrazolium) staining of oyster hemolymph exposed to zymosan particles.

(A) Hemocyte morphology after MCDH staining : Macrophage Like (a), Small Granule Cells (b), Hyalinocyte (c), Basophilic (d) and Acidophilic (e) Blast cells, Big Granule Cells (f) and Vesicular Cells (g). NBT staining of the different hemocyte types (h-n). Red stars show zymosan and bacteria particles. Black arrows identify Macrophage-Like cells. Scale bar : 10 µm. (B) Quantification of the phagocytic activity of each cell type for zymosan particles from 3 independent experiments. Results of quantification of the phagocytic activity of each cell type and number of zymosan particles per cell type. The graph shows the result of 3 independent experiments.

Uniform Manifold Approximation and Projection (UMAP) plots of cells expressing Macrophage-Like markers.

Cluster numbers are indicated on each cluster. Each point in the UMAP plot represents a single hemocyte, and the clustering of these points reveals the distinct transcriptional profiles of macrophage-like specific markers within the hemocyte population.

Labeling of intracellular copper stores in C.gigas hemocytes.

MCDH (upper panels) and rhodanine (lower panels) staining of oyster hemocytes to reveal copper accumulation. Cells were first processed for copper staining and then stained according to MCDH protocol. H : Hyalinocytes, ABL : Acidophilic Blast-Like cells, BBL : Basophilic Blast-Like cells, ML : Macrophage-Like cells, SGC : Small Granule Cells, VC : Vesicular Cells and BGC : Big Granule Cells. Bar : 10µm.

Results of scRNA-seq trajectory analysis using Monocle3.

Analysis was performed using cluster 4 as the zero pseudotime. (A) Lineage from immature cells to hyalinocytes, (B) to cluster 5 cells, (C) to cluster 6 cells, (D) to vesicular cells, (E) to macrophage-like cells and (F) to small granule cells. For each lineage, cell trajectories are shown in red and heat maps of pseudo time-dependent genes are shown. Blue indicates low expression, and red indicates high expression. Pseudotime flows from right to left.

Uniform Manifold Approximation and Projection (UMAP) plots of cells expressing transcription factors.

28 UMAP representation for the transcription factors identified in the scRNA-seq dataset. Each UMAP plot shows cells expressing the transcription factor in purple. Log2FC expression level is also reported.

Observation of autofluorescence of vesicular cells in hemolymph.

Freshly punctured total hemolymph was cytospun, directly observed under a microscope using a DAPI filter set (DAPI Blue ex : 350/50 nm, DC : 400 nm and em : 460/50 nm) and then processed for MCDH staining. Arrows indicate autofluorescent cells. Scale bar : 10 µm