Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, public reviews, and a provisional response from the authors.
Read more about eLife’s peer review process.Editors
- Reviewing EditorJungmin ChoiKorea University, Seoul, Korea, the Republic of
- Senior EditorMurim ChoiSeoul National University, Seoul, Korea, the Republic of
Reviewer #1 (Public Review):
Summary:
Hartman and Satija's manuscript constitutes a significant contribution to the field of imaging-based spatial transcriptomics (ST) through their comprehensive comparative analysis of six multiplexed in situ gene expression profiling technologies. Their findings provide invaluable insights into the practical considerations and performance of these methods, offering robust evidence for researchers seeking optimal ST technologies. However, given the simultaneous availability of similar preprints, readers should exercise caution when comparing findings to ensure reliable information. Therefore, the authors should revise their manuscript to ensure consistency among all ST technologies compared, considering findings from other preprints as well if possible.
Strengths:
(1) The manuscript offers a comprehensive and systematic comparison of six in situ gene expression profiling technologies, including both commercially available and academically developed methods, which is the most extensive study in this field.
(2) Novel metrics have been proposed by the authors to mitigate molecular artifacts and off-target signals, enhancing the accuracy of sensitivity and specificity comparisons across datasets. By emphasizing the significance of evaluating both sensitivity and specificity, the study addresses the challenge of comparing standard metrics like the number of unique molecules detected per cell, given variations in panel composition and off-target molecular artifacts. This feature is directly connected to their development of novel cell segmentation methods to improve the specificity.
(3) As a result of the analysis performed earlier, the authors illustrate how molecular false positives can distort spatially-aware differential expression analysis, underscoring the necessity for caution in interpreting downstream results.
(4) Offering guidance for the selection, processing, and interpretation of in situ spatial technologies, the study equips researchers in the field with valuable insights.
Weaknesses:
(1) Although focusing on mouse brain datasets broadens the comparison of technologies, it confines the study to a single biological context. Discussing the potential limitations of this approach and advocating for future studies in diverse tissue types would enrich the manuscript, especially for clinical FFPE applications.
(2) Providing more explicit details on the criteria used to select datasets for each technology would ensure a fair and unbiased comparison. Otherwise, it may look like the Hall of Fame for champion data sets to advertise a certain commercial product.
(3) Improving the discussion part by discussing the origins of non-specific signals and molecular artifacts, alongside the challenges related to cell segmentation across different tissue types and cell morphologies, would enrich its content. Note that all of these experimental sets have been obtained from thin mouse brain slices, which are actually 3D although they are thin like 10-20 um. As a result, there might be a chance to have partial cell overlap in the z-axis, potentially leading to transcript mixing. Additionally, many cells are probably cut so their actual transcriptomes are inherently partial information, which makes direct comparison to scRNA-seq unfair. These aspects should be included for fair comparison issues.
(4) Expanding on the potential implications of the findings for developing new computational methods to address non-specific biases in downstream analyses would augment the manuscript's impact and relevance.
Reviewer #2 (Public Review):
Summary:
In the manuscript, Hartman et al. present a detailed comparison of 6 distinct multiplexed in situ gene expression profiling technologies, including both academic and commercial systems.
The main concept of the study is to evaluate publicly accessible mouse brain datasets provided by the platforms' developers, where optimal performance in showcasing their technologies is expected. The authors stress the difficulty of making a comparison with standard metrics, e.g., the count of total molecules per cell, considering the differences in gene panel sizes across platforms. To make a fair comparison, the authors conceived a metric of specificity performance, which is called "MECR", an average of mutually exclusive gene co-expression rates in the sample. The authors found that the rate mainly depends on the choice of cell segmentation method, thus reanalyzed 5 of these datasets (excluding STARmap PLUS, due to the lack of molecule location information) with an independent cell segmentation algorithm (i.e., Baysor). Based on the reanalysis, the authors clearly suggest the best-performing platform at the end of the manuscript.
Strengths:
I consider that the paper is a valuable contribution to the community, for the following two reasons:
(1) As the authors mentioned, I fully agree that the spatial transcriptomics community indeed needs better metrics in terms of comparison across technologies, rather than traditional metrics, e.g., molecule counts per cell. In that regard, I believe introducing a new metric, MECR, is quite valuable.
(2) This work highlights the differences in results based on the choice of cell segmentation used for each platform, which suggests a need for trying out different segmentation algorithms to derive the right results. I believe this is an urgent warning that should be widespread in the community as soon as possible.
Weaknesses:
I disagree with the conclusion of the manuscript where the authors compare the technologies and suggest the best-performing ones, because of the following major points:
(1) As the authors mentioned, MECR is a measure of "specificity" not "sensitivity". Still, the comparison of sensitivity was done with the mean counts per cell (Figure 3e). However, I strongly disagree with using the mean counts per cell as a measure of sensitivity because the comparison was done with different gene panels. The counts per cell can be highly dependent on the choice of genes, especially due to optical crowding.
(2) The authors compared sensitivity based on the Baysor cell segmentation, but in fact, Baysor uses spatial gene expression for cell segmentation, which depends on the sensitivity of the platform. Thus, a comparison of sensitivity based on an algorithm that is based on sensitivity seems to be nonsensical.