Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.
Read more about eLife’s peer review process.Editors
- Reviewing EditorFred RiekeUniversity of Washington, Seattle, United States of America
- Senior EditorJoshua GoldUniversity of Pennsylvania, Philadelphia, United States of America
Reviewer #1 (Public Review):
Summary:
This paper introduces an efficient approach to infer properties of receptive-field subunits from the ensemble of spike-triggered stimuli. This is an important general problem in sensory coding. The results introduced in the paper make a solid contribution to both how subunits can be identified and how subunits of different types are coordinated in space.
Strengths:
A primary strength of the paper is the development of approaches that substantially speed non-negative matrix factorization and by doing so create an opportunity for a more systematic exploration of how the procedure depends on various control parameters. The improved procedure is well documented and the direct comparisons with previous procedures are helpful. The improved efficiency enabled several improvements in the procedure - notably tests of good procedures for initializing NNMF and tests of the dependence of the results on the sparsity regularization parameter.
A second strength of the paper is the exploration of the spatial relationship between different subunits. This, to my knowledge, is new and is an interesting direction. There are some concerns about this analysis (see weaknesses below), but if this analysis can be strengthened it will provide new information that will be important both functionally and developmentally.
Weaknesses:
A primary concern is that choices made about parameters for several aspects of the analysis appear to be made subjectively. Much of this centers around how much of the structure in the extracted subunits is imposed by the procedure itself, and how much reflects the underlying neural circuitry. Some specific issues related to this concern are:
- Sparsity: the use of the autocorrelation function to differentiate real vs spurious subunits should be documented and validated. For example, can the authors split data in half and show that the real subunits are stable?
- Choice of regularization: the impact of the regularization parameter on subunit properties is nicely documented. However, the choice of an appropriate regularization parameter seems somewhat arbitrary. Line 253-256 is an example of this problem: this sentence sounds circular - as if the sparsity factor was turned up until the authors obtained what they expected to obtain. Could the choice of this parameter significantly impact the properties of the extracted subunits? How sensitive are the subunit properties to that parameter? Some additional control analyses are needed to validate the parameter choice (see the crossvalidation comment below).
- Crossvalidation was not used to identify the regularization constraint value because the weight matrix from NNMF does not generalize beyond the data it was fit to. Could the authors instead hold the components matrix fixed and recompute the weight matrix, and use that approach for cross-validation (especially since it is really the components matrix that needs validating)?
The paper would benefit from a more complete comparison with known anatomy. For example, can the authors estimate the number of cones within each subunit? This is well-constrained both anatomically (at least in macaque) and, especially for midget ganglion cell subunits, functionally. In macaque, most midget bipolar cells get input from single cones, so the number of extracted subunits should be close to the number of cones. This would be a useful point of comparison for the current work.
Is the analysis of the spatial relationship between different subunit mosaics robust to the incompleteness of those mosaics? The argument on lines 496-503 should be backed up by more analysis. For example, if subunits are removed from regions where the mosaic is pretty complete, do the authors change the spatial dependence? Alternatively, could they use synthetic mosaics with properties like those measured to check the sensitivity to missing cells?
NNMF relies on accounting for each spike-triggered stimulus with a linear combination of components. Would nonlinearities - e.g. those in the bipolar cell outputs - substantially change the results?
Does the approach work for cells that receive input from multiple bipolar types? Some ganglion cells, e.g. in mice, receive input from multiple bipolar types, each accounting for a sizable percentage of the total input. There is similar anatomical work indicating that parasol cells may receive input from multiple diffuse bipolar types. It is not clear whether the current approach works in cases where the subunits of a single ganglion cell overlap. Some discussion of this would be useful.
Reviewer #2 (Public Review):
Summary:
Identifying spatial subunits within the receptive field of retinal ganglion cells can help study spatial nonlinearities and upstream computations performed by the bipolar cells. The authors significantly accelerate the implementation of the previously proposed Spike Triggered semi-non-negative Matrix Factorization (STNMF) method to identify the subunits. The authors also propose a few method improvements - better initialization; new stability-based criteria for selecting the regularization strength, and hyperparameter selection across cell types.
The authors then apply this new method to RGC populations in both the salamander retina and the macaque (marmoset) retina. The authors document the subunit sizes, numbers, and overlap across cell types. The neuroscience finding describes the anti-coordination of ON and OFF parasol receptive fields, but not for the corresponding subunits.
Overall, the authors claim that a faster and more accurate method makes scale-up to large neuronal populations feasible.
Strengths:
- The paper is well-written, easy to read and the figures are clear. The limitations are also made clear.
- The scientific findings are novel and seem to be well supported.
- The claimed speed-up of the method is potentially important for practical applications to large populations. Each innovation of the method is well-supported.
- This is a serious effort to improve the method and document the subunits in primate retina.
Weaknesses:
- The description of the method is confusing. Currently, the new method is described in the context of changes from existing methods. As someone who is not familiar with previous methods, it is very confusing to follow the details.
- I think it will help a lot with clarity to have a concise flowchart/pseudocode to summarize the algorithm and separate it from a description of the main changes from previous methods.
- Separate pseudocodes can be provided for the main method, initialization, regularization parameter selection using consensus, and identifying the regularization parameter across cell types.
- While the new method clearly shows a drastic improvement compared to the previous method on a laptop, would it be possible to get the same improvement on the previous method if it was implemented with GPU (as is standard for most AI/ML algorithms)?
- For the calculation of subunits across multiple cells, can you run multiple parallel jobs on the same computer? This may make some innovations unnecessary (like setting the same regularization strength across multiple cells).
- There are two main innovations in this paper: the fast and approximate method, and analysis of subunit mosaics for primate RGCs. It would be helpful to include an analysis of the primate RGC subunits using the older, slower, but more exact method and show that the major scientific results can be reproduced. This would validate the new method in an end-to-end manner. While this may take a while to run, it may be helpful in the supplement.
- It would be important to understand the data-efficiency of the method. The approximate method may deviate more from the exact method when the amount of data is limited.
- Would it be possible to have a few steps of the exact method at the end to ensure that the solution truly optimizes the objective function?
- Does the number of estimated subunits change with the number of observed spikes? If so, the estimates of subunit number/size must be interpreted with caution.
Reviewer #3 (Public Review):
Summary:
This work addresses the problem of determining the subunit composition of receptive fields of retinal ganglion cells (RGCs). RGCs process stimuli through non-linear transforms that largely (although not entirely) reflect the individual contributions of their input bipolar cells, which themselves process visual stimuli nonlinearly. Thus, using the correct system identification methods might correctly model the RGC cells, while revealing details of the underlying circuit, including the function of the presynaptic components. It is now well established that a model of the form of an LNLN cascade can potentially capture this bipolar-RGC circuit, although the devil is in the details. The authors present an improved method of non-negative matrix factorization (NMF) - which is one approach to this system identification problem - that can speed things up by a factor of 100, and in doing so infer plausible mosaics of the bipolar cell types supporting the identified RGC types that are recorded from.
As written, the focus of this paper seems almost entirely methodological, supporting the sped-up version of NMF, called STNMF. The >100x speedup potentially makes a lot more measurements available, since it enables much more comprehensive scans across model meta-parameters, although has its own complications that must also be methodologically addressed. The results presented are largely a demonstration and validation of the potential power of this approach using example recordings in the peripheral marmoset retina. I do not think the results themselves are meant to be evaluated as definitive, since they are often based on examples and are largely confirmatory of what is already known.
Strengths:
I have very few concerns about the paper methodologically: these methods are well laid out and demonstrated (at least up to the level of my expertise and interest), including validation with established literature.
I am also enthusiastic about some of the potential results in the retina outlined (but not fully fleshed out) in the later sections of the paper.
Weaknesses:
My main critique is to question the conceptual advance in this paper: what did we learn, and what is the targeted audience of interest? Establishing this is particularly dire for this manuscript since NMF has already been established and expounded on as a useful approach in this context (including by the author most recently in 2017) so any of the scientific results is already achievable with enough computer power using existing approaches. As currently cast, the conceptual advances here are purely methodological and relate to the utility of speeding up the approach. Also, they do not appear to generalize to other problems outside of the narrow range that it is currently applied.
Thus, two paths to improving the manuscript would be either:
(1) target readers interested in the retina by fully fleshing out the current results and add more to make this into a paper about the retina rather than about the STNMF method, or
(2) demonstrate that the methods might be useful outside of the very narrow set of conditions specific to identifying nonlinear bipolar cell subunits in peripheral retina under white noise stimulation.
In its current state, the Discussion addressing limitations and generality seems to suggest applicability past this narrow condition, which I do not think is the case: but would be happy to be convinced otherwise.
For fleshing out scientific results, in the current manuscript, they are currently presented to validate the approach and are largely confirmatory for what we already know about the retina (which allows for this validation). Also, much of the results are measurements based on examples, and not accumulated past a single recording in some cases. Finally, it is not clear to the extent that these results depend on the specific recordings in the peripheral marmoset retina: what about more central in the retina, or in other species?
For demonstrating the utility of the methodology: here are some of the main limitations to generalizing past this specific case:
(1) the necessity of linear or near-linear processing in previous layers;
(2) lack of any negative components;
(3) lack of ability to account for other influences on spiking than the positive contributions of LN subunits;
(4) necessity of white noise stimulation that is specifically sized for a uniform subunit size.
Together, I believe this precludes potential applications to other areas in the brain: further back in the visual system will require non-linear transforms as well as the convergence of positive and negative inputs. Other sensory systems like the auditory system are even more non-linear well before getting to even mid-level pre-cortical structures and also combine positive and negative influences. Given the importance of inhibition in the retina (including what is thought to be an important role of amacrine cells in shaping RGC responses), it is not clear how general this approach is in the retina, although the specific results shown are believable. How could this approach generalize, realistically? Could applications to other types of data be demonstrated, and/or plausibly get by these fundamental limitations? How?