Peer review process
Revised: This Reviewed Preprint has been revised by the authors in response to the previous round of peer review; the eLife assessment and the public reviews have been updated where necessary by the editors and peer reviewers.
Read more about eLife’s peer review process.Editors
- Reviewing EditorRichard NaudUniversity of Ottawa, Ottawa, Canada
- Senior EditorPanayiota PoiraziFORTH Institute of Molecular Biology and Biotechnology, Heraklion, Greece
Reviewer #1 (Public review):
Koren et al. derive and analyse a spiking network model optimised to represent external signals using the minimum number of spikes. Unlike most prior work using a similar setup, the network includes separate populations of excitatory and inhibitory neurons. The authors show that the optimised connectivity has a like-to-like structure, which leads to the experimentally observed phenomenon of feature competition. The authors also examine how various (hyper)parameters-such as adaptation timescale, the excitatory-to-inhibitory cell ratio, regularization strength, and background current-affect the model. These findings add biological realism to a specific implementation of efficient coding. They show that efficient coding explains, or at least is consistent with, multiple experimentally observed properties of excitatory and inhibitory neurons.
As discussed in the first round of reviews, the model's ability to replicate biological observations such as the 4:1 ratio of excitatory vs. inhibitory neurons hinges on somewhat arbitrary hyperparameter choices. Although this may limit the model's explanatory power, the authors have made significant efforts to explore how these parameters influence their model. It is an empirical question whether the uncovered relationships between, e.g., metabolic cost and the fraction of excitatory neurons are biologically relevant.
The revised manuscript is also more transparent about the model's limitations, such as the lack of excitatory-excitatory connectivity. Further improvements could come from explicitly acknowledging additional discrepancies with biological data, such as the widely reported weak stimulus tuning of inhibitory neurons in the primary sensory cortex of untrained animals.
Reviewer #2 (Public review):
Summary:
In this work, the authors present a biologically plausible, efficient E-I spiking network model and study various aspects of the model and its relation to experimental observations. This includes a derivation of the network into two (E-I) populations, the study of single-neuron perturbations and lateral-inhibition, the study of the effects of adaptation and metabolic cost, and considerations of optimal parameters. From this, they conclude that their work puts forth a plausible implementation of efficient coding that matches several experimental findings, including feature-specific inhibition, tight instantaneous balance, a 4 to 1 ratio of excitatory to inhibitory neurons, and a 3 to 1 ratio of I-I to E-I connectivity strength.
Strengths:
While many network implementations of efficient coding have been developed, such normative models are often abstract and lacking sufficient detail to compare directly to experiments. The intention of this work to produce a more plausible and efficient spiking model and compare it with experimental data is important and necessary in order to test these models. In rigorously deriving the model with real physical units, this work maps efficient spiking networks onto other more classical biophysical spiking neuron models. It also attempts to compare the model to recent single-neuron perturbation experiments, as well as some long-standing puzzles about neural circuits, such as the presence of separate excitatory and inhibitory neurons, the ratio of excitatory to inhibitory neurons, and E/I balance. One of the primary goals of this paper, to determine if these are merely biological constraints or come from some normative efficient coding objective, is also important. Lastly, though several of the observations have been reported and studied before, this work arguably studies them in more depth, which could be useful for comparing more directly to experiments.
Weaknesses:
This work is the latest among a line of research papers studying the properties of efficient spiking networks. Many of the characteristics and findings here have been discussed before, thereby limiting the new insights that this work can provide. Thus, the conclusions of this work should be considered and understood in the context of those previous works, as the authors state. Furthermore, the number of assumptions and free parameters in the model, though necessary to bring the model closer to biophysical reality, make it more difficult to understand and to draw clear conclusions from. As the authors state, many of the optimality claims depend on these free parameters, such as the dimensionality of the input signal (M=3), the relative weighting of encoding error and metabolic cost, and several others. This raises the possibility that it is not the case that the set of biophysical properties measured in the brain are accounted for by efficient coding, but rather that theories of efficient coding are flexible enough to be consistent with this regime. With this in mind, some of the conclusions made in the text may be overstated and should be considered in this light.
Conclusions, Impact, and additional context:
Notions of optimality are important for normative theories, but they are often studied in simple models with as few free parameters as possible. Biophysically detailed and mechanistic models, on the other hand, will often have many free parameters by their very nature, thereby muddying the connection to optimality. This tradeoff is an important concern in neuroscientific models. Previous efficient spiking models have often been criticized for their lack of biophysically-plausible characteristics, such as large synaptic weights, dense connectivity, and instantaneous communication. This work is an important contribution in showing that such networks can be modified to be much closer to biophysical reality without losing their essential properties. Though the model presented does suffer from complexity issues which raise questions about its connections to "optimal" efficient coding, the extensive study of various parameter dependencies offers a good characterization of the model and puts its conclusions in context.
Reviewer #3 (Public review):
Summary:
In their paper the authors tackle three things at once in a theoretical model: how can spiking neural networks perform efficient coding, how can such networks limit the energy use at the same time, and how can this be done in a more biologically realistic way than previous work.
They start by working from a long-running theory on how networks operating in a precisely balanced state can perform efficient coding. First, they assume split networks of excitatory (E) and inhibitory (I) neurons. The E neurons have the task to represent some lower dimensional input signal, and the I neurons have the task to represent the signal represented by the E neurons. Additionally, the E and I populations should minimize an energy cost represented by the sum of all spikes. All this results in two loss functions for the E and I populations, and the networks are then derived by assuming E and I neurons should only spike if this improves their respective loss. This results in networks of spiking neurons that live in a balanced state, and can accurately represent the network inputs.
They then investigate in depth different aspects of the resulting networks, such as responses to perturbations, the effect of following Dale's law, spiking statistics, the excitation (E)/inhibition (I) balance, optimal E/I cell ratios, and others. Overall, they expand on previous work by taking a more biological angle on the theory and show the networks can operate in a biologically realistic regime.
Strengths:
* The authors take a much more biological angle on the efficient spiking networks theory than previous work, which is an essential contribution to the field
* They make a very extensive investigation of many aspects of the network in this context, and do so thoroughly
* They put sensible constraints on their networks, while still maintaining the good properties these networks should have
Weaknesses:
* One of the core goals of the paper is to make a more biophysically realistic network than previous work using similar optimization principles. One of the important things they consider is a split into E and I neurons. While this works fine, and they consider the coding consequences of this, it is not clear from an optimization perspective why the split into E and I neurons and following Dale's law would be beneficial. This would be out of scope for the current paper however.
* The theoretical advances in the paper are not all novel by themselves, as most of them (in particular the split into E and I neurons and the use of biophysical constants) had been achieved in previous models. However, the authors discuss these links thoroughly and do more in-depth follow-up experiments with the resulting model.
Assessment and context:
Overall, although much of the underlying theory is not necessarily new, the work provides an important addition to the field. The authors succeeded well in their goal of making the networks more biologically realistic, and incorporate aspects of energy efficiency. For computational neuroscientists this paper is a good example of how to build models that link well to experimental knowledge and constraints, while still being computationally and mathematically tractable. For experimental readers the model provides a clearer link of efficient coding spiking networks to known experimental constraints and provides a few predictions.
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This study offers a useful treatment of how the population of excitatory and inhibitory neurons integrates principles of energy efficiency in their coding strategies. The analysis provides a comprehensive characterisation of the model, highlighting the structured connectivity between excitatory and inhibitory neurons. However, the manuscript provides an incomplete motivation for parameter choices. Furthermore, the work is insufficiently contextualized within the literature, and some of the findings appear overlapping and incremental given previous work.
We are genuinely grateful to the Editors and Reviewers for taking time to provide extremely valuable suggestions and comments, which will help us to substantially improve our paper. We decided to do our very best to implement all suggestions, as detailed in the point-by-point rebuttal letter below. We feel that our paper has improved considerably as a result.
Public Reviews:
Reviewer #1 (Public Review):
Summary: Koren et al. derive and analyse a spiking network model optimised to represent external signals using the minimum number of spikes. Unlike most prior work using a similar setup, the network includes separate populations of excitatory and inhibitory neurons. The authors show that the optimised connectivity has a like-to-like structure, leading to the experimentally observed phenomenon of feature competition. They also characterise the impact of various (hyper)parameters, such as adaptation timescale, ratio of excitatory to inhibitory cells, regularisation strength, and background current. These results add useful biological realism to a particular model of efficient coding. However, not all claims seem fully supported by the evidence. Specifically, several biological features, such as the ratio of excitatory to inhibitory neurons, which the authors claim to explain through efficient coding, might be contingent on arbitrary modelling choices. In addition, earlier work has already established the importance of structured connectivity for feature competition. A clearer presentation of modelling choices, limitations, and prior work could improve the manuscript.
Thanks for these insights and for this summary of our work.
Major comments:
(1) Much is made of the 4:1 ratio between excitatory and inhibitory neurons, which the authors claim to explain through efficient coding. I see two issues with this conclusion: (i) The 4:1 ratio is specific to rodents; humans have an approximate 2:1 ratio (see Fang & Xia et al., Science 2022 and references therein); (ii) the optimal ratio in the model depends on a seemingly arbitrary choice of hyperparameters, particularly the weighting of encoding error versus metabolic cost. This second concern applies to several other results, including the strength of inhibitory versus excitatory synapses. While the model can, therefore, be made consistent with biological data, this requires auxiliary assumptions.
We now describe better the ratio of numbers of E and I neurons found in real data, as suggested. The first submission already contained an analysis of how the optimal ratio of E vs I neuron numbers depends in our model on the relative weighting of the loss of E and I neurons and on the relative weighting of the encoding error vs the metabolic cost in the loss function (see Fig. 7E). We revised the text on page 12 describing Fig. 7E.
To allow readers to form easily a clear idea of how the weighting of the error vs the cost may influence the optimal network configuration, we now present how optimal parameters depend on the weighting in a systematic way, by always including this type of analysis when studying all other model parameters (time constants of single E and I neurons, noise intensity, metabolic constant, ratio of mean I-I to E-I connectivity). These results are shown on the Supplementary Fig. S4 A-D and H, and we comment briefly on each of them in Results sections (pages 9, 10, 11 and 12) that analyze each of these parameters.
Following this Reviewer’s comment, we now included a joint analysis of network performance relative to the ratio of E-I neuron numbers and the ratio of mean I-I to E-I connectivity (Fig. 7J). We found a positive correlation between optima values of these two ratios. This implies that a lower ratio of E-I neuron numbers, such as a 2:1 ratio in human cortex mentioned by the reviewer, predicts lower optimal ratio of I-I to E-I connectivity and thus weaker inhibition in the network. We made sure that this finding is suitably described in revision (page 13).
(2) A growing body of evidence supports the importance of structured E-I and I-E connectivity for feature selectivity and response to perturbations. For example, this is a major conclusion from the Oldenburg paper (reference 62 in the manuscript), which includes extensive modelling work. Similar conclusions can be found in work from Znamenskiy and colleagues (experiments and spiking network model; bioRxiv 2018, Neuron 2023 (ref. 82)), Sadeh & Clopath (rate network; eLife, 2020), and Mackwood et al. (rate network with plasticity; eLife, 2021). The current manuscript adds to this evidence by showing that (a particular implementation of) efficient coding in spiking networks leads to structured connectivity. The fact that this structured connectivity then explains perturbation responses is, in the light of earlier findings, not new.
We agree that the main contribution of our manuscript in this respect is to show how efficient coding in spiking networks can lead to structured connectivity implementing lateral inhibition similar to that proposed in the recent studies mentioned by the Reviewer. We apologize if this was not clear enough in the previous version. We streamlined the presentation to make it clearer in revision. We nevertheless think it useful to report the effects of perturbations within this network because these results give information about how lateral inhibition works in our network. Thus, we kept presenting it in the revised version, although we de-emphasized and simplified its presentation. We now give more emphasis to the novelty of the derivation of this connectivity rule from the principles of efficient coding (pages 4 and 6). We also describe better (page 8) what the specific results of our simulated perturbation experiments add to the existing literature.
(3) The model's limitations are hard to discern, being relegated to the manuscript's last and rather equivocal paragraph. For instance, the lack of recurrent excitation, crucial in neural dynamics and computation, likely influences the results: neuronal time constants must be as large as the target readout (Figure 4), presumably because the network cannot integrate the signal without recurrent excitation. However, this and other results are not presented in tandem with relevant caveats.
We improved the Limitations paragraph in Discussion, and also anticipated caveats in tandem with results when needed, as suggested.
We now mention the assumption of equal time constants between the targets and readouts in the Abstract.
We now added the analysis of the network performance and dynamics as a function of the time constant of the target (tx) to the Supplementary Fig S5 (C-E). These results are briefly discussed in text on page 13. The only measure sensitive to tx is the encoding error of E neurons, with a minimum at tx =9 ms, while I neurons and metabolic cost show no dependency. Firing rates, variability of spiking as well as the average and instantaneous balance show no dependency on tx. We note that tx = t, with t=1/l the time constant of the population readout (Eq. 9), is an assumption we use when we derive the model from the efficiency objective (Eq. 18 to 23). In our new and preliminary work (Koren, Emanuel, Panzeri, Biorxiv 2024), we derived a more general class of models where this assumption is relaxed, which gives a network with E-E connectivity that adapts to the time constant of the stimulus. Thus, the reviewer is correct in the intuition that the network requires E-E connectivity to better integrate target signals with a different time constant than the time constant of the membrane. We now better emphasize this limitation in Discussion (page 16).
(4) On repeated occasions, results from the model are referred to as predictions claimed to match the data. A prediction is a statement about what will happen in the future – but most of the “predictions” from the model are actually findings that broadly match earlier experimental results, making them “postdictions”.
This distinction is important: compared to postdictions, predictions are a much stronger test because they are falsifiable. This is especially relevant given (my impression) that key parameters of the model were tweaked to match the data.
We now comment on every result from the model as either matching earlier experimental results, or being a prediction for experiments.
In Section “Assumptions and emergent properties of the efficient E-I network derived from first principles”, we report (page 4) that neural networks have connectivity structure that relates to tuning similarity of neurons (postdiction).
In Section “Encoding performance and neural dynamics in an optimally efficient E-I network” we report (page 5) that in a network with optimal parameters, I neurons have higher firing rate than E neurons (postdiction), that single neurons show temporally correlated synaptic currents (postdiction) and that the distribution of firing rates across neurons is log-normal (postdiction).
In Section “Competition across neurons with similar stimulus tuning emerging in efficient spiking networks” we report (page 6) that the activity perturbation of E neurons induces lateral inhibition on other E neurons, and that the strength of lateral inhibition depends on tuning similarity (postdiction). We show that activity perturbation of E neurons induces lateral excitation in I neurons (prediction). We moreover show that the specific effects of the perturbation of neural activity rely on structured E-I-E connectivity (prediction for experiments, but similar result in Sadeh and Clopath, 2020). We show strong voltage correlations but weak spike-timing correlations in our network (prediction for experiments, but similar result in Boerlin et al. 2013).
In Section “The effect of structured connectivity on coding efficiency and neural dynamics”, we report (page 7) that our model predicts a number of differences between networks with structured and unstructured (random) connectivity. In particular, structured networks differ from unstructured ones by showing better encoding performance, lower metabolic cost, weaker variance over time in the membrane potential of each neuron, lower firing rates and weaker average and instantaneous balance of synaptic currents.
In Section “Weak or no spike-triggered adaptation optimizes network efficiency”, we report (page 9) that our model predicts better encoding performance in networks with adaptation compared to facilitation. Our results suggest that adaptation should be stronger in E compared to I (PV+) neurons (postdiction). In the same section, we report (page 10) that our results suggest that the instantaneous balance is a better predictor of model efficiency than average balance (prediction).
In Section “Non-specific currents regulate network coding properties”, we report (page 10) that our model predicts that more than half of the distance between the resting potential and firing threshold is taken by external currents that are unrelated to feedforward processing (postdiction). We also report (page 11) that our model predicts that moderate levels of uncorrelated (additive) noise is beneficial for efficiency (prediction for experiments, but similar results in Chalk et al., 2016, Koren et al., 2017, Timcheck et al. 2022).
In Section “Optimal ratio of E-I neuron numbers and of mean I-I to E-I synaptic efficacy coincide with biophysical measurements”, we predict the optimal ratio of E to I neuron numbers to be 4:1 (postdiction) and the optimal ratio of mean I-I to E-I connectivity to be 3:1 (postdiction). Further, we report (page 13) that our results predict that a decrease in the ratio of E-I neuron numbers is accompanied with the decrease in the ratio of mean I-I to E-I connectivity.
Finally, in Section “Dependence of efficient coding and neural dynamics on the stimulus statistics”, we report (page 13) that our model predicts that the efficiency of the network has almost no dependence on the time scale of the stimulus (prediction).
Reviewer #2 (Public Review):
Summary:
In this work, the authors present a biologically plausible, efficient E-I spiking network model and study various aspects of the model and its relation to experimental observations. This includes a derivation of the network into two (E-I) populations, the study of single-neuron perturbations and lateral-inhibition, the study of the effects of adaptation and metabolic cost, and considerations of optimal parameters. From this, they conclude that their work puts forth a plausible implementation of efficient coding that matches several experimental findings, including feature-specific inhibition, tight instantaneous balance, a 4 to 1 ratio of excitatory to inhibitory neurons, and a 3 to 1 ratio of I-I to E-I connectivity strength. It thus argues that some of these observations may come as a direct consequence of efficient coding.
Strengths:
While many network implementations of efficient coding have been developed, such normative models are often abstract and lacking sufficient detail to compare directly to experiments. The intention of this work to produce a more plausible and efficient spiking model and compare it with experimental data is important and necessary in order to test these models.
In rigorously deriving the model with real physical units, this work maps efficient spiking networks onto other more classical biophysical spiking neuron models. It also attempts to compare the model to recent single-neuron perturbation experiments, as well as some longstanding puzzles about neural circuits, such as the presence of separate excitatory and inhibitory neurons, the ratio of excitatory to inhibitory neurons, and E/I balance. One of the primary goals of this paper, to determine if these are merely biological constraints or come from some normative efficient coding objective, is also important.
Though several of the observations have been reported and studied before (see below), this work arguably studies them in more depth, which could be useful for comparing more directly to experiments.
Thanks for these insights and for the kind words of appreciation of the strengths of our work.
Weaknesses:
Though the text of the paper may suggest otherwise, many of the modeling choices and observations found in the paper have been introduced in previous work on efficient spiking models, thereby making this work somewhat repetitive and incremental at times. This includes the derivation of the network into separate excitatory and inhibitory populations, discussion of physical units, comparison of voltage versus spike-timing correlations, and instantaneous E/I balance, all of which can be found in one of the first efficient spiking network papers (Boerlin et al. 2013), as well as in subsequent papers. Metabolic cost and slow adaptation currents were also presented in a previous study (Gutierrez & Deneve 2019). Though it is perfectly fine and reasonable to build upon these previous studies, the language of the text gives them insufficient credit.
We indeed built our work on these important previous studies, and we apologize if this was not clear enough. We thus improved the text to make sure that credit to previous studies is more precisely and more clearly given (see detailed reply for the list of changes made).
To facilitate the understanding on how we built on previous work, we expanded the comparison of our results with the results of Boerlin et al. (2013) about voltage correlations and uncorrelated spiking (page 7), comparison with the derivation of physical units of Boerlin et al. (2013) (page 3), discussion of how results on the ratio of the number of E to I neurons relate to Calaim et al (2022) and Barrett et al. (2016) (page 16), and comment on the previous work by Gutierrez and Deneve about adaptation (page 8).
Furthermore, the paper makes several claims of optimality that are not convincing enough, as they are only verified by a limited parameter sweep of single parameters at a time, are unintuitive and may be in conflict with previous findings of efficient spiking networks. This includes the following.
Coding error (RMSE) has a minimum at intermediate metabolic cost (Figure 5B), despite the fact that intuitively, zero metabolic cost would indicate that the network is solely minimizing coding error and that previous work has suggested that additional costs bias the output.
Coding error also appears to have a minimum at intermediate values of the ratio of E to I neurons (effectively the number of I neurons) and the number of encoded variables (Figures 6D, 7B). These both have to do with the redundancy in the network (number of neurons for each encoded variable), and previous work suggests that networks can code for arbitrary numbers of variables provided the redundancy is high enough (e.g., Calaim et al. 2022).
Lastly, the performance of the E-I variant of the network is shown to be better than that of a single cell type (1CT: Figure 7C, D). Given that the E-I network is performing a similar computation as to the 1CT model but with more neurons (i.e., instead of an E neuron directly providing lateral inhibition to its neighbor, it goes through an interneuron), this is unintuitive and again not supported by previous work. These may be valid emergent properties of the E-I spiking network derived here, but their presentation and description are not sufficient to determine this.
With regard to the concern that our previous analyses considered optimal parameter sets determined with a sweep of a single parameter at a time, we have addressed this issue in two ways. First, we presented (Figure 6I and 7J and text on pages 11 and 13) results of joint sweeps of variations of pairs of parameters whose joint variations are expected to influence optimality in a way that cannot be understood varying one parameter at a time. These new analyses complement the joint parameter sweep of the time constants of single E and I neurons (trE and trI) that has already been presented in Fig. 5A (former Fig. 4A). Second, we conducted, within a reasonable/realistic range of possible variations of each individual parameter, a Monte-Carlo random joint sampling (10000 simulations with 20 trials each) of all 6 model parameters that we explored in the paper. We presented these new results on Fig. 2 and discuss it on pages 5-6.
The Reviewer is correct in stating that the error (RMSE) exhibits a counterintuitive minimum as a function of the metabolic constant despite the fact that, intuitively, for vanishing metabolic constant the network is solely minimizing the coding error (Fig. 6B). In our understanding, this counterintuitive finding is due to the presence of noise in the membrane potential dynamics. In the presence of noise, a non-vanishing metabolic constant is needed to suppress “inefficient” spikes purely induced by noise that do not contribute to coding and increase the error. This gives rise to a form of “stochastic resonance”, where the noise improves detection of the signal coming from the feedforward currents. We note that the metabolic constant and the noise variance both appear in the non-specific external current (Eq. 29f in Methods), and, thus, a covariation in their optimal values is expected. Indeed, we find that the optimal metabolic constant monotonically increases as a function of the noise variance, with stronger regularization (larger beta) required to compensate for larger variability (larger sigma) (Fig. 6I). Finally, we note that a moderate level of noise (which, in turn, induces a non-trivial minimum of the coding error as a function of beta) in the network is optimal. The beneficial effect of moderate levels of noise on performance in networks with efficient coding has been shown in different contexts in previous work (Chalk et al. 2016, Koren and Deneve, 2017). The intuition is that the noise prevents the excessive synchronization of the network and insufficient single neuron variability that decrease the performance. The points above are now explained in the revised text on page 11.
The Reviewer is also correct in stating that the network exhibits an optimal performance for intermediate values of the number of I neurons and the number of encoded features. In our understanding, the optimal number of encoded features of M=3 arises simply because all the other parameters were optimized for those values of M. The purpose of those analyses was not to state that a network optimally encodes only a given number of features, but how a network whose parameters are optimized for a given M perform reasonably well when M is varied. We clarify this on page 13 of Results in Discussion on page 16. In the same Discussion paragraph we refer also to the results of Calaim et al mentioned by the Reviewer.
To address the concern about the comparison of efficiency between the E-I and the 1CT model, we took advantage of the Reviewer’s suggestions to consider this issue more deeply. In revision, we now compare the efficiency of the 1CT model with the E population of the E-I model (Fig. 8H). This new comparison changes the conclusion about which model is more efficient, as it shows the 1CT model is slightly more efficient than the E-I model. Nevertheless, the E-I model performance is more robust to small variations of optimal parameters, e.g., it exhibits biologically plausible firing rates for non-optimal values of the metabolic constant. See also the reply to point 3 of the Public Review of Reviewer 2 for more detail. We added these results and the ensuing caveats for the interpretation of this comparison on Page 14, and also revised the title of the last subsection of Results.
Alternatively, the methodology of the model suggests that ad hoc modeling choices may be playing a role. For example, an arbitrary weighting of coding error and metabolic cost of 0.7 to 0.3, respectively, is chosen without mention of how this affects the results. Furthermore, the scaling of synaptic weights appears to be controlled separately for each connection type in the network (Table 1), despite the fact that some of these quantities are likely linked in the optimal network derivation. Finally, the optimal threshold and metabolic constants are an order of magnitude larger than the synaptic weights (Table 1). All of these considerations suggest one of the following two possibilities. One, the model has a substantial number of unconstrained parameters to tune, in which case more parameter sweeps would be necessary to definitively make claims of optimality. Or two, parameters are being decoupled from those constrained by the optimal derivation, and the optima simply corresponds to the values that should come out of the derivation.
We thank the reviewer for bringing about these important questions.
In the first submission, we presented both the encoding error and the metabolic cost separately as a function of the parameters, so that readers could get an understanding of how stable optimal parameters would be to the change of the relative weighting of encoding error and metabolic cost. We specified this in Results (page 5) and we kept presenting separately encoding and metabolic terms in the revision.
However, we agree that it is important to present the explicit quantification on how the optimal parameters may depend on gL. In the first submission, we showed the analysis for all possible weightings in case of two parameters for which we found this analysis was the most relevant – the ratio of neuron numbers (Fig. 7E, Fig. 6E in first submission) and the optimal number of input features M (see last paragraph on page 13 and Fig. 8D). We now show this analysis also for the rest of studied model parameters in the Supplementary Fig. S4 (A-D and H). This is discussed on pages 9, 10,11 and 12.
With regard to the concern that the scaling of synaptic weights should not be controlled separately for each connection type in the network, we agree and we would like to clarify that we did not control such scaling separately. Apologies if this was not clear enough. From the optimal analytical solution, we obtained that the connectivity scales with the standard deviation of decoding weights (swE and swI) of the pre and postsynaptic populations (Methods, Eq. 32). We studied the network properties as a function of the ratio of average I-I to E-I connectivity (Fig. 7 F-I; Supplementary Fig. S4 D-H), which is equivalent to the ratio of standard deviations swI /swE (see Methods, Eq. 35). We clarified this in text on page 12.
Next, it is correct that our synaptic weights are an order of magnitude smaller than the metabolic constant. We analysed a simpler version of the network that has the coding and dynamics identical to our full model (Methods, Eq. 25) but without the external currents. We found that the optimal parameters determining the firing threshold in such a simpler network were biologically implausible (see Supplementary Text 2 and Supplementary Table S1). We considered as another simple solution the rescaling of the synaptic efficacy such as to have biologically plausible threshold. However, that gave implausible mean synaptic efficacy (see Supplementary Text 2). Thus, to be able to define a network with biologically plausible firing threshold and mean synaptic efficacy, we introduced the non-specific external current. After introducing such current, we were able to shift the firing threshold to biologically plausible values while keeping realistic values of mean synaptic efficacy. Biologically plausible values for the firing threshold are around 15 -– 20 mV above the resting potential (Constantinople and Bruno, 2013), which is the value that we have in our model. A plausible value for the average synaptic strength is between a fraction of one millivolt to a couple of millivolts (Constantinople & Bruno, 2013, Campagnola et al. 2022), which also corresponds to values that the synaptic weights take. The above results are briefly explained in the revised text on page 4.
Finally, to study the optimality of the network when changing multiple parameters at a time, we added a new analysis with Monte-Carlo random joint sampling (10.000 parameter sets with 20 trials for each set) of all 6 model parameters that we explored in the paper. We compared (Fig 2) the so-obtained results of each simulation with those obtained from the understanding gained from varying one or two parameters at a time (optimal parameters reported in Table 1 and used throughout the paper). We found (Fig. 2) that the optimal configuration in Table 1 was never improved by any other simulations we performed, and that the first three random simulations that came the closest to the optimal one of Table 1 had stronger noise intensity but also stronger metabolic cost than the configuration on Table 1. The second, third and fourth configurations had longer time constants of both E and I single neurons (adaptation time constants). Ratio of E-I neuron numbers and of I-I to E-I connectivity in the second, third and fourth best configuration were either jointly increased or decreased with respect to our configuration. These results are reported on Fig. 2 and in Tables 2-3 and they are discussed in Results (page 5).
Reviewer #3 (Public Review):
Summary:
In their paper the authors tackle three things at once in a theoretical model: how can spiking neural networks perform efficient coding, how can such networks limit the energy use at the same time, and how can this be done in a more biologically realistic way than previous work?
They start by working from a long-running theory on how networks operating in a precisely balanced state can perform efficient coding. First, they assume split networks of excitatory (E) and inhibitory (I) neurons. The E neurons have the task to represent some lower dimensional input signal, and the I neurons have the task to represent the signal represented by the E neurons. Additionally, the E and I populations should minimize an energy cost represented by the sum of all spikes. All this results in two loss functions for the E and I populations, and the networks are then derived by assuming E and I neurons should only spike if this improves their respective loss. This results in networks of spiking neurons that live in a balanced state, and can accurately represent the network inputs.
They then investigate in-depth different aspects of the resulting networks, such as responses to perturbations, the effect of following Dale's law, spiking statistics, the excitation (E)/inhibition (I) balance, optimal E/I cell ratios, and others. Overall, they expand on previous work by taking a more biological angle on the theory and showing the networks can operate in a biologically realistic regime.
Strengths:
(1) The authors take a much more biological angle on the efficient spiking networks theory than previous work, which is an essential contribution to the field.
(2) They make a very extensive investigation of many aspects of the network in this context, and do so thoroughly.
(3) They put sensible constraints on their networks, while still maintaining the good properties these networks should have.
Thanks for this summary and for these kind words of appreciation of the strengths of our work.
Weaknesses:
(1) The paper has somewhat overstated the significance of their theoretical contributions, and should make much clearer what aspects of the derivations are novel. Large parts were done in very similar ways in previous papers. Specifically: the split into E and I neurons was also done in Boerlin et al (2008) and in Barrett et al (2016). Defining the networks in terms of realistic units was already done by Boerlin et al (2008). It would also be worth it to discuss Barrett et al (2016) specifically more, as there they also use split E/I networks and perform biologically relevant experiments.
We improved the text to make sure that credit to previous studies is more precisely and more clearly given (see rebuttal to the specific suggestions of Reviewer 2 for a full list).
We apologize if this was not clear enough in the previous version.
With regard to the specific point raised here about the E-I split, we revised the text on page 2. With regard to the realistic units, we revised the text on page 3. Finally, we commented on relation between our results and results of the study by Barrett et al. (2016) on page 16.
(2) It is not clear from an optimization perspective why the split into E and I neurons and following Dale's law would be beneficial. While the constraints of Dale's law are sensible (splitting the population in E and I neurons, and removing any non-Dalian connection), they are imposed from biology and not from any coding principles. A discussion of how this could be done would be much appreciated, and in the main text, this should be made clear.
We indeed removed non-Dalian connections because Dale’s law is a major constraint for biological plausibility. Our logic was to consider efficient coding within the space of networks that satisfy this (and other) biological plausibility constraints. We did not intend to claim that removing the non-Dalian connections was the result of an analytical optimization. We clarified this in revision (page 4).
(3) Related to the previous point, the claim that the network with split E and I neurons has a lower average loss than a 1 cell-type (1-CT) network seems incorrect to me. Only the E population coding error should be compared to the 1-CT network loss, or the sum of the E and I populations (not their average). In my author recommendations, I go more in-depth on this point.
We carefully considered these possibilities and decided to compare only the E population of the E-I model with the 1-CT model. On Fig.8G (7C of the first submission), E neurons have a slightly higher error and cost compared to the 1CT network. In the revision, we compared the loss of E neurons of the E-I model with the loss of the 1-CT model. Using such comparison, we found that the 1CT network has lower loss and is more efficient compared to E neurons of the E-I model. We revised Figure 8H and text on page 14 to address this point.
(4) While the paper is supposed to bring the balanced spiking networks they consider in a more experimentally relevant context, for experimental audiences I don't think it is easy to follow how the model works, and I recommend reworking both the main text and methods to improve on that aspect.
We tried to make the presentation of the model more accessible to a non-computational audience in the revised paper. We carefully edited the text throughout to make it as accessible as possible.
Assessment and context:
Overall, although much of the underlying theory is not necessarily new, the work provides an important addition to the field. The authors succeeded well in their goal of making the networks more biologically realistic, and incorporating aspects of energy efficiency. For computational neuroscientists, this paper is a good example of how to build models that link well to experimental knowledge and constraints, while still being computationally and mathematically tractable. For experimental readers, the model provides a clearer link between efficient coding spiking networks to known experimental constraints and provides a few predictions.
Thanks for these kind words. We revised the paper to make sure that these points emerge more clearly and in a more accessible way from the revised paper.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Referring to the major comments:
(1) Be upfront about particular modelling choices and why you made them; avoid talk of a "striking/surprising", etc. ability to explain data when this actually requires otherwise-arbitrary choices and auxiliary assumptions. Ideally, this nuance is already clear from the abstract.
We removed all the "striking/surprising" and similar expressions from the text.
We added to the Abstract the assumption of equal time constants of the stimulus and of the membrane of E and I neurons and the assumption of the independence of encoded stimulus features.
In revision, we performed additional analyses (joint parameter sweeps, Monte-Carlo joint sampling of all 6 model parameters) providing additional evidence that the network parameters in Table 1 capture reasonably well the optimal solution. These are reported on Figs. 2, 6I and 7J and in Results (pages 5, 11 and 13). See rebuttal to weaknesses of the public review of the Referee 2 for details.
(2) Make even more of an effort to acknowledge prior work on the importance of structured E-I and I-E connectivity.
We have revised the text (page 4) to better place our results within previous work on structured E-I and I-E connectivity.
(3) Be clear about the model's limitations and mention them throughout the text. This will allow readers to interpret your results appropriately.
We now comment more on model's limitations, in particular the simplifying assumption about the network's computation (page 16), the lack of E-E connectivity (page 3), the absence of long-term adaptation (page 10), and the simplification of only having one type of inhibitory neurons (page 16).
(4) Present your "predictions" for what they are: aspects of the model that can be made consistent with the existing data after some fitting. Except in the few cases where you make actual predictions, which deserve to be highlighted.
We followed the suggestion of the reviewer and distinguished cases where the model is consistent with the data (postdictions) from actual predictions, where empirical measurements are not available or not conclusive. We compiled a list of predictions and postdictions in response to the point 4 of Reviewer 1. In revision, we now comment about every property of the model as either reproducing a known property of biological networks (postdiction) or being a prediction. We improved the text in Results on pages 4, 5, 6, 7, 9, 10, 11, 12 and 13 to accommodate these requests.
Minor comments and recommendations
It's a sizable list, but most can be addressed with some text edits.
(1) The image captions should give more details about the simulations and analyses, particularly regarding sample sizes and statistical tests. In Figure 5, for example, it is unclear if the lines represent averages over multiple signals and, if so, how many. It's probably not a single realization, but if it is, this might explain the otherwise puzzling optimal number of three stimuli. Box plots visualize the distribution across simulation trials, but it's not clear how many. In Figure 7d, a star suggests statistical significance, but the caption does not mention the test or its results; the y-axis should also have larger limits.
All statistical results were computed on 100 or 200 simulation trials, depending on the figure, with duration of the trial of 1 second of simulated time. To compute statistical results in Fig. 1, we used 10 trials with duration of 10 seconds for each trial. Each trial consisted of M independent realizations of Ornstein-Uhlenbeck (OU) processes as stimuli, independent noise in the membrane potential and an independent draw of tuning parameters, such that the results are general over specific realization of these random variables. Realizations of the OU processes were independent across stimulus dimensions and across trials. We added this information in the caption of each figure.
The optimal number of M=3 stimuli is the result of measuring the performance of the network in 100 simulation trials (for each parameter value), thus following the same procedure as for all other parameters. Boxplots on Fig. 8G-H were also generated from results computed in 100 simulation trials, which we have now specified in the caption of the figure, together with the statistical test used for assessing the significance (twotailed t-test). We also enlarged the limits of Fig. 8H (7D in the previous version).
(2) The Oldenburg paper (reference 62) finds suppression of all but nearby neurons in response to two- photon stimulation of small neural ensembles (instead of single neurons, as in Chettih & Harvey). This isn't perfectly consistent with the model's results, even though the Oldenburg experiments seem more relevant given the model's small size, and strong connectivity/high connection probability between similarly tuned neurons. What might explain the potential mismatch?
We sincerely apologize for not having been precise enough on this point when comparing our model against Chettih & Harvey and Oldenburg et al. We corrected the sentence (page 6) to remove the claim that our model reproduces both.
We speculate that the discrepancy between perturbing our model and the Oldenburg data may arise from the lack of E-E connectivity in our model. Synaptic connections between E neurons with similar selectivity could create an enhancement instead of suppression between neuronal pairs with very similar tuning. We added a sentence about this in the section with perturbation experiments “Competition across neurons with similar stimulus tuning emerging in efficient spiking networks” (page 7) where we discuss this limitation of our model. We feel that this example shows the utility to derive some perturbation results from our model, as not all networks with some degree of lateral inhibition will show the same perturbation results. Comparing our model's perturbation with real data perturbation results has thus some value to better appreciate strengths and limitations of our approach.
(3) "Previous studies optogenetically stimulated E neurons but did not determine whether the recorded neurons were excitatory or inhibitory " (p. 11). I believe Oldenburg et al. did specifically image excitatory neurons.
The reviewer is correct about Oldenburg et al. imaging specifically excitatory neurons. We have revised this part of the Discussion (page 15).
(4) The authors write that efficiency is particularly achieved where adaptation is stronger in E compared to I neurons (p. 7; Figure 4). Although this would be consistent with experimental data (the I neurons in the model seem akin to fast-spiking Pv+ cells), I struggle to see it in the figure. Instead, it seems like there are roughly two regimes. If either of the neuronal timescales is faster than the stimulus timescale, the optimisation fails. If both are at least as slow, optimisation succeeds.
We agree with the reviewer that the adaptation properties of our inhibitory neurons are compatible with Pv+ cells. What is essential for determining the dynamical regime of the network is less the relation to the time constant of the stimulus (tx) but rather the relation between the time constant of the population readout (t, which is also the membrane time constant) and the time constant of the single neuron (try for y=E and y=I; see Eq. 23, 25 or 29e). The relation between t and try determines if single neurons generate spike-triggered adaptation (try > t) or spike-triggered facilitation (try < t; see Table 4). In regimes with facilitation in either E or I neurons (or both), the network performance strongly deteriorates compared to regimes with adaptation (Fig. 5A).
Beyond adaptation leading to better performance, we also found different effects of adaptation in E and I neurons. We acknowledge that the difference of these effects was difficult to see from the Fig. 4B in the first submission. We have now replotted results from previously shown Fig. 4B to focus on the adaptation regime only, (since the Fig. 5A already establishes that this is the regime with better performance). We also added figures showing the differential effect of adaptation in E and I cell type on the firing rate and on the average loss (Fig. 5C-D). Fig. 5B and C (top plots) show that with adaptation in E neurons, the error and the loss increase more slowly than with adaptation in I neurons. Moreover, the firing rate in both cell types decreases with adaptation in E neurons, while this is not the case with adaptation in I neurons (Fig. 5D). These results are added to the figure panels specified above and discussed in text on page 9.
To clarify the relation between neuronal and stimulus timescale, we now also added the analysis of network performance as a function of the time constant of the stimulus tx (Supplementary Fig. S5 C-E). We found that the model's performance is optimal when the time constant of the stimulus is close to the membrane time constant t. This result is expected, because the equality of these time constants was imposed in our analytical derivation of the model (tx = t). We see a similar decrease in performance for values of tx that are faster and slower with respect to the membrane time constant (Supplementary Fig. S5C, top). These results are added to the figure panels specified above and discussed in text on page 13.
(5) A key functional property of cortical interneurons is their lower stimulus selectivity. Does the model replicate this feature?
We think that whether I neurons are less selective than E neurons is still an open question. A number of recent empirical studies reported that the selectivity of I neurons is comparable to the selectivity of E neurons (see., e.g., Kuan et al. Nature 2024, Runyan et al. Neuron 2010, Najafi et al. Neuron 2020). In our model, the optimal solution prescribes a precise structure in recurrent connectivity (see Eq. 24 and Fig. 1C(ii)) and structured connectivity endows I neurons with stimulus selectivity. To show this, we added plots of example tuning curves and the distribution of the selectivity index across E and I neurons (Fig. 8E-F) and described these new results in Results (page 14). Tuning curves in our network were similar to those computed in a previous work that addressed stimulus tuning in efficient spiking networks (Barrett et al. 2016). We evaluated tuning curves using M=3 constant stimulus features and we varied one of the features while the two others were kept fixed. We provided details on how the tuning curves and the selectivity index were computed in a new Methods subsection (“Tuning curves and selectivity index”) on page 50.
(6) The final panels of Figure 4 are presented as an approach to test the efficiency of biological networks. The authors seem to measure the instantaneous (and time-averaged) E-I balance while varying the adaptation parameter and then correlate this with the loss. If that is indeed the approach (it's difficult to tell), this doesn't seem to suggest a tractable experiment. Also, the conclusion is somewhat obvious: the tighter the single neuron balance, the fewer unnecessary spikes are fired. I recommend that the authors clearly explain their analysis and how they envision its application to biological data.
We indeed measured the instantaneous (and time-averaged) E-I balance while varying the adaptation parameters and then correlating this with the loss. We did not want to imply that the latter panels of Figure 4 are a means to test the efficiency or biological networks or that we are suggesting new and possibly unfeasible experiments. We see it as a way to better conceptually understand how spike triggered adaptation helps the network’s coding efficiency, by tightening the E I balance in a way that it reduces the number of unnecessary spikes. We apologize if the previous text was confusing in this respect. We have now removed the initial paragraph of former Results Subsection (including removing the subsection title) and added new text about different effect of adaptation in E and I neurons on Page 9. We also thoroughly revised Figure 5.
(7) The external stimuli are repeatedly said to vary (or be tracked) across "multiple time scales", which might inadvertently be interpreted as (i) a single stimulus containing multiple timescales or (ii) simultaneously presented stimuli containing different timescales. These scenarios are potential targets for efficient coding through neuronal adaptation (reference 21 in the manuscript and Pozzorini et al. Nat. Neuro. 2013), but they are not addressed in the current model. I recommend the authors clarify their statements regarding timescales (and if they're up for it, acknowledge this as a limitation).
We thank the reviewer for bringing up this interesting point. To address the second point raised by the Reviewer (simultaneously presented stimuli containing multiple timescales), we performed new analyses to test the model with simultaneously presented stimuli that have different timescales. We found that the model encodes efficiently such stimuli. We tested the case with a 3-dimensional stimulus where each dimension is an Ornstein-Uhlenbeck process with a different time constant. More precisely, we kept the time constant in the first dimension fixed (at 10 ms), and varied the time constant in the second and third dimension such that the time constant in the third dimension is doubled with respect to the second dimension. We plotted the encoding error in every stimulus dimension for E and I neurons (Fig. 8B, left plot) as well as the encoding error and the metabolic cost averaged across stimulus dimensions (Fig. 8B, right plot). The results are briefly described with text on page 13.
Regarding the case i) (single stimulus containing multiple timescales), we considered two possibilities. One possibility is that timescales of the stimulus are separable, and in this case a single stimulus containing several time scales can be decomposed in several stimuli with a single time scale each. As we assign a new set of weights for each dimension of the decomposed stimulus, this case is similar to the case ii) that we already addressed. Another possibility is that timescales of the stimulus cannot be separated. This case is not covered in the present analysis and we listed it among the limitations of the model. We revised the text (page 13) around the question of multiple time scales and included the citation of Pozzorini et al. (2013).
(8) It is claimed that the model uses a mixed code to represent signals, citing reference 47 (Rigotti et al., Nature 2013). But whereas the model seems to use linear mixed selectivity, the Rigotti reference highlights the virtues of nonlinear mixed selectivity. In my understanding, a linearly mixed code does not enjoy the same benefits since it’s mathematically equivalent to a non-mixed code (simply rotate the readout matrix). I recommend that the authors clarify the type of selectivity used by their model and how it relates to the paper(s) they cite.
The reviewer is correct that our selectivity is a linear mixing of input variables, and differs from the selectivity in Rigotti et al. (2013) which is non-linear. We revised the sentence on page 4 to clarify better that the mixed selectivity we consider is linear and we removed Rigotti’s citation.
(9) Reference 46 is cited as evidence that leaky integration of sensory features is a relevant computation for sensory areas. I don’t think this is quite what the reference shows. Instead, it finds certain morphological and electrophysiological differences between single pyramidal neurons in the primary visual cortex compared to the prefrontal cortex. Reference 46’ then goes on to speculate that these are differences relevant to sensory computation. This may seem like a quibble, but given the centrality of the objectivee function in normative theories, I think it's important to clarify why a particular objective is chosen.
We agree that our reference of Amatrudo et al was not the best reference and that the previous text was confusing. We thus tried to improve on its clarity. We looked at the previous theoretical efficient coding papers introducing this leaky integration and we could not find in the previous theoretical work a justification of this assumption based on experimental papers. However, there is evidence that neurons in sensory structures, and in cortical association areas respond to time varying sensory evidence by summing stimuli over time with a weight that decreases steadily going back in time from the time of firing, which suggests that neurons integrate time-varying sensory features. In many cases, these integration kernels decay approximately exponentially going back in time, and several models explaining successfully perceptual readouts of neural activity work assuming leaky integration. This suggests that the mathematical approximation of leaky integration of sensory evidence, though possibly simplistic, is reasonable. We revised the text in this respect (page 2).
(10) The definition of the objective function uses beta as a tuning parameter, but later parts of the text and figures refer to a parameter g_L which might only be introduced in the convex combination of Eq. 40a.
This is correct. Parameter optimization has been performed on a weighted sum of the average encoding error and cost as given by the Eq. 39a (40a in first submission), with the weighting gL for the error versus the cost, and not the beta that is part of the objective in Eq.10. The convex combination in Eq. 39a allowed us to find a set of optimal parameters that is within biologically realistic parameter ranges, which includes realistic values for the firing threshold. The average encoding error and metabolic cost (the two terms on the right-hand side of Eq. 39a, without weighting with gL) in our network are of the same order (see Fig 8G for the E-I model where these values are plotted separately for the optimal network). Weighing the cost with optimal beta that is in the range of ~10 would have yielded a network that optimizes almost exclusively the metabolic cost and would bias the results towards solutions with poor encoding accuracy.
To document more fully how the choice of weighting of the error with the cost (gL) affects the optimal parameters, we now added new analysis (Fig. 8D and Supplementary Fig. S4 A-D and H) showing optimal parameters as a function of this weighting. We commented on these results in the text on pages 9-11 and 12. For further details, please see also the reply to point 1 or Reviewer 1.
(11) Figure 1J: "In E neurons, the distribution of inhibitory and of net synaptic inputs overlap". In my understanding, they are in fact identical, and this is by construction. It might help the reader to state this.
We apologize for an unclear statement. In E neurons, net synaptic current is the sum of the feedforward current and of recurrent inhibition (Eq. 29c and Eq. 42). With our choice of tuning parameters that are symmetric around zero and with stimulus features that have vanishing mean, the mean of the feedforward current is close to zero. Because of this, the mean of the net current is negative and is close to the mean of the inhibitory current. We have clarified this in the text (page 5).
(12) A few typos:
- p1. "Minimizes the encoding accuracy" should be "maximizes..."
- p1: "as well the progress" should be something like "as well as the progress"
- p.11 In recorded neurons where excitatory or inhibitory. ", "where" should be "were" - Fig3: missing parentheses (B)
- Fig4B: the 200 ticks on the y-scale are cut off.
- Panel Fig. 5a: "stimulus" should be "stimuli".
- Ref 24 "Efficient andadaptive sensory codes" is missing a space.
- p. 26: "requires" should be "required".
- On several occasions, the article "the" is missing.
We thank the reviewer for kindly pointing out the typos that we now corrected.
Reviewer #2 (Recommendations For The Authors):
I would like to give the authors more details about the two main weaknesses discussed above, so that they may address specific points in the paper. First, there is the relation to previous work. Several published articles have presented very similar results to those discussed here, including references 5, 26, 28, 32, 33, 42, 43, 48, and an additional reference not cited by the authors (Calaim et al. 2022 eLife e73276). This includes:
(1) Derivation of an E-I efficient spiking network, which is found in refs. 28, 42, 43, and 48. This is not reflected in the text: e.g., "These previous implementations, however, had neurons that did not respect Dale's law" (Introduction, pg. 1); "Unlike previous approaches (28, 48), we hypothesize that E and I neurons have distinct normative objectives...". The authors should discuss how their derivation compares to these.
We have now fully clarified on page 3 that our model builds on the seminal previous works that introduced E-I networks with efficient coding (Supplementary text in Boerlin et al. 2013, Chalk et al. 2016, Barrett et al. 2016).
(2) Inclusion of a slow adaptation current: I believe this also appears in a previous paper (Gutierrez & Deneve 2019, ref. 33) in almost the exact same form, and is again not reflected in the text: "The strength of the current is proportional to the difference in inverse time constants ... and is thus absent in previous studies assuming that these time constants are equal (... ref. 33). Again, the authors should compare their derivation to this previous work.
We thank the reviewer for pointing this out. We sincerely apologize if our previous version did not recognize sufficiently clearly that the previous work of Gutierrez and Deneve (eLife 2019; ref 33) introduced first the slow adaptation current that is similar to spike-triggered adaptation in our model. We have made sure that the revised text recognizes it more clearly. We also explained better what we changed or added with respect to this previous work (see revised text on page 8).
The work by Gutierrez and Deneve (2019) emphasizes the interplay between single neuron property (an adapting current in single neurons) and network property (networklevel coding through structured recurrent connections). They use a network that does not distinguish E and I neurons. Our contribution instead focuses on the adaptation in an E-I network. To improve the presentation following the Reviewer’s comment, we now better emphasize the differential effect of adaptation in E and in I neurons in revision (Fig. 5 B-D). Moreover, Gutierrez and Deneve studied the effect of adaptation on slower time scales (1 or 2 seconds) while we study the adaptation on a finer time scale of tens of milliseconds. The revised text detailed this is reported on Page 8.
(3) Background currents and physical units: Pg. 26: "these models did not contain any synaptic current unrelated to feedforward and recurrent processing" and "Moreover previous models on efficient coding did not thoroughly consider physical units of variables" - this was briefly described in ref. 28 (Boerlin et al. 2013), in which the voltage and threshold are transformed by adding a common constant, and additional aspects of physical units are discussed.
It is correct that Boerlin et al (2013) suggested adding a common constant to introduce physical units. We now revised the text to make clearer the relation between our results and the results of Boerlin et al. (2013) (page 3). In our paper, we built on Boerlin et al. (2013) and assigned physical units to computational variables that define the model's objective (the targets, the estimates, the metabolic constant, etc.). We assigned units to computational variables in such a way that physical variables (such as membrane potential, transmembrane currents, firing thresholds and resets) have the correct physical units. We have now clarified how we derived physical units in the section of Results where we introduce the biophysical model (page 3) and specified how this derivation relates to the results in Boerlin et al. (2013).
(4) Voltage correlations, spike correlations, and instantaneous E/I balance: this was already pointed out in Boerlin et al. 2013 (ref 28; from that paper: "Despite these strong correlations of the membrane potentials, the neurons fire rarely and asynchronously") and others including ref. 32. The authors mention this briefly in the Discussion, but it should be more prominent that this work presents a more thorough study of this well-known characteristic of the network.
We agree that it would be important to comment on how our results relate to these results in Boerlin et al. (2013). It is correct that in Boerlin et al. (2013) neurons have strong correlations in the membrane potentials, but fire asynchronously, similarly to what we observe in our model. However, asynchronous dynamics in Boerlin et al. (2013) strongly depends on the assumption of instantaneous synaptic transmission and time discretization, with a “one spike per time bin” rule in numerical implementation. This rule enforces that at most one spike is fired in each time bin, thus actively preventing any synchronization across neurons. If this rule is removed, their network synchronizes, unless the metabolic constant is strong enough to control such synchronization to bring it back to asynchronous regime (see ref. 36). Our implementation does not contain any specific rule that would prevent synchronization across neurons. We now cite the paper by Boerlin and colleagues and briefly summarize this discussion when we describe the result of Fig. 3D on page 7.
(5) Perturbations and parameters sweep: I found one previous paper on efficient spiking networks (Calaim et al. 2022) which the authors did not cite, but appears to be highly relevant to the work presented here. Though the authors perform different perturbations from this previous study, they should ideally discuss how their findings relate to this one. Furthermore, this previous study performs extensive sweeps over various network parameters, which the authors might discuss here, when relevant. For example, on pg. 8, the authors write “We predict that, if number of neurons within the population decreases, neurons have to fire more spikes to achieve an optimal population readout” – this was already shown in Calaim et al. 2022 Figure 5, and the authors should mention if their results are consistent.
We apologize for not being aware of Calaim et al. (2022) when we submitted the first version of our paper. This important study is now cited in the revised version. We have now, as suggested, performed sweeps of multiple parameters inspired by the work of Calaim. This new analysis is described extensively in reply to Weaknesses in the Public Review of reviewer 2 and is found in Fig 2, 6I and 7J and described on pages 5,11 and 13.
The Reviewer is also correct that the compensation mechanism that applies when changing the ratio of E-I neuron numbers is similar to the one described in Barrett et al. (2016) and related to our claim “if number of neurons within the population decreases, neurons have to fire more spikes to achieve an optimal population readout”. We have now added (page 11) that this prediction is consistent with the finding of Barrett et al. (2016).
With regard to the dependence of optimal coding properties on the number of neurons, we have tried to better describe similarities and differences with our work and that of Calaim et al as well as with the work of Barrett et al. (2016) which reports highly relevant results. These additional considerations are summarized in a paragraph in Discussion (page 16).
(6) Overall, the authors should distinguish which of their results are novel, which ones are consistent with previous work on efficient spiking networks, and which ones are consistent in general with network implementations of efficient and sparse coding. In many of the above cases, this manuscript goes into much more depth and study of each of the network characteristics, which is interesting and commendable, but this should be made clear. In clarifying the points listed above, I hope that the authors can better contextualize their work in relation to previous studies, and highlight what are the unique characteristics of the model presented here.
We made a number of clarifications of the text to provide better contextualization of our model within existing literature and to credit more precisely previous publications. This includes commenting on previous studies that introduced separate objective functions of E and I neurons (page 2), spike-triggered adaptation (page 8), physical units (page 3), and changes in the number of neurons in the network (page 16).
Next, there are the claims of optimal parameters. As explained on pg. 35 (criterion for determining optimal model parameters), it appears to me that they simply vary each parameter one at a time around the optimal value. This argument appears somewhat circular, as they would need to know the optimal parameters before starting this sweep. In general, I find these optimality considerations to be the most interesting and novel part of the paper, but the simulations are relatively limited, so I would ask the authors to either back them up with more extensive parameter sweeps that consider covariations in different parameters simultaneously (as in Calaim et al. 2022). Furthermore, the authors should make sure that they are not breaking any of the required relationships between parameters necessary for the optimization of the loss function. Again, some of the results (such as coding error not being minimized with zero metabolic cost) suggests that there might be issues here.
We thank the reviewer for this insightful suggestion. We have now added a joint sweep of all relevant model parameters using Monte-Carlo parameter search with 10.000 iterations. We randomly drew parameter configurations from predetermined parameter ranges that are detailed in the newly added Table 2. Parameters were sampled from a uniform distribution. We varied all the six model parameters studied in the paper (metabolic constant, noise intensity, time constant of single E and I neurons, ratio of E to I neurons and ratio of the mean I-I to E-I connectivity). We now present these results on a new Figure 2. We did not find any set of parameters with lower loss than the parameters in Table 1 when the weighting of the error with the cost was in the following range: 0.4<gL<0.81 (Fig. 2C). While our large but finite Monte-Carlo random sampling does not fully prove that the configuration we selected as optimal (on Table 1) is a global optimum, it shows that this configuration is highly efficient. Further, and as detailed in the rebuttal to the Weaknesses of the Public Review of Referee 2, analyses of the near optimal solutions are compatible with the notion (resulting from the join parameter sweep studies that we added to Figures 6 and 7) that network optimality may be influenced by joint covariations in parameters. These new results are reported in Results (page 5, 11 and 13) and in Figure 2, 6I an 7J.
Some more specific points:
(1) In general, I find it difficult to understand the scaling of the RMSE, cost, and loss values in Figures 4-7. Why are RMSE values in the range of 1-10, whereas loss and cost values are in the range of 0-1? Perhaps the authors can explicitly write the values of the RMSE and loss for the simulation in Figure 1G as a reference point.
Encoding error (RMSE), metabolic cost (MC) and average loss for a well performing network are within the range of 1-10 (see Fig. 8G or 7C in the first submission). To ease the visualization of results, we normalized the cost and the loss on Figs. 6-8 in order to plot them on the same figure (while the computation of the optima is done following the Eq. 39 and is without normalization). We have now explicitly written the values of RMSE, MC and the average loss (non-normalized) for the simulation in Fig. 1D on page 5, as suggested by the reviewer. We have also revised Fig. 4 and now show the absolute and not the relative values of the RMSE and the MC (metabolic cost).
(2) Optimal E-I neuron ratio of 4:1 and efficacy ratio of 3:1: besides being unintuitive in relation to previous work, are these two optimal settings related to one another? If there are 4x more excitatory neurons than inhibitory neurons, won't this affect the efficacy ratio of the weights of the two populations? What happens if these two parameters are varied together?
Thanks for this insightful point. Indeed, the optima of these two parameters are interdependent and positively correlated - if we decrease the E-I neuron ratio, the optimal efficacy ratio decreases as well. To better show this relation we added figures with 2dimensional parameter search (Fig. 7J) where we varied jointly the two ratios. The red cross on the right figure marks the optimal ratios used as optimal parameters in our study. These finding are discussed on page 13.
(3) Optimal dimensionality of M=[1,4]: Again, previous work (Calaim et al. 2022) would suggest that efficient spiking networks can code for arbitrary dimensional signals, but that performance depends on the redundancy in the network - the more neurons, the better the coding. From this, I don't understand how or why the authors find a minimum in Figure 7B. Why does coding performance get worse for small M?
We optimized all model parameters with M=3 and this is the reason why M=3 is the optimal number of inputs when we vary this parameter. Our network shows a distinct minimum of the encoding error as a function of the stimulus dimensionality for both E and I neurons (Fig. 8C, top). This minimum is reflected in the minimum of the average loss (Fig. 8C, bottom). The minimum of the loss is shifted (or biased) by the metabolic cost, with strong weighting of the cost lowering the optimal number of inputs. This is discussed on pages 13-14.
Here are a list of other, more minor points, that the authors can consider addressing to make the results and text more clear:
(1) Feedforward efficient coding models: in the introduction (pg. 1) and discussion (pg. 11) it is mentioned that early efficient coding models, such as that of Olshausen & Field 96, were purely feedforward, which I believe to be untrue (e.g., see Eq. 2 of O&F 96). Later models made this even more explicit (Rozell et al. 2008). Perhaps the authors can either clarify what they meant by this, or downplay this point.
We sincerely apologize for the oversight present in the previous version of the text. We agree with the reviewer that the model in Olshausen and Field (1996) indeed defines a network with recurrent connections, and the same type of recurrent connectivity has been used by Rozell et al. (2008, 2013). The structure of the connectivity in Olshausen and Field (as well as in Rozell et al (2008)) is closely related to the structure of connectivity that we derived in our model. We have corrected the text in the introduction (page 1) to remove these errors.
(2) Pg. 2 - The authors state: "We draw tuning parameters from a normal distribution...", but in the methods, it states that these are then normalized across neurons, so perhaps the authors could add this here, or rephrase it to say that weights are drawn uniformly on the hypersphere.
We rephrased the description of how weights were determined (page 2).
(3) Pg. 2 - "We hypothesize the time-resolved metabolic cost to be proportional to the estimate of a momentary firing rate of the neural population" - from what I can see, this is not the usual population rate, which would be an average or sum of rates across the population.
Indeed, the time-dependent metabolic cost is not the population rate (in the sense of the sum of instantaneous firing rates across neurons), but is proportional to it by a factor of 1/t. More precisely, we can define the instantaneous estimate of the firing rate of a single neuron i as zi(t) = 1/tr ri(t) with ri(t) as in Eq. 7. We have clarified this in the revised text on page 3.
(4) Pg. 3: "The synaptic strength between two neurons is proportional to their tuning similarity if the tuning similarity is positive" - based on the figure and results, this appears to be the case for I-E, E-I, and I-I connections, but not for E-E connections. This should be clarified in the text. Furthermore, one reference given in the subsequent sentence (Ko et al. 2011, ref. 51), is specifically about E-E connections, so doesn't appear to be relevant here.
We have now specified that the Eq. 24 does not describe E-E connections. We also agree that the reference (Ko et al. 2011) did not adequately support our claim and we thus removed it and revised the text on page 3 accordingly.
(5) Pg. 3: "the relative weight of the metabolic cost over the encoding error controls the operating regime of the network" and "and an operating regime controlled by the metabolic constant" - what do you mean by operating regime here?
We used the expression “operating regime” in the sense of a dynamical regime of the network. However, we agree that this expression may be confusing and we removed it in revision.
(6) Pg. 3: "Previous studies interpreted changes of the metabolic constant beta as changes to the firing thresholds, which has less biological plausibility" - can the authors explain why this is less plausible, or ideally provide a reference for it?
In biological networks, global variables such as brain state can strongly modulate the way neural networks respond to a feedforward stimulus. These variables influence neural activity in at least two distinct ways. One is by changing non-specific synaptic inputs to neurons, which is a network-wide effect (Destexhe and Pare, Nature Reviews Neurosci. 2003). This is captured in our model by changing the strength of the mean and fluctuations in the external currents. Beyond modulating synaptic currents, another way of modulating neural activity is by changing cell-intrinsic factors that modulate the firing threshold in biological neurons (Pozzorini et al. 2013). Previous studies on spiking networks with efficient coding interpreted the effect of the metabolic constant as changes to the firing threshold (Koren and Deneve, 2017, Gutierrez and Deneve 2019), which corresponds to cell-intrinsic factors. Here we instead propose that the metabolic constant modulates the neural activity by changing the non-specific synaptic input, homogeneously across all neurons in the network. Interpreting the metabolic constant as setting the mean of the non-specific synaptic input was necessary in our model to find an optimal set of parameters (as in Table 1) that is also biologically plausible. We revised the text accordingly (page 4).
(7) Pg. 4: Competition across neurons: since the model lacks E-E connectivity, it seems trivial to conclude that there is competition through lateral inhibition, and it can be directly determined from the connectivity. What is gained from running these perturbation experiments?
We agree that a reader with a good understanding of sparse / efficient coding theory can tell that there is competition across neurons with similar tuning already from the equation for the recurrent connectivity (Eq. 24). However, we presume that not all readers can see this from the equations and that it is worth showing this with simulations.
Following the reviewer's comment, we have now downplayed the result about the model manifesting lateral inhibition in general on page 6. We have also removed its extensive elaboration in Discussion.
One reason to run perturbation experiments was to test to what extent the optimal model qualitatively replicates empirical findings, in particular, single neuron perturbation experiments in Chettih and Harvey, 2019, without specifically tuning any of the model parameters. We found that the model reproduces qualitatively the main empirical findings, without tuning the model to replicate the data. We revised the text on page 5 accordingly.
Further reason to run these experiments was to refine predictions about the minimal amount of connectivity structure that generates perturbation response profiles that are qualitatively compatible with empirical observations. To establish this, we did perturbation experiments while removing the connectivity structure of a particular connectivity sub-matrices (E-I, I-I or I-E; Fig. S3 F). This allowed us to determine which connectivity matrix has to be structured to observe results that qualitatively match empirical findings. We found that the structure of E-I and I-E connectivity is necessary, but not the structure of I-I connectivity. Finally, we tested partial removal of the connectivity structure where we replaced the precise (and optimal) connectivity structure and imposed a simpler connectivity rule. In the optimal connectivity, the connection strength is proportional to the tuning similarity. A simpler connectivity rule, in contrast, only specifies that neurons with similar tuning share a connection, and beyond this the connection strength is random. Running perturbation experiments in such a network obeying a simpler connectivity rule still qualitatively replicated empirical results from Chettih and Harvey (2019). This is shown on the Supplementary Fig. S2F on described on page 8.
(8) Pg. 4: "the optimal E-I network provided a precise and unbiased estimator of the multidimensional and time-dependent target signal" - from previous work (e.g., Calaim et al. 2022), I would guess that the estimator is indeed biased by the metabolic cost. Why is this not the case here? Did you tune the output weights to remove this bias?
Output weights were not tuned to remove the bias. On Fig. 1H in the first submission we plotted the bias for the network that minimizes the encoding error. We forgot to specify this in the text and figure caption, for which we apologize. We now replaced this figure with a new one (Fig. 1E) where we plot the bias of the network minimizing the average loss (with parameters as in Table 1). The bias of the network minimizing the error is close to zero, B^E = 0.02 and B^I = 0.03. The bias of the network minimizing the loss is stronger and negative, B^E = -0.15 and B^I=-0.34. In the text of Results, we now report the bias of both networks (i.e., optimizing the encoding error and optimizing the loss). We also added a plot showing trial-averaged estimates and a time-dependent bias in each stimulus dimension (Supplementary figure S1 F). Note that the network minimizing the encoding error requires a lower metabolic constant (β = 6) than the network optimizing the loss (β=14), however, the optimal metabolic cost in both networks is nonzero. We revised the text and explained these points on page 5.
(9) Pg. 4: "The distribution of firing rates was well described by a log-normal distribution" - I find this quite interesting, but it isn't clear to me how much this is due to the simulation of a finitetime noisy input. If the neurons all have equal tuning on the hypersphere, I would expect that the variability in firing is primarily due to how much the input correlates with their tuning. If this is true, I would guess that if you extend the duration of the simulation, the distribution would become tighter. Can you confirm that this is the stationary distribution of the firing rates?
We now simulated the network with longer simulation time (10 seconds of simulated time instead of 2 seconds used previously) and also iterated the simulation across 10 trials to report a result that is general across random draws of tuning parameters (previously a single set of tuning parameters was used). The reviewer is correct that the distribution of firing rates of E neurons has become tighter with longer simulation time, but distributions remain log-normal. We also recomputed the coefficient of variation (CV) using the same procedure. We updated these plots on Fig. 1F.
(10) Pg. 4: "We observed a strong average E-I balance" - based on the plots in Figure 1J, the inputs appear to be inhibition-dominated, especially for excitatory neurons. So by what criterion are you calling this strong average balance?
The reviewer is correct about the fact that the net synaptic input to single neurons in our optimal network shows excess inhibition and the network is inhibition-dominated, so we revised this sentence (page 5) accordingly.
(11) Pg. 4: Stronger instantaneous balance in I neurons compared to E neurons - this is curious, and I have two questions: (1) can the authors provide any intuition or explanation for why this is the case in the model? and (2) does this relate to any literature on balance that might suggest inhibitory neurons are more balanced than excitatory neurons?
In our model, I neurons receive excitatory and inhibitory synaptic currents through synaptic connections that are precisely structured. E neurons receive structured inhibition and a feedforward current. The feedforward current consists of M=3 independent OU processes projected on the tuning vectors of E neurons wiE. We speculate that because the synaptic inhibition and feedforward current are different processes and the 3 OU inputs are independent, it is harder for E neurons to achieve the instantaneous balance that would be as precise as in I neurons. While we think that the feedforward current in our model reflects biologically plausible sensory processing, it is not a mechanistic model of feedforward processing. In biological neurons, real feedforward signals are implemented as a series of complex feedforward synaptic inputs from downstream areas, while the feedforward current in our model is a sum of stimulus features, and is thus a simplification of a biological process that generates feedforward signals. We speculate that a mechanistic implementation of the feedforward current could increase the instantaneous balance in E neurons. Furthermore, the presence of EE connections could potentially also increase the instantaneous balance in E neurons. We revised the Discussion about these important questions that lie on the side of model limitations and could be advanced in future work. We could not find any empirical evidence directly comparing the instantaneous balance in E versus I neurons. We have reported these considerations in the revised Discussion (page 16).
(12) Pg. 5, comparison with random connectivity: "Randomizing E-I and I-E connectivity led to several-fold increases in the encoding error as well as to significant increases in the metabolic cost" and Discussion, pg. 11: "the structured network exhibits several fold lower encoding error compared to unstructured networks": I'm wondering if these comparisons are fair. First, regarding activity changes that affect the metabolic cost - it is known that random balanced networks can have global activity control, so it is not straightforward that randomizing the connectivity will change the metabolic cost. What about shuffling the weights but keeping an average balance for each neuron's input weights? Second, regarding coding error, it is trivial that random weights will not map onto the correct readout. A fairer comparison, in my opinion, would at least be to retrain the output weights to find the best-fitting decoder for the threedimensional signal, something more akin to a reservoir network.
Thank you for raising these interesting questions. The purpose of comparing networks with and without connectivity structure was to observe causal effects of the connectivity structure on the neural activity. We agree that the effect on the encoding error is close to trivial, because shuffling of connectivity weights decouples neural dynamics from decoding weights. We have carefully considered Reviewer's suggestions to better compare the performance of structured and unstructured networks.
In reply to the first point, we followed the reviewer's suggestion and compared the optimal network with a shuffled network that matched the optimal network in its average balance. This was achieved by increasing the metabolic constant, decreasing the noise intensity and slightly decreasing the feedforward stimulus (we did not find a way to match the net current in both cell types by changing a single parameter). As we compared the metabolic cost between the optimal and the shuffled network with matched average balance, we still found lower metabolic cost in the optimal network, even though the difference was now smaller. We replaced Fig. 3B from the first submission with these new results in Fig. 4B and commented on them in the text (page 7).
In reply to the second point, we followed reviewer’s suggestion and compared the encoding error (RMSE) of the optimal network and the network with shuffled connectivity where decoding weights are trained such as to optimally reconstruct the target signal. As suggested, we now analyzed the encoding error of the networks using decoding weights trained on the set of spike trains generated by the network using linear least square regression to minimize the decoding error. For a fair and quantitative comparison and because we did not train decoding weights of our structured model, we performed this same analysis using spike trains generated by networks with structured and shuffled recurrent connectivity. We found that the encoding error is smaller in the E population and much smaller in the I population in the structured compared to the random network. Decoding weights found numerically in the optimal network approach uniform distribution of weights that we used in our model (Fig. 4A, right). In contrast, decoding weights obtained from the random network do not converge to a uniform distribution, but instead form a much sparser distribution, in particular in I neurons (Supplementary Fig. S3 A). These additional results reported in the above mentioned figures are discussed in text on page 14.
(13) Pg. 5: "a shift from mean-driven to fluctuation-driven spiking" and Pg. 11 "a network structured as in our efficient coding solution operates in a dynamical regime that is more stimulus-driven, compared to an unstructured network that is more fluctuation driven" - I would expect that the balanced condition dictates that spiking is always fluctuation driven. I'm wondering if the authors can clarify this.
We agree with the reviewer that networks with and without connectivity structure are fluctuation-driven, because in a mean-driven network the mean current must be suprathreshold (Ahmadian and Miller, 2021), which is not the case of either network. We removed the claim of the change from mean to fluctuation driven regime in the revised paper. We are grateful to the Reviewer for helping us tighten the elaboration of our findings.
(14) Pg. 5: "suggesting that variability of spiking is independent of the connectivity structure" - the literature of balanced networks argues against this. Is this not simply because you have a noisy input? Can you test this claim?
We thank the reviewer for the suggestion. We tested this claim by measuring the coefficient of variation in networks receiving a constant stimulus. In particular, we set the same strength in each of the M=3 stimulus dimensions and set the stimulus amplitude such as to match the firing rate of the optimal network in response to the OU stimulus. We computed the coefficient of variation in 200 simulation trials. The removal of connectivity structure did not cause significant change of the coefficient of variation in a network driven by a constant stimulus (Fig. 4E). These additional results are discussed in text on page 7.
We also taken the suggestion about variability of spiking being independent of the connectivity structure. We removed this claim in the revision, because we only tested a couple of specific cases where the connectivity is structured with respect to tuning similarity (fully structured, fully unstructured and partially unstructured networks). This is not exhaustive of all possible structures that recurrent connectivity may have.
(15) Pg. 6: "we also removed the connectivity structure only partially, keeping like-to-like connectivity structure and removing all structure beyond like-to-like" - can you clarify what this means, perhaps using an equation? What connectivity structure is there besides like-to-like?
In the optimal model, the strength of the synapse between a pair of neurons is proportional to the tuning similarity of the two neurons, Yij proportional to Jij for Yij >0 (see Eq. 24 and Fig. 1C(ii)). Besides networks with optimal connectivity, we also tested networks with a simpler connectivity rule. Such a simpler rule prescribes a connection if the pair of neurons has similar tuning (Yij >0), and no connection otherwise. The strength of the connection following this simpler connectivity rule is otherwise random (and not proportional to pairwise tuning similarity Yij as it is in the optimal network). We clarified this in the revision (page 8), also by avoiding the term “like-to-like” for the second type of networks, which could indeed be prone to confusion.
(16) Pgs. 6-7: "we indeed found that optimal coding efficiency is achieved with weak adaptation in both cell types" and "adaptation in E neurons promotes efficient coding because it enforces every spike to be error- correcting" - this was not clear to me. First, it appears as though optimal efficiency is achieved without adaptation nor facilitation, i.e., when the time constants are all equal. Indeed, this is what is stated in Table 1. So is there really a weak adaptation present in the optimal case? Second, it seems that the network already enforces each spike to be errorcorrecting without adaptation, so why and how would adaptation help with this?
We agree with the Reviewer that the network without adaptation in E and I neurons is already optimal. It is also true that most spikes in an optimal network should already be error-correcting (besides some spikes that might be caused by the noise). However, regimes with weak adaptation in E neurons remain close to optimality. Spike-triggered facilitation, meanwhile, ads spikes that are unnecessary and decrease network efficiency. We revised the Fig.5 (Fig. 4 in first submission) and replaced 2-dimensional plots in Fig.4 C-F with plots that show the differential effect of adaptation in E neurons (top) and in I neurons (bottom plots) for the measures of the encoding error (RMSE), the efficiency (average loss) and the firing rate (Fig. 5B-D). On the new Fig. 5C it is evident that the loss of E and I population grows slowly with adaptation in E neurons (top) while it grows faster with adaptation in I neurons (bottom). These considerations are explained in revised text on page 9.
(17) Pg. 7: "adaptation in E neurons resulted in an increase of the encoding error in E neurons and a decrease in I neurons" - it would be nice if the authors could provide any explanation or intuition for why this is the case. Could it perhaps be because the E population has fewer spikes, making the signal easier to track for the I population?
We agree that this could indeed be the case. We commented on it in revision (page 9).
(18) Pg. 7: "The average balance was precise...with strong adaptation in E neurons, and it got weaker when increasing the adaptation in I neurons (Figure 4E)" - I found the wording of this a bit confusing. Didn't the balance get stronger with larger I time constants?
By increasing the time constant of I neurons, the average imbalance got weaker (closer to zero) in E neurons (Fig. 5G, left), but stronger (further away from zero) in I neurons (Fig. 5G, right). We have revised the text on page 9 to make this clearer.
(19) Pg. 7: Figure 4F is not directly described in the text.
We have now added text (page 9) commenting on this figure in revision.
(20) Pg. 8: "indicating that the recurrent network dynamics generates substantial variability even in the absence of variability in the external current" -- how does this observation relate to your earlier claim (which I noted above) that "variability of spiking is independent of connectivity structure"?
We agree that the claim about variability of spiking being independent of connectivity structure was overstated and we thus removed it. The observation that we wanted to report is that both structured and unstructured networks have very similar levels of variability of spiking of single neurons. The fact that much of the variability of the optimal network is generated by recurrent connections is not incompatible. We revised the related text (page 11) for clarity.
(21) Pg. 9: "We found that in the optimally efficient network, the mean E-I and I-E synaptic efficacy are exactly balanced" - isn't this by design based on the derivation of the network?
True, the I-E connectivity matrix is the transpose of the E-I connectivity matrix, and their means are the same by the analytical solution. This however remains a finding of our study. We have clarified this in the revised text (page 12).
(22) Pg. 30, eq. 25: the authors should verify if they include all possible connectivity here, or if they exclude EE connectivity beforehand.
We now specify that the equation for recurrent connectivity (Eq. 24, Eq. 25 in first submission) does not include the E-E connectivity in the revised text (page 41).
Reviewer #3 (Recommendations For The Authors):
Essential
(1) Currently, they measure the RMSE and cost of the E and I population separately, and the 1CT model. Then, they average the losses of the E and I populations, and compare that to the 1CT model, with the conclusion that the 1CT model has a higher average loss. However, it seems to me that only the E population should be compared to the 1CT model. The I population loss determines how well the I population can represent the E population representation (which it can do extremely well). But the overall coding accuracy of the network of the input signal itself is only represented by the E population. Even if you do combine the E and I losses, they should be summed, not averaged. I believe a more fair conclusion would be that the E/I networks have generally slightly worse performance because of needing to follow Dale's law, but are still highly efficient and precise nonetheless. Of course, I might be making a critical error somewhere above, and happy to be convinced otherwise!
We carefully considered the reviewer's comment and tested different ways of combining the losses of the E and I population. We decided to follow the reviewer's suggestion and to compare the loss of the E population of the E-I model with the loss of the one cell type model. As evident already from the Fig. 8G, such comparison indeed changes the result to make the 1CT model more efficient. Also, the sum of losses of E and I neurons results in the 1CT model being more efficient than the E-I model. Note, however, the robustness of the E-I model to changes in the metabolic constant (Fig. 6C, top). The firing rates of the E-I model stay within physiological ranges for any value of the metabolic constant, while the firing rate of the 1CT model skyrocket for the metabolic constant that is lower than optimal (Fig. 8I).
We added to Results (page 14) a summary of these findings.
(2) The methods and main text should make much clearer what aspects of the derivation are novel, and which are not novel (see review weaknesses for specifics).
We specified these aspects, as discussed in more detail in the above reply to point 4 of the public review of Reviewer 1.
Request:
If possible, I would like to see the code before publication and give recommendations on that (is it easy to parse and reproduce, etc.)
We are happy to share the computer code with the reviewer and the community. We added a link to our public repository containing the computer code that we used for simulations and analysis to the preprint and submission (section “Code availability” on page 17).
Suggestions:
(1) I believe that for an eLife audience, the main text is too math-heavy at the beginning, and it could be much simplified, or more effort could be made to guide the reader through the math.
We tried to do our best to improve the clarity of description of mathematical expressions in the main text.
(2) Generally vector notation makes network equations for spiking neurons much clearer and easier to parse, I would recommend using that throughout the paper (and not just in the supplementary methods).
We now use vector notation throughout the paper whenever we think that this improves the intelligibility of the text.
(3) In the discussion or at the end of the results adding a clear section summarizing what the minimal requirements or essential assumptions are for biological networks to implement this theory would be helpful for experimentalists and theorists alike.
We have added such a section in Discussion (page 15).
(5) I think the title is a bit too cumbersome and hard to parse. Might I suggest something like 'Efficient coding and energy use in biophysically realistic excitatory-inhibitory spiking networks' or 'Biophysically constrained excitatory-inhibitory spiking networks can efficiently implement efficient coding'.
We followed reviewer’s suggestion and changed the title to “Efficient coding in biophysically realistic excitatory-inhibitory spiking networks.”
(6) How the connections were shuffled exactly was not clear to me in how it was described now. Did they just take the derived connectivity, and shuffle the connections around? I recommend a more explicit methods section on it (I might have missed it).
Indeed, the connections of the optimal network were randomly shuffled, without repetition, between all neuronal pairs of a specific connectivity matrix. This allows to preserve all properties of the distribution of connectivity weights and only removes the structure of the connectivity, which is precisely what we wanted to test. We now added a section in Methods (“Removal of connectivity structure”) on pages 51-52 where we explain how the connectivity structure is removed.
(7) Figure 1 sub-panel ordering was confusing to read (first up down, then left right). Not sure if re- arranging is possible, but perhaps it could be A, B, and C at the top, with subsublabels (i) and (ii). Might become too busy though.
We followed this suggestion and rearranged the Fig. 1 as suggested by the reviewer.
(8) Equation 3 in the main text should specify that 'y' stands for either E or I.
This has been specified in the revision (page 3).
(9) Figure 1D shows a rough sketch of the types of connectivities that exist, but I would find it very useful to also see the actual connection strengths and the effect of enforcing Dale's law.
We revised this figure (now Fig. 1B (ii)) and added connection strengths as well as a sketch of a connection that was removed because of Dale’s law.
(10) The main text mentions how the readout weights are defined (normal distributions), but I think this should also be mentioned in the methods.
Agreed. We indeed had Methods section “Parametrization of synaptic connectivity (page 46), where we explain how readout weights are defined. We apologize if a call on this section was not salient enough in the first submission. We made sure that the revised main text contains a clear pointer to this Methods section for details.
(11) The text seems to mix ‘decoding weights’ and ‘readout weights’.
Thanks for this suggestion to use consistent language. We opted for ‘decoding weights’ and removed ‘readout weights’.
(12) The way the paper is written makes it quite hard to parse what are new experimental predictions, and what results reproduce known features. I wonder if some sort of 'box' is possible with novel predictions that experimentalists could easily look at and design an experiment around.
We now revised the text. We clarified for every property of the model if this property is a prediction of facts that were not yet experimentally tested or if it accounts for previously observed properties of biological neurons. Please see the reply to point 4 of Reviewer 1.
(13) Typo's etc.:
Page 5 bottom -- ("all") should have one of the quotes change direction (common latex typo, seems to be the only place with the issue).
We thank the reviewer for pointing out this typo that has been removed in revision.