Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, public reviews, and a provisional response from the authors.
Read more about eLife’s peer review process.Editors
- Reviewing EditorLaura BradfieldUniversity of Technology Sydney, Sydney, Australia
- Senior EditorKate WassumUniversity of California, Los Angeles, Los Angeles, United States of America
Reviewer #1 (Public review):
Summary:
Maladaptive decision-making is a trait commonly seen in gambling disorders. Salient cues can impact decision-making and drive gambling, though how cues affect decision-making isn't well understood. This manuscript describes the impact of cueing distinct outcomes of a validated rodent cost/benefit-making task based on the human Iowa Gambling Task. Comparing six task variants, the authors describe the effect of adding salient cues to wins (that scale with the size of win or the inverse), to every outcome regardless of loss or win, randomly to losses or wins, or to losses. Behavioral results reveal that cueing wins increased risky choices. By contrast, presenting the cues randomly or cueing the losses reduced risky choices. Risk-preferring animals of the uncued, randomly cued, and loss-cued tasks showed sensitivity to devaluation, whereas win-paired cued rats did not, suggesting cues blunt behavioral updating. Behavioral analyses were paired with computational modeling of initial acquisition which revealed that risky decision-making was related to reduced punishment learning. These data provide unique insight into how cues may bias behavior and drive gambling-related phenotypes.
Strengths:
The detailed analyses provide interesting insight into how cues impact complex decision-making. While there has been a great deal of work into the impact of cues on choice, few studies integrate multiple probabilistic outcomes. Complementing these data with computational parameters helps the reader to understand what may be driving these differences in behavior. The manuscript is well-written, clearly explaining the relevance of the results and potential future directions.
Weaknesses:
Two main questions arise from these results. The first - when do behavioral differences emerge between the task variants? Based on the results and discussion, the cues increase the salience of either the wins or the losses, biasing behavior in favor of either risky or optimal choice. If this is the case, one might expect the cues to expedite learning, particularly in the standard and loss condition. Providing an analysis of the acquisition of the tasks may provide insight into how the cues are "teaching" decision-making and might explain how biases are formed and cemented.
The second question is - does the learning period used for the modeling impact the interpretation of the behavioral results? The authors indicate that computational modeling was done on the first five sessions and used these data to predict preferences at baseline. Based on these results, punishment learning predicts choice preference. However, these animals are not naïve to the contingencies because of the forced choice training prior to the task, which may impact behavior in these early sessions. Though punishment learning may initially predict risk preference, other parameters later in training may also predict behavior at baseline. The authors also present simulated data from the models for sessions 18-20, but according to the statistical analysis section, sessions 35-40 were used for analysis (and presumably presented in Figure 1). If the simulation is carried out in sessions 35-40, do the models fit the data? Finally, though the n's are small, it would be interesting to see how the devaluation impacts computational metrics. These additional analyses may help to explain the nuanced effects of the cues in the task variants.
Reviewer #2 (Public review):
Summary:
The manuscript by Hathaway et al. describes a set of elegant behavioral experiments designed to understand which aspects of cue-reward contingencies drive risky choice behavior. The authors developed several clever variants of the well-established rodent gambling task (also developed by this group) to understand how audiovisual cues alter learning, choice behavior, and risk. Computational and sophisticated statistical approaches were used to provide evidence that: (1) audiovisual cues drive risky choice if they are paired with rewards and decrease risk if only paired with loss, (2) pairing cues with rewards reduces learning from punishment, and (3) differences in risk-taking seem to be present early on in training.
Strengths:
The paper is well-written, the experiments are well-designed, and the results are highly interesting, particularly for understanding how cues can motivate and invigorate normal and abnormal behavior.
Weaknesses:
Additional support and evidence are needed for the claims made by the authors. Some of the statements are inconsistent with the data and/or analyses or are only weakly supportive of the claims.
Reviewer #3 (Public review):
Summary:
In this work, Hathaway and colleagues aim to understand how audiovisual cues at the time of outcome promote the selection of risky choices. A real-life illustration of this effect is used in electronic gambling machines which signal a win with flashing lights and jingles, encouraging the player to keep betting. More specifically, the authors ask whether the cue has to be paired exclusively to wins, or whether it can be paired to both outcomes, or exclusively loss outcomes, or occur randomly. To tackle this question, they employ a version of the Iowa Gambling Task adapted to rats, and test the effect of different rules of cue-outcome associations on the probability of selecting the riskier options; they then test the effect of prior reward devaluation on the task; finally, the optimised computational models on the early phases of the experiment to investigate potential mechanisms underlying the behavioural differences.
Strengths:
The experimental approach is very well thought-out, in particular, the choice of the different task variants covers a wide range of different potential hypotheses. Using this approach, they find that, although rats prefer the optimal choices, there is a shift towards selecting riskier options in the variants of the task where the cue is paired to win outcomes. They analyse this population average shift by showing that there is a concurrent increase in the number of risk-taking individuals in these tasks. They also make the novel discovery that pairing cues with loss outcomes only reduces the tendency for risky decisions.
The computational strategy is appropriate and in keeping with the accepted state of the art: defining a set of candidate models, optimising them, comparing them, simulating the best ones to ensure they replicate the main experimental results, then analysing parameter estimates in the different tasks to speculate about potential mechanisms.
Weaknesses:
There is a very problematic statistical stratagem that involves categorising individuals as either risky or optimal based on their choice probabilities. As a measurement or outcome, this is fine, as previously highlighted in the results, but this label is then used as a factor in different ANOVAs to analyse the very same choice probabilities, which then constitutes a circular argument (individuals categorised as risky because they make more risky choices, make more risky choices...).
A second experiment was done to study the effect of devaluation on risky choices in the different tasks. The results, which are not very clear to understand from Figure 3, would suggest that reward devaluation affects choices in tasks where the win-cue pairing is not present. The authors interpret this result by saying that pairing wins with cues makes the individuals insensitive to reward devaluation. Counter this, if an individual is prone to making risky choices in a given task, this points to an already distorted sense of value as the most rewarding strategy is to make optimal non-risky choices.
While the overall computational approach is excellent, I believe that the choice of computational models is poor. Loss trials come at a double cost, something the authors might want to elaborate more upon, firstly the lost opportunity of not having selected a winning option which is reflected in Q-learning by the fact that r=0, and secondly a waiting period which will affect the overall reward rate. The authors choose to combine these costs by attempting to convert the time penalty into "reward currency" using three different functions that make up the three different tested models. This is a bit of a wasted opportunity as the question when comparing models is not something like "are individuals in the paired win-cue tasks more sensitive to risk? or less sensitive to time? etc" but "what is the best way of converting time into Q-value currency to fit the data?" Instead, the authors could have contrasted other models that explicitly track time as a separate variable (see for example "Impulsivity and risk-seeking as Bayesian inference under dopaminergic control" (Mikhael & Gershman 2021)) or give actions an extra risk bonus (as in "Nicotinic receptors in the VTA promote uncertainty seeking" (Naude et al 2016)). Another weakness of the computational section is the fact, that despite simulations having been made, figure 5 only shows the simulated risk scores and not the different choice probabilities which would be a much more interesting metric by which to judge model validity. In the last section, the authors ask whether the parameter estimates (obtained from optimisation on the early sessions) could be used to predict risk preference. While this is an interesting question to address, the authors give very little explanation as to how they establish any predictive relationship. A figure and more detailed explanation would have been warranted to support their claims.