a.) Detailed visualization of minute-to-minute statistics of animal 25 (in the sessions after the introduction of the novel object). From top to bottom, the plots show % time within (Akiti et al., 2022)’s 7cm threshold of the object with (cautious) and without (confident) tail behind, the length of a bout at the object and the number of bouts per minute. Orange lines are the box-car functions fitted to segment phases and illustrate the change in time, duration, and frequency statistics across phases. The transition points t1 and t2 as well as the initial cautious , final cautious , peak confident and steady-state confident approach percentage times are shown. The right plots show examples of minute-to-minute and phase-averaged approach time, duration, and frequency for (b.) brave, (c.) intermediate, and (d.) timid animals. Note that animals are ordered by the group-timidity animal index (see main text Section 2.2.6). Green indicates cautious and blue indicates confident approach. Darker colors indicate higher values. Averaging statistics over phases ignores idiosyncracies of behavior to provide a high-level summary of learning dynamics.

Separating the three animal groups. The x-axis shows the ratio of total time spent in confident versus cautious bouts. The y-axis shows the ratio of bout time in the first 10 minutes of confident approach and the last 10 minutes of confident approach (set to 0 for timid animals that do not have a confident phase). The horizontal line indicates y = 1.0. All 9 timid animals are close to the origin. We separate brave and intermediate animals according to the y = 1 line.

Markov decision process underlying the BAMDP model. Four real (nest, cautious object, confident object, retreat) and three imagined (cautious detect, confident detect, dead) states. Agent actions are italicized. Blue arrows indicate (possibly stochastic) transitions caused by agent actions. Green arrows indicate (possibly stochastic) forced transitions. Cautious approach provides less informational reward r2 < r1 but has a smaller chance of death p2 < p1 compared to confident approach. Travel and dying costs are not shown.

Hazard function learning for (a.) brave and (b.) timid animals. Brave animals start with a flexible hazard prior with a low mean for h2. This leads to longer bouts (first length 2, then 3 and 4), which imply that the hazard posterior quickly approaches zero (here, after 10 bouts). Timid animals start with an inflexible hazard prior with a higher mean h2, and are limited to length 2 bouts. The hazard posterior only changes slightly after 10 bouts.

Summary of model fit. Left panels: minute-to-minute time the animals spend within 7cm of the novel object (top), duration (middle), and frequency (bottom). There are 26 animals (one per row) sorted by the group-timidity animal index (see main text Section 2.2.6). Central panels: the same values averaged over behavioral phases. Right panels: time, duration and frequency of bouts generated as sample trajectories from the individual fits of the BAMDP model. Legend: green/blue distinguishes cautious and confident bouts. The intensity of colors indicates higher values, and gray indicates zeros.

The bout durations of brave animals depend on the hazard prior. a.) Brave animals that initially perform cautious-2 bouts, then confident-3 bouts. The prior mean μ3 for τ = 3 is higher than in (c.) because there is some hazard to overcome before the animal does a duration-3 bout. Blue indicates individual animals and black indicates the mean. The y-axis E[μτ ], shows μτ averaged over the ABCSMC posterior particles for each animal. b.) Cautious-2 then confident-4 animals. Since the mean μ4 prior is low, once the animal overcomes the τ = 2 hazard, it quickly transitions from duration 2 to 4. c.) Cautious-3, then confident-3 animals. These animals are fitted with a low μ3 prior and high μ4 prior because they never perform duration-4 bouts. d.) Cautious-3 then confident-4 animals. Since the μ3 prior is lower than in (b.), these animals begins with duration-3 bouts.

a.) nCVaR versus the group-timidity animal index ranking defined in Section 2.2.6. Color indicates the animal group. More timid animals are generally fitted by a lower nCVaRα. Prior hazard parameter for t=2 (b.), t=3 (c.), and t=4 (d.) versus timidity ranking. Dots indicate the mean; the probability density is represented by color where darker means higher density regions. The t=2 prior mean is similar across all animals (timid = 0.28 ± 0.02, intermediate = 0.26 ± 0.04, brave = 0.22 ± 0.08) explaining the short, cautious bouts all animals initially use to assess risk. However, timid animals are best fit with lower variance (inflexible) and higher t=3 and t=4 prior means. This leads to shorter, cautious bouts in the long run. Brave animals are fitted by a low slope (indicated by lower mean for t=3 and t=4) and high variance (flexible) hazard prior. This allows them to perform longer bouts over time. t=4 mean is low (panel d) for brave animals that perform length 4 bouts. Like brave animals, most intermediate animals have flexible, gradual hazards up to t=3.

a.) The relationship between G0 and the peak to steady-state change point for brave animals. The best fit line is shown in black. Higher G0 means the agent explores longer, hence postponing the change point. b.) G0 versus peak to steady-state change point for timid animals. c.) Forgetting rate versus steady-state turns at the nest state for brave animals. A higher forgetting rate leads to quicker replenishment of the exploration pool and hence fewer turns at the nest before approaching the object. d.) Forgetting rate versus turns at nest timid animal. All correlations are significant with p < 0.002.

Non-identifiability of nCVaRα against the hazard prior. Animals are labeled using the group-timidity animal index. a.) The scatter plot shows the t=2 prior mean (μ2) versus nCVaRα for ABCSMC particles of timid animal 1. The ellipse indicates one standard deviation in a Gaussian density model. Animal 1 (and timid animals generally) can be either fit with a higher nCVaRα and a higher μ2, or a lower nCVaRα and a lower μ2. The box-and-whisker plot illustrates the correlation between μ2 and nCVaRα across all timid animals. b.) The scatter plot shows an example intermediate animal 10; the box-and-whisker plot shows μ2 versus nCVaRα for the intermediate population. c.) The scatter plot shows an example animal 11 from the group containing cautious-2/confident-4 and cautious-2/confident-3 animals. This group of animals starts with duration= 2 bouts and hence must overcome the prior μ3. The box-and-whisker plot shows μ3 versus nCVaRα for the population. d.) The scatter plot shows an example animal 25 from the group containing cautious-2/confident-4 and cautious-3/confident-4 animals. This group of animals eventually performs duration= 4 bouts and hence must overcome the prior μ4. The box-and-whisker plot shows μ4 versus nCVaRα for the population. nCVaRα and μ are correlated in the ABCSMC posterior for all animals and hence non-identifiable. p < 0.05 for all correlations.

Comparing the behavior of FONC and UONC conditions. There are 9 FONC and 11 UONC brave animals (one per row). Left panels: minute-to-minute time the animals spend within 7cm of the novel object (top), duration (middle), and frequency (bottom). Animals are again sorted by group-timidity animal index but split by experiment condition (UONC then FONC). Central panels: the same values averaged over behavioral phases. Right panels: time, duration and frequency of bouts generated as sample trajectories from the individual fits of the BAMDP model.

ABCSMC parameter fits of the 9 FONC and 11 UONC animals (with the latter replotted from figure 7 for convenience). The x-axis shows group-timidity animal index but UONC and FONC animals are separated. a.) Average nCVaRα over posterior particles of each animal. Color indicates the animal group. Dashed lines indicate the average (across animals) values of each condition (UONC brave or FONC brave). p-values for the Kolmogorov-Smirnov test of condition differences are shown. p < 0.05 and therefore the nCVaRα values of brave FONC animals are significantly higher than those of brave UONC animals. b.) Exploration bonus pool, which is also significantly different between FONC and UONC animals. c.) Forgetting rate, which is not significantly different between the two conditions. Prior hazard parameter for t=2 (d.), t=3 (e.), and t=4 (f.). The probability density is represented by color where darker means higher density regions. Dots indicate the mean. Dashed lines indicate the average of mean values across animals while dotted lines indicate the average of standard deviation values across animals. p-values testing the difference between the two conditions’ means and standard deviations are shown on the right-hand-side and left-hand-side of the plots respectively. Brave FONC animals have both significantly lower hazard prior mean and standard deviation than brave UONC animals.

Bayes-net showing the relationship between the random variables in the noisy-or model. Only xτ is shown. xτ+1 depends on zt=1:τ+1, and so on.

Table of ABCSMC Parameters

recovery targets versus the closest particles in the ABCSMC posterior. Each subplot plots one of the nine fitted parameters for all 26 animals. The colors of the points indicate the animal group. The gray y = x line represents a perfect recovery of the recovery targets. Most points lie close to the y = x line, suggesting our ABCSMC fitting algorithm has good recoverability.

Identical to Fig. 1 but the recovery targets are plotted against the (marginal) means of the ABCSMC posterior. We chose the final ABCSMC population for the posterior (population 15). R2 is high for G0 and f, suggesting that these parameters are identifiable. R2 is low for nCVaRα and the hazard priors due to the non-identifiability discussed in the main text. In particular, R2 is less than 0.0 for nCVaRα and θ2-mean suggesting these parameters are the most confounded. However, R2 is high for θ2-deviation, suggesting nCVaRα does not confound the flexibility of the hazard function. Finally, the R2 for θ3 is nearly zero. This is expected because timid and some intermediate animals do not have duration-3 approach and for these animals, θ3 can take on arbitrary large values.

The ABCSMC posterior for animal 24. Univariate and bivariate marginals are shown on the diagonal and off-diagonal respectively. recovery targets are shown as green vertical lines in univariate plots and green points on bivariate plots. Marginal means are shown in orange. recovery targets and means are close for G0 and f due to their identifiability. nCVaRα and the hazard prior parameters are non-identifiable. Hence, the recovery targets are farther from the mean but still lie in a region of the posterior with support.