Food-washing monkeys recognize the law of diminishing returns

  1. Department of Anthropology, Dartmouth College, Hanover, United States
  2. Department of Biological Sciences, Dartmouth College, Hanover, United States
  3. Ecology, Evolution, Environment & Society, Dartmouth College, Hanover, United States
  4. School of Natural Sciences, University of California, Merced, United States
  5. The Santa Fe Institute, Santa Fe, United States
  6. Department of Biology, Chulalongkorn University, Bangkok, Thailand
  7. National Primate Research Center of Thailand, Chulalongkorn University, Saraburi, Thailand
  8. Department of Anthropology, Durham University, Durham, United Kingdom

Peer review process

Revised: This Reviewed Preprint has been revised by the authors in response to the previous round of peer review; the eLife assessment and the public reviews have been updated where necessary by the editors and peer reviewers.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Jenny Tung
    Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
  • Senior Editor
    George Perry
    Pennsylvania State University, University Park, United States of America

Reviewer #1 (Public review):

In this paper, the authors had 2 aims:

(1) Measure macaques' aversion to sand and see if its' removal is intentional, as it likely in an unpleasurable sensation that causes tooth damage.

(2) Show that or see if monkeys engage in suboptimal behavior by cleaning foods beyond the point of diminishing returns, and see if this was related to individual traits such as sex and rank, and behavioral technique.

They attempted to achieve these aims through a combination of geochemical analysis of sand, field experiments, and comparing predictions to an analytical model.

The authors' conclusions were that they verified a long-standing assumption that monkeys have an aversion to sand as it contains many potentially damaging fine grained silicates, and that removing it via brushing or washing is intentional.

They also concluded that monkeys will clean food for longer than is necessary, i.e. beyond the point of diminishing returns, and that this is rank-dependent.

High and low-ranking monkeys tended not to wash their food, but instead over-brushed it, potentially to minimize handling time and maximize caloric intake, despite the long-term cumulative costs of sand.

This was interpreted through the *disposable soma hypothesis*, where dominants maximize immediate needs to maintain rank and increase reproductive success at the potential expense of long-term health and survival.

# Strengths

The field experiment seemed well designed, and their quantification of the physical and mineral properties of quartz particles (relative to human detection thresholds) seemed good relative to their feret diameter and particle circularity (to a reviewer that is not an expert in sand). The *Rank Determination* and *Measuring Sand* sections were clear.

In achieving Aim 1, the authors validated a commonly interpreted, but unmeasured function, of macaque and primate behavior-- a key study/finding in primate food processing and cultural transmission research.

I commend their approach in trying to develop a quantitative model to generate predictions to compare to empirical data for their second aim.
This is something others should strive for.

I really appreciated the historical context of this paper in the introduction and found it very enjoyable and easy to read.

I do think that interpreting these results in the context of the *disposable soma hypothesis* and the potential implications in the *paleolithic matters* section about interpreting dental wear in the fossil record are worthwhile.

# Weaknesses

Several of my concerns in an earlier review were addressed in revision, which I appreciate. One thing I think could strengthen this paper is a clearer link to social foraging theory to explore heterogeneity in handling times (as the currency they are trying to maximize).

I am satisfied with the improvements in statistics and that I can access the code and data.

I am still struck that there was an analysis of only trials where <3 individuals are present. If rank was important, I would imagine that behavior might be different in social contexts when theft, scrounging, policing, aggression, or other distractions might occur-- where rank would have effects on foraging behavior. Maybe lower rankers prioritize rapid food intake then. If rank should be related to investment in this behavior, we might expect this to be magnified (or different) in social contexts where it would affect foraging. It might just be that the data was too hard to score or process in those settings, or the analysis was limited. Additionally, I think that more robust metrics of rank from more densely sampled focal follow data would be a better measure, but I acknowledge the limitations in getting the ideal . Since rank is central to the interpretation of these results, I think that reduced social contexts in which rank was analyzed and the robustness of the data from which rank was calculated and analyzed are the main weaknesses of the evidence presented in this paper.

While some of the boxes about raccoons and Concorde Fallacy were interesting, they did feel like a bit of a distraction from the main message in the paper.

Reviewer #3 (Public review):

This revised paper provides evidence that food washing and brushing in wild long-tailed macaques are deliberate behaviors to remove sand that can damage tooth enamel. The demonstration of the immediate functional importance of these behaviors is nicely done, and there is some interesting initial evidence that macaques differ systematically in their investment in food cleaning based on dominance rank.

The authors interpret this evidence as support for "disposable soma" effects: that reduced time and effort invested food washing in high-ranking individuals is attributable to prioritizing reproductive effort. Given that the analysis is on a single group with no longitudinal data, there are no fitness measures or fitness proxies, the energetic constraints faced by this population are not clear, and both sexes are combined into a single dominance hierarchy (trade-offs between different forms of investment are typically thought to differ between sexes), this conclusion is premature, although an interesting foundation for future studies.

More generally, the results directly supported by the data collection and analysis (grit on Koshima likely damages macaque teeth; processing food helps mitigate the damage; there is some interesting interindividual variation in food processing time, and that time is not always in line with what appears to be optimal) tend to be combined with interpretation that is much more speculative (e.g., the effect sizes observed are consequential for fitness; high-ranking animals are making choices that optimize their long-term fitness at the expense of their soma). This is in part a stylistic choice but can have the effect of drawing attention away from the stronger empirical findings and/or be misleading. Similarly, although I appreciate that the authors were trying to interpret and respond to previous feedback from reviewers, I found the addition of the box text on the raccoon nomenclature and on irrational behavior and the Concorde effect distracting (more intro-textbook style than journal article style).

Author response:

The following is the authors’ response to the original reviews.

We thank the reviewers for their constructive criticism. It is rare and gratifying to receive such thoughtful feedback, and the result is a much stronger paper. We made significant changes to our statistical analyses and figures to better differentiate the effects of sex and dominance rank on food-cleaning behaviors. These revisions uphold our original conclusion––that rank-related variation overwhelms any sex difference in cleaning behavior. We hope that these edits, together with the rest of our responses, provide a convincing demonstration of the tradeoffs of eliminating quartz from food surfaces.

Reviewer #1 (Public Review):

Summary

We have no objections to Reviewer 1’s summary of our manuscript.

Strengths

Reviewer 1 is extremely gracious, and we are grateful for the kind words.

Weaknesses

Reviewer 1 identified several weaknesses, enumerating three types: (1) statistics, (2) insufficient links to foraging theory, and (3) interpretation and validity of the model. The present response is organized around these same categories.

(1) Statistics

We put all of our data and code into the Zenodo repository prior to submission. This content should have been accessible to Reviewer 1 from the outset. But in any event, we are very sorry for the mixup. To ensure access to our data and code during the present stage of review, we included the URL in the main mainscript and here: https://doi.org/10.5281/zenodo.14002737

(a) AIC and outcome distributions

Reviewer 1 criticized our use of AIC for determining model selection. We agree and this aspect of our manuscript is now removed. In lieu of AIC, we produced two data sets consisting of whole number counts (seconds) with means <5. The data were right-skewed due to high concentrations of biologically-meaningful zeros (i.e., bouts of food handling without any cleaning effort). Following the recommendations of Bolker et al. (2008) and others (Brooks et al. 2017, 2019), we chose an outcome distribution (zero-inflated Poisson, see response below) that best matched this data distribution. In addition, we evaluated the post-hoc performance of each of our models using the standardized residual diagnostic tools for hierarchical regression models available in the DHARMa package (Hartig, 2022). To further evaluate our choice of outcome distribution, we generated QQ-plots and residual vs. predicted plots for each model and included them in our revision as Figures S3-S5.

(b) zeros

Reviewer 1 expressed concern over our treatment of biologically-meaningful zeros, and recommended use of a zero-inflated GLMM with either a Poisson or negative binomial outcome distribution. We agree that such models are best for our two data sets. Accordingly, we fit a series of zero-inflated generalized linear mixed models (ZIGLMM) using the glmmTMB package in R, each with a logit-link function, a single zero-inflation parameter applying to all observations, and a Poisson error distribution. For the food-brushing model, we fit a zero-inflated Poisson (ZIP), which produced favorable standardized residual diagnostic plots with no major patterns of deviation (Figure S3) and minor, but non-significant underdispersion (DHARMa dispersion statistic = 0.99, p = 0.80). For our two food-washing models, we used zero-inflated models with Conway-Maxwell Poisson (ZICMP) distributions, an error distribution chosen for its ability to handle data that are more underdispersed (DHARMa dispersion statistic = 8.2E-09, p = 0.74) than the standard zero-inflated Poisson (Brooks et al. 2019). Using this error distribution improved residual diagnostic plots over a standard ZIP model and we view any deviations in the standardized residuals as minor and attributable to the smaller sample size of our food-washing data set (see Figures S4 and S5) (Hartig, 2022). We reported the summarized fixed effects tests for each GLMM in Tables S1-S3 as Analysis of Deviance Tables (Type II Wald chi square tests, one-sided) along with 𝜒2 values, degrees of freedom, and p-values (one-sided tests). Full model summaries with standard errors and confidence intervals are also included in Tables S4-S6. For all statistical analyses, we set 𝛼 = 0.05.

(2) Absence of Links to Foraging Theory

This critique has three components. The first revisits the absence of code for the optimal cleaning time model. This omission was an unfortunate error at the moment of submission, but our code is available now as a Mathematica notebook in Zenodo (https://doi.org/10.5281/zenodo.14002737). The second pivots around our scholarship, admonishing us for failing to acknowledge the marginal value theorem of Charnov (1976). It is a fair point and we have corrected the oversight with a citation to this classic paper. The third criticism is also rooted in scholarship, with Reviewer 1 asking for greater connection to the existing literature on optimal foraging theory, a point echoed in the summary assessment of the editors at eLife. This comment and the weight given to it by eLife’s editors put us in a difficult spot, as our paper is focused on the optimization of delayed gratification, not food acquisition per se. So, we are in the awkward position of gently resisting this recommendation while simultaneously agreeing with Reviewer 1 that we need to better situate our findings in the landscape of existing literature. To thread this needle, we produced Box 2 with a photograph and 410 words. This display box puts our findings into direct conversation with recent research focused on the sunk cost fallacy.

(3) Interpretation and validity of model relative to data

This critique is focused on the simulated brushing and washing results reported in Figure S1, along with its captioning, which was inadequate. We edited the caption to identify the author (JER) who simulated the brushing and washing behaviors of the monkeys. In addition, we clarified the number of brushing replicates (3) and washing replicates (3) for each of three treatments, for a total of 18 simulations.

We followed Reviewer 1’s suggestion, incorporating the experimental uncertainty of grit removal into our optimal cleaning time model. We drew % grit removed values the % grit removed is used to estimate the cleaning inefficiency≥ 100%parameter 𝑐 for from a distribution, discounting the rare event when values were drawn. As brushing and washing, the included uncertainty now allows us to evaluate these parameters as distributions; and, in turn, obtain a distribution for our predicted brushing and washing optimal cleaning times. As we now describe in the main text, the optimal cleaning time for brushing and washing are 𝑡* = 0. 98 ± 0. 19 s and * = 2. 40 ± 0. 74 s, respectively. We are grateful for Reviewer 1’s suggestion, for it added𝑡 valuable context to our model predictions. Notably, the inclusion of experimental uncertainty did not change the qualitative nature of our results, or the interpretations of our model predictions compared to observed cleaning behaviors.

We choose to exclude variability in handling time h to generate predicted cleaning time optima, at least in the main text. Our reasoning stems from the observation that handling time variability is long-tailed, with the longer handling times associated with behaviors that we do not account for in our analysis. For example, individuals carrying multiple cucumber slices to the ocean were apt to drop them, struggling at times to re-grasp so many at once. Such moments increased handling times substantially. Still, we acted on Reviewer 1’s suggestion, accounting for the tandem effects of handling time variability and uncertainty in % grit removed (see Figure S6). Drawing handling time estimates from a log-normal distribution fitted to the handling time data, we found that these dual sources of uncertainty did not qualitatively change our results. They added further uncertainty to the predicted washing time, but the mean remains roughly equivalent. (We note that brushing is assumed to have a constant handling time––composed of only assessment time and no travel––such that the results for brushing do not change.) Both analyses are included in the Mathematica notebook at (https://doi.org/10.5281/zenodo.14002737).

Reviewer #2 (Public Review):

Summary

We have no objections to Reviewer 2’s summary of our manuscript.

Strengths

Reviewer 2 is extremely gracious, and we are grateful for the kind words.

Weaknesses

Reviewer 2 noted that our manuscript failed to provide “sufficient background on [our study] population of animals and their prior demonstrations of food-cleaning behavior or other object-handling behaviors (e.g., stone handling).” To address this comment, we edited the introduction (lines 56-58) to alert readers to the onset of regular food-cleaning behaviors sometime after December 26, 2004. In addition, we edited our methods text (lines 155-160) to highlight the onset and limited scope of prior research with this study population:

“The animals are well habituated to human observers due to regular tourism and sustained study since 2013 (Tan et al., 2018). Most of this research has revolved around stone tool-mediated foraging on mollusks, the only activity known to elicit stone handling (Malaivijitnond et al., 2007; Gumert and Malaivijitnond, 2012, 2013; Tan et al., 2015), although infants and juveniles will sometimes use stones during object play (Tan, 2017). There has been no prior examination of food-cleaning behaviors.”

Reviewer #3 (Public Review):

Reviewer 3 identified three weaknesses, which we address in three paragraphs.

Reviewer 3 questioned our methods for determining rank-dependent differences in cleaning behavior, arguing that our conclusions were unsupported. It is a fair point, and it compelled us to combine males and females into a single standardized ordinal rank of 24 individuals. This unified ranking is now reflected in the x-axes of Figure 2 and Figure S2. Plotting the data this way––see Figure S2––underscores Reviewer 3’s concern that sex and dominance rank are confounding variables. To address this problem, our GLMM included rank and sex as predictor variables, which controls for the effect of sex when assessing the relationship between rank and cleaning time across the three treatments. Reported in Tables S1-S3, these findings show that the effect of sex on either brushing or washing time was not significant. This result bolsters our original contention that rank-related variation in cleaning time overwhelms any sex differences.

Relatedly, Reviewer 3 questioned our conclusions on the effects of rank because our study was focused on a single social group. In other words, it is plausible that our results were heavily influenced by the idiosyncrasies of select individuals, not dominance rank per se. It is a fair point, and it compelled us to include individual ID as a random effect in each of our GLMMs. Including individual ID as a random intercept allowed us to control for inter-individual variation in cleaning duration while assessing the effects of rank. An analysis based on additional social groups or longitudinal data are certainly desirable, but also well beyond the scope of a Short Report for eLife.

Finally, Reviewer 3 objected to fragments of sentences in our abstract, introduction, and discussion, combining them into a criticism of claims that we did not and do not make. It probably wasn’t intentional, but it puts us in the awkward position of deconstructing a strawman:

● Review 3 begins, “there is no evidence presented on the actual fitness-related costs of tooth wear or the benefits of slightly faster food consumption”. This statement is true while insinuating that collecting such evidence was our intent. To be clear, our experiment was never designed to measure tooth wear or reproductive fitness, nor do we make any claims of having done so.

● Reviewer 3 adds, “Support for these arguments is provided based on other papers, some of which come from highly resource-limited populations (and different species). But this is a population that is supplemented by tourists with melons, cucumbers, and pineapples!” We were puzzled over these sentences. The first fails to mention that the citations exist in our discussion. Citing relevant work in a discussion is a basic convention of scientific writing. But it seems the underlying intent of these words is to denigrate the value of our study population because two dozen tourists visit Koram Island once a day. Exclamations to the contrary, the amount of tourist-provisioned food in the diet of any one monkey is negligible.

● Last, Reviewer 3 commented on matters of style, objecting to “overly strong claims.” We puzzled over this criticism because the claims in question are broader points of introduction or discussion, not results. The root problem appears to be the final sentence of our abstract:

“Dominant monkeys abstained from washing, balancing the long-term benefits of mitigating tooth wear against immediate energetic requirements, an essential predictor of reproductive fitness.”

This sentence has three clauses. The first is a statement of results, whereas the second and third are meant to mirror our discussion on the importance of our findings. We combined the concepts into a single concluding sentence for the sake of concision, but we can appreciate how a reader could feel deceived, expecting to see data on tooth wear and fitness. So, our impression is that we are dealing with a simple misunderstanding of our own making, and that this single sentence explains Reviewer 3’s criticism and tone––it cast a long shadow over the substance of our paper. To resolve this problem, we edited the sentence:

“Dominant monkeys abstained from washing, a choice consistent with the impulses of dominant monkeys elsewhere: to prioritize rapid food intake and greater reproductive fitness over the long-term benefits of prolonging tooth function.”

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation