Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, public reviews, and a provisional response from the authors.
Read more about eLife’s peer review process.Editors
- Reviewing EditorGuy SellaColumbia University, New York, United States of America
- Senior EditorDavid JamesUniversity of Sydney, Sydney, Australia
Reviewer #1 (Public Review):
In this paper, Hui and colleagues investigate how the predictive accuracy of a polygenic score (PGS) for body mass index (BMI) changes when individuals are stratified by 62 different covariates. After showing that the PGS has different predictive power across strata for 18 out of 62 covariates, they turn to understanding why these differences and seeing if predictive performance could be improved. First, they investigated which types of covariates result in the largest differences in PGS predictive power, finding that covariates with larger "main effects" on the trait and covariates with larger interaction effects (interacting with the PGS to affect the trait) tend to better stratify individuals by PGS performance. The authors then see if including interactions between the PGS and covariates improves predictive accuracy, finding that linear models only result in modest increases in performance but nonlinear models result in more substantial performance gains.
Overall, the results are interesting and well-supported. The results will be broadly interesting to people using and developing PGS methods. Below I list some strengths and minor weaknesses.
Strengths:
A major impediment to the clinical use of PGS is the interaction between the PGS and various other routinely measured covariates, and this work provides a very interesting empirical study along these lines. The problem is interesting, and the work presented here is a convincing empirical study of the problem.
The result that PGS accuracy differs across covariates, but in a way that is not well-captured by linear models with interactions is important for PGS method development.
Weakness:
While arguably outside the scope of this paper, one shortcoming is the lack of a conceptual model explaining the results. It is interesting and empirically useful that PGS prediction accuracy differs across many covariates, but some of the results are hard to reconcile simultaneously. For example, it is interesting that triglyceride levels are associated with PGS performance across cohorts, but it seems like the effect on performance is discordant across datasets (Figure 2). Similarly, many of these effects have discordant (linear) interactions across cohorts (Figure 3). Overall it is surprising that the same covariates would be important but for presumably different reasons in different cohorts. Similarly, it would be good to discuss how the present results relate to the conceptual models in Mostafavi et al. (eLife 2020) and Zhu et al. (Cell Genomics 2023).
Reviewer #2 (Public Review):
This work follows in the footsteps of earlier work showing that BMI prediction accuracy can vary dramatically by context, even within a relatively ancestrally homogenous sample. This is an important observation that is worth the extension to different context variables and samples.
Much of the follow-up analyses are commendably trying to take us a step further-towards explaining the underlying observed trends of variable prediction accuracy for BMI. Some of these analyses, however, are somewhat confounded and hard to interpret.
For example, many of the covariates which the authors use to stratify the sample by may drive range restriction effects. Further, the covariates considered could be causally affected by genotype and causally affect BMI, with reverse causality effects; other covariates may be partially causally affected by both genotype and BMI, resulting in collider bias. Finally, population structure differences between quintiles of a covariate may drive variable levels of stratification. These can bias estimation and confounds interpretations, at least one of which intuitively seems like a concern for each of the context variables (e.g., the covariates SES, LDL, diet, age, smoking, and alcohol drinking).
The increased prediction accuracy observed with some of the age-dependent prediction models is notable. Despite the clear utility of this investigation, I am not aware of much existing work that shows such improvements for context-aware prediction models (compared to additive/main effect models). I would be curious to see if the predictive utility extends to held-out data from a data set distinct from the UKB, where the model was trained, or whether it replicates when predicting variation within families. Such analyses could strengthen the evidence for these models capturing direct causal effects, rather than other reasons for the associations existing in the UKB sample.
Reviewer #3 (Public Review):
Polygenic scores (PGS), constructed based on genetic effect sizes estimated in genome-wide association studies (GWAS) and used to predict phenotypes in humans have attracted considerable recent interest in human and evolutionary genetics, and in the social sciences. Recent work, however, has shown that PGSs have limited portability across ancestry groups, and that even within an ancestry group, their predictive accuracy varies markedly depending on characteristics such as the socio-economic status, age, and sex of the individuals in the samples used to construct them and to which they are applied. This study takes further steps in investigating and addressing the later problem, focusing on body mass index, a phenotype of substantial biomedical interest. Specifically, it quantifies the effects of a large number of co-variates and of interactions between these covariates and the PGS on prediction accuracy; it also examines the utility of including such covariates and interaction in the construction of predictors using both standard methods and artificial neural networks. This study would be of interest to investigators that develop and apply PGSs.
I should add that I have not worked on PGSs and am not a statistician, and apologize in advance if this has led to some misunderstandings.
Strengths:
- The paper presents a much more comprehensive assessment of the effects of covariates than previous studies. It finds many covariates to have a substantial effect, which further highlights the importance of this problem to the development and application of PGSs for BMI and more generally.
- The findings re the relationships between the effects of covariates and interactions between covariates and PGSs are, to the best of my knowledge, novel and interesting.
- The development of predictors that account for multiple covariates and their interaction with the PGS are, to the best of my knowledge, novel and may prove useful in future efforts to produce reliable PGSs.
- The improvement offered by the predictors that account for PGS and covariates using neural networks highlights the importance of non-linear interactions that are not addressed by standard methods, which is both interesting and likely to be of future utility.
Weaknesses:
- The paper would benefit substantially from extensive editing. It also uses terminology that is specific to recent literature on PGSs, thus limiting accessibility to a broader readership.
- The potential meaning of most of the results is not explored. Some examples are provided below:
• the paper emphasizes that 18/62 covariates examined show significant effects, but this result clearly depends on the covariates included. It would be helpful to provide more detail on how these covariates were chosen. Moreover, many of these covariates are likely to be correlated, making this result more difficult to interpret. Could these questions at least be partially addressed using the predictors constructed using all covariates and their interactions jointly (i.e., with LASSO)? In that regard, it would be helpful to know how many of the covariates and interactions were used in this predictor (I apologize if I missed that).
• While the relationship between covariate effects and covariate-PGS interaction effects is intriguing, it is difficult to interpret without articulating what one would expect, i.e., what would be an appropriate null.
• The finding that using artificial neural networks substantially improves prediction over more standard methods is especially intriguing, and highlights the potential importance of non-linear relationships between PGSs and covariates. These relationships remain hidden in a black box, however. Even fairly straightforward analyses, based on using different combinations of the PGS and/or covariates may shed some light on these relationships. For example, analyzing which covariates have a substantial effect on the prediction or varying one covariate at a time for different values of the PGS, etc.
- The relationship to previous work should be discussed in greater detail.