Figure 1.A flowchart of the project.Figure 2.PGS R2 stratified by quintiles for quantitative variables and by binary variables. a) Continuous covariates with significant (p < 8.1×10-4) R2 differences across quintiles in UKBB EUR. Pork and processed meat consumption per week were excluded from this plot in favor of pork and processed meat intake. b) Covariates with significant differences that were available in multiple cohorts. When traits had the same or directly comparable units between cohorts we show the actual trait values (and show percentiles for physical activity, alcohol intake frequency, and socioeconomic status, which had slightly differing phenotype definitions across cohorts) plotted on x-axis. Townsend index and income were used as variables for socioeconomic status UKBB and GERA, respectively. Note that the sign for Townsend index was reversed, since increasing Townsend index is lower socioeconomic status, while increasing income is higher socioeconomic status. Abbreviations: physical activity (PA), International Physical Activity Questionnaire (IPAQ).Table 1.Model descriptive statistics on 28 of 62 covariates, which have significant (p<.05/62) PGS-covariate interaction terms, in UKBB EUR.The third column is the percentage change in PGS effect per unit change (standard deviations for continuous variables, binary variables encoded as 0 or 1) in covariate. The fifth column is the increase in model R2 with a PGS-covariate interaction term versus a main effects only model. Abbreviations: blood pressure (BP), physical activity (PA), forced vital capacity (FVC), forced expiratory volume in 1-second (FEV1), International Physical Activity Questionnaire (IPAQ).Figure 3.Relative percentage changes in PGS effect per unit change in covariate, for covariates that significantly changed PGS effect (i.e., significant interaction beta at Bonferroni p < 8.1×10-4 – denoted by asterisks) and were present in multiple cohorts and ancestries. Same covariate groupings and transformations were performed as in Figure 1. Similarly, actual values were used when variables had comparable units across cohorts, and standard deviations (SD) used otherwise.Figure 4.Relationships (Pearson correlations weighted by sample size) between maximum R2 differences across strata, main effects of covariate on log(BMI), and PGS-covariate interaction effects on log(BMI). Main effect units are in standard deviations, interaction effect units are in PGS standard deviations multiplied by covariate standard deviations. Only continuous variables are plotted and modeled. GERA was excluded due to slightly different phenotype definitions.Figure 5.Quantile regression effects of PGSBMI (in units of log(BMI)) on log(BMI) at each decile of BMI in each cohort and ancestry. The effect of PGSBMI increases as BMI itself increases, suggesting that no individual covariate-PGS interaction is responsible for the nonlinear effect of PGSBMI.S Figure 2.Three sets of simulated data with varying regression line slopes, showing how model R2 changes when regression line slope changes, all else being equal. Residuals were sampled from a normal distribution (mean=0, sigma=sqrt(π/2)) to give mean squared error=1. 5,000 x-values were sampled for each line, uniformly distributed from 0-10. Despite having the same mean squared error, model R2 increases as beta increases.S Figure 3.Univariable association of PGSBMI and log(BMI) in European UKBB, separately for the bottom and top quintiles of age. R2 is higher in younger individuals, which is partially a consequence of the larger effect (as shown in S Figure 2), despite the mean squared error actually being higher.Figure 6.Model R2 from different machine learning models across cohorts and ancestries using age and gender as covariates (along with PGSBMI and PCs 1-5). Across all cohorts and ancestries, LASSO with PGS-age and PGS-gender interaction terms had better average 10-fold cross-validation R2 than LASSO without interaction terms, while neural networks outperformed LASSO models.Figure 7.PGS R2 based on three sets of GWAS setups. “Main effects” were from a typical main effect GWAS, “GxAge” effects were from a GWAS with a SNP-age interaction term, and “Age stratified” GWAS had main effects only but were conducted in four age quartiles. PGS R2 was evaluated using two models: one with main effects only, and one with an additional PGS*Age interaction term.S Figure 1.PGS-covariate interaction term -log10(p-values) in UKBB EUR, with and without including the covariate PGS in the model – the mean -log10(p) is reduced from 18.0899 to 14.97072 with their inclusions. Note age and sex PGS were not calculated, and their interaction p-values are excluded from this figure.