Predictive performance of multi-model ensemble forecasts of COVID-19 across European nations
Figures
![](https://iiif.elifesciences.org/lax:81916%2Felife-81916-fig1-v2.tif/full/617,/0/default.jpg)
Total number of forecasts included in evaluation, by target location, week ahead horizon, and variable.
![](https://iiif.elifesciences.org/lax:81916%2Felife-81916-fig2-v2.tif/full/617,/0/default.jpg)
Ensemble forecasts of weekly incident cases in Germany over periods of increasing SARS-CoV-2 variants Delta (B.1.617.2, left) and Omicron (B.1.1.529, right).
Black indicates observed data. Coloured ribbons represent each weekly forecast of 1–4 weeks ahead (showing median, 50%, and 90% probability). For each variant, forecasts are shown over an x-axis bounded by the earliest dates at which 5% and 99% of sequenced cases were identified as the respective variant of concern, while vertical dotted lines indicate the approximate date that the variant reached dominance (>50% sequenced cases).
![](https://iiif.elifesciences.org/lax:81916%2Felife-81916-fig3-v2.tif/full/617,/0/default.jpg)
Performance of short-term forecasts aggregated across all individually submitted models and the Hub ensemble, by horizon, forecasting cases (left) and deaths (right).
Performance measured by relative weighted interval score scaled against a baseline (dotted line, 1), and coverage of uncertainty at the 50% and 95% levels. Boxplot, with width proportional to number of observations, show interquartile ranges with outlying scores as faded points. The target range for each set of scores is shaded in yellow.
![](https://iiif.elifesciences.org/lax:81916%2Felife-81916-fig4-v2.tif/full/617,/0/default.jpg)
Performance of short-term forecasts across models and median ensemble (asterisk), by country, forecasting cases (top) and deaths (bottom) for 2-week ahead forecasts, according to the relative weighted interval score.
Boxplots show interquartile ranges, with outliers as faded points, and the ensemble model performance is marked by an asterisk. y-axis is cut-off to an upper bound of 4 for readability.
Tables
Predictive performance of main ensembles, as measured by the mean ratio of interval scores against the baseline ensemble.
Horizon | Weighted mean | Weighted median | Unweighted mean | Unweighted median |
---|---|---|---|---|
Cases | ||||
1 week | 0.63 | 0.64 | 0.61 | 0.64 |
2 weeks | 0.72 | 0.71 | 0.69 | 0.69 |
3 weeks | 0.82 | 0.76 | 0.82 | 0.72 |
4 weeks | 1.07 | 0.86 | 1.12 | 0.78 |
Deaths | ||||
1 week | 0.65 | 0.61 | 1.81 | 0.61 |
2 weeks | 0.58 | 0.54 | 1.29 | 0.54 |
3 weeks | 0.64 | 0.57 | 1.17 | 0.53 |
4 weeks | 0.82 | 0.67 | 0.84 | 0.62 |
Additional files
-
Supplementary file 1
EPIFORGE reporting guidelines Completed checklist following reporting guidelines on epidemic forecasting research.
- https://cdn.elifesciences.org/articles/81916/elife-81916-supp1-v2.csv
-
Supplementary file 2
Participating team metadata Team metadata for teams participating in the European Forecast Hub and evaluated in this study.
- https://cdn.elifesciences.org/articles/81916/elife-81916-supp2-v2.csv
-
MDAR checklist
- https://cdn.elifesciences.org/articles/81916/elife-81916-mdarchecklist1-v2.pdf