Research Article

Epidemiology and Global Health

Predictive performance of multi-model ensemble forecasts of COVID-19 across European nations

London School of Hygiene & Tropical Medicine, United Kingdom
European Centre for Disease Prevention and Control (ECDC), Sweden
Karlsruhe Institute of Technology, Germany
Robert Koch Institute, Germany
University of Massachusetts Amherst, United States
Boston Children’s Hospital and Harvard Medical School, United States
Third Faculty of Medicine, Charles University, Czech Republic
Ecole Polytechnique Federale de Lausanne, Switzerland
Éducation nationale, France
Eidgenossische Technische Hochschule, Switzerland
Forschungszentrum Jülich GmbH, Germany
Frankfurt Institute for Advanced Studies, Germany
Fraunhofer Institute for Industrial Mathematics, Germany
Heidelberg University, Germany
Helmholtz Centre for Infection Research, Germany
IEM, Inc, United States
Independent researcher, Austria
Independent researcher, United States
Institut d’Investigacions Biomèdiques August Pi i Sunyer, Universitat Pompeu Fabra, Spain
Institute of Computer Science of the CAS, Czech Republic
Institute of Information Theory and Automation of the CAS, Czech Republic
Inverence, Spain
Los Alamos National Laboratory, United States
LUMSA University, Italy
Masaryk University, Czech Republic
Massachusetts Institute of Technology, United States
Max-Planck-Institut für Dynamik und Selbstorganisation, Germany
Medical University of Gdansk, Poland
Paul Scherrer Institute, Switzerland
Politecnico di Milano, Italy
Technical University of Kaiserlautern, Germany
Technische Universität Ilmenau, Germany
Universidad Carlos III de Madrid, Spain
Universidad Nacional de Educación a Distancia (UNED), Spain
Universitat de Barcelona, Spain
Universitat Politècnica de Catalunya, Spain
Universitat Trier, Germany
University of Cologne, Germany
University of Halle, Germany
University of Ljubljana, Slovenia
University of Milano-Bicocca, Italy
University of Molise, Italy
University of Oxford, United Kingdom
University of Palermo, Italy
University of Pavia, Italy
University of Perugia, Italy
University of Rome "La Sapienza", Italy
University of Rome "Tor Vergata", Italy
University of Southern California, United States
University of Sydney, Australia
University of Virginia, United States
University of Warsaw, Poland
University of Bialystok, Poland
University of Wroclaw, Poland
Universtät Leipzig, Germany
Warsaw University of Technology, Poland
Wroclaw University of Science and Technology, Poland

Apr 21, 2023

https://doi.org/10.7554/eLife.81916

Open access
Copyright information

Abstract
Editor's evaluation
Introduction
Materials and methods
Results
Discussion
Data availability
References
Article and author information
Metrics

Abstract

Background:

Short-term forecasts of infectious disease burden can contribute to situational awareness and aid capacity planning. Based on best practice in other fields and recent insights in infectious disease epidemiology, one can maximise the predictive performance of such forecasts if multiple models are combined into an ensemble. Here, we report on the performance of ensembles in predicting COVID-19 cases and deaths across Europe between 08 March 2021 and 07 March 2022.

Methods:

We used open-source tools to develop a public European COVID-19 Forecast Hub. We invited groups globally to contribute weekly forecasts for COVID-19 cases and deaths reported by a standardised source for 32 countries over the next 1–4 weeks. Teams submitted forecasts from March 2021 using standardised quantiles of the predictive distribution. Each week we created an ensemble forecast, where each predictive quantile was calculated as the equally-weighted average (initially the mean and then from 26th July the median) of all individual models’ predictive quantiles. We measured the performance of each model using the relative Weighted Interval Score (WIS), comparing models’ forecast accuracy relative to all other models. We retrospectively explored alternative methods for ensemble forecasts, including weighted averages based on models’ past predictive performance.

Results:

Over 52 weeks, we collected forecasts from 48 unique models. We evaluated 29 models’ forecast scores in comparison to the ensemble model. We found a weekly ensemble had a consistently strong performance across countries over time. Across all horizons and locations, the ensemble performed better on relative WIS than 83% of participating models’ forecasts of incident cases (with a total N=886 predictions from 23 unique models), and 91% of participating models’ forecasts of deaths (N=763 predictions from 20 models). Across a 1–4 week time horizon, ensemble performance declined with longer forecast periods when forecasting cases, but remained stable over 4 weeks for incident death forecasts. In every forecast across 32 countries, the ensemble outperformed most contributing models when forecasting either cases or deaths, frequently outperforming all of its individual component models. Among several choices of ensemble methods we found that the most influential and best choice was to use a median average of models instead of using the mean, regardless of methods of weighting component forecast models.

Conclusions:

Our results support the use of combining forecasts from individual models into an ensemble in order to improve predictive performance across epidemiological targets and populations during infectious disease epidemics. Our findings further suggest that median ensemble methods yield better predictive performance more than ones based on means. Our findings also highlight that forecast consumers should place more weight on incident death forecasts than incident case forecasts at forecast horizons greater than 2 weeks.

Funding:

AA, BH, BL, LWa, MMa, PP, SV funded by National Institutes of Health (NIH) Grant 1R01GM109718, NSF BIG DATA Grant IIS-1633028, NSF Grant No.: OAC-1916805, NSF Expeditions in Computing Grant CCF-1918656, CCF-1917819, NSF RAPID CNS-2028004, NSF RAPID OAC-2027541, US Centers for Disease Control and Prevention 75D30119C05935, a grant from Google, University of Virginia Strategic Investment Fund award number SIF160, Defense Threat Reduction Agency (DTRA) under Contract No. HDTRA1-19-D-0007, and respectively Virginia Dept of Health Grant VDH-21-501-0141, VDH-21-501-0143, VDH-21-501-0147, VDH-21-501-0145, VDH-21-501-0146, VDH-21-501-0142, VDH-21-501-0148. AF, AMa, GL funded by SMIGE - Modelli statistici inferenziali per governare l'epidemia, FISR 2020-Covid-19 I Fase, FISR2020IP-00156, Codice Progetto: PRJ-0695. AM, BK, FD, FR, JK, JN, JZ, KN, MG, MR, MS, RB funded by Ministry of Science and Higher Education of Poland with grant 28/WFSN/2021 to the University of Warsaw. BRe, CPe, JLAz funded by Ministerio de Sanidad/ISCIII. BT, PG funded by PERISCOPE European H2020 project, contract number 101016233. CP, DL, EA, MC, SA funded by European Commission - Directorate-General for Communications Networks, Content and Technology through the contract LC-01485746, and Ministerio de Ciencia, Innovacion y Universidades and FEDER, with the project PGC2018-095456-B-I00. DE., MGu funded by Spanish Ministry of Health / REACT-UE (FEDER). DO, GF, IMi, LC funded by Laboratory Directed Research and Development program of Los Alamos National Laboratory (LANL) under project number 20200700ER. DS, ELR, GG, NGR, NW, YW funded by National Institutes of General Medical Sciences (R35GM119582; the content is solely the responsibility of the authors and does not necessarily represent the official views of NIGMS or the National Institutes of Health). FB, FP funded by InPresa, Lombardy Region, Italy. HG, KS funded by European Centre for Disease Prevention and Control. IV funded by Agencia de Qualitat i Avaluacio Sanitaries de Catalunya (AQuAS) through contract 2021-021OE. JDe, SMo, VP funded by Netzwerk Universitatsmedizin (NUM) project egePan (01KX2021). JPB, SH, TH funded by Federal Ministry of Education and Research (BMBF; grant 05M18SIA). KH, MSc, YKh funded by Project SaxoCOV, funded by the German Free State of Saxony. Presentation of data, model results and simulations also funded by the NFDI4Health Task Force COVID-19 (https://www.nfdi4health.de/task-force-covid-19-2) within the framework of a DFG-project (LO-342/17-1). LP, VE funded by Mathematical and Statistical modelling project (MUNI/A/1615/2020), Online platform for real-time monitoring, analysis and management of epidemic situations (MUNI/11/02202001/2020); VE also supported by RECETOX research infrastructure (Ministry of Education, Youth and Sports of the Czech Republic: LM2018121), the CETOCOEN EXCELLENCE (CZ.02.1.01/0.0/0.0/17-043/0009632), RECETOX RI project (CZ.02.1.01/0.0/0.0/16-013/0001761). NIB funded by Health Protection Research Unit (grant code NIHR200908). SAb, SF funded by Wellcome Trust (210758/Z/18/Z).

Editor's evaluation

This large-scale collaborative study is a timely contribution that will be of interest to researchers working in the fields of infectious disease forecasting and epidemic control. This paper provides a comprehensive evaluation of the predictive skills of real-time COVID-19 forecasting models in Europe. The conclusions of the paper are well supported by the data and are consistent with findings from studies in other countries.

https://doi.org/10.7554/eLife.81916.sa0

Introduction

Epidemiological forecasts make quantitative statements about a disease outcome in the near future. Forecasting targets can include measures of prevalent or incident disease and its severity, for some population over a specified time horizon. Researchers, policy makers, and the general public have used such forecasts to understand and respond to the global outbreaks of COVID-19 (Van Basshuysen et al., 2021; CDC, 2020; European Centre for Disease Prevention and Control, 2021c). At the same time, forecasters use a variety of methods and models for creating and publishing forecasts, varying in both defining the forecast outcome and in reporting the probability distribution of outcomes (Zelner et al., 2021; James et al., 2021).

Within Europe, comparing forecasts across both models and countries can support a range of national policy needs simultaneously. European public health professionals operate across national, regional, and continental scales, with strong existing policy networks in addition to rich patterns of cross-border migration influencing epidemic dynamics. A majority of European countries also cooperate in setting policy with inter-governmental European bodies such as the European Centre for Disease Prevention and Control (ECDC). In this case, a consistent approach to forecasting across the continent as a whole can support accurately informing cross-European monitoring, analysis, and guidance (European Centre for Disease Prevention and Control, 2021c). At a regional level, multi-country forecasts can support a better understanding of the impact of regional migration networks. Meanwhile, where there is limited capacity for infectious disease forecasting at a national level, forecasters generating multi-country results can provide an otherwise-unavailable opportunity for forecasts to inform national situational awareness. Some independent forecasting models have sought to address this by producing multi-country results (Aguas et al., 2020; Adib et al., 2021; Agosto and Giudici, 2020; Agosto et al., 2021).

Variation in forecast methods and presentation makes it difficult to compare predictive performance between forecast models, and from there to derive objective arguments for using one forecast over another. This confounds the selection of a single representative forecast and reduces the reliability of the evidence base for decisions based on forecasts. A ‘forecast hub’ is a centralised effort to improve the transparency and usefulness of forecasts, by standardising and collating the work of many independent teams producing forecasts (Reich et al., 2019a). A hub sets a commonly agreed-upon structure for forecast targets, such as type of disease event, spatio-temporal units, or the set of quantiles of the probability distribution to include from probabilistic forecasts. For instance, a hub may collect predictions of the total number of cases reported in a given country for each day in the next 2 weeks. Forecasters can adopt this format and contribute forecasts for centralised storage in the public domain.

This shared infrastructure allows forecasts produced from diverse teams and methods to be visualised and quantitatively compared on a like-for-like basis, which can strengthen public and policy use of disease forecasts. The underlying approach to creating a forecast hub was pioneered in climate modelling and adapted for collaborative epidemiological forecasts of dengue (Johansson et al., 2019) and influenza in the USA (Reich et al., 2019a; Reich et al., 2019b). This infrastructure was adapted for forecasts of short-term COVID-19 cases and deaths in the US (Cramer et al., 2021a; Ray et al., 2020), prompting similar efforts in some European countries (Bracher et al., 2021c; Funk et al., 2020; Bicher et al., 2020).

Standardising forecasts allows for combining multiple forecasts into a single ensemble with the potential for an improved predictive performance. Evidence from previous efforts in multi-model infectious disease forecasting suggests that forecasts from an ensemble of models can be consistently high performing compared to any one of the component models (Johansson et al., 2019; Reich et al., 2019b; Viboud et al., 2018). Elsewhere, weather forecasting has a long-standing use of building ensembles of models using diverse methods with standardised data and formatting in order to improve performance (Buizza, 2019; Moran et al., 2016).

The European COVID-19 Forecast Hub (European Covid-19 Forecast Hub, 2023d) is a project to collate short-term forecasts of COVID-19 across 32 countries in the European region. The Hub is funded and supported by the ECDC, with the primary aim to provide reliable information about the near-term epidemiology of the COVID-19 pandemic to the research and policy communities and the general public (European Centre for Disease Prevention and Control, 2021c). Second, the Hub aims to create infrastructure for storing and analysing epidemiological forecasts made in real time by diverse research teams and methods across Europe. Third, the Hub aims to maintain a community of infectious disease modellers underpinned by open science principles.

We started formally collating and combining contributions to the European Forecast Hub in March 2021. Here, we investigate the predictive performance of an ensemble of all forecasts contributed to the Hub in real time each week, as well as the performance of variations of ensemble methods created retrospectively.

Materials and methods

We developed infrastructure to host and analyse prospective forecasts of COVID-19 cases and deaths. The infrastructure is compatible with equivalent research software from the US (Cramer et al., 2021c; Wang et al., 2021) and German and Polish COVID-19 (Bracher et al., 2020) Forecast Hubs, and easy to replicate for new forecasting collaborations.

Forecast targets and models

Request a detailed protocol

We sought forecasts for the incidence of COVID-19 as the total reported number of cases and deaths per week. We considered forecasts for 32 countries in Europe, including all countries of the European Union, European Free Trade Area, and the United Kingdom. We compared forecasts against observed data reported for each country by Johns Hopkins University (JHU, Dong et al., 2020). JHU data sources included a mix of national and aggregated subnational data. We aggregated incidence over the Morbidity and Mortality Weekly Report (MMWR) epidemiological week definition of Sunday through Saturday.

Teams could express their uncertainty around any single forecast target by submitting predictions for up to 23 quantiles (from 0.01 to 0.99) of the predictive probability distribution. Teams could also submit a single point forecast. At the first submission, we asked teams to add a pre-specified set of metadata briefly describing the forecasting team and methods (provided online and in supplementary information). No restrictions were placed on who could submit forecasts. To increase participation, we actively contacted known forecasting teams across Europe and the US and advertised among the ECDC network. Teams submitted a broad spectrum of model types, ranging from mechanistic to empirical models, agent-based and statistical models, and ensembles of multiple quantitative or qualitative models (described at European Covid-19 Forecast Hub, 2023a). We maintain a full project specification with a detailed submissions protocol (European Covid-19 Forecast Hub, 2023c).

We collected forecasts submitted weekly in real time over the 52-week period from 08 March 2021 to 07 March 2022. Teams submitted at latest 2 days after the complete dataset for the latest forecasting week became available each Sunday. We implemented an automated validation programme to check that each new forecast conformed to standardised formatting. Forecast validation ensured a monotonic increase of predictions with each increasing quantile, integer-valued non-negative counts of predicted cases, as well as consistent date and location definitions.

Each week we used all available valid forecasts to create a weekly real-time ensemble model (referred to as ‘the ensemble’ from here on), for each of the 256 possible forecast targets: incident cases and deaths in 32 locations over the following one through 4 weeks. The ensemble method was an unweighted average of all models’ forecast values, at each predictive quantile for a given location, target, and horizon. From 08 March 2021, we used the arithmetic mean. However we noticed that including highly anomalous forecasts in a mean ensemble produced extremely wide uncertainty. To mitigate this, from 26th July 2021 onwards the ensemble instead used a median of all predictive quantiles.

We created an open and publicly accessible interface to the forecasts and ensemble, including an online visualisation tool allowing viewers to see past data and interact with one or multiple forecasts for each country and target for up to 4 weeks’ horizon (European Covid-19 Forecast Hub, 2023b). All forecasts, metadata, and evaluations are freely available and held on Github (European Covid-19 Forecast Hub, 2023d) (archived in real-time at Sherratt, 2022), and Zoltar, a platform for hosting epidemiological forecasts (EpiForecasts, 2021; Reich et al., 2021). In the codebase for this study (covid19-forecast-hub-europe, 2022) we provide a simple method and instructions for downloading and preparing these data for analysis using R. We encourage other researchers to freely use and adapt this to support their own analyses.

Forecast evaluation

Request a detailed protocol

In this study, we focused only on the comparative performance of forecasting models relative to each other. Performance in absolute terms is available on the Hub website (European Covid-19 Forecast Hub, 2023b). For each model, we assessed calibration and overall predictive performance. We evaluated all previous forecasts against actual observed values for each model, stratified by the forecast horizon, location, and target. We calculated scores using the scoringutils R package (Bosse et al., 2023). We removed any forecast surrounding (both the week of, and the first week after) a strongly anomalous data point. We defined anomalous as where any subsequent data release revised that data point by over 5%.

To investigate calibration, we assessed coverage as the correspondence between the forecast probability of an event and the observed frequency of that event. This usage follows previous work in epidemic forecasting (Bracher et al., 2021a), and is related to the concept of reliability for binary forecasts. We established the accuracy of each model’s prediction boundaries as the coverage of the predictive intervals. We calculated coverage at a given interval level $k$ , where $k \in [0, 1]$ , as the proportion $p$ of observations that fell within the corresponding central predictive intervals across locations and forecast dates. A perfectly calibrated model would have $p = k$ at all 11 levels (corresponding to 22 quantiles excluding the median). An underconfident model at level $k$ would have $p > k$ , i.e. more observations fall within a given interval than expected. In contrast, an overconfident model at level $k$ would have $p < k$ , i.e. fewer observations fall within a given interval than expected. We here focus on coverage at the $k = 0.5$ and $k = 0.95$ levels.

We also assessed the overall predictive performance of weekly forecasts using the Weighted Interval Score~(WIS) across all available quantiles. The WIS represents a parsimonious approach to scoring forecasts based on uncertainty represented as forecast values across a set of quantiles (Bracher et al., 2021a), and is a strictly proper scoring rule, that is, it is optimal for predictions that come from the data-generating model. As a consequence, the WIS encourages forecasters to report predictions representing their true belief about the future (Gneiting and Raftery, 2007). Each forecast for a given location and date is scored based on an observed count of weekly incidence, the median of the predictive distribution and the predictive upper and lower quantiles corresponding to the central predictive interval level.

Not all models provided forecasts for all locations and dates, and we needed to compare predictive performance in the face of various levels of missingness across each forecast target. Therefore we calculated a relative WIS. This is a measure of forecast performance which takes into account that different teams may not cover the same set of forecast targets (i.e. weeks and locations). The relative WIS is computed using a pairwise comparison tournament where for each pair of models a mean score ratio is computed based on the set of shared targets. The relative WIS of a model with respect to another model is then the ratio of their respective geometric mean of the mean score ratios, such that smaller values indicate better performance.

We scaled the relative WIS of each model with the relative WIS of a baseline model, for each forecast target, location, date, and horizon. The baseline model assumes case or death counts stay the same as the latest data point over all future horizons, with expanding uncertainty, described previously in Cramer et al., 2021b. In this study, we report the relative WIS of each model with respect to the baseline model.

Retrospective ensemble methods

Request a detailed protocol

We retrospectively explored alternative methods for combining forecasts for each target at each week. A natural way to combine probability distributions available in the quantile format Genest, 1992 used here is

F^{- 1} (α) = \sum_{i = 1}^{n} w_{i} F_{i}^{- 1} (α),

Where $F_{1} \dots F_{n}$ are the cumulative distribution functions of the individual probability distributions (in our case, the predictive distributions of each forecast model $i$ contributed to the hub), w_i are a set of weights in $[0, 1]$ ; and $α$ are the quantile levels, such that following notation introduced in Genest, 1992,

F^{- 1} (α) = i n f {t : F_{i} (t) \geq α} .

Different ensemble choices then mainly translate to the choice of weights w_i. An arithmetic mean ensemble uses weights at $w_{i} = 1 / n$ , where all weights are equal and sum up to 1.

Alternatively, we can choose a set of weights to apply to forecasts before they are combined. Numerous options exist for choosing these weights with the aim to maximise predictive performance, including choosing weights to reflect each forecast’s past performance (thereby moving from an untrained to a trained ensemble). A straightforward choice is so-called inverse score weighting. In this case, the weights are calculated as

w_{i} = \frac{1}{S_{i}},

where $S_{i}$ reflects the forecasting skill calculated as the relative WIS of forecaster $i$ , calculated over all available model data, and normalised so that weights sum to 1. This method of weighting was found in the US to outperform unweighted scores during some time periods (Taylor and Taylor, 2023) but this was not confirmed in a similar study in Germany and Poland (Bracher et al., 2021c).

When constructing ensembles from quantile means, a single outlier can have an oversized effect on the ensemble forecast. Previous research has found that a median ensemble, replacing the arithmetic mean of each quantile with a median of the same values, yields competitive performance while maintaining robustness to outlying forecasts (Ray et al., 2022). Building on this, we also created weighted median ensembles using the weights described above and a Harrel-Davis quantile estimator with a beta function to approximate the weighted percentiles (Harrell and Davis, 1982). We then compared the performance of unweighted and inverse relative WIS weighted mean and median ensembles, comparing the ratio of interval scores between each ensemble model relative to the baseline model.

Results

For 32 European countries, we collected, visualised, and made available online weekly COVID-19 forecasts and observed data (Sherratt, 2022). Over the whole study period, we collected forecasts from 48 unique models. Modellers created forecasts choosing from a set of 32 possible locations, four time horizons, and two variables, and modellers variously joined and left the Hub over time. This meant the number of models contributing to the Hub varied over time and by forecasting target. Using all models and the ensemble, we created 2139 forecasting scores, where each score summarises a unique combination of forecasting model, variable, country, and week ahead horizon (Figure 1).

Figure 1

Download asset Open asset

Total number of forecasts included in evaluation, by target location, week ahead horizon, and variable.

Of the total 48 models, we received the most forecasts for Germany, with 29 unique models submitting 1-week case forecasts, while only 12 models ever submitted 4-week case or death forecasts for Liechtenstein. Modelling teams also differed in how they expressed uncertainty. Only three models provided point forecasts with no estimate of uncertainty around their predictions, while 41 models provided the full set of 23 probabilistic quantiles across the predictive distribution for each target.

In this evaluation we included 29 models in comparison to the ensemble forecast (Figure 1). We have included metadata provided by modellers in the supplement and online (Sherratt, 2022). In this evaluation, at most 15 models contributed forecasts for cases in Germany at the 1 week horizon, with an accumulated 592 forecast scores for that single target over the study period. In contrast, deaths in Finland at the 2 week horizon saw the smallest number of forecasts, with only 6 independent models contributing 24 forecast scores at any time over the 52-week period. Of the 29 models included in this evaluation, 5 models provided less than the full set of 23 quantiles, and were excluded when creating the ensemble. No ensemble forecast was composed of less than 3 independent models.

We visually compared the absolute performance of forecasts in predicting numbers of incident cases and deaths. We observed that forecasts predicted well in times of stable epidemic behaviour, while struggling to accurately predict at longer horizons around inflection points, for example during rapid changes in population-level behaviour or surveillance. Forecast models varied widely in their ability to predict and account for the introduction of new variants, giving the ensemble forecast over these periods a high level of uncertainty. An example of weekly forecasts from the ensemble model is shown in Figure 2.

Figure 2

Download asset Open asset

Ensemble forecasts of weekly incident cases in Germany over periods of increasing SARS-CoV-2 variants Delta (B.1.617.2, left) and Omicron (B.1.1.529, right).

Black indicates observed data. Coloured ribbons represent each weekly forecast of 1–4 weeks ahead (showing median, 50%, and 90% probability). For each variant, forecasts are shown over an x-axis bounded by the earliest dates at which 5% and 99% of sequenced cases were identified as the respective variant of concern, while vertical dotted lines indicate the approximate date that the variant reached dominance (>50% sequenced cases).

In relative terms, the ensemble of all models performed well compared to both its component models and the baseline. By relative WIS scaled against a baseline of 1 (where a score <1 indicates outperforming the baseline), the median score of forecasts from the Hub ensemble model was 0.71, within an interquartile range of 0.61 at 25% probability to 0.88 at 75% probability. Meanwhile the median score of forecasts across all participating models (excluding the Hub ensemble) was 1.04 (IQR 0.82–1.36).

Across all horizons and locations, the ensemble performed better on scaled relative WIS than 83% of forecast scores when forecasting cases (with a total N=886 from 23 unique models), and 91% of scores for forecasts of incident deaths (N=763 scores from 20 models). We also saw high performance from the ensemble when evaluating against all models including those who did not submit the full set of probabilistic quantile predictions (80% for cases with N=1006 scores from 28 models, and 88% for deaths, N=877 scores from 24 models).

The performance of individual and ensemble forecasts varied by length of the forecast horizon (Figure 3). At each horizon, the typical performance of the ensemble outperformed both the baseline model and the aggregated scores of all its component models, although we saw wide variation between individual models in performance across horizons. Both individual models and the ensemble saw a trend of worsening performance at longer horizons when forecasting cases with the median scaled relative WIS of the ensemble across locations worsened from 0.62 for 1-week ahead forecasts to 0.9 when forecasting 4 weeks ahead. Performance for forecasts of deaths was more stable over one through 4 weeks, with median ensemble performance moving from 0.69 to 0.76 across the 4-week horizons.

Figure 3

Download asset Open asset

Performance of short-term forecasts aggregated across all individually submitted models and the Hub ensemble, by horizon, forecasting cases (left) and deaths (right).

Performance measured by relative weighted interval score scaled against a baseline (dotted line, 1), and coverage of uncertainty at the 50% and 95% levels. Boxplot, with width proportional to number of observations, show interquartile ranges with outlying scores as faded points. The target range for each set of scores is shaded in yellow.

We observed similar trends in performance across horizon when considering how well the ensemble was calibrated with respect to the observed data. At 1 week ahead the case ensemble was well calibrated (ca. 50% and 95% nominal coverage at the 50% and 95% levels, respectively). This did not hold at longer forecast horizons as the case forecasts became increasingly over-confident. Meanwhile, the ensemble of death forecasts was well calibrated at the 95% level across all horizons, and the calibration of death forecasts at the 50% level improved with lengthening horizons compared to being underconfident at shorter horizons.

The ensemble also performed consistently well in comparison to individual models when forecasting across countries (Figure 4). In total, across 32 countries forecasting for 1 through 4 weeks, when forecasting cases the ensemble outperformed 75% of component models in 22 countries, and outperformed all available models in 3 countries. When forecasting deaths, the ensemble outperformed 75% and 100% of models in 30 and 8 countries, respectively. Considering only the the 2-week horizon shown in Figure 4, the ensemble of case forecasts outperformed 75% models in 25 countries and all models in only 12 countries. At the 2-week horizon for forecasts of deaths, the ensemble outperformed 75% and 100% of its component models in 30 and 26 countries, respectively.

Figure 4

Download asset Open asset

Performance of short-term forecasts across models and median ensemble (asterisk), by country, forecasting cases (top) and deaths (bottom) for 2-week ahead forecasts, according to the relative weighted interval score.

Boxplots show interquartile ranges, with outliers as faded points, and the ensemble model performance is marked by an asterisk. y-axis is cut-off to an upper bound of 4 for readability.

We considered alternative methods for creating ensembles from the participating forecasts, using either a mean or median to combine either weighted or unweighted forecasts. We evaluated each alternative ensemble model against the baseline model, taking the mean score ratio across all targets (Table 1). Across locations we observed that the median outperformed the mean across all one through 4 week horizons and both cases and death targets, for all but cases at the 1 week horizon. This held regardless of whether the component forecasts were weighted or unweighted by their individual past performance. Between methods of combination, weighting made little difference to the performance of the median ensemble, but appeared to improve performance of a mean ensemble in forecasting deaths.

Table 1

Predictive performance of main ensembles, as measured by the mean ratio of interval scores against the baseline ensemble.

Horizon	Weighted mean	Weighted median	Unweighted mean	Unweighted median
Cases
1 week	0.63	0.64	0.61	0.64
2 weeks	0.72	0.71	0.69	0.69
3 weeks	0.82	0.76	0.82	0.72
4 weeks	1.07	0.86	1.12	0.78
Deaths
1 week	0.65	0.61	1.81	0.61
2 weeks	0.58	0.54	1.29	0.54
3 weeks	0.64	0.57	1.17	0.53
4 weeks	0.82	0.67	0.84	0.62

Discussion

We collated 12 months of forecasts of COVID-19 cases and deaths across 32 countries in Europe, collecting from multiple independent teams and using a principled approach to standardising both forecast targets and the predictive distribution of forecasts. We combined these into an ensemble forecast and compared the relative performance of forecasts between models, finding that the ensemble forecasts outperformed most individual models across all countries and horizons over time.

Across all models we observed that forecasting changes in trend in real time was particularly challenging. Our study period included multiple fundamental changes in viral-, individual-, and population-level factors driving the transmission of COVID-19 across Europe. In early 2021, the introduction of vaccination started to change population-level associations between infections, cases, and deaths (European Centre for Disease Prevention and Control, 2021b), while the Delta variant emerged and became dominant (European Centre for Disease Prevention and Control, 2021a). Similarly from late 2021 we saw the interaction of individually waning immunity during the emergence and global spread of the Omicron variant (European Centre for Disease Prevention and Control, 2022b). Neither the extent nor timing of these factors were uniform across European countries covered by the Forecast Hub (European Centre for Disease Prevention and Control, 2023). This meant that the performance of any single forecasting model depended partly on the ability, speed, and precision with which it could adapt to new conditions for each forecast target.

We observed a contrast between a more stable performance of forecasting deaths further into the future compared to forecasts of cases. Previous work has found rapidly declining performance for case forecasts with increasing horizon (Cramer et al., 2021b; Castro et al., 2020), while death forecasts can perform well with up to 6 weeks lead time (Friedman et al., 2021). We can link this to the specific epidemic dynamics in this study.

First, COVID-19 has a typical serial interval of less than a week (Alene et al., 2021). This implies that case forecasts of more than 2 weeks only remain valid if rates of both transmission and detection remain stable over the entire forecast horizon. In contrast, we saw rapid changes in epidemic dynamics across many countries in Europe over our study period, impacting the longer term case forecasts.

Second, we can interpret the higher reliability of death forecasts as due to the different lengths and distributions of time lags from infection to case and death reporting (Jin, 2021). For example, a spike in infections may be matched by a consistently sharp increase in case reporting, but a longer tailed distribution of the subsequent increase in death reports. This creates a lower magnitude of fluctuation in the time-series of deaths compared to that of cases. Similarly, surveillance data for death reporting is substantially more consistent, with fewer errors and retrospective corrections, than case reporting (Català et al., 2021).

Third, we also note that the performance of trend-based forecasts may have benefited from the slower changes to trends in incident deaths caused by gradually increasing vaccination rates. These features allow forecasters to incorporate the effect of changes in transmission more easily when forecasting deaths, compared to cases.

We found the ensemble in this study continued to outperform both other models and the baseline at up to 4 weeks ahead. Our results support previous findings that ensemble forecasts are the best or nearly the best performing models with respect to absolute predictive performance and appropriate coverage of uncertainty (Funk et al., 2020; Viboud et al., 2018; Cramer et al., 2021b). While the ensemble was consistently high performing, it was not strictly dominant across all forecast targets, reflecting findings from previous comparable studies of COVID-19 forecasts (Bracher et al., 2021c; Brooks, 2020). Our finding suggests the usefulness of an ensemble as a robust summary when forecasting across many spatio-temporal targets, without replacing the importance of communicating the full range of model predictions.

When exploring variations in ensemble methods, we found that the choice of median over means yielded the most consistent improvement in predictive performance, regardless of the method of weighting. Other work has supported the importance of the median in providing a stable forecast that better accounts for outlier forecasts than the mean (Brooks, 2020), although this finding may be dependent on the quality of the individual forecast submissions. In contrast, weighing models by past performance did not result in any consistent improvement in performance. This is in line with existing mixed evidence for any optimal ensemble method for combining short term probabilistic infectious disease forecasts. Many methods of combination have performed competitively in analyses of forecasts for COVID-19 in the US, including the simple mean and weighted approaches outperforming unweighted or median methods (Taylor and Taylor, 2023). This contrasts with later analyses finding weighted methods to give similar performance to a median average (Ray et al., 2020; Brooks, 2020). We can partly explain this inconsistency if performance of each method depends on the outcome being predicted (cases, deaths), its count (incident, cumulative) and absolute level, the changing disease dynamics, and the varying quality and quantity of forecasting teams over time.

We note several limitations in our approach to assessing the relative performance of an ensemble among forecast models. While we have described differences in model scores, we have not used any formal statistical test for comparing forecast scores, such as the Diebold-Mariano test (Diebold and Mariano, 1995), recognising that it is unclear how this is best achieved across many models. Our results are the outcome of evaluating forecasts against a specific performance metric and baseline, where multiple options for evaluation exist and the choice reflects the aim of the evaluation process. Further, our choice of baseline model affects the given performance scores in absolute terms, and more generally the choice of appropriate baseline for epidemic forecast models is not obvious when assessing infectious disease forecasts. The model used here is supported by previous work (Cramer et al., 2021b), yet previous evaluation in a similar context has suggested that choice of baseline affects relative performance in general (Bracher et al., 2021b), and future research should be done on the best choices of baseline models in the context of infectious disease epidemics.

Our assessment of forecast performance may further have been inaccurate due to limitations in the observed data against which we evaluated forecasts. We sourced data from a globally aggregated database to maintain compatibility across 32 countries (Dong et al., 2020). However, this made it difficult to identify the origin of lags and inconsistencies between national data streams, and to what extent these could bias forecasts for different targets. In particular, we saw some real time data revised retrospectively, introducing bias in either direction where the data used to create forecasts was not the same as that used to evaluate it. We attempted to mitigate this by using an automated process for determining data revisions, and excluding forecasts made at a time of missing, unreliable, or heavily revised data. We also recognise that evaluating forecasts against updated data is a valid alternative approach used elsewhere (Cramer et al., 2021b). More generally it is unclear if the expectation of observation revisions should be a feature built into forecasts. Further research is needed to understand the perspective of end-users of forecasts in order to assess this.

The focus of this study was describing and summarising an ensemble of many models. We note that we have little insight into the individual methods and wide variety of assumptions that modellers used. While we asked modellers to provide a short description of their methods, we did not create a rigorous framework for this, and we did not document whether modellers changed the methods for a particular submitted model over time. Both the content of and variation in modelling methods and assumptions are likely to be critical to explaining performance, rather than describing or summarising it. Exploring modellers’ methods and relating this to forecast performance will be an important area of future work.

In an emergency setting, access to visualised forecasts and underlying data is useful for researchers, policymakers, and the public (CDC, 2020). Previous European multi-country efforts to forecast COVID-19 have included only single models adapted to country-specific parameters (Aguas et al., 2020; Adib et al., 2021; Agosto et al., 2021).

The European Forecasting Hub acted as a unique tool for creating an open-access, cross-country modelling network, and connecting this to public health policy across Europe. By opening participation to many modelling teams and with international high participation, we were able to create robust ensemble forecasts across Europe. This also allows comparison across forecasts built with different interpretations of current data, on a like for like scale in real time. The European Hub has supported policy outputs at an international, regional, and national level, including Hub forecasts cited weekly in ECDC Communicable Disease Threats Reports (European Centre for Disease Prevention and Control, 2022a).

For forecast producers, an easily accessible comparison between results from different methods can highlight individual strengths and weaknesses and help prioritise new areas of work. Collating time-stamped predictions ensures that we can test true out-of-sample performance of models and avoid retrospective claims of performance. Testing the limits of forecasting ability with these comparisons forms an important part of communicating any model-based prediction to decision makers. For example, the weekly ECDC Communicable Disease Threats reports include the specific results of this work by qualitatively highlighting the greater uncertainty around case forecasts compared to death forecasts.

This study raises many further questions which could inform epidemic forecast modellers and users. The dataset created by the European Forecast Hub is an openly accessible, standardised, and extensively documented catalogue of real time forecasting work from a range of teams and models across Europe (European Covid-19 Forecast Hub, 2023b), and we recommend its use for further research on forecast performance. In the code developed for this study, we provide a worked example of downloading and using both the forecasts and their evaluation scores (covid19-forecast-hub-europe, 2022).

Future work could explore the impact on forecast models of changing epidemiology at a broad spatial scale by combining analyses of trends and turning points in cases and deaths with forecast performance, or extending to include data on vaccination, variant, or policy changes over time. There is also much scope for future research into methods for combining forecasts to improve performance of an ensemble. This includes altering the inclusion criteria of forecast models based on different thresholds of past performance, excluding or including only forecasts that predict the lowest and highest values (trimming) (Taylor and Taylor, 2023), or using alternative weighting methods such as quantile regression averaging (Funk et al., 2020). Exploring these questions would add to our understanding of real time performance, supporting and improving future forecasting efforts.

We see additional scope to adapt the Hub format to the changing COVID-19 situation across Europe. We have extended the Forecast Hub infrastructure to include short term forecasts for hospitalisations with COVID-19, which is a challenging task due to limited data across the locations covered by the hub. As the policy focus shifts from immediate response to anticipating changes brought by vaccinations or the geographic spread of new variants (European Centre for Disease Prevention and Control, 2023), we are also separately investigating models for longer term scenarios in addition to the short term forecasts in a similar framework to existing scenario modelling work in the US (Borchering et al., 2021).

In conclusion, we have shown that during a rapidly evolving epidemic spreading through multiple populations, an ensemble forecast performed highly consistently across a large matrix of forecast targets, typically outperforming the majority of its separate component models and a naive baseline model. In addition, we have linked issues with the predictability of short-term case forecasts to underlying COVID-19 epidemiology, and shown that ensemble methods based on past model performance were unable to reliably improve forecast performance. Our work constitutes a step towards both unifying COVID-19 forecasts and improving our understanding of them.

Data availability

All source data were openly available before the study, originally available at: https://github.com/covid19-forecast-hub-europe/covid19-forecast-hub-europe (copy archived at swh:1:rev:b4d66c495e07c12d88384506154cf58f08592365). All data and code for this study are openly available on Github: covid19-forecast-hub-europe/euro-hub-ensemble.

References

1. Adib K
2. Hancock PA
3. Rahimli A
4. Mugisa B
5. Abdulrazeq F
6. Aguas R
7. White LJ
8. Hajjeh R
9. Al Ariqi L
10. Nabeth P
(2021) A participatory modelling approach for investigating the spread of covid-19 in countries of the eastern Mediterranean region to support public health decision-making
BMJ Global Health 6:e005207.

https://doi.org/10.1136/bmjgh-2021-005207
- PubMed
- Google Scholar
1. Agosto A
2. Giudici P
(2020) A poisson autoregressive model to understand COVID-19 contagion dynamics
Risks 8:77.

https://doi.org/10.3390/risks8030077
- Google Scholar
1. Agosto A
2. Campmas A
3. Giudici P
4. Renda A
(2021) Monitoring COVID-19 contagion growth
Statistics in Medicine 40:4150–4160.

https://doi.org/10.1002/sim.9020
- PubMed
- Google Scholar
1. Aguas R
2. White L
3. Hupert N
4. Shretta R
5. Pan-Ngum W
6. Celhay O
7. Moldokmatova A
8. Arifi F
9. Mirzazadeh A
10. Sharifi H
11. Adib K
12. Sahak MN
13. Franco C
14. Coutinho R
15. CoMo Consortium
(2020) Modelling the COVID-19 pandemic in context: an international participatory approach
BMJ Global Health 5:e003126.

https://doi.org/10.1136/bmjgh-2020-003126
- PubMed
- Google Scholar
1. Alene M
2. Yismaw L
3. Assemie MA
4. Ketema DB
5. Gietaneh W
6. Birhan TY
(2021) Serial interval and incubation period of COVID-19: a systematic review and meta-analysis
BMC Infectious Diseases 21:257.

https://doi.org/10.1186/s12879-021-05950-x
- PubMed
- Google Scholar
Preprint
1. Bicher M
2. Zuba M
3. Rainer L
4. Bachner F
5. Rippinger C
6. Ostermann H
7. Popper N
8. Thurner S
9. Klimek P
(2020) Supporting COVID-19 Policy-Making with a Predictive Epidemiological Multi-Model Warning System
medRxiv.

https://doi.org/10.1101/2020.10.18.20214767
- Google Scholar
1. Borchering RK
2. Viboud C
3. Howerton E
4. Smith CP
5. Truelove S
6. Runge MC
7. Reich NG
8. Contamin L
9. Levander J
10. Salerno J
11. van Panhuis W
12. Kinsey M
13. Tallaksen K
14. Obrecht RF
15. Asher L
16. Costello C
17. Kelbaugh M
18. Wilson S
19. Shin L
20. Gallagher ME
21. Mullany LC
22. Rainwater-Lovett K
23. Lemaitre JC
24. Dent J
25. Grantz KH
26. Kaminsky J
27. Lauer SA
28. Lee EC
29. Meredith HR
30. Perez-Saez J
31. Keegan LT
32. Karlen D
33. Chinazzi M
34. Davis JT
35. Mu K
36. Xiong X
37. Pastore Y Piontti A
38. Vespignani A
39. Srivastava A
40. Porebski P
41. Venkatramanan S
42. Adiga A
43. Lewis B
44. Klahn B
45. Outten J
46. Schlitt J
47. Corbett P
48. Telionis PA
49. Wang L
50. Peddireddy AS
51. Hurt B
52. Chen J
53. Vullikanti A
54. Marathe M
55. Healy JM
56. Slayton RB
57. Biggerstaff M
58. Johansson MA
59. Shea K
60. Lessler J
(2021) Modeling of future COVID-19 cases, hospitalizations, and deaths, by vaccination rates and nonpharmaceutical intervention scenarios-United States, april-september 2021
MMWR. Morbidity and Mortality Weekly Report 70:719–724.

https://doi.org/10.15585/mmwr.mm7019e3
- PubMed
- Google Scholar
Software
1. Bosse NI
2. Gruson H
3. Funk S
4. Abbott S
(2023) Scoringutils: utilities for scoring and assessing predictions
CRAN.

https://github.com/epiforecasts/scoringutils
Software
(2020) The German and Polish COVID-19 forecast hub
Github.

https://github.com/KITmetricslab/covid19-forecast-hub-de
1. Bracher J
2. Ray EL
3. Gneiting T
4. Reich NG
(2021a) Evaluating epidemic forecasts in an interval format
PLOS Computational Biology 17:e1008618.

https://doi.org/10.1371/journal.pcbi.1008618
- PubMed
- Google Scholar
Preprint
1. Bracher J
2. Wolffram D
3. Deuschel J
4. Görgen K
5. Ketterer JL
6. Ullrich A
7. Abbott S
8. Barbarossa MV
9. Bertsimas D
10. Bhatia S
11. Bodych M
12. Bosse NI
13. Burgard JP
14. Castro L
15. Fairchild G
16. Fiedler J
17. Fuhrmann J
18. Funk S
19. Gambin A
20. Gogolewski K
21. Heyder S
22. Hotz T
23. Kheifetz Y
24. Kirsten H
25. Krueger T
26. Krymova E
27. Leithäuser N
28. Li ML
29. Meinke JH
30. Miasojedow B
31. Michaud IJ
32. Mohring J
33. Nouvellet P
34. Nowosielski JM
35. Ozanski T
36. Radwan M
37. Rakowski F
38. Scholz M
39. Soni S
40. Srivastava A
41. Gneiting T
42. Schienle M
(2021b) National and Subnational Short-Term Forecasting of COVID-19 in Germany and Poland during Early 2021
medRxiv.

https://doi.org/10.1101/2021.11.05.21265810
- Google Scholar
1. Bracher J
2. Wolffram D
3. Deuschel J
4. Görgen K
5. Ketterer JL
6. Ullrich A
7. Abbott S
8. Barbarossa MV
9. Bertsimas D
10. Bhatia S
11. Bodych M
12. Bosse NI
13. Burgard JP
14. Castro L
15. Fairchild G
16. Fuhrmann J
17. Funk S
18. Gogolewski K
19. Gu Q
20. Heyder S
21. Hotz T
22. Kheifetz Y
23. Kirsten H
24. Krueger T
25. Krymova E
26. Li ML
27. Meinke JH
28. Michaud IJ
29. Niedzielewski K
30. Ożański T
31. Rakowski F
32. Scholz M
33. Soni S
34. Srivastava A
35. Zieliński J
36. Zou D
37. Gneiting T
38. Schienle M
39. List of Contributors by Team
(2021c) A pre-registered short-term forecasting study of covid-19 in Germany and Poland during the second wave
Nature Communications 12:5173.

https://doi.org/10.1038/s41467-021-25207-0
- PubMed
- Google Scholar
Website
1. Brooks L
(2020) Comparing ensemble approaches for short-term probabilistic COVID-19 forecasts in the U.S
Accessed April 1, 2022.

https://forecasters.org/blog/2020/10/28/comparing-ensemble-approaches-for-short-term-probabilistic-covid-19-forecasts-in-the-u-s/
1. Buizza R
(2019) Introduction to the special issue on `` 25 years of ensemble forecasting.''
Quarterly Journal of the Royal Meteorological Society 145:1–11.

https://doi.org/10.1002/qj.3370
- Google Scholar
1. Castro M
2. Ares S
3. Cuesta JA
4. Manrubia S
(2020) The turning point and end of an expanding epidemic can not be precisely forecast
PNAS 117:26190–26196.

https://doi.org/10.1073/pnas.2007868117
- PubMed
- Google Scholar
(2021) Robust estimation of diagnostic rate and real incidence of COVID-19 for European policymakers
PLOS ONE 16:e0243701.

https://doi.org/10.1371/journal.pone.0243701
- PubMed
- Google Scholar
Software
1. CDC
(2020) Coronavirus disease 2019
COVID-19.

https://www.cdc.gov/coronavirus/2019-ncov/science/forecasting/forecasting.html
Software
1. covid19-forecast-hub-europe
(2022) Predictive performance of multi-model ensemble forecasts of covid-19 across European nations
Github.

https://github.com/covid19-forecast-hub-europe/euro-hub-ensemble
Preprint
1. Cramer EY
2. Huang Y
3. Wang Y
4. Ray EL
5. Cornell M
6. Bracher J
7. Brennen A
8. Castero Rivadeneira AJ
9. Gerding A
10. House K
11. Jayawardena D
12. Kanji AH
13. Khandelwal A
14. Le K
15. Niemi J
16. Stark A
17. Shah A
18. Wattanchit N
19. Zorn MW
20. Reich NG
(2021a) The United States COVID-19 Forecast Hub Dataset
medRxiv.

https://doi.org/10.1101/2021.11.04.21265886
- Google Scholar
Preprint
1. Cramer EY
2. Ray EL
3. Lopez VK
4. Bracher J
5. Brennen A
6. Castro Rivadeneira AJ
7. Gerding A
8. Gneiting T
9. House KH
10. Huang Y
11. Jayawardena D
12. Kanji AH
13. Khandelwal A
14. Le K
15. Mühlemann A
16. Niemi J
17. Shah A
18. Stark A
19. Wang Y
20. Wattanachit N
21. Zorn MW
22. Gu Y
23. Jain S
24. Bannur N
25. Deva A
26. Kulkarni M
27. Merugu S
28. Raval A
29. Shingi S
30. Tiwari A
31. White J
32. Abernethy NF
33. Woody S
34. Dahan M
35. Fox S
36. Gaither K
37. Lachmann M
38. Meyers LA
39. Scott JG
40. Tec M
41. Srivastava A
42. George GE
43. Cegan JC
44. Dettwiller ID
45. England WP
46. Farthing MW
47. Hunter RH
48. Lafferty B
49. Linkov I
50. Mayo ML
51. Parno MD
52. Rowland MA
53. Trump BD
54. Zhang-James Y
55. Chen S
56. Faraone SV
57. Hess J
58. Morley CP
59. Salekin A
60. Wang D
61. Corsetti SM
62. Baer TM
63. Eisenberg MC
64. Falb K
65. Huang Y
66. Martin ET
67. McCauley E
68. Myers RL
69. Schwarz T
70. Sheldon D
71. Gibson GC
72. Yu R
73. Gao L
74. Ma Y
75. Wu D
76. Yan X
77. Jin X
78. Wang YX
79. Chen Y
80. Guo L
81. Zhao Y
82. Gu Q
83. Chen J
84. Wang L
85. Xu P
86. Zhang W
87. Zou D
88. Biegel H
89. Lega J
90. McConnell S
91. Nagraj VP
92. Guertin SL
93. Hulme-Lowe C
94. Turner SD
95. Shi Y
96. Ban X
97. Walraven R
98. Hong QJ
99. Kong S
100. van de Walle A
101. Turtle JA
102. Ben-Nun M
103. Riley S
104. Riley P
105. Koyluoglu U
106. DesRoches D
107. Forli P
108. Hamory B
109. Kyriakides C
110. Leis H
111. Milliken J
112. Moloney M
113. Morgan J
114. Nirgudkar N
115. Ozcan G
116. Piwonka N
117. Ravi M
118. Schrader C
119. Shakhnovich E
120. Siegel D
121. Spatz R
122. Stiefeling C
123. Wilkinson B
124. Wong A
125. Cavany S
126. España G
127. Moore S
128. Oidtman R
129. Perkins A
130. Kraus D
131. Kraus A
132. Gao Z
133. Bian J
134. Cao W
135. Ferres JL
136. Li C
137. Liu TY
138. Xie X
139. Zhang S
140. Zheng S
141. Vespignani A
142. Chinazzi M
143. Davis JT
144. Mu K
145. y Piontti AP
146. Xiong X
147. Zheng A
148. Baek J
149. Farias V
150. Georgescu A
151. Levi R
152. Sinha D
153. Wilde J
154. Perakis G
155. Bennouna MA
156. Nze-Ndong D
157. Singhvi D
158. Spantidakis I
159. Thayaparan L
160. Tsiourvas A
161. Sarker A
162. Jadbabaie A
163. Shah D
164. Penna ND
165. Celi LA
166. Sundar S
167. Wolfinger R
168. Osthus D
169. Castro L
170. Fairchild G
171. Michaud I
172. Karlen D
173. Kinsey M
174. Mullany LC
175. Rainwater-Lovett K
176. Shin L
177. Tallaksen K
178. Wilson S
179. Lee EC
180. Dent J
181. Grantz KH
182. Hill AL
183. Kaminsky J
184. Kaminsky K
185. Keegan LT
186. Lauer SA
187. Lemaitre JC
188. Lessler J
189. Meredith HR
190. Perez-Saez J
191. Shah S
192. Smith CP
193. Truelove SA
194. Wills J
195. Marshall M
196. Gardner L
197. Nixon K
198. Burant JC
199. Wang L
200. Gao L
201. Gu Z
202. Kim M
203. Li X
204. Wang G
205. Wang Y
206. Yu S
207. Reiner RC
208. Barber R
209. Gakidou E
210. Hay SI
211. Lim S
212. Murray CJL
213. Pigott D
214. Gurung HL
215. Baccam P
216. Stage SA
217. Suchoski BT
218. Prakash BA
219. Adhikari B
220. Cui J
221. Rodríguez A
222. Tabassum A
223. Xie J
224. Keskinocak P
225. Asplund J
226. Baxter A
227. Oruc BE
228. Serban N
229. Arik SO
230. Dusenberry M
231. Epshteyn A
232. Kanal E
233. Le LT
234. Li CL
235. Pfister T
236. Sava D
237. Sinha R
238. Tsai T
239. Yoder N
240. Yoon J
241. Zhang L
242. Abbott S
243. Bosse NI
244. Funk S
245. Hellewell J
246. Meakin SR
247. Sherratt K
248. Zhou M
249. Kalantari R
250. Yamana TK
251. Pei S
252. Shaman J
253. Li ML
254. Bertsimas D
255. Lami OS
256. Soni S
257. Bouardi HT
258. Ayer T
259. Adee M
260. Chhatwal J
261. Dalgic OO
262. Ladd MA
263. Linas BP
264. Mueller P
265. Xiao J
266. Wang Y
267. Wang Q
268. Xie S
269. Zeng D
270. Green A
271. Bien J
272. Brooks L
273. Hu AJ
274. Jahja M
275. McDonald D
276. Narasimhan B
277. Politsch C
278. Rajanala S
279. Rumack A
280. Simon N
281. Tibshirani RJ
282. Tibshirani R
283. Ventura V
284. Wasserman L
285. O’Dea EB
286. Drake JM
287. Pagano R
288. Tran QT
289. Tung Ho LS
290. Huynh H
291. Walker JW
292. Slayton RB
293. Johansson MA
294. Biggerstaff M
295. Reich NG
(2021b) Evaluation of Individual and Ensemble Probabilistic Forecasts of COVID-19 Mortality in the US
medRxiv.

https://doi.org/10.1101/2021.02.03.21250974
- Google Scholar
Software
1. Cramer E
2. Wang SY
3. Reich NG
4. Hanna A
5. Niem J
6. House K
7. Huang YD
(2021c) Reichlab/covid19-forecast-hub: release for zenodo, 20210816
Zenodo.

https://doi.org/10.5281/zenodo.5208210
1. Diebold FX
2. Mariano RS
(1995) Comparing predictive accuracy
Journal of Business & Economic Statistics 13:253–263.

https://doi.org/10.1080/07350015.1995.10524599
- Google Scholar
1. Dong E
2. Du H
3. Gardner L
(2020) An interactive web-based dashboard to track covid-19 in real time
The Lancet. Infectious Diseases 20:533–534.

https://doi.org/10.1016/S1473-3099(20)30120-1
- PubMed
- Google Scholar
Software
1. EpiForecasts
(2021) Project: ECDC European COVID-19 forecast hub, version 0.1
Zoltar.

https://www.zoltardata.com/project/238
Website
1. European Centre for Disease Prevention and Control
(2021a) Threat assessment brief: implications for the EU/EEA on the spread of the SARS-cov-2 delta (B.1.617.2) variant of concern
Accessed April 1, 2022.

https://www.ecdc.europa.eu/en/publications-data/threat-assessment-emergence-and-impact-sars-cov-2-delta-variant
Website
1. European Centre for Disease Prevention and Control
(2021b) Interim guidance on the benefits of full vaccination against COVID-19 for transmission and implications for non-pharmaceutical interventions
Accessed April 1, 2022.

https://www.ecdc.europa.eu/en/publications-data/interim-guidance-benefits-full-vaccination-against-covid-19-transmission
Website
1. European Centre for Disease Prevention and Control
(2021c) Forecasting COVID-19 cases and deaths in Europe-new hub will support European pandemic planning
Accessed April 1, 2022.

https://www.ecdc.europa.eu/en/news-events/forecasting-covid-19-cases-and-deaths-europe-new-hub
Website
1. European Centre for Disease Prevention and Control
(2022a) Weekly threats reports (CDTR)
Accessed April 1, 2023.

https://www.ecdc.europa.eu/en/publications-and-data/monitoring/weekly-threats-reports
Website
1. European Centre for Disease Prevention and Control
(2022b) Assessment of the further spread and potential impact of the SARS-CoV-2 Omicron variant of concern in the EU/EEA, 19th update
Accessed April 1, 2022.

https://www.ecdc.europa.eu/en/publications-data/covid-19-omicron-risk-assessment-further-emergence-and-potential-impact
Website
1. European Centre for Disease Prevention and Control
(2023) Overview of the implementation of COVID-19 vaccination strategies and deployment plans in the EU/EEA
Accessed April 1, 2023.

https://www.ecdc.europa.eu/en/publications-data/overview-implementation-covid-19-vaccination-strategies-and-deployment-plans
Website
1. European Covid-19 Forecast Hub
(2023a) Community
Accessed April 1, 2023.

https://covid19forecasthub.eu/community.html
Website
1. European Covid-19 Forecast Hub
(2023b) European Covid-19 Forecast Hub
Accessed April 1, 2023.

https://covid19forecasthub.eu/index.html
Software
1. European Covid-19 Forecast Hub
(2023c) Covid19-forecast-hub-europe, version 9d13832
Github.

https://github.com/covid19-forecast-hub-europe/covid19-forecast-hub-europe
Software
1. European Covid-19 Forecast Hub
(2023d) Covid19-forecast-hub-europe, 2021, version 9d13832
Github.

https://github.com/covid19-forecast-hub-europe/covid19-forecast-hub-europe
1. Friedman J
2. Liu P
3. Troeger CE
4. Carter A
5. Reiner RC
6. Barber RM
7. Collins J
8. Lim SS
9. Pigott DM
10. Vos T
11. Hay SI
12. Murray CJL
13. Gakidou E
(2021) Predictive performance of international COVID-19 mortality forecasting models
Nature Communications 12:2609.

https://doi.org/10.1038/s41467-021-22457-w
- PubMed
- Google Scholar
Preprint
1. Funk S
2. Abbott S
3. Atkins BD
4. Baguelin M
5. Baillie JK
6. Birrell P
7. Blake J
8. Bosse NI
9. Burton J
10. Carruthers J
11. Davies NG
12. De Angelis D
13. Dyson L
14. Edmunds WJ
15. Eggo RM
16. Ferguson NM
17. Gaythorpe K
18. Gorsich E
19. Guyver-Fletcher G
20. Hellewell J
21. Hill EM
22. Holmes A
23. House TA
24. Jewell C
25. Jit M
26. Jombart T
27. Joshi I
28. Keeling MJ
29. Kendall E
30. Knock ES
31. Kucharski AJ
32. Lythgoe KA
33. Meakin SR
34. Munday JD
35. Openshaw PJM
36. Overton CE
37. Pagani F
38. Pearson J
39. Perez-Guzman PN
40. Pellis L
41. Scarabel F
42. Semple MG
43. Sherratt K
44. Tang M
45. Tildesley MJ
46. Van Leeuwen E
47. Whittles LK
48. CMMID COVID-19 Working Group
49. Imperial College COVID-19 Response Team
50. ISARIC4C Investigators
(2020) Short-Term Forecasts to Inform the Response to the Covid-19 Epidemic in the UK
medRxiv.

https://doi.org/10.1101/2020.11.11.20220962
- Google Scholar
1. Genest C
(1992) Vincentization revisited
The Annals of Statistics 20:1137–1142.

https://doi.org/10.1214/aos/1176348676
- Google Scholar
1. Gneiting T
2. Raftery AE
(2007) Strictly proper scoring rules, prediction, and estimation
Journal of the American Statistical Association 102:359–378.

https://doi.org/10.1198/016214506000001437
- Google Scholar
1. Harrell FE
2. Davis CE
(1982) A new distribution-free quantile estimator
Biometrika 69:635–640.

https://doi.org/10.1093/biomet/69.3.635
- Google Scholar
(2021) The use and misuse of mathematical modeling for infectious disease policymaking: lessons for the COVID-19 pandemic
Medical Decision Making 41:379–385.

https://doi.org/10.1177/0272989X21990391
- PubMed
- Google Scholar
1. Jin R
(2021) The lag between daily reported covid-19 cases and deaths and its relationship to age
Journal of Public Health Research 10:2049.

https://doi.org/10.4081/jphr.2021.2049
- PubMed
- Google Scholar
1. Johansson MA
2. Apfeldorf KM
3. Dobson S
4. Devita J
5. Buczak AL
6. Baugher B
7. Moniz LJ
8. Bagley T
9. Babin SM
10. Guven E
11. Yamana TK
12. Shaman J
13. Moschou T
14. Lothian N
15. Lane A
16. Osborne G
17. Jiang G
18. Brooks LC
19. Farrow DC
20. Hyun S
21. Tibshirani RJ
22. Rosenfeld R
23. Lessler J
24. Reich NG
25. Cummings DAT
26. Lauer SA
27. Moore SM
28. Clapham HE
29. Lowe R
30. Bailey TC
31. García-Díez M
32. Carvalho MS
33. Rodó X
34. Sardar T
35. Paul R
36. Ray EL
37. Sakrejda K
38. Brown AC
39. Meng X
40. Osoba O
41. Vardavas R
42. Manheim D
43. Moore M
44. Rao DM
45. Porco TC
46. Ackley S
47. Liu F
48. Worden L
49. Convertino M
50. Liu Y
51. Reddy A
52. Ortiz E
53. Rivero J
54. Brito H
55. Juarrero A
56. Johnson LR
57. Gramacy RB
58. Cohen JM
59. Mordecai EA
60. Murdock CC
61. Rohr JR
62. Ryan SJ
63. Stewart-Ibarra AM
64. Weikel DP
65. Jutla A
66. Khan R
67. Poultney M
68. Colwell RR
69. Rivera-García B
70. Barker CM
71. Bell JE
72. Biggerstaff M
73. Swerdlow D
74. Mier-Y-Teran-Romero L
75. Forshey BM
76. Trtanj J
77. Asher J
78. Clay M
79. Margolis HS
80. Hebbeler AM
81. George D
82. Chretien JP
(2019) An open challenge to advance probabilistic forecasting for dengue epidemics
PNAS 116:24268–24274.

https://doi.org/10.1073/pnas.1909865116
- PubMed
- Google Scholar
1. Moran KR
2. Fairchild G
3. Generous N
4. Hickmann K
5. Osthus D
6. Priedhorsky R
7. Hyman J
8. Del Valle SY
(2016) Epidemic forecasting is messier than weather forecasting: the role of human behavior and Internet data streams in epidemic forecast
The Journal of Infectious Diseases 214:S404–S408.

https://doi.org/10.1093/infdis/jiw375
- PubMed
- Google Scholar
Preprint
1. Ray EL
2. Wattanachit N
3. Niemi J
4. Kanji AH
5. House K
6. Cramer EY
7. Bracher J
8. Zheng A
9. Yamana TK
10. Xiong X
11. Woody S
12. Wang Y
13. Wang L
14. Walraven RL
15. Tomar V
16. Sherratt K
17. Sheldon D
18. Reiner RC
19. Prakash BA
20. Osthus D
21. Li ML
22. Lee EC
23. Koyluoglu U
24. Keskinocak P
25. Gu Y
26. Gu Q
27. George GE
28. España G
29. Corsetti S
30. Chhatwal J
31. Cavany S
32. Biegel H
33. Ben-Nun M
34. Walker J
35. Slayton R
36. Lopez V
37. Biggerstaff M
38. Johansson MA
39. Reich NG
(2020) Ensemble Forecasts of Coronavirus Disease 2019 (COVID-19) in the U.S
medRxiv.

https://doi.org/10.1101/2020.08.19.20177493
- Google Scholar
1. Ray EL
2. Brooks LC
3. Bien J
4. Biggerstaff M
5. Bosse NI
6. Bracher J
7. Cramer EY
8. Funk S
9. Gerding A
10. Johansson MA
11. Rumack A
12. Wang Y
13. Zorn M
14. Tibshirani RJ
15. Reich NG
(2022) Comparing trained and untrained probabilistic ensemble forecasts of COVID-19 cases and deaths in the United States
International Journal of Forecasting 1:005.

https://doi.org/10.1016/j.ijforecast.2022.06.005
- PubMed
- Google Scholar
1. Reich NG
2. Brooks LC
3. Fox SJ
4. Kandula S
5. McGowan CJ
6. Moore E
7. Osthus D
8. Ray EL
9. Tushar A
10. Yamana TK
11. Biggerstaff M
12. Johansson MA
13. Rosenfeld R
14. Shaman J
(2019a) A collaborative multiyear, multimodel assessment of seasonal influenza forecasting in the United States
PNAS 116:3146–3154.

https://doi.org/10.1073/pnas.1812594116
- PubMed
- Google Scholar
1. Reich NG
2. McGowan CJ
3. Yamana TK
4. Tushar A
5. Ray EL
6. Osthus D
7. Kandula S
8. Brooks LC
9. Crawford-Crudell W
10. Gibson GC
11. Moore E
12. Silva R
13. Biggerstaff M
14. Johansson MA
15. Rosenfeld R
16. Shaman J
(2019b) Accuracy of real-time multi-model ensemble forecasts for seasonal influenza in the U.S
PLOS Computational Biology 15:e1007486.

https://doi.org/10.1371/journal.pcbi.1007486
- PubMed
- Google Scholar
1. Reich NG
2. Cornell M
3. Ray EL
4. House K
5. Le K
(2021) The zoltar forecast archive, a tool to standardize and store interdisciplinary prediction research
Scientific Data 8:59.

https://doi.org/10.1038/s41597-021-00839-5
- PubMed
- Google Scholar
Data
1. Sherratt K
(authors) (2022) European covid-19 forecast hub
Zenodo.

https://doi.org/10.5281/zenodo.7356267
1. Taylor JW
2. Taylor KS
(2023) Combining probabilistic forecasts of covid-19 mortality in the United States
European Journal of Operational Research 304:25–41.

https://doi.org/10.1016/j.ejor.2021.06.044
- PubMed
- Google Scholar
(2021) Three ways in which pandemic models may perform a pandemic
Erasmus Journal for Philosophy and Economics 14:10–127.

https://doi.org/10.23941/ejpe.v14i1.582
- Google Scholar
(2018) The RAPIDD Ebola forecasting challenge: synthesis and lessons learnt
Epidemics 22:13–21.

https://doi.org/10.1016/j.epidem.2017.08.002
- PubMed
- Google Scholar
Data
1. Wang SY
2. Stark A
3. Ray E
4. Bosse N
5. Reich NG
6. Sherratt K
7. Shah A
(authors) (2021) Reichlab/covidhubutils: Repository release for zenodo
Zenodo.

https://doi.org/10.5281/zenodo.5207940
1. Zelner J
2. Riou J
3. Etzioni R
4. Gelman A
(2021) Accounting for uncertainty during a pandemic
Patterns 2:100310.

https://doi.org/10.1016/j.patter.2021.100310
- PubMed
- Google Scholar

Article and author information

Author details

Katharine Sherratt

London School of Hygiene & Tropical Medicine, London, United Kingdom

Contribution
Conceptualization, Data curation, Software, Formal analysis, Investigation, Methodology, Writing - original draft, Writing – review and editing

For correspondence
katharine.sherratt@lshtm.ac.uk

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0003-2049-3423
Hugo Gruson

London School of Hygiene & Tropical Medicine, London, United Kingdom

Contribution
Software, Writing – review and editing

Competing interests
No competing interests declared
Rok Grah

European Centre for Disease Prevention and Control (ECDC), Stockholm, Sweden

Contribution
Funding acquisition, Project administration, Writing – review and editing

Competing interests
No competing interests declared
Helen Johnson

European Centre for Disease Prevention and Control (ECDC), Stockholm, Sweden

Contribution
Funding acquisition, Project administration, Writing – review and editing

Competing interests
No competing interests declared
Rene Niehus

European Centre for Disease Prevention and Control (ECDC), Stockholm, Sweden

Contribution
Funding acquisition, Project administration, Writing – review and editing

Competing interests
No competing interests declared
Bastian Prasse

European Centre for Disease Prevention and Control (ECDC), Stockholm, Sweden

Contribution
Funding acquisition, Project administration, Writing – review and editing

Competing interests
No competing interests declared
Frank Sandmann

European Centre for Disease Prevention and Control (ECDC), Stockholm, Sweden

Contribution
Funding acquisition, Project administration, Writing – review and editing

Competing interests
No competing interests declared
Jannik Deuschel

Karlsruhe Institute of Technology, Karlsruhe, Germany

Contribution
Software, Methodology, Writing – review and editing

Competing interests
No competing interests declared
Daniel Wolffram

Karlsruhe Institute of Technology, Karlsruhe, Germany

Contribution
Software, Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0003-0318-3669
Sam Abbott

London School of Hygiene & Tropical Medicine, London, United Kingdom

Contribution
Validation, Methodology, Writing – review and editing

Competing interests
No competing interests declared
Alexander Ullrich

Robert Koch Institute, Berlin, Germany

Contribution
Software, Visualization

Competing interests
No competing interests declared
Graham Gibson

University of Massachusetts Amherst, Amherst, United States

Contribution
Software, Methodology, Writing – review and editing

Competing interests
No competing interests declared
Evan L Ray

University of Massachusetts Amherst, Amherst, United States

Contribution
Software, Methodology, Writing – review and editing

Competing interests
No competing interests declared
Nicholas G Reich

University of Massachusetts Amherst, Amherst, United States

Contribution
Software, Methodology, Writing – review and editing

Competing interests
No competing interests declared
Daniel Sheldon

University of Massachusetts Amherst, Amherst, United States

Contribution
Software, Methodology, Writing – review and editing

Competing interests
No competing interests declared
Yijin Wang

University of Massachusetts Amherst, Amherst, United States

Contribution
Software, Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0003-4438-6366
Nutcha Wattanachit

University of Massachusetts Amherst, Amherst, United States

Contribution
Software, Methodology, Writing – review and editing

Competing interests
No competing interests declared
Lijing Wang

Boston Children’s Hospital and Harvard Medical School, Boston, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Jan Trnka

Third Faculty of Medicine, Charles University, Prague, Czech Republic

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-1786-7562
Guillaume Obozinski

Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Tao Sun

Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-6357-6726
Dorina Thanou

Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Loic Pottier

Éducation nationale, Valbonne, France

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Ekaterina Krymova

Eidgenossische Technische Hochschule, Zurich, Switzerland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Jan H Meinke

Forschungszentrum Jülich GmbH, Jülich, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Maria Vittoria Barbarossa

Frankfurt Institute for Advanced Studies, Frankfurt, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Neele Leithauser

Fraunhofer Institute for Industrial Mathematics, Kaiserslautern, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Jan Mohring

Fraunhofer Institute for Industrial Mathematics, Kaiserslautern, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Johanna Schneider

Fraunhofer Institute for Industrial Mathematics, Kaiserslautern, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-9330-2838
Jaroslaw Wlazlo

Fraunhofer Institute for Industrial Mathematics, Kaiserslautern, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Jan Fuhrmann

Heidelberg University, Heidelberg, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-7091-3740
Berit Lange

Helmholtz Centre for Infection Research, Braunschweig, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Isti Rodiah

Helmholtz Centre for Infection Research, Braunschweig, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Prasith Baccam

IEM, Inc, Bel Air, United States

Contribution
Methodology, Writing – review and editing

Competing interests
Affiliated with IEM, Inc. The author has no financial interests to declare
Heidi Gurung

IEM, Inc, Bel Air, United States

Contribution
Methodology, Writing – review and editing

Competing interests
Affiliated with IEM, Inc. The author has no financial interests to declare
Steven Stage

IEM, Inc, Baton Rouge, United States

Contribution
Methodology, Writing – review and editing

Competing interests
Affiliated with IEM, Inc. The author has no financial interests to declare
Bradley Suchoski

IEM, Inc, Bel Air, United States

Contribution
Methodology, Writing – review and editing

Competing interests
Affiliated with IEM, Inc. The author has no financial interests to declare
Jozef Budzinski

Independent researcher, Vienna, Austria

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Robert Walraven

Independent researcher, Davis, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Inmaculada Villanueva

Institut d’Investigacions Biomèdiques August Pi i Sunyer, Universitat Pompeu Fabra, Barcelona, Spain

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0003-4940-085X
Vit Tucek

Institute of Computer Science of the CAS, Prague, Czech Republic

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Martin Smid

Institute of Information Theory and Automation of the CAS, Prague, Czech Republic

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Milan Zajicek

Institute of Information Theory and Automation of the CAS, Prague, Czech Republic

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-3226-7266
Cesar Perez Alvarez

Inverence, Madrid, Spain

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Borja Reina

Inverence, Madrid, Spain

Contribution
Methodology, Writing – review and editing, Conceptualization

Competing interests
No competing interests declared
Nikos I Bosse

London School of Hygiene & Tropical Medicine, London, United Kingdom

Contribution
Software, Methodology, Writing – review and editing

Competing interests
No competing interests declared
Sophie R Meakin

London School of Hygiene & Tropical Medicine, London, United Kingdom

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Lauren Castro

Los Alamos National Laboratory, Los Alamos, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Geoffrey Fairchild

Los Alamos National Laboratory, Los Alamos, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Isaac Michaud

Los Alamos National Laboratory, Los Alamos, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Dave Osthus

Los Alamos National Laboratory, Los Alamos, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Pierfrancesco Alaimo Di Loro

LUMSA University, Rome, Italy

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Antonello Maruotti

LUMSA University, Rome, Italy

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-8377-9950
Veronika Eclerova

Masaryk University, Brno, Czech Republic

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-8476-7740
Andrea Kraus

Masaryk University, Brno, Czech Republic

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
David Kraus

Masaryk University, Brno, Czech Republic

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Lenka Pribylova

Masaryk University, Brno, Czech Republic

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Bertsimas Dimitris

Massachusetts Institute of Technology, Cambridge, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Michael Lingzhi Li

Massachusetts Institute of Technology, Cambridge, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Soni Saksham

Massachusetts Institute of Technology, Cambridge, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Jonas Dehning

Max-Planck-Institut für Dynamik und Selbstorganisation, Göttingen, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Sebastian Mohr

Max-Planck-Institut für Dynamik und Selbstorganisation, Göttingen, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Viola Priesemann

Max-Planck-Institut für Dynamik und Selbstorganisation, Göttingen, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-8905-5873
Grzegorz Redlarski

Medical University of Gdansk, Gdańsk, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Benjamin Bejar

Paul Scherrer Institute, Villigen, Switzerland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Giovanni Ardenghi

Politecnico di Milano, Milan, Italy

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Nicola Parolini

Politecnico di Milano, Milan, Italy

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Giovanni Ziarelli

Politecnico di Milano, Milan, Italy

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Wolfgang Bock

Technical University of Kaiserlautern, Kaiserslautern, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Stefan Heyder

Technische Universität Ilmenau, Ilmenau, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Thomas Hotz

Technische Universität Ilmenau, Ilmenau, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
David E Singh

Universidad Carlos III de Madrid, Leganes, Spain

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Miguel Guzman-Merino

Universidad Carlos III de Madrid, Leganes, Spain

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Jose L Aznarte

Universidad Nacional de Educación a Distancia (UNED), Madrid, Spain

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
David Morina

Universitat de Barcelona, Barcelona, Spain

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Sergio Alonso

Universitat Politècnica de Catalunya, Barcelona, Spain

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-3989-8757
Enric Alvarez

Universitat Politècnica de Catalunya, Barcelona, Spain

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Daniel Lopez

Universitat Politècnica de Catalunya, Barcelona, Spain

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Clara Prats

Universitat Politècnica de Catalunya, Barcelona, Spain

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-1398-7559
Jan Pablo Burgard

Universitat Trier, Trier, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-5771-6179
Arne Rodloff

University of Cologne, Cologne, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Tom Zimmermann

University of Cologne, Cologne, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Alexander Kuhlmann

University of Halle, Halle, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Janez Zibert

University of Ljubljana, Ljubljana, Slovenia

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Fulvia Pennoni

University of Milano-Bicocca, Milano, Italy

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Fabio Divino

University of Molise, Pesche, Italy

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Marti Catala

University of Oxford, Oxford, United Kingdom

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Gianfranco Lovison

University of Palermo, Palermo, Italy

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Paolo Giudici

University of Pavia, Pavia, Italy

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Barbara Tarantino

University of Pavia, Pavia, Italy

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Francesco Bartolucci

University of Perugia, Perugia, Italy

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Giovanna Jona Lasinio

University of Rome "La Sapienza", Rome, Italy

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Marco Mingione

University of Rome "La Sapienza", Rome, Italy

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Alessio Farcomeni

University of Rome "Tor Vergata", Rome, Italy

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-7104-5826
Ajitesh Srivastava

University of Southern California, Los Angeles, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Pablo Montero-Manso

University of Sydney, Sydney, Australia

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Aniruddha Adiga

University of Virginia, Charlottesville, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Benjamin Hurt

University of Virginia, Charlottesville, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Bryan Lewis

University of Virginia, Charlottesville, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0003-0793-6082
Madhav Marathe

University of Virginia, Charlottesville, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Przemyslaw Porebski

University of Virginia, Charlottesville, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-8012-5791
Srinivasan Venkatramanan

University of Virginia, Charlottesville, United States

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Rafal P Bartczuk

University of Warsaw, Warsaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-0433-7327
Filip Dreger

University of Warsaw, Warsaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Anna Gambin

University of Warsaw, Warsaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Krzysztof Gogolewski

University of Warsaw, Warsaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-5523-5198
Magdalena Gruziel-Slomka

University of Warsaw, Warsaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Bartosz Krupa

University of Warsaw, Warsaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Antoni Moszyński

University of Warsaw, Warsaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Karol Niedzielewski

University of Warsaw, Warsaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Jedrzej Nowosielski

University of Warsaw, Warsaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Maciej Radwan

University of Warsaw, Warsaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Franciszek Rakowski

University of Warsaw, Warsaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Marcin Semeniuk

University of Warsaw, Warsaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Ewa Szczurek

University of Warsaw, Warsaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Jakub Zielinski

University of Warsaw, Warsaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-8935-8137
Jan Kisielewski
1. University of Warsaw, Warsaw, Poland
2. University of Bialystok, Warsaw, Poland
Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Barbara Pabjan

University of Wroclaw, Wroclaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Kirsten Holger

Universtät Leipzig, Leipzig, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Yuri Kheifetz

Universtät Leipzig, Leipzig, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Markus Scholz

Universtät Leipzig, Leipzig, Germany

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Biecek Przemyslaw

Warsaw University of Technology, Warsaw, Poland

Present address
University of Warsaw, Warsaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Marcin Bodych

Wroclaw University of Science and Technology, Wroclaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Maciej Filinski

Wroclaw University of Science and Technology, Wroclaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Radoslaw Idzikowski

Wroclaw University of Science and Technology, Wroclaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Tyll Krueger

Wroclaw University of Science and Technology, Wroclaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Tomasz Ozanski

Wroclaw University of Science and Technology, Wroclaw, Poland

Contribution
Methodology, Writing – review and editing

Competing interests
No competing interests declared
Johannes Bracher

Karlsruhe Institute of Technology, Karlsruhe, Germany

Contribution
Software, Methodology, Writing – review and editing

Competing interests
No competing interests declared
Sebastian Funk

London School of Hygiene & Tropical Medicine, London, United Kingdom

Contribution
Conceptualization, Software, Supervision, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-2842-3406

Funding

Netzwerk Universitätsmedizin (Project egePan 01KX2021)

Jonas Dehning
Sebastian Mohr
Viola Priesemann

FISR (SMIGE - Modelli statistici inferenziali per governare l'epidemia, FISR 2020 - Covid-19 I Fase, FISR2020IP_00156, Codice Progetto - PRJ-0695)

Antonello Maruotti
Gianfranco Lovison
Alessio Farcomeni

Agència de Qualitat i Avaluació Sanitàries de Catalunya (Contract 2021_021OE)

Inmaculada Villanueva

European Centre for Disease Prevention and Control

Katharine Sherratt

European Commission (Communications Networks Content and Technology LC-01485746, Ministerio CIU/FEDER PGC2018-095456-B-I00)

Sergio Alonso
Enric Alvarez
Daniel Lopez
Clara Prats

Bundesministerium für Bildung und Forschung (05M18SIA)

Stefan Heyder
Thomas Hotz
Jan Pablo Burgard

Health Protection Research Unit (NIHR200908)

Nikos I Bosse

InPresa (Lombardy Region Italy)

Fulvia Pennoni
Francesco Bartolucci

Los Alamos National Laboratory

Lauren Castro

MUNI (Mathematical and Statistical modelling project (MUNI/A/1615/2020),MUNI/11/02202001/2020)

Veronika Eclerova
Lenka Pribylova

Ministerio de Sanidad

Cesar Perez Alvarez

Ministry of Science and Higher Education of Poland (28/WFSN/2021)

Rafal P Bartczuk

National Institute of General Medical Sciences (R35GM119582)

Graham Gibson

National Institutes of Health (1R01GM109718)

Lijing Wang

Virginia Department of Health (VDH-21-501-0141)

Aniruddha Adiga

Virginia Department of Health (VDH-21-501-0143)

Benjamin Hurt

Virginia Department of Health (VDH-21-501-0147)

Bryan Lewis

Virginia Department of Health (VDH-21-501-0142)

Lijing Wang

Virginia Department of Health (VDH-21-501-0148)

Madhav Marathe

Virginia Department of Health (VDH-21-501-0145)

Przemyslaw Porebski

Virginia Department of Health (VDH-21-501-0146)

Srinivasan Venkatramanan

Narodowe Centrum Badań i Rozwoju (INFOSTRATEG-I/0022/2021-00)

Biecek Przemyslaw

Horizon 2020 (PERISCOPE 101016233)

Paolo Giudici
Barbara Tarantino

German Free State of Saxony (LO-342/17-1)

Kirsten Holger
Yuri Kheifetz
Markus Scholz

Spanish Ministry of Health, Social Policy and Equality (REACT-UE (FEDER))

David E Singh

Wellcome Trust (210758/Z/18/Z)

Sam Abbott

RECETOX Přírodovědecké Fakulty Masarykovy Univerzity (LM2018121)

Veronika Eclerova

CETOCOEN EXCELLENCEC (CZ.02.1.01/0.0/0.0/17-043/0009632)

Veronika Eclerova

RECETOX RI project (CZ.02.1.01/0.0/0.0/16-013/0001761)

Veronika Eclerova

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. For the purpose of Open Access, the authors have applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.

Copyright

This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.