Estimation and worldwide monitoring of the effective reproductive number of SARS-CoV-2

Abstract
Editor's evaluation
eLife digest
Introduction
Results
Discussion
Materials and methods
Appendix 1
Appendix 2
Appendix 3
Appendix 4
Appendix 5
Appendix 6
Data availability
References
Article and author information
Metrics

Abstract

The effective reproductive number R_e is a key indicator of the growth of an epidemic. Since the start of the SARS-CoV-2 pandemic, many methods and online dashboards have sprung up to monitor this number through time. However, these methods are not always thoroughly tested, correctly placed in time, or are overly confident during high incidence periods. Here, we present a method for timely estimation of R_e, applied to COVID-19 epidemic data from 170 countries. We thoroughly evaluate the method on simulated data, and present an intuitive web interface for interactive data exploration. We show that, in early 2020, in the majority of countries the estimated R_e dropped below 1 only after the introduction of major non-pharmaceutical interventions. For Europe the implementation of non-pharmaceutical interventions was broadly associated with reductions in the estimated R_e. Globally though, relaxing non-pharmaceutical interventions had more varied effects on subsequent R_e estimates. Our framework is useful to inform governments and the general public on the status of epidemics in their country, and is used as the official source of R_e estimates for SARS-CoV-2 in Switzerland. It further allows detailed comparison between countries and in relation to covariates such as implemented public health policies, mobility, behaviour, or weather data.

Editor's evaluation

Understanding the trajectory of epidemic growth and predicting it in real-time is an important goal of epidemiological modelling. This work aggregates data from 170 countries in an effort to better understand how the effective reproduction number of SARS-CoV-2 spread evolved over time and across the world.

https://doi.org/10.7554/eLife.71345.sa0

eLife digest

Over the past two and a half years, countries around the globe have struggled to control the transmission of the SARS-CoV-2 virus within their borders. To manage the situation, it is important to have an accurate picture of how fast the virus is spreading. This can be achieved by calculating the effective reproductive number (Re), which describes how many people, on average, someone with COVID-19 is likely to infect. If the Re is greater than one, the virus is infecting increasingly more people, but if it is smaller than one, the number of cases is declining.

Scientists use various strategies to estimate the Re, which each have their own strengths and weaknesses. One of the main difficulties is that infections are typically recorded only when people test positive for COVID-19, are hospitalized with the virus, or die. This means that the data provides a delayed representation of when infections are happening. Furthermore, changes in these records occur later than measures that change the infection dynamics. As a result, researchers need to take these delays into account when estimating Re.

Here, Huisman, Scire et al. have developed a new method for estimating the Re based on available data records, statistically taking into account the above-mentioned delays. An online dashboard with daily updates was then created so that policy makers and the population could monitor the values over time.

For over two years, Huisman, Scire et al. have been applying their tool and dashboard to COVID-19 data from 170 countries. They found that public health interventions, such as mask requirements and lockdowns, did help reduce the Re in Europe. But the effects were not uniform across the globe, likely because of variations in how restrictions were implemented and followed during the pandemic. In early 2020, the Re only dropped below one after countries put lockdowns or other severe measures in place.

The Re values added to the dashboard over the last two years have been used pro-actively to inform public health policies in Switzerland and to monitor the spread of SARS-CoV-2 in South Africa. The team has also recently released programming software based on this method that can be used to track future disease outbreaks, and extended the method to estimate the Re using SARS-CoV-2 levels in wastewater.

Introduction

During an infectious-disease outbreak, such as the SARS-CoV-2 pandemic, accurate monitoring of the epidemic situation is critical to the decision-making process of governments and public health authorities. The magnitude of an epidemic, as well as its spatial and temporal infection dynamics determine the exposure risk posed to citizens in the near and long-term future, the pressure on critical infrastructure like hospitals, and the overall burden of disease to society.

The effective reproductive number $R_{e}$ is a key indicator to describe how efficiently a pathogen spreads in a given population at a given time (Anderson and May, 1991; Cauchemez et al., 2006; Wallinga and Lipsitch, 2007). It quantifies the average number of secondary infections caused by a primary infected individual. It also has a natural threshold value of 1, below which the epidemic reduces in size Anderson and May, 1991; Nishiura and Chowell, 2009. $R_{e}$ typically changes during the course of an epidemic as a result of the depletion of susceptible individuals, changed contact behaviour, seasonality of the pathogen, or the effect of pharmaceutical and non-pharmaceutical interventions (NPIs) (Anderson and May, 1991; Delamater et al., 2019; Scire et al., 2020; Ali et al., 2020; Flaxman et al., 2020).

Different methods have been developed to estimate $R_{e}$ . They broadly fall into two categories: those based on compartmental models, (e.g. Delamater et al., 2019; Kucharski et al., 2020; Zhou et al., 2020), and those that infer the number of secondary infections per infected individual directly, based on a time series of infection incidence, (e.g. Wallinga and Teunis, 2004; Cori et al., 2013). We focus on the latter class of methods as they rely on few, simple assumptions, are less prone to model misspecifications, and are well-suited for ongoing monitoring of the epidemic (Gostic et al., 2020). In particular, we consider the EpiEstim method of Cori et al., 2013.

The infection incidence based methods face the difficulty that infection events cannot be observed directly (Gostic et al., 2020). These events can only be surmised with a certain time lag, e.g. when individuals show symptoms and are tested, via contact tracing, or via periodic testing of a cohort of individuals (Nishiura and Chowell, 2009). To use these methods, one must thus employ a proxy for infection events (e.g. the observed incidence of confirmed cases, hospitalisations, or deaths). This proxy is either used directly in lieu of the infection incidence, or it is used as an indirect observation to infer past infections (Gostic et al., 2020). It is important to relate $R_{e}$ estimates to the timing of infection events because this allows multiple proxies of infection events, with differing delays, to be used separately to monitor the same epidemic (Scire et al., 2020). In addition, any factors that may affect transmission dynamics will do so at the time infections occurred. If $R_{e}$ is placed properly on this timescale, it can be compared directly to external covariates like weather and interventions (Flaxman et al., 2020; Soltesz et al., 2020). However, depending on the method used to infer the timing of infections from the observed incidence time series, one can also introduce biases such as smoothing sudden changes in $R_{e}$ (Gostic et al., 2020; Goldstein et al., 2009; Petermann and Wyler, 2020).

Several methods, software packages, and online dashboards have been developed to monitor developments in $R_{e}$ during the SARS-CoV-2 pandemic (e.g. Abbott et al., 2020b; Systrom et al., 2020; Tebé et al., 2020; Scott et al., 2020; Hamouda, 2020; Richter et al., 2020). A pipeline for the continuous estimation of $R_{e}$ using infection incidence based methods should include four critical steps: (i) gathering and curating observable proxy data of infection incidence, (ii) reconstruction of the unobserved infection events, (iii) $R_{e}$ estimation, and (iv) communication of the results, including uncertainty and potential biases. These four axes also define the differences between existing methods. The first step dictates e.g. the geographical scope of the $R_{e}$ estimates reported. During the SARS-CoV-2 epidemic, many local public health authorities have made case data publicly available. Depending on the data sources used, estimated $R_{e}$ values span from the scale of a city, region, country, or the entire globe (Systrom et al., 2020; Pan et al., 2020; Robert Koch-Institut, 2020). The second step, i.e. going from a noisy time series of indirect observations to an infection incidence time series, is technically challenging. Biases can be introduced easily, and accurately assessing the uncertainty around the inferred infection incidence is a challenge in itself (Gostic et al., 2020). For the third step, i.e. to estimate $R_{e}$ from a timeline of infection events, there are ready-to-use software packages (Cori et al., 2013; Obadia et al., 2012), which produce $R_{e}$ estimates along with an estimate of the uncertainty resulting from this step. Finally, the communication of results to the general public and decision makers is essential, but often overlooked. We present a pipeline, together with an online dashboard, for timely monitoring of $R_{e}$ . We use publicly available data gathered by different public health authorities. Wherever possible, we show results obtained from different types of case reports (confirmed cases, hospitalisations or deaths). This allows comparison across observation types and to balance the biases inherent in the different types. Results are updated daily, and can be found on https://ibz-shiny.ethz.ch/covid-19-re-international/. The results of this method have been used directly in public health policy making in Switzerland for the past 2 years, and were also communicated by the Federal Office of Public Health on https://www.covid19.admin.ch/en/overview during that time. Through continuous engagement with the public, scientific experts, and thorough evaluation on simulated scenarios, we have created a robust and transparent method of enduring relevance for the current and future epidemics.

Because $R_{e}$ estimates reflect changes in virus transmission dynamics, they can be used to assess the impact of public health interventions. Prior work on the relative impact of specific non-pharmaceutical interventions on $R_{e}$ has shown conflicting results (Flaxman et al., 2020; Esra et al., 2020; Banholzer et al., 2021; Soltesz et al., 2020; Haug et al., 2020; Kohanovski et al., 2022). These differences can be attributed mostly to different model formulations (Soltesz et al., 2020; Sharma et al., 2020; Banholzer et al., 2022), including differing assumptions on the independence of NPIs (Sharma et al., 2020), differing timescales over which the effect of the NPI was analysed (Flaxman et al., 2020; Haug et al., 2020), whether the time point of the NPI was assumed fixed or allowed to vary (Kohanovski et al., 2022), and differing geographical scope. There is a need to address whether the strength of measures and the speed of their implementation resulted in a larger and faster decrease of $R_{e}$ , and specifically whether highly restrictive lockdowns were necessary to achieve $R_{e} < 1$ . Further, it remains unclear how the impact of interventions differed across time and geographical regions. We add to this debate by using our $R_{e}$ estimates across geographical regions and timescales that include the lifting of many NPIs. While we cannot determine causal relationships, we use our method to assess likely associations.

Results

A pipeline to estimate the effective reproductive number of SARS-CoV-2

We have developed a pipeline to estimate the time-varying effective reproductive number of SARS-CoV-2 from observed COVID-19 case incidence time series (see Materials and Methods). The objective was to achieve stable estimates for multiple types of data, and with an adequate representation of uncertainty. At the core, we use the EpiEstim method (Cori et al., 2013) to estimate $R_{e}$ from a time series of infection incidence. To infer the infection incidence from a time series of (noisy) observations, we extended the deconvolution method by Goldstein et al. to deal with partially observed data and time-varying delay distributions (Gostic et al., 2020; Goldstein et al., 2009). We smooth the data prior to deconvolution to reduce numerical artefacts resulting from their weekly patterns and overall noisy nature. We compute point-wise 95% confidence intervals for the true $R_{e}$ values, using the union of a block bootstrap method, designed to account for variation in the case observations, and the credible intervals from EpiEstim. As observed incidence data we use COVID-19 confirmed case data, hospital admissions, and deaths (with type specific delay distributions, see Materials and Methods). We publish separate $R_{e}$ estimates for each of these types of incidence data. The most recent $R_{e}$ estimate lies further in the past than the most recent observed incidence data due to the delay between infection and case observation.

Evaluation on simulated data

To evaluate our method, we used simulations of several epidemic scenarios (see Materials and Methods for more details). For each scenario, we specified an $R_{e}$ time series. The specified $R_{e}$ trajectories were parametrised in a piecewise linear fashion. To mimic the course of the COVID-19 outbreaks observed in many European countries in 2020 (Lemaitre et al., 2020), we started with $R_{e}$ values around 3, then dropped to a value below 1 (the ‘initial decrease’), stayed around 1 in summer and slightly above 1 (the ‘second wave’) in autumn (Figure 1). From each specified $R_{e}$ trajectory, we stochastically simulated 100 time series of infections and their resulting case observations. To account for reporting effects and to better mimic observed COVID-19 case data from around the world, we added additional autocorrelated noise to the case observations.

Figure 1

Download asset Open asset

Evaluation of the pipeline on simulated data.

(A) The specified $R_{e}$ trajectory (black line) was used to stochastically simulate 100 trajectories of observed cases. From each trajectory, we estimated $R_{e}$ (yellow boxplots) and constructed a 95% confidence interval (purple boxplots of the lower/upper endpoint). (B) Fraction of simulations for which the true $R_{e}$ value was within the 95% confidence interval. The dashed red line indicates the nominal 95% coverage. (C) Root mean squared relative error for every time point. (D) Fraction of simulations for which $R_{e}$ is estimated to be significantly above or below one, depending on the true value of $R_{e}$ .

We then used our method to infer the infection incidence and $R_{e}$ from each simulated time series of case observations, and compared these to the true underlying $R_{e}$ values (Figure 1). The results show that the method accurately estimates the effective reproductive number (Figure 1; metrics described in Materials and Methods). Across most time points, the 95% confidence interval includes the true $R_{e}$ value (coverage; Figure 1B). The low root mean squared error (RMSE) indicates that our point estimates closely track the true $R_{e}$ value (Figure 1C). Importantly, we correctly infer whether $R_{e}$ is significantly above or below 1 in this scenario: we never infer that $R_{e}$ is significantly above 1 when the true value is below 1, and only for two time points the estimates are significantly below 1 for some simulations when the true value is a little above 1 (Figure 1D). Due to the smoothing step prior to deconvolution, we slightly misestimate $R_{e}$ during steep changes (see Appendix 2, and Appendix 3—figure 1 for more scenarios). However, the inclusion of smoothing greatly improves the performance across scenarios with different types of observation noise (Appendix 3—figures 2 and 3). For a wide range of infection incidences, the 95% confidence interval is informative and covers the true value of $R_{e}$ (Appendix 3—figures 4 and 5). Our block-bootstrapping method greatly improves coverage compared to the out-of-the-box EpiEstim method. Our method also outperforms the common approach of using a fixed delay to infer the infection incidence time series (Appendix 3—figure 6).

We further tested the impact of misspecifying the delay between infections and observations. Misspecifying the mean of the delay distribution between infection and case observation by up to 2 days does not have a strong effect on the $R_{e}$ estimates, whereas larger misspecifications by 5 or 10 days lead to a more substantial decrease in coverage (Appendix 3—figure 7). Correspondingly, allowing for time-varying delay distributions has a pronounced effect on the estimated $R_{e}$ only when changes in the mean of the delay distribution are large (Appendix 3—figure 8). The impact of other model misspecifications, e.g. of the generation time interval, have been investigated by Gostic et al., 2020 for this class of methods.

Stability of the $R_{e}$ estimates in an outbreak monitoring context

As our $R_{e}$ estimates for SARS-CoV-2 were directly policy relevant in Switzerland, we investigated their stability as new data becomes available (up to 21 additional days of data; Figure 2). With each new day of incidence data, it becomes possible to estimate $R_{e}$ for an additional day in the past. The most recent available $R_{e}$ estimate will be delayed with respect to the observed case incidence by at least the median delay from infection to observation to ensures sufficient information is available in the data (and can be delayed further if required; see below). As time passes and more observations become available, one can estimate more recent values of $R_{e}$ . Importantly, the estimated $R_{e} (t)$ for a day $t$ is updated whenever additional data is added. This means that the $R_{e} (t)$ estimates initially change with each passing day before they settle on a long-term, stable value. In the analysis of Swiss COVID-19 case data in Figure 2B, $R_{e}$ estimates for Sept. 1 2020 to April 1 2021 based on data up to April 30 2021 are referred to as ‘stable $R_{e}$ estimates’. During rapid changes in the real $R_{e}$ , the initial estimates for $R_{e} (t)$ can occasionally under- or overshoot the long-term stable value. However, with our improved 95% confidence intervals (CI), the percentage of the first estimated CI for $R_{e} (t)$ that is contained within the stable CI is substantially improved compared to purely EpiEstim-based uncertainty intervals (Figure 2). This difference is particularly striking during periods of high case incidence (e.g. October 2020), when the EpiEstim uncertainty interval is very narrow.

Figure 2

Download asset Open asset

Stability of the Swiss $R_{e}$ estimates based on confirmed COVID-19 cases, upon adding additional days of observations.

(A) Line segments correspond to 3 weeks of estimates made with the same input data (e.g. data up to December 1st). The segments were assigned an arbitrary colour for ease of distinction. For each day, $R_{e}$ estimates and associated 95% confidence intervals (CIs) are overlaid, from the first possible estimate for that day up to estimates including 3 additional weeks of data. The latter, always the left end of a line segment, corresponds to the stable estimate. (B) Percentage of the first estimated CI that is contained in the stable CI based on data from 30 April 2021. This percentage was calculated as the width of the intersection of both CIs, divided by the width of the first CI. The colour indicates whether the stable $R_{e}$ estimate was contained in the first reported CI. In both rows, the left column shows uncertainty intervals from EpiEstim on the original data, and the right our improved 95% CIs. Both columns use the same pipeline, and differ only in the construction of the uncertainty intervals.

We complemented this analysis on empirical data with an assessment of the stability of $R_{e}$ estimates on synthetic data. Using the set-up described before, we simulated 4 $R_{e}$ scenarios: a constant trend, a slow increase, a slow decrease, and an abrupt decrease in $R_{e}$ . Contrary to before, we estimated $R_{e}$ repeatedly, adding an additional day to the simulation in each iteration (Appendix 4—figure 1). For each scenario, we compare ‘raw’ $R_{e}$ estimates to trajectories for which the 4 most recent $R_{e}$ estimates were removed. This analysis shows that the last few $R_{e}$ estimates can lay outside of the stabilized confidence interval, in particular when the real $R_{e}$ trend is increasing. Instead, the truncated trajectories appear more stable as their most recent estimate has already been consolidated over 4 days. This highlights a trade-off between timeliness and accuracy when publishing $R_{e}$ estimates. On our online dashboard we present truncated $R_{e}$ estimates for Swiss cantons since these estimates were directly policy-relevant in Switzerland.

Detailed data allows more precise analysis: the example of Switzerland

When detailed epidemiological data about individual cases is available (in the form of a line list), the precision of our method can be increased by relaxing the assumptions that (i) distributions of delays between infection and observation do not change through time and (ii) outbreaks occur in populations that are isolated at the country-level. In particular, we collaborated with the Federal Office of Public Health (FOPH) in Switzerland to relax these assumptions and further refine the monitoring of the Swiss SARS-CoV-2 epidemic.

The FOPH line list contains information on the delays between onset of symptoms and reporting - of a positive test, hospitalisation or death - for a substantial fraction of the reported cases. For each of these three types of case report, we estimate the time-varying empirical delay distribution for the delay from infection to reporting. We use this time-varying distribution as input to the deconvolution step, instead of the fixed delay distribution from the literature which is used for countries without an available line list (for details see Materials and Methods section 4.3). Each delay distribution is thus tailored to the specifics of the Swiss population and health system. Since each distribution varies through time, it reflects changes caused by e.g. improved contact tracing or overburdened health offices (see Appendix 4—figure 2; Appendix 1). Whenever available in the FOPH line list, we use the symptom onset date of patients as the date of observation and thus only deconvolve the incubation period to obtain a time series of infection dates. This was most relevant until early 2021, after which the date of symptom onset was rarely recorded anymore. For most days, the effect of these modifications on the $R_{e}$ point estimates is slight (Appendix 4—figure 3; shown for confirmed case data), yet the difference for a particular day can be as big as 20%.

Using FOPH data on the fraction of cases infected abroad, we can correct our $R_{e}$ estimate for imported cases. This is especially important in phases during which the local epidemic is seeded from abroad, and local transmission occurs at a low rate relative to case importation (Appendix 4—figure 4). This correction, which relies on EpiEstim (Cori et al., 2013), treats imported cases as pure infectors, and not infectees. It further assumes that imported cases transmit at the same rate as local cases. When this is not the case, e.g. when strict quarantines are imposed on travellers or travellers have more contacts compared to the rest of the population, and when a large fraction of all cases are imported, this can bias results (Tsang et al., 2021). Additionally, since we do not have data on the number of cases infected in Switzerland that are “exported" to other countries, we cannot correct for exports. Thus, the estimated $R_{e}$ value corrected for imports is a lower bound for the $R_{e}$ estimate which would be obtained if we could account for the location of infection of all cases detected in Switzerland or exported out of the country.

Comparison with existing methods

During the COVID-19 pandemic, research groups and local public health authorities around the world developed and used methods to estimate the effective reproductive number (Abbott et al., 2020a; Scott et al., 2020; Hamouda, 2020; Richter et al., 2020). To put our method into context, we compare against EpiNow2, a prominent R-package to estimate (Abbott et al., 2020a), and the official $R_{e}$ estimates for Germany (developed by the Robert Koch-Institut, 2020; RKI Hamouda, 2020) and Austria (developed by the Austrian Agency for Health and Food Safety; AGES Richter et al., 2020). A detailed comparison of the structure and features of these methods —with the addition of the epidemia R-package (Scott et al., 2020)— can be found in Supplementary file 2. To the best of our knowledge, our method is the only one that can account for variations through time in delay distributions and combine symptom onset data with case data.

We compiled publicly available $R_{e}$ estimates for Switzerland, Austria and Germany from these research groups and institutions (Figure 3; Abbott et al., 2022; Heiden, 2021; TU Graz AGES, 2021). In this case the underlying truth is unknown, yet we can compare how well the point estimates and confidence intervals correspond between different methods. Both the RKI and AGES publish very narrow confidence intervals, which are unlikely to accurately capture the uncertainty around the estimates (similar to Figure 2). In addition, the trend of AGES estimates appear shifted closer to the present than our estimates (Figure 3). This is because AGES applies EpiEstim directly to observed case data, thus indirectly assuming that case confirmation occurs on the day of infection. The estimates from EpiNow2 and ours follow a qualitatively similar trend, although the EpiNow2 estimates are smoother and there is a small lag between both estimates, likely due to differences in the specified observation delay distribution. An in-depth comparison exploring the accuracy and stability of estimates produced by all available methods lies beyond the scope of this work, but would certainly be beneficial to give a full picture of the state-of-the-art of $R_{e}$ estimation methods.

Figure 3

Download asset Open asset

Comparison of published $R_{e}$ estimates for three countries.

Point estimates are presented with a solid line and 95% confidence intervals are presented as coloured ribbons.

Monitoring $R_{e}$ during the COVID-19 pandemic

We developed an online dashboard (https://ibz-shiny.ethz.ch/covid-19-re-international/) on which we present daily-updated results of this $R_{e}$ estimation method applied to COVID-19 case data from 170 countries (Figure 4). For most countries, we include multiple observation sources, such as daily incidence of COVID-19 cases and deaths, and, when available, hospital admissions. We estimate $R_{e}$ separately from each incidence type and make the estimates available for download, as an open resource for other researchers and the general public alike.

Figure 4

Download asset Open asset

Example panels from the online dashboard.

(A) Swiss case incidence with evidence of weekly testing patterns (top row), $R_{e}$ estimates with associated 95% confidence intervals from four types of observation data (middle row), and timeline of stringency index and vaccination coverage (bottom row). (B) World map of incidence per 100’000 inhabitants over the last 14 days. One can also display the worldwide $R_{e}$ estimates instead. (C) Comparison of $R_{e}$ estimates across four countries (Austria, Chile, India and Morocco), with timelines of stringency indices and vaccination coverage. All panels were extracted on May 12, 2022. Dashboard url: https://ibz-shiny.ethz.ch/covid-19-re-international.

The online app allows for comparison through time within a single country, between multiple observation traces, and between multiple countries. The data download further allows users to put these estimates in relation to external covariates such as mobility, weather, or behavioural data. The map view enables comparison across larger geographical areas and additionally reports the cases per 100’000 inhabitants per 14 days. We additionally show the Oxford Stringency Index and vaccination coverage for context (Hale et al., 2021; Roser et al., 2020).

The effect of lockdowns in spring 2020 on the estimated $R_{e}$ of SARS-CoV-2

We assessed the association between non-pharmaceutical interventions (NPIs) and the estimated effective reproductive number $R_{e}$ during the early stage of the COVID-19 pandemic. We selected 20 European countries for which the reported data was free of major gaps or spikes, and for which we could estimate $R_{e}$ prior to the nationwide implementation of a lockdown in spring 2020. The dates of interventions were extracted from news reports (sources listed in Appendix 6—table 2), and ‘lockdown’ taken to refer to stay-at-home orders of differing intensity. Of the countries investigated, all except Sweden implemented a lockdown (19/20). Using case data, we inferred that $R_{e}$ was significantly above one prior to the lockdown measures in nearly all countries with a lockdown (15/19; Table 1). Denmark, which had a complex outbreak consisting of two initial waves, and Germany, which experienced a cluster of early cases, had an estimated $R_{e}$ significantly below one prior to this date. For countries with very short delays between the lockdown and the estimated date that $R_{e} < 1$ (e.g. Austria, Switzerland) we can not exclude the possibility that the ‘true’ $R_{e}$ may have been below 1 prior to the lockdown since our pipeline introduces smoothing to the estimates (see Appendix 2). The results are remarkably consistent across the different observation types (Appendix 6—table 1). However, the 95% confidence intervals tend to be wider for the estimates based on death incidence data because the number of deaths is much smaller than the number of cases, and the relative noise in observations tends to be higher.

Table 1

Investigating the relation between the date of ‘lockdown’ and the date when the estimated $R_{e}$ based on case reports dropped below 1.

Based on news reports, we report when a country implemented stay-at-home orders (a ‘lockdown’). The column ‘ ${\hat{R}}_{e} < 1$ ’ indicates when the $R_{e}$ point estimate first dropped below 1. The column ‘CI includes 1’ details the corresponding time interval where the 95% confidence interval included 1. Of the investigated countries that implemented a nationwide lockdown, four (Denmark, Germany, the Netherlands, Slovenia) had 95% confidence intervals that included 1 or were below before a nationwide lockdown was implemented. The column ‘Time until ${\hat{R}}_{e} < 1$ ’ indicates the number of days between the lockdown and the date that the $R_{e}$ point estimate dropped below 1.

Country	Lockdown	Re <1	CI includes 1	Time until Re <1
Austria	16–30	20–30	[20-03, 20-03]	4 days
Belgium	18–30	30–30	[25-03, 03-04]	12 days
Denmark	18–30	≤10–03	[≤10–03, 20–06]	–8 days
Finland	16–30	01-Feb	[29-03, 30-04]	17 days
France	17–30	27–30	[23-03, 07-04]	10 days
Germany	22–30	18–30	[17-03, 19-03]	–4 days
Ireland	27–30	01–100	[04–04, 15–04]	12 days
Italy	01–300	18–30	[17-03, 19-03]	8 days
Netherlands	23–30	01-Jan	[22-03, 10-04]	13 days
Norway	14–30	21–30	[17-03, 19-03]	7 days
Poland	25–30	01-Jan	[31-03, 17-04]	8 days
Portugal	16–30	28–30	[23-03, 15-04]	12 days
Romania	24–30	01-Jun	[31-03, 29-04]	13 days
Russian Federation	30–30	01-Apr	[01–05, 08–05]	35 days
Slovenia	20–30	23–30	[≤13–03, 26–03]	3 days
Spain	14–30	26–30	[25-03, 26-03]	12 days
Sweden		01-Jan	[06–03,≥03-05-2021]
Switzerland	17–30	22–30	[20-03, 22-03]	5 days
Turkey	21–30	01-Aug	[01–04, 13–04]	18 days
United Kingdom	24–30	30–30	[28-03, 20-04]	6 days

To consider the association between NPIs and the estimated $R_{e}$ for countries outside of Europe, we used the stringency index (SI) of the Blavatnik School of Government (Hale et al., 2020) to describe the public health response in different countries (Figure 4C). This is a compound measure describing e.g. whether a state has closed borders, schools, or workplaces. For example, a country with widespread information campaigns, partially closed borders, closed schools, and a ban on public events and gatherings with more than 10 people would have an SI slightly above 50. As reference date, we used the date when a country first exceeded a stringency index of 50 ( $t_{S I 50}$ ). Then, we investigated whether the estimated $R_{e}$ was significantly above 1 prior to the reference date (i.e. the lower bound of the 95% confidence interval was above 1). We excluded countries without $R_{e}$ estimates before the reference date $t_{S I 50}$ . Worldwide, for 35 out of the 42 countries which fulfilled the criteria for inclusion (list in Appendix 6), $R_{e}$ was significantly above 1 prior to $t_{S I 50}$ . As a sensitivity analysis, we performed the same calculation with a different reference date $t_{m a x}$ , defined as the date with the biggest increase in SI in the preceding 7 days. The results were very similar, with 38/45 countries significantly above one before $t_{m a x}$ (Appendix 6).

Insights into continent-specific impacts of NPIs

To further investigate the association between non-pharmaceutical interventions and changes in the estimated reproductive number of SARS-CoV-2, we extended our analysis beyond the first 2020 epidemic wave. We included both the implementation and lifting of NPIs until May 3rd 2021 (increases and decreases in stringency). For each week and each country, we calculated the change in stringency index over the preceding week ( $Δ S I_{t} = S I (t) - S I (t - 7)$ ) and the corresponding change in the estimated $R_{e}$ (during the same week $Δ {\hat{R}}_{e, t} = {\hat{R}}_{e} (t) - {\hat{R}}_{e} (t - 7)$ , 1 week later $Δ {\hat{R}}_{e, t + 7} = {\hat{R}}_{e} (t + 7) - {\hat{R}}_{e} (t)$ , or 2 weeks later $Δ {\hat{R}}_{e, t + 14} = {\hat{R}}_{e} (t + 14) - {\hat{R}}_{e} (t)$ ). If NPIs were effective, we expect increases in stringency to be associated with decreases in the estimated $R_{e}$ and vice versa. We do find such an association for increases in stringency e.g. in Europe. In Europe we also see that large increases in stringency are associated with larger decreases in $R_{e}$ 7–14 days after the change in SI. However, the association between decreases in stringency and changes in $R_{e}$ is heterogeneous on all continents (Figure 5A). This suggests that reversing non-pharmaceutical interventions had a different effect than introducing them.

Figure 5

Download asset Open asset

The association between the implementation or lifting of non-pharmaceutical interventions and changes in $R_{e}$ until May 2021.

(A) The change in the estimated $R_{e}$ at the same time as ( $R (t)$ ) or following ( $R (t + 7)$ and $R (t + 14)$ ) the implementation (above x-axis) or lifting (below x-axis) of NPIs in a given week. (B) The change in the estimated $R_{e}$ related to the change in mobility in the same week. The error bars indicate the Q1 and Q3 quartiles.

We repeated the same analysis for Europe, comparing against various measures of Google mobility data (Figure 5B). Increased mobility in residential areas, and decreased mobility at workplaces or grocery stores is associated with decreases in $R_{e}$ .

Discussion

We have developed a pipeline to estimate the effective reproductive number $R_{e}$ of SARS-CoV-2 for both timely monitoring and retrospective investigation. We evaluated our estimates on simulated data. We showed that the inferred $R_{e}$ curve can be over-smoothed on simulated data, but that this disadvantage is outweighed by the increased stability of the estimates. Overall, we show that the relative error in the $R_{e}$ estimates is small.

During the ongoing SARS-CoV-2 pandemic, $R_{e}$ estimates are of interest to health authorities, politicians, decision makers, the media and the general public. Because of this broad interest and the importance of $R_{e}$ estimates, it is crucial to communicate both the results as well as the associated uncertainty and caveats in an open, transparent and accessible way. This is why we display daily updated results on an online dashboard, accessible at https://ibz-shiny.ethz.ch/covid-19-re-international/. The dashboard shows $R_{e}$ estimates in the form of time series for each included country or region, and a global map containing the latest $R_{e}$ estimates and normalised incidence. For all countries, we further display a timeline of the stringency index of the Blavatnik School of Government (Hale et al., 2021), and current vaccination coverage.

A unique advantage of the monitoring method we have developed is the parallel use of different types of observation data, all reflecting the same underlying infection process (Scire et al., 2020). Wherever we have data of sufficient quality, we estimate $R_{e}$ separately based on confirmed cases, hospitalisations and death reports. The advantages and disadvantages of the different observation types are discussed in the Appendix 1. Comparing estimates from several types of data is a powerful way to evaluate the sensitivity of the results to the type of observations they were derived from. More generally, the method is applicable to any other type of incidence data, such as admissions to intensive care units or excess death data. We have also extended this method to make use of daily measurements of SARS-CoV-2 viral concentrations in wastewater (Huisman et al., 2022). The potential limitations of our $R_{e}$ estimation method are discussed in detail in the Appendix 1.

Any decision to implement, remove or otherwise adjust measures aimed at infection control will be informed by epidemiological, social and economic factors (Sebhatu et al., 2020). We can aid this decision making process by investigating the association between adjustments of public-health measures and the estimated $R_{e}$ . In particular, the merits of nation-wide lockdowns in the context of the COVID-19 pandemic have been heavily discussed, both in the scientific literature and the public sphere (Flaxman et al., 2020; Haug et al., 2020; Banholzer et al., 2021; Soltesz et al., 2020; Karberg, 2020). Analyses showing that $R_{e}$ estimates had dropped below 1 before the strictest measures were enforced were frequently used to claim that a lockdown was not necessary (Karberg, 2020). We showed that this argumentation cannot be applied universally: for 15 out of 20 European countries, we found that the estimated $R_{e}$ was significantly above 1 prior to the lockdown in spring of 2020. Interestingly, the result we obtain for Germany critically depends on whether we use symptom onset data, or more widely available case reports.

Extending our analysis beyond the first wave, we find differences between continents in the association between changes in the stringency of NPIs and changes in $R_{e}$ . This could reflect differences in the speed with which lockdowns were put into practice (Kohanovski et al., 2022), the de facto lockdown stringency, or socio-cultural aspects (Sebhatu et al., 2020; Mbow et al., 2020). It is often argued that, especially in countries with a large informal business sector, there may be a difference between the official containment measures and those adhered to or implemented de facto (Mbow et al., 2020). However, for continents where we find no significant correlation, this could also be because a large fraction of NPIs were implemented at a time for which we could not estimate changes in $R_{e}$ . Many African countries had early and strict government responses, often prior to the first detected cases. These are thought to have delayed the virus in establishing a foothold on the continent (Mbow et al., 2020).

Importantly, our analysis suggests that reversing non-pharmaceutical interventions may have a very different effect than introducing them. This could be because the situation is not fully reverted: due to increased public awareness, testing, contact tracing, and quarantine measures still in place. In addition, the epidemic situation - in terms of number of infected individuals - is likely different when measures are implemented or lifted.

Our analysis could be confounded by economic, social, and psychological factors motivating the implementation or release of measures. With the current stringency measures we cannot account for diversity in adherence to NPIs across geographic regions and through time. Cultural norms, defiance towards public authorities, "lockdown fatigue", and economic pressures are all among the factors that may determine whether NPIs are in fact adhered to. In addition, there is increasing evidence that weather may be a factor influencing $R_{e}$ through its effect on people’s behaviour and on properties of the virus (Morris et al., 2021). In the future, our tools to estimate $R_{e}$ could be used to explore associations of these many factors with $R_{e}$ estimates, with the aim of identifying minimal sets of factors that may ensure an $R_{e} < 1$ for a particular location.

Materials and methods

Overview of the software pipeline

Request a detailed protocol

The software pipeline we developed allows the estimation of $R_{e}$ from different proxies for the infection incidence, such as the time series of confirmed cases, hospitalisations or deaths. It provides a separate estimate of the $R_{e}$ trajectory through time for each proxy. In a first step, we smooth the case observations and deconvolve the smoothed observations by the type-specific delay distribution to obtain an estimate of the infection incidence time series. Second, we use the package EpiEstim to estimate the effective reproductive number Re from this infection incidence. We assess the uncertainty in the estimates using the union of a block bootstrap method, designed to account for variation in the case observations, and the credible intervals from EpiEstim.

Smoothing the case observations

Request a detailed protocol

To reduce the influence of weekly patterns in case reporting data, as well as reporting irregularities, we smooth the observed incidence data prior to deconvolution. To smooth the incidence data, we use local polynomial regression (LOESS) with 1st order polynomials and tricubic weights. The smoothing parameter alpha is set such that we include 21 days of data in the local neighbourhood of each point. After smoothing, we normalise to the original total number of cases. Here we use smoothing parameter 21 because it performs best overall in our simulations. We investigated the effect of this tuning parameter in simulations, see Appendix 5—figure 5.

Estimating the infection incidence through deconvolution

Request a detailed protocol

To recover the non-observed time series of infection incidence, we deconvolve the smoothed observed time series of COVID-19 case incidence with a delay distribution specific to the type of case detection (case confirmation, hospital admission, death). To this end, we extended the deconvolution method of Goldstein et al., 2009, which is itself an adaptation of the Richardson-Lucy algorithm (Richardson, 1972; Lucy, 1974) (essentially an expectation maximisation algorithm), to deal with zero-incidence case observations and time-varying delay distributions.

Formally, the method infers a deconvolved output time series $(λ_{1}, \dots, λ_{N})$ from an input time series $({\bar{D}}_{K}, \dots, {\bar{D}}_{N})$ , where $K \geq 1$ and ${\bar{D}}_{i}$ indicates the smoothed number of observations on day $i$ (e.g. confirmed cases, hospitalisations, or deaths). Let $m_{l}^{j}$ be the probability that an infection on day $j$ takes $l \geq 0$ days to be observed. If no line list data is available, $m_{l}^{j} = m_{l}$ and no time-variation of the delay distribution is assumed. Let q_j be the probability that an infection that occurred on day $j$ is observed during the time-window of observations, i.e. is counted towards $({\bar{D}}_{K}, \dots, {\bar{D}}_{N})$ . Then:

\begin{matrix} q_{j} & = \sum_{l = K - j}^{N - j} m_{l}^{j} . \end{matrix}

Let $E_{i}$ be the expected number of observed cases on day $i$ , for a given infection incidence $(λ_{k})$ :

\begin{aligned} E_{i} & = {\begin{cases} \sum_{j = 1}^{i} λ_{j} m_{i - j}^{j} & for K \geq i \geq N \\ 0 & for 0 < i < K . \end{cases} \end{aligned}

The deconvolution algorithm uses expectation maximisation (Dempster et al., 1977) to find a final infection incidence estimate, which has the highest likelihood of explaining the observed input time series. To do so, it starts from an initial guess of the infection incidence time series $Λ^{0} = (λ_{1}^{0}, \dots, λ_{N}^{0})$ , used to compute $E_{i}^{0}$ according to equation 2, and updates the estimate in each iteration $n$ according to the following formula:

λ_{j}^{n + 1} = \frac{λ_{j}^{n}}{q_{j}} \cdot \sum_{i = K}^{N} \frac{m_{i - j}^{j} {\bar{D}}_{i}}{E_{i}^{n}} .

The iteration proceeds until a termination criterion is reached. Here, we follow Goldstein et al. and iterate until the $χ^{2}$ statistic drops below 1 (Goldstein et al., 2009):

χ^{2} = \frac{1}{N - K + 1} \sum_{i = K}^{N} \frac{{(E_{i}^{n} - {\bar{D}}_{i})}^{2}}{E_{i}^{n}},

or 100 iterations have been reached.

Convergence is typically fast and the stopping criterion based on the $χ^{2}$ statistic is reached in a few iterations. Due to the smoothing prior to deconvolution, this is the case for the vast majority of the empirical data we analyzed. In some cases, e.g. when the observed incidence is especially noisy, convergence is slower and the threshold of 100 iterations is reached (on 26.5.2022, this was the case for 13 of the countries analyzed).

For the initial estimate of the incidence time series $Λ^{0}$ , we shift the observation time series backwards in time by the mode of the delay distribution μ Goldstein et al., 2009. However, this leaves a gap of unspecified values at the start and end of the time series $Λ_{0}$ . Contrary to Goldstein et al., we augment the shifted time series with the first observed value ( ${\bar{D}}_{K}$ ) on the left, and with the last observed value ( ${\bar{D}}_{N}$ ) on the right, to avoid initialising with a zero-value anywhere. If a day is initialised with zero incidence, it will also have zero incidence in the final estimate (compare equation (3)), which would be a potential source of bias.

We note that the Richardson-Lucy deconvolution algorithm accounts for ‘right truncation’, i.e. that not all infections are observed within the given observation time window (due to delay until symptoms/reporting), through the q_j indices.

Use of line list data

Request a detailed protocol

When information on the time variation of delays between symptom onset and observation is available (e.g. through a line list), this can be taken into account directly during the deconvolution step. In this case, we perform the deconvolution in two separate steps: first with the time-varying empirical onset-to-observation distributions, and then with the constant-through-time incubation period distribution. For those cases where symptom onset data is available, we only deconvolve with the incubation period distribution.

The $(m_{0}^{j}, \dots, m_{l_{m a x}}^{j})$ time-varying delay distributions from onset of symptoms to observation are determined as follows: for each date $j$ , at least 300 of the most recent recorded delays between symptom onset and observation, with onset date before $j$ , are taken into account; $l_{m a x}$ being the highest observed delay. To avoid biases caused by the intensity of testing and reporting varying throughout the week, recorded delays are included in full weeks going in the past, until at least 00 delays are included.

As the incidence data is right-truncated, we have to fix the distribution for the reporting delay ( $m_{l}^{j}$ ) after a certain day $j$ , so that delay distributions are not downward biased for infection dates close to the present. Let $({\bar{m}}_{0}, \dots, {\bar{m}}_{l_{m a x}})$ be the empirical probability density function of the delay (aggregated over the entire window of observations) and $n$ the 99th percentile of this distribution ( $n$ is the smallest integer for which $\sum_{i = 1}^{n} {\bar{m}}_{i} \geq 0.99$ ). For infection dates $z$ that are closer to the present than $n$ (i.e. $N - z < n$ , where $N$ is the index of the last available data point), we fix $(m_{0}^{z}, \dots, m_{l_{m a x}}^{z})$ to be equal to $(m_{0}^{N - n}, \dots, m_{l_{m a x}}^{N - n})$ .

Estimating the effective reproductive number Re

Request a detailed protocol

Once we have obtained an estimate for the time series of infection incidence, we use the method developed by Cori et al., 2013, implemented in the EpiEstim R package, to estimate $R_{e}$ .

Disease transmission is modelled with a Poisson process. At time $t$ , an individual infected at time $t - s$ causes new infections at a rate $R_{e} (t) \cdot w_{s}$ , where w_s is the value of the infectivity profile $s$ days after infection. The infectivity profile sums to 1, and can be approximated by the (discretised) serial interval distribution (Cori et al., 2013). The likelihood of the incidence $I_{t}$ at time $t$ is thus given by:

\begin{matrix} P (I_{t} | I_{0}, \dots, I_{t - 1}, R_{e} (t)) & = \frac{{(R_{e} (t) Λ_{t})}^{I_{t}} e^{- R_{e} (t) Λ_{t}}}{I_{t}!}, \end{matrix}

\begin{matrix} where Λ_{t} & = \sum_{s = 1}^{t} I_{t - s} w_{s} . \end{matrix}

The $R_{e}$ inference is performed in a Bayesian framework, and an analytical solution can be derived for the posterior distribution of $R_{e} (t)$ (see Cori et al., 2013; Web Appendix 1). We choose a gamma distributed prior on $R_{e} (t)$ with mean 1, and standard deviation 5.

For the gradually-changing $R_{e}$ estimates, we assume $R_{e}$ is constant over a sliding window of 3 days ( $τ = 3$ in EpiEstim), i.e. the reported $R_{e}$ estimate for day $T$ summarises the average $R_{e}$ over a 3 day period ending on day $T$ . In addition to these smooth estimates, we provide step-wise estimates of $R_{e}$ on our dashboard. In the step-wise analysis, $R_{e}$ is assumed to be constant on a number of intervals spanning the entire epidemic time window. These intervals are determined by dates at which public health interventions were implemented, altered, or lifted. All results reported here are based on the smooth $R_{e}$ estimates. In both cases, we use the mean of the posterior distribution of $R_{e}$ as the point estimate.

Estimating the uncertainty intervals

Request a detailed protocol

To account for the uncertainty in the case observations, we construct 95% bootstrap confidence intervals for $R_{e}$ . We first re-sample case observations as follows: given the original case observations $D_{t}, t = K, \dots, N$ , we apply the LOESS with smoothing parameter 21 days on the log-transformed data $l o g (D_{t} + 1)$ to obtain the smoothed value ${\hat{μ}}_{t}$ and additive residuals e_t. Here we use log-transformation to stabilise the variance of the residuals because it is the best overall choice among the transformations we tried. We compare to the commonly used square root transformation in Appendix 5—figure 1.

After obtaining the residuals e_t, we resample them to get bootstrap residuals $e_{t}^{*}$ and obtain the bootstrap case observations by

\begin{matrix} D_{t}^{*} = \max (\exp ({\hat{μ}}_{t} + e_{t}^{*}) - 1, 0) . \end{matrix}

We now discuss how we obtain a set of bootstrapped residuals $e_{t}^{*}$ , $t = K, \dots, N$ . Since the empirical residuals of most countries are autocorrelated (see Appendix 5—figure 2), we use an overlapping block bootstrap. Specifically, given the original residuals $(e_{K}, \dots, e_{N})$ , we start by taking a random block of $b$ consecutive residuals, which we denote by $(e_{1}^{* 1}, \dots, e_{b}^{* 1})$ . To account for weekly patterns in the case observations, we need to match the day of the week to the original case observations. That is, we keep the longest subvector $(e_{m_{1}}^{* 1}, \dots, e_{b}^{* 1})$ of $(e_{1}^{* 1}, \dots, e_{b}^{* 1})$ such that $e_{m_{1}}^{* 1}$ has the same day of the week as $e_{K}$ (e.g., both correspond to Fridays). We then randomly take a new block of $b$ consecutive residuals, which we denote by $(e_{1}^{* 2}, \dots, e_{b}^{* 2})$ . We keep its longest part $(e_{m_{2}}^{* 2}, \dots, e_{b}^{* 2})$ such that $e_{m_{2}}^{* 2}$ has the day of the week that follows on that of $e_{b}^{* 1}$ (e.g. if $e_{b}^{* 1}$ corresponds to a Tuesday, then $e_{m_{2}}^{* 2}$ must correspond to a Wednesday). We then glue these two sampled blocks together to get $(e_{m_{1}}^{* 1}, \dots, e_{b}^{* 1}, e_{m_{2}}^{* 2}, \dots, e_{b}^{* 2})$ . We repeat this process of adding blocks until the length of the re-sampled residuals is at least as large as that of the original residuals. If it is longer, we simply cut off the last part of the re-sampled residuals so that its length is the same. Finally, we re-index the re-sampled residuals as $(e_{K}^{*}, \dots, e_{N}^{*})$ . We present a concrete example in Appendix 2.

Choosing an optimal block size $b$ for the block bootstrap method is generally difficult. To capture week effects, we need a block size of at least 7. We tried different sizes and found that $b = 10$ tended to work well in various simulation settings (Appendix 5—figure 6).

Given a set of bootstrap case observations $(D_{K}^{*}, \dots, D_{N}^{*})$ , we apply our method to obtain an estimate for $R_{e} (t)$ . For ease of notation, we now denote this by ${\hat{θ}}^{*} (t)$ . By repeating the above steps 100 times, we obtain ${\hat{θ}}_{1}^{*} (t), \dots, {\hat{θ}}_{100}^{*} (t)$ . Then, we construct a Normal based bootstrap confidence interval for each time point $t$ by:

\begin{matrix} [\hat{θ} (t) - q_{z} (1 - \frac{α}{2}) \hat{s d} ({\hat{θ}}^{*} (t)), \hat{θ} (t) + q_{z} (1 - \frac{α}{2}) \hat{s d} ({\hat{θ}}^{*} (t))], \end{matrix}

where $\hat{θ} (t)$ denotes the estimated $R_{e} (t)$ based on the original case observations, $q_{z} (1 - \frac{α}{2})$ denotes the $1 - \frac{α}{2}$ quantile of the standard normal distribution, and $\hat{s d} ({\hat{θ}}^{*})$ denotes the empirical standard deviation of ${\hat{θ}}_{1}^{*} (t), \dots, {\hat{θ}}_{100}^{*} (t)$ . In this paper, we aim at a confidence interval level of 95%, so $α = 0.05$ . We use the Normal based bootstrap interval because we found that it performed best overall with respect to coverage in our simulations, when compared to other common choices like quantile and reversed-quantile bootstrap confidence intervals (Appendix 3—figure 3).

The above bootstrap method implicitly assumes that the variance of the residuals e_t is constant over time $t$ and does not depend on the value of the log-transformed data $\log (D_{t} + 1)$ . This assumption roughly holds when the case incidence is high. During periods of low case incidence (e.g. deaths or regional data in summer 2020 in Switzerland), this assumption is no longer appropriate. Therefore, to be conservative and rather err on the side of too wide uncertainty intervals, we also consider the credible interval of $R_{e}$ which is obtained by taking the 0.025 and 0.975 quantiles from the posterior distribution of $R_{e} (t)$ using EpiEstim based on the original data $(D_{K}, \dots, D_{N})$ . The final reported interval is then the union of the credible interval and the 95% bootstrap confidence interval. Based on our experience, the credible interval is typically wider during periods of very low case incidence and will then be reported. But at high case numbers, the bootstrap confidence interval will tend to be much wider than the credible interval and will be the reported one.

Finally, we point out that the choices of transformation, block size, smoothing parameters and type of bootstrap confidence interval in this paper might not be universal. The best choice can be different for different data sets (e.g., data sets from different countries).

Data

We gather case incidence data directly from public health authorities. Whenever accessible, we rely on data from local authorities. Otherwise, we use data from ‘Our World in Data’ since the European Centre for Disease Control (ECDC) has stopped its daily updates (December 2020) (Roser et al., 2020; European Centre for Disease Prevention and Control (ECDC), 2022). A table summarising the incidence data sources is available in Supplementary file 1. Information on the start and end of interventions, or major changes in testing policy, are obtained from media reports and the websites of public health authorities. The stringency index of the Blavatnik School of Government is accessed from their publicly available github repository (Hale et al., 2020). The vaccination coverage is taken from ‘Our World in Data’ (Roser et al., 2020).

We parametrise the discretised infectivity profile w_s using COVID-19 serial interval estimates from the literature (Nishiura et al., 2020). For a review of published serial interval estimates, see Griffin et al. (Griffin et al., 2020). The incubation period is parametrised by a gamma distribution with mean 5.3 days and SD 3.2 days (Linton et al., 2020). For countries for which we do not have access to line list data, i.e. all except Switzerland, Germany and Hong Kong at the time of writing, we assume delays from symptom onset to observation to be gamma-distributed, with parameters taken from the literature. Table 2 summarises the distributions used in our pipeline.

Table 2

Gamma distributions used in the pipeline: serial interval, incubation period, and the delay distributions assumed for each observation type.

Distribution	Mean (days)	SD (days)	Reference
Serial interval	4.8	2.3	Nishiura et al., 2020
Infection to onset of symptoms	5.3	3.2	Linton et al., 2020
Onset of symptoms to case confirmation	5.5	3.8	Bi et al., 2020
Onset of symptoms to hospital admission	5.1	4.2	Pellis et al., 2021
Onset of symptoms to death	15.0	6.9	Linton et al., 2020

For Switzerland, Germany and Hong Kong, we use line lists to build time-varying empirical distributions on delays between symptom onset and case confirmation, hospitalisation or death. During the deconvolution step we use the empirical delay distribution of the last 300 recorded cases prior to the infection date. Moreover, for the fraction of cases for which the date of onset of symptoms is known, we use the onset date directly instead of deconvolving a delay from onset to reporting, allowing for more precise estimation of the infection date. For Switzerland, line lists contain information on which cases were infected abroad. By considering imported cases and locally-transmitted cases separately in the deconvolution step, we obtain two separate time series, one for local infections and one for imported infections. EpiEstim can then estimate a corrected $R_{e}$ that excludes infections incurred abroad from the local transmission dynamics.

For comparison between methods, epiforecasts.io $R_{e}$ estimates were collected from https://github.com/epiforecasts/covid-rt-estimates/blob/master/national/cases/summary/rt.csv, accessing file versions from December 5 2020, March 15, June 24, October 1, November 10 2021 and January 10 2022. Robert Koch Institute estimates were collected from https://raw.githubusercontent.com/robert-koch-institut/SARS-CoV-2-Nowcasting_und_-R-Schaetzung/main/Nowcast_R_aktuell.csv, last accessed on January 10 2022. AGES estimates were collected from https://www.ages.at/fileadmin/AGES2015/Wissen-Aktuell/COVID19/R_eff.csv, last accessed on January 10 2022.

Simulations

Request a detailed protocol

In the simulations, we start by specifying different $R_{e}$ trajectories. To assess a range of scenarios, we parametrise $R_{e}$ as a piecewise linear trajectory, where we fix its plateau values and the time-points at which its slope changes. From each $R_{e}$ trajectory, we stochastically simulate 100 time series of infections and their corresponding case observations. We then use our pipeline to estimate $R_{e} (t)$ from each of these 100 times series and compare it to the true underlying value of $R_{e} (t)$ .

Assuming I₀ infected individuals on the first day, the infection incidence is simulated forward in time. The infection incidence on day $t$ is drawn from a Poisson distribution, corresponding to equation (6), using the specified $R_{e}$ time series and the discretised serial interval for SARS-CoV-2 (Nishiura et al., 2020) as the infectivity profile (see Cori et al., 2013; Web Appendix 11). These simulated infections are convolved with the observation type-specific delay distribution (Linton et al., 2020) to obtain the raw observation time series ${\tilde{D}}_{t}$ .

Since the raw observation time series ${\tilde{D}}_{t}$ are too smooth compared to the real data (Appendix 5—figure 4), we add noise to obtain our final simulated observation time series $D_{t}$ . The additional noise accounts for aspects of the observation process that are not covered by the delay distribution, such as weekend and holiday effects, the random and occasional delay in the recording of confirmed cases, and irregular components such as confirmed cases that are imported from abroad.

To obtain a realistic noise model for $D_{t}$ , we considered the empirical noise observed in real SARS-CoV-2 case data. For most countries the residuals are autocorrelated (Appendix 5—figure 2), which led us to fit ARIMA models to the observed residuals. We considered five simulation settings with different noise models obtained based on the confirmed case data from five countries (Switzerland, China, France, New Zealand, United States of America). Specifically, we first apply the LOESS smoother with smoothing parameter 21 days on the log-transformed confirmed case data to obtain additive residuals. We then chose the ARIMA model by fitting ARIMA models of various orders and assessing the resulting ACF and PACF plots of their residuals. This leads to five ARIMA models: ARIMA(2,0,1)(0,1,1), ARIMA(1,0,1)(0,0,0), ARIMA(0,0,6)(0,1,1), ARIMA(4,0,1)(1,0,0), and ARIMA(4,0,0)(0,0,0), based on the data from CHE, CHN, FRA, NZL, and USA, respectively. The final observation time series $D_{t} = {\tilde{D}}_{t} \cdot \exp (e_{t})$ , where e_t is simulated from the fitted ARIMA model. We present the simulated observations with the noise model based on CHE data in Appendix 5—figure 4. We emphasize that the ARIMA model is only used in simulations to obtain simulated observations that look roughly realistic. Our main approach to obtain the estimated $R_{e}$ and the related confidence intervals does not require fitting an ARIMA model. In particular, the block bootstrap method is fully non-parametric.

In the case of time-varying delay distributions, we assume that the mean of the delay distribution decreases by a fixed amount (1/20) each day, to a minimum of 2 days (e.g. for the confirmed cases this results in a range from 5.5 to 2). When estimating with a time-varying delay distribution, we draw observations from the true distributions, similar to line list information recorded by public health authorities. To assess the added value of the deconvolution method, we further compare against a method where we estimate the infection time series by shifting the observations back by the mean of the delay distribution (termed ‘fixed shift method’).

To quantify the performance of our method on the simulated scenarios, we compute the root mean squared error (RMSE) at time point $j$ :

\begin{matrix} R M S E (j) & = \sqrt{\frac{1}{M} \sum_{m = 1}^{M} {({\hat{R}}_{e} (j, m) - R_{e} (j))}^{2}}, \end{matrix}

where $M$ is the total number of simulations, ${\hat{R}}_{e} (j, m)$ the estimated $R_{e}$ and $R_{e} (j)$ the true $R_{e}$ at time $j$ , for simulation $m$ .

For each simulation we also compute the 95% confidence interval (CI) of our estimates across 100 bootstrap replicates. The empirical coverage indicates the fraction of simulations for which our CI includes the true $R_{e}$ value.

Implementation and method availability

Request a detailed protocol

Daily updated results of our method on global COVID-19 data are available online on https://ibz-shiny.ethz.ch/covid-19-re-international/.

The source code of the software pipeline is openly accessible at https://github.com/covid-19-Re/shiny-dailyRe; Angst, 2022, and the code necessary to reproduce the figures and analyses presented in this paper is available at https://github.com/covid-19-Re/paper-code; Huisman, 2022.

Appendix 1

Observation types and the influence of testing

Here, we briefly discuss the benefits and potential biases of the three types of observations we used. The most commonly used proxy for infection incidence is the incidence of confirmed cases. It is the least indirect way of observing infection events. However, it generally assumes that (i) the proportion of infected individuals that is tested, and (ii) the distribution of the delay between infection and testing are constant through time. Unfortunately, these assumptions do not generally hold.

As long as the sampling proportion is constant throughout the considered time period, the $R_{e}$ estimates of EpiEstim are not affected by under-sampling (Cori et al., 2013). During the COVID-19 epidemic, many countries initially restricted testing to only severe cases, before switching to a more extensive testing effort after curbing the first epidemic wave and ramping up testing capacity (Roser et al., 2020). Changes in testing strategy as well as bottlenecks in testing capacity can result in a varying fraction of infected individuals that are confirmed positive, and variation in the delay between infection and test confirmation. These variations can bias $R_{e}$ estimates, as the increase or decrease in case numbers between consecutive time points will be attributed to a change in infection incidence, rather than a change in testing.

However, it is important to note that the ‘memory’ inherent in the $R_{e}$ estimate is dictated by the infectivity profile w_s. An event at time $t$ which changes the proportion of true infection incidence observed per day, e.g. a change in testing policy, will bias the $R_{e}$ estimate for a number of days given by w_s (compare Materials and Methods, equation 6). For SARS-CoV-2 the time needed to reach the 95% quantile of w_s is 9days. We do not observe the infection incidence directly, but if the deconvolution is assumed to be perfect, the intuition for the number of days of biased $R_{e}$ estimates still holds.

It is further possible to investigate the influence of testing intensity, by applying the $R_{e}$ estimation method separately to a case incidence time series which is adjusted for the intensity of the testing effort. We have added this analysis to our online dashboard (where we show the number of confirmed cases / number of tests, normalised by the mean number of tests). However, one should note that such a normalisation does not take into account that the probability of test positivity might also change with the number of tests (e.g. by prioritising likely cases at low numbers of tests).

In contrast, the incidence of hospital admittance and deaths are likely based primarily on the severity of the symptoms, and mostly unaffected by changes in testing strategies, or the magnitude of the epidemic. This makes them valuable complementary observations of infection events (Goldstein et al., 2009). However, also here biases can occur. First, only a small fraction of all infections results in hospitalisation or death (a meta-analysis found an average infection fatality ratio for SARS-CoV-2 of 0.68% Meyerowitz-Katz and Merone, 2020). This fraction varies with the risk group of the infected population (Meyerowitz-Katz and Merone, 2020; Hauser et al., 2020; Esteve et al., 2020; Yang et al., 2020), introducing potential biases in $R_{e}$ estimations when outbreaks occur in particularly age-stratified settings. Also, new variants may result in different hospitalisation or fatality rates. Second, if a country’s health infrastructure becomes overburdened and hospitals are forced to triage or delay admission, we expect the fraction of hospital admissions to decrease, and deaths to increase. Third, the likelihood to die from an infection may change through time as new treatment strategies are developed or if hospitals are overburdened. Additionally, guidelines used to record COVID-19 as the cause of death have changed through time for some countries (Minder, 2020). Lastly, the delay between infection and hospitalisation or death is expected to be longer than the delay until case confirmation, with the result that these $R_{e}$ estimates are less timely. One should note that these observation type specific biases could also be seen as a source of information. The types simply describe a different epidemic if very structured populations with highly different mortality rates are captured (e.g. in elderly homes).

It is important to note that all analyses here are focused on the period before vaccination mediated immunity became widespread. Since vaccinations change the fraction of infections that eventually become hospitalised or die, they may introduce temporary biases for the $R_{e}$ estimated from hospitalisation and death incidence. We have added the metric of vaccination coverage to the online dashboard, so one can estimate when these effects start to become important.

Method Limitations

The $R_{e}$ estimation method we present in the main text relies on several assumptions. Here we highlight the limitations that occur when these assumptions are violated.

First, the geographical scale of the $R_{e}$ estimates is determined by the incidence data itself. The $R_{e}$ value calculated for a country represents an average, summarised across multiple local epidemics unfolding in different regions. $R_{e}$ values need not be identical in different local epidemics across a country or administrative region. In particular, in times of very low pathogen transmission, single super-spreading events can significantly increase the estimated $R_{e}$ of the entire country (Lloyd-Smith et al., 2005).

Second, in our deconvolution step we account for an incubation period and a delay from symptom onset to case observation. Implicitly, we thus assume that all reported cases come from symptomatic individuals. This is certainly true for hospitalised and deceased patients, but does not have to hold for all confirmed cases. Similar to the testing intensity (discussed in Section B.1), this would not bias our estimates as long as the fraction of asymptomatic or presymptomatic individuals is constant through time. However, the fraction of asymptomatic individuals could vary with the population structure and age-stratification. The fraction of tested presymptomatic individuals could vary with the testing strategy and the intensity of the testing effort.

Third, in our current analysis we assume a single serial interval distribution for all geographic locations and all times. However, behaviour, population contact structure, and cultural differences in dealing with infection symptoms, will cause geographic and temporal variations in the serial interval. In particular, the implementation of non-pharmaceutical interventions can significantly shorten the serial interval (Ali et al., 2020). Misspecification of the serial interval will lead to larger errors in $R_{e}$ estimates further away from one (Gostic et al., 2020).

Lastly, our estimates of the effective reproductive number $R_{e}$ are subject to changes in data reporting. There are frequent changes in the way in which public health offices update their observed incidence data: the number of variables shared (e.g. Brasil, the UK excluded testing information), their frequency (e.g. Swiss cantons moved to weekly data updates when daily numbers became low), the amount of data consolidation (i.e. to which extent values reported for a given day change in subsequent days), and what constitutes a COVID-19 case (Minder, 2020; Tokyay, 2020). These variables have all changed during the epidemic, frequently in response to political pressure or the magnitude of the local epidemic and the resulting workload at the public health offices (Minder, 2020). This affects the timeliness of our estimates, and can cause the estimated $R_{e}$ to change a bit between days.

Appendix 2

Discretisation of delay distributions

When approximating delay distributions by gamma distributions, we discretise these in the following fashion:

m_{l} = {\begin{matrix} \int_{0}^{0.5} f (x) d x & l = 0 \\ \int_{l - 0.5}^{l + 0.5} f (x) d x & l \in {1, 2, \dots}, \end{matrix}

where $f$ is either the probability density function (p.d.f) of the gamma-distributed delay distribution, or the p.d.f of the convolution of two independent gamma-distributed delay distributions. The former applies when line list data is available, and the observed data is deconvolved with the gamma-distributed incubation period separately from the empirical delay distribution of symptom onset to observation. The latter applies whenever the observed case data is jointly deconvolved with the incubation period and the delay between symptom onset and observation.

Because the probability density function of a convolution of two independent gamma distributions does not admit a simple form in the general case, we approximate the p.d.f by drawing a million independent pairs of samples, one from each gamma distribution, summing the pairs, and computing the empirical cumulative distribution function of the sampled distribution.

The effect of smoothing on our ability to infer when Re =1

Our LOESS smoothing roughly spreads sudden changes in $R_{e}$ over 20 days in the estimated $R_{e}$ . Whether this is a substantial problem depends on the smoothness of the true $R_{e}$ that we are trying to estimate. Direct observations of behavioural changes, specifically changes in mobility, suggest the true $R_{e}$ is quite smooth: for instance it took 2–3 weeks for mobility to drop to its lowest level in response to government interventions in Switzerland (Apple, 2021; Google, 2021).

However, to get a rough feeling for the impact smoothing would have on our estimates and downstream analysis in case the true $R_{e}$ does change abruptly, we can use a simple analysis using a linear approximation. In the case of a step-wise change from R₀ to R₁ (with $R_{0} > R_{1}$ ) at time t₀, the estimated smooth $R_{e}$ will start decreasing about 10 days prior to t₀, and take another 10 days after to reach the terminal value (Appendix 2—figure 1). When inferring the day that a certain threshold value was reached (e.g. $R_{e} = 1$ ) we will be off by a number of days $s$ , dictated by R₀ and R₁. Specifically, the delay $s$ is greater if the turning point $R_{t p} = \frac{R_{1} + R_{0}}{2}$ is further above 1, or the slope $a_{t p} = \frac{R_{1} - R_{0}}{20}$ is closer to 0:

s = (R_{t p} - 1) \frac{20}{R_{0} - R_{1}} .

Appendix 2—figure 1

Download asset Open asset

Schematic of the effect of smoothing on the ability to estimate when $R_{e} = 1$ .

The true $R_{e}$ is indicated by the black solid line, the black dashed line shows a linear approximation of the smoothed $R_{e}$ . Instead of crossing 1 at $t_{0}$ , this line crosses 1 at $t_{0} + s$ .

In Table 1 we have listed some possible delay values, using R₀ values spanning the range of values reported for SARS-CoV-2 (Alimohamadi et al., 2020). The delay is positive if $R_{0} > R_{1}$ and $R_{t p} > 1$ , which was the case for most countries around the 1st lockdown. In general, these numbers can be considered a ‘worst-case’ scenario: when the true underlying $R_{e}$ changes more gradually than considered here, the smoothing introduced by our pipeline will have a smaller effect.

Note that these calculations specifically refer to the point estimate. The estimates may stop being significantly above the threshold already earlier, especially when the confidence interval is wide and the slope is close to 0.

Appendix 2—table 1

The effect of smoothing on the ability to estimate when $R_{e} = 1$ .

These values were calculated using Equation 11.

R₀	R₁	$R_{t p}$	$a_{t p}$ (per day)	Delays (days)
6.0	0.0	3.0	–0.30	6.7
3.0	0.0	1.5	–0.15	3.3
3.5	0.5	2.0	–0.15	6.7
2.5	0.5	1.5	–0.10	5.0
3.3	0.9	2.1	–0.12	9.2
1.8	0.8	1.3	–0.05	6.0

A concrete example to illustrate the overlapping block bootstrap method

Appendix 2—table 2

Residuals and their corresponding days of the week.

Day of the week	Mon	Tue	Wed	Thu	Fri	Sat	Sun
Residuals		e₁	e₂	e₃	e₄	e₅	e₆
	e₇	e₈	e₉	e₁₀	e₁₁	e₁₂	e₁₃
	e₁₄	e₁₅	e₁₆	e₁₇	e₁₈	e₁₉	e₂₀
	e₂₁	e₂₂	e₂₃	e₂₄	e₂₅	e₂₆	e₂₇
	e₂₈	e₂₉	e₃₀

We illustrate the overlapping block bootstrap method with block length $b = 10$ in a small example. Let $(e_{1}, e_{2}, \dots, e_{30})$ denote residuals corresponding to the days of the week as shown in Table 2.

We first randomly sample a block of 10 consecutive residuals from $(e_{1}, e_{2}, \dots, e_{30})$ . Say the sampled residuals are $(e_{3}, e_{4}, \dots, e_{12})$ . Since the first residual e₁ of the original residuals corresponds to a Tuesday, we then take the longest part of $(e_{3}, e_{4}, \dots, e_{12})$ such that the first residual corresponds to Tuesday. In this case, we keep $(e_{8}, e_{9}, \dots, e_{12})$ .

The length of the sequence of re-sampled residuals is now only 5, which is less than the desired total length 30, so we have to sample again. We randomly sample a second block of 10 consecutive residuals from $(e_{1}, e_{2}, \dots, e_{30})$ . Say the sampled residuals are now $(e_{19}, e_{20}, \dots, e_{28})$ . Since the last residual e₁₂ of the first set of re-sampled residuals corresponds to a Saturday, we take the longest part of $(e_{19}, e_{20}, \dots, e_{28})$ such that the first residual corresponds to a Sunday. In this case, we keep $(e_{20}, e_{21}, \dots, e_{28})$ . We then glue it to the previously sampled residuals and obtain $(e_{8}, e_{9}, \dots, e_{12}, e_{20}, e_{21}, \dots, e_{28})$ with length 14. Since this is less than the desired length 30, we need to sample again.

So we randomly sample a third block of 10 consecutive residuals from $(e_{1}, e_{2}, \dots, e_{30})$ . Say the re-sampled residuals are $(e_{1}, e_{2}, \dots, e_{10})$ . Since the last residual e₂₈ of the kept residuals corresponds to a Monday, we take the longest part of $(e_{1}, e_{2}, \dots, e_{10})$ such that the first residual corresponds to a Tuesday. In this case, we keep $(e_{1}, e_{2}, \dots, e_{10})$ . We then glue it to the previously kept residuals and obtain $(e_{8}, e_{9}, \dots, e_{12}, e_{20}, e_{21}, \dots, e_{28}, e_{1}, e_{2}, \dots, e_{10})$ . Its length is 24, which is less than the original length of 30, so we keep going.

We randomly sample a fourth block of 10 consecutive residuals from $(e_{1}, e_{2}, \dots, e_{30})$ . Say the sample residuals are $(e_{17}, e_{18}, \dots, e_{26})$ . Since the last residual e₁₀ of the kept residuals corresponds to a Thursday, we take the longest part of $(e_{17}, e_{18}, \dots, e_{26})$ such that the first residual corresponds to Friday. In this case, we keep $(e_{18}, e_{19}, \dots, e_{26})$ . We then glue it to the previously kept residuals and obtain $(e_{8}, e_{9}, \dots, e_{12}, e_{20}, e_{21}, \dots, e_{28}, e_{1}, e_{2}, \dots, e_{10}, e_{18}, e_{19}, \dots, e_{26})$ . Note that its length is 33 which is larger than the original length of 30, so we cut the last three residuals $(e_{24}, e_{25}, e_{26})$ .

This means that we obtain $(e_{8}, e_{9}, \dots, e_{12}, e_{20}, e_{21}, \dots, e_{28}, e_{1}, e_{2}, \dots, e_{10}, e_{18}, e_{19}, \dots, e_{23})$ as our set of block bootstrapped residuals.

Appendix 3

Appendix 3—figure 1

Download asset Open asset

Performance of our method on simulated scenarios with differing slopes.

(A) The specified $R_{e}$ trajectory (black line; see Methods) was used to simulate a trajectory of reported cases (with Swiss case observation noise) 100 times. From each trajectory we estimated $R_{e}$ (yellow boxplots), and constructed a 95% confidence interval (purple boxplots of the lower/upper endpoint). We varied the time it took to change from one $R_{e}$ value to the next, $t \in {7, 14, 28}$ (columns). Larger values of $t$ correspond to less abrupt changes. (B) The fraction of simulations where the true $R_{e}$ value was within the 95% confidence interval. The dashed red line indicates the nominal 95% coverage. (C) The root mean squared relative error for every time point. (D) The fraction of simulations where we estimate $R_{e}$ is significantly above or below one, depending on the true value of $R_{e}$ . We see that the method closely tracks the true $R_{e}$ in all scenarios, although the error is greater for steeper slopes. In the case of steeper changes in $R_{e}$ the overall size of the epidemic is also smaller, which explains the larger confidence intervals.

Appendix 3—figure 2

Download asset Open asset

Performance of our method, modified to skip the smoothing step in the pipeline, on simulated scenarios with observation noise.

The specified $R_{e}$ trajectory (black line; see Methods) was used to simulate a trajectory of reported cases (with varying country-specific noise profiles; rows) 100 times. From each trajectory we estimated $R_{e}$ (yellow boxplots), and constructed a 95% confidence interval (purple boxplots of the lower/upper endpoint). Contrary to our normal pipeline, the observations were not smoothed prior to the deconvolution and $R_{e}$ estimation. We see that the estimates are highly variable.

Appendix 3—figure 3

Download asset Open asset

Performance of our method on simulated scenarios with observation noise.

The columns differ in the method used to construct confidence intervals: EpiEstim reports the 95% HPD of EpiEstim on the original data, estimateR refers to our method, Quantiles and Reverse Quantiles use the 5 and 95% quantiles of the estimated $R_{e}$ to construct the CIs. (A) The specified $R_{e}$ trajectory (black line; see Methods) was used to simulate a trajectory of reported cases (with varying country-specific noise profiles; rows) 100 times. From each trajectory we estimated $R_{e}$ (yellow ribbons represent the estimated mean ± sd across 100 simulations), and constructed a 95% confidence interval (purple ribbons represent the mean ± sd of the estimated lower/upper endpoint). (B) The fraction of simulations where the true $R_{e}$ value was within the 95% confidence interval. The dashed red line indicates the nominal 95% coverage.

Appendix 3—figure 4

Download asset Open asset

Performance of our method on simulated scenarios with varying population size, using confidence intervals from EpiEstim.

(A) We specified a constant $R_{e} \in {0.8, 1, 1.5}$ value (black line; rows) to simulate a trajectory of reported cases (with Swiss case observation noise) 100 times. From each trajectory we estimated $R_{e}$ (yellow boxplots), and constructed a 95% confidence interval (purple boxplots of the lower/upper endpoint). The simulated scenarios had differing initial incidence of $I_{0} \in {10, 100, 1000, 5000, 10000}$ infections per day (columns). In the top row, $R_{e} < 1$ so the epidemic is decreasing. In the middle row, $R_{e} = 1$ , the epidemic is constant, and in the bottom row, $R_{e} > 1$ , the epidemic is increasing. The bias at the start is due to the initialisation of the simulation. (B) The fraction of simulations where the true $R_{e}$ value was within the 95% confidence interval. The dashed red line indicates the nominal 95% coverage. We see that the EpiEstim coverage strongly declines with increased epidemic size.

Appendix 3—figure 5

Download asset Open asset

Performance of our method on simulated scenarios with varying population size, using the Union of EpiEstim and Block bootstrap 95% confidence intervals.

(A) We specified a constant $R_{e} \in {0.8, 1, 1.5}$ value (black line; rows) to simulate a trajectory of reported cases (with Swiss case observation noise) 100 times. From each trajectory we estimated $R_{e}$ (yellow boxplots), and constructed a 95% confidence interval (purple boxplots of the lower/upper endpoint). The simulated scenarios had differing initial incidence of $I_{0} \in {10, 100, 1000, 5000, 10000}$ infections per day (columns). In the top row, $R_{e} < 1$ so the epidemic is decreasing. In the middle row, $R_{e} = 1$ , the epidemic is constant, and in the bottom row, $R_{e} > 1$ , the epidemic is increasing. The bias at the start is due to the initialisation of the simulation. (B) The fraction of simulations where the true $R_{e}$ value was within the 95% confidence interval. The dashed red line indicates the nominal 95% coverage. We see that for a wide range of infection incidences, our 95% confidence interval is informative and covers the true value of $R_{e}$ .

Appendix 3—figure 6

Download asset Open asset

Performance of our method on simulated scenarios using a fixed shift versus the deconvolution to infer infection incidence.

The fixed shift method shifts the observations back by the mean of the delay distribution (here assumed to correspond to confirmed cases). (A) The specified $R_{e}$ trajectory (black line; see Methods) was used to simulate a trajectory of reported cases (with Swiss case observation noise) 100 times. From each trajectory we estimated $R_{e}$ (yellow boxplots), and constructed a 95% confidence interval (purple boxplots of the lower/upper endpoint). (B) The fraction of simulations where the true $R_{e}$ value was within the 95% confidence interval. The dashed red line indicates the nominal 95% coverage. The average coverage in this scenario was 0.90 with deconvolution and 0.78 with the fixed shift. (C) The root mean squared relative error for every time point. The average (cumulative) RMSE in this scenario was 0.0706 (25.4) with deconvolution and 0.0726 (26.6) with the fixed shift.

Appendix 3—figure 7

Download asset Open asset

Performance of our method on simulated scenarios with misspecified delay distributions.

When estimating $R_{e}$ , we misspecified the mean of the delay distribution (5.5 for symptom-onset to case confirmation) by the numbers above the columns. (A) The specified $R_{e}$ trajectory (black line; see Methods) was used to simulate a trajectory of reported cases (with Swiss case observation noise) 100 times. From each trajectory we estimated $R_{e}$ (yellow boxplots), and a 95% confidence interval (purple boxplots of the lower/upper endpoint). (B) The fraction of simulations where the true $R_{e}$ value was within the 95% confidence interval. The dashed red line indicates the nominal 95% coverage. The average coverage in this scenario was 0.72, 0.85, 0.88, 0.90, 0.90, 0.89, 0.82, 0.69 from –5 to 10. (C) The root mean squared relative error for every time point. The average (cumulative) RMSE in this scenario was 0.0989 (36.0), 0.0807 (29.2), 0.0746 (26.9), 0.0706 (25.4), 0.0701 (25.2), 0.0726 (26.0), 0.0909 (32.3), 0.134 (47.0) from –5 to 10.

Appendix 3—figure 8

Download asset Open asset

Performance of our method on simulated scenarios with time-varying delay distributions.

The observations were simulated with a time-varying delay distribution for (**A,B,C**) confirmed cases, or (**D,E,F**) deaths (see Methods), and then estimated with (right column) or without (left column) taking the time-varying distributions into account. (**A, D**) The specified $R_{e}$ trajectory (black line; see Methods) was used to simulate a trajectory of reported cases or deaths (with Swiss case observation noise) 100 times. From each trajectory we estimated $R_{e}$ (yellow boxplots), and a 95% confidence interval (purple boxplots of the lower/upper endpoint). (**B, E**) The fraction of simulations where the true $R_{e}$ value was within the 95% confidence interval. The dashed red line indicates the nominal 95% coverage. For the cumulative cases, the average coverage in this scenario was 0.89 without and 0.89 with time variation. For the deaths, the average coverage was 0.83 without and 0.92 with time variation. (**C, F**) The root mean squared relative error for every time point. For the cumulative cases, the average (cumulative) RMSE was 0.0728 (26.2) without and 0.0783 (28.5) with time variation. For the deaths, the average (cumulative) RMSE was 0.113 (39.5) without and 0.0861 (30.8) with time variation.

Appendix 4

Appendix 4—figure 1

Download asset Open asset

Stability of $R_{e}$ estimates at present.

We estimated $R_{e}$ through time repeatedly on 4 scenarios. With each new iteration, we added one new data point at present. Each $R_{e}$ trajectory is presented with its own colour. Purple trajectories are iterations for which the last known case data point was furthest in the past, yellow trajectories are trajectories for which the last known case data point was closest to the present. In each panel and for each $R_{e}$ trajectory, 100 simulation replicates were aggregated. The median of mean estimates are presented with lines and medians of upper and lower bounds of 95% confidence intervals are shown with translucent ribbons. For each scenario, two panels are presented. Each time the right panel correspond to raw estimates and the left panel corresponds to the same estimates with the last 4 $R_{e}$ values removed from each $R_{e}$ trajectory. (A) Stable $R_{e}$ . (B) Gradual increase in $R_{e}$ . (C) Gradual decrease in $R_{e}$ . (D) Abrupt decrease in $R_{e}$ .

Appendix 4—figure 2

Download asset Open asset

Mean delay in Switzerland between onset of symptoms and reporting.

For each date, the mean is taken over the last 300 reports with known symptom onset date, based on line list data from the FOPH. For early dates, before 300 reports are available, the mean is taken over the first 300 reports.

Appendix 4—figure 3

Download asset Open asset

Comparison of the $R_{e}$ estimates with or without accounting for known symptom onset dates and for time-variability on reporting delays.

The comparison is based on time series of confirmed cases in Switzerland, from line list data provided by the FOPH. Both the inclusion of known symptom onset dates and of the time-variability of reporting delay distributions have an effect on the Re estimates, in particular for early estimates in this case. The fraction of cases with known symptom onset date has drastically reduced since November 2020, hence the overlap in curves with and without symptom onset data for later dates. For each trajectory the point estimate is shown with a line, and the translucent ribbon indicates the 95% confidence interval.

Appendix 4—figure 4

Download asset Open asset

Effective reproductive number estimates with or without accounting for known imports.

The comparison is based on time series of confirmed cases in Switzerland, from line list data provided by the FOPH. The analysis ignoring imports is unbiased if the number of imports equals the number of exports. Since the analysis accounting for imports is not accounting for exports, the results are a lower limit for the effective reproductive number. Very few imported cases were reported since November 2020, hence the complete overlap in the curves after that date. For each trajectory the point estimate is shown with a line, and the translucent ribbon indicates the 95% confidence interval.

Appendix 5

Appendix 5—figure 1

Download asset Open asset

Square root and log transformations to stabilise the variance of residuals.

Each row corresponds to the results of observations from Switzerland (CHE), China (CHN), France (FRA), New Zealand (NZL) and the United States (USA), respectively. The first and last two plots correspond to the result of square root transformation and log transformation, respectively.

Appendix 5—figure 2

Download asset Open asset

Autocorrelation function (ACF) and partial autocorrelation function (PACF) plots of the observations from five different countries.

In each row, the two plots are the ACF and PACF plots of the observations from Switzerland (CHE), China (CHN), France (FRA), New Zealand (NZL) and the United States (USA), respectively.

Appendix 5—figure 3

Download asset Open asset

Autocorrelation function (ACF) and partial autocorrelation function (PACF) plots of observations from Switzerland and the fitted ARIMA model.

The two plots on the upper row are the ACF and PACF plots of the observations from Switzerland. The two plots on the lower row are the ACF and PACF plots of the residuals of the fitted ARIMA model.

Appendix 5—figure 4

Download asset Open asset

Simulated observations with and without noise.

The upper row shows the real observations from Switzerland (twice the same). The other four rows show simulated observations, the left column shows simulations without the noise term ( ${\tilde{D}}_{t}$ in Section 4.7), and the right the simulated observations with the noise term ( $D_{t}$ in Section 4.7).

Appendix 5—figure 5

Download asset Open asset

Estimated $R_{e}$ with different smoothing windows.

For each trajectory the point estimate is shown with a yellow line, and the purple ribbon indicates the 95% confidence interval.

Appendix 5—figure 6

Download asset Open asset

Estimated $R_{e}$ with different block sizes.

The ribbons indicate the 95% confidence interval.

Appendix 6

SI 50 analysis

Reference date: first day the stringency index exceeded 50 ( $S I > 50$ ).

The 42 included countries: Algeria, Andorra(*), Australia, Austria, Belgium, Canada, Chile, Croatia, Czech Republic, Denmark, Egypt, Estonia, Finland, France, Germany, Greece, Iceland, Indonesia, Iran, Ireland, Israel, Japan, South Korea(*), Lebanon, Malaysia, Mexico, Netherlands, Norway, Philippines, Poland, Portugal, Saudi Arabia, Singapore, Slovenia, Spain, Switzerland, Tajikistan(*), Thailand, United Arab Emirates, United Kingdom, United States, Vietnam.

A star indicates the country was not included in the $Δ S I$ analysis (e.g. because the biggest jump in SI took place prior to the first possible $R_{e}$ estimate).

For 37/42 countries the $R_{e}$ estimate was above one prior to the reference date, and significantly so for 35/42. The countries that reached $R_{e} < 1$ prior to the reference date were Andorra (17 days prior), Australia (2 day prior), Denmark (3 days prior), Japan (359 days prior), and Vietnam (3 days prior).

$Δ S I$ analysis

Reference date: date of the biggest 7 day increase in the SI.

The 45 included countries: Algeria, Australia, Austria, Belarus(*), Belgium, Canada, Chile, Colombia(*), Croatia, Czech Republic, Denmark, Egypt, Estonia, Finland, France, Germany, Greece, Iceland, Indonesia, Iran, Ireland, Israel, Japan, Lebanon, Malaysia, Mexico, Netherlands, New Zealand(*), Norway, Philippines, Poland, Portugal, Russia(*), Saudi Arabia, Serbia(*), Singapore, Slovenia, Spain, Switzerland, Thailand, Turkey(*), United Arab Emirates, United Kingdom, United States, Vietnam.

A star indicates the country was not included in the $S I 50$ analysis (e.g. because $S I = 50$ was never reached).

For 41/45 countries the $R_{e}$ estimate was above one prior to the reference date, and significantly so for 38/45. The countries that reached $R_{e} < 1$ prior to the reference date were Australia (3 days prior), Denmark (4 days prior), Germany (4 days prior), and Vietnam (10 days prior).

Appendix 6—table 1

Investigating the relation between the date of ‘lockdown’ and the date that the estimated $R_{e}$ dropped below 1.

The first three columns contain the same information as the first four columns of the main text Table 1. The last two columns are analogous to the third (‘ ${\hat{R}}_{e} < 1$ based on confirmed cases’) but are based on $R_{e}$ estimates for hospitalisations and deaths respectively. For each of these observation types, we used our method to determine when the $R_{e}$ estimate first dropped below 1, and for which dates the corresponding 95% confidence interval contained 1. Further, we used news reports to determine when a country implemented stay-at-home orders (a ‘lockdown’). Based on our $R_{e}$ estimates for confirmed cases, Denmark, Germany, the Netherlands, and Slovenia had 95% confidence intervals that included or were below one before a nationwide lockdown was implemented. For $R_{e}$ estimates based on COVID-19 deaths, there are also four: Denmark, the Netherlands, Poland, and the United Kingdom. See Appendix 2 for smoothing related caveats.

Country	Lockdown	${\hat{R}}_{e} < 1$ based on Confirmed cases	${\hat{R}}_{e} < 1$ based on Deaths	${\hat{R}}_{e} < 1$ based on Hospitalisations
Austria	16–03	20–03 [20-03, 20-03]
Belgium	18–03	30–03 [25-03, 03-04]	26–03 [24-03, 26-03]	25–03 [24-03, 25-03]
Denmark	18–03	10–03 [10-03, 20-06]	22–03 [18-03, 07-01]
Finland	16–03	02–04 [29-03, 30-04]	07–04 [25-03, 11-04]
France	17–03	27–03 [23-03, 07-04]	24–03 [22-03, 26-03]	27–03 [25-03, 26-03]
Germany	22–03	18–03 [17-03, 19-03]	31–03 [23-03, 04-04]
Ireland	27–03	08–04 [04–04, 15–04]	05–04 [31-03, 09-04]	06–04 [06–04, 26–04]
Italy	10–03	18–03 [17-03, 19-03]	14–03 [01–03, 29–05]
Netherlands	23–03	05–04 [22-03, 10-04]	22–03 [19-03, 02-04]	26–03 [24-03, 26-03]
Norway	14–03	21–03 [17-03, 24-03]	25–03 [18-03, 08-04]
Poland	25–03	02–04 [31-03, 17-04]	09–04 [24-03, 01-12]
Portugal	16–03	28–03 [23-03, 15-04]	28–03 [21-03, 12-04]
Romania	24–03	06–04 [31-03, 29-04]	17–04 [25-03, 28-04]
Russian Federation	30–03	04–05 [01–05, 08–05]	18–05 [14-05, 12-12]
Slovenia	20–03	23–03 [13-03, 26-03]	26–03 [20–03, ≥03-05-2021]
Spain	14–03	26–03 [25-03, 26-03]
Sweden		01–04 [06–03,≥03-05-2021]	05–04 [13–03,≥03-05-2021]
Switzerland	17–03	22–03 [20-03, 22-03]	21–03 [18-03, 23-03]	18–03 [16-03, 18-03]
Turkey	21–03	08–04 [01–04, 13–04]	04–04 [31-03, 06-04]
United Kingdom	24–03	30–03 [28-03, 20-04]	25–03 [24-03, 25-03]	29–03 [27-03, 29-03]

Appendix 6—table 2

News and public resources used to determine when a country implemented the first non-pharmaceutical interventions, and a nationwide lockdown.

Country	First Measure	Lockdown	URL
Austria	10–03	16–03	https://mrc-ide.github.io/covid19estimates/#/interventions
Belgium	10–03	18–03	https://mrc-ide.github.io/covid19estimates/#/interventions
Denmark	12–03	18–03	https://mrc-ide.github.io/covid19estimates/#/interventions
Finland	16–03	16–03	https://en.wikipedia.org/wiki/COVID-19_pandemic_in_Finland#Response_by_sector
France	29–02	17–03	https://www.politico.eu/article/europes-coronavirus-lockdown-measures-compared/
Germany	06–03	22–03	https://www.bundesregierung.de/breg-de/themen/coronavirus/besprechung-der-bundeskanzlerin-mit-den-regierungschefinnen-und-regierungschefs-der-laender-1733248
Ireland	12–03	27–03	https://en.wikipedia.org/wiki/COVID-19_pandemic_in_the_Republic_of_Ireland#Containment_phase
Italy	22–02	10–03	https://www.politico.eu/article/europes-coronavirus-lockdown-measures-compared/
Netherlands	10–03	23–03	https://www.volkskrant.nl/nieuws-achtergrond/bijeenkomsten-tot-1-juni-verboden-burgemeesters-mogen-handhaven-met-forse-boetes~b41b8508/
Norway	12–03	14–03	https://www.euractiv.com/section/coronavirus/short_news/norway-update-covid-19/
Poland	09–03	25–03	https://www.politico.eu/article/europes-coronavirus-lockdown-measures-compared/
Portugal	11–03	16–03	https://www.politico.eu/article/europes-coronavirus-lockdown-measures-compared/
Romania	21–02	24–03	https://en.wikipedia.org/wiki/COVID-19_pandemic_in_Romania
Russian Federation	25–03	30–03	https://en.wikipedia.org/wiki/COVID-19_pandemic_in_Russia
Slovenia	09–03	20–03	https://en.wikipedia.org/wiki/COVID-19_pandemic_in_Slovenia
Spain	10–03	14–03	https://www.elmundo.es/espana/2020/03/13/5e6b844e21efa0dd258b45a5.html
Sweden	11–03
Switzerland	28–02	17–03	‘Verordnung 2 über Massnahmen zur Bekämpfung des Coronavirus (COVID-19)’
Turkey	12–03	21–03	https://en.wikipedia.org/wiki/COVID-19_pandemic_in_Turkey#Government_response
United Kingdom	12–03	24–03	https://www.bbc.com/news/uk-52012432

Data availability

The source code of the pipeline is available at https://github.com/covid-19-Re/shiny-dailyRe (copy archived at swh:1:rev:012c2892381ed8e246843556ca6ee502f134a6e0); this includes a script to download the required incidence data from public sources. The resulting estimates (updated daily) are available at: https://github.com/covid-19-Re/dailyRe-Data. The code and data (such as case, hospitalization and death occurence data) necessary to reproduce the figures in the paper is at https://github.com/covid-19-Re/paper-code (copy archived at swh:1:rev:edbe2e7a9ca3c5183772e94909d36fbe23e124cb). Note that for Switzerland we use delay distributions from a line list provided to us by the Federal Office of Public Health (FOPH), with one row per infected individual including patient information. Due to privacy concerns, we are not allowed to share this original data. To obtain access to these records, interested individuals should contact the FOPH directly.

The following data sets were generated

1. Angst D
2. Chen C
(2022) GitHub
ID dailyRe-Data. dailyRe-Data.

https://github.com/covid-19-Re/dailyRe-Data

References

Software
1. Abbott S
2. Hellewell J
3. Sherratt K
4. Gostic K
5. Hickson J
6. Badr HS
7. DeWitt M
8. Thompson R
9. Funk S
(2020a) EpiNow2: estimate real-time case counts and time-varying epidemiological parameters
Zenodo.

https://doi.org/10.5281/zenodo.5036949
1. Abbott S
2. Hellewell J
3. Thompson RN
4. Sherratt K
5. Gibbs HP
6. Bosse NI
7. Munday JD
8. Meakin S
9. Doughty EL
10. Chun JY
11. Chan YWD
12. Finger F
13. Campbell P
14. Endo A
15. Pearson CAB
16. Gimma A
17. Russell T
18. Flasche S
19. Kucharski AJ
20. Eggo RM
21. Funk S
22. CMMID COVID modelling group
(2020b) Estimating the time-varying reproduction number of SARS-cov-2 using national and subnational case counts
Wellcome Open Research 5:112.

https://doi.org/10.12688/wellcomeopenres.16006.2
- Google Scholar
Website
1. Abbott S
2. Hellewell J
3. Thompson RN
4. Sherratt K
5. Gibbs HP
6. Bosse NI
7. Munday JD
8. Meakin S
9. Doughty EL
10. Chun JY
11. Chan YWD
12. Finger F
13. Campbell P
14. Endo A
15. Pearson CAB
16. Gimma A
17. Russell T
(2022) Temporal variation in transmission during the covid-19 outbreak
Accessed May 10, 2022.

https://epiforecasts.io/covid/
1. Ali ST
2. Wang L
3. Lau EHY
4. Xu X-K
5. Du Z
6. Wu Y
7. Leung GM
8. Cowling BJ
(2020) Serial interval of SARS-cov-2 was shortened over time by nonpharmaceutical interventions
Science 369:1106–1109.

https://doi.org/10.1126/science.abc9004
- PubMed
- Google Scholar
(2020) Estimate of the basic reproduction number for COVID-19: A systematic review and meta-analysis
Journal of Preventive Medicine and Public Health = Yebang Uihakhoe Chi 53:151–157.

https://doi.org/10.3961/jpmph.20.076
- PubMed
- Google Scholar
Book
1. Anderson RM
2. May RM
(1991)
Infectious Diseases of Humans: Dynamics and Control

Oxford: Oxford University Press.
- Google Scholar
Software
1. Angst D
(2022) Shiny-dailyre, version swh:1:rev:012c2892381ed8e246843556ca6ee502f134a6e0
Software Heritage.

https://github.com/covid-19-Re/shiny-dailyRe
Website
1. Apple
(2021) COVID-19 Mobility Trends Reports
Accessed May 6, 2021.

https://covid19.apple.com/mobility
1. Banholzer N
2. van Weenen E
3. Lison A
4. Cenedese A
5. Seeliger A
6. Kratzwald B
7. Tschernutter D
8. Salles JP
9. Bottrighi P
10. Lehtinen S
11. Feuerriegel S
12. Vach W
(2021) Estimating the effects of non-pharmaceutical interventions on the number of new infections with COVID-19 during the first epidemic wave
PLOS ONE 16:e0252827.

https://doi.org/10.1371/journal.pone.0252827
- PubMed
- Google Scholar
Preprint
1. Banholzer N
2. Lison A
3. Özcelik D
4. Stadler T
5. Feuerriegel S
6. Vach W
(2022) The Methodologies to Assess the Effects of Non-Pharmaceutical Interventions during COVID-19: A Systematic Review
medRxiv.

https://doi.org/10.1101/2022.04.14.22273858
- Google Scholar
1. Bi Q
2. Wu Y
3. Mei S
4. Ye C
5. Zou X
6. Zhang Z
7. Liu X
8. Wei L
9. Truelove SA
10. Zhang T
11. Gao W
12. Cheng C
13. Tang X
14. Wu X
15. Wu Y
16. Sun B
17. Huang S
18. Sun Y
19. Zhang J
20. Ma T
21. Lessler J
22. Feng T
(2020) Epidemiology and transmission of COVID-19 in 391 cases and 1286 of their close contacts in shenzhen, china: a retrospective cohort study
The Lancet. Infectious Diseases 20:911–919.

https://doi.org/10.1016/S1473-3099(20)30287-5
- PubMed
- Google Scholar
(2006) Estimating in real time the efficacy of measures to control emerging communicable diseases
American Journal of Epidemiology 164:591–597.

https://doi.org/10.1093/aje/kwj274
- PubMed
- Google Scholar
(2013) A new framework and software to estimate time-varying reproduction numbers during epidemics
American Journal of Epidemiology 178:1505–1512.

https://doi.org/10.1093/aje/kwt133
- PubMed
- Google Scholar
(2019) Complexity of the basic reproduction number (R0)
Emerging Infectious Diseases 25:171901.

https://doi.org/10.3201/eid2501.171901
- Google Scholar
(1977) Maximum likelihood from incomplete data via the em algorithm
Journal of the Royal Statistical Society 39:1–22.

https://doi.org/10.1111/j.2517-6161.1977.tb01600.x
- Google Scholar
Preprint
1. Esra R
2. Jamieson L
3. Fox MP
4. Letswalo D
5. Ngcobo N
6. Mngadi S
7. Estill J
8. Meyer-Rath G
9. Keiser O
(2020) Evaluating the Impact of Non-Pharmaceutical Interventions for SARS-CoV-2 on a Global Scale
medRxiv.

https://doi.org/10.1101/2020.07.30.20164939
- Google Scholar
(2020) National age and coresidence patterns shape COVID-19 vulnerability
PNAS 117:16118–16120.

https://doi.org/10.1073/pnas.2008764117
- Google Scholar
Website
1. European Centre for Disease Prevention and Control (ECDC)
(2022) Daily number of new reported cases of COVID-19 by country worldwide
Accessed October 2, 2020.

https://opendata.ecdc.europa.eu/covid19/casedistribution/csv
1. Flaxman S
2. Mishra S
3. Gandy A
4. Unwin HJT
5. Mellan TA
6. Coupland H
7. Whittaker C
8. Zhu H
9. Berah T
10. Eaton JW
11. Monod M
12. Imperial College COVID-19 Response Team
13. Ghani AC
14. Donnelly CA
15. Riley S
16. Vollmer MAC
17. Ferguson NM
18. Okell LC
19. Bhatt S
(2020) Estimating the effects of non-pharmaceutical interventions on covid-19 in europe
Nature 584:257–261.

https://doi.org/10.1038/s41586-020-2405-7
- PubMed
- Google Scholar
1. Goldstein E
2. Dushoff J
3. Ma J
4. Plotkin JB
5. Earn DJD
6. Lipsitch M
(2009) Reconstructing influenza incidence by deconvolution of daily mortality time series
PNAS 106:21825–21829.

https://doi.org/10.1073/pnas.0902958106
- PubMed
- Google Scholar
Website
1. Google
(2021) COVID-19 Community Mobility Reports
Accessed May 6, 2021.

https://www.google.com/covid19/mobility
1. Gostic KM
2. McGough L
3. Baskerville EB
4. Abbott S
5. Joshi K
6. Tedijanto C
7. Kahn R
8. Niehus R
9. Hay JA
10. De Salazar PM
11. Hellewell J
12. Meakin S
13. Munday JD
14. Bosse NI
15. Sherrat K
16. Thompson RN
17. White LF
18. Huisman JS
19. Scire J
20. Bonhoeffer S
21. Stadler T
22. Wallinga J
23. Funk S
24. Lipsitch M
25. Cobey S
(2020) Practical considerations for measuring the effective reproductive number, rt
PLOS Computational Biology 16:1008409.

https://doi.org/10.1371/journal.pcbi.1008409
- PubMed
- Google Scholar
1. Griffin J
2. Casey M
3. Collins Á
4. Hunt K
5. McEvoy D
6. Byrne A
7. McAloon C
8. Barber A
9. Lane EA
10. More S
(2020) Rapid review of available evidence on the serial interval and generation time of COVID-19
BMJ Open 10:e040263.

https://doi.org/10.1136/bmjopen-2020-040263
- PubMed
- Google Scholar
Book
1. Hale T
2. Webster S
3. Petherick A
4. Phillips T
5. Kira B
(2020)
Oxford COVID-19 Government Response Tracker

Blavatnik School of Government.
- Google Scholar
1. Hale T
2. Angrist N
3. Goldszmidt R
4. Kira B
5. Petherick A
6. Phillips T
7. Webster S
8. Cameron-Blake E
9. Hallas L
10. Majumdar S
11. Tatlow H
(2021) A global panel database of pandemic policies (oxford COVID-19 government response tracker)
Nature Human Behaviour 5:529–538.

https://doi.org/10.1038/s41562-021-01079-8
- PubMed
- Google Scholar
Software
1. Hamouda O
(2020) Schätzung der aktuellen entwicklung der sars-cov-2-epidemie in deutschland–nowcasting
Zenodo.

https://zenodo.org/record/5712013#.Ywb3-nZBw2w
1. Haug N
2. Geyrhofer L
3. Londei A
4. Dervic E
5. Desvars-Larrive A
6. Loreto V
7. Pinior B
8. Thurner S
9. Klimek P
(2020) Ranking the effectiveness of worldwide COVID-19 government interventions
Nature Human Behaviour 4:1303–1312.

https://doi.org/10.1038/s41562-020-01009-0
- PubMed
- Google Scholar
(2020) Estimation of SARS-cov-2 mortality during the early stages of an epidemic: A modeling study in hubei, china, and six regions in europe
PLOS Medicine 17:1003189.

https://doi.org/10.1371/journal.pmed.1003189
- PubMed
- Google Scholar
Software
1. Heiden M
(2021) Sars-cov-2-nowcasting und -r-schaetzung
Zenodo.

https://zenodo.org/record/5712013#.YvpNmXZBzIU
Software
1. Huisman JS
(2022) Paper-code, version swh:1:rev:edbe2e7a9ca3c5183772e94909d36fbe23e124cb
Software Heritage.

https://github.com/covid-19-Re/paper-code
Preprint
1. Huisman JS
2. Scire J
3. Caduff L
4. Fernandez-Cassi X
5. Ganesanandamoorthy P
6. Kull A
7. Scheidegger A
8. Stachler E
9. Boehm AB
10. Hughes B
11. Knudson A
12. Topol A
13. Wigginton KR
14. Wolfe MK
15. Kohn T
16. Ort C
17. Stadler T
18. Julian TR
(2022) Wastewater-Based Estimation of the Effective Reproductive Number of Sars-Cov-2
medRxiv.

https://doi.org/10.1101/2021.04.29.21255961
- Google Scholar
Website
1. Karberg S
(2020) Der “überflüssige” Lockdown? Tagesspiegel
Accessed October 22, 2020.

https://www.tagesspiegel.de/wissen/der-ueberfluessige-lockdown-ja-der-r-wert-sank-schon-vor-der-kontaktsperre-aber-/25767642.html
(2022) Inferring the effective start dates of non-pharmaceutical interventions during covid-19 outbreaks
International Journal of Infectious Diseases 117:361–368.

https://doi.org/10.1016/j.ijid.2021.12.364
- PubMed
- Google Scholar
1. Kucharski AJ
2. Russell TW
3. Diamond C
4. Liu Y
5. Edmunds J
6. Funk S
7. Eggo RM
8. Sun F
9. Jit M
10. Munday JD
(2020) Early dynamics of transmission and control of COVID-19: a mathematical modelling study
The Lancet. Infectious Diseases 20:553–558.

https://doi.org/10.1016/S1473-3099(20)30144-4
- PubMed
- Google Scholar
(2020) Assessing the impact of non-pharmaceutical interventions on SARS-cov-2 transmission in switzerland
Swiss Medical Weekly 150:w20295.

https://doi.org/10.4414/smw.2020.20295
- PubMed
- Google Scholar
1. Linton NM
2. Kobayashi T
3. Yang Y
4. Hayashi K
5. Akhmetzhanov AR
6. Jung S-M
7. Yuan B
8. Kinoshita R
9. Nishiura H
(2020) Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: A statistical analysis of publicly available case data
Journal of Clinical Medicine 9:E538.

https://doi.org/10.3390/jcm9020538
- PubMed
- Google Scholar
(2005) Superspreading and the effect of individual variation on disease emergence
Nature 438:355–359.

https://doi.org/10.1038/nature04153
- PubMed
- Google Scholar
1. Lucy LB
(1974) An iterative technique for the rectification of observed distributions
The Astronomical Journal 79:745.

https://doi.org/10.1086/111605
- Google Scholar
1. Mbow M
2. Lell B
3. Jochems SP
4. Cisse B
5. Mboup S
6. Dewals BG
7. Jaye A
8. Dieye A
9. Yazdanbakhsh M
(2020) COVID-19 in africa: dampening the storm?
Science 369:624–626.

https://doi.org/10.1126/science.abd3902
- PubMed
- Google Scholar
1. Meyerowitz-Katz G
2. Merone L
(2020) A systematic review and meta-analysis of published research data on COVID-19 infection fatality rates
International Journal of Infectious Diseases 101:138–148.

https://doi.org/10.1016/j.ijid.2020.09.1464
- PubMed
- Google Scholar
Website
1. Minder R
(2020) Counting Bodies and Pointing Fingers as Spain Tallies Coronavirus Dead
Accessed October 22, 2021.

https://www.nytimes.com/2020/04/16/world/europe/coronoavirus-spain-death-toll.html
1. Morris DH
2. Yinda KC
3. Gamble A
4. Rossine FW
5. Huang Q
6. Bushmaker T
7. Fischer RJ
8. Matson MJ
9. Van Doremalen N
10. Vikesland PJ
11. Marr LC
12. Munster VJ
13. Lloyd-Smith JO
(2021) Mechanistic theory predicts the effects of temperature and humidity on inactivation of SARS-cov-2 and other enveloped viruses
eLife 10:e65902.

https://doi.org/10.7554/eLife.65902
- PubMed
- Google Scholar
Book
1. Nishiura H
2. Chowell G
(2009)
The effective reproduction number as a prelude to statistical estimation of time-dependent epidemic trends

In: Bettencourt Lu, Castillo-Chavez Carlos, editors. Mathematical and Statistical Estimation Approaches in Epidemiology, Vol 1. Dordrecht, Netherlands: Springer. pp. 103–121.
- Google Scholar
(2020) Serial interval of novel coronavirus (COVID-19) infections
International Journal of Infectious Diseases 93:284–286.

https://doi.org/10.1016/j.ijid.2020.02.060
- PubMed
- Google Scholar
(2012) The R0 package: a toolbox to estimate reproduction numbers for epidemic outbreaks
BMC Medical Informatics and Decision Making 12:1–9.

https://doi.org/10.1186/1472-6947-12-147
- PubMed
- Google Scholar
1. Pan A
2. Liu L
3. Wang C
4. Guo H
5. Hao X
6. Wang Q
7. Huang J
8. He N
9. Yu H
10. Lin X
11. Wei S
12. Wu T
(2020) Association of public health interventions with the epidemiology of the COVID-19 outbreak in wuhan, china
JAMA 323:1915–1923.

https://doi.org/10.1001/jama.2020.6130
- PubMed
- Google Scholar
(2021) Challenges in control of covid-19: short doubling time and long delay to effect of interventions
Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 376:20200264.

https://doi.org/10.1098/rstb.2020.0264
- PubMed
- Google Scholar
1. Petermann M
2. Wyler D
(2020) A pitfall in estimating the e ective reproductive number rt for COVID-19
Swiss Medical Weekly 150:w20307.

https://doi.org/10.4414/smw.2020.20307
- PubMed
- Google Scholar
1. Richardson WH
(1972) Bayesian-based iterative method of image restoration*
Journal of the Optical Society of America 62:55.

https://doi.org/10.1364/JOSA.62.000055
- Google Scholar
Website
(2020) Methodenbeschreibung für die Schätzung von epidemiologischen Parametern des COVID19 Ausbruchs, Österreich. AGES- Österreichische Agentur für Gesundheit und Ernährungssicherheit
Accessed April 20, 2020.

https://graz.pure.elsevier.com/de/publications/methodenbeschreibung-f%C3%BCr-die-sch%C3%A4tzung-von-epidemiologischen-para
Website
1. Robert Koch-Institut
(2020) Täglicher Lagebericht des RKI zur Coronavirus-Krankheit-2019 (COVID-19)
Accessed November 16, 2020.

https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/Situationsberichte/Maerz_2022/2022-03-09-de.pdf?__blob=publicationFile
Website
(2020) Coronavirus Pandemic (COVID-19). Our World in Data
Accessed January 10, 2022.

https://ourworldindata.org/coronavirus
1. Scire J
2. Nadeau S
3. Vaughan T
4. Brupbacher G
5. Fuchs S
6. Sommer J
7. Koch KN
8. Misteli R
9. Mundorff L
10. Götz T
11. Eichenberger T
12. Quinto C
13. Savic M
14. Meienberg A
15. Burkard T
16. Mayr M
17. Meier CA
18. Widmer A
19. Kuehl R
20. Egli A
21. Hirsch HH
22. Bassetti S
23. Nickel CH
24. Rentsch KS
25. Kübler W
26. Bingisser R
27. Battegay M
28. Tschudin-Sutter S
29. Stadler T
(2020) Reproductive number of the COVID-19 epidemic in switzerland with a focus on the cantons of basel-stadt and basel-landschaft
Swiss Medical Weekly 150:w20271.

https://doi.org/10.4414/smw.2020.20271
- PubMed
- Google Scholar
Software
1. Scott JA
2. Gandy A
3. Mishra S
4. Unwin J
5. Flaxman S
6. Bhatt S
(2020)
Epidemia: modeling of epidemics using hierarchical bayesian models, version version 1.0.0

R Package.
(2020) Explaining the homogeneous diffusion of COVID-19 nonpharmaceutical interventions across heterogeneous countries
PNAS 117:21201–21208.

https://doi.org/10.1073/pnas.2010625117
- PubMed
- Google Scholar
Preprint
1. Sharma M
2. Mindermann S
3. Brauner JM
4. Leech G
5. Stephenson AB
6. Gavenčiak T
7. Kulveit J
8. Teh YW
9. Chindelevitch L
10. Gal Y
(2020) On the Robustness of Effectiveness Estimation of Nonpharmaceutical Interventions against COVID-19 Transmission
arXiv.

https://arxiv.org/abs/2007.13454v1
- Google Scholar
1. Soltesz K
2. Gustafsson F
3. Timpka T
4. Jaldén J
5. Jidling C
6. Heimerson A
7. Schön TB
8. Spreco A
9. Ekberg J
10. Dahlström Ö
11. Bagge Carlson F
12. Jöud A
13. Bernhardsson B
(2020) The effect of interventions on COVID-19
Nature 588:E26–E28.

https://doi.org/10.1038/s41586-020-3025-y
- PubMed
- Google Scholar
Website
(2020) Rt covid live
Accessed January 10, 2022.

https://github.com/rtcovidlive/covid-model
1. Tebé C
2. Valls J
3. Satorra P
4. Tobías A
(2020) COVID19-world: a shiny application to perform comprehensive country-specific data visualization for SARS-cov-2 epidemic
BMC Medical Research Methodology 20:1–7.

https://doi.org/10.1186/s12874-020-01121-9
- PubMed
- Google Scholar
Website
1. Tokyay M
(2020) Turkey hits record COVID-19 cases after change in reporting
Accessed April 26, 2021.

https://www.arabnews.com/node/1768881/middle-east
1. Tsang TK
2. Wu P
3. Lau EHY
4. Cowling BJ
(2021) Accounting for imported cases in estimating the time-varying reproductive number of coronavirus disease 2019 in hong kong
The Journal of Infectious Diseases 224:783–787.

https://doi.org/10.1093/infdis/jiab299
- PubMed
- Google Scholar
Website
1. TU Graz AGES
(2021) Epidemiologische parameter des covid19 ausbruchs, update 11.11.2021, österreich, 2020/2021
Accessed January 10, 2022.

https://wissenaktuell.ages.at/download/0/0/7951f8dc4672000514e611c70195db45c55a65f2/fileadmin/AGES2015/Wissen-Aktuell/COVID19/Update_Epidemiologische_Parameter_des_COVID19_Ausbruchs_2021-11-19.pdf
1. Wallinga J
2. Teunis P
(2004) Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures
American Journal of Epidemiology 160:509–516.

https://doi.org/10.1093/aje/kwh255
- PubMed
- Google Scholar
1. Wallinga J
2. Lipsitch M
(2007) How generation intervals shape the relationship between growth rates and reproductive numbers
Proceedings. Biological Sciences 274:599–604.

https://doi.org/10.1098/rspb.2006.3754
- PubMed
- Google Scholar
Preprint
1. Yang W
2. Kandula S
3. Huynh M
4. Greene SK
5. Wye GV
6. Li W
7. Chan HT
8. McGibbon E
9. Yeung A
10. Olson D
11. Fine A
12. Shaman J
(2020) Estimating the Infection Fatality Risk of COVID-19 in New York City during the Spring 2020 Pandemic Wave
medRxiv.

https://doi.org/10.1101/2020.06.27.20141689
- Google Scholar
1. Zhou T
2. Liu Q
3. Yang Z
4. Liao J
5. Yang K
6. Bai W
7. Lu X
8. Zhang W
(2020) Preliminary prediction of the basic reproduction number of the wuhan novel coronavirus 2019-ncov
Journal of Evidence-Based Medicine 13:3–7.

https://doi.org/10.1111/jebm.12376
- PubMed
- Google Scholar

Article and author information

Author details

Jana S Huisman
1. Department of Environmental Systems Science, ETH Zurich, Swiss Federal Institute of Technology, Zurich, Switzerland
2. Swiss Institute of Bioinformatics, Lausanne, Switzerland
3. Department of Biosystems Science and Engineering, ETH Zurich, Swiss Federal Institute of Technology, Basel, Switzerland
Contribution
Conceptualization, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing

Contributed equally with
Jérémie Scire

For correspondence
jana.huisman@env.ethz.ch

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-1782-8109
Jérémie Scire
1. Swiss Institute of Bioinformatics, Lausanne, Switzerland
2. Department of Biosystems Science and Engineering, ETH Zurich, Swiss Federal Institute of Technology, Basel, Switzerland
Contribution
Conceptualization, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing

Contributed equally with
Jana S Huisman

Competing interests
No competing interests declared
Daniel C Angst

Department of Environmental Systems Science, ETH Zurich, Swiss Federal Institute of Technology, Zurich, Switzerland

Contribution
Conceptualization, Data curation, Software, Validation, Investigation, Visualization, Project administration, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-6512-4595
Jinzhou Li

Department of Mathematics, ETH Zurich, Swiss Federal Institute of Technology, Zurich, Switzerland

Contribution
Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – review and editing

Competing interests
No competing interests declared
Richard A Neher
1. Swiss Institute of Bioinformatics, Lausanne, Switzerland
2. Biozentrum, University of Basel, Basel, Switzerland
Contribution
Methodology, Writing – review and editing

Competing interests
Reviewing editor, eLife

"This ORCID iD identifies the author of this article:" 0000-0003-2525-1407
Marloes H Maathuis

Department of Mathematics, ETH Zurich, Swiss Federal Institute of Technology, Zurich, Switzerland

Contribution
Conceptualization, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing – review and editing

Competing interests
No competing interests declared
Sebastian Bonhoeffer

Department of Environmental Systems Science, ETH Zurich, Swiss Federal Institute of Technology, Zurich, Switzerland

Contribution
Conceptualization, Resources, Supervision, Investigation, Project administration, Writing – review and editing

Contributed equally with
Tanja Stadler

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-8052-3925
Tanja Stadler
1. Swiss Institute of Bioinformatics, Lausanne, Switzerland
2. Department of Biosystems Science and Engineering, ETH Zurich, Swiss Federal Institute of Technology, Basel, Switzerland
Contribution
Conceptualization, Resources, Supervision, Funding acquisition, Validation, Investigation, Methodology, Project administration, Writing – review and editing

Contributed equally with
Sebastian Bonhoeffer

For correspondence
tanja.stadler@bsse.ethz.ch

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-6431-535X

Funding

Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (31CA30_196267)

Tanja Stadler

Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (200021_172603)

Marloes H Maathuis

Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (310030B_176401)

Sebastian Bonhoeffer

Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (407240-167121)

Sebastian Bonhoeffer
Tanja Stadler

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank the Federal Office of Public Health Switzerland for access to their line list data, and Jūlija Pečerska for help in finding governmental datasets on COVID-19. We thank members of the modelling group of the Swiss National Covid-19 science task force for helpful discussions, and Markus Petermann and Daniel Wyler for their comments and suggestions. This work was supported by the Swiss National Science Foundation (SNSF) through grant number 31CA30_196267 (to TS), 200021_172603 (to MHM), 310030B_176401 (to SB), and NRP72 grant 407240–167121 (to SB and TS).

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

2,113

views
349

downloads
51

citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Article PDF

Open citations (links to open the citations from this article in various online reference manager services)

Mendeley

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Jana S Huisman
Jérémie Scire
Daniel C Angst
Jinzhou Li
Richard A Neher
Marloes H Maathuis
Sebastian Bonhoeffer
Tanja Stadler

(2022)

Estimation and worldwide monitoring of the effective reproductive number of SARS-CoV-2

eLife 11:e71345.

https://doi.org/10.7554/eLife.71345

Share this article

Cite this article

Evaluation of the pipeline on simulated data.

Stability of the Swiss Re estimates based on confirmed COVID-19 cases, upon adding additional days of observations.

Comparison of published Re estimates for three countries.

Example panels from the online dashboard.

Investigating the relation between the date of ‘lockdown’ and the date when the estimated Re based on case reports dropped below 1.

The association between the implementation or lifting of non-pharmaceutical interventions and changes in Re until May 2021.

Gamma distributions used in the pipeline: serial interval, incubation period, and the delay distributions assumed for each observation type.

Schematic of the effect of smoothing on the ability to estimate when Re=1.

The effect of smoothing on the ability to estimate when Re=1.

Residuals and their corresponding days of the week.

Performance of our method on simulated scenarios with differing slopes.

Performance of our method, modified to skip the smoothing step in the pipeline, on simulated scenarios with observation noise.

Performance of our method on simulated scenarios with observation noise.

Performance of our method on simulated scenarios with varying population size, using confidence intervals from EpiEstim.

Performance of our method on simulated scenarios with varying population size, using the Union of EpiEstim and Block bootstrap 95% confidence intervals.

Performance of our method on simulated scenarios using a fixed shift versus the deconvolution to infer infection incidence.

Performance of our method on simulated scenarios with misspecified delay distributions.

Performance of our method on simulated scenarios with time-varying delay distributions.

Stability of Re estimates at present.

Mean delay in Switzerland between onset of symptoms and reporting.

Comparison of the Re estimates with or without accounting for known symptom onset dates and for time-variability on reporting delays.

Effective reproductive number estimates with or without accounting for known imports.

Square root and log transformations to stabilise the variance of residuals.

Autocorrelation function (ACF) and partial autocorrelation function (PACF) plots of the observations from five different countries.

Autocorrelation function (ACF) and partial autocorrelation function (PACF) plots of observations from Switzerland and the fitted ARIMA model.

Simulated observations with and without noise.

Estimated Re with different smoothing windows.

Estimated Re with different block sizes.

Investigating the relation between the date of ‘lockdown’ and the date that the estimated Re dropped below 1.

News and public resources used to determine when a country implemented the first non-pharmaceutical interventions, and a nationwide lockdown.

Author details

Jana S Huisman

Contribution

Contributed equally with

For correspondence

Competing interests

Jérémie Scire

Contribution

Contributed equally with

Competing interests

Daniel C Angst

Contribution

Competing interests

Jinzhou Li

Contribution

Competing interests

Richard A Neher

Contribution

Competing interests

Marloes H Maathuis

Contribution

Competing interests

Sebastian Bonhoeffer

Contribution

Contributed equally with

Competing interests

Tanja Stadler

Contribution

Contributed equally with

For correspondence

Competing interests

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organisms

Further reading

Stability of the Swiss $R_{e}$ estimates based on confirmed COVID-19 cases, upon adding additional days of observations.

Comparison of published $R_{e}$ estimates for three countries.

Investigating the relation between the date of ‘lockdown’ and the date when the estimated $R_{e}$ based on case reports dropped below 1.

The association between the implementation or lifting of non-pharmaceutical interventions and changes in $R_{e}$ until May 2021.

Schematic of the effect of smoothing on the ability to estimate when $R_{e} = 1$ .

The effect of smoothing on the ability to estimate when $R_{e} = 1$ .

Stability of $R_{e}$ estimates at present.

Comparison of the $R_{e}$ estimates with or without accounting for known symptom onset dates and for time-variability on reporting delays.

Estimated $R_{e}$ with different smoothing windows.

Estimated $R_{e}$ with different block sizes.

Investigating the relation between the date of ‘lockdown’ and the date that the estimated $R_{e}$ dropped below 1.