Giter VIP home page Giter VIP logo

imperialcollegelondon / covid19model Goto Github PK

View Code? Open in Web Editor NEW
944.0 81.0 273.0 66.36 MB

Code for modelling estimated deaths and cases for COVID19.

License: MIT License

R 13.00% Stan 82.58% Dockerfile 0.01% Shell 0.05% Python 0.06% Jupyter Notebook 4.31%
covid-19 bayesian-statistics renewal-process branching-process intervention-study probabilistic-models statistical-models statistical-inference statistical-computing

covid19model's Introduction

covid19model DOI

Code for modelling estimated deaths and infections for COVID-19 from "Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe", Flaxman, Mishra, Gandy et al, Nature, 2020, the published version of our original Report 13.

If you are looking for the individual based model used in Imperial's Report 9, Ferguson, Laydon, Nedjati-Gilani et al, please look here.

Version 12 Release DOI

This is the release related to our Tiers paper, where we use latent factor model to estimate the effectiveness of tiers systems in England. Peer reviewed version to be out soon. All other code is still the same for previous releases.

To reproduce results of the paper either from Rstudio source("tiers/spline_LFA.R") or from the command line Rscript tiers/spline_LFA.R.

The instructions for reproducing European report , Italy report , Brazil report , USA report , Nature, IFR, USA age specific report, Nature Communincations, and Science paper are the same as earlier (Look at version 3, version 4, version 5, version 6, version 7, version 8, version 9, version 10 and version 11). This release is specific to Tiers paper.

This release has been checked on macOS Catalina version 10.15.6/7 and Ubuntu version 18.04.2.

The authors of version 12 Release are Daniel J Laydon, Swapnil Mishra, and Samir Bhatt

Version 11 Release DOI

This is the release related to our Science paper, where we use age-specific mobility data to estimate the epidemic in the USA by accounting for age-specific heterogeneity. All other code is still the same for previous releases.

To run this code you need to follow the steps listed in the age-specific model here.

The code should be run in full mode to obtain credible results. Not running a full run to estimate anything is not recommended and discouraged. Only a full run should be used to get results.

The instructions for reproducing European report , Italy report , Brazil report , USA report , Nature, IFR, USA age specific report and Nature Communincations are the same as earlier (Look at version 3, version 4, version 5, version 6, version 7, version 8, version 9 and version 10). This release is specific to Science paper.

This release has been checked on macOS Catalina version 10.15.6/7 and Ubuntu version 18.04.2.

The authors of version 11 Release are Melodie Monod, Alexandra Blenkinsop, Xiaoyue Xi, H Juliette Unwin, Swapnil Mishra, Oliver Ratmann"

Version 10 Release DOI

This is the release related to Unwin, H.J.T., Mishra, S., Bradley, V.C. et al. State-level tracking of COVID-19 in the United States. Nat Commun 11, 6189 (2020). https://doi.org/10.1038/s41467-020-19652-6, where we use mobility data to estimate situation in all states of the USA. All other code is still the same.

To run this code you can directly run the base-usa.r file or from command line after seting the current directory as the repository directory run the following command Rscript base-usa-cases.r

The code should be run in full mode to obtain credible results. Not running a full run to estimate anything is not recommended and discouraged. Only a full run should be used to get results.

The instructions for European, Italy, Brazil, USA, Nature, IFR, USA age-specific code are the same as earlier (Look at version 3, version 4, version 5, version 6, version 7, version 8, and version 9). This release is specific to [ usa-paper (soon to be out) and medRxiv paper.

This release has been checked on macOS Catalina version 10.15.6/7 and Ubuntu version 18.04.2.

Version 9 Release DOI

This is the release related to report 32 and medRxiv paper, where we use age-specific mobility data to estimate the epidemic in the USA by accounting for age-specific heterogeneity. All other code is still the same for previous releases.

To run this code you need to follow the steps listed in the age-specific model here.

The code should be run in full mode to obtain credible results. Not running a full run to estimate anything is not recommended and discouraged. Only a full run should be used to get results.

The instructions for European, Italy, Brazil, USA, Nature, and IFR code are the same as earlier (Look at version 3, version 4, version 5, version 6, version 7, and version 8). This release is specific to report 32.

This release has been checked on macOS Catalina version 10.15.6/7 and Ubuntu version 18.04.2.

Version 8 Release DOI

The code for running models remains unchanged in the release. We provide scripts to calculate ifr for European countries and USA states. We hope the release of ifr computation code will enable the researchers to adapt the ifr calculations as per their requirements. The two files to run are compute-ifr-europe.r and compute-ifr-usa.r.

The code for ifr calculation for a new country will need changes if you have different age-specific bands. The code provided is not a universal code that will work for each country, the intention is to help researchers to modify the code easily for their setup.

The code for ifr needs an additional package named socialmixr, which is not specified in environment file as it doesn't exist with conda. You will need to download it via CRAN.

Version 7 Release DOI

This code is the exact code that was used in Flaxman, Mishra, Gandy et al. "Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe," Nature, 2020. https://www.nature.com/articles/s41586-020-2405-7

To run the code from the main folder in Rstudio source("base-nature.r") or from the command line Rscript base-nature.r.

The code should be run in full mode to obtain results---debug mode is only to check that your environment has the required libraries; results will not be reliable as the MCMC chains will not have converged.

The repository with posterior draws of the model in Flaxman, Mishra, Gandy et al. "Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe," Nature, 2020. https://www.nature.com/articles/s41586-020-2405-7 is here.

This code doesn't supersede our earlier model, it is here for everyone to have direct access to code used in Flaxman, Mishra, Gandy et al. "Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe," Nature, 2020.https://www.nature.com/articles/s41586-020-2405-7.

The instructions for European, Italy, Brazil, and USA code are the same as earlier (Look at version 3, version 4, version 5, version 6).

Version 6 Release

This is the release related to report 23, where we use mobility data to estimate situation in all states of the USA. All other code is still the same.

To run this code you can directly run the base-usa.r file or from command line after seting the current directory as the repository directory run the following command Rscript base-usa.r

The code should be run in full mode to obtain any results. Not running full model to estimate anything is not recommended and discouraged. Only full run should be used to get results.

The instructions for European, Italy and Brazil code are same as earlier (Look at version 3, version 4 and version 5). This release is specific to USA report.

This release has been checked on macOS Catalina version 10.15.6 and Ubuntu version 18.04.2. A full run takes about 20 hours using 4 cores.

Version 5 Release

This is the release related to report 21, where we use mobility data to estimate situation in Brazil. All other code is still the same.

To run this code you can directly run the base-Brazil.r file or from command line after seting the current directory as the repository directory run the following command Rscript base-Brazil.r

The code shold be run in full mode to obtain any results. Not running full model to estimate anything is not recommended and discouraged. Only full run should be used to get results.

The instructions for European and Italy code are same as earlier (Look at version 3 and version 4). This release is specific to Brazil report

Version 4 Release

This is the release related to report 20, where we use mobility data to estimate situation in Italy. All other code is still the same.

To run this code you can directly source the base-italy.r file in rstudio inside the project or from command line after setting the current directory as the repository directory run the following command Rscript base-italy.r base-italy google interventions '~ -1 + residential + transit + averageMobility' '~ -1 + residential + transit + averageMobility'

The code for scenarios runs only in full mode not in short run or debug mode. Not running full model to estimate anything is not recommended and discouraged. Only full run should be used to get results.

The instructions for European code are below. This release is specific to Italy report

Version 3 Release

In this update, we first extended our model from version 2 to have 'partial-pooling' for lockdown across all countries. This means now we have a global effect of lockdown along with each country having its own different lockdown effect. We also made our code modular, stan code faster (with help from the community) and now we create CSV outputs too for usage.

You can directly get csv files here and new model description here

Notice

⚠️ Python code is right now not updated and won't work. Python code is good for only version 1 model and data.

⚠️ base_general.r and base_general.stan, base_general_speed.stan and base_general_speed2.stan are now valid models for only version2

⚠️ This code is released with no support. We try our best to look at issues and pull request but can't help people with setup most of the time. We have docker images and conda environment file to make it easy for you to get started with the setup, any other approach assumes user can handle their computing environments approriately.

⚠️ This model is in active development and so parameter name and behaviours, and output file formats will change without notice.

⚠️ As with any mathematical model, it is easy to misconfigure inputs and therefore get meaningless outputs. The development team only endorses outputs it has itself generated.

Version 2 Release

In this update we extend our original model to include (a) population saturation effects, (b) prior uncertainty on the infection fatality ratio and (c) a more balanced prior on intervention effects. We also (d) included another 3 countries (Greece, the Netherlands and Portugal). The updated technical detail is available here.

You can directly look at our results here

This repository has code for replication purposes. The bleeding edge code and advancements are done in a private repository. Ask report authors for any collaborations.

Contributing

We welcome all potential collaborators and contributors from the wider community. Please see contributing for more details.

Installing dependencies

Using Conda

An environment.yml file is provided and can be used to build a virtual environment containing all model dependencies. Create the environment using:

conda env create -f environment.yml

Then activate the environment for use:

conda activate covid19model

This does not include the packages required for plotting maps. The following packages are required:

library(ggplot2)
library(ggstance)
library(ggrepel)
library(cowplot)
library(geofacet)
library(broom)
library(maptools)
library(rgeos)
library(rgdal)
library(colorspace)
library(sf)
library(scales)
library(bayesplot)
library(matrixStats)
library(boot)
library(lubridate)

Using Docker

A Docker image providing all model dependencies is available. See docker/README.md for details of running the model with Docker.

Other

If you wish to install packages into your native R environment or with a system package manager please see environment.yml for a full list of dependencies.

How to run the code

There are two ways to run our code:-

  • Open the rstudio project covid19model.Rproj file in rstudio and run/source base.r file
  • To run from commandline please enter the cloned directory and type Rscript base.r base in terminal

Please note to not make you wait for long we have by default set run sampling to a short period. For proper estimates please run it in FULL mode either by setting the flag --full or the environment variable FULL=TRUE. This will run sampling for 4000 iterations with 2000 warmups and 4 chains. The run time for 14 countries using new faster code is around 50 mins/1hr for the version 3 code.

Run mode settings

Three different run modes are supported:

  • DEBUG which can either be enabled by setting the flag --debug when running the base.r file as such:
    • Rscript base.r base --debug or by setting the environment variable DEBUG to TRUE.
  • DEFAULT which will run if neither full nor debug are set. Please note that for proper estimates FULL should always be set.
  • FULL which must always be used if you want to obtain reliable results and can be enabled by setting the flag --full on the command line:
    • Rscript base.r base --full or by setting the environment variable FULL to TRUE.

Results

  • The results are stored in two folders results and figures.
  • Results have the stored stan fits and data used for plotting
  • Figures have the images with daily cases, daily death and Rt for all countries.

covid19model's People

Contributors

bgoodri avatar cc-a avatar dependabot[bot] avatar ettieunwin avatar flaxter avatar fvalka avatar harrisonzhu508 avatar hrkulkarmsft avatar mansmeg avatar melodiemonod avatar payoto avatar s-mishra avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

covid19model's Issues

Errors due to forecast duration (N2) for data after the 18/04/2020

Describe the bug

When running with data including and after the 18/04/20 the plot-forecast.r fail to run.

Diagnostic

This is because, at that stage Italy has 84 days of data, the forecasting is attempted for 7 days (hard coded in the forecast), and the model is only run for 90 days N2 in base.r.

The fixes

After a bit of poking around to find a fix:

  • In base.r increase the value of N2 to 100 (its max allowed) This fix works for 10 days.
  • Increase the number of values in file data/serial_interval.csv
    • should this just be padded with zeros? I'm not sure what those do.

To Reproduce

Steps to reproduce the behavior:

  1. Run Rscript data/fetch-ecdc.r
  2. Run Rscript base.r base --debug
  3. Inspect closely the output (this does not appear in the CI runs as they don't monitor the status of intermediate steps)

Error message

Saving 10 x 7 in image
Error in estimated.deaths[, 1:N2, i] : subscript out of bounds
Calls: make_forecast_plot -> colMeans -> is.data.frame
In addition: Warning messages:
1: Transformation introduced infinite values in continuous y-axis
2: Removed 27 rows containing missing values (geom_bar).
3: Transformation introduced infinite values in continuous y-axis
4: Removed 27 rows containing missing values (geom_bar).
5: Transformation introduced infinite values in continuous y-axis
6: Removed 27 rows containing missing values (geom_bar).
7: Transformation introduced infinite values in continuous y-axis
8: Removed 27 rows containing missing values (geom_bar).
Execution halted
[1] "Running results/base-414043-stanfit.Rdata"
Error in prediction[, 1:length(x), i] : subscript out of bounds
Calls: colMeans -> is.data.frame
Execution halted

Instructions are incorrect for new Dockerfile

Describe the bug
15 hours or so ago the Dockerfile was replaced with one that doesn't use run_model_script.sh. As a result, the command line in the README tries to pass a uid/gid to Rscript and fails.

To Reproduce
Steps to reproduce the behavior:

$ docker run --rm -it -v $(pwd):/var/model harrisonzhu5080/covid19model:latest $(id -u) $(id -g)
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"1459431370\": executable file not found in $PATH": unknown.

If you leave off the uid/gid, you get a different error (running with a locally built image):

$ docker run --rm -it -v $(pwd):/var/model covid19model:latest
[snip...]
[1] "Portugal has 50 days of data"
[1] "First non-zero cases is on day 60, and 30 days before 10 deaths is day 46"
[1] "Netherlands has 57 days of data"
Error in compileCode(f, code, language = language, verbose = verbose) : 
  Compilation ERROR, function(s)/method(s) not created! In file included from /usr/lib/R/site-library/RcppEigen/include/Eigen/Core:392,
                 from /usr/lib/R/site-library/RcppEigen/include/Eigen/Dense:1,
                 from /usr/lib/R/site-library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13,
                 from <command-line>:
/usr/lib/R/site-library/RcppEigen/include/Eigen/src/Core/arch/SSE/PacketMath.h:60:39: warning: ignoring attributes on template argument ‘__m128’ {aka ‘__vector(4) float’} [-Wignored-attributes]
   60 | template<> struct is_arithmetic<__m128>  { enum { value = true }; };
      |                                       ^
/usr/lib/R/site-library/RcppEigen/include/Eigen/src/Core/arch/SSE/PacketMath.h:61:40: warning: ignoring attributes on template argument ‘__m128i’ {aka ‘__vector(2) long long int’} [-Wignored-attributes]
   61 | template<> struct is_arithmetic<__m128i> { enum { value = true }; };
      |              
Calls: stan_model ... cxxfunctionplus -> <Anonymous> -> cxxfunction -> compileCode
In addition: Warning message:
In system(cmd, intern = !verbose) :
  running command '/usr/lib/R/bin/R CMD SHLIB file76f43ff93.cpp 2> file76f43ff93.cpp.err.txt' had status 1
Error in sink(type = "output") : invalid connection
Calls: stan_model -> cxxfunctionplus -> sink
Execution halted

This one appears to be a permission error.

Expected behavior
The program runs without errors

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: OSX
  • Version 10.15.3
$ uname -a                   
Darwin GBLON1PLA00110 19.3.0 Darwin Kernel Version 19.3.0: Thu Jan  9 20:58:23 PST 2020; root:xnu-6153.81.5~1/RELEASE_X86_64 x86_64

Additional context
Add any other context about the problem here.

You can pass the uid and gid to docker containers without a wrapper script, and this is how I got the program to run:

docker run --user $(id -u):$(id -g) --rm -it -v $(pwd):/var/model covid19model:latest 

Number of deaths in Norway is wrong

Thank you for working on and sharing this code!
But as already mentioned in Norwegian newspapers, the death-numbers for Norway is completely wrong. I don't know where the responsibility to update the edcd-database lies, but it is clearly not updated correctly. (ref article: https://www.dagbladet.no/nyheter/tror-viruset-har-smittet-over-40-000-nordmenn/72317406)
I hope other countries work harder to provide the right numbers, since so many use this report as the closest you get to the truth on this epidemic.
KR Larsen

Interpreting the `EpidemicStart` variable

Hello,

thank you for your work! I'm a newcomer in Bayesian models, so I found your code really interesting.

The EpidemicStart data seems no to so clear to me: I figured it out as the day when you considered started the epidemic in each country, but all the array element has the same value, 31 (it is calculated as index1+1-index2, but index2 = index1-30).

You use these values as a cut-off for deaths in the STAN model: as you stated in the COVID Report 13, you only include deaths when they are more than 10 for each country. But in this way the model includes all deaths data from the 31st day on, for all countries, so may you can check this variable.

Regards,

Luca

Add 10 Latin American countries; consider curfew and partial quarantines

Dear all, thank you for your work. My name is Leonardo Calcagno, I am a Doctor in Economics from the University of Orleans (France) and am currently working in Argentina's National Audit Office (AGN).
Is your feature request related to a problem? Please describe.
I worked on running the model with the Latin American countries /territories that reported 10 or more COVID19 deaths by 15 Aprtil 2020: Argentina, Brazil, Chile, Colombia, Dominican Republic, Ecuador, Mexico, Panama, Peru and Puerto Rico. The full process, from cloning version 1 up to where I am right now, is on my Github Page https://github.com/leonardo-calcagno/covid19model_latam/tree/latam

Describe the solution you'd like

  • I added interventions on those 10 Latin American countries in the data folder, with the sources I found for determining intervention dates
  • Classified first interventions as done for Europe, then Latin-American specific interventions (curfew and partial or regional quarantine)
  • Computed IFR ratios based on UNPOPS population data (but without taking into account social contact matrices);
  • Corrected the source data for Puerto Rico with data taken from https://github.com/CSSEGISandData. Corrected the COVID-19-up-to-date csv to bring observations up to 31/12/2019 for all countries (to avoid bugs);
  • Added a forecast plot with counter-factual estimated deaths, taken from the estimated.deaths.cf variable. Since it's version 1 it is unreliable for more advanced stages of the epidemic, when some immunity starts to build up, but it remains useful for the early stages (it shows Brazil deaths followed counter-factual estimated deaths up to the first week of April).
  • Ended-up commenting-out Ecuador due to probably under-reported deaths, as claimed by the mayor of Guayaquil https://www.elcomercio.com/actualidad/guayaquil-enfermos-muertos-viteri-coronavirus.html. She claims there were on March 2020 1500 additional deaths in the city of Guayaquil when compared with March 2019; that 50 municipal civil servants died from COVID19; and that reported deaths from COVID19 are false because not enough tests are done to dead people to determine whether COVID19 was the cause of death. Daily number of deaths are thus likely an order of magnitude higher than currently reported.

Describe alternatives you've considered
I ran three types of simulations:
1- pooling Latin American and European countries together, with version 1 covariates
2- pooling only among Latin American countries, with version 1 covariates
3- pooling only among Latin American countries, with School Closure, Public events banned, lockdown, curfew and partial quarantine as covariates

I found when pooled with European countries, Latin American lockdown measures brought contagion rates below 1; whereas, when pooled only between Latin American countries, lockdown measures were less effective. I am not sure whether this is due to differences in classification, to the fact European countries are in a more advanced stage of the epidemic, or whether lockdown measures are less effective in Latin America.

More generally, why not distinguish between hard and soft lockdown, or partial and full quarantine, in European countries as well?

Additional context
Here I put figures for Brazil respectively pooled with European countries; pooled only between Latin American countries; and pooled between Latin American countries with specific covariates.
Brazil
Pooled with European countries
Capture d’écran 2020-04-15 à 19 36 52
Capture d’écran 2020-04-15 à 19 38 59
Counterfactual
Capture d’écran 2020-04-15 à 19 40 03

Pooled with Latin American countries, same covariates as version 1
Capture d’écran 2020-04-15 à 19 45 30
Capture d’écran 2020-04-15 à 19 46 26
Counterfactual
Capture d’écran 2020-04-15 à 19 47 06

Pooled with Latin American countries, specific covariates
Capture d’écran 2020-04-15 à 19 49 07
Capture d’écran 2020-04-15 à 19 55 16
Counterfactual
Capture d’écran 2020-04-15 à 19 55 42

Providing the source for IFR calculations (code, attack rates, contact matrices)

Low priority (I think)

Is your feature request related to a problem? Please describe.
Trying to expand the model to new countries or regions, accurate estimation of the IFR is needed and not simple.

Describe the solution you'd like
If you could open the source and the data used in the IFR calculations already included in the repository? Either in this repo, or in a separate one?

At minimum it would be nice to have the contact matrices to not have to redo the work that was
done in report 12 finding all the country specific data to process.

Describe alternatives you've considered
Attempting to reproduce it, but this seems like it would be so much easier with the input data.

Graphs & csv files only with data up to 24/4

Describe the bug
A clear and concise description of what the bug is.
Data presented on country graphs on website and csv files only with model run up to the 24th of April (is not the fetch code, that works great, is it the N2 variable?).
To Reproduce
Open website

Expected behavior
Previous models used data up to last available date

Screenshots

Desktop (please complete the following information):
Any, just reporting an issue with data on website.

Additional context
Didn't run the model locally, just the fetch data code.

Added to Open Source COVID-19

Thanks for your work to help the people in need! Your site has been added to the Open-Source-COVID-19 page, which collects open source projects related to COVID-19, including maps, data, news, api, analysis, medical and supply information, etc. Please share to anyone who might need the information in the list, or will possibly contribute to some of those projects. You are also welcome to recommend more projects.

http://open-source-covid-19.weileizeng.com/

Cheers!

v2.0 vs v1.0 results dynamic explanation

Hi All,

Also it would be good to have some high level explanation of few very visible changes in results of v2.0 at April 12th vs Report 13 (v1.0 at March 30th):

  • significant descrease of % of total population infected estimate mean for Italy and Spain;
  • dropping of Rt estimate post lockdown below 1 for most of countries.
    If it s mainly due to new data or model improvements

thx and rgds
Slava

Add a CONTRIBUTING.md

To leverage community engagement as much as possible I suggest adding a CONTRIBUTING.md file. GitHub has a good introduction.

Letting people know what they can do and how they can most effectively help you will get the most out of open sourcing your work.

My last interfering suggestion I promise ;)

Hospitalization

I am wondering whether hospitalization is included or not. I guess not...

Is that possible to include hospitalization as you did in a former Report 9? This is from the society perspective very important because I guess that measures is only driven by the projection (maximum) of the hospitalization.

Discrepancy between implementation and paper?

I'm currently trying to reproduce the model in another PPL and I'm slightly confused when comparing the model and code.

In the paper (p. 20) you can find the following:
image
while in the code we have

tau ~ exponential(0.03);
target += -sum(y_raw); // exponential(1) prior on y_raw implies y ~ exponential(1 / tau)

vector<lower = 0>[M] y = tau * y_raw;

which means the code is implementing Exponential(0.03) and Exponential(1 / τ) rather than Exponential(0.03) and Exponential(τ) as specified in the paper (or with inverse parameters, depending on which parameterisation you're using in the paper).

So I'm wondering if this is a mistake in the code or just a typo in the paper?:)

Also, from the above snippet from the paper, it seems to me that y[m] should be a random vector of length num_impute or N0 (6 in paper) for each m, while in the code all the imputed values seem to be the same value:

prediction[1:N0,m] = rep_vector(y[m],N0); // learn the number of cases in the first N0 days

Is this correct?

Thanks!

(Btw, really appreciate you open-sourcing this!)

Add additional country - Finland

Is your feature request related to a problem? Please describe.
We are currently working on including Finnish data in the model. We would be happy to contribute this to your repo. How should we do that in the best way?

Describe the solution you'd like
I would need information on what files (csv) needs to be updated and how. ECDC data already have death counts for Finland, so that is already handled.

I guess the following files need to be updated:

  1. interventions.csv when interventions have been done.
  2. popt_ifr.csv: I'm not totally sure about this. Is this computed based on age groups outside the repo, since the file contain one value per country?
  3. ages.csv. This is not superclear to me. Is that the age groups sizes in 1000s?

With kind regards
Måns

Code 19

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Percentage of total population infected

I am wondering how reliable this data on the attack rate is. Just to give some examples:
In a report from 30 March (https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/report-13-europe-npi-impact/) you put the numbers for Spain and Germany at 15% and 0.78%, respectively. Your latest numbers for both countries are 6.6% and 0.75% respectively (a decrease in Spain and no increase in Germany). At the same time, the German Robert Koch institute just published a report according to which about 8.6% of about half a million people having participated in an antibody study were found positive. Assuming that the study is fairly representative of the entire country, this would mean that there are about 7 million positives in Germany and that your model estimate is off by a factor of >10.

We did some calculations of our own, and based on an Infection fatality rate of about 0.3% (average of the IFRs estimated in the Diamond Princess study (0.5%, see https://www.medrxiv.org/content/10.1101/2020.03.05.20031773v2), the German case cluster study (0.37%, see https://www.land.nrw/sites/default/files/asset/document/zwischenergebnis_covid19_case_study_gangelt_0.pdf), and the German antibody study (0.05%, see above, resulting from 8.6% or 7 million positives and 3500 deaths), we would obtain attack rates for most countries in the double digits (especially considering that deaths are under-reported).

I can't seem to get this to run locally on my mac

This is my error message at the end:

covariate_size_effects_error <- system(paste0("Rscript covariate-size-effects.r ", filename,'-stanfit.Rdata'),intern=FALSE)
Loading required package: magrittr
This is bayesplot version 1.7.1

  • Online documentation and vignettes at mc-stan.org/bayesplot
  • bayesplot theme set to bayesplot::theme_default()
    • Does not affect other ggplot2 plots
    • See ?bayesplot_theme_set for details on theme setting

Note: As of version 1.0.0, cowplot does not change the
default ggplot2 theme anymore. To recover the previous
behavior, execute:
theme_set(theme_cowplot())


Attaching package: ‘cowplot’

The following object is masked from ‘package:ggpubr’:

get_legend

Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), :
'data' must be of a vector type, was 'NULL'
Calls: data.frame -> as.matrix -> as.matrix.default -> array
Execution halted

if(covariate_size_effects_error != 0){

  • stop(sprintf("Error while plotting covariate size effects! Code: %d", covariate_size_effects_error))
  • }
    Error in eval(ei, envir) :
    Error while plotting covariate size effects! Code: 1

Docker image run on MacOS ends up in error

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:
$ docker run covid19model

Expected behavior
Expected results folder populated and figures generated.

Screenshots
Error in compileCode(f, code, language = language, verbose = verbose) :
Compilation ERROR, function(s)/method(s) not created! In file included from /usr/lib/R/site-library/RcppEigen/include/Eigen/Core:392,
from /usr/lib/R/site-library/RcppEigen/include/Eigen/Dense:1,
from /usr/lib/R/site-library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13,
from :
/usr/lib/R/site-library/RcppEigen/include/Eigen/src/Core/arch/SSE/PacketMath.h:60:39: warning: ignoring attributes on template argument ‘__m128’ {aka ‘__vector(4) float’} [-Wignored-attributes]
60 | template<> struct is_arithmetic<__m128> { enum { value = true }; };
| ^
/usr/lib/R/site-library/RcppEigen/include/Eigen/src/Core/arch/SSE/PacketMath.h:61:40: warning: ignoring attributes on template argument ‘__m128i’ {aka ‘__vector(2) long long int’} [-Wignored-attributes]
61 | template<> struct is_arithmetic<__m128i> { enum { value = true }; };
|
Calls: stan_model ... cxxfunctionplus -> -> cxxfunction -> compileCode
In addition: Warning message:
In system(cmd, intern = !verbose) :
running command '/usr/lib/R/bin/R CMD SHLIB file6f80ddb.cpp 2> file6f80ddb.cpp.err.txt' had status 1
Error in sink(type = "output") : invalid connection
Calls: stan_model -> cxxfunctionplus -> sink

Desktop (please complete the following information):

  • OS: MacOS Catalina
  • Version 10.15.3

Additional context
N/A

GitHub Actions

Following on from #4 I wanted discuss a little what should be achievable using GitHub actions to support publishing the model. At the present time I see GitHub actions as providing potentially two useful features.

  1. Testing that the model runs on a variety of different operating systems (Linux + Windows), under different environments (Docker + Conda). This is basic validation that other people can use the model in a way that is fairly agnostic w.r.t how it is setup and run. This catches other simple to make errors such as adding a new dependency but forgetting to add it to environment.yml or forgetting to check in a new file with a commit.

The reason I say runs above is to try and emphasise the limitations of this form of testing. This will only check that the model can be run. It will provide no assurance about the correctness of the code or that any results produced are meaningful. It can be easy to rely too much on tests of this type but they should be considered only the most basic and crude check of a set of changes. With the attention this project has garnered you might find random PRs coming in (including from me apparently) and I don't want to give you a false sense of security about merging them. In an ideal world the project would include test code that could more robustly check incoming changes but that will have remain a manual process for now.

  1. Automate the building of the Docker image and its publication for new commits to master. This saves users from having to build the Docker image themselves but also makes the model much easier to run with Docker alternatives such as Singularity that are compatible with HPC environments.

I don't think either of these is very difficult to put together. I'm sure there are all sorts of things that GitHub actions could be used for here that I've not thought so let me know if any other automatible tasks spring to mind!

Please let me know what you think. If you're interested I think a sensible way forward is for me to prototype something in my fork so you can see it in action.

Provide output as text files.

Is your feature request related to a problem? Please describe.
It's quite costly to obtain the output of this model at the moment. One has to prepare their system to run it, then do so for a few hours.

Describe the solution you'd like
If you guys are running it most days anyway, couldn't you just provide the output? I see the graphs on your website, but no raw data output from anything I can see. It would be very helpful!

Describe alternatives you've considered
Putting the graph into one of those graph digitisers. Or just running it myself.

Additional context
Just a thank-you for your work on this!

Contribute 50 state variant?

I have data (intervention dates, IFRs) for all U.S. states plus my county.

I've run this through an earlier version of your model (I mirrored into a private repo after commit 10a42cf).

Please let me know if/how you'd like me to contribute this back or keep it as a mirrored project (which I expect will be public shortly).

Prediction for Indian data, only Infection part

Hi,
I am a newcomer to STAN. I tried to predict the spread in India using the code. I modified the code accordingly. But it is still not working. I am interested in the prediction of infections only.

Any help in this regard will be highly appreciated.
Stay safe.
with best wishes,
Arindom

Code_India.zip

Possible to keep data updated and continuously update the figures?

Is it possible to produce daily reports using this model to show how estimates of post-intervention transmission and total percentage of population vary with data up to a certain day, and to update the data to be current?

I see many people citing the percent infected as of March 28 and post-intervention R figures, so I think there's a value in making sure the estimates are informed by the most recent data, and seeing how the estimates change as new data is added.

Clarification surrounding counterfactual estimation

I was wondering if it would be possible to get some clarification around the precise methodology used to obtain the counterfactual results in table 2 of your most recent report.

In particular, is that the case that

  1. inference was performed in a model with time-varying R_{t, m} to obtain the posterior over R_{0, m}, which was subsequently utilised as a distribution over R{0, m} in a counterfactual model, from which samples are generated assuming R_{t, m} = R_{0, m} for all t, or
  2. two separate models were fit, one in which R_{t, m} varies, and one in which R_{0, m} does not,

(presumably 2 would estimate a lower R_{0, m} for the counterfactuals than 1), or something else was done?

I ask because page 13 para 2 of the report begins To understand the impact of interventions, we fit a counterfactual model without the interventions and compare this to the actual model., suggesting that multiple models have been fit, which doesn't match with what I was imagining that you're doing in table 2.

Apologies in advance if I've missed the bit of the report where you explain this in detail!

Bugs in data

Describe the bug
I have refactored a lot of the preprocessing and put most stuff into an R package for being able to add tests to data (I'm adding my own). While doing this I found three (smallish) bugs in the data:

  1. In d1 for spain, it is incorrectly so that Self_isolate_if_ill is set to the 14th of March, not the 17th as in the interventions.csv file.

  2. In d1 for Greece, the 3rd and 4th of March is missing, so the series is shifted by two days. This is padded for in the Deaths variable, but not in the Cases variable, and hence those two differ.

  3. Padding does not seem to work when adding additional countries. I have not delved deeper into this though. (I interpret padding as adding additional missing days with 0 cases and deaths)

To Reproduce
Run the code and store d1 variables, check dates and compare with interventions.csv

Expected behavior
The data should be the same as in ECDC data and interventions.csv

Additional context
I have refactored most of the data processing and put this functionality as an R package. That lets me check package dependencies as well as include tests of the R code. Now I have test suites to asses this functionality to be correct (i.e. give identical results to your code minus the mentioned bugs). I could try to make a separate PR for this refactoring, if you are interested.

Reports not displaying on old Android browser

Describe the bug
A clear and concise description of what the bug is.

Only header and footer of reports display on an old Android Tablet browser.
No problem in Chrome on same Tablet

To Reproduce
Steps to reproduce the behavior:
"COVID-19 Planning Tools

Daily updated estimates for Europe

...

Find out more >"

Works fine in Chrome on Android Tablet but not on default browser on same Tablet.

Default Android browser has link broken from:

https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/covid-19-planning-tools/

to:

https://imperialcollegelondon.github.io/covid19estimates/#/details/United_Kingdom

Shows footer and top banner with drop down menus but not working.

Same problem for link from README in this repo at:

"You can look at our results here"

in:

https://github.com/ImperialCollegeLondon/covid19model

to:

https://imperialcollegelondon.github.io/covid19estimates

Works ok in Chrome but not in default Android browser.

Expected behavior
A clear and concise description of what you expected to happen.

Tools link shows UK data in Chrome browser.
README link shows introduction.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]
    Android System Webview 80.0.3987.132 on Samsung Galaxy 10.5 SM-T800 Android 5.02

Additional context
Add any other context about the problem here.

Seems bizarre but I double checked after SHIFT refresh of cache.

Page footer as well as header is visible.

Suggested Fix:

Probably not worth fixing the actual problem. Could just add a note in header and/or footer:

"If page not displaying properly update Chrome browser"

Interpretation of CFR

Sorry to step in here, but saw the article and wanted to share something.

From the code it seems that you assume that CFR is the probability of dying given the infection.
However, you should take into account two opposite effects:

  1. While death data may be rather clean (not really, but let's leave this), number of cases is a mess. I can talk from my country (Spain). This renders CFR useless, as it is greatly overestimated by this effect and it is not even consistent through time.

  2. Even with clean data, I think you are forgetting that due to the death lag and epidemic growth, Observed CFR greatly underestimated by an opposite effect:

Maybe an example helps:

Assumptions:
Real mortality: 3%
Total cases day 0: 100
Daily increase 30%

What we see:
Total cases day 10 : 1001.3^10= 1400
Total Deaths day 10: 100
.03= 3
CFR observed at day 10: 3/1400=0.2%

Observed CRF is 14 times lower than real mortality.

regards,
Jaime

Simulate the Risk of lifting measures

Is your feature request related to a problem? Please describe.
Great people from Imperial, I think you need to publish a report 15 what happend when you lift measures. Your report 9 had such a good inpact on poletician. I am scared the poletician do not understand the dynamics when we lift measures on a much bigger basis of infectiuous people without noticing any change for 14 days because of the delay from exposed -> infectious -> hospital -> ...

Describe the solution you'd like
extension of the measure table with a new column containing either implemented of lifted at date X

Describe alternatives you've considered
With an entended version of a SEIR Model. But people from Imperial, it does not matter how bust it matters who publishs the topics.

Python Interface

To make it easier for further development within the community, I suggest we convert some of the pre-stan code in base.r into Python. Should be mostly a pandas-numpy-datetime job. After which we can use the Pystan[https://pystan.readthedocs.io/en/latest/] interface to compile the Stan code and ingest data via Python.

Generation of alpha covar table failed!

Describe the bug
Running on Docker under Linux, it dies with Generation of alpha covar table failed!

To Reproduce
Steps to reproduce the behavior:

  1. Set up Docker on Linux 2 AMI or AWS EC2 as per these instructions
  2. Run docker run --rm -v $(pwd)/results:/var/model/results -v $(pwd)/figures:/var/model/figures harrisonzhu5080/covid19model:latest
  3. Wait 30 minutes (faster with more vCPUs probably)
  4. Error message:
There were 50 or more warnings (use warnings() to see the first 50)
[1] "Make forecast table"
[1] "Running results/base-318697-stanfit.Rdata"
Error in `$<-.data.frame`(`*tmp*`, value, value = character(0)) : 
  replacement has 0 rows, data has 14
Calls: $<- -> $<-.data.frame
Execution halted
Error: Generation of alpha covar table failed! Code: 1
Execution halted

Full transcript here

Expected behavior
DNA code for a virus that, upon infecting a human, renders them immune to COVID-19.
Or, failing that, complete without errors.

g++.exe: error: : No such file or directory

Hello,

When attempting to run the code, specifically line 201

m = stan_model(paste0('stan-models/',StanModel,'.stan'))

I get the following error;

Error in compileCode(f, code, language = language, verbose = verbose) : Compilation ERROR, function(s)/method(s) not created! g++.exe: error: sheaky/Documents/R/R-3.6.3/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp: No such file or directory g++.exe: error: sheaky/Documents/R/R-3.6.3/library/StanHeaders/include: No such file or directory g++.exe: error: sheaky/Documents/R/R-3.6.3/library/RcppEigen/include: No such file or directory make: *** [C:/Users/OLIVER~1/DOCUME~1/R/R-36~1.3/etc/x64/Makeconf:215: file17041f307c4a.o] Error 1

The correct path to the library is C:/Users/oliver sheaky/Documents/R/R-3.6.3/library, I am assuming it is failing to handle the space in my username?

I followed the instructions here for setting up RStan in case that was causing the issue, but that didn't resolve it.

Apologies if its a super simple fix I have never used RStan before.

Thanks in advance for your help,
Oliver

The IFR can be estimated from the data

This issue is for information only. I want to share the observation that the infection fatality ratio, that is such a crucial input to the model and whose true value is so uncertain, can be estimated from the mortality data itself, and the result is surprising. The reason the IFR is a priori so uncertain is that what we can observe is only the CFR, the case fatality ratio. We do not know how many undetected infections there are for every detected case. The Imperial group have done their best to come up with good values for the IFR, but a large uncertainty remains.

As we describe at https://github.com/gustavdelius/covid19model/blob/master/covid19_IFR_report.pdf , if you put a broader prior distribution on the mean of the IFR, to acknowledge our uncertain knowledge, the posterior is actually quite peaked at a value that is an order of magnitude smaller than previously assumed, i.e., the model and the data taken together suggest that the virus is less deadly than assumed, but at the same time also more virulous.

The reason it is possible to extract information about the IFR from the mortality data is that the IFR determines the number of infections that are deduced from those mortalities, and those infections are assumed to lead to immunity, and the build-up of immunity over time lowers the R_t over time, which it turns out is detectable in the shape of the mortality curves. This is illustrated in the graph from the model with the broader prior below, where the curvature in R_t due to immunity is clearly visible.

United_Kingdom_three_pannel_step_ifr-1491471-stanfit Rdata

This observation that the data can be used to learn values for the IFR may be of particular value to people who are wanting to extend the model to new countries or regions and are not sure how to choose the IFR.

You can play with this at https://github.com/gustavdelius/covid19model and I welcome suggestions and questions on the issue tracker there, https://github.com/gustavdelius/covid19model/issues . The report itself has been put on medRxiv.org.

It is hopefully not necessary, but nevertheless I would like to stress that it is important to understand that these are model results, valid only in the context of this model, which may or may not adequately reflect the real world. Do not go out and break the shutdown based on this model.

Request for code for report 9

This code is for report 13 and appears to be different to the code used to generate the famous Report 9, from the 16th March:

https://www.imperial.ac.uk/media/imperial-college/medicine/sph/ide/gida-fellowships/Imperial-College-COVID19-NPI-modelling-16-03-2020.pdf

This codebase is written mostly in R, but in tweets Prof Neil Ferguson claimed that the code used for this report was "thousands of lines of C". There's no C anywhere in this codebase.

I think what people would most appreciate seeing is the model that drove the UK's sudden change of tactics, which this one doesn't appear to be. Can you please publish Prof Ferguson's original code as well?

more dependencies?

in environment.yml I think you need to add r-jsonlite and r-svglite, and you need to make it clear in requirements that you need a jdk (at least for mac)?

Add 44 US States & District of Columbia

I have extended your model to 44 US States and the District of Columbia, using data from Johns Hopkins and adapting it for the input format of your model.

For state-specific IFR, I used the European-country IFR averge, since test coverage in USA is insufficient.

Due to some hard-coding related to the European data, I used new _us.r files derived from the original. My progress is at a forked repository. I am uploading my results here.

If the authors are interested in the simple addition of "United States," this would be a relatively straightforward change. My intention going forward is to add a factor (k=7) for "partial lockdown" to represent situations where large municipalities locked down before the entire state.

Mark the model for Sweden as invalid or similar

Due to the model for Sweden using incorrect data, I think you should somehow mark the corresponding web page [1] as invalid or something. Perhaps you could add a banner or note at the top of the web page. Or maybe just remove it.

As for why the data I say the data it's using is invalid, have a look at the the screenshots below for comparison. See #58 for additional details.

I also wonder if you have checked the data for other countries?

N:o deceased for different days as taken from Sweden's "Folkhälsomyndighet" on the 16th of April:
image

Compare this with what's currently (on the 16th) being shown on the model for Sweden:
image

[1] https://mrc-ide.github.io/covid19estimates/#/details/Sweden

Comment on trends in cumulative infection predictions and Rt

I looked at predicted cumulative infections for the March 28 and April 9 runs. Of countries with some sort of lockdown (all of them ex-Sweden), 5 show lower predicted percentages infected on April 9 than on March 28. There is a positive correlation between the ratio of the April 9 and March 28 infection estimates and the day in March of lockdown. Qualitatively there are similar trends for post-lockdown Rt: all countries except Norway Rt>1 on March 28; all countries except Belgium Rt<1 on April 9.

Given that cumulative infections and Rt are important data points for policy (the main things I've seen cited from this work), could you comment on that? I wonder if a lag of transmission in home/work post-lockdown-day could be added to the model?

Discrepancy for Spain

Hi Swapnil / All,

Thank you for all your efforts in this project.

Just joining this space and have few questions. Sorry if some of them were already adressed.

1st one being why in v2.0 at April 12th predicted deaths for Spain are far outside of 95% CI for quite a few recent dates? I understood that given nb of cases and interventions age, Spainish data were intensively used for model calibration..

thx and rgds
Slava

"Total infected" wording?

https://mrc-ide.github.io/covid19estimates/#/total-infected

Hi, I read the caption "Posterior model estimates of percentage of total population infected as of 2020-04-15." before reading the sentence "We report the total percentage of the population infected over the course of the pandemic not the population as of now."

Can it be rephrased to "Posterior model estimates of percentage of total population infected over the course of the pandemic. Estimates as of 2020-04-15." ?

How to reproduce results / run the code?

I applaud you for your work and for making the code available. That said, I'm not sure how to run the code. I infer in the below line that I'm supposed to run this from the command line with arguments (or at least that if no arguments are provided then base is run), but I'm not sure whether that's the right approach or if it is what arguments I should pass to Rscript.

args = commandArgs(trailingOnly=TRUE)

There are other inputs that are apparently needed (like the environment variable PBS_JOBID), but I'm not sure what they do.

Can you provide one or two examples in the README.md?

Thanks again.

Modularisation of pre and post processing for application to regional data

This is a feature request but also an offer to help

Is your feature request related to a problem? Please describe.
I'm adapting this code to run on data regarding French regions. The pre and post-processing
is brittle (hard coded number of countries, hard coded reference countries) and would benefit from modularisation/parameterisation.

Describe the solution you'd like
Functionalize and parameterize pre and post processing to tolerate arbitrary geographical regions to be run provided the data is present and in the correct format.

Describe alternatives you've considered
I've already started adapting the code:
https://github.com/payoto/covid19model/tree/france-regions

I will probably modularise it a bit but would have more incentive to do it properly if it can be useful to others (and facilitate maintenance of my fork).

Would you be interested in a PR going refactoring parts of the scripts to be functions?

improving Stan model efficiency and readability

Is your feature request related to a problem? Please describe.

The issue is related to making the Stan code more efficient and readable.

Describe the solution you'd like

Ideally, someone codes the suggested changes. With some feedback, I can code the changes myself, but I didn't want to make a massive PR that the original developers didn't like.

Describe alternatives you've considered

Remove variable x because it's not used. (We're developing a lint mode
that'll flag issues like unused variables.)


Remove transformed data variable delta because it's not used.


Rename covariate1, ..., covariate6 to x1, ..., x6 because x is conventional for covariates and the term "covariate" doesn't convey much information. Giving these meaningful names would be another possibility.


Clarify comment on N2 as to what days of observed data is. Is it
max(N)?


First, rewrite Rt[ , m] for readability:

   Rt[ , m] = mu[m] * exp(covariate1[, m] * (-alpha[1])
                           + covariate2[, m] * (-alpha[2])
                           + covariate3[, m] * (-alpha[3])
                           + covariate4[, m] * (-alpha[4])
                           + covariate5[, m] * (-alpha[5])
                           + covariate6[, m] * (-alpha[6]));

Then rearrange and move to declare/define style followed by an increment
for the intercept

  matrix[N2, M] Rt = exp(-(covariate1 * alpha[1]
                           + covariate2 * alpha[2]
                           + covariate3 * alpha[3]
                           + covariate4 * alpha[4]
                           + covariate5 * alpha[5]
                           + covariate6 * alpha[6]));
  for (m in 1:M) Rt[ , m] *= mu[m];

For readability, I'd sacrifice a bit of speed and use

  y_raw ~ exponential(1);  // implies y ~ exponential(1 / tau);

instead of

  target += -sum(y_raw);  // ...

Keep tau itself on unit scale for more efficient adaptation:

  parameters {
    real<lower = 0> tau_unit;

  transformed parameters {
    real<lower = 0> tau = tau_unit / 0.03;

  model {
    tau_unit ~ exponential(1);  // implies tau ~ exponential(0.03)

The convolution code can be simplified from

  for (m in 1:M) {
    prediction[1:N0, m] = rep_vector(y[m], N0);
    for (i in (N0 + 1):N2) {
      real convolution = 0;
      for (j in 1:(i - 1)) {
        convolution += prediction[j, m] * SI[i - j];
      }
      prediction[i, m] = Rt[i, m] * convolution;
    }

to

  for (m in 1:M) {
    for (i in 1:N0) {
      prediction[i, m] = y[m];
    }
    for (i in (N0 + 1):N2) {
      real convolution = y[m] * sum(SI[1:(i - 1)]);
      prediction[i, m] = Rt[i, m] * convolution;
    }

It's a bit counter-intuitive, but the loop is faster because it doesn't require a new vector copy on the right hand side and loops without autodiff are very fast in Stan. Given that preciction[j, m] was just defined to be y[m], that can be used directly to cut down on memory access in the loop.


I assume the centered parameterization

mu ~ normal(2.4, kappa)

works better than the non-centered one, given that other parameters are non-centered. Otherwise, this needs to be rewritten in terms of

mu_std ~ std_normal();

and

mu = 2.4 + kappa * mu_std;

phi and kappa can be put on unit scales,

  transformed parameters {
    real<lower = 0> phi = 5 * phi_std;
    real<lower = 0> kappa = 0.5 * kappa_raw;

  model {
    phi_std ~ std_normal();
    kappa_std ~ std_normal();

The magic number 1e-9 needs doc in the definition of E_deaths[1, m]. E_deaths shows up in the negative binomial likelihood as a rate parmeter. Is it never used because of EpidemicStart? If it's not intended to use, filling unused bits of E_deaths with NaN will make sure they raise an error if they're accidentally used somewhere in the target density.


Put rep_matrix in generated data so it's only done once.

generated data {
  matrix[N2, M] matrix_1000 = rep_matrix(1000, N2, M);
  matrix[N2, M] matrix_0 = rep_matrix(0, N2, M);

generated quantities {
  matrix[N2, M] lp0 = matrix_1000;
  matrix[N2, M] lp1 = matrix_1000;
  ...

The magic number 1000 in the definition of matrix_1000 should be documented.


One way to document constants is to define them as generated data variables with meaningful names.


The E_deaths loop can be pulled up into a function.

  functions {
    matrix expected_deaths(matrix prediction, matrix f) {
      int M = rows(prediction);
      int N = cols(prediction);
      matrix[M, N] y;
      for (m in 1:M) {
        y[1, m] = 1e-9;
	for (i in 2:M) {
	  y[2, m] = 0;
	  for (j in 1:(i - 1)) {
	    y[i, m] += prediction[j, m] * f[i - j, m];
          }
        }
      }
      return y;
    }
  }

And then the blocks of code can be replaced by

  E_deaths = expected_deaths(prediction, f);
  ...
  E_deaths0 = expected_deaths(prediction0, f);

Also, f is an unusual symbol for a matrix.


I'm working on getting ragged arrays into the language---sorry we don't
have those yet or the indexing would be much simpler!

Static/Dynamic Website

We should create a visualisation tool as part of the modelling that can be part of the CI/CD pipeline. Let us know below what features you would like to include! :) It could follow the lines of https://covid19uk.live/

Add Latinoamerica countries: Ecuador

Is your feature request related to a problem? Please describe.
If yes, I was working with the Model for Ecuador data. If it is possible include it I would lile to contribute.

Describe the solution you'd like
Now I reproduced the full script and I generated the charts for Ecuador.

  • I am using the official report of Ecuador Government (for dates of interventions, death and confirmed new cases) and OMS data,
  • I am using data for groups of age of Ecuador.
  • Finally I changed label chart to Spanish.

Describe alternatives you've considered
For translation I just changed the labels.

Additional context
Screenshots
Ecuador_three_pannel_base-420305-stanfit Rdata
Ecuador_forecast_base-420305-stanfit Rdata

Sweden intervention measures not in sync between current online results and PDF report

The PDF version of the report looking at the impact of non-pharmaceutical interventions (https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/report-13-europe-npi-impact/) lists different interventions then the most recent online version (https://imperialcollegelondon.github.io/covid19estimates/#/details/Sweden).
Specifically, Sweden did implement partial school closures (e.g. https://www.svt.se/nyheter/inrikes/gymnasieskolor-och-hogskolor-rekommenderas-distansundervisning) - high-schools from year 10, universities and adult education switched to distance learning from March 18th - which do not show up in the estimate shown in the most recent results up to April 14th.
I wonder if this is intentional and how this would affect the current Rt and estimates.

P.S.: Thanks for the very informative site and for sharing the source!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.