Giter VIP home page Giter VIP logo

lisphilar / covid19-sir Goto Github PK

View Code? Open in Web Editor NEW
108.0 5.0 44.0 1.14 GB

CovsirPhy: Python library for COVID-19 analysis with phase-dependent SIR-derived ODE models.

Home Page: https://lisphilar.github.io/covid19-sir/

License: Apache License 2.0

Python 68.43% Makefile 0.81% Jupyter Notebook 30.77%
covid19 epidemiology python covid covid-19 coronavirus epidemic-simulations epidemic-model data-science analysis

covid19-sir's Introduction

Twitter @lisphilar My Qiita posts My Qiita contributions My Qiita followers

👨‍💼Work as a data anaylist in clinical data scieince field and 👨‍💻develop Python project, including CovsirPhy library, as a hobby.
Clinical research associate previously.

Clinical data scientist in Japan. Python library developer.
分子生物学(2012-2018 M.Eng.), 生命情報科学(2014- 独学), 医薬品臨床試験(2018−2023 CRA), 医療データ解析(2024-).
#CovsirPhy

Category Item Inforamtion
Info Keyword Molecular biology, Neurochemistry, Bioinformatics,
Clinical trials, Clinical data science,
Python library development,
Japanese history
Language Japanese, Python, English
Where Japan
Favorite book What Is Life? by Schrödinger
Job 2018/4-2023/12 CRA / Clinical research associate
2024/1-current Data analyst in clinical data science
Academic Degree Bachelor of Engineering (Life Science Program)
Master of Engineering (Chemical and Energy Engineering)
College College of Engineering Science, YOKOHAMA National University, Japan.
Graduate school Graduate school of Engineering, YOKOHAMA National University, Japan.
Subject of bachelor/master's thesis Protein-protein interaction site analysis to develop a drug for central nervous system injuries
Tool Python From 2014
R Learned from 2012 to 2014, from 2023
Editor Visual Studio Code
Note taking Obsidian, Google Colaboratory
OS Windows / Windows subsystem for Linux
Learning Data science with Python (self-study) / Coursera / edX

Lisphilar: Life science + philosophy of science + molecular biology

Products

I have published the following libraries and notebooks. Please collaborate with me for development!

COVID-19 data analysis

Scenario files of Sengokushi (Japanese history, written in Japanese)

Stats

github-chart

covid19-sir's People

Contributors

deepgohil avatar dependabot-preview[bot] avatar dependabot[bot] avatar ilyasst avatar inglezos avatar lisphilar avatar mehrdadbn9 avatar mihan786chistie avatar rebeccadavidsson avatar renovate-bot avatar sourcery-ai[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

covid19-sir's Issues

Interface to create example datasets easily

Is your feature request related to a problem? Please describe.
Interface to create example datasets is necessary to provide tools to develop new ODE models.

Describe the solution you'd like

  • JHUData class includes the methods of PhaseData, SRData and ODEData
  • Create ExampleData that is a sub-class of JHUData
  • ExampleData produces example datasets with pre-set/applied parameters and models

ImportError when pip install with Kaggle Nontebook

Summary:
In Kaggle Nontebook, the following installation causes ImportError (cannot import name 'ModelBase' from covsirphy.ode.mbase)

!pip install covsirphy

CovsirPhy version 2.5.2

Environment:
Python 3.8, pipenv, WSL.

Need change of ODE simulation system from non-dim to dimensional system

As mensioned in #4 (comment) and #1 (comment), we need to change the ODE simulation system from non-dimenstional system to dimensional system.

Advantage of non-dimensional system:

  • we can estimate the parameter values efficiently and compare the parameter values
  • we can compare the parameter values with that of the other countries easily

Dis-advantage of non-dimensional system (may be root cause of issue#1):

  • substraction in diferencial equations reduces the accuracy of numerical simulation
  • it is difficult to determine the minimum value of dydt
  • dimensionalization of simulated values increases error with actual values

USA scenario analysis: not show line in S-R trend analysis

Summary:
With USA data, the following codes do not show fitting line in S-R trend analysis.

import covsirphy as cs
# Dataset preparation
data_loader = cs.DataLoader("input")
jhu_data = data_loader.jhu()
population_data = data_loader.population()
scenario = cs.Scenario(jhu_data, population_data, country="US")
scenario.trend()

Environment:
Python 3.8, pipenv, WSL.

JHUData.subset(country="US") causes KeyError in Kaggle

Summary:
JHUData.subset(country="US") causes KeyError with Kaggle datasets.

CovsirPhy version 2.5.2

Related classes:

  • covsirphy.JHUData

Codes and outputs:
(Local environment with Kaggle API)

import covsirphy as cs
data_loader = cs.DataLoader("input")
jhu_data = cs.JHUData("/kaggle/input/novel-corona-virus-2019-dataset/covid_19_data.csv")
population_data = cs.PopulationData(
    "/kaggle/input/covid19-global-forecasting-locations-population/locations_population.csv"
)
scenario = cs.Scenario(jhu_data, population_data, country="US")
scenario.records()

This causes KeyError as follows.

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-170-0205c0c93a9f> in <module>
      1 usa_scenario = cs.Scenario(jhu_data, pop_data, "US")
----> 2 usa_scenario.records().tail()

/opt/conda/lib/python3.6/site-packages/covsirphy/analysis/scenario.py in records(self, show_figure, filename)
     95             Records with Recovered > 0 will be selected.
     96         """
---> 97         df = self.jhu_data.subset(country=self.country, province=self.province)
     98         if not show_figure:
     99             return df

/opt/conda/lib/python3.6/site-packages/covsirphy/cleaning/jhu_data.py in subset(self, country, province, start_date, end_date, population)
    251         # Subset with area
    252         df = self._subset_area(
--> 253             country, province=province, population=population
    254         )
    255         # Subset with Start/end date

/opt/conda/lib/python3.6/site-packages/covsirphy/cleaning/jhu_data.py in _subset_area(self, country, province, population)
    206                 return df.loc[df[self.R] > 0, :]
    207             raise KeyError(
--> 208                 f"Records of {province} in {country} were not registered.")
    209         # Province was not selected and COVID-19 Data Hub dataset
    210         c_level_set = set(

KeyError: 'Records of - in US were not registered.'

Environment:
Python 3.8, pipenv, WSL.

Estimater of ODE parameters does not perform parallel jobs

Summary:
Estimator.run(n_jobs=-1) does not perform paralle jobs. This may be caused by threading method of Optuna package.

CovsirPhy version 2.3.2

Related classes:

  • covsirphy.Estimator
  • covsirphy.Scenario

Environment:
Python 3.8, pipenv, WSL.

Failed in addition of past phase manually

Dear Rakesh,
Thank you for your feed-back.
I changed the code to fix #110 in GitHub, but new error occurred as you mentioned. I will fix this issue today and update the package in PyPI.

By the way, please edit each "summary", "Codes and outputs" section etc. in the issue template. I think this is more useful for you.

Dear Lisphilar,
Please let me know how to divide the different phases in case of scenario matter as per own need.Kindly help me if possible.

With regards,
Rakesh

Originally posted by @SM-ins in #116 (comment)

Fixing Bug: ParserError with Population class

Hii,

pop_data = cs.Population(
"../input/world-population/API_EN.POP.DNST_DS2_en_csv_v2.csv"
)
pop_data.cleaned().tail()

i am getting the error given below while running the above code...,please help if possible...

ParserError: Error tokenizing data. C error: Expected 3 fields in line 5, saw 62

With regards,
Rakesh

Revise stdout of parameter estimation

Summary:
Stdout of parameter estimation in scenario analysis could be revised.

CovsirPhy version 2.4.1

Related classes:

  • covsirphy.Scenario

Codes and outputs:

import covsirphy as cs
# Dataset preparation
data_loader = cs.DataLoader("input")
jhu_data = data_loader.jhu()
population_data = data_loader.population()
scenario.trend(set_phases=True)
scenario.estimate(cs.SIRF)

This code returns
10th phase with SIR-F model finished 67 in 1 min 3 sec. etc.
10th phase with SIR-F model finished 67 trials in 1 min 3 sec. is better.

Environment:
Python 3.8, pipenv, WSL.

Change default value of Estimator.run(timeout_iteration) to 5 seconds

Summary:
Because parameter estimation completes within 5 seconds in some phases, the default value of timeout_iteration of Estimator.run() can be 5 seconds.

CovsirPhy version 2.4.1

Related classes:

  • covsirphy.Estimator

Codes and outputs:

import covsirphy as cs
# Dataset preparation
data_loader = cs.DataLoader("input")
jhu_data = data_loader.jhu()
population_data = data_loader.population()
scenario = cs.Scenario(jhu_data, population_data, "Japan")
scenario.trend()
scenario.estimate(cs.SIRF)

Environment:
Python 3.8, pipenv, WSL.

Select a country with ISO3 code

Is your feature request related to a problem? Please describe.
In version 2.4, we can specify countries only with country name in JHUData and Scenario class.

Describe the solution you'd like
For standard users, create a method of CleaningBase class to convert ISO3 code to country name.

ModuleNotFoundError: No module named 'better_exceptions' in installation

Summary:
When installed CovsirPhy with pip command, "ModuleNotFoundError: No module named 'better_exceptions'" occurs and we cannot install this package.
This error was mentioned as a comment of Kaggle notebook.

CovsirPhy version 2.2.2

Codes and outputs:

pip install git+https://github.com/lisphilar/covid19-sir#egg=covsirphy

This code causes the following error.

Collecting covsirphy from git+https://github.com/lisphilar/covid19-sir#egg=covsirphy
  Cloning https://github.com/lisphilar/covid19-sir to /tmp/pip-build-y0_kp_8r/covsirphy
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-y0_kp_8r/covsirphy/setup.py", line 3, in <module>
        setup()
...
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 994, in _gcd_import
      File "<frozen importlib._bootstrap>", line 971, in _find_and_load
      File "<frozen importlib._bootstrap>", line 941, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 994, in _gcd_import
      File "<frozen importlib._bootstrap>", line 971, in _find_and_load
      File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 678, in exec_module
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "/tmp/pip-build-y0_kp_8r/covsirphy/covsirphy/__init__.py", line 6, in <module>
        import better_exceptions
    ModuleNotFoundError: No module named 'better_exceptions'

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-y0_kp_8r/covsirphy/

Environment:
Python 3.6.4, pip, WSL.

Long-term ODE simulation shows negative number of cases

Summary:
The number of cases must be a non-negetive integer. However, long-term ODE simulation shows negative number of cases.

Version 2.0.1

Related classes:

  • covsirphy.analysis.simulator.ODESimulator
  • covsirphy.ode.sirf.SIRF

Code:

import covsirphy as cs
# Settings
eg_population = 1_000_000
eg_tau = 1440
step_n = 1000  # Step number of simulation
param_dict = {"theta": 0.002, "kappa": 0.005, "rho": 0.2, "sigma": 0.075}
y0_dict = {"x": 0.999, "y": 0.001, "z": 0, "w": 0}
# Simulation
simulator = cs.ODESimulator(country="Example", province="Example-1")
simulator.add(
    model=cs.SIRF, step_n=step_n, population=eg_population,
    param_dict=param_dict, y0_dict=y0_dict
)
simulator.run()
# Non-dimensional
nondim_df = simulator.non_dim()

Output:
nondim_df is a dataframe and shows predicted values in non-dimensional ODE model.
(t: time step, x: Susceptible/Population, y: Infected/Population, z: Recovered/Population, w: Fatal/Population for SIR-F model.)
x, y, z and w must be a positive number of cases, but some x values are negative values.

Frequency:
Always

Environment:
Python 3.8, pipenv, WSL

Show parameter values and OxCGRT scores in the same dataframe

Is your feature request related to a problem? Please describe.
As mentioned in #3, it is useful to show the parameter values and OxCGRT scores in a dataframe. This new method will be used for learning the relationship of parameter values and OxCGRT scores.

Describe the solution you'd like

  1. Create cs.Scenario(..., country="country name") instance
  2. Perform S-R trend analysis and find change points and set phases with cs.Scenario.trend()
  3. Calculate parameter values of phases with cs.Scenario.estimate(cs.SIRF)
  4. Calculate parameter values of each day using this new method
  5. Combine with OxCGRT data using this new method and create a dataframe

(with Kaggle API) KeyError for covsirphy.Population.value(country="JPN")

Summary:
KeyError was raised when covsirphy.Population.value(country="JPN") was done when the datasets were downloaded with "input.py" (Kaggle API).

CovsirPhy version 2.3.0

Related classes:

  • covsirphy.cleaning.population.Population

Codes and outputs:

import covsirphy as cs
data_loader = DataLoader("input")
population_data = data_loader.population()
population_data.value("JPN")

This code returns KeyError: 'JPN is not registered. Please use ISO3 code, like JPN.'

Environment:
Python 3.8, pipenv, WSL.

Citation was set mistakenly for local datasets

Summary:
DataLoader.jhu() etc. must set the citations when retrieving from remote servers. However, citation was set when using local files downloaded from Kaggle API.

CovsirPhy version 2.4.1

Related classes:

  • covsirphy.DataLoader

Codes and outputs:

import covsirphy as cs
# Dataset preparation
data_loader = cs.DataLoader("input")
jhu_data = data_loader.jhu(local_file="covid_19_data.csv")

Environment:
Python 3.8, pipenv, WSL.

How to predict the number of cases accurately

Topic:
How can we predict the number of cases accurately?

Note:
With version 2.0.0, we perform the following steps.

  1. Split time-series data to some phases using S-R trend analysis
  2. Estimate the parameter values of an ODE model using data of each phase
  3. Predict parameter values of future phases using values of the last phase
  4. Simulate the ODE model with predicted parameter values

Please share your ideas to update the steps/create a new approach.

Un-necessary optumization is done for fixed parameters

Summary:
Un-necessary optumization is done for fixed parameters in hyperparameter optimization of math models. This bug was mentioned by prbocca on a Kaggle notebook.

CovsirPhy version 2.1.1

Related classes:

  • covsirphy.phase.estimator.Estimator

Current:

p_dict.update( { k: trial.suggest_uniform(k, *v) for (k, v) in model_param_dict.items() } )

Proposed:

p_dict.update( { k: trial.suggest_uniform(k, *v) for (k, v) in model_param_dict.items() if k is not in self.fixed_dict.keys() } )

Documentation of the detail of usage

Is your feature request related to a problem? Please describe.
Quick usage is in README.md and example codes are in example directory.
However, it is difficult to get detaild information of this package.

Describe the solution you'd like
Create GitHub Pages with Sphinx to document the details of this package.

Figure of S-R trend analysis: 10th phase converted to 0Initial in legend

Summary:
In the figure of S-R trend analysis, 10th phase was labeled as "0Initilal" phase.

CovsirPhy version 2.4.1

Related classes:

  • covsirphy.ChangeFinder
  • covsirphy.Trend

Codes and outputs:

import covsirphy as cs
data_loader = cs.DataLoader("input")
jhu_data = data_loader.jhu()
population_data = data_loader.population()
scenario = cs.Scenario(jhu_data, population_data, country="Japan")
scenario.trend()

Environment:
Python 3.8, pipenv, WSL.

ImportError of CovsirPhy: cannot import name 'ModelBase'

Summary:
When importing CovsirPhy in Kaggle Notebook, ImportError occurred.

CovsirPhy version 2.5.1

Environment:
Python 3.8, pipenv, WSL.

Codes:

!pip install covsirphy
import covsirphy as cs

Error:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-3-a646b9deb9e0> in <module>
----> 1 import covsirphy as cs
      2 cs.get_version()

/opt/conda/lib/python3.6/site-packages/covsirphy/__init__.py in <module>
     10     better_exceptions_installed = False
     11 from covsirphy.__version__ import __version__
---> 12 from covsirphy.analysis import ODESimulator, ChangeFinder
     13 from covsirphy.analysis import PhaseSeries, Scenario
     14 from covsirphy.cleaning import Term, CleaningBase, DataLoader

/opt/conda/lib/python3.6/site-packages/covsirphy/analysis/__init__.py in <module>
     13 
     14 for m in modules:
---> 15     m_imported = import_module(f"{__name__}.{m.stem}")
     16     for (k, v) in m_imported.__dict__.items():
     17         if not k.startswith("__"):

/opt/conda/lib/python3.6/importlib/__init__.py in import_module(name, package)
    124                 break
    125             level += 1
--> 126     return _bootstrap._gcd_import(name[level:], package, level)
    127 
    128 

/opt/conda/lib/python3.6/site-packages/covsirphy/analysis/scenario.py in <module>
     13 import numpy as np
     14 import pandas as pd
---> 15 from covsirphy.ode import ModelBase
     16 from covsirphy.cleaning import JHUData, PopulationData, Term
     17 from covsirphy.phase import Estimator

/opt/conda/lib/python3.6/site-packages/covsirphy/ode/__init__.py in <module>
     13 
     14 for m in modules:
---> 15     m_imported = import_module(f"{__name__}.{m.stem}")
     16     for (k, v) in m_imported.__dict__.items():
     17         if not k.startswith("__"):

/opt/conda/lib/python3.6/importlib/__init__.py in import_module(name, package)
    124                 break
    125             level += 1
--> 126     return _bootstrap._gcd_import(name[level:], package, level)
    127 
    128 

/opt/conda/lib/python3.6/site-packages/covsirphy/ode/sirfv.py in <module>
      3 
      4 import numpy as np
----> 5 from covsirphy.ode.mbase import ModelBase
      6 
      7 

/opt/conda/lib/python3.6/site-packages/covsirphy/ode/mbase.py in <module>
      3 
      4 import numpy as np
----> 5 from covsirphy.ode.mbasecom import ModelBaseCommon
      6 
      7 

/opt/conda/lib/python3.6/site-packages/covsirphy/ode/mbasecom.py in <module>
      2 # -*- coding: utf-8 -*-
      3 
----> 4 from covsirphy.cleaning.term import Term
      5 
      6 

/opt/conda/lib/python3.6/site-packages/covsirphy/cleaning/__init__.py in <module>
     13 
     14 for m in modules:
---> 15     m_imported = import_module(f"{__name__}.{m.stem}")
     16     for (k, v) in m_imported.__dict__.items():
     17         if not k.startswith("__"):

/opt/conda/lib/python3.6/importlib/__init__.py in import_module(name, package)
    124                 break
    125             level += 1
--> 126     return _bootstrap._gcd_import(name[level:], package, level)
    127 
    128 

/opt/conda/lib/python3.6/site-packages/covsirphy/cleaning/example_data.py in <module>
      4 import pandas as pd
      5 from covsirphy.cleaning.jhu_data import JHUData
----> 6 from covsirphy.analysis.simulator import ODESimulator
      7 from covsirphy.ode.mbase import ModelBase
      8 

/opt/conda/lib/python3.6/site-packages/covsirphy/analysis/simulator.py in <module>
      7 from scipy.integrate import solve_ivp
      8 from covsirphy.cleaning.term import Term
----> 9 from covsirphy.ode.mbase import ModelBase
     10 
     11 

ImportError: cannot import name 'ModelBase'

With importing two times, importing was successfully completed.

OSError when trying update input folder in Kaggle

Dear Rakesh(@SM-ins),
Thank you for your feed-back!

cs.Population does not input the CSV files directly downloaded from THE WORLD BANK. We need DataLoader class to use them.
Please run the following codes with the latest version (>2.3.0).

data_loader = cs.DataLoader("../input")
pop_data = data_loader.population()
pop_data.cleaned().tail()

Type of pop_data is equal to cs.Population and "locations_population.csv" will be saved in "../input" directory. This is different from "API_EN.POP.DNST_DS2_en_csv_v2.csv", but well organized.

Best Regards,
Lisphilar

Dear Rakesh(@SM-ins),
Thank you for your feed-back!

cs.Population does not input the CSV files directly downloaded from THE WORLD BANK. We need DataLoader class to use them.
Please run the following codes with the latest version (>2.3.0).

data_loader = cs.DataLoader("../input")
pop_data = data_loader.population()
pop_data.cleaned().tail()

Type of pop_data is equal to cs.Population and "locations_population.csv" will be saved in "../input" directory. This is different from "API_EN.POP.DNST_DS2_en_csv_v2.csv", but well organized.

Best Regards,
Lisphilar

Dear Lisphilar,
Thankyou so much for the response.

i tried to learn Your following code:-

data_loader = cs.DataLoader("../input")

pop_data = data_loader.population()
pop_data.cleaned().tail()

but now i am getting a new error:-
OSError: [Errno 30] Read-only file system: '/kaggle/input/locations_population.csv'

With regards,
Rakesh

Originally posted by @SM-ins in #42 (comment)

Speed-up of ODESimulator using numba.njit

Summary:
Scenario.estimate() is time-consuming and uses ODESimulator many times. To accerate ODESimulator, consider to use numba package.

CovsirPhy version 2.4.1

Related classes:

  • covsirphy.ODESumulator
  • covsirphy.Scenario

Codes and outputs:

import covsirphy as cs
# Dataset preparation
data_loader = cs.DataLoader("input")
jhu_data = data_loader.jhu()
population_data = data_loader.population()
scenario = cs.Scenario(jhu_data, population_data, "Japan")
scenario.trend()
scenario.estimate(cs.SIRF)

scenario.estimate(cs.SIRF) takes 3-5 minutes.

Environment:
Python 3.8, pipenv, WSL.

How to replace JHU data with country-wise data (India) and error in cleaning country level datasets

Dear @lisphilar ,

while running the following code,i am getting mentioned error:-

ind_data = cs.CountryData("/kaggle/input/covid19-in-india/covid_19_india.csv",Country="India")
ind_data.set_variables(
date="Date", confirmed="Positive", fatal="Fatal", recovered="Discharged", province=None
)
ind_data.cleaned().tail()

TypeError: init() got an unexpected keyword argument 'Country'

I just replace the country by India in place of japan and change the file path too...,but in spite of that I am getting above error..,will you please help..

With regards,
Rakesh

population_data.value() returns total value of all records in one area

Summary:
population_data.value() returns total value of all records in one area, not the last value.

CovsirPhy version 2.4.1

Related classes:

  • covsirphy.PopulationData

Codes and outputs:

import covsirphy as cs
# Dataset preparation
data_loader = cs.DataLoader("input")
jhu_data = data_loader.jhu()
population_data = data_loader.population()
population_data.value("Italy")

This code returns 21279409008 and greater than the population value of Italy.

Environment:
Python 3.8, pipenv, WSL.

Cleaned dataset of country-specific data is empty

Summary:
cs.CountryData.cleand() needs to return an un-empty dataframe, but returns an empty dataframe.

Version 2.1.0

Relatedc classes:

  • covsirphy.cleaning.country_data.CountryData

Code:

import covsirphy as cs
jpn_data = cs.CountryData("input/covid_jpn_total.csv", country="Japan")
jpn_data.set_variables(
    date="Date", confirmed="Positive", fatal="Fatal", recovered="Discharged"
)
print(jpn_data.cleaned())

This code returns an empty dataframe.

Environment:
Python 3.8, pipenv, WSL

Add example dataset to this repository

Is your feature request related to a problem? Please describe.
To try this package, it is necessary to prepare a dataset in advance. This sometimes prevent new users to use this package.

Describe the solution you'd like
Include Japanese dataset to this package.
Kaggle: COVID-19 dataset in Japan is maintained by me and this can be included in this repository.

PopulationData.value(): add "date" argument

Is your feature request related to a problem? Please describe.
Population values may change in the near future and COVID-19 Data Hub includes the population values for each date. "Date" argument will be useful for phase-dependent analysis.

Describe the solution you'd like
Add "date" argument PopulationData.value() and this method returns the value of the date.
Default value of date will None and None means the last date.

Error in find change points with CovsirPhy 2.5.4-alpha

Dear Lisphilar,
Thankyou for the updates.As You have made a single scenario phase of India,we are unable to predict the future data...,I am getting following error now while executing predicting model.

ind_scenario.clear()
ind_scenario.add_phase(days=7)
ind_scenario.simulate().tail(7).style.background_gradient(axis=0)

NameError: Initial value of Susceptible must be specified in @y0_dict.

Please check it once...,if possible..

Thankyou.

Make input.sh compatible with all OSes (re-write input.sh using python)

Is your feature request related to a problem? Please describe.
Currently, input.sh works for Ubuntu (it might work on MacOS if SVN is available but I did not test it), however it can definitely not be used for Windows.

Describe the solution you'd like
input.sh could be written in python which would make it possible to execute it using any OS as long as the python environment is properly setup.

Set random seed of hyperparameter optimization

Is your feature request related to a problem? Please describe.
For reproducibility, CovsirPhy needs to set random seed of hyperparameter optimization.

Describe the solution you'd like
Add argument seed to covsirphy.ChangeFinder.run() and covsirphy.Estimator.run(), and set the seed to Optuna package.

Describe alternatives you've considered
Repeat optimization until the results will be constant.

Keep track parameter values/reproductive number of all countries

Is your feature request related to a problem? Please describe.
A simple codes keep track values of parameter and reproductive number of all countries.

Describe the solution you'd like

  • Method of JHUData to get the country list.
  • Create a class to track parameter values of all countries.

[New] dataset of Population pyramid

What dataset you need?
Population pyramid dataset needs to include the population values per ages in each country.

How will you use the dataset for your analysis?
As mentioned in #53 and my Kaggle Notebook, population pyramid data is useful to analyse beta/rho parameter of SIR-F model.

Some phases have small number of days (<=2) suggested by ChangeFinder

Summary:
`` needs to return ... but returns ...

CovsirPhy version

Related classes:

  • covsirphy.
    (optional)

Codes and outputs:

This code returns

Environment:
Python 3.8, pipenv, WSL.

Hi Lisphilar,
i am getting following error while running the below code:-

ind_scenario.trend()

ValueError: @end_date must be over 23Jun2020.

Please help me out if possible...

With regards,
Rakesh

Upating of dataset in Kaggle

Dear @lisphilar ,

Please help me in updating date in following scenario's....,actually i want to add date as per my wish..

1)ind_scenario = cs.Scenario(jhu_data, pop_data, "India")
ind_scenario.records().tail()

2)ind_scenario.trend()

With regards,
Rakesh

India scenario analysis: records from 10Jun2020 was not included in analysis

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.
Dear Lisphilar,

is it possible to divide the phases of India from 25th march to current date,i.e. till 17th July..in scenario analysis..,please help me if possible..

With regards,
Rakesh

Last date of ODE simulation does not match end date of the last phase

Summary:
Scenario.simulate() needs to simulate until the end date of last phase. However, the last date of the dataframe returned by Scenario.simulate() does not match the end date of the last phase.

Version 2.0.2

Related classes:

  • covsirphy.analysis.scenario.Scenario
  • covsirphy.analysis.simulator.ODESimulator

Code/output 1:

import covsirphy as cs
# Read dataset
jhu_data = cs.JHUData("input/covid_19_data.csv")
pop_data = cs.Population("input/locations_population.csv")
# Set phase
ita_scenario = cs.Scenario(jhu_data, pop_data, country="Italy")
ita_scenario.trend(n_points=4, set_phases=True)
ita_scenario.add_phase(end_date="31Dec2020")
# Show the end date of the last phase
print(ita_scenario.get("End", phase="last"))

This returns "31Dec2020"

Code/output 2:

# Hyper parameter estimation
ita_scenario.estimate(cs.SIRF)
# Simulation
pred_df = ita_scenario.simulate()
print(pred_df.loc[pred_df.index[-1], "Date"])

This returns "27Sep2020" etc.

Environment:
Python 3.8, pipenv, WSL

TypeError of scenario.param_history(show_box_plot=False)

Summary:
scenario.param_history(show_box_plot=False)

CovsirPhy version 2.4.1

Related classes:

  • covsirphy.Scenario

Codes and outputs:

import covsirphy as cs
data_loader = cs.DataLoader("input")
jhu_data = data_loader.jhu()
population_data = data_loader.population()
scenario = cs.Scenario(jhu_data, population_data, country="Japan")
scenario.trend()
scenario.estimate(cs.SIRF)
scenario.param_history(targets=["Rt"], divide_by_first=False, show_box_plot=False)

This code raises TypeError: line_plot() got an unexpected keyword argument 'show_figure'

Environment:
Python 3.8, pipenv, WSL.

Change data source: the number of cases, JHU to COVID-19 Data Hub

We are using JHU dataset with cs.DataLoader.jhu() and cs.JHUData() now.
However, this dataset has critical errors (e.g. Italy: Confirmed=241184 and Recovered=11811 on 03Jul2020, Recovered << Confirmed) and the errors may not be corrected. So, we need change the source data that is maintained.

I found COVID-19 Data Hub and this has Python Interface.
We can retrieve the datasets as follows.

pip install covid19dh
import covid19dh
# Country level
country_df = covid19dh.covid19(country=None, level=1, verbose=False)
# For some countries, province-level data is included
province_df = covid19dh.covid19(country=None, level=2, verbose=False)
# List of citation
covid19dh.cite(country_df)

OxCGRT data and population values are included in this dataset.

In the next version, I will try to change the data source.
Thank you.

Upating of dataset in Kaggle

Dear @lisphilar ,

Please help me in updating date in following scenario's....,actually i want to add date as per my wish..

1)ind_scenario = cs.Scenario(jhu_data, pop_data, "India")
ind_scenario.records().tail()

2)ind_scenario.trend()

With regards,
Rakesh

Low accuracy of parameter estimation for SIR-FV and SEWIR-F model

Summary:
Accuracy of parameter optimization is high for SIR, SIR-D, SIR-F model (RMSLE scores are about 0.1), but that is low for SIR-FV and SEWIR-F model (RMSLE scores are about 30).

CovsirPhy version 2.2.1

Related classes:

  • covsirphy.SIRFV
  • covsirphy.SEWIRF

Codes:
Codes are in example/sirfv_model.py and example/sewirf_model.py

Environment:
Python 3.8, pipenv, WSL.

PopulationData.update() add population values without init

Summary:
If population value has been registered, PopulationData.update() does not register new value correctly.

CovsirPhy version 2.4.1

Related classes:

  • covsirphy.PopulationData

Codes and outputs:

import covsirphy as cs
# Dataset preparation
data_loader = cs.DataLoader("input")
jhu_data = data_loader.jhu()
population_data = data_loader.population()
# Update population value
population_data.update(‎126_180_643, country="Japan")
population_data.value("Japan")

This code does not retun ‎126180643

Environment:
Python 3.8, pipenv, WSL.

End date of Scenario.simulate() does not match the end date of a phase

Summary:
End date of Scenario.simulate() does not match the end date of a phase.

CovsirPhy version 2.4.2

Related classes:

  • covsirphy.Scenario

Codes and outputs:

import covsirphy as cs
# Dataset preparation
data_loader = cs.DataLoader("input")
jhu_data = data_loader.jhu()
population_data = data_loader.population()
scenario = cs.Scenario(jhu_data, population_data, country="Italy")
scenario.trend()
scenario.estimate(cs.SIRF)
scenario.add_phase(end_date="01Jan2020")
scenario.summary()
scenario.simulate()

The last date of summary was 01Jan2020, but the last date of simulated records was 22Dec2020.

Environment:
Python 3.8, pipenv, WSL.

DataLoader failed in saving CSV files

Summary:
DataLoader failed in the dataset retrieved from COVID-19 Data Hub.

CovsirPhy version 2.4.0

Related classes:

  • covsirphy.DataLoader

Codes and outputs:

import covsirphy as cs
data_loader = cs.DataLoader("input")
jhu_data = data_loader.jhu()

This code should save a CSV file in "input" directory, but failed.

Environment:
Python 3.8, pipenv, WSL.

Automatic downloading of dataset: total population

Is your feature request related to a problem? Please describe.
As mensioned in #26 , this package needs to include a data loader which enable us to download the datasets automatically. In this issue, dataset about "total population" will be discussed.

Describe the solution you'd like

  • Find/create a dataset about "total population of each country"
  • Create a Python class for automatic downloading

Dataset design

  • Downloading the dataset does NOT request API keys, including Kaggle API keys.
  • The dataset must include country names and population values.
  • The dataset should include province names because CovsirPhy.Population uses province field.

Dear Rakesh(@SM-ins),

Dear Rakesh(@SM-ins),
Thank you for your feed-back!

cs.Population does not input the CSV files directly downloaded from THE WORLD BANK. We need DataLoader class to use them.
Please run the following codes with the latest version (>2.3.0).

data_loader = cs.DataLoader("../input")
pop_data = data_loader.population()
pop_data.cleaned().tail()

Type of pop_data is equal to cs.Population and "locations_population.csv" will be saved in "../input" directory. This is different from "API_EN.POP.DNST_DS2_en_csv_v2.csv", but well organized.

Best Regards,
Lisphilar

Originally posted by @lisphilar in #42 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.