Giter VIP home page Giter VIP logo

ebola's Introduction

Data for the 2014 Global Ebola outbeak

Announcements

As of Dec 15, 2015, I will no longer be updating the data. Pull requests will be accepted.

Please refer to Brian Rowe's R package for scraping Liberia's sitreps.

Contents

Datamarket has made these data available through their API here. The DataMarket API is documented here. To access it programmatically you need a sharing key, which you can find in the file 'datamarket_sharingkey.txt'

  • country_timeseries.csv contains a time series of case counts and deaths is from the World Health Organization and WHO situation reports.
  • liberia_data/ contains .csv files of data provided by the Liberia Ministry of Health. I have noticed the data are somewhat inconsistent. Cross-check the data when analyzing.
  • sl_data/ contains .csv files of data provided by the Sierra Leone Ministry of Health
  • guinea_data/ contains a mix of .csv and PDF files from the Guinea Ministry of Health. These data are not consistently available online, so I will keep the PDFs in the repo for reference.
  • mali_data/ contains a mix of .csv and PDF files from the Mali Ministry of Health.
  • who_data/ contains data from the WHO that compare sitrep case counts with patient database counts for select cities and countries.
  • data_products/ contains analyses, processing scripts, etc. Highlights include:
    • liberia_data.py converts the liberia_data csv files into a multidimensional pandas dataframe. Pandas is a requirement for this script. Optional argument allows output to .csv. You can run this script with ./liberia_data.py --help to learn more.
  • line_list.csv is a line listing I manually compiled from media reports and published case series of case clusters. It is unverified and almost certainly contains errors. Use with extreme caution. The legrand compartment specifies with infectious compartment each case would originate from in the Legrand et al model. The source_id column is the case_id of the node from whom the case was infected.
  • Sierraleone_country.csv and SierraLeone_town.csv is from the Sierra Leone Ministry of Health website. Data in SierraLeone_town.csv is cumlative confirmed cases - counts do not include suspected or probable cases. These spreadsheets will no longer be updated as of Sept 12 (newer data can be found in the sl_data/* files), but pull requests will be accepted.

Disclaimer

I cannot guarantee the accuracy of this data. These data are digitized by hand (and sometimes with Tabula) so there may be data entry errors; there may also be changes and errors in the source data. I will provide updates when possible.

Contact

I am Caitlin Rivers, formerly of Network Dynamics and Simulation Science Laboratory at Virginia Tech. Also see the NDSSL website for additional Ebola data resources. You can reach me at:

Please note: I receive numerous requests for customized versions of these data. I am not able to accommodate these requests.

Contribute

Please feel free to send a pull request or a cup of coffee.

ebola's People

Contributors

aflaxman avatar carlosp420 avatar chendaniely avatar chenghlee avatar chrisvoncsefalvay avatar cmrivers avatar dkergl avatar donpdonp avatar elofgren avatar gfairchild avatar grlurton avatar isaacyeaton avatar jabbate avatar jeremybmerrill avatar jsoma avatar kdodia avatar luiscape avatar orthographic-pedant avatar pallih avatar pierrepo avatar rcquan avatar reidpr avatar rikblok avatar runarberg avatar samccone avatar seanbeatty avatar sergestinckwich avatar shawnacscott avatar tallenaz avatar waleo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ebola's Issues

Worth adding main local and global responses?

I was wondering if we could also record in this repo main local (at the contry scale) or global (i.e. wordwide) responses to fight the virus. E.g.:

date: 2014-08-08
actor: WHO
action: "Outbreak is an international public health emergency"
source: "http://en.wikipedia.org/wiki/Ebola_virus_epidemic_in_West_Africa"

date: 2014-08-24
actor: WHO
action: "Ebola Respons Roadmap"
source: "http://apps.who.int/iris/bitstream/10665/134771/1/roadmapsitrep_24Sept2014_eng.pdf"

date: 2014-09-18
actor: UN
action: "Threat to international peace and security"
source: "http://www.bbc.com/news/world-africa-29262968"

date: 2014-09-19
actor: "Sierra Leone"
action: "Three-day lockdown"
source: "http://www.nytimes.com/2014/09/20/world/africa/ebola-outbreak.html"

repo license

Someone in irc ##ebola asked about the licence used for the data in this repo. Its not something I've thought about but its probably a good idea to define one. There are a number of applicable licenses. My inclination would be to use the most open one possible - cczero - which is essentially the public domain.

http://creativecommons.org/publicdomain/zero/1.0/

putting a text file of LICENSE in the top-level directory is the normal way to signify the license in use.

Other options are the Gnu free documentation license, the MIT license, etc.

Organize Guinea digitization project

Figure out a way to make clear which Guinea situation reports have already been converted to PDF, and which still need to be done. I want to keep the PDFs in the repo even after they are digitized, since they are not available easily available online like the SL and Liberia sitreps.

Some Guinea feature coordinates are off

I noticed in the locations.geojson file that some of the coordinates are off -- 3 locations in Guinea are getting converted to [-104.798844, 39.87166390000001], which is in Denver, Colorado. Not sure how this is happening but will try to track it down as time permits.

    {
      "type": "Feature",
      "properties": {
        "address": "Gueckedou",
        "country": "Guinea"
      },
      "geometry": {
        "type": "Point",
        "coordinates": [
          -104.798844,
          39.87166390000001
        ]
      }
    },
   {
      "type": "Feature",
      "properties": {
        "address": "Dinguiraye",
        "country": "Guinea"
      },
      "geometry": {
        "type": "Point",
        "coordinates": [
          -104.798844,
          39.87166390000001
        ]
      }
    },
    {
      "type": "Feature",
      "properties": {
        "address": "Nzerekore",
        "country": "Guinea"
      },
      "geometry": {
        "type": "Point",
        "coordinates": [
          -104.798844,
          39.87166390000001
        ]
      }
    },

untracked file after fastforward

Hello,

After my last merge from upstream/master this file is showing up as untracked

"drcongo_data/originals/OMS_Rapport de Situation 01_Cas de gastro-ent\303\251rite h\303\251morragique et f\303\251brile.pdf"

After removing it locally, git thinks it needs to be git removed, which is weird. git ls-files shows it has been removed.

Revise Sierra Leone data handling

Per @donpdonp:

The dates are all ahead by one day.
The file "2014-08-13-v77.csv" contains CSV data that says 8/13/2014 from report 77.

The problem is the data is from report 77 but report 77 contains data from 2014-8-12 (it was reported on 2014-8-13).

the date field for Sep 25,26,27,30 use a 2 digit year, it should be 4.

Standardized filenames

It could be helpful to run with a consistent naming convention like country-datasource-YYYY-MM-DD.csv, e.g. liberia-case_reports-2014-09-29.csv - I think it might make sorting and browsing a little easier.

It might also help with #37 so that guinea-report-2014-09-29.pdf could sit next to guinea-report-2014-09-29.csv and you'd have a better idea what still needs to be digitized.

Universal variables?

How do you feel about changing the variable names to be consistent across countries? Right now it is a mix of the variables as described in the sitreps (which include whitespace and punctuation), and more machine-friendly shortened variable names. The change would cover both old and new files. cc @pallih

add citation to root readme.md

Not sure if there's a canonical way to cite a github repo, but it seems like it'll be a good idea for this repository. This is one example I took from here

Rivers C., Ebola, (2014), GitHub repository, https://github.com/cmrivers/ebola/

SL date issues

Just in case my message on SL data file 2014-11-10.csv gets missed, there are inconsistencies with the data "dates" from Oct 31st onward - I am currently going through to put the correct date for which the data are describing, which is typically one day prior to the SitRep publication date. On Nov 12/15th, the file names changed from reflecting the date for the data to describing the date on which the report was published. This is confusing. I propose changes to those file names to match the date for the data they describe. Keeping the SitRep Version # is brilliant though. It's a shame the health ministry does not use a date system to identify each report in the file name!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.