Giter VIP home page Giter VIP logo

california-coronavirus-data's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

california-coronavirus-data's Issues

Vaccination data outdated

The vaccination file hasn’t been updated since February 10. This is very useful data so I’d love to see it kept current.

In cdph-county-cases-deaths.csv the county FIPS code is being treated as an integer

The data dictionaries say that the county FIPS codes are strings, and it's important that they retain leading zeroes in order to be used on constructing 5-digit state+county FIPS codes but at least in the case of cdph-county-cases-deaths.csv it appears that the leading zeroes are being stripped and the field is being treated as an integer. All the other files I looked at appear to have retained the leading zeroes and are treating the FIPS codes as strings.

See e.g. the first few lines of cdph-county-cases-deaths.csv

date,county,fips,population,confirmed_cases,reported_cases,probable_cases,reported_and_probable_cases,reported_deaths
2021-10-17,Alameda,1,1643700,117413,117423,2860,120283,1376
2021-10-17,Alpine,3,1148,103,103,0,103,0
2021-10-17,Amador,5,37829,5340,5341,62,5403,64

vs. the first few lines of latimes-county-totals.csv

date,county,fips,confirmed_cases,deaths,new_confirmed_cases,new_deaths
2021-09-02,Alameda,001,109537,1316,655,2
2021-09-02,Alpine,003,89,0,0,0
2021-09-02,Amador,005,4559,53,33,0

Possible data error in state hospitalization totals

The confirmed_hospitalizations column in the cdph-state-totals.csv seems to have a possible bad data entry, introduced in this commit 57e198.

Specifically, for 2020-04-17, the confirmed_hospitalizations is 2180, whereas for the previous day 2020-04-16 the total was 3141, and the following day for 2020-04-18 is 3221.

Is this a typo, and instead of 2180 should it be 3180, which would make much more sense and not a large deviation from the previous and subsequent totals?

Negative tests for counties

It would be super helpful if information about total number of tests (positive + negative) was included into the county level data.

Bay Area cities?

Hi, as of 4/9/20, Santa Clara County is posting city case counts. Do you have an estimate of if/when your tool will start including Bay Area daily city counts?

cdph-adult-and-senior-care-facilities.csv is no longer updating

The cdph-adult-and-senior-care-facilities.csv dataset hasn't been updated since April 6th. It looks like the data's still being updated daily by CDSS, but as it's in PDF form, I fully understand how annoying that is to deal with. Just wanted to put that on the radar in case it's not a known issue.

Also, the README for the repository for this dataset's section has a bad link to the source file.

Thanks again for all the wonderful resources!

Some CSV files no longer being updated

I use the "cdph-county-cases-deaths.csv" and "latimes-place-totals.csv" files.
The former was last updated around Apr 15th and the latter around Apr 19th.

I use them both on my little site https://manylakes.io/

Have you stopped maintaining these files or is something broken?

If you have stopped updating them, please annotate so people know this.
Also, if they are going away permanently, can you advise consumers of the data on a good replacement?

`fips` field in latimes-county-totals.csv missing leading zeros

Heya, I grabbed this CSV today and found the fips field missing leading zeros; I cruised the file's history and it looks like this started with 860ffb9, on 4/25.

FWIW, I don't see this in the other files I glanced at that also have a FIPS field, but I'll update issue if I do. I'd poke through the code to find where the data type changed, but I'm on deadline!

Thanks for this, and I hope y'all are well.

Incorrect data in multiple .csvs

Multiple .csvs seem to have incorrect data or malformed data put into them.
The following files seem to be affected

  • latimes-agency-totals.csv
  • latimes-county-totals.csv
  • latimes-state-totals.csv

latimes-agency-totals.csv data from 2021-08-17

agency,county,fips,date,confirmed_cases,deaths,did_not_update
Alameda,Alameda,001,2021-08-17,,,
Berkeley,Alameda,001,2021-08-17,,,
Alpine,Alpine,003,2021-08-17,,,

latimes-agency-totals.csv data from 2021-08-16

Alameda,Alameda,001,2021-08-16,97878,1251,
Berkeley,Alameda,001,2021-08-16,4223,51,
Alpine,Alpine,003,2021-08-16,89,0,TRUE

latimes-county-totals.csv data from 2021-08-17

date,county,fips,confirmed_cases,deaths,new_confirmed_cases,new_deaths
2021-08-17,Alameda,001,0,0,-102101,-1302
2021-08-17,Alpine,003,0,0,-89,0
2021-08-17,Amador,005,0,0,-4199,-50

latimes-county-totals.csv data from 2021-08-16

2021-08-16,Alameda,001,102101,1302,628,1
2021-08-16,Alpine,003,89,0,0,0
2021-08-16,Amador,005,4199,50,69,1

latimes-state-totals.csv

date,confirmed_cases,deaths,new_confirmed_cases,new_deaths
2021-08-17,78720,1386,-4041164,-62738
2021-08-16,4119884,64124,19749,55

latimes-place-totals.csv has integers in string-typed columns of some rows.

In latimes-place-totals.csv There are some ill-formed rows:

  1. Several cases of "id" column containing an integer rather than a string
  2. Several cases of "name" column containing an integer rather than a string
  3. There are also cases of "population" column being empty but that's probably not an error, more likely missing data

Skilled nursing facilities feed has been down

Hi,

I noticed this feed's been down for about a week. I'm guessing that's due to the official SNF feed no longer providing daily CSV exports, leaving the only available data in the dashboard? Unless I missed a new location for those, anyway.

Given that situation, I've become adept at writing scrapers for Tableau dashboards. So if the above is just out of operation indefinitely due to the lack of CSV exports, let me know. I'll work on it from my end, and perhaps I can help restore the feeds for you from my end, give you an endpoint to consume.

Loss of historical data on skilled nursing facilities?

The past few days, the cdph-skilled-nursing-facilities.csv dataset has contained information only for the current day, losing all historical data. Is this intentional, or is it just an error? I haven't been able to track the trends from this dataset due to this issue.

Redundant copies of data in cdph-age.csv

Low priority issue - it looks like there is a great deal of redundant data in cdph-age.csv:

  • Each age-date from 2020-05-21 through 2021-07-28 appears 81 times.
  • Each age-date from 2021-08-04 appears 58 times.
  • Each age-date from 2021-08-11 appears 18 times.

Here are the first 19 lines:


   date       age   confirmed_cases_total confirmed_cases_percent deaths_total deaths_percent
 1 2021-08-11 0-4                   97440                   0.024            7              0
 2 2021-08-11 0-4                   97440                   0.024            7              0
 3 2021-08-11 0-4                   97440                   0.024            7              0
 4 2021-08-11 0-4                   97440                   0.024            7              0
 5 2021-08-11 0-4                   97440                   0.024            7              0
 6 2021-08-11 0-4                   97440                   0.024            7              0
 7 2021-08-11 0-4                   97440                   0.024            7              0
 8 2021-08-11 0-4                   97440                   0.024            7              0
 9 2021-08-11 0-4                   97440                   0.024            7              0
10 2021-08-11 0-4                   97440                   0.024            7              0
11 2021-08-11 0-4                   97440                   0.024            7              0
12 2021-08-11 0-4                   97440                   0.024            7              0
13 2021-08-11 0-4                   97440                   0.024            7              0
14 2021-08-11 0-4                   97440                   0.024            7              0
15 2021-08-11 0-4                   97440                   0.024            7              0
16 2021-08-11 0-4                   97440                   0.024            7              0
17 2021-08-11 0-4                   97440                   0.024            7              0
18 2021-08-11 0-4                   97440                   0.024            7              0
19 2021-08-11 5-17                 433229                   0.109           23              0

Consequently, the data file is 117,390 rows long, but there are only 1,460 distinct data points.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.