datadesk / california-coronavirus-data Goto Github PK
View Code? Open in Web Editor NEWThe Los Angeles Times' open-source archive of California coronavirus data
Home Page: https://www.latimes.com/coronavirustracker
License: Other
The Los Angeles Times' open-source archive of California coronavirus data
Home Page: https://www.latimes.com/coronavirustracker
License: Other
In the first row of latimes-agency-totals.csv file, the field name for 'x' changed to 'lon' but y remains y.
The readme still refers to 'x' and 'y'.
This feels like a mistake. It should probably be changed back to x and y or changed to lon and lat.
The file being referred to
https://github.com/datadesk/california-coronavirus-data/blob/master/latimes-place-totals.csv
The vaccination file hasn’t been updated since February 10. This is very useful data so I’d love to see it kept current.
The data dictionaries say that the county FIPS codes are strings, and it's important that they retain leading zeroes in order to be used on constructing 5-digit state+county FIPS codes but at least in the case of cdph-county-cases-deaths.csv
it appears that the leading zeroes are being stripped and the field is being treated as an integer. All the other files I looked at appear to have retained the leading zeroes and are treating the FIPS codes as strings.
See e.g. the first few lines of cdph-county-cases-deaths.csv
date,county,fips,population,confirmed_cases,reported_cases,probable_cases,reported_and_probable_cases,reported_deaths
2021-10-17,Alameda,1,1643700,117413,117423,2860,120283,1376
2021-10-17,Alpine,3,1148,103,103,0,103,0
2021-10-17,Amador,5,37829,5340,5341,62,5403,64
vs. the first few lines of latimes-county-totals.csv
date,county,fips,confirmed_cases,deaths,new_confirmed_cases,new_deaths
2021-09-02,Alameda,001,109537,1316,655,2
2021-09-02,Alpine,003,89,0,0,0
2021-09-02,Amador,005,4559,53,33,0
Hello,
Your dataset was added to CoronaWhy (https://www.coronawhy.org/) Data Lake on Dataverse as a piece of common COVID-19 data frame https://datasets.coronawhy.org/dataset.xhtml?persistentId=doi:10.5072/FK2/ONDTAK
Would you be willing to help with the maintenance of your dataset in Dataverse, e.g. adding the relevant metadata and keeping the dataset up-to-date? That will help to make the dataset findable and accessible for the medical science community.
The confirmed_hospitalizations
column in the cdph-state-totals.csv seems to have a possible bad data entry, introduced in this commit 57e198.
Specifically, for 2020-04-17
, the confirmed_hospitalizations
is 2180
, whereas for the previous day 2020-04-16
the total was 3141
, and the following day for 2020-04-18
is 3221
.
Is this a typo, and instead of 2180
should it be 3180
, which would make much more sense and not a large deviation from the previous and subsequent totals?
2020-06-04 may have total confirmed cases and new confirmed cases switched? That row lists only 5195 new cases, so new confirmed cases is a very large negative number that's skewing my charts.
Thanks,
Andrew
https://github.com/datadesk/california-coronavirus-data/blob/master/latimes-state-totals.csv
Hello,
The last update to:
(https://github.com/datadesk/california-coronavirus-data)/cdph-hospital-patient-county-totals.csv
and
(https://github.com/datadesk/california-coronavirus-data)/cdph-hospital-patient-state-totals.csv
occurred on 6/12/22 (6/11/22 data). Have updates been discontinued for these files? If yes, is there another good source for this data?
Thank you
It would be super helpful if information about total number of tests (positive + negative) was included into the county level data.
I've noticed it's the end of 4/17 and no updates for 2 days. Is this repository being abandoned?
Hi, as of 4/9/20, Santa Clara County is posting city case counts. Do you have an estimate of if/when your tool will start including Bay Area daily city counts?
date,county,fips,confirmed_cases,deaths,new_confirmed_cases,new_deaths
2021-08-13,San Mateo,081,0,0,-46231,-592
Seems all data to date for San Mateo zeroed out for 2021-08-13 in commit cf492f4
Hi,
For the past couple months, the cdph-adult-and-senior-care-facilities.csv
dataset has lacked any new data. Is there a scraping problem, or is the data being retired?
Thanks!
The cdph-adult-and-senior-care-facilities.csv
dataset hasn't been updated since April 6th. It looks like the data's still being updated daily by CDSS, but as it's in PDF form, I fully understand how annoying that is to deal with. Just wanted to put that on the radar in case it's not a known issue.
Also, the README for the repository for this dataset's section has a bad link to the source file.
Thanks again for all the wonderful resources!
I use the "cdph-county-cases-deaths.csv" and "latimes-place-totals.csv" files.
The former was last updated around Apr 15th and the latter around Apr 19th.
I use them both on my little site https://manylakes.io/
Have you stopped maintaining these files or is something broken?
If you have stopped updating them, please annotate so people know this.
Also, if they are going away permanently, can you advise consumers of the data on a good replacement?
Heya, I grabbed this CSV today and found the fips
field missing leading zeros; I cruised the file's history and it looks like this started with 860ffb9, on 4/25.
FWIW, I don't see this in the other files I glanced at that also have a FIPS field, but I'll update issue if I do. I'd poke through the code to find where the data type changed, but I'm on deadline!
Thanks for this, and I hope y'all are well.
Alameda county stats in latimes-place-totals.csv have not changed since 2021-04-04 (9 days ago).
Perhaps an ETL glitch?
The data for Sonoma County for 2021-04-16 reads:
2021-04-16,Sonoma,097,0,0,-29632,-311
It looks like it probably should be:
2021-04-16,Sonoma,097,29632,311,0,0
Data dump of 7/24/2021 has ill-formed data in latimes-agency-totals.csv:
Example: note deaths and confirmed_cases are blank
Pasadena,Los Angeles,037,2021-07-23,,,
Hasn't updated in a week (latest is Feb 25).
Hello,
I note that the latimes-county-totals.csv hasn't been updated since September 2. Has this file been retired?
Multiple .csvs seem to have incorrect data or malformed data put into them.
The following files seem to be affected
latimes-agency-totals.csv data from 2021-08-17
agency,county,fips,date,confirmed_cases,deaths,did_not_update
Alameda,Alameda,001,2021-08-17,,,
Berkeley,Alameda,001,2021-08-17,,,
Alpine,Alpine,003,2021-08-17,,,
latimes-agency-totals.csv data from 2021-08-16
Alameda,Alameda,001,2021-08-16,97878,1251,
Berkeley,Alameda,001,2021-08-16,4223,51,
Alpine,Alpine,003,2021-08-16,89,0,TRUE
latimes-county-totals.csv data from 2021-08-17
date,county,fips,confirmed_cases,deaths,new_confirmed_cases,new_deaths
2021-08-17,Alameda,001,0,0,-102101,-1302
2021-08-17,Alpine,003,0,0,-89,0
2021-08-17,Amador,005,0,0,-4199,-50
latimes-county-totals.csv data from 2021-08-16
2021-08-16,Alameda,001,102101,1302,628,1
2021-08-16,Alpine,003,89,0,0,0
2021-08-16,Amador,005,4199,50,69,1
latimes-state-totals.csv
date,confirmed_cases,deaths,new_confirmed_cases,new_deaths
2021-08-17,78720,1386,-4041164,-62738
2021-08-16,4119884,64124,19749,55
In latimes-place-totals.csv There are some ill-formed rows:
Hi,
I noticed this feed's been down for about a week. I'm guessing that's due to the official SNF feed no longer providing daily CSV exports, leaving the only available data in the dashboard? Unless I missed a new location for those, anyway.
Given that situation, I've become adept at writing scrapers for Tableau dashboards. So if the above is just out of operation indefinitely due to the lack of CSV exports, let me know. I'll work on it from my end, and perhaps I can help restore the feeds for you from my end, give you an endpoint to consume.
The past few days, the cdph-skilled-nursing-facilities.csv
dataset has contained information only for the current day, losing all historical data. Is this intentional, or is it just an error? I haven't been able to track the trends from this dataset due to this issue.
Low priority issue - it looks like there is a great deal of redundant data in cdph-age.csv:
Here are the first 19 lines:
date age confirmed_cases_total confirmed_cases_percent deaths_total deaths_percent
1 2021-08-11 0-4 97440 0.024 7 0
2 2021-08-11 0-4 97440 0.024 7 0
3 2021-08-11 0-4 97440 0.024 7 0
4 2021-08-11 0-4 97440 0.024 7 0
5 2021-08-11 0-4 97440 0.024 7 0
6 2021-08-11 0-4 97440 0.024 7 0
7 2021-08-11 0-4 97440 0.024 7 0
8 2021-08-11 0-4 97440 0.024 7 0
9 2021-08-11 0-4 97440 0.024 7 0
10 2021-08-11 0-4 97440 0.024 7 0
11 2021-08-11 0-4 97440 0.024 7 0
12 2021-08-11 0-4 97440 0.024 7 0
13 2021-08-11 0-4 97440 0.024 7 0
14 2021-08-11 0-4 97440 0.024 7 0
15 2021-08-11 0-4 97440 0.024 7 0
16 2021-08-11 0-4 97440 0.024 7 0
17 2021-08-11 0-4 97440 0.024 7 0
18 2021-08-11 0-4 97440 0.024 7 0
19 2021-08-11 5-17 433229 0.109 23 0
Consequently, the data file is 117,390 rows long, but there are only 1,460 distinct data points.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.