Giter VIP home page Giter VIP logo

seandavi / sars2pack Goto Github PK

View Code? Open in Web Editor NEW
59.0 10.0 14.0 1016.84 MB

An R package with over 50 highly cited, read-to-use, up-to-date COVID-19 pandemic data resources

Home Page: https://seandavi.github.io/sars2pack/

License: Other

R 99.54% Dockerfile 0.46%
rstats-package rstats data-science datascience data-visualization covid-19 datasets coronavirus coronavirus-tracking biomedical-data

sars2pack's Introduction

sars2pack

codecov test-coverage

Overview

The sars2pack R package provides one-line access to over 40 COVID-related datasets. Datasets are accessed in real time directly from their sources and then transformed to tidy-data form where possible and applicable. The result of each dataset accessor is a ready-to-use R dataset, often a dataframe. Documentation includes dataset descriptions, sources and references, and examples. Online documentation is available in two locations:

Questions addressed by sars2pack

  • What are the current and historical total, new cases, and deaths of COVID-19 at the city, county, state, national, and international levels?
  • How do changes in infection rates differ across locations?
  • What are the non-pharmacological interventions in place at the local and national levels?
  • In the United States, what is the geographical distribution of healthcare capacity (ICU beds, total beds, doctors, etc.)?
  • What are the published values of key epidemic parameters, as curated from the literature?

Installation

# If you do not have BiocManager installed:
install.packages('BiocManager')

# Then, if sars2pack is not already installed:
BiocManager::install('seandavi/sars2pack')

After the one-time installation, load the packge to get started.

library(sars2pack)

Available datasets

name accessor data\_type geographical geospatial region resolution url
United States county-level geographic details us\_county\_geo\_details c(“demographics”, “geographic”) TRUE TRUE United States admin2 [LINK](https://github.com/josh-byster/fips_lat_long)
OECD International Unemployment Data oecd\_unemployment\_data c(“economics”, “time series”) TRUE FALSE World admin0 [LINK](https://oecd.org)
healthdata.org COVID-19 Mobility Observations and Projections healthdata\_mobility\_data c(“mobility”, “time series”, “projections”) TRUE FALSE International c(“admin0”, “admin1”) [LINK](https://covid19.healthdata.org/projections)
healthdata.org COVID-19 Testing Observations and Projections healthdata\_testing\_data c(“testing”, “time series”, “projections”) TRUE FALSE International c(“admin0”, “admin1”) [LINK](https://covid19.healthdata.org/projections)
Our World In Data testing and cases reporting owid\_data c(“time series”, “cases”, “deaths”, “testing”) TRUE FALSE World admin0 [LINK](https://ourworldindata.org/coronavirus)
CovidTracker data covidtracker\_data c(“time series”, “cases”, “deaths”, “testing”) TRUE FALSE United States admin1 [LINK](https://covidtracking.com/)
European CDC world tracking ecdc\_data c(“time series”, “cases”, “deaths”) TRUE FALSE World admin0 [LINK](https://www.ecdc.europa.eu/en/covid-19)
EU data Github aggregator eu\_data\_cache\_data c(“time series”, “cases”, “deaths”) TRUE FALSE Europe c(“admin0”, “admin1”) [LINK](https://github.com/covid19-eu-zh/covid19-eu-data)
USA Facts usa\_facts\_data c(“time series”, “cases”, “deaths”) TRUE FALSE United States admin1 [LINK](https://usafacts.org/visualizations/coronavirus-covid-19-spread-map/)
Johns Hopkins dataset jhu\_data c(“time series”, “cases”, “deaths”) TRUE FALSE World admin0 [LINK](https://github.com/CSSEGISandData/COVID-19)
Johns Hopkins US-centric data jhu\_us\_data c(“time series”, “cases”, “deaths”) TRUE FALSE United States c(“admin1”, “admin2”) [LINK](https://github.com/CSSEGISandData/COVID-19)
New York Times county level data nytimes\_county\_data c(“time series”, “cases”, “deaths”) TRUE FALSE United States admin2 [LINK](https://raw.githubusercontent.com/nytimes/covid-19-data)
New York Times state level data nytimes\_state\_data c(“time series”, “cases”, “deaths”) TRUE FALSE United States admin1 [LINK](https://raw.githubusercontent.com/nytimes/covid-19-data)
The Economist: Excess deaths during COVID pandemic economist\_excess\_deaths c(“time series”, “deaths”, “excess deaths”) TRUE FALSE International c(“admin0”, “admin1”) [LINK](https://github.com/TheEconomist/covid-19-excess-deaths-tracker)
The : Excess deaths during COVID pandemic financial\_times\_excess\_deaths c(“time series”, “deaths”, “excess deaths”) TRUE FALSE International c(“admin0”, “admin1”) [LINK](https://github.com/Financial-Times/coronavirus-excess-mortality-data)
US CDC excess deaths dataset cdc\_excess\_deaths c(“time series”, “deaths”, “excess deaths”) TRUE FALSE United States admin1 [LINK](https://www.cdc.gov/nchs/nvss/vsrr/covid19/excess_deaths.html)
Descartes Labs Mobility Data descartes\_mobility\_data c(“time series”, “mobility”) TRUE FALSE United States admin1 [LINK](https://raw.githubusercontent.com/descarteslabs/DL-COVID-19)
Apple mobility data from maps apple\_mobility\_data c(“time series”, “mobility”) TRUE FALSE World c(“admin0”, “admin1”, “admin2”, “admin3”) [LINK](https://www.apple.com/covid19/mobility)
Healthdata.org projections of hospital utilization and deaths healthdata\_projections\_data c(“time series”, “projections”, “cases”, “deaths”) TRUE FALSE c(“United States”, “World”) c(“admin1”, “admin2”) [LINK](http://www.healthdata.org/covid)
Healthdata.org mobility data healthdata\_mobility\_data c(“time series”, “projections”, “mobility”) TRUE FALSE c(“United States”, “World”) c(“admin1”, “admin2”) [LINK](http://www.healthdata.org/covid)
United States CDC Social Vulnerability Index cdc\_social\_vulnerability\_index demographics TRUE FALSE United States admin2 [LINK](https://svi.cdc.gov/)
US county health rankings from ‘’ us\_county\_health\_rankings demographics TRUE FALSE United States c(“admin0”, “admin1”, “admin2”) [LINK](https://www.countyhealthrankings.org)
Country metadata from restcountries.eu country\_metadata demographics TRUE FALSE World admin0 [LINK](https://restcountries.eu)
Extensive United States hospital capabilities us\_hospital\_details healthcare capacity TRUE TRUE United States individual hospital [LINK](https://hub.arcgis.com/datasets/geoplatform::hospitals)
Kaiser Family Foundation ICU bed data kff\_icu\_beds healthcare capacity TRUE TRUE United States Individual hospital [LINK](https://khn.org/news/as-coronavirus-spreads-widely-millions-of-older-americans-live-in-counties-with-no-icu-beds)
CovidCare United States Healthcare Capacity us\_healthcare\_capacity healthcare capacity TRUE TRUE United States Individual hospital [LINK](https://github.com/covidcaremap/covid19-healthsystemcapacity)
GISAID metadata from thousands of SARS-CoV-2 sequences cov\_glue\_lineage\_data line list TRUE FALSE World multiple [LINK](https://github.com/hCoV-2019/lineages)
beoutbreakprepared beoutbreakprepared\_data line list TRUE FALSE World patient [LINK](https://github.com/beoutbreakprepared/nCoV2019)
Published epidemic parameters for COVID-19 param\_estimates\_published miscellaneous FALSE FALSE list() list() [LINK](https://github.com/midas-network/COVID-19/blob/master/parameter_estimates/2019_novel_coronavirus/estimates.csv)
Google mobility data google\_mobility\_data mobility TRUE FALSE World c(“admin0”, “admin1”, “admin2”) [LINK](https://www.google.com/covid19/mobility/)
Newick tree from thousands of SARS-CoV-2 sequences cov\_glue\_newick\_data phylogenetic FALSE FALSE World multiple [LINK](https://github.com/hCoV-2019/lineages)
Aggregated projections from US CDC cdc\_aggregated\_projections projections TRUE FALSE list() c(“admin0”, “admin1”) [LINK](https://www.cdc.gov/coronavirus/2019-ncov/covid-data/forecasting-us.html)
CoronaNet government response database coronanet\_government\_response\_data public policy TRUE FALSE World c(“admin0”, “admin1”) [LINK](https://coronanet-project.org/index.html)
Oxford Government Policy Intervention time series government\_policy\_timeline public policy TRUE FALSE World admin0 [LINK](https://www.bsg.ox.ac.uk/research/research-projects/oxford-covid-19-government-response-tracker)
United States social distancing policies us\_state\_distancing\_policy public policy TRUE FALSE United States admin1 [LINK](https://github.com/COVID19StatePolicy/SocialDistancing/)
Case tracking -------------

Updated tracking of city, county, state, national, and international confirmed cases, deaths, and testing is critical to driving policy, implementing interventions, and measuring their effectiveness. Case tracking datasets include date, a count of cases, and usually numerous other pieces of information related to location of reporting, etc.

Accessing case-tracking datasets is typically done with one function per dataset. The example here is data from the European Centers for Disease Control, or ECDC.

ecdc = ecdc_data()

Get a quick overview of the dataset.

head(ecdc)

## # A tibble: 6 x 8
## # Groups:   location_name, subset [6]
##   date       location_name iso2c iso3c population_2019 continent subset    count
##   <date>     <chr>         <chr> <chr>           <dbl> <chr>     <chr>     <dbl>
## 1 2019-12-31 Afghanistan   AF    AFG          38041757 Asia      confirmed     0
## 2 2019-12-31 Afghanistan   AF    AFG          38041757 Asia      deaths        0
## 3 2019-12-31 Algeria       DZ    DZA          43053054 Africa    confirmed     0
## 4 2019-12-31 Algeria       DZ    DZA          43053054 Africa    deaths        0
## 5 2019-12-31 Armenia       AM    ARM           2957728 Europe    confirmed     0
## 6 2019-12-31 Armenia       AM    ARM           2957728 Europe    deaths        0

The ecdc dataset is just a data.frame (actually, a tibble), so applying standard R or tidyverse functionality can get answers to basic questions with little code. The next code block generates a top10 of countries with the most deaths recorded to date. Note that if you do this on your own computer, the data will be updated to today’s data values.

library(dplyr)
top10 = ecdc %>% filter(subset=='deaths') %>% 
    group_by(location_name) %>%
    filter(count==max(count)) %>%
    arrange(desc(count)) %>%
    head(10) %>% select(-starts_with('iso'),-continent,-subset) %>%
    mutate(rate_per_100k = 1e5*count/population_2019)

Finally, present a nice table of those countries:

knitr::kable(
    top10,
    caption = "Reported COVID-19-related deaths in ten most affected countries.",
    format = 'pandoc')
Reported COVID-19-related deaths in ten most affected countries.
date location_name population_2019 count rate_per_100k
2020-07-06 United_States_of_America 329064917 129947 39.489776
2020-07-06 Brazil 211049519 64867 30.735441
2020-07-06 United_Kingdom 66647112 44220 66.349462
2020-07-06 Italy 60359546 34861 57.755570
2020-07-06 Mexico 127575529 30639 24.016361
2020-07-04 France 67012883 29893 44.607841
2020-07-05 France 67012883 29893 44.607841
2020-07-06 France 67012883 29893 44.607841
2020-05-24 Spain 46937060 28752 61.256500
2020-07-06 India 1366417756 19693 1.441214

Examine the spread of the pandemic throughout the world by examining cumulative deaths reported for the top 10 countries above.

ecdc_top10 = ecdc %>% filter(location_name %in% top10$location_name & subset=='deaths')
plot_epicurve(ecdc_top10,
              filter_expression = count > 10, 
              color='location_name')

Comparing the features of disease spread is easiest if all curves are shifted to “start” at the same absolute level of infection. In this case, shift the origin for all countries to start at the first time point when more than 100 cumulative cases had been observed. Note how some curves cross others which is evidence of less infection control at the same relative time in the pandemic for that country (eg., Brazil).

ecdc_top10 %>% align_to_baseline(count>100,group_vars=c('location_name')) %>%
    plot_epicurve(date_column = 'index',color='location_name')

Contributions

Pull requests are gladly accepted on Github.

Adding new datasets

See the Adding new datasets vignette.

Similar work

sars2pack's People

Contributors

jcmallery avatar joe-wasserman avatar kevinrue avatar richardmn avatar seandavi avatar vjcitn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sars2pack's Issues

[New Data Resource] Center for Economic Progress Report on minorities in frontline jobs

Describe the data resource you'd like to add

Before the COVID-19 pandemic, more than 30 million US workers were employed in six broad industries that are now on the frontlines of the response. They include grocery store clerks, nurses, cleaners, warehouse workers, and bus drivers, among others. They were essential before the pandemic hit, yet also overworked, underpaid, under protected, and under appreciated. The tables below provide a basic demographic profile of workers in these frontline industries.

Access details

Additional context
https://cepr.net/a-basic-demographic-profile-of-workers-in-frontline-industries/

[BUG] Articles no longer updating

Describe the bug
Vignettes no longer updating. Last update date: May 17, 2020

Expected behavior
Daily updated articles, with at most ~24 H delay

Screenshots
image

Additional context
Appears to affect all vignettes

Looks like GH actions is now breaking due to bioc docker version mismatch

Looks like:

  • Bioconductor BiocManager is now set to use Bioc 3.11
  • Bioconductor:Bioconductor_docker is using R 3.6.3, although I think rocker:devel is now running R-devel (R-4.1.0 pre)
  • Bioc 3.11 requires R version 4.0

So, bioc supported docker is out-of-sync with bioc versioning. Will likely have to ditch Bioconductor docker images for at least the times around releases, as this is likely to happen again.

[New Data Resource]: https://github.com/COVID19StatePolicy/SocialDistancing

Description

This is a routinely-maintained data repository for US state-level distancing policies to the 2019 novel coronavirus (SARS-CoV-2), the cause of COVID-19. It was developed and is maintained by researchers at the University of Washington, Seattle, WA, USA.

These data will be useful for US state-level modeling, for example implementing the Imperial College covid19model

Access details

Data elements

  • File: "USstatesCov19distancingpolicy.csv". Prior datasets are archived with date stamps in the format of YYYYMMDD.
  • location_id: State-level unique identifier per the Global Burden of Disease (GBD) study.
  • StateFIPS: State-level Federal Information Processing Standard (FIPS) code.
  • StatePostal: Two-letter state postal code. This corresponds to StatePostal in the "state_id.csv".
  • StateName: State name. This corresponds to StateName in "state_id.csv".
  • StatePolicy : String variable of state policies, as described below:
    - EmergDec: Emergency declaration; currently includes State of Emergency, Public Health Emergency, and Public Health Disaster declarations.
    - GathRecomAny: Any recommendation of against gathering that stops short of a formal mandate or restriction of gatherings. Includes uses phrasing such as "advises against mass gatherings" and "constituents should avoid gatherings of more than 100" that imply a recommendation versus restriction.
    - GathRestrictAny: Restriction of any gathering; includes formal mandate or an executive order that uses phrasing such as "prohibits all mass gatherings" (per definition of mass gathering) and "constituents must avoid gatherings of more than 100". The first issuance of a gathering restriction of any size is coded with this date.
    - GathRestrict1000: Restriction of any gathering exceeding 1000 persons; coding followed the "GathRestrict" criteria. Some mandates include exceptions for essential businesses and organizations; these cases are still coded as a restriction applicable to the general public.
    - GathRestrict500: Restriction of any gathering exceeding 500 persons; coding followed the "GathRestrict" criteria and used the same coding approach as "GathRestrict1000" (i.e., considered a restriction if applicable to the general public).
    - GathRestrict250: Restriction of any gathering exceeding 250 persons; coding followed the same criteria as above.
    - GathRestrict100: Restriction of any gathering exceeding 100 persons; coding followed the same criteria as above.
    - GathRestrict50: Restriction of any gathering exceeding 50 persons; coding followed the same criteria as above.
    - GathRestrict25: Restriction of any gathering exceeding 25 persons; coding followed the same criteria as above.
    - GathRestrict10: Restriction of any gathering exceeding 10 persons; coding followed the same criteria as above.
    - GathRestrict5: Restriction of any gathering exceeding 5 persons; coding followed the same criteria as above.
    - SchoolClose: Formal closing of (at minimum) public schools. Where possible, additional information on types of school closings are provided in "PolicyCodingNotes".
    - RestaurantRestrict: Restriction or limitation of restaurants and other venues where food is consumed on-premises. Coding a case as a restriction requires a formal restriction on operations (e.g., offsite consumption only, limiting services to only take-away, delivery, or curbside drop-off) or mandate for substantially reducing operations (e.g., restaurant closure must occur unless 10 or fewer patrons are dining at at time).
    - OtherBusinessClose: Mandate to fully close operations of any category of business. Coding a case as an other business closure requires the executive order to use phrasing indicative of a mandate (e.g., "casinos must close", "operations at fitness centers and entertainment venues must cease by date"). A given state may have multiple cases of other business closures as they often occurred in phases (e.g., fitness centers and gyms on March 13, 2020; casinos and entertainment venues on March 15, 2020; personal service businesses like barbers and nail salons on March 19, 2020); thus, where possible, separate entries are provided for each mandate.
    - NEBusinessClose: Mandate to close all non-essential businesses. Coding a case as a closure order requires the executive order to use phrasing indicative of a mandate (e.g., "non-essential businesses are required to close", "non-essential businesses must cease operations by date"). Coding does not distinguish among states' classification of essential versus non-essential businesses, as they vary substantially by state.
    - StayAtHome: Mandate for individuals to stay at home for all non-essential activities. Coding a case as a stay-at-home order mandate requires the executive order to using phrasing indicative of a mandate (e.g., "must stay at home"); otherwise it is coded as 0 for the "Mandate" variable if it uses advisory phrasing. Coding does not distinguish among states' classification of essential versus non-essential activities, as they vary substantially by state. Shelter-in-place and stay-at-home orders are considered to be equivalent.
    - StateCurfew: Mandates specific curfews at which residents are not to be outside their homes unless performing essential activities, as defined by the state. Coding a case as a state curfew requires specific curfew times (thus "stay-at-home" mandates were not considered curfews).
    -Quarantine: Quarantines mandated for people entering the state, requiring a period of self-isolation. Quarantines may be imposed on all people entering the state, out-of-state residents, or travelers from a particular state or city. Quarantine length and who is covered by the policy can be found in the "PolicyNotes" variable. This policy type was added April 3, 2020.
    -TravelRestrictIntra: Restrictions on travel within the state. These restrictions can be between cities or counties or within them. The "StateWide" variable reflects whether these restrictions are applicable to across the state (coded as 1) or only for local areas (coded as 0). This policy type was added April 3, 2020.
    -TravelRestrictExit: Policies which prohibit residents of a state from leaving the state. These policies may have exceptions for essential businesses. This policy type was added April 3, 2020.
    - TravelRestrictEntry: Travel restriction mandates that limit non-residents from entering a given state. These policies may have exceptions or exemptions for essential businesses or their employees, and they may include restrictions for commercial lodging for non-residents. This policy type was added April 8, 2020.
  • Mandate: Binary variable indicating whether the policy applied is a mandate (1) or is advisory or a recommendation (0). This is coded on the basis of the order's phrasing (e.g., "residents are advised to stay at home and avoid unnecessary travel" would be coded as 0 for mandate as a "StayAtHome" policy). This variable was added on March 30, 2020.
  • DateIssued: Date of policy issuance. The date of signing of the policy document (e.g., executive order) was used wherever possible. Format is YYYYMMDD (e.g., March 16, 2020 is 20200316). Entries are not currently included for most non-statewide policies; this documentation is in-progress.
  • DateEnacted: Date of policy enactment: the date of when the policy would be enforced, per descriptions available in policy documents. The format is YYYYMMDD. Entries are not currently included for most non-statewide policies; this documentation is in-progress.
  • DateExpiry: Date of policy expiry, if or as provided in the policy issuance or executive order. This date is meant to reflect when the policy or order would be in effect until or unless additional action is taken to extend, amend, or halt its status. The format is YYYYMMDD. This documentation is in-progress, as it was added on March 29, 2020 as a variable of interest.
  • DateEnded: Date the policy is ended. This date is meant to reflect when a policy is ended, particularly if it is halted or reversed prior to its expiry date. The format is YYYYMMDD. This documentation is in-progress.
  • PolicyCodingNotes: Coder notes. Information on specific businesses closed, type of emergency declaration, potential exceptions, etc., are provided here.
  • PolicySource: Currently available source for each policy issued. Sourcing by hard-copy PDF versus hyperlinks is in-progress.
  • StateWide: Binary variable indicating whether the policy applied statewide (1) or for local areas (0).
  • LastUpdated: Date of last update for the given state-policy observation. The format is YYYYMMDD.
  • LastUpdatedNotes: Coder notes on last updates. This reflects notable changes since the last update, especially if a date has been recoded (e.g., switching to coding orders enacted at 11:59 pm on date1 to date1+1 for its enactment timing) or the "StatePolicy" type has been amended (e.g., some initial coding of "NEBusinessClose" policies were applicable to non-essential in-person retail businesses only, not all non-essential businessess as defined by state).

[New Data Resource] Keystone Strategy NPI catalog for US

Describe the data resource you'd like to add
Our dataset on non-pharmaceutical interventions, including school closings, public venue closings, etc., by county

  • US
  • county-leve
  • non-geospatial
  • curated, updated

Access details

Additional context
Came from SafeGraph datasets channel in slack.

[BUG] `covidtracker_data()` fails

Describe the bug
covidtracker_data() fails to download the expected data. I think the cause is that they started versioning their API, so the correct URL to the csv is now "http://covidtracking.com/api/v1/states/daily.csv"

To Reproduce
covidtracker_data() returns the following error:

Error: Can't subset columns that don't exist.
x The columns `date`, `state`, `positive`, `negative`, `pending`, etc. don't exist.

Ouptput of sessionInfo

R version 3.6.3 (2020-02-29)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] plotly_4.9.2               sars2pack_0.0.47           sf_0.9-2                   R0_1.2-6                  
 [5] MASS_7.3-51.6              geofacet_0.1.10            tidyquant_1.0.0            quantmod_0.4-15           
 [9] TTR_0.23-5                 PerformanceAnalytics_2.0.4 xts_0.11-2                 zoo_1.8-6                 
[13] lubridate_1.7.8            ggplot2_3.3.0              purrr_0.3.4                tidyr_1.0.2               
[17] dplyr_0.8.5                Cairo_1.5-10               knitr_1.24                

loaded via a namespace (and not attached):
  [1] colorspace_1.4-1      ellipsis_0.3.0        class_7.3-17          base64enc_0.1-3       rstudioapi_0.10      
  [6] npsurv_0.4-0          MatrixModels_0.4-1    ggrepel_0.8.1         bit64_0.9-7           fansi_0.4.1          
 [11] xml2_1.3.1            splines_3.6.3         lsei_1.2-0            jsonlite_1.6.1        mcmc_0.9-7           
 [16] dbplyr_1.4.2          png_0.1-7             rgeos_0.5-2           shiny_1.4.0.2         BiocManager_1.30.10  
 [21] readr_1.3.1           compiler_3.6.3        httr_1.4.1            lazyeval_0.2.2        assertthat_0.2.1     
 [26] Matrix_1.2-18         fastmap_1.0.1         cli_2.0.2             later_1.0.0           htmltools_0.4.0      
 [31] quantreg_5.54         tools_3.6.3           coda_0.19-3           gtable_0.3.0          glue_1.4.0           
 [36] reshape2_1.4.3        rappdirs_0.3.1        Rcpp_1.0.4.6          cellranger_1.1.0      imguR_1.0.3          
 [41] webdriver_1.0.5       vctrs_0.2.4           countrycode_1.1.2     debugme_1.1.0         crosstalk_1.0.0      
 [46] xfun_0.8              stringr_1.4.0         ps_1.3.0              openxlsx_4.1.4        rvest_0.3.5          
 [51] mime_0.7              lifecycle_0.2.0       scales_1.0.0          BiocStyle_2.14.4      hms_0.5.0            
 [56] promises_1.1.0        coarseDataTools_0.6-5 SparseM_1.77          yaml_2.2.1            curl_4.3             
 [61] memoise_1.1.0         gridExtra_2.3         incidence_1.7.1       stringi_1.4.3         RSQLite_2.1.2        
 [66] highr_0.8             e1071_1.7-3           zip_2.0.4             rlang_0.4.5           pkgconfig_2.0.3      
 [71] geogrid_0.1.1         evaluate_0.14         lattice_0.20-41       labeling_0.3          htmlwidgets_1.5.1    
 [76] bit_1.1-15.2          tidyselect_1.0.0      processx_3.4.1        showimage_1.0.0       plyr_1.8.4           
 [81] magrittr_1.5          R6_2.4.1              generics_0.0.2        DBI_1.1.0             pillar_1.4.3         
 [86] withr_2.1.2           fitdistrplus_1.0-14   units_0.6-4           survival_3.1-12       sp_1.3-1             
 [91] tibble_3.0.1          crayon_1.3.4          Quandl_2.10.0         KernSmooth_2.23-17    utf8_1.1.4           
 [96] BiocFileCache_1.10.2  rmarkdown_2.1         jpeg_0.1-8.1          rnaturalearth_0.1.0   grid_3.6.3           
[101] readxl_1.3.1          data.table_1.12.6     blob_1.2.1            callr_3.3.1           digest_0.6.25        
[106] classInt_0.4-1        xtable_1.8-4          httpuv_1.5.2          MCMCpack_1.4-6        munsell_0.5.0        
[111] viridisLite_0.3.0     EpiEstim_2.2-1        quadprog_1.5-7

[New Data Resource] Florida DOH line list

Describe the data resource you'd like to add
Florida up-to-date line list for all deaths and cases.

Access details

Additional context
Add any other context or screenshots about the data request here.

Please consider a pull request in addition to or in place of this feature request.

Add column descriptions to imported datasets.

> x
    a b
1   1 A
2   2 B
3   3 C
4   4 D
5   5 E
6   6 F
7   7 G
8   8 H
9   9 I
10 10 J
> x$a
 [1]  1  2  3  4  5  6  7  8  9 10
> attr(x$a,'description') = 'This is the description'
> x
    a b
1   1 A
2   2 B
3   3 C
4   4 D
5   5 E
6   6 F
7   7 G
8   8 H
9   9 I
10 10 J
> x$a
 [1]  1  2  3  4  5  6  7  8  9 10
attr(,"description")
[1] "This is the description"
> b = dplyr::filter(x, a>5)
> attr(b$a,'description')
[1] "This is the description"
> b
   a b
1  6 F
2  7 G
3  8 H
4  9 I
5 10 J

[New Data Resource]: http://us-covid19-vulnerability.socialprogress.org/

Describe the data resource you'd like to add
Include:

Dimension Indicator Weighting Factor Source
Population Demographics Population aged 80+, percentage of total pop. 16 2018 American Community Survey, 5-year data
  Population aged 70-79, percentage of total pop. 9 2018 American Community Survey, 5-year data
  Population aged 60-69, percentage of total pop. 4 2018 American Community Survey, 5-year data
  Population density per 10,000 1 2018 American Community Survey, 5-year data
  Nursing home population per 1,000 1 Homeland Infrastructure Foundation-Level Data
  Prison population per 1,000 1 Homeland Infrastructure Foundation-Level Data
Underlying Health Issues High blood pressure prevalence in adults 18+ 7 2018 CDC 500 Cities
  Cancer prevalence in adults 18+ 7 2018 CDC 500 Cities
  Asthma prevalence among adults 18+ 1 2018 CDC 500 Cities
  Coronary heart disease prevalence in adults 18+ 12 2018 CDC 500 Cities
  COPD prevalence among adults 18+ 7 2018 CDC 500 Cities
  Active smoking in adults 18+ 1 2018 CDC 500 Cities
  Diabetes prevalence in adults 18+ 8 2018 CDC 500 Cities
Health Infrastructure Percentage of pop. 18-64 with no health insurance 1 2018 CDC 500 Cities
  Number of hospital3 beds per 1,000 pop. within 25km radius 1 Homeland Infrastructure Foundation-Level Data
  Number of urgent care facilities per 1,000 pop. within 25km radius 1 Homeland Infrastructure Foundation-Level Data
  Ratio of the population to primary care physicians 1 2019 County Health Rankings and Roadmaps

Access details

Additional context
https://covid19risk.eastus.cloudapp.azure.com/#tab-7292-3

Use BiocFileCache() for caching

Looks like this will work; nice improvements since I last looked.

For all the _data functions, use a layer of BiocFileCache between the user and the download.

Need to check what resources have viable metadata (to see that expiration occurs in a timely manner). Consider adding a "default timeout" option for these resources, as many are updated one or more times per day.

us_state_distancing_policy format changed

Describe the bug

test failure

══ Failed tests ════════════════════════════════════════════════════════════════
── Failure (???): us_state_distancing_policy column names match ────────────────
`cnames` not equal to names(dsets[[dset]]$columns).
Lengths differ: 43 is not 39
── Failure (???): us_state_distancing_policy column types match ────────────────
`ctypes` not equal to dsets[[dset]]$columns.
Length mismatch: comparison on first 39 components
Component "curfew": 1 string mismatch

Document how to add a new datasource.

  • Caching using biocfilecache
  • Documentation style
  • Data catalog entry
  • Basic testing
    • URL exists
    • column names
    • sanity checks
    • data type checks on columns like date and geometry

[BUG] min_bic doesn't always converge--causes CI to time out.

Describe the bug
When running the CI on GH actions, runs time out (intermittently) when min_bic example run.

Expected behavior
Have min_bic either gracefully terminate after some number of iterations, or recode....

Currently, I disabled the example. Needs to be reenabled when fixed.

[New Data Resource] ACAPS Government Measures dataset

Describe the data resource you'd like to add
Include:
The COVID19 Government Measures Dataset puts together all the measures implemented by governments worldwide in response to the Coronavirus pandemic. Data collection includes secondary data review. The researched information available falls into five categories:

Social distancing
Movement restrictions
Public health measures
Social and economic measures
Lockdowns

Each category is broken down into several types of measures.

ACAPS consulted government, media, United Nations, and other organisations sources.

Access details

How to enforce the data update?

Thanks for the great package.
I cannot figure out how to update the data.
I checked that the data have been updated on usafacts.org, but not updated neither by rerunning usa_facts_data() or reinstalling the package.

cdc_excess_deaths dates may have changed format

Hi Sean,

I love this package! I noticed tonight that when I call cdc_excess_deaths, the dates aren't parsing correctly unless I use ymd rather than mdy on the "date" (Week Ending Date) column. Looks like the CDC's format might have changed on you?

Thanks again for making covid data analysis with R so simple,
--Nancy

Create data resource overview

Include in a table format:

  • column names
  • dimensions
  • column descriptions
  • ? example data
  • description
  • source
  • relationships (could do this programmatically to look for columns that have similar names and/or values between datasets

[BUG]

Describe the bug
A clear and concise description of what the bug is.

To Reproduce

Ideally, submit a reproducible example with code and error. If that is not possible, submit what you can.
I can't install this sars2pack package, I have a screenshot for this error.
Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.
error

Ouptput of sessionInfo
Type sessionInfo() into your R session and paste the results here surrounded by '```' above and below.

Additional context
Add any other context about the problem here.

[New Data Resource] World Population Prospects from UN

Describe the data resource you'd like to add

The 2019 Revision of World Population Prospects is the twenty-sixth round of official United Nations population estimates and projections that have been prepared by the Population Division of the Department of Economic and Social Affairs of the United Nations Secretariat.

These CSV files are encoded in UTF-8 and all of them have the following columns:

LocID (numeric): numeric code for the location; for countries and areas, it follows the ISO 3166-1 numeric standard
Location (string): name of the region, subregion, country or area
VarID (numeric): numeric code for the variant
Variant (string): projection variant name (Medium is the most used); for more information see Definition of projection variants
Time (string): label identifying the single year (e.g. 1950) or the period of the data (e.g. 1950-1955)
MidPeriod (numeric): numeric value identifying the mid period of the data, with the decimal representing the month (e.g. 1950.5 for July 1950)
Use the LocID, VarID and Time columns to link the data accross the different files, if necessary. Note that Time differs between single year (e.g. 1950) and period (e.g. 1950-1955) data.

Access details

Additional context
Add any other context or screenshots about the data request here.

Please consider a pull request in addition to or in place of this feature request.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.