Giter VIP home page Giter VIP logo

csodata's Introduction

csodata

An R package for downloading CSO data.

The csodata package allows for easily downloading CSO (Central Statistics Office, the statistics agency of Ireland) PxStat data into R. It also includes multiple functions for examining the metadata of CSO tables, as well as a function to download geographic data in the ESRI vector format from the CSO website.

PxStat is the Central Statistics Office’s (CSO) online database of Official Statistics. This database contains current and historical data series compiled from CSO statistical releases and is accessed here . The CSO PxStat Application Programming Interface (API), which is accessed in this package, provides access to PxStat data in JSON-stat format. This dissemination tool allows developers machine to machine access to CSO PxStat data.

References

Graeme Walsh (2018). statbanker R package version 6.2.0. For inspiration and code used for reshaping tables.

csodata's People

Contributors

conor-crowley avatar jamesor19 avatar olivroy avatar vytashub avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

csodata's Issues

error while running install_github("CSOIreland/csodata")

Hello,

I am going the following error when trying to install from git.

All the best.

Best regards,

─ preparing 'csodata':
✔ checking DESCRIPTION meta-information ...
E checking vignette meta-information ...
Output(s) listed in 'build/vignette.rds' but not in package:
'inst/doc/quick_start_guide.html'
Run R CMD build without --no-build-vignettes to re-create
Error: Failed to install 'csodata' from GitHub:
! System command 'Rcmd.exe' failed
In addition: Warning messages:
1: In untar2(tarfile, files, list, exdir, restore_times) :
skipping pax global extended headers
2: In untar2(tarfile, files, list, exdir, restore_times) :
skipping pax global extended headers

Issue with pivot_format = "tidy" in cso_get_data()

Using the pivot_format = "tidy" parameter causes an error in cso_get_data().

The problem would appear to be with the line

data <- tidyr::pivot_wider(data, names_from = Statistic ,values_from = value)

as the variable value has been renamed to Value earlier.

Warning when downlading table using cso_get_table()

Warning periodically appears when using cso_get_table(). Believed to be linked to the cache but still investigating.

In addition: Warning messages:
1: In normalizePath(path.expand(path), winslash, mustWork) :
path[1]="path/to/ RCache": The system cannot find the file specified
2: In normalizePath(path.expand(path), winslash, mustWork) :

Issue with windows file paths when flush_cache=TRUE

The flush_cache parameter appears not to work on linux for functions cso_get_data(), cso_get_geo() and cso_get_toc().

These functions contain the code paste0(R.cache::getCacheRootPath(),"\\csodata") when a call to file.path() might be preferable.

Issue with date-time format in cso_get_toc()

The LastModified column of the data.frame returned by cso_get_toc() contains some NA values. (All the values are NA when run on a linux machine.)

The 'Z' in the format = argument below may be the problem.

tbl3$LastModified <- as.POSIXct(tbl3$LastModified, format = "%Y-%m-%dT%H:%M:%SZ")

When the 'Z' is removed (format = "%Y-%m-%dT%H:%M:%S"), all the values appear in the data.frame.

Issue with include_ids parameter in cso_get_data()

Setting include_ids = TRUE causes an error in cso_get_data().

Possibly:

   data_id <- rjstat::fromJSONstat(parse(json_data)$result, naming = "id", use_factors = use_factors)

should be changed back to:

    data_id <- rjstat::fromJSONstat(json_data, naming = "id", use_factors = use_factors   )

as in an earlier version of this package.

Very slow response from data.cso.ie

Hi,
Using the latest cran csodata 1.4.2
Observed - request frequently times out with error message ' "Warning: Failed retrieving table of contents. Please check internet connection and that data.cso.ie is online"
Loaded cached data'

This is despite cso.data.ie being open in my browser, and the internet connection working well.
Eventually it will respond, but it can take several tries.

Issues with County Council geographic data from cso_get_geo()

Some Issue with the County Council geographic data from cso_get_geo().

The following code:
cso_get_geo_meta(cso_get_geo("cc"))

when run results in the following error

Error in wk_handle.wk_wkb(wkb, s2_geography_writer(oriented = oriented,  : 
  Loop 50 is not valid: Edge 1391 has duplicate vertex with edge 1394

Unsure if issue with cso_get_geo_meta() or the sf dataframe itself. Will need to be investigated.

Need to add new county council shape file to reflect boundary changes

In 2019 Cork city expanded its boundary. The updated shape files showing the change can be seen on the cso website. This shape file cannot be currently retrieved through the cso_get_geo function.

This is just an example of an change that I have noticed but there may be more not reflected in the current shape files offered through cso_get_geo.

cso_get_data not grabbing most recent data

Encountered an issue where a user trying to download data was getting not up to date results due to PxStat being updated while package was retrieving the old version from the cache. Need to investigate possibly emptying the cache and downloading fresh data at certain times.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.