Giter VIP home page Giter VIP logo

oecd's People

Contributors

briatte avatar expersso avatar lsmantiz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

oecd's Issues

get_dataset fails in latest version of R (4.2.0)

R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Package: OECD version 0.2.5 built on R 4.2.0.

Hi, an error is shown when running the following code in the latest R version 4.2.0.

Steps to reproduce:

 OECD::get_dataset("EO",
                              pre_formatted = T,
                              filter = "DNK+FIN+ISL+NOR+SWE+EA17.EXCHEB.A",
                              start_time = "1979",
                              end_time = "2021"

    )

Error shown is :

Error in download.file(path, destfile, method, quiet, mode, ...) : 
  cannot open URL 'https://stats.oecd.org/restsdmx/sdmx.ashx/GetData/TIMELY_IE/DNK+FIN+ISL+NOR+SWE+EA17.EXCHEB.A/all?startTime=1979&endTime=2021'
In addition: Warning message:
In download.file(path, destfile, method, quiet, mode, ...) :
  URL 'https://stats.oecd.org/restsdmx/sdmx.ashx/GetData/TIMELY_IE/DNK+FIN+ISL+NOR+SWE+EA17.EXCHEB.A/all?startTime=1979&endTime=2021': 
status was 'SSL peer certificate or SSH remote key was not OK'

It seems to be some issue with download.file() in R 4.2.0, see for example the following on stack overflow.

Full session info:

R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=Swedish_Sweden.utf8  LC_CTYPE=Swedish_Sweden.utf8    LC_MONETARY=Swedish_Sweden.utf8
[4] LC_NUMERIC=C                    LC_TIME=Swedish_Sweden.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] openxlsx_4.2.5     zoo_1.8-10         lattice_0.20-45    OECD_0.2.5         compare_0.2-6      dkstat_0.08       
 [7] pxweb_0.13.1       lubridate_1.8.0    filesstrings_3.2.2 readxl_1.4.0       writexl_1.4.0      NSDB_0.1.6        
[13] pxR_0.42.4         RJSONIO_1.3-1.6    reshape2_1.4.4     forcats_0.5.1      stringr_1.4.0      purrr_0.3.4       
[19] readr_2.1.2        tidyr_1.2.0        tibble_3.1.7       ggplot2_3.3.6      tidyverse_1.3.1    dplyr_1.0.9       
[25] eurostat_3.7.10    plyr_1.8.7        

loaded via a namespace (and not attached):
 [1] httr_1.4.3         bit64_4.0.5        vroom_1.5.7        jsonlite_1.8.0     here_1.0.1         modelr_0.1.8      
 [7] assertthat_0.2.1   countrycode_1.4.0  cellranger_1.1.0   pillar_1.7.0       backports_1.4.1    glue_1.6.2        
[13] rvest_1.0.2        RefManageR_1.3.0   colorspace_2.0-3   pkgconfig_2.0.3    broom_0.8.0        haven_2.5.0       
[19] scales_1.2.0       tzdb_0.3.0         proxy_0.4-27       generics_0.1.2     ellipsis_0.3.2     withr_2.5.0       
[25] cli_3.3.0          magrittr_2.0.3     crayon_1.5.1       strex_1.4.2        fs_1.5.2           fansi_1.0.3       
[31] xml2_1.3.3         class_7.3-20       tools_4.2.0        hms_1.1.1          lifecycle_1.0.1    munsell_0.5.0     
[37] reprex_2.0.1       zip_2.2.0          compiler_4.2.0     e1071_1.7-11       rlang_1.0.3        classInt_0.4-7    
[43] grid_4.2.0         rstudioapi_0.13    regions_0.1.8      readsdmx_0.3.0     gtable_0.3.0       DBI_1.1.3         
[49] curl_4.3.2         R6_2.5.1           bit_4.0.4          utf8_1.2.2         rprojroot_2.0.3    KernSmooth_2.23-20
[55] stringi_1.7.6      parallel_4.2.0     Rcpp_1.0.8.3       vctrs_0.4.1        dbplyr_2.2.1       tidyselect_1.1.2  

incomplete data download in comparison to OECD.stat statistics

I am using the OECD package to download R&D expenditure statistics, in particular: Gross domestic expenditure on R&D by sector of performance and socio-economic objective (SEO), which can be accessed here: https://stats.oecd.org/Index.aspx?DataSetCode=GERD_SEO#

The data series on the OECD.stat site runs till 2019.

When I download the data by means of this command: get_dataset("gerd_objective_nabs2007"), the abovementioned database is downloaded but the series stop at 2015 instead of 2019.

It is not clear to me if this is related to the OECD package or to the OECD API. I hope you can clarify and if it is the API, if you could please show me how to reproduce the API message that is sent to the server, which I will then forward to the OECD.stat to ask what is going on.

Max Data?

Many thanks for the awesome package.

I tried to pull the "TISP_EBOPS2010" data:
OECD <- get_dataset("TISP_EBOPS2010")

and I see that the object is 1,000,000 obs of 11 variables.
I quickly skimmed through the OECD API documentation to check if there is any data limitation or the exact 1,000,000 obs is a concidence.

To me it doesn't seem like a coincidence and I am not sure what limits the dataset... Is it OECD API or a memory limitation?

Any ideas?

I don't want to filter anything as I would like to download the whole database and do the subset selection at a later stage.

thanks,

using the API to access OECD microdata

Hello
I would very much like to access Donor Agency information in the CRS1 dataset using the OECD package via the OECD API.
Using EU Institutions as an example, if I go onto the OECD website and download disbursement data for EU Institutions there is a column 'Donor Agency' that is not included if I pull EU Institutions using the API (code 918). Is there a way to also include the Donor Agency information in the OECD package?
Many thanks

Running get_data_structure("DUR_D")

I follow the package instructions but get back the following:

ttt<-get_data_structure("DUR_D")
Error in data.frame(data_structure@concepts) :
trying to get slot "concepts" from an object (class "data.frame") that is not an S4 object

OECD package maintenance

Hi there seems to be a change in the datastructure, as some fuinctions of the package are not functionning anymore (they were a few weeks ago).
for instance:
OECD::get_data_structure("SNA_TABLE11")
throws an error

Error in data.frame(data_structure@concepts) : 
  trying to get slot "concepts" from an object (class "data.frame") that is not an S4 object 

It doesn't looks like it very heavy but it would be nice if done.
By the way, some data fileds have changed names, which is not really cool.

New get_dataset() not working for all series due to hardcoded 1.1

The new get_dataset() function for the new API has hardcoded 1.1 into the query url, but some queries seem to require different number.

For example, I would like to fetch this series: https://data-explorer.oecd.org/vis?fs%5b0%5d=Topic%2C1%7CEconomy%23ECO%23%7CLeading%20indicators%23ECO_LEA%23&pg=0&fc=Topic&bp=true&snb=1&vw=tb&df%5bds%5d=dsDisseminateFinalDMZ&df%5bid%5d=DSD_STES%40DF_CLI&df%5bag%5d=OECD.SDD.STES&df%5bvs%5d=4.1&pd=%2C&dq=.M.LI...AA...H&ly%5brw%5d=TIME_PERIOD&ly%5bcl%5d=REF_AREA&to%5bTIME_PERIOD%5d=false&lo=10&lom=LASTNPERIODS

The data query of which should be:

https://sdmx.oecd.org/public/rest/data/OECD.SDD.STES,DSD_STES@DF_CLI,4.1/.M.LI...AA...H?startPeriod=2023-05

However, get_dataset() adds 1.1 after the series name "OECD.SDD.STES,DSD_STES@DF_CLI", even though it should be 4.1.

With 1.1, the query returns a 404.

The same issue applies to the structure query.

Problem with proxy config calling get_datasets

I'm trying to get this useful package up and running behind a corporate firewall, where we need to specify proxy server details - specifically the proxy server, and to use Windows authentication. I've tackled a few similar issues in the past, and have solutions for packages using RCurl or httr. However, I've been having trouble with the get_datasets() function (which looks like it uses curl, behind read_xml).

I can't see a way to pass through or pre-configure the proxy settings we need as it stands, but I might be able to see a workaround. Any suggestions or advice would be welcome!

To give specific examples:

  1. This code gives a timeout as expected, since no proxy server specified
library(OECD)
dataset_list = get_datasets()
  1. I can specify the server as follows... but then I get an HTTP 407 authentication error
myProxyServer = "..."
Sys.setenv(HTTPS_PROXY = myProxyServer)
dataset_list = get_datasets()

This is also expected, since in our setting we need to specify proxyuserpwd = ":" to work with Windows authentication.

  1. Having looked at the code behind get_datasets, I've figured out that the following adjustment works successfully:
library(httr)
library(xml2)

set_config(config(
  proxy = myProxyServer,
  proxyuserpwd = ":"
))

url = "https://stats.oecd.org/RestSDMX/sdmx.ashx/GetKeyFamily/all"
datasets = read_xml(GET(url))

This way, I can use httr::set_config to configure the full range of curl options for httr, and then pass the result of the GET into read_xml. The rest of the code behind get_datasets works as before.

Can you suggest another way of making this function work with our proxy config, or would you consider making a change so the package uses an httr-based approach? For the moment, I'll create a little helper function with the code above - but obviously I'd prefer not to maintain this separately from the package.

Thanks!

Error using OECD::get_data_structure(dataset = 'RS_GBL')

I am using the package OECD to extract data from the Global Revenue Statistics Database. I already know the id for this database but I get the following error:

library(OECD)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(stringr)

dataset_list <- OECD::get_datasets()

dataset_list |> 
  dplyr::filter(stringr::str_detect(title, 
                                    pattern='Global Revenue Statistics Database'))
#> # A tibble: 1 × 2
#>   id     title                             
#>   <chr>  <chr>                             
#> 1 RS_GBL Global Revenue Statistics Database

OECD::get_data_structure(dataset = 'RS_GBL')
#> Error in data_structure@concepts: no applicable method for `@` applied to an object of class "data.frame"

Created on 2023-07-18 with reprex v2.0.2

The information of my current R session is the following:

sessionInfo()
#> R version 4.3.1 (2023-06-16 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 11 x64 (build 22621)
#> 
#> Matrix products: default
#> 
#> 
#> locale:
#> [1] LC_COLLATE=English_United States.utf8 
#> [2] LC_CTYPE=English_United States.utf8   
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.utf8    
#> 
#> time zone: America/Bogota
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] stringr_1.5.0 dplyr_1.1.2   OECD_0.2.5   
#> 
#> loaded via a namespace (and not attached):
#>  [1] vctrs_0.6.3       cli_3.6.1         knitr_1.43        rlang_1.1.1      
#>  [5] xfun_0.39         stringi_1.7.12    purrr_1.0.1       styler_1.10.1    
#>  [9] generics_0.1.3    glue_1.6.2        htmltools_0.5.5   fansi_1.0.4      
#> [13] rmarkdown_2.23    R.cache_0.16.0    tibble_3.2.1      evaluate_0.21    
#> [17] fastmap_1.1.1     yaml_2.3.7        lifecycle_1.0.3   compiler_4.3.1   
#> [21] fs_1.6.2          pkgconfig_2.0.3   rstudioapi_0.15.0 R.oo_1.25.0      
#> [25] R.utils_2.12.2    digest_0.6.33     R6_2.5.1          tidyselect_1.2.0 
#> [29] utf8_1.2.3        reprex_2.0.2      pillar_1.9.0      magrittr_2.0.3   
#> [33] R.methodsS3_1.8.2 tools_4.3.1       withr_2.5.0

Created on 2023-07-18 with reprex v2.0.2

get_dataset returns error message

Dear Eric,

this package could be incredibly helpful for me, thanks for publishing it.

Unfortunately, the code at the bottom of my post returns an error when I try to run it. I'm using R-4.1.3 in RStudio. get_datasets() works fine, but get_dataset() returns the following error message:

> df <- get_dataset("PATS_REGION", filter = "PCT_A.INVENTORS.BEL+BE10.TOTAL+BIOTECH", pre_formatted = TRUE)
Error in download.file(path, destfile, method, quiet, mode, ...) : 
  cannot open URL 'https://stats.oecd.org/restsdmx/sdmx.ashx/GetData/PATS_REGION/PCT_A.INVENTORS.BEL+BE10.TOTAL+BIOTECH/all'
In addition: Warning message:
In download.file(path, destfile, method, quiet, mode, ...) :
  URL 'https://stats.oecd.org/restsdmx/sdmx.ashx/GetData/PATS_REGION/PCT_A.INVENTORS.BEL+BE10.TOTAL+BIOTECH/all': status was 'Couldn't connect to server'

Here's the script I tried to run:

library(OECD)
library(httr)
httr::set_config(httr::use_proxy("192.168.78.10", port = 8080, auth = "basic"))
datasets <- get_datasets()
df <- get_dataset("PATS_REGION", filter = "PCT_A.INVENTORS.BEL+BE10.TOTAL+BIOTECH", pre_formatted = TRUE)

Like I mentioned, get_dataset() returns the expected data frame. I suspect there may have been a change in the API?

get_dataset() fails on request

Probably related to #7

Trying to run the example given in Alternative data-acquisition strategy

library(OECD)

df <- get_dataset("PATS_REGION",
                  filter = "PCT_A.INVENTORS.BEL+BE10.TOTAL+BIOTECH", 
                  pre_formatted = TRUE)

fails with

<XMLInputError: XML content does not seem to be XML: ''>

Problem first emerged Saturday (16.06.2018) evening.

sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=Estonian_Estonia.1257  LC_CTYPE=Estonian_Estonia.1257    LC_MONETARY=Estonian_Estonia.1257 LC_NUMERIC=C                     
[5] LC_TIME=Estonian_Estonia.1257    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] OECD_0.2.2.999

loaded via a namespace (and not attached):
 [1] httr_1.3.1      compiler_3.4.4  rsdmx_0.5-11    plyr_1.8.4      R6_2.2.2        tools_3.4.4     RCurl_1.95-4.10 yaml_2.1.19     Rcpp_0.12.17   
[10] bitops_1.0-6    XML_3.98-1.11  

Get OBS_STATUS

Is there any way of getting the OBS_VALUE data downloaded when using get_dataset to download tables from OECD?

OECD pension statistics

I am struggling to download pension statistics using the examples available.

Here it is what I want to do:

# download from OECD
dataset_id <- "PNN_new"

# get structure of the dataset and labels
dstruc <- get_data_structure(dataset_id) 

# 13 dimensions 
# id: APLANS --> Total, by pension plan type
# id: ADEF   --> Total, by definition type
# id: ACONT  --> Total all funds

# fileters 
#  - Investments in Bonds (VAR id: 1000) 
#  - Investment  in Bonds issued by public administration  (VAR id: 1270) 
#  - Mutual Funds --> Bills and Bonds   (VAR id: 1220) 

Given the filters above I am trying for example this:

filter_list<-list(c("DEU","FRA"), "APLANS", "ADEF", "ACONT", "1000")
get_dataset(dataset = dataset_id, filter = filter_list)

But it results into a bad request: 

  HTTP request failed with status: 400 Bad Request


I also have a question of how to download all countries together.

I got the dataset I need only through the alternative strategy you describe. But, good know how to make the more direct one to work as well.

OECD Stats discontinuation March 2024

It has recently been announced that the OECD Stats database will be switched off by the end of March 2024.

What will the implications be for users of this package? Does this R package support the new OECD Data Explorer that is replacing OECD Stats?

Filter by name of variable and not just order

I might be misunderstanding, but it looks like the order of filters are applied to the variables in strict order, so (from the vignette):

df <- get_dataset(dataset = "DUR_D", filter = list(c("DEU", "FRA"), "MW", "2024"))

works, but if I only want to apply the last filter I have to write

df <- get_dataset(dataset = "DUR_D", filter = list(NULL, NULL, "2024"))

What would be nice is to use the names() of the filter list to avoid having to type this, e.g.

df <- get_dataset(dataset = "DUR_D", filter = list(AGE = "2024"))

It doesn't seem like this would be too hard, I guess it would require a call to get_data_structure() to get the order right.

You could even match arguments in ... so that a call could look like:

df <- get_dataset(dataset = "DUR_D", AGE = "2024")

Error get_dataset()

Hi,

I am having the following issue trying to access the data through R:

library(OECD)
dataset <- get_dataset("DUR_D")

Error in function (type, msg, asError = TRUE) :
Unknown SSL protocol error in connection to stats.oecd.org:443

I saw an earlier post on an error using get_databases, but this seems to be something different.

I am using Rstudio 1.1.447 on a Window OS. Any help will be greatly appreciated.

Thanks!!

Antonio

unsafe legacy renegotiation disabled

I installed from CRAN.

I get the following error when I try to download the list of datasets

> datasets <- get_datasets()
Error in curl::curl_fetch_memory(url, handle = handle) : 
  OpenSSL/3.1.4: error:0A000152:SSL routines::unsafe legacy renegotiation disabled

Issue when importind data with get_dataset using filter

I get the following error: "Bad Request (HTTP 400).Error in rsdmx::readSDMX(url) : HTTP request failed with status: 400" while trying to import data through get_dataset() function and filtering before the data using filter and a simple list with just one member. I used the following code

dataset = "TABLE2A"

filter_list <- list(c("20001"))
df <- get_dataset(dataset = dataset, filter = filter_list)

Bad Request (HTTP 400).Error in rsdmx::readSDMX(url) : HTTP request failed with status: 400

Regional & Metropolitan vars

I was wondering if you could include in the vignette an example of filtering regional or metropolitian variables? For this, neither strategy seems to work. The SDMX querry is several pages long and I cannot really work with it. The other filtering method for me does not seem to work. However, without effective filtering the data table exceeds the API limit.

Is there a way to access, for example, any of the regional GDP variables with the package?

Stats comes wrong

Hello,
I'm not sure if I'm doing this the right way but I wanted to communicate the spurious results I'm getting with this great package "OECD".
I couldn't find the same numbers on OECD website so I don't really know if the numbers that I receive when using get_dataset are wrong or the data on the site is wrong.
By looking at the results I would say that for some reason for 2016, CHE (Switzerland) the numbers are padded with "000".
I would provide more info if needed, just let me know. [email protected]

library(tidyverse)
library(OECD)
library(DT)

Search_for <- "migration" # type search word or term

search_dataset(Search_for, get_datasets()) %>%
DT::datatable()

dat_str <- get_data_structure("MIG") # check the structure of the data

dat_str[[3]] %>%
DT::datatable()

mig <- get_dataset("MIG",
start_time = 2007, end_time = 2016) # change time span for the data

mig %>%
filter(VAR =="B21",GEN == "TOT",CO2 != "TOT") %>%
filter(COU == "CHE",obsTime == 2014) %>%
group_by(CO2,obsValue,obsTime) %>%
summarise(people=sum(obsValue)) %>%
arrange(desc(people))

A tibble: 152 x 4

Groups: CO2, obsValue [152]

CO2 obsValue obsTime people

1 DEU 14354000 2016 14354000
2 ITA 11044000 2016 11044000
3 FRA 8233000 2016 8233000
4 PRT 5967000 2016 5967000
5 ESP 3380000 2016 3380000
6 POL 2775000 2016 2775000
7 HUN 2585000 2016 2585000
8 GBR 2062000 2016 2062000
9 AUT 2035000 2016 2035000
10 ROU 1932000 2016 1932000

... with 142 more rows

Error using 'get_datasets()'

Hello, I am getting the following error when I use 'get_datasets()': Error in curl::curl_fetch_memory(url, handle = handle) : Failure when receiving data from the peer
This was working fine before Christmas, and I can't figure out what's different. Be grateful for your help.
Thanks, Kim

Documentation on start_time and end_time formats

When I need to download annual data, I can simply set the input parameters start_time and end_time in OECD::get_dataset() to integers. This is also shown in the examples provided. However, the documentation does not specify what to do to with time other than years. For example, consider the OECD dataset MEI_CLI, that has monthly data. An example SDMX data URL is

https://stats.oecd.org/restsdmx/sdmx.ashx/GetData/MEI_CLI/LOLITOAA.FRA+DEU+ITA+ESP.M/all?startTime=2000-01&endTime=2019-12

In R, I would write

cli_filters <- list(
    "LOLITOAA",
    c("DEU", "FRA", "ESP", "ITA"),
    "M"
)
cli_raw <- get_dataset("MEI_CLI", filter=cli_filters)

What would I write as arguments for start_time and end_time? If I read the SDMX URL, I get the hint that I can write start_time="2000-01", in the fashion YYYY-MM. However, this is undocumented.

This issue is only meant to suggest to improve the documentation.

OECD Economic Outlook Vintages

I'm trying to access OECD EO vintages and cannot.

I run the following:

library("readxl")
library("OECD")
library("rsdmx")
data_test= get_dataset(("EO59_VINTAGE"),filter ="AUT", start_time = 1975)

and get the following error message:
Error in rsdmx::readSDMX(url) :
HTTP request failed with status: 400 Bad Request

The code works when I use EO60_MAIN or any later publication. None of the vintages seems to be available, although when I look at the list of data sets I can access through the OECD library, the vintages are all listed. Any idea of what the issue is?

Thanks!

image

more help on accessing data

I just discovered your package. Very interesting. I've been able to replicate the examples in the vignette, but I'm having problem understanding how to get other data. For instance, if I wanted to access the Adult Unemployment Rate and the Youth Unemployment Rate for France and the United States. I tried the following, but it's not working. I think my problem is working out how to set up a filter. Any help appreciated: more examples in the vignette would be useful -- to me and probably others. Thanks!

library("OECD")
dataset <- "AEO2012_CH6_FIG4"
dstruc <- get_data_structure(dataset)
dstruc$MEASURE
##     id                                       label
## 1  AUR                 Adult unemployment rate (%)
## 2  YUR                 Youth unemployment rate (%)
## 3 YUAU Youth unemployment / Adult unemployment (%)
filter <- list(c("USA", "FRA"), c("AUR", "YUR"))
df <- get_dataset(dataset = dataset, filter = filter)
Error in rsdmx::readSDMX(url) : 
  HTTP request failed with status: 400 Bad Request

Running example from read me returns error

Running your example returns an errror:

> library(OECD)
> dataset <- "OECD.SDD.NAD,DSD_NAAG@DF_NAAG_I,1.0"
> filter <- "A.USA+EU.B1GQ_R_POP+B1GQ_R_GR.USD_PPP_PS+PC."
> df <- get_dataset(dataset, filter)
Fehler in download.file(path, destfile, method, quiet, mode, ...) : 
  kann URL 'https://stats.oecd.org/restsdmx/sdmx.ashx/GetData/OECD.SDD.NAD,DSD_NAAG@DF_NAAG_I,1.0/A.USA+EU.B1GQ_R_POP+B1GQ_R_GR.USD_PPP_PS+PC./all' nicht öffnen
Zusätzlich: Warnmeldung:
In download.file(path, destfile, method, quiet, mode, ...) :
  Kann URL 'https://stats.oecd.org/restsdmx/sdmx.ashx/GetData/OECD.SDD.NAD,DSD_NAAG@DF_NAAG_I,1.0/A.USA+EU.B1GQ_R_POP+B1GQ_R_GR.USD_PPP_PS+PC./all' nicht öffnen: HTTP Status war '500 Internal Server Error'

Error in get_data_structure and get_dataset: no package called XML

Hi,

I tried to install and use this package using R 3.6.2, RStudio 1.2.5033.
When I tried to follow the vignette, I get errors when trying to use
get_data_structure and get_dataset:
Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) :
there is no package called ‘XML’.

I searched CRAN but no package called XML seems to be currently available for installation. There is xml2 and XML2R. Can you give advice where I can find this missing package?

Best regards
KHrehova

Source for current CRAN version (0.2.5)

The version of OECD on CRAN is 0.2.5 but this repo only has 0.2.4 https://cran.r-project.org/web/packages/OECD/index.html. The CRAN version also causes issue #24 due to the replacement of the rsdmx package with readsdmx in get_data_structure. Given the apparent speed benefits of readsdmx, the following function (or equivalent in base R) could be used to generate the same result as v0.2.4:

source("https://raw.githubusercontent.com/expersso/OECD/master/R/main.R")

get_data_structure_fixed <- function(dataset) {
  url <- paste0("https://stats.oecd.org/restsdmx/sdmx.ashx/GetDataStructure/", 
                dataset)
  
  data_structure <- readsdmx::read_sdmx(url) |>
    dplyr::mutate(id = gsub(paste0("CL_", dataset, "_"), "", id))
  
  code_list <- data_structure |>
    dplyr::select(id, value, label = en_description) |>
    split(factor(data_structure$id, levels = unique(data_structure$id))) |>
    purrr::map(
      \(x) dplyr::select(x, id = value, label) |> tibble::remove_rownames()
    )
  
  lookup <- tibble::enframe(c(
    OBS_VALUE = "Observation Value",
    TIME_FORMAT = "Time Format",
    UNIT = "Unit",
    POWERCODE = "Unit multiplier",
    REFERENCEPERIOD = "Reference period"
  ), name = "id", value = "description"
  ) |>
    dplyr::filter(id %in% names(code_list) | id == "OBS_VALUE")
  
  variable_desc <- data_structure |>
    dplyr::select(id, description = en) |>
    dplyr::distinct() |>
    dplyr::filter(!id %in% lookup$id) |>
    rbind(lookup)
  
  full_df_list <- c(VAR_DESC = list(variable_desc), code_list)
  
  full_df_list
}

test_data_structure <- function(dataset) {
  new <- get_data_structure_fixed(dataset)
  
  # From version 0.2.4
  ref <- get_data_structure(dataset)
  
  new$VAR_DESC <- new$VAR_DESC |> dplyr::arrange(id)
  ref$VAR_DESC <- ref$VAR_DESC |> dplyr::arrange(id)
  
  testthat::expect_identical(new, ref)
}

datasets <- c("GOV_DEBT", "DUR_D", "AIR_EMISSIONS", "TEL", "FUA_CITY")

for (ds in datasets) {
  test_data_structure(ds)
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.