Giter VIP home page Giter VIP logo

eia's Introduction

eia

Project Status: Active – The project has reached a stable, usable state and is being actively developed. R-CMD-check Codecov test coverage CRAN status CRAN RStudio mirror downloads Github Stars

The eia package provides API access to data from the US Energy Information Administration (EIA).

Pulling data from the US Energy Information Administration (EIA) API requires a registered API key. A key can be obtained at no cost here. A valid email and agreement to the API Terms of Service is required to obtain a key.

eia includes functions for searching the EIA API data directory and importing various datasets. Datasets returned by these functions are provided in a tidy format or alternatively in more raw form. It also offers helper functions for working with EIA API date strings and time formats and for inspecting different summaries of data metadata. The package also provides control over API key storage and caching of API request results.

Installation

Install the CRAN release of eia with

install.packages("eia")

or install the development version from GitHub with

# install.packages("remotes")
remotes::install_github("ropensci/eia")

Example

After obtaining the API key, store it somewhere such as .Renviron and never have to do anything with the key when using the package. Alternatively, set it manually with eia_set_key() in the current R session. Further, it can always be passed explicitly to the key argument of a given eia function.

Load package and set key

library(eia)

# not run
eia_set_key("yourkey") # set API key if not already set globally

Explore the API directory

Get a list of the EIA’s data directory (and sub-directories) with eia_dir().

# Top-level directory
eia_dir()
#> # A tibble: 14 × 3
#>    id                name                            description                
#>    <chr>             <chr>                           <chr>                      
#>  1 coal              Coal                            EIA coal energy data       
#>  2 crude-oil-imports Crude Oil Imports               Crude oil imports by count…
#>  3 electricity       Electricity                     EIA electricity survey data
#>  4 international     International                   Country level production, …
#>  5 natural-gas       Natural Gas                     EIA natural gas survey data
#>  6 nuclear-outages   Nuclear Outages                 EIA nuclear outages survey…
#>  7 petroleum         Petroleum                       EIA petroleum gas survey d…
#>  8 seds              State Energy Data System (SEDS) Estimated production, cons…
#>  9 steo              Short Term Energy Outlook       Monthly short term (18 mon…
#> 10 densified-biomass Densified Biomass               EIA densified biomass data 
#> 11 total-energy      Total Energy                    These data represent the m…
#> 12 aeo               Annual Energy Outlook           Annual U.S. projections us…
#> 13 ieo               International Energy Outlook    Annual international proje…
#> 14 co2-emissions     State CO2 Emissions             EIA CO2 Emissions data

# Electricity sub-directory
eia_dir("electricity")
#> # A tibble: 6 × 3
#>   id                              name                               description
#>   <chr>                           <chr>                              <chr>      
#> 1 retail-sales                    Electricity Sales to Ultimate Cus… "Electrici…
#> 2 electric-power-operational-data Electric Power Operations (Annual… "Monthly a…
#> 3 rto                             Electric Power Operations (Daily … "Hourly an…
#> 4 state-electricity-profiles      State Specific Data                "State Spe…
#> 5 operating-generator-capacity    Inventory of Operable Generators   "Inventory…
#> 6 facility-fuel                   Electric Power Operations for Ind… "Annual an…

Get data

Get annual retail electric sales for the Ohio residential sector since 2010

(d <- eia_data(
  dir = "electricity/retail-sales",
  data = "sales",
  facets = list(stateid = "OH", sectorid = "RES"),
  freq = "annual",
  start = "2010",
  sort = list(cols = "period", order = "asc"),
))
#> # A tibble: 13 × 7
#>    period stateid stateDescription sectorid sectorName   sales `sales-units`    
#>     <int> <chr>   <chr>            <chr>    <chr>        <dbl> <chr>            
#>  1   2010 OH      Ohio             RES      residential 54474. million kilowatt…
#>  2   2011 OH      Ohio             RES      residential 53687. million kilowatt…
#>  3   2012 OH      Ohio             RES      residential 52288. million kilowatt…
#>  4   2013 OH      Ohio             RES      residential 52158. million kilowatt…
#>  5   2014 OH      Ohio             RES      residential 52804. million kilowatt…
#>  6   2015 OH      Ohio             RES      residential 51493. million kilowatt…
#>  7   2016 OH      Ohio             RES      residential 52524. million kilowatt…
#>  8   2017 OH      Ohio             RES      residential 49796. million kilowatt…
#>  9   2018 OH      Ohio             RES      residential 54452. million kilowatt…
#> 10   2019 OH      Ohio             RES      residential 52226. million kilowatt…
#> 11   2020 OH      Ohio             RES      residential 52553. million kilowatt…
#> 12   2021 OH      Ohio             RES      residential 53171. million kilowatt…
#> 13   2022 OH      Ohio             RES      residential 53312. million kilowatt…

and make a nice plot.

library(ggplot2)
ggplot(d, aes(x = period, y = sales / 1e3)) +
  geom_bar(col = "steelblue", fill = "steelblue", stat = "identity") +
  theme_bw() +
  labs(
    title = "Annual Retail Sales of Electricity (GWh)",
    subtitle = "State: Ohio; Sector: Residential",
    x = "Year", y = "Sales (GWh)"
  )

References

See the collection of vignette tutorials and examples as well as complete package documentation available at the eia package website.


Please note that the eia project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

ropensci_footer

eia's People

Contributors

daranzolin avatar jameslamb avatar leonawicz avatar maelle avatar mghoff avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

eia's Issues

interest in functions for eia bulk data files?

Hello,

Thank you for this package! I have been enjoying using it to access EIA data in a tidier format. I sometimes find myself wanting larger numbers of EIA series, which has caused me to turn toward the option of using the EIA bulk data files. Recently I have been trying to adapt parts of the code in eia::eia_series and using a handler with jsonlite::stream_in to process bulk data files in a way that matches up to the results from eia_series.

Is processing bulk data files something of interest for this package? Perhaps signatures something like:

eia_series_bulk <- function(con, tidy = TRUE)
eia_cats_bulk <- function(con, tidy = TRUE)

I have been working on this, and what I have so far is quite slow and not very package-like, but hoping that in combination with drake/targets, that it will be a useful way for me to access large numbers of series.

thanks,
Jameel

Intermittent error from eia_cats()

Reprex below.

> library(eia)

> eia_get_key()
[1] "xxxxxx"

> eia_cats()
$`category`
# A tibble: 1 x 3
  category_id name          notes
  <chr>       <chr>         <chr>
1 371         EIA Data Sets ""   

$childcategories
# A tibble: 12 x 2
   category_id name                               
         <int> <chr>                              
 1           0 Electricity                        
 2       40203 State Energy Data System (SEDS)    
 3      714755 Petroleum                          
 4      714804 Natural Gas                        
 5      711224 Total Energy                       
 6      717234 Coal                               
 7      829714 Short-Term Energy Outlook          
 8     1292190 Crude Oil Imports                  
 9     2123635 U.S. Electric System Operating Data
10     2134384 International Energy Data          
11     2631064 International Energy Outlook       
12     2889994 U.S. Nuclear Outages               

> eia_cats(0)
$`category`
# A tibble: 1 x 4
  category_id parent_category_id name        notes
  <chr>       <chr>              <chr>       <chr>
1 0           371                Electricity ""   

$childcategories
# A tibble: 19 x 2
   category_id name                                                              
         <int> <chr>                                                             
 1           1 Net generation                                                    
 2          35 Total consumption                                                 
 3          32 Total consumption (Btu)                                           
 4          36 Consumption for electricity generation                            
 5          33 Consumption for electricity generation (Btu)                      
 6          37 Consumption for useful thermal output                             
 7          34 Consumption for useful thermal output (Btu)                       
 8        1017 Plant level data                                                  
 9          38 Retail sales of electricity                                       
10          39 Revenue from retail sales of electricity                          
11          40 Average retail price of electricity                               
12     1718389 Number of customer accounts                                       
13       41137 Fossil-fuel stocks for electricity generation                     
14       41138 Receipts of fossil fuels by electricity plants                    
15       41139 Receipts of fossil fuels by electricity plants (Btu)              
16       41140 Average cost of fossil fuels for electricity generation           
17       41141 Average cost of fossil fuels for electricity generation (per Btu) 
18       41142 Quality of fossil fuels in electricity generation : sulfur content
19       41143 Quality of fossil fuels in electricity generation : ash content   

> # |- petroleum ----
> eia_cats(714755)
$`category`
# A tibble: 1 x 4
  category_id parent_category_id name      notes
  <chr>       <chr>              <chr>     <chr>
1 714755      371                Petroleum ""   

$childcategories
# A tibble: 7 x 2
  category_id name                         
        <int> <chr>                        
1      714756 Summary                      
2      714757 Prices                       
3      714758 Crude Reserves and Production
4      714759 Refining and Processing      
5      714760 Imports/Exports and Movements
6      714802 Stocks                       
7      714803 Consumption/Sales 

> eia_cats(714756)
Error: lexical error: invalid char in json text.
                                       <!DOCTYPE html PUBLIC "-//W3C//
                     (right here) ------^

freq = 'weekly'

I noticed that the 'weekly' frequency is not supported. I was wondering if that was left out accidentally or for a particular reason.

.freq_specs <- function(freq){
  if (!is.character(freq) \| length(freq) > 1)
    stop("'freq' must be one of: 'annual', 'yearly', 'monthly', 'daily', or 'hourly'.")
  paste0("&frequency=", freq)
}

Feature Request: Parameter/variable to enable https

First of all: thanks for the great work. The package makes it really comfortable to get data from EIA.

Just a minor point/feature request:
Data from EIA are also available via https. As I prefer this way, I currently achieve this by changing the last line of function eia:::.eia_url manually from "http://api.eia.gov/" to "https://api.eia.gov/"

However, it could probably be useful (also for other users) if "https" could be parameterized in some way (e.g. as a variable or a function argument).

If this is a feature you would consider to implement, I would be happy to provide a pull request.

EIA API v1 is gone

With the retirement on 2023-03-13 of version 1 of the EAI API, it seems this package no longer works. Is this correct?

Data available for download that is not available in API

Related to #1 this issue seeks feedback from users about data commonly accessed from the EIA website, but which may not be available through the API itself. I am interested in creating additional wrapper functions for pulling specific, popular datasets from EIA into R that may need to be done by downloading files from specific URLs because they are not available through the API. Please let me know of possible examples and I will keep them in mind.

EIA API version 2?

Is there a plan to update this package to use the EIA API version 2? I am interested in either helping to update this package, or in working on a new package to use the new version of the API. Using the eia package has been very helpful to me over the years.

EIA has released a version 2 of its API. My understanding is that while the API v1 is still currently operating, that it could be shut down soon. From the FAQ on their website:
"What exact day is APIv1 shutting down?
We aren't able to provide a specific date at this time, but we will as the date draws closer. For your planning purposes, we are targeting January 2023."

They are providing a backwards-compatibility endpoint which should allow results to be retrieved by series ID through that endpoint. However, I think at least some changes are necessary to use this backwards-compatibility endpoint.
EIA API documentation

In addition, it seems like they have worked hard to improve the metadata available and structure of the API overall. I am hopeful that there are possibilities in an R package to take advantage of the improvements they have made.

thanks!
Jameel

eia_data() documentation

I think we should improve the documentation of this function a bit.

Most specifically, I think the parameter entries for data, facets, freq, and start,end should still state what they are, following their type, before jumping right to "See details." There should be something nominal about them between type and referring to the Details section.

eia_data not returning values for international directory

I am trying to pull international data and getting responses with no value variable. Example code is below:

eia_set_key("safekey")

eia_data(dir = "international",
         freq = "annual", 
         start = "2017" ,
         facets = list(activityId = 12, 
                       productId = c(2)))

returns the following variables:

"period" 
"productId"
"productName"
"activityId"
"activityName"
"countryRegionId"
"countryRegionName"
"countryRegionTypeId"
"countryRegionTypeName"
"dataFlagId"
"dataFlagDescription"
"unitName"
"unit"

There is no value returned - i see the value key in the json if i call that online. What am I missing?

Fetching Data

Dear Mr. Matthew,
Greetings.
I am writing to you seeking advice.

  • Let say i want data on "Petroleum and other liquids production" Annually per country,
    library(tidyverse)
    library(eia)
    data_ser <- eia_cats(2134915) %>%
    .[["childseries"]] %>%
    filter(grepl("INTL.53-1-.*-TBPD.A", series_id ))

y <- data_ser %>%
pull(series_id) %>%
.[1:100] %>% # since API can get 100 series max

eia_series(., start = 1990)

I am wondering if this is the right way of getting data or there is a better way.

  • how can i get country names, or abbreviations, or iso2 using your package in order to use as region parameter in eia_geoset function.

Wrappers for popular datasets?

I have been considering creating a collection of wrapper functions that make it even more convenient to grab commonly requested datasets from the EIA API.

I do not know if this makes sense to bother with, however, until I can get more feedback from users; not only in terms of what users most often access the API for, but to see whether requests across users actually share a lot in common or tend to be completely unrelated. The API contains a huge amount of data, so knowing what others use it for is helpful for the ongoing development of this package in general.

I see some potential for popular datasets, where it makes sense to, to at a minimum abstract away the need to enter series IDs. Series can be implicit based on the name of the specific wrapper function, and other args passed to it such as region if applicable. This is still just in the idea stage, but I'd like to hear others' thoughts.

Please use this issue to provide any feedback on what convenience wrappers might be helpful to the most users if you have insight into this. Please include a reproducible example (no API key required) showing the code currently used to request the given data and a sentence or two about it so I have some context. I don't want to make wrappers around everything; only a subset of the most popular datasets.

404 Error when trying to pull series from API

Hi, I'm trying to update data in a script that I've run in the past and I'm getting the following error. Is there a way to fix this/does this package still function? The last time I ran it (2 months ago) it was working perfectly.

library(eia)
eia_set_key('7eba73584cfb231e1c6ee0282ff8f2a8')

eia_series('STEO.BREPUUS.M')

This is the error that I get

Request failed [404]. Retrying in 1.9 seconds...
Request failed [404]. Retrying in 1 seconds...
Error: Page not found

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.