Giter VIP home page Giter VIP logo

gcamrpt's People

Contributors

calebbraun avatar pkyle avatar rplzzz avatar ssmithclimate avatar xavier-gutierrez avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gcamrpt's Issues

Create input files for demo

We will need to create the control files for the demo on Thursday. We also need to locate a database for a scenario other than Reference, so that we can show the difference between merged and unmerged scenarios.

Tests Needed

We need tests for the following modules:

  • agriculture
  • emissions
  • end_use
  • land_use
  • primary_energy
  • water

Output scenario name not used in final output

I ran generate() with a scenario control file that specifies a different output scenario than the GCAM scenario. The scenario column in the resulting .csv used the original GCAM scenario name.

I was able to fix the problem by deleting line 357 in mcl.R:

if(!('scenario' %in% names(tbl)))

Is there a case that this check is necessary? If not, I can submit a PR for the fix.

Finish transportation modules

A few of the modules in transporation.R are still stubs, returning no data and issuing a warning. These need to be finished and to have tests written.

Write aggregation function

Accept a table, a string containing a vector of aggregation keys, and the name of an aggregation function. Group by the aggregation keys and apply the aggregation function. Handle the following errors:

  • One or more of the aggregation keys isn't in the table (warning, drop offending key(s) from the list).
  • The aggregation function doesn't exist (warning, return full table with no aggregation)

Write function to run and return results for a list of queries and scenarios.

The function should accept a list of queries and a list of scenarios. It should use rgcam to run the queries. It must handle these special cases:

  • The query does not return results for any of the scenarios (warning, return nothing)
  • The query returns results for some, but not all of the scenarios (warning, return a table full of NA for the scenarios with no results)

Add mechanism for caching intermediate results

There are a lot of derived outputs that result from simple arithmetic on previously calculated outputs. Right now we repeat all of that calculation, which is not ideal. This problem could become even worse if the same base variable is repeated with several different types of aggregation and/or filtering. We would end up repeating the entire calculation each time, when all we really want to do is to redo the filtering and aggregation steps.

I think we could do this by replacing the allqueries list with an environment. Then we could write intermediate outputs into the environment to be used in other modules. Modules would check the environment to see if their intermediate values are available, and if they aren't, they could call the relevant models directly, causing the intermediates to be filled in, before proceeding.

GCAM test data package

Right now it's hard to test packages that work on GCAM databases because the databases themselves are too large to conveniently include in the package. (This package is one of several that have this problem.) What we could do is create a separate package with a GCAM database in it, along with, possibly, a more comprehensive set of queries than the example queries in the rgcam package. Then we have packages that need a GCAM database just for testing Suggest the package with the sample database in it. Then we can write tests that use the database from the sample data package and have testthat skip those tests if the sample data package isn't available.)

The sample-data package will still be larger than recommended, but now the problem is confined to a single package that nobody has to install, which is a bit improvement over the current situation.

Write function to read input

We need a function (or functions) to read all of the user input. Right now that comprises:

  • Table of desired outputs
  • Table of scenarios and db filenames

We also need the directory containing the database files, but we can have that passed in as a string argument.

Write CSV output function

Write a function to output results as a CSV file or group of files, according to the user settings.

Write main control loop

Write the main control loop that collects the module information, runs the queries, runs the modules, and runs the output conversion.

Package fails to install

The error is

ERROR: hard-coded installation path: please report to the package maintainer and use ‘--no-staged-install’

This is the same issue we were seeing in JGCRI/rgcam#57. Further information here:
https://developer.r-project.org/Blog/public/2019/02/14/staged-install/index.html

As far as I can tell, the problem is in our startup function:

gcamrpt/R/zzz.R

Lines 3 to 7 in c2cc137

.onLoad <- function(libname, pkgname)
{
qfile <- system.file('extdata/default-queries.xml', package=pkgname)
parseQueries(qfile)
}

Based on my reading of the documentation above, this is a false-positive. This code should be safe because it never gets run until the package is loaded. If that's right, then we just need to restructure the call so it doesn't look to the installer like this path is being saved at install time. I'm thinking that burying the call to system.file in parseQueries should do the trick.

Trim package exports

Right now I'm exporting most of the key functions in the package, but in fact we expect to have a fairly minimal interface. Before our first release we should trim out all of the exports that we don't expect users to call directly.

Set up CI

Set up CI runs on Travis and require checks to pass before merging is allowed.

IIASA format write-out crashing

I'm working on the transportation branch but the code looks the same on master, so this may be a problem for others too. If I try to write out the data in IIASA format (i.e., dataformat == 'IIASA'), I get the following crash after all of the queries have been run and all data aggregated:

Error in if (nrow(vardf) == 0) rslts[var] <- NULL : 
  argument is of length zero

The problem stems from line 221 in mcl.R, where the following line:

vardf <- rslts[var]

doesn't make a data frame; it makes a list of length one whose first element is a data frame. So, the next line, which is intended to check the number of rows of the data frame, instead causes a crash.

Write filtering function

Write a function that accepts a table, a string containing a list of predicates in s-exp notation (i.e., (operator, var, value), for example (==, sector, 'beef')), and start and end years. Return the table filtered to rows where all predicates are true and the years are between the start and end years (inclusive).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.