jgcri / gcamrpt Goto Github PK

View Code? Open in Web Editor NEW

1.0 12.0 0.0 8.62 MB

Convert GCAM output to the format used by public IAM databases

License: GNU General Public License v2.0

R 100.00%

gcam coupled-human-natural-systems

gcamrpt's People

Contributors

Stargazers

Watchers

gcamrpt's Issues

Add tests for module code

Right now the module code has no test coverage. We need to add tests for all existing modules.

We will need to create the control files for the demo on Thursday. We also need to locate a database for a scenario other than Reference, so that we can show the difference between merged and unmerged scenarios.

Tests Needed

We need tests for the following modules:

Output scenario name not used in final output

I ran generate() with a scenario control file that specifies a different output scenario than the GCAM scenario. The scenario column in the resulting .csv used the original GCAM scenario name.

I was able to fix the problem by deleting line 357 in mcl.R:

if(!('scenario' %in% names(tbl)))

Is there a case that this check is necessary? If not, I can submit a PR for the fix.

Write runModule generic

Write the generic for the runModule interface.

Finish transportation modules

A few of the modules in transporation.R are still stubs, returning no data and issuing a warning. These need to be finished and to have tests written.

Add IIASA format option for output.

Add an option to generate to convert data to wide format before output.

Create electricity data module

Write the runModule method for the "Electricity" class.

Unit conversion for transportation service output needs to be fixed

Freight should be able to convert mass units, and both passenger and freight will fail if you try to convert to tonne-km or passenger-km (i.e., without a count unit).

Write aggregation function

Accept a table, a string containing a vector of aggregation keys, and the name of an aggregation function. Group by the aggregation keys and apply the aggregation function. Handle the following errors:

One or more of the aggregation keys isn't in the table (warning, drop offending key(s) from the list).
The aggregation function doesn't exist (warning, return full table with no aggregation)

Write function to run and return results for a list of queries and scenarios.

The function should accept a list of queries and a list of scenarios. It should use rgcam to run the queries. It must handle these special cases:

The query does not return results for any of the scenarios (warning, return nothing)
The query returns results for some, but not all of the scenarios (warning, return a table full of NA for the scenarios with no results)

Hand validation of transportation outputs

We need to examine the comparison tables we use to test the transportation modules manually to validate that they are correct.

Add tests for package basic functionality

Should include all of the package infrastructure and at least one simple variable.

Add mechanism for caching intermediate results

There are a lot of derived outputs that result from simple arithmetic on previously calculated outputs. Right now we repeat all of that calculation, which is not ideal. This problem could become even worse if the same base variable is repeated with several different types of aggregation and/or filtering. We would end up repeating the entire calculation each time, when all we really want to do is to redo the filtering and aggregation steps.

I think we could do this by replacing the allqueries list with an environment. Then we could write intermediate outputs into the environment to be used in other modules. Modules would check the environment to see if their intermediate values are available, and if they aren't, they could call the relevant models directly, causing the intermediates to be filled in, before proceeding.

Check for queries that fail or return empty tables

We don't need this for the demo, but it will be something we will want to add shortly after we get the go-ahead to develop v1.0.

GCAM test data package

Right now it's hard to test packages that work on GCAM databases because the databases themselves are too large to conveniently include in the package. (This package is one of several that have this problem.) What we could do is create a separate package with a GCAM database in it, along with, possibly, a more comprehensive set of queries than the example queries in the rgcam package. Then we have packages that need a GCAM database just for testing Suggest the package with the sample database in it. Then we can write tests that use the database from the sample data package and have testthat skip those tests if the sample data package isn't available.)

The sample-data package will still be larger than recommended, but now the problem is confined to a single package that nobody has to install, which is a bit improvement over the current situation.

Change startyear and endyear to a single list of years

By user request.

Name change in docs

Change name in docs from iamrpt to gcamrpt.

Create GDP data module

Create the runModule method for the 'GDP_PPP' class.

Write function to read input

We need a function (or functions) to read all of the user input. Right now that comprises:

Table of desired outputs
Table of scenarios and db filenames

We also need the directory containing the database files, but we can have that passed in as a string argument.

Write CSV output function

Write a function to output results as a CSV file or group of files, according to the user settings.

Write main control loop

Write the main control loop that collects the module information, runs the queries, runs the modules, and runs the output conversion.

Create Population module

Write the runModule method for the "Population" class.

Package fails to install

The error is

ERROR: hard-coded installation path: please report to the package maintainer and use ‘--no-staged-install’

This is the same issue we were seeing in JGCRI/rgcam#57. Further information here:
https://developer.r-project.org/Blog/public/2019/02/14/staged-install/index.html

As far as I can tell, the problem is in our startup function:

gcamrpt/R/zzz.R

Lines 3 to 7 in c2cc137

 .onLoad <- function(libname, pkgname) 

 { 

 qfile <- system.file('extdata/default-queries.xml', package=pkgname) 

 parseQueries(qfile) 

 }

Based on my reading of the documentation above, this is a false-positive. This code should be safe because it never gets run until the package is loaded. If that's right, then we just need to restructure the call so it doesn't look to the installer like this path is being saved at install time. I'm thinking that burying the call to system.file in parseQueries should do the trick.

Write Excel output function

Write a function to output the tables generated by the rest of the code.

Trim package exports

Right now I'm exporting most of the key functions in the package, but in fact we expect to have a fairly minimal interface. Before our first release we should trim out all of the exports that we don't expect users to call directly.

Set up CI

Set up CI runs on Travis and require checks to pass before merging is allowed.

IIASA format write-out crashing

I'm working on the transportation branch but the code looks the same on master, so this may be a problem for others too. If I try to write out the data in IIASA format (i.e., dataformat == 'IIASA'), I get the following crash after all of the queries have been run and all data aggregated:

Error in if (nrow(vardf) == 0) rslts[var] <- NULL : 
  argument is of length zero

The problem stems from line 221 in mcl.R, where the following line:

vardf <- rslts[var]

doesn't make a data frame; it makes a list of length one whose first element is a data frame. So, the next line, which is intended to check the number of rows of the data frame, instead causes a crash.

Write filtering function

Write a function that accepts a table, a string containing a list of predicates in s-exp notation (i.e., (operator, var, value), for example (==, sector, 'beef')), and start and end years. Return the table filtered to rows where all predicates are true and the years are between the start and end years (inclusive).

Add queries.xml file to extra data.

We can start with the simple one included with rgcam and expand it as required.

	.onLoad <- function(libname, pkgname)
	{
	qfile <- system.file('extdata/default-queries.xml', package=pkgname)
	parseQueries(qfile)
	}

jgcri / gcamrpt Goto Github PK

gcamrpt's People

Contributors

Stargazers

Watchers

gcamrpt's Issues

Recommend Projects

Recommend Topics

Recommend Org