Giter VIP home page Giter VIP logo

ed2-mandifore's Introduction

ed2-mandifore

Project Status: Inactive – The project has reached a stable, usable state but is no longer being actively developed; support/maintenance will be provided as time allows.

The goal of ed2-mandifore is to run ED2 with Setaria in already grown, complex ecosystems using weather data from MANDIFORE sites.

Reproducibility

This project uses renv to manage package dependencies. This is essential for reproducing this work as the version of PEcAn.ED2 used is installed from a pull request that will likely never be merged. Using a different version of PEcAn.ED2 will result in errored runs. Run renv::restore() to install dependencies.

Setup scripts

Scripts 00 and 01 have already been run and have generated the data in data/. They do not need to be run again.

  1. Start with sourcing 02_setup-runs.R to generate files in the transect/ directory.
  2. In the terminal, navigate to a particular run (e.g. ./transect/MANDIFORE-SEUS-352/pine) and start the job as a background process with ./run.sh.
  3. Follow the checklist below to check that the job is running correctly

Job Start Checklist

Do all of this before starting the next job!

  • Is the R output being saved to workflow.Rout?
  • Are all the expected run/ and out/ folders created locally?
  • Find settings_checked.xml and look through it.
  • Are all the expected run/ and out/ folders created on the HPC?
  • Once the job starts on the HPC, record the SLURM job ID. It is not printed anywhere in the logs, so you will need to manually copy and paste it somewhere (e.g. into the pid.nohup file that has the local PID)
  • Spot-check the log files for multiple runs to see that simulation has started

...now you can start another job.

Job analytics

If you want to know how long a job took on the HPC you can use:

sacct -j <jobid> -o Start,End,Elapsed

To check remaining compute hours:

va

ed2-mandifore's People

Contributors

aariq avatar kristinariemer avatar

Watchers

 avatar  avatar  avatar

Forkers

kristinariemer

ed2-mandifore's Issues

Old pss and css files not compatible with modern ED2

The history files from Mike were created with ED2.1.0 at some point (not sure exactly when) and are meant to work with IED_INIT_MODE=3, which is now deprecated in favor of IED_INIT_MODE=6. The .pss files have a different format with the new mode. Also, the .css files contain some PFTs that have been replaced (12, 13, 14), so these need to be replaced with some modern PFTs. Other changes might be necessary also.

Fix messed up git situation

Accidentally committed a large file on #19, now can't push to GitHub. Untracked large file, but still can't push. Can I squash commits or something??

Some jobs not starting using run.sh

Some jobs apparently don't start when using the run.sh script. The jobid is NULL and the workflow. Not sure if there is a pattern to this or if it's stochastic.

More than 3 simultaneous HPC jobs hits CPU limit?

It seems like we are either getting close to the compute limit, or there's some other limit about the number of simultaneous jobs/nodes/cores.

(puma) [ericrscott@wentletrap ~]$ squeue -u ericrscott
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
           6753242  standard PEcAn-SA ericrsco PD       0:00      2 (AssocGrpCPUMinutesLimit)
           6753209  standard PEcAn-SA ericrsco  R   17:28:25      2 r1u11n2,r4u29n2
           6753173  standard PEcAn-SA ericrsco  R   17:40:42      2 r1u26n1,r2u32n1
           6753207  standard PEcAn-SA ericrsco  R   17:34:33      2 r4u08n2,r4u13n2

Figure out what to do about PEcAn breaking allometry equations

Currently PEcAn overwrites ED2 defaults with its own defaults resulting in errored runs.

The simplest fix is to just turn off this behavior from PEcAn and let ED2 use it's own defaults, but this will never be merged into the develop branch.

  1. Keep a fork/branch that we install PEcAn.ED2 from
  2. Try to fix PEcAn.ED2 in a less invasive way:
    • re-name ED2IN.r2.2.0.github to ED2IN.rgit so history.rgit is used for defaults
    • possibly edit ED2IN to use ED 2.2.0 defaults for everything (i.e. IMETRAD, IALLOM and ISTRUCT_GROWTH_SCHEME) (compare with what other ED2IN files in inst/ do)
    • update suspected bad values in history.rgit

Figure out why numbers seem wrong

NPP seems way too high
AGB seems way too low
transpiration seems way too low

  • Inspect raw data coming from .h5 files
  • Triple-check calculations and conversions
  • Ask in ED2 discussions if I'm getting the units and conversions right

New sites for model training data

  • Select 20 sites from SEUS that we haven't run yet
  • Generate run files
  • Change number of cores (programatically in setup script?) to encumber fewer cpu hours
  • Start all runs

Get list of ED2 pfts

Create table with columns: pft number, number of sites in PNW, number of sites in SEUS

Separate Patches

The initial idea of having three patches representing three distinct ecosystems didn't appear to work. After just a few timepoints in the simulation it appears that all PFTs were in all patches. Rather than figuring out how to make patches not interact and not split, it might be better to just do these as separate runs. It'll take ~3x longer, but we already know how to do it.

turn on SA

Figure out PFT mappings

Files from Mike have ED2 pft numbers, but we need to create a lookup table to decide which PEcAn PFTs these match to, potentially taking into account whether the sites are in PNW (pacific northwest) or SEUS (south east US). Also keeping in mind that 3 PFTs are "upgraded":

12 -> 9
13 -> 10
14 -> 3

Although I'm suspicious about that last one since in modern PEcAn ED PFT 3 is broadleaf_evergreen_tropical_tree and it doesn't make sense that it occurs in the PNW sites

Jobs started with run.sh don't use `renv`

R CMD BATCH doesn't load .Rprofile, so doesn't use renv and uses different package versions.

Solution:
at the top of workflow.R source(".Rprofile"). This doesn't work though, unless you provide an absolute path or setwd(), both of which aren't great solutions. Maybe can do somethuing like setwd(../..)?

Split workflow.R

Split workflow.R into setup script, submit runs script, and process results script. This will hopefully reduce fragility and allow for some manual checking while maintaining some automation.

Also create checklists (could be programmatic, could be literal list) for each stage.

"Production" runs

Longer runs, more ensembles, all the sites, SA with ± 1, 2, 3, SD, etc.

debug model2netcdf.ED2()

For some reason model2netcdf.ED2() fails for the prairie run at site 1123 for ensemble member 1 at 2003.

> model2netcdf.ED2(
       # .x,
       outdirs[[1]],
       29.365195,
       -82.810137,
       '2002-06-01',
       '2012-06-30',
       c(
         SetariaWT = 1L,
         sentinel_ebifarm.c3grass = 5L,
         sentinel_ebifarm.c4grass = 16L,
         ebifarm.forb = 12L
       ),
       process_partial = TRUE
     )

2023-03-20 14:57:05 INFO   [model2netcdf.ED2] : 
   ----- Processing year: 2002 
2023-03-20 14:57:05 INFO   [read_E_files] : 
   *** Reading -E- file *** 
2023-03-20 14:57:15 INFO   [model2netcdf.ED2] : 
   *** Writing netCDF file *** 
2023-03-20 14:57:15 INFO   [model2netcdf.ED2] : 
   ----- Processing year: 2003 
2023-03-20 14:57:15 INFO   [read_E_files] : 
   *** Reading -E- file *** 
2023-03-20 14:57:25 INFO   [model2netcdf.ED2] : 
   *** Writing netCDF file *** 
Error in ncdf4::ncvar_put(nc = nc, varid = varid, vals = vals, start = start,  : 
  ncvar_put: error: you asked to write 48 values, but the passed data array only has 44 entries!
In addition: Warning message:
Returning more (or less) than 1 row per `summarise()` group was deprecated in dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()` always returns an ungrouped
  data frame and adjust accordingly.
ℹ The deprecated feature was likely used in the PEcAn.ED2 package.
  Please report the issue to the authors.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated. 

Decide how to "plant" Setaria

Currently I've added Setaria by editing .css files to include Setaria in every patch with dbh = 0.6 (cm) and n = 1 (plants/m^2). dbh =0.6 came from what @KristinaRiemer used in previous runs with .css files and n=1 came from @dlebauer's suggestion. If these numbers are fine, this can be closed. Otherwise let's decide on a method.

Select sites

We said we'd try for ~ 50 sites. Someone needs to come up with criteria and/or a sampling procedure for narrowing down the sites.

Clone and simplify non-setaria PFTs

Remove most priors from PFTs. Keep a few parameters so SA and ensemble analysis generate variation. Replace uniform priors when possible.

Params to keep:

  • SLA
  • Vcmax
  • stomatal slope
  • cuticular conductance
  • quantim efficiency
  • fineroot2leaf

Some more "custom" pfts like the ebifarm.forb might need to have more priors to make it more distinct from the default ED2 PFT it's based on.

Add cleanup code

Remove .h5 files if .nc files were successfully created at the end of a run so Welsch doesn't fill up.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.