Giter VIP home page Giter VIP logo

r3pg's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

r3pg's Issues

size distributions

I noticed that there is the option “correct_sizeDist”. I think it is good to have the d_sizeDist, but the correct_sizeDist could remain as correct_bias. This is because the d_sizeDist clearly contains the parameters that are used for size distributions. But these distributions are not only used for the bias correction. They are also use to calculate variables for the largest crop trees (e.g. mean dominant height) and can be important outputs for foresters who need to divide the stand into size classes due to their differences in wood value. So the bias correction is one of several uses for the distributions. The name correct_bias would more specifically describe what the option is, as opposed to correct_sizeDist, because the size distribution itself is not being changed. I should have said this more clearly before.

Question: should we create separate, documented checkInput functions?

At the moment, there is the internal chk_input function.

I wonder if we should create separate, visible checkInput functions, e.g. checkSiteInput. Apart from the fact that a user could check a separate input in this way, this would provide an opportunity to document the input format for each input more clearly, without blowing up the help of the main function, i.e. the long list about the parameter inputs could be removed in favor of a link to the separate check functions.

On the other hand, a user then has to jump between help files, so not sure if it is a good idea.

Generate climate data

We need a function to generate climate data in case the user provide only the average data.

e.g.

create_climate <- function(data, n_year = 10, i_month = 1){
rbind( replicate(n_year, data, simplify = FALSE)))
}

growth is negligible if trees are planted more than one or two months after simulation starts

While I haven't looked through the code to see where this issue is coming from, there appear to be some assumptions about the relationship between d_site$from and d_species$planted. If from is too far before planted then the trees don't grow.

There's a few curious things here:

  • stems_n at the beginning of the stand output matches the number of stems planted even though no trees have been planted.
  • GPP is small but nonzero (~1E-8) before planting and then declines after planting, resulting in GPP being too low by several orders of magnitude by a couple decades into the stand trajectory. This carries through into NPP and explains the lack of growth.
  • The above behavior doesn't occur if from precedes planted by one or two months but whether the maximum difference before growth shuts down is one or two months seems to vary seasonally. Not sure if there's a requirement from and planted be in the same year or if something else is going on.
  • From can occur after planting, in which case the trees sit in stasis until timesteps start. This is unsurprising from a code perspective but, since there doesn't appear to be any constraint on the lag to simulation start, it's possible to skip decades of growth.

The workaround I'm using is to set from = planted, which is no trouble for my purposes, but I wanted to capture this since it seems like there's some missing argument checking, maybe some bugs, and others might find it harder to troubleshoot the issue. Not sure if the vignette should have from and planted getting initialized by a shared variable or one picking up its value from the other, but either coding method would discourage their divergence.

I also wonder if these apparent issues around triggering planting as a management action from a specific date might partially or fully explain #36.

cohort age tends not to be the trees' true age

As of r3PG 0.1.3, with the workaround for #52, what I get in the output is the start month of simulation is age 0. This is equivalent to assuming simulation starts at germination, which is awkward for silviculture based on nursery stock such as plug+0 or 2+0. I didn't see a way of addressing this in #42 or #49.

The workaround I'm using for now is to maintain distinct r3PGage and trueAge columns in the tibbles. This isn't a big deal as we don't have complex, multi-cohort stands and therefore don't have too many planting age offsets to keep track of. But it'd be easier if r3PG could emit fully accurate tree ages. An alternate workaround is to start the simulation in the greenhouse and then carry it through any container to bare root transitions before planting at the harvest unit. However, it's a much simpler setup if weather inputs are needed only for the management unit the trees end up on and you don't need to insert "thinning" treatments to represent exogenous planting mortality.

A second motivation for such age bookkeeping is correctly selecting the simulation months matching field measurements in order to calculate model error. Some measurement datasets define age 0 as the planting age and therefore compose naturally with r3PG's current implementation. Most of the ones I work with don't, though, and translating the dataset to r3PG's age origin tends to be more complex than maintaining two age columns.

It's also common in the monitoring plot datasets I work with that planting records are minimal or absent and measurements don't start until trees have reached some requirement such as a 2 cm or 5 cm DBH. In such cases there may not be data to initialize a simulation cohort until the trees are 5-12+ years old, resulting a greater disparity between true ages and 3-PG ages. On production units the first inventory might not be until somewhere in the range of age 20-35, which in some species is enough it's probably a good idea to start manipulating the MaxAge, nAge, and rAge species parameters to account for discrepancies between 3-PG's ages and the trees' actual ages. This too seems like avoidable complexity. Iterating sims to figure out when to introduce a naturally recruited cohort so that running forward matches its measurements is, in principle, probably preferred in respect to fully capturing stand state into a model. However, such an approach isn't always feasible or practical.

I think the cleanest way to simplify these cases might be to have d_species$germinated as well as d_species$planted, calculating age from germinated but continuing to use planted as the trigger for the management action. That makes it easy to specify a plausible spring germination month, for example, and lets the code deal with sorting out the timing details of things like fall versus winter plantings rather than making users recalculate an offset. If germinated defaults to planted it's a non-breaking change and transparent to existing code.

Travis CI build error

OK, I'll file an issue now because a colleague told me they have exactly the same problem with BayesianTools, so maybe worth discussing it here.

During install, we are getting the following error on Travis for builds that definitely ran through on Travis before

* DONE (ROI)
* installing *source* package ‘Rglpk’ ...
** package ‘Rglpk’ successfully unpacked and MD5 sums checked
** libs
/bin/bash: line 0: cd: GLPK: No such file or directory
Makevars:10: recipe for target 'GLPK.ts' failed
make: *** [GLPK.ts] Error 1
ERROR: compilation failed for package ‘Rglpk’
* removing ‘/home/travis/R/Library/Rglpk’
Error in i.p(...) : 
  (converted from warning) installation of package ‘Rglpk’ had non-zero exit status
Calls: <Anonymous> ... with_rprofile_user -> with_envvar -> force -> force -> i.p
Execution halted
The command "Rscript -e 'deps <- remotes::dev_package_deps(dependencies = NA);remotes::install_deps(dependencies = TRUE);if (!all(deps$package %in% installed.packages())) { message("missing: ", paste(setdiff(deps$package, installed.packages()), collapse=", ")); q(status = 1, save = "no")}'" failed and exited with 1 during .
Your build has been stopped.

It's unclear if something is going wrong on Travis, whether a package dependency has somehow changed and breaks the build, or something else. I don't have any problems locally, but I haven't re-installed all my packages.

spatial simulation in vignette broken with multidplyr 0.1.0

multidplyr's change from create_cluster() to new_cluster() came up already in #40. There are also updates needed with cluster_copy() and partition().

The code below hello worlds for me with r3PG 0.1.3 and multidplyr 0.1.0, though there are warnings about differing measure variable attributes and uneven raster pixels and there is probably a reliability issue within multidplyr as the R session crashed the first time I tried the code. The uneven pixels appear to be due to the site locations in coord.grid and are presumably not a new issue.

cl_in <- new_cluster(n = 2) %>% # or more cores if desired
  cluster_library(c('r3PG', 'purrr', 'dplyr', 'tidyr')) %>%
  cluster_copy(names = c("r3pg_grid", "species.grid", "thinn.grid", "param.draw"))

sim.grid <- inner_join(site.grid, climate.grid, by = 'grid_id')  %>%
  sample_n(10) %>%
  group_by(grid_id) %>% # partition() now operates on the tibble's groups
  partition(cluster = cl_in) %>%
  mutate( out = map2(site, forc, ~r3pg_grid(.x, .y))) %>%
  select( grid_id, out) %>%
  collect() %>%
  unnest_legacy() %>%
  ungroup()

testthat error

Hi @florianhartig would you mind adding a simple working example to testthat part? E.g. TRUE==TRUE.
Since you have added test directory I'm getting errors during the checking:

Error in test_files(paths, reporter = reporter, env = env, stop_on_failure = stop_on_failure,  : 
  No matching test file in dir
Calls: test_check -> test_package_dir -> test_dir -> test_files

Fortran runtime error

I can compile the package, but when I try running the example I get

Fortran runtime error: Funny sized logical array

I assume it's a problem with my fortran version or something similar, but not sure. Any idea?

3PG model is insensitive to parameter variation (Morris Sensitivity Analysis)

Dear,

I'm trying to analyze the sensitivity of my 3PG model using the Morris method as it was described in the vignette example.
But I got as result of analysis a dataframe full of 0, meaning that the model is non sensitive to parameter variation values.

So basically the species I'm working on is Cedrus atlantica Manetti (Blue cedar) for which no model has been developed before. To get around the lack , I decided to use "param_solling" ( default, min and max values).

Also , biomass ( stem, foliage and roots) , DBH and NPP were used to calculate log likelihood. [ NPP for the entire period month by month , and for the other variables the begining and the end of the period ].

When I calculated log likelihood value using default parameters I got -915. But the model seems to be insensitive to parameter variation.

Is there any issue with the approach ?

Any help will be appreciated.

Catarina S.

NB : we are working on the same project me and the user who asked the precedent question @issamyax

placette70.xlsx

Output list

Is there a list of all output variables that includes their description and units?

Preparing CRAN submission 0.1.0

Issue to collect last points / discuss for the first CRAN submission.

Vova, have you run the package through winbuilder already? I can't do it, because emails always go to the maintainer.

3-PG manual

Would it be possible to somewhere refer to the manual I wrote? It points people in the direction of different sources of information and hopefully facilitates higher quality use of 3-PG. It is not a replacement for the Sands work, but it tries to fill in some information that appears to be misunderstood when trying to obtain data to estimate the parameters.

Forrester, D. I., 2020. 3-PG User Manual (available from https://sites.google.com/site/davidforresterssite/home/projects/3PGmix/3pgmixdownload). Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Birmensdorf, Switzerland. 70 p.

removing one of the cohort

This issue appears if one of the cohort is thinned to 0. Unfortunately, the last thinning for Picea able_1 provide NA, and cause all the simulation to be NA afterwards.

f_loc <- ""
 out_3PG <- run_3PG(site        = read_xlsx(f_loc, 'site'),
                    species     = read_xlsx(f_loc, 'species'),
                    climate     = prepare_climate(climate = read_xlsx(f_loc, 'climate'), from = read_xlsx(f_loc, sheet = 'site')$from, to = read_xlsx(f_loc, sheet = 'site')$to),
                    thinning    = read_xlsx(f_loc, 'thinning'),
                    parameters  = read_xlsx(f_loc, 'parameters'),
                    size_dist   = read_xlsx(f_loc, 'sizeDist'),
                    settings    = list(light_model = 2, transp_model = 2, phys_model = 2, height_model = 2, correct_bias = 1, calculate_d13c = 0),
                    check_input = TRUE, df_out = TRUE)

This issue is not occurring if I test with the internal data.

library(r3PG)
library(dplyr)
library(ggplot2)

d_thinning$stems_n <- c(500.1, 600.8, 0)

out_3PG <- run_3PG(
  site        = d_site, 
  species     = d_species, 
  climate     = d_climate, 
  thinning    = d_thinning,
  parameters  = d_parameters, 
  size_dist   = d_sizeDist,
  settings    = list(light_model = 2, transp_model = 2, phys_model = 2, 
    height_model = 1, correct_bias = 0, calculate_d13c = 0),
  check_input = TRUE, df_out = TRUE)

sel_var <- c('biom_stem', 'biom_foliage', 'biom_root', 'stems_n')

out_3PG %>%
  filter( variable %in% sel_var ) %>%
  ggplot( aes(date, value, color = species) ) +
  geom_line() +
  facet_wrap(~variable, scales = 'free') +
  theme_classic()

1019000_input.xlsx

Roxygen old usage or not

I added Roxygen: list(old_usage = TRUE) to the description, which means that parameters values in the help will not be separated by a linebreak - this is how it used to be in R.

Since the last Roxygen version, the changed this, but I prefer the old style. You can try for yourself what you like best

src/Makevars has all: $(SHLIB) clean

See #23 (comment)

Dear maintainer,
Please see the problems shown on
<https://cran.r-project.org/web/checks/check_results_r3PG.html>.
Please correct before 2020-05-26 to safely retain your package on CRAN.
The CRAN Team

src/Makevars has

all: $(SHLIB) clean

'Writing R Extensions' warned you that was wrong, and it leads to frequent failures with parallel makes.

I think this is one of their new checks (similar found here for rstantools)

Ok, I think I have fixed it based on this Fix for parallel make based on Brian Ripley and this

Year and month added to climate data

Could the climate be provided so that it includes a year and month
column, and therefore it can start before the simulation start and end
after the simulation end. This way users can enter all the climate data
they have for a given site, and then use the same climate data for
different simulations in terms of start and end times – the same as in VBA.
This would make it possible to examine different rotations using the same
climate data input.

Volume prediction by R3PG

I tried using R3PG to simulate one of my study sites and i found that the DBH and Height predictions were doing fine but the volume predictions were quite funny. The volumes were over predicted (DBH = 1.97cm and Height = 1.59m was yielding volume of 344.8 m3ha-1). I checked the function used for volume calculation and i found it was quite different from the 3PGpjs version. I went further to calculate the volume manually using this equation from the R3PG code and i got the same funny volume.

Regrowth of the trees after harvesting

From the user

I have a mixed stand composed of 5 species of different growth rates planted at the same 
time and would like to simulate the harvest of only one of them, followed by its regrowth
through resprouting. I tried to simulate this in 3PGmix in Excel by "planting" the same 
species twice by repeating the species and using two different planting dates and a thinning 
of all individuals for the first line, but it did not work. Do you see any way to simulate this 
using r3PG?

We can consider adding this option

Clean previous releases

Once we are ready with the package, clean all the previous releases and made the first one as v0.1.0

Error: 'create_cluster'

I prepared the codes below

multidplyr::create_cluster() %>%
cluster_library(c('r3PG', 'purrr', 'dplyr', 'tidyr')) %>%
cluster_copy( r3pg_grid ) %>%
cluster_copy( species.grid ) %>%
cluster_copy( thinn.grid ) %>%
cluster_copy( param.draw )

However, I got an error message
Error: 'create_cluster' is not an exported object from 'namespace:multidplyr'

Was this function creat_cluster removed? (https://www.rdocumentation.org/packages/multidplyr/versions/0.0.0.9000/topics/create_cluster)

Follow-up questions on Bayesian calibration example in the vignette

Dear @florianhartig @trotsiuk

I tried running the Bayesian calibration using the data provided in vignette build, the result I got was different from your result. I ran both morris sensitivity and Bayesian calibration. The morris sensitivity seems to be the same. I must say here that I don't have advanced knowledge of Bayesian statistics or using BayesianTools in R.

Here is what I observed.

running this line of code

mcmc_out <- runMCMC(
bayesianSetup = mcmc_setup,
sampler = "DEzs",
settings = list(iterations = 4e+03, nrChains = 3))

On the console, i saw this

runMCMC terminated after 37.8599999999999seconds002 . Current logp -474.9476 -469.4768 -468.8282 . Please wait!
runMCMC terminated after 34.8800000000001seconds002 . Current logp -341.9094 -339.1239 -342.8411 . Please wait!
runMCMC terminated after 34.3899999999999seconds002 . Current logp -357.7484 -358.9804 -356.8762 . Please wait!

Question: was it suppose to terminate? is it doing the right thing here?

also, the result of gelmanDiagnostics( mcmc_out) gave me 95.2, contrary to 1.11 you got.

I have attached the graph showing the observed, calibrated, and default value.
calibration

Please, can you help clarify what i am doing wrong?

Also, i am doing multiple runs for my simulations, we wrote a kind of pipeline to make r3pg loop over multiple sites. I was trying to go over this method of calibrating the r3pg model and see how i can use it.

Please, i will also like to get your advice on how to do that since this vignette explains for a single run.

Thank you.

provide the light absorption as an output

from @DavidForrester

would it be possible to provide the light absorption as an
output? I was looking at the par, I guessed this is photosynthetically
active radiation (as opposed the absorption of par), but if it was it would
be solar radiation / 2, and it would not vary between species. If it is the
absorbed par, it might be better to name it apar, which is often used, and
would distinguish it from par, which is also often used, but for
photosynthetically active radiation.

Data check

Add the check for NA in the input data: climate (specially co2, ...)

Soil water availability

Hi Vova,

I was wondering what maximum and minimum available soil water (asw_min, asw_max) actually is and about the diference to field capacity and permanent wilting point? I was also wondering if it is only the difference between maximum and minimum that matters in the end as water storage in the model ?

Best wishes,
Johannes

vignette passes a default standard deviation vector to likelihoodIidNormal

The signature of likelidhoodIidNormal is

likelihoodIidNormal(predicted, observed, sd)

but the vignette has

  err_def <- setNames(error_solling$default, error_solling$param_name)

  logpost <- sapply(1:6, function(i) {
    likelihoodIidNormal( sim.df[,i], observ_solling_mat[,i], err_def[i] ) 
  }) %>% sum(.)

I have three concerns about this.

  1. It's unclear where the default values come from. Since the other columns in error_solling are min and max and default is always in between it looks like the intent might be to have some sort of constrained error model. But min and max don't seem to be used and I'm not seeing as much of a connection between default and the data as I would expect. For example, err_biom_root and err_biom_foliage both default to 1 even though root biomass sims out is several times larger than foliage biomass and is usually considerably harder to measure than foliage biomass.
  2. It's implausible the covariance matrix is diagonal. I recognize r3PG is limited to what BayesianTools supports and that there is no likelihoodNormal(predicted, observed, covariance) available. But treating error in basal area, height, and stem biomass as independent of error in DBH leaves me less confident in the accuracy of Morris and DE-MCMC results than I would like. I don't see an issue tracking this in either r3PG or in the BayesianTools repo, so wanted to capture it.
  3. The name likelihoodIidNormal implies a scalar standard deviation but r3PG is passing in a vector that's definitely not indicating identical distributions. I would therefore expect the r3PG vignette to fail in parameter validation on the BayesianTools side. But it doesn't. Since the BayesianTools documentation is terse it's unclear if there's vector support (suggesting likelihoodIidNormal is probably an obsolete name) or if something like the default standard deviation for err_basal_area getting applied to all six measurements.

@florianhartig, if you'd like a tracking issue for 2 or 3 on the BayesianTools side let me know and I can type something up.

Load data in vignette

In the vignette, if we do

load('vignette_data/solling.rda')

this works for us, but not for the user, right? We could probably directly load from the internet, as in

https://raw.githubusercontent.com/trotsiuk/r3PG/master/pkg/data-raw/vignette_data/solling.rda

If we do this, we could also move the data out of the package, e.g. in the repo under a folder data. It would even be possible to provide further datasets there.

change naming for sub_climate?

Hi Vova, it's more a question of style, but recommendation is that function names should be understandable. If I understand correctly, sub_climate can be used to subset or replicate climate data. Maybe "prepareClimateTable" would be a more accurate / understandable name?

Include default Parameters for European tree species into R package?

Hey @trotsiuk and @DavidForrester,

I think it would be quite useful to include the parameter estimates from https://link.springer.com/article/10.1007/s10342-021-01370-3 in the package. Specifically, what we could do is simply save the MAP parameter estimates as a data object, but ideally would be of course to also save the uncertainty. BayesianTools includes a function to create a prior from a posterior sample, so we could apply this and save the respective BT object.

Combined with the next issue that I will open, this would allow a much swifter use of 3PG.

soil classes

I also saw that the soil textures are defined in the d_site, which is good (I still don’t know why they are considered species parameters in VBA). But in the VBA version, when the soil classes are defined in this way, there are only about 4 classes. But in reality, there is a gradient of soil types and that is when the “species parameters” SWconst and SWpower were useful – this was commonly used. I think this can be solved by adding more options to the 4 that you already have, as shown in the attached excel file. Another option would be to have the soil class in the d_site specified using both SWconst and SWpower.
soil_textures.xlsx

How to set up 3pg for mixed forests?

Question from a user:

I would like to assess the contribution of [redacted] Forest ( A mixed forest of Cedar and Oak) to carbon sequestration efforts using the 3PG model.

Forest stands in [redacted] Forest do not have the same age nor the same structure. In some locations stands are pure ( containing only one species), in other locations, stands are mixed. Also, In some locations cedar is dominant, in other locations Oak is the dominant. Thus, I think that I could not use the same species input dataframe in order to predict my dependent variable which is NPP.

Height equation

Add the option for multiple height equations:

A Michajlow (or Schumacher) function (Michajlow, 1952) is used to predict heights and live crown lengths. The version without the “ +n_HC×C×DBH “ was found to be the most appropriate for the EFM data by Zell (2016).

y=1.3+a_H×e^((-n_HB)⁄DBH)+n_HC×C×DBH

where y is height (HTOT ) or live-crown length (LCL) in metres, C is the competition variable of 3-PG, aH, nHB and nHC are fitted parameters for 3-PG and DBH is diameter at 1.3 m in cm.

References

Michajlow, J., 1952. Mathematische Formulierung des Gesetzes für Wachstum und Zuwachs der Waldbäume und Bestände. Schweiz. Z. Forstw 103, 368-380.
Zell, J., 2016. A climate sensitive single tree stand simulator for Switzerland. Birmensdorf, Swiss Federal Institute of Forest, Snow and Landscape Research WSL. 107 p.

nonlinearfunction.docx

Settings

at the moment, all settings have to be provided, right?

settings = list(light_model =
  1, transp_model = 1, phys_model = 1, correct_bias = 0, calculate_d13c = 0)

could also implement a function like in BT which checks settings. I have done this for another model, which is, however, not public. Can send you an example. However, not really a priority.

CRAN and NEWS file

Hi Vova,

in c226cad I created a CRAN and NEWS file.

The NEWS file is required (or at least strongly recommended), to document the changes in each release. I usually copy what's written there also to the CRAN submission and to GitHub releases https://github.com/trotsiuk/r3PG/releases which I would do at least for every CRAN release, I often do intermediate releases as well.

THe CRAN file is for our own documentation, but most packages do this - paste there the text you write in your CRAN submission, plus the responses you get.

At the moment, I just copied over the text from DHARMa and modified to have a bare minimum. Feel free to modify / populate this.

Change pkg to r3PG?

This is just cosmetic, but I wonder if it would look nicer to change the pck subfolder to r3PG? At least it seems to me that most packages I know (at least mine) have this structure. Maybe I'm just projecting my preference on this, however, technically, it doesn't really matter

Unit tests

We should add some testthat unit tests. I would suggest that we make 3 test cases for the run function

  • test the basic functionality
  • tests for sanity (could be done together with the latter), i.e. that changes in recommended parameter space don't create crazy results
  • tests for compatibility with the other model versions, i.e. that results are similar than the virtual basic versions.

For the later, I thought we could run the virtual basic model with the vignette drivers for a few parameter combinations (for both model versions), and then check in the test that the new version delivers close to identical results? This way, we can also immediately check this off the list of things to do for the paper/

Tave & VPD input data

IF available, user shall provide directly the tave and vpd. And only if they don't have this information, it shall be calculated by simple approximation.

within run_3PG there shall be part to calculate tave if not available, and ...
The fortran compiler shall take all variables....

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.