Giter VIP home page Giter VIP logo

bcgcalc's Introduction

Erik W. Leppo

Working as a data scientist in the environmental field with Tetra Tech (since 1994).

Working in R since 2006 and Shiny since 2017. Using GitHub for version control since 2016.

🔥 GitHub Stats

Anurag's GitHub stats

GitHub Streak

bcgcalc's People

Contributors

hedintt avatar leppott avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

blocktt

bcgcalc's Issues

CT BCG - documentation

Is your feature request related to a problem? Please describe.
Update Vignette with specifics on adding new model.

Describe the solution you'd like
use CT as an example.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

BCG.Metric.Membership - fails if try to calculate bugs and fish for same model

Describe the bug
Error "missing values are not allowed in subscripted
assignments of data frames".

Function fails.

Works if use only one sitetype.

To Reproduce
BCG.Metric.Membership when have metrics from 2 site types (e.g., bugs and fish) and they do not overlap completely. This leads to NA when assign 0 and 1 for membership.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
Errors in code here:

image

Additional context
Change to ifelse() to avoid issues with NAs.

OLD
image

NEW
image

CITATION missing

Describe the bug
Citation not appearing for package.

To Reproduce
citation("BCGcalc")

Expected behavior
Citation should appear in R.

Screenshots
image

Additional context
Pointed out by user.

Could be generated from DESCRIPTION.

Alternately could create a CITATION file.
http://r-pkgs.had.co.nz/inst.html#inst-citation

Thermal metrics incorrect.

Thermal metrics calculating for any matching word not exact.

That is, the "COLD" metrics are counting both "COLD" and "COLD_COOL" and "WARM" metrics are counting both "WARM" and "COOL_WARM".

pi, pt, and nt are affected.

Model - add - Maritime NW High-Low Gradient

Is your feature request related to a problem? Please describe.
Add Maritime NW BCG model for the High Gradient-Low Elevation class

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Some tiers have a best 2 of 3 trigger. Really uses the 2nd value so can program using the median of the three values.

Also update the "flags" file.

Error Checking - BCG.Level.Membership

Describe the bug
Need error checking to ensure have all metrics in rules table.

To Reproduce
If have mismatched metric names then only those metrics that match are evaluated.

Can result in bad level membership calculations.

For example, if Use min of Rule0 for a level and all the "0" values fall out but a "1" is kept then membership is "1" instead of "0".

Expected behavior
Should throw a hard error with message.

Additional context
NA

Slopes - more information on getting data

Is your feature request related to a problem? Please describe.
From user SH:

  1. Slopes….there is talk of using the NHD Slope data, but no real description of how to find and use this data. I worked with Ryan Hill to download the data, bring it in, and merge with my bug samples. While the code has examples of how to bring it in, it doesn’t document the data itself. I’d recommend including links to where to find this data. I think what Ryan Gave me was for the entire PNW, so it should work for everyone.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Update Readme

Is your feature request related to a problem? Please describe.
Update ReadMe

Describe the solution you'd like

  1. Use only the example that installs the vignette.
  2. Include some example code for running functions from the package.
  3. Use "library" instead of "require" in examples.

Describe alternatives you've considered
NA

Additional context
Want a working example for getting only select metrics.

Update ReadMe for metric.values

Is your feature request related to a problem? Please describe.
ReadMe example doesn't reference BioMonTools library for metric.values.

Describe the solution you'd like
see above

Describe alternatives you've considered
NA

Additional context
NA

Metadata - taxa list

Is your feature request related to a problem? Please describe.
Meta data for master taxa list fields and valid values. Include sources of information.

Describe the solution you'd like
Not sure of best location, Vignette and/or Excel files. Probably both locations.

Describe alternatives you've considered
Short list in help file for metric.values is too brief.

Additional context
NA

CT BCG - flags

Is your feature request related to a problem? Please describe.
Add flags specific to CT BCG.

Describe the solution you'd like
Document any warnings that should be flagged so the user can best interpret the output. Should be similar to implement in the R package as what did for the PacNW project. Already have the framework in the package code.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

CT BCG - add model thresholds

Is your feature request related to a problem? Please describe.
Add CT BCG model thresholds.

Describe the solution you'd like
Add model to Excel file so available for calculation.

1 macroinvertebrate model and 3 fish models.

Describe alternatives you've considered
NA

Additional context
Model is in Access (available as a project file).

Gerritsen J, Jessup B. 2007. Calibration of the biological condition gradient for high gradient streams of Connecticut. Report prepared for US EPA Office of Science and Technology and the Connecticut Department of Environmental Protection. TetraTech, Maryland

Stamp, J., Gerritsen J. 2013. A biological condition gradient assessment model for stream fish communities of Connecticut-Final Report. Report prepared for US EPA Office of Science and Technology and the Connecticut Department of Environmental Protection. TetraTech, Maryland

QC Check - metric.values - Exclude as T/F

Is your feature request related to a problem? Please describe.
Ensure have T/F in Exclude column for metric.values.
Some users may have Y/N or 1/0.

Describe the solution you'd like
Add statement to help file and vignette. But also have a warning in code.

Describe alternatives you've considered
Should be in the code. User may have Y/N or 1/0 (Access) and assume it is working.

Additional context
Need to be clear it is working.

Update example data; Data_BCG_PacNW.xlsx

Describe the bug
Remove unneeded worksheets.
Update phylogeny.

To Reproduce
Errors in phylogeny fields keep from being able to assign Excluded taxa.
Fixed in database.

Expected behavior
NA

Screenshots
NA

Additional context
NA

Move qc.checks function to BioMonTools package

Is your feature request related to a problem? Please describe.
Move qc.checks function to BioMonTools package

Describe the solution you'd like
Related to metric.values so move to the more appropriate package.

Will need to update the vignette.

Describe alternatives you've considered
Better to have in 1 package than spread across 2.

Additional context
Keep flags Excel file in this package. (But have in BioMonTools also).

Function not found (dcast and melt from reshape2)

Describe the bug
Flags example using dcast fails.

To Reproduce
Steps to reproduce the behavior:
Vignette - Level Assignment section

long to wide format

df.flags.wide <- dcast(df.flags, SAMPLEID ~ CHECKNAME, value.var="FLAG")

Error - could not find fucntion "dcast"

Expected behavior
Converts data frame from long to wide format.
Uses reshape2::dcast

library(reshape2) is at the top of the section.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
After ensuring that library(reshape2) is used in each example and vignette example section (for dcast and melt). That should be ok.

reshape2 is in the "Imports" in DESCRIPTION.

reshape2::foo is used in functions for both dcast and melt.

New Metric - Percent Baetis tricaudatus complex + Simuliidae individual

Is your feature request related to a problem? Please describe.
Add new metric to metric.values; Percent Baetis tricaudatus complex + Simuliidae individual

Describe the solution you'd like
See above.

Describe alternatives you've considered
NA

Additional context
Easy enough to add into the code.

"pi_SimBtri"

Metadata - metric names

Is your feature request related to a problem? Please describe.
Metric names file is old/incomplete.
\extdata\MetricNames.xlsx

Describe the solution you'd like
Move to \doc and save as PDF so can access from help.
Add all metrics and group by how appear in metric.values function output.

Describe alternatives you've considered
File is incomplete. Don't have all names. And some descriptions incomplete.

Additional context
NA

Metric Membership - fails with extra columns from metric.values

Metric Membership fails if include cols2keep in metric.values.

Changes the structure of the input file for BCG.Metric.Membership and the code fails.

Temporary workaround = don't include extra columns.

Permanent workaround = need to modify code to account for extra columns.

Examples; names not always consistent

Is your feature request related to a problem? Please describe.
Some names in examples are not always consistent between functions and vignette.

Describe the solution you'd like
Ensure always use the same format in case user's mix code.

Describe alternatives you've considered
NA

Additional context
NA

Write statements - intermediate as TSV, final as CSV

Is your feature request related to a problem? Please describe.
Be consistent in write statements.
TSV = intermediate step
CSV = final result

Describe the solution you'd like
Check examples and vignette

Describe alternatives you've considered
Have a mix in package so need to be consistent.

Additional context
NA

metric.values - NonTarget as T/F

Is your feature request related to a problem? Please describe.
In metric.values NonTarget must be TRUE or FALSE. No warning if not.

Describe the solution you'd like
Add a warning if no "FALSE" values.

Describe alternatives you've considered
Same solution as for Exclude column.

Additional context
NA

metric.values - fun.MetricNames not working

Describe the bug
Get error when try to use this variable.

To Reproduce
Steps to reproduce the behavior:

image

Expected behavior
Should get back a data frame with only those columns included.

Screenshots
See code above

Desktop (please complete the following information):
v1.2.2.9005

Additional context
NA

Metric Names - Not consistent

Is your feature request related to a problem? Please describe.
Metric names not always in consistent format.

e.g., most habit metrics are plural but not burrow. And ffg metrics none are plural.

Describe the solution you'd like
Habit metrics could be "cling", "climb", "burrow", "swim" or something similar and consistent format.

Describe alternatives you've considered
Prefixes are consistent just the endings.

Additional context
Need to make the change sooner than later before develop a larger user base.

Map Vignette - color scale

Is your feature request related to a problem? Please describe.
On the map vignette change the scale. Cyan is too jarring and can't see other points all that well.

Describe the solution you'd like
Add to ggplot code.

  • scale_color_gradientn(colours = terrain.colors(5))

Describe alternatives you've considered
Could use color brewer or other sets of colors.

Additional context
Existing map and another using terrain colors.

image

image

Additional Report - Climate Indicator Metrics

Is your feature request related to a problem? Please describe.
Create a report for climate indicators.

• Thermal preference (cold, cold/cool, cool/warm, warm)
o Number of taxa
o Percent of taxa
o Percent individuals
o Percent Baetis tricaudatus complex + Simuliidae individual

Describe the solution you'd like
Report (RMD), maybe an R Notebook.

Describe alternatives you've considered
Could be included in example code for "metric.values".

This might be the better / easier solution. Already have code to produce metrics. Just include selection code and output.

Additional context
Example code.

library(BCGcalc)
library(readxl)
library(knitr)

Load Data

df.data <- read_excel(system.file("./extdata/Data_BCG_PacNW.xlsx"
, package="BCGcalc"))

Columns to keep

myCols <- c("Area_mi2", "SurfaceArea", "Density_m2", "Density_ft2")

Run Function

df.metval <- metric.values(df.data, "bugs", fun.cols2keep=myCols)

View Results

#View(df.metval)

Metrics of Interest

thermal indicator (ti)

#names(df.metval)[grepl("ti", names(df.metval))]
col.met2keep <- c("ni_total", "nt_total", "nt_ti_c", "nt_ti_cc", "nt_ti_cw"
, "nt_ti_w", "pi_ti_c", "pi_ti_cc", "pi_ti_cw", "pi_ti_w"
, "pt_ti_c", "pt_ti_cc", "pt_ti_cw", "pt_ti_w")
col.ID <- c("SAMPLEID", toupper(myCols), "INDEX_NAME", "SITE_TYPE")

Ouput

df.metval.ci <- df.metval[, c(col.ID, col.met2keep)]

RMD table

kable(df.metval.ci[1:10, ])

Save

write.table(df.metval.ci, "metrics.thermalindicators.tsv"
, col.names=TRUE, row.names=FALSE, sep="\t")

Default dataset generates level 6 membership for all sample_IDs

When using the default dataset "Data_BCG_PacNW.xlsx", level assignment all sample n= 678 IDs is 6. This was conducted in R 3.5.2. with the latest version of the BCGCalc. I have reproduced this result multiple times. I would expect level assignment to range from 2 - 6 for these samples representing the calibration dataset.

Here is a screenshot of the output...
image

metric.values - required fields (more than described in help)

Describe the bug
The function metric.values doesn't work unless have more fields than specified in help file.

To Reproduce
Steps to reproduce the behavior:

  1. Use only the columns specified in the help file as your data input for metric.values.

col.help <- c("SAMPLEID, "TAXAID", "N_TAXA", "EXCLUDE", "SITE_TYPE", "NONTARGET", "PHYLUM", "CLASS", "ORDER", "FAMILY", "SUBFAMILY", "GENUS", "FFG", "HABIT", "LIFE_CYCLE", "TOLVAL", "BCG_ATTR")

  1. Will keep getting errors about missing fields until add the following.

col.needed <- c("INDEX_NAME", "SITE_TYPE", "THERMAL_INDICATOR", "SUBPHYLUM", "TRIBE").

Expected behavior
Would be beneficial to:

  1. check for all required fields in the function and return an error code if missing.

  2. Have all required fields stated in the help file.

Screenshots
image

Desktop (please complete the following information):
image

CT BCG - test cases

Is your feature request related to a problem? Please describe.
Develop test cases for new CT models.

Describe the solution you'd like
Create small dataset to include in testthat.

Describe alternatives you've considered
Don't want to burden package with "all" data.

Additional context
QC data outside of package. Only want small set of samples to ensure don't break things in the future.

HBI calculation

Describe the bug
metric.values HBI result is NA if any TolVal values are NA.

To Reproduce
See above

Expected behavior
Should be removed.

Screenshots
NA

Desktop (please complete the following information):
NA

Smartphone (please complete the following information):
NA

Additional context
Add na.rm=TRUE to sum function in numerator.

Could affect other metrics.

Failed Install - R v3.6.0 - gzip error

Describe the bug
Error with installing from GitHub.

To Reproduce
Steps to reproduce the behavior:
devtools::install_github("leppott/BCGcalc")

Expected behavior
Should still install properly.

Screenshots
If applicable, add screenshots to help explain your problem.

image

Additional context
Add any other context about the problem here.

QC Checks to Flags

Is your feature request related to a problem? Please describe.
Update QC Checks to Flags. But leave function the same.

Describe the solution you'd like
Multiple fixes across the package.

  1. Vignette - section header, QC Check to Flags
  2. Updated files for the examples
  3. Update the example (function and vignette)

Describe alternatives you've considered
From Jen from comments from the group.

Additional context
NA

QC checks - not all included in final example in Level Assignment

Not all extra columns (col2keep) showing up in final example.

"case" issue. Convert to lower case in qc.check. And had to account for NA in "EVAL" column.

In BCG.Level.Assignment example ran metrics once to get membership and a 2nd time to include extra columns for qc.checks. When fix Issue #11 then can change the example code.

update Rules.xlsx

Is your feature request related to a problem? Please describe.
Include revised Rules.xlsx in Package.

Describe the solution you'd like

  1. Renamed "BCG_PacNW" to "BCG_PacNW_v1_500ct"
  2. Revise in code when import. Multiple functions and in multiple places in Vignette.

Describe alternatives you've considered
NA

Additional context
Added 300 ct model worksheet to file and removed unused worksheets.

Final BCG assignment - proportional

Is your feature request related to a problem? Please describe.
So has anyone ever looked at scoring based on probabilities? I am thinking it might be a good way to get a more continuous BCG score, as well as capturing partial group memberships. Here is my first crack at it.

The first sample has a really high probaqbility of L6, but a very low probability of L4. (How often does that happen that it skips a level between two others?) Thus the overall weighted value is slightly lower than L6.

The 3rd sample is mostly a 3, but 1/3 L4. So the overall score moves slightly closer to 4, at 3.3.

Describe the solution you'd like
Add a new column to the final results. See below.

SAMPLEID INDEX_NAME SITE_TYPE L1 L2 L3 L4 L5 L6   L1 L2 L3 L4 L5 L6 FINAL_BCG value
00001CSR BCG_PacNW_v1_500ct hi 0 0 0 0.054422 0 0.945578   0 0 0 0.217687 0 5.673469 5.9
00001GRD BCG_PacNW_v1_500ct hi 0 0 1 0 0 0   0 0 3 0 0 0 3.0
00001GRDa BCG_PacNW_v1_500ct hi 0 0 0.666667 0.333333 0 0   0 0 2 1.333333 0 0 3.3
00001HOOD BCG_PacNW_v1_500ct hi 0 0 0.5 0.5 0 0   0 0 1.5 2 0 0 3.5
00001HOODa BCG_PacNW_v1_500ct hi 0 0 0.1 0.6 0.3 0   0 0 0.3 2.4 1.5 0 4.2
00001JDE BCG_PacNW_v1_500ct hi 0 0.363636 0.636364 0 0 0   0 0.727273 1.909091 0 0 0 2.6
00001REM BCG_PacNW_v1_500ct hi 0 1 0 0 0 0   0 2 0 0 0 0 2.0
00001SPS BCG_PacNW_v1_500ct lo 0 0 1 0 0 0   0 0 3 0 0 0 3.0
00001WSE BCG_PacNW_v1_500ct hi 0 1 0 0 0 0   0 2 0 0 0 0 2.0
00002CSR BCG_PacNW_v1_500ct lo 0 0.4 0.6 0 0 0   0 0.8 1.8 0 0 0 2.6
00002GRD BCG_PacNW_v1_500ct hi 0 0 0 1 0 0   0 0 0 4 0 0 4.0
00002HOOD BCG_PacNW_v1_500ct hi 0 0.8 0.2 0 0 0   0 1.6 0.6 0 0 0 2.2
00002HOODa BCG_PacNW_v1_500ct hi 0 1 0 0 0 0   0 2 0 0 0 0 2.0
00002JDE BCG_PacNW_v1_500ct hi 0 0 0.666667 0.333333 0 0   0 0 2 1.333333 0 0 3.3
00002REM BCG_PacNW_v1_500ct hi 0 1 0 0 0 0   0 2 0 0 0 0 2.0
00002SPSa BCG_PacNW_v1_500ct lo 0 0.4 0.6 0 0 0   0 0.8 1.8 0 0 0 2.6
00002SPSad BCG_PacNW_v1_500ct lo 0 0 1 0 0 0   0 0 3 0 0 0 3.0
00002WSE BCG_PacNW_v1_500ct lo 0 0 1 0 0 0   0 0 3 0 0 0 3.0
00003CSR BCG_PacNW_v1_500ct hi 0 0.6 0.4 0 0 0   0 1.2 1.2 0 0 0 2.4
00003GRD BCG_PacNW_v1_500ct lo 0 0.166667 0.733333 0.1 0 0   0 0.333333 2.2 0.4 0 0 2.9
00003JDE BCG_PacNW_v1_500ct hi 0 0.8 0.2 0 0 0   0 1.6 0.6 0 0 0 2.2

Describe alternatives you've considered
NA

Additional context
NA

Update BCG rules for PacNW workgroup

The PacNW BCGers are going through the confirmation round right now (their homework is due in the next 1-2 weeks). They ended up simplifying some of the rules (e.g., no more alternate for Level 3) and changing a few thresholds. I want to wait until the confirmation is done before sending you a list of the final updates.

JS, email 2018-05-10

Check and clean

Run "check" and clean up any errors, warnings, or notes.

v1.1.0.9010 has some of the above but nothing critical.

Examples - should be only PacificNW

Is your feature request related to a problem? Please describe.
Remove Indiana examples and data.

Describe the solution you'd like
Want only the Pacific NW data in the package.

Describe alternatives you've considered
Some users confused.

Additional context
NA

rarify function

Is your feature request related to a problem? Please describe.
The package BCGcalc and ContDataQC both have the "rarify" function. Make the same in both. Documentation differs.

Describe the solution you'd like
The function in ContDataQC seems to have better example code.

Describe alternatives you've considered
Some have used the other function.

Additional context
NA

Examples - donotrun "write" statements

Is your feature request related to a problem? Please describe.
Bad practice to write files in examples. Move to "donotrun".

Describe the solution you'd like
Make sure "write" statements are at the end of each example (as much as is appropriate) and mark as "donotrun".

Describe alternatives you've considered
Either comment out or mark as donotrun.

Additional context
NA

ReadMe - libraries

Update libraries in ReadMe to those from BCGcalc. The current list is from ContDataQC.

Map example (vignette)

Is your feature request related to a problem? Please describe.
Add a map results example.

Describe the solution you'd like
Could be a new vignette or included in the existing vignette.

Describe alternatives you've considered
NA

Additional context
Use ggplot.

BCG calculation primer

Is your feature request related to a problem? Please describe.
Add short document to 'doc' for directions on BCG calcuations.

Describe the solution you'd like
BCGcalc_README_20180918.pdf

Describe alternatives you've considered
NA

Additional context
NA

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.