leppott / bcgcalc Goto Github PK

View Code? Open in Web Editor NEW

3.0 4.0 1.0 509.61 MB

Metric calculation and other tools for Biological Condition Gradient

Home Page: https://leppott.github.io/BCGcalc/

License: MIT License

R 12.54% HTML 87.46%

bcg biological condition gradient metric r-package

bcgcalc's Introduction

Erik W. Leppo

Working as a data scientist in the environmental field with Tetra Tech (since 1994).

Working in R since 2006 and Shiny since 2017. Using GitHub for version control since 2016.

🔥 GitHub Stats

bcgcalc's People

Contributors

Stargazers

Watchers

Forkers

blocktt

bcgcalc's Issues

CT BCG - documentation

Is your feature request related to a problem? Please describe.
Update Vignette with specifics on adding new model.

Describe the solution you'd like
use CT as an example.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Move bioassessment functions to another package (BioMonTools)

Is your feature request related to a problem? Please describe.
Move rarify and metric.values into a separate package (BioMonTools).

https://github.com/leppott/BioMonTools

Describe the solution you'd like
Want to keep this package "on topic" without extras. And

Describe alternatives you've considered
Don't want the same function in multiple packages.

Additional context
NA

BCG.Metric.Membership - fails if try to calculate bugs and fish for same model

Describe the bug
Error "missing values are not allowed in subscripted
assignments of data frames".

Function fails.

Works if use only one sitetype.

To Reproduce
BCG.Metric.Membership when have metrics from 2 site types (e.g., bugs and fish) and they do not overlap completely. This leads to NA when assign 0 and 1 for membership.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
Errors in code here:

Additional context
Change to ifelse() to avoid issues with NAs.

OLD

NEW

CITATION missing

Describe the bug
Citation not appearing for package.

To Reproduce
citation("BCGcalc")

Expected behavior
Citation should appear in R.

Screenshots

Additional context
Pointed out by user.

Could be generated from DESCRIPTION.

Alternately could create a CITATION file.
http://r-pkgs.had.co.nz/inst.html#inst-citation

Terminology - Change Region to SiteType

For site class change Region to SiteType.

Thermal metrics incorrect.

Thermal metrics calculating for any matching word not exact.

That is, the "COLD" metrics are counting both "COLD" and "COLD_COOL" and "WARM" metrics are counting both "WARM" and "COOL_WARM".

pi, pt, and nt are affected.

Model - add - Maritime NW High-Low Gradient

Is your feature request related to a problem? Please describe.
Add Maritime NW BCG model for the High Gradient-Low Elevation class

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Some tiers have a best 2 of 3 trigger. Really uses the 2nd value so can program using the median of the three values.

Also update the "flags" file.

Error Checking - BCG.Level.Membership

Describe the bug
Need error checking to ensure have all metrics in rules table.

To Reproduce
If have mismatched metric names then only those metrics that match are evaluated.

Can result in bad level membership calculations.

For example, if Use min of Rule0 for a level and all the "0" values fall out but a "1" is kept then membership is "1" instead of "0".

Expected behavior
Should throw a hard error with message.

Additional context
NA

Slopes - more information on getting data

Is your feature request related to a problem? Please describe.
From user SH:

Slopes….there is talk of using the NHD Slope data, but no real description of how to find and use this data. I worked with Ryan Hill to download the data, bring it in, and merge with my bug samples. While the code has examples of how to bring it in, it doesn’t document the data itself. I’d recommend including links to where to find this data. I think what Ryan Gave me was for the entire PNW, so it should work for everyone.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Update Readme

Is your feature request related to a problem? Please describe.
Update ReadMe

Describe the solution you'd like

Use only the example that installs the vignette.
Include some example code for running functions from the package.
Use "library" instead of "require" in examples.

Describe alternatives you've considered
NA

Additional context
Want a working example for getting only select metrics.

Update ReadMe for metric.values

Is your feature request related to a problem? Please describe.
ReadMe example doesn't reference BioMonTools library for metric.values.

Describe the solution you'd like
see above

Describe alternatives you've considered
NA

Additional context
NA

Metadata - taxa list

Is your feature request related to a problem? Please describe.
Meta data for master taxa list fields and valid values. Include sources of information.

Describe the solution you'd like
Not sure of best location, Vignette and/or Excel files. Probably both locations.

Describe alternatives you've considered
Short list in help file for metric.values is too brief.

Additional context
NA

CT BCG - flags

Is your feature request related to a problem? Please describe.
Add flags specific to CT BCG.

Describe the solution you'd like
Document any warnings that should be flagged so the user can best interpret the output. Should be similar to implement in the R package as what did for the PacNW project. Already have the framework in the package code.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

CT BCG - add model thresholds

Is your feature request related to a problem? Please describe.
Add CT BCG model thresholds.

Describe the solution you'd like
Add model to Excel file so available for calculation.

1 macroinvertebrate model and 3 fish models.

Describe alternatives you've considered
NA

Additional context
Model is in Access (available as a project file).

Gerritsen J, Jessup B. 2007. Calibration of the biological condition gradient for high gradient streams of Connecticut. Report prepared for US EPA Office of Science and Technology and the Connecticut Department of Environmental Protection. TetraTech, Maryland

Stamp, J., Gerritsen J. 2013. A biological condition gradient assessment model for stream fish communities of Connecticut-Final Report. Report prepared for US EPA Office of Science and Technology and the Connecticut Department of Environmental Protection. TetraTech, Maryland

QC Check - metric.values - Exclude as T/F

Is your feature request related to a problem? Please describe.
Ensure have T/F in Exclude column for metric.values.
Some users may have Y/N or 1/0.

Describe the solution you'd like
Add statement to help file and vignette. But also have a warning in code.

Describe alternatives you've considered
Should be in the code. User may have Y/N or 1/0 (Access) and assume it is working.

Additional context
Need to be clear it is working.

Update example data; Data_BCG_PacNW.xlsx

Describe the bug
Remove unneeded worksheets.
Update phylogeny.

To Reproduce
Errors in phylogeny fields keep from being able to assign Excluded taxa.
Fixed in database.

Expected behavior
NA

Screenshots
NA

Additional context
NA

Move qc.checks function to BioMonTools package

Is your feature request related to a problem? Please describe.
Move qc.checks function to BioMonTools package

Describe the solution you'd like
Related to metric.values so move to the more appropriate package.

Will need to update the vignette.

Describe alternatives you've considered
Better to have in 1 package than spread across 2.

Additional context
Keep flags Excel file in this package. (But have in BioMonTools also).

Function not found (dcast and melt from reshape2)

Describe the bug
Flags example using dcast fails.

To Reproduce
Steps to reproduce the behavior:
Vignette - Level Assignment section

long to wide format

df.flags.wide <- dcast(df.flags, SAMPLEID ~ CHECKNAME, value.var="FLAG")

Error - could not find fucntion "dcast"

Expected behavior
Converts data frame from long to wide format.
Uses reshape2::dcast

library(reshape2) is at the top of the section.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
After ensuring that library(reshape2) is used in each example and vignette example section (for dcast and melt). That should be ok.

reshape2 is in the "Imports" in DESCRIPTION.

reshape2::foo is used in functions for both dcast and melt.

New Metric - Percent Baetis tricaudatus complex + Simuliidae individual

Is your feature request related to a problem? Please describe.
Add new metric to metric.values; Percent Baetis tricaudatus complex + Simuliidae individual

Describe the solution you'd like
See above.

Describe alternatives you've considered
NA

Additional context
Easy enough to add into the code.

"pi_SimBtri"

Metadata - metric names

Is your feature request related to a problem? Please describe.
Metric names file is old/incomplete.
\extdata\MetricNames.xlsx

Describe the solution you'd like
Move to \doc and save as PDF so can access from help.
Add all metrics and group by how appear in metric.values function output.

Describe alternatives you've considered
File is incomplete. Don't have all names. And some descriptions incomplete.

Additional context
NA

Metric Membership - fails with extra columns from metric.values

Metric Membership fails if include cols2keep in metric.values.

Changes the structure of the input file for BCG.Metric.Membership and the code fails.

Temporary workaround = don't include extra columns.

Permanent workaround = need to modify code to account for extra columns.

Examples; names not always consistent

Is your feature request related to a problem? Please describe.
Some names in examples are not always consistent between functions and vignette.

Describe the solution you'd like
Ensure always use the same format in case user's mix code.

Describe alternatives you've considered
NA

Additional context
NA

Write statements - intermediate as TSV, final as CSV

Is your feature request related to a problem? Please describe.
Be consistent in write statements.
TSV = intermediate step
CSV = final result

Describe the solution you'd like
Check examples and vignette

Describe alternatives you've considered
Have a mix in package so need to be consistent.

Additional context
NA

metric.values - NonTarget as T/F

Is your feature request related to a problem? Please describe.
In metric.values NonTarget must be TRUE or FALSE. No warning if not.

Describe the solution you'd like
Add a warning if no "FALSE" values.

Describe alternatives you've considered
Same solution as for Exclude column.

Additional context
NA

metric.values - fun.MetricNames not working

Describe the bug
Get error when try to use this variable.

To Reproduce
Steps to reproduce the behavior:

Expected behavior
Should get back a data frame with only those columns included.

Screenshots
See code above

Desktop (please complete the following information):
v1.2.2.9005

Additional context
NA

Metric Names - Not consistent

Is your feature request related to a problem? Please describe.
Metric names not always in consistent format.

e.g., most habit metrics are plural but not burrow. And ffg metrics none are plural.

Describe the solution you'd like
Habit metrics could be "cling", "climb", "burrow", "swim" or something similar and consistent format.

Describe alternatives you've considered
Prefixes are consistent just the endings.

Additional context
Need to make the change sooner than later before develop a larger user base.

Map Vignette - color scale

Is your feature request related to a problem? Please describe.
On the map vignette change the scale. Cyan is too jarring and can't see other points all that well.

Describe the solution you'd like
Add to ggplot code.

scale_color_gradientn(colours = terrain.colors(5))

Describe alternatives you've considered
Could use color brewer or other sets of colors.

Additional context
Existing map and another using terrain colors.

Additional Report - Climate Indicator Metrics

Is your feature request related to a problem? Please describe.
Create a report for climate indicators.

• Thermal preference (cold, cold/cool, cool/warm, warm)
o Number of taxa
o Percent of taxa
o Percent individuals
o Percent Baetis tricaudatus complex + Simuliidae individual

Describe the solution you'd like
Report (RMD), maybe an R Notebook.

Describe alternatives you've considered
Could be included in example code for "metric.values".

This might be the better / easier solution. Already have code to produce metrics. Just include selection code and output.

Additional context
Example code.

library(BCGcalc)
library(readxl)
library(knitr)

Load Data

df.data <- read_excel(system.file("./extdata/Data_BCG_PacNW.xlsx"
, package="BCGcalc"))

Columns to keep

myCols <- c("Area_mi2", "SurfaceArea", "Density_m2", "Density_ft2")

Run Function

df.metval <- metric.values(df.data, "bugs", fun.cols2keep=myCols)

View Results

#View(df.metval)

Metrics of Interest

thermal indicator (ti)

#names(df.metval)[grepl("ti", names(df.metval))]
col.met2keep <- c("ni_total", "nt_total", "nt_ti_c", "nt_ti_cc", "nt_ti_cw"
, "nt_ti_w", "pi_ti_c", "pi_ti_cc", "pi_ti_cw", "pi_ti_w"
, "pt_ti_c", "pt_ti_cc", "pt_ti_cw", "pt_ti_w")
col.ID <- c("SAMPLEID", toupper(myCols), "INDEX_NAME", "SITE_TYPE")

Ouput

df.metval.ci <- df.metval[, c(col.ID, col.met2keep)]

RMD table

kable(df.metval.ci[1:10, ])

Save

write.table(df.metval.ci, "metrics.thermalindicators.tsv"
, col.names=TRUE, row.names=FALSE, sep="\t")

Default dataset generates level 6 membership for all sample_IDs

When using the default dataset "Data_BCG_PacNW.xlsx", level assignment all sample n= 678 IDs is 6. This was conducted in R 3.5.2. with the latest version of the BCGCalc. I have reproduced this result multiple times. I would expect level assignment to range from 2 - 6 for these samples representing the calibration dataset.

Here is a screenshot of the output...

metric.values - required fields (more than described in help)

Describe the bug
The function metric.values doesn't work unless have more fields than specified in help file.

To Reproduce
Steps to reproduce the behavior:

Use only the columns specified in the help file as your data input for metric.values.

col.help <- c("SAMPLEID, "TAXAID", "N_TAXA", "EXCLUDE", "SITE_TYPE", "NONTARGET", "PHYLUM", "CLASS", "ORDER", "FAMILY", "SUBFAMILY", "GENUS", "FFG", "HABIT", "LIFE_CYCLE", "TOLVAL", "BCG_ATTR")

Will keep getting errors about missing fields until add the following.

col.needed <- c("INDEX_NAME", "SITE_TYPE", "THERMAL_INDICATOR", "SUBPHYLUM", "TRIBE").

Expected behavior
Would be beneficial to:

check for all required fields in the function and return an error code if missing.
Have all required fields stated in the help file.

Screenshots

Desktop (please complete the following information):

CT BCG - test cases

Is your feature request related to a problem? Please describe.
Develop test cases for new CT models.

Describe the solution you'd like
Create small dataset to include in testthat.

Describe alternatives you've considered
Don't want to burden package with "all" data.

Additional context
QC data outside of package. Only want small set of samples to ensure don't break things in the future.

HBI calculation

Describe the bug
metric.values HBI result is NA if any TolVal values are NA.

To Reproduce
See above

Expected behavior
Should be removed.

Screenshots
NA

Desktop (please complete the following information):
NA

Smartphone (please complete the following information):
NA

Additional context
Add na.rm=TRUE to sum function in numerator.

Could affect other metrics.

Save data in Example

Include in each function's example section a write.table line.

Failed Install - R v3.6.0 - gzip error

Describe the bug
Error with installing from GitHub.

To Reproduce
Steps to reproduce the behavior:
devtools::install_github("leppott/BCGcalc")

Expected behavior
Should still install properly.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Match up qc.check and level.assignment

Users want to have QC checks (flags) along side of final level assignments.

QC Checks to Flags

Is your feature request related to a problem? Please describe.
Update QC Checks to Flags. But leave function the same.

Describe the solution you'd like
Multiple fixes across the package.

Vignette - section header, QC Check to Flags
Updated files for the examples
Update the example (function and vignette)

Describe alternatives you've considered
From Jen from comments from the group.

Additional context
NA

QC checks - not all included in final example in Level Assignment

Not all extra columns (col2keep) showing up in final example.

"case" issue. Convert to lower case in qc.check. And had to account for NA in "EVAL" column.

In BCG.Level.Assignment example ran metrics once to get membership and a 2nd time to include extra columns for qc.checks. When fix Issue #11 then can change the example code.

update Rules.xlsx

Is your feature request related to a problem? Please describe.
Include revised Rules.xlsx in Package.

Describe the solution you'd like

Renamed "BCG_PacNW" to "BCG_PacNW_v1_500ct"
Revise in code when import. Multiple functions and in multiple places in Vignette.

Describe alternatives you've considered
NA

Additional context
Added 300 ct model worksheet to file and removed unused worksheets.

Final BCG assignment - proportional

Is your feature request related to a problem? Please describe.
So has anyone ever looked at scoring based on probabilities? I am thinking it might be a good way to get a more continuous BCG score, as well as capturing partial group memberships. Here is my first crack at it.

The first sample has a really high probaqbility of L6, but a very low probability of L4. (How often does that happen that it skips a level between two others?) Thus the overall weighted value is slightly lower than L6.

The 3rd sample is mostly a 3, but 1/3 L4. So the overall score moves slightly closer to 4, at 3.3.

Describe the solution you'd like
Add a new column to the final results. See below.

SAMPLEID	INDEX_NAME	SITE_TYPE	L2	L3	L4	L5	L6	L2	L3	L4	L5	L6	FINAL_BCG value
00001CSR	BCG_PacNW_v1_500ct	hi	0	0	0.054422	0	0.945578	0	0	0.217687	0	5.673469	5.9
00001GRD	BCG_PacNW_v1_500ct	hi	0	1	0	0	0	0	3	0	0	0	3.0
00001GRDa	BCG_PacNW_v1_500ct	hi	0	0.666667	0.333333	0	0	0	2	1.333333	0	0	3.3
00001HOOD	BCG_PacNW_v1_500ct	hi	0	0.5	0.5	0	0	0	1.5	2	0	0	3.5
00001HOODa	BCG_PacNW_v1_500ct	hi	0	0.1	0.6	0.3	0	0	0.3	2.4	1.5	0	4.2
00001JDE	BCG_PacNW_v1_500ct	hi	0.363636	0.636364	0	0	0	0.727273	1.909091	0	0	0	2.6
00001REM	BCG_PacNW_v1_500ct	hi	1	0	0	0	0	2	0	0	0	0	2.0
00001SPS	BCG_PacNW_v1_500ct	lo	0	1	0	0	0	0	3	0	0	0	3.0
00001WSE	BCG_PacNW_v1_500ct	hi	1	0	0	0	0	2	0	0	0	0	2.0
00002CSR	BCG_PacNW_v1_500ct	lo	0.4	0.6	0	0	0	0.8	1.8	0	0	0	2.6
00002GRD	BCG_PacNW_v1_500ct	hi	0	0	1	0	0	0	0	4	0	0	4.0
00002HOOD	BCG_PacNW_v1_500ct	hi	0.8	0.2	0	0	0	1.6	0.6	0	0	0	2.2
00002HOODa	BCG_PacNW_v1_500ct	hi	1	0	0	0	0	2	0	0	0	0	2.0
00002JDE	BCG_PacNW_v1_500ct	hi	0	0.666667	0.333333	0	0	0	2	1.333333	0	0	3.3
00002REM	BCG_PacNW_v1_500ct	hi	1	0	0	0	0	2	0	0	0	0	2.0
00002SPSa	BCG_PacNW_v1_500ct	lo	0.4	0.6	0	0	0	0.8	1.8	0	0	0	2.6
00002SPSad	BCG_PacNW_v1_500ct	lo	0	1	0	0	0	0	3	0	0	0	3.0
00002WSE	BCG_PacNW_v1_500ct	lo	0	1	0	0	0	0	3	0	0	0	3.0
00003CSR	BCG_PacNW_v1_500ct	hi	0.6	0.4	0	0	0	1.2	1.2	0	0	0	2.4
00003GRD	BCG_PacNW_v1_500ct	lo	0.166667	0.733333	0.1	0	0	0.333333	2.2	0.4	0	0	2.9
00003JDE	BCG_PacNW_v1_500ct	hi	0.8	0.2	0	0	0	1.6	0.6	0	0	0	2.2

Describe alternatives you've considered
NA

Additional context
NA

Update BCG rules for PacNW workgroup

The PacNW BCGers are going through the confirmation round right now (their homework is due in the next 1-2 weeks). They ended up simplifying some of the rules (e.g., no more alternate for Level 3) and changing a few thresholds. I want to wait until the confirmation is done before sending you a list of the final updates.

JS, email 2018-05-10

Check and clean

Run "check" and clean up any errors, warnings, or notes.

v1.1.0.9010 has some of the above but nothing critical.

Examples - should be only PacificNW

Is your feature request related to a problem? Please describe.
Remove Indiana examples and data.

Describe the solution you'd like
Want only the Pacific NW data in the package.

Describe alternatives you've considered
Some users confused.

Additional context
NA

rarify function

Is your feature request related to a problem? Please describe.
The package BCGcalc and ContDataQC both have the "rarify" function. Make the same in both. Documentation differs.

Describe the solution you'd like
The function in ContDataQC seems to have better example code.

Describe alternatives you've considered
Some have used the other function.

Additional context
NA

Describe alternatives you've considered
Either comment out or mark as donotrun.

Additional context
NA

ReadMe - libraries

Update libraries in ReadMe to those from BCGcalc. The current list is from ContDataQC.

Map example (vignette)

Is your feature request related to a problem? Please describe.
Add a map results example.

Describe the solution you'd like
Could be a new vignette or included in the existing vignette.

Describe alternatives you've considered
NA

Additional context
Use ggplot.

BCG calculation primer

Is your feature request related to a problem? Please describe.
Add short document to 'doc' for directions on BCG calcuations.

Describe the solution you'd like
BCGcalc_README_20180918.pdf

Describe alternatives you've considered
NA

Additional context
NA

leppott / bcgcalc Goto Github PK

bcgcalc's Introduction

Erik W. Leppo

🔥 GitHub Stats

bcgcalc's People

Contributors

Stargazers

Watchers

Forkers

bcgcalc's Issues

long to wide format

Load Data

Columns to keep

Run Function

View Results

Metrics of Interest

thermal indicator (ti)

Ouput

RMD table

Save

Recommend Projects

Recommend Topics

Recommend Org