Light

ropensci / birdsize Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 1.0 7.59 MB

R package to simulate avian body size distributions (primarily for species in the North American Breeding Bird Survey).

Home Page: https://docs.ropensci.org/birdsize

License: Other

R 100.00%

birdsize's Introduction

rOpenSci

This repository has been archived. The former README is now in README-NOT.md.

birdsize's People

Contributors

Stargazers

Watchers

Forkers

mstrimas

birdsize's Issues

scoping

Overall the core intended functionality of this package is to simulate size measurements for birds based on either a species ID (following BBS AOU) or a mean body size (if you want to do simulations).

Body masses are simulated as draws from a normal distribution with mean and standard deviation.

If you supply a species code, look up these parameters from the lookup table (sd_table).

If you supply a species' mean, simulate the sd according to the scaling relationship (use estimate_sd).

Other wider and further afield questions include...

species name lookup fxn (make AOU table user-visible, possibly include string search fxns)
supplying a species x abundance table (apply core sim fxn over multiple species)
supply a bbs route and year(s) ---> x this would require including bbs data in the package which is a whole attributions and data management and data updating situation that I think actually hampers broader utility more than it necessarily helps
0 standard deviation (use species' means only)
estimating individual level bmr
returning community-wide summary statistics (total biomass, total energy use, mean individual size, mean individual bmr)

Vignettes for different use cases

Vignettes or add-on fxns for plotting and analyzing ISDs? again may be drifting more specialized

am I then going to refactor bbs-size-shifts to use this package? yep, probably. should not be too hard

s e e d s. this is a rabbit hole but I'm unsure whether to muck about with seeds in this package at all, or if you can set.seed() and then call a function from this package (that runs rnorm()) and have the seed stay consistent. i.e. leave seed management up to the user. or if you have to set the seed within a function. on reflection it seems pretty sus to build an r package that messes around with your seeds, but scope. idk.

fxn naming to comply with ROpenSci norms

ROS uses object_verb; that may make more sense than my usual wordiness

This is probably most relevant for user-facing functions

This package doesn't have package-defined classes

So thinking through what things have what things done to them, that a user will see

raw_masses

pkgcheck results - community_data

Checks for birdsize (v0.0.0.9000)

git hash: 573807a7

✔️ Package name is available
✔️ has a 'codemeta.json' file.
✔️ has a 'contributing' file.
✔️ uses 'roxygen2'.
✔️ 'DESCRIPTION' has a URL field.
✔️ 'DESCRIPTION' has a BugReports field.
✔️ Package has at least one HTML vignette
✔️ All functions have examples.
✔️ Package has continuous integration checks.
✖️ Package coverage is 58.9% (should be at least 75%).
✔️ R CMD check found no errors.
✔️ R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: MIT + file LICENSE

birdsize ROpenSci review

scope inquiry to ROpenSci
- f/u re: data inclusion
finish vignettes
(regardless of ROpenSci review, consider application paper at MEE?)

Retriever data in vignette

see if @weecology/LDATS, @weecology/MATSS have any examples

fxn to simulate a population measure given mean, abundance, and sd

vignette comparing simulating with and without sd

fxn to simulate a population measure given species code or mean (toggle based on args provided)

notes from today

downloaded retriever data; use this later to add in community_clean fxns and a vignette demoing use of data as it comes from the retriever
~~switched to s4 species class to coerce data types and clean up species_define/looking up stuff~~ S4 is fun to play with but I don't think it adds a lot in this specific context.
renamed user facing fxns to object_verb

docs for this refactor are a little sketchy

fxn to go get mean and sd given species code

fxn to simulate community-wide measures given species codes and abundances

Meta stuff

sd_table

fxn to estimate individual level bmr

Update species tables

There are about 30 records for species in the most recent BBS data release (via retriever) that aren't in raw_masses. This appears to me to be a mixture of subspecies conflicts, name changes, and a random 🦜

It's bugging me so I'm thinking to update raw_masses with those records. There might be a pathway via matching, or simply appending those rows to raw_masses and revisiting in excel

fxn to simulate sd given mean

add_estimated_sds(clean_size_data, sd_pars)

Fxns to document better

ROpenSci review

This package might fall under scope for ROpenSci review (which could provide valuable feedback and would speed up CRAN/JOSS submission) as either "data retrieval" or "automating a field/lab process": https://devguide.ropensci.org/policies.html#aims-and-scope.

Planned vignettes

overview vignette
bbs data
scaling relationship

scaling relationship

vignette for simulating population + community given means

JOSS notes

https://joss.readthedocs.io/en/latest/submitting.html

This package might be too small/lightweight based on LOC. We'll see.

pkgcheck results - main

Checks for birdsize (v0.0.0.9000)

git hash: cddbbff1

✔️ Package name is available
✔️ has a 'codemeta.json' file.
✖️ does not have a 'contributing' file.
✔️ uses 'roxygen2'.
✔️ 'DESCRIPTION' has a URL field.
✔️ 'DESCRIPTION' has a BugReports field.
✔️ Package has at least one HTML vignette
✖️ These functions do not have examples: [draw_population, lookup_species_pars, simulate_population].
✔️ Package has continuous integration checks.
✔️ Package coverage is 81.4%.
✔️ R CMD check found no errors.
✔️ R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: MIT + file LICENSE

add fxnality to simulate population measure given epithet

fxn or toggle to "simulate" with no sd (use only sp means)

accessing species table from bbs data releases

the table here: https://www.sciencebase.gov/catalog/file/get/5ea04e9a82cefae35a129d65?f=__disk__6f%2F16%2F1f%2F6f161fc7c7db1dcaf1259deb02d824700f280460&allowOpen=true

from the 2020 release

is really hard to parse in R using read.table or readLines because the separators are nonstandard and there are irregular line lengths.

Ideally I would prefer to pull in the species list from the URL and then parse it rather than have to include more of the BBS dataset in this package. But that's not seeming tractable at least from this data source.

get_sp_mean_size(sd_dat)

fxn to return community-or-pop-wide summary statistics

lookup table of species x common name x aou

error handling

Currently using verbose if() + stop() for informative error messages. May want to shift to something gentler.

synthetic route rather than new hartford

If you really don't want to redistribute any data, you could create a synthetic route with the dims from the new hartford route but synthetic data

get_sd_parameters(raw_size_data):

Given raw_size_data, fits the lm and returns to intercept and slope pars for the scaling relationship

Next steps

The next milestone for this is going to be ROpenSci review.
For that:

Clean up function documentation
Make sure datasets and manuscripts are cited properly throughout
Rewrite vignettes (see ipad notes)
Remove hartland data

Current pkgstats loc is 384.

Could add fxns, have also wanted to add shiny app, but I think do the above and then return to adding additional pieces.

fxn to simulate community-wide measures given means and abundances

citations

Dunning 2008

Pardieck 2020 BBS data release

Fristoe et al 2015

Thibault et al 2011

Harris et al 2018

fxn to clean raw bbs survey data

clean_sp_size_data(raw_size_data)

estimate_sd(sp_mean, pars)

birdsize methods/software note paper

fxn to lookup aou given species name?

fxns to parse bbs data --> expected format

go from raw downloads from e.g. https://www.sciencebase.gov/catalog/item/5ea04e9a82cefae35a129d65

to what you get out of the retriever

this would allow folks to go from raw download --> interface with this package without needing the retriever. seems slightly fiddly but doable.

prioritization of methods

If both species_mean and species_code are provided, which to use?

I wrote the function to default to using species_mean, but on reflection I feel like using species_code may be preferable.

pkgcheck failure notes

I'm getting a variety of interesting failures running pkgcheck either locally or via actions.

I've tried:

deleting all of pkgcheck's cache files
deleting and redownloading the whole repo
running the checks interactively; fails with the same error (can't subset column that doesn't exist).

This is new today.

Add functionality to directly access BBS data?

I could add functionality to download and process BBS data through this package. So far I have not done this, in order to keep this package lightweight and focused on the key functionality of estimating size measurements for birds. Adding it might make it easier for users to extract size estimates for populations/communities on specific routes.

The rdataretriever package includes functionality (and a vignette) for loading BBS data; MATSS does as well (using the retriever under the hood). (In fact I have always worked with BBS data downloaded via MATSS.

My thinking is to avoid duplication of effort by reworking the problem of downloading BBS data; but to work through the exercise of downloading BBS using the general-user instructions for the retriever and using birdsize to work with those data; probably including instructions/a vignette for doing so.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.