diazrenata / scads Goto Github PK
View Code? Open in Web Editor NEWStatistically constrained abundance distributions
License: MIT License
Statistically constrained abundance distributions
License: MIT License
A number of functions are popular for fitting the SAD, but it's not clear how much of the support for them comes from the fact that they generate hollow curves vs. they meaningfully predict observed vectors above and beyond the constraint imposed by the feasible set.
The logic here has some nuance/chirality to it, and probably needs more thought. But:
Where do vectors drawn from the fitted distribution (lognormal, geometric, etc), that have been constrained/selected to have the correct S and N, fall in the feasible set compared to empirical distributions?
This has some nuance to it because constraining the samples to fall within the feasible set (have the right S and N) may drag them away from what is likely for the function. An alternative might be to calculate the likelihood of each of the FS samples | the function, and see if the empirical vector has an especially high likelihood compared to the builk of the FS.
e1071
calculates via 3 formulas. Try the other ones and see if you get the same results.
https://onlinelibrary.wiley.com/doi/epdf/10.1111/ecog.03424 Is this data available? Would be super fun.
https://zenodo.org/record/1120445
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0112850
What is the (real) sampling limit? So far I have been running up against my own patience + disc space in my personal hpg division.
The skewness percentile effectively gives a P-value-like consistent-ness of difference. It would be nice to be able to describe the magnitude of difference.
Current idea: The difference between the actual vector and the 1:1 line? It should always be lower than the 1:1 line, so there's no issue of absolute values.
Especially as you pull more distinct samples from the feasible set, you're not mapping what is likely so much as what is possible; all possible draws from any other distribution must be within the feasible set.
That said, especially as S and N get big and the fs balloons, it should take a while to happen on a really weird one (for example, flat)
See issue 5 in scadsplants: diazrenata/scadsplants#5. Carry this forward in future analyses.
Behavior at small S, small N, small N/S, and then increasing:
Dimensions within:
S <- c(2:10)
N <- c(2:19, seq(20, 200, by = 10))
S <- c(10, 20, 30, seq(50, 250, by = 50))
N <- seq(50, 250, by = 50)
S <- c(seq(5, 40, by = 5), 50, 75)
N <- c(seq(500, 2500, by = 500), 5000, 7500)
So to do this you would generate fs's for the observed vector > calculate skewnesses > calculate obs %ile; then do the entire process on vectors that have the same values for the abundant species but have singletons added or removed, and see if the percentile changes.
What's the maximum S and N I can calculate a p table for in R?
Download the data from Xiao (White?) et al 2012 and see what the approximate maximum S and N are.
Closely related to #17.
See theories and predicted distributions collected https://onlinelibrary.wiley.com/doi/full/10.1111/j.1461-0248.2007.01094.x. How close are the predictions to the empirical in terms of their positions within the feasible set?
This gets complex because you're kind of mixing approaches. Many of these predictions are idealized distributions, not actual vectors of counts. Most draws from some of these distributions don't sum to the state variables.
I have previously tried evaluating the likelihood of all samples from the feasible set, from the predicted distribution, and then seeing if the empirical vector has the highest likelihood. I think this tells us if the predicted distribution is pointing at hollow curves in the feasible set generally, or at specifically the empirical hollow curve. But note that winning in this sense is more like a Venn diagram...the empirical is at the intersection of things-predicted-by-the-distribution and the feasible set.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.