Giter VIP home page Giter VIP logo

scads's Issues

Skewness and singletons

  • Empirical datasets are most likely to have errors in the precise nb of v rare species (veil line-ish stuff)
  • Does fuzzying the rare tail of an empirical SAD change its position in the feasible set?

So to do this you would generate fs's for the observed vector > calculate skewnesses > calculate obs %ile; then do the entire process on vectors that have the same values for the abundant species but have singletons added or removed, and see if the percentile changes.

Positions of common approximations within FS

A number of functions are popular for fitting the SAD, but it's not clear how much of the support for them comes from the fact that they generate hollow curves vs. they meaningfully predict observed vectors above and beyond the constraint imposed by the feasible set.

The logic here has some nuance/chirality to it, and probably needs more thought. But:

Where do vectors drawn from the fitted distribution (lognormal, geometric, etc), that have been constrained/selected to have the correct S and N, fall in the feasible set compared to empirical distributions?

This has some nuance to it because constraining the samples to fall within the feasible set (have the right S and N) may drag them away from what is likely for the function. An alternative might be to calculate the likelihood of each of the FS samples | the function, and see if the empirical vector has an especially high likelihood compared to the builk of the FS.

Comparing coefficients

  • Am I understanding the euclidean distance piece correctly? Would it make more sense to do boxplots for each coefficient, or some kind of multivariate/ordination thing?

Sampling limit?

What is the (real) sampling limit? So far I have been running up against my own patience + disc space in my personal hpg division.

Behaviors at range of S*N

Behavior at small S, small N, small N/S, and then increasing:

  • number of unique elements (of, say, 10000 draws)
  • violins(?) of skewness, Shannon, Simpson
  • heatmaps(?)

Dimensions within:

  • For small S & N, all possible combinations for which N > S:
    • S <- c(2:10)
    • N <- c(2:19, seq(20, 200, by = 10))
  • To explore the broader space, combinations of:
    • Large S, moderate N:
      • S <- c(10, 20, 30, seq(50, 250, by = 50))
      • N <- seq(50, 250, by = 50)
    • Moderate S, large N:
      • S <- c(seq(5, 40, by = 5), 50, 75)
      • N <- c(seq(500, 2500, by = 500), 5000, 7500)
      • Exclude combinations of S > 40, N > 2500.

Evaluate historical predictions relative to FS

Closely related to #17.

See theories and predicted distributions collected https://onlinelibrary.wiley.com/doi/full/10.1111/j.1461-0248.2007.01094.x. How close are the predictions to the empirical in terms of their positions within the feasible set?

This gets complex because you're kind of mixing approaches. Many of these predictions are idealized distributions, not actual vectors of counts. Most draws from some of these distributions don't sum to the state variables.

I have previously tried evaluating the likelihood of all samples from the feasible set, from the predicted distribution, and then seeing if the empirical vector has the highest likelihood. I think this tells us if the predicted distribution is pointing at hollow curves in the feasible set generally, or at specifically the empirical hollow curve. But note that winning in this sense is more like a Venn diagram...the empirical is at the intersection of things-predicted-by-the-distribution and the feasible set.

Magnitude of skewness differences

The skewness percentile effectively gives a P-value-like consistent-ness of difference. It would be nice to be able to describe the magnitude of difference.

Current idea: The difference between the actual vector and the 1:1 line? It should always be lower than the 1:1 line, so there's no issue of absolute values.

Map the fs space

Especially as you pull more distinct samples from the feasible set, you're not mapping what is likely so much as what is possible; all possible draws from any other distribution must be within the feasible set.

That said, especially as S and N get big and the fs balloons, it should take a while to happen on a really weird one (for example, flat)

Data foraging

NEON:

  • Plants
  • Ground beetles?
    * Aquatic plants Too many unidentified
    * Aquatic macroinverts? "identified to lowest practical taxon...genus or species" seems likely to muddy the water
  • Fish
  • Smammals

Maximum S and N

What's the maximum S and N I can calculate a p table for in R?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.