Giter VIP home page Giter VIP logo

Comments (13)

rueuntal avatar rueuntal commented on May 29, 2024

Hey @dmcglinn and @ngotelli - here is my primitive thought, without a clear solution. Let's take a simpler example. Suppose we have 5 plots, lined up in a linear fashion as Dan's figure shows. If we look at S at 2-plot level, there are 4 combos: 12, 23, 34, 45. However these four combos are weighed differently in the two algorithms.

In the moving window algorithm, they are given exactly the same weight (25%), because that's how the window moves forward.
In our accumulation algorithm:
Starting from plot 1, there is only one combo that we can get for 2 plots, which is 12.
Starting from plot 2, there are two possibilities, 21 and 23.
Starting from plot 3, 32 and 34.
Starting from plot 4, 43 and 45.
Starting from plot 5, only one combo 54.
So in our algorithm, in calculating S(2plots), we give 12 and 45 30% weight each, 23 and 34 20% weight each.

I'd imagine this issue would persist for larger numbers of plots as well. Does this sound like a possible culprit to you?

from mobr.

rueuntal avatar rueuntal commented on May 29, 2024

@dmcglinn does this issue only appear when there are ties?

from mobr.

dmcglinn avatar dmcglinn commented on May 29, 2024

I do think you may be on to something with your point about differential weighting. I did double check if ties were the cause of the problem and it does not appear that they are - I added a random uniform noise to the site coordinates so that their distances are no longer exactly identical between different samples.

from mobr.

rueuntal avatar rueuntal commented on May 29, 2024

Too bad! I was hoping this is something simple... Dan would it be possible for you to email me the site by sp matrix of one such simulation?

from mobr.

rueuntal avatar rueuntal commented on May 29, 2024

Actually I just realized that the problem may still persist even when there are no ties. Suppose with the random noise, plots 1 & 2 are the closest. Then sliding window would still give equal weight to all combos, while the combo 23 would be weighed even less in our algorithm - when the starting point is plot 2 it would always go to 1 as the next stop.

I'm still not entirely sure if this is causing the jaggedness. But if it is the problem, then I think both methods are legitimate and it's down to our decision.

from mobr.

rueuntal avatar rueuntal commented on May 29, 2024

Hey @dmcglinn - could you send me the community matrix of your simulated community (or point me to an empirical data set that has this behavior)?

from mobr.

dmcglinn avatar dmcglinn commented on May 29, 2024

Hey @rueuntal you can simulate the community using this R code:

gauss.niche<-function(m,z,u,s){
 ##Purpose: to provide an exponential unimodal function 
 ##which is a model for a species response to the enviornment
 ##Called within the functions 'peaks' and 'sim.init.uni'
 ##from Palmer 1992
 ##Arguments:
 ##'m' is max perform
 ##'z' is enviornment at given coordinate
 ##'u' is env optim
 ##'s' is habitat breadth
 m*exp(-.5*(z-u)^2/s^2)
}


S = 15      # number of species
N = 100     # number of samples
env = 1:N   # gradient / spatial locations
sp_optima = round(seq(min(env), max(env), length.out=S))
m = 2.5
s = 2
dat = data.frame(sp=rep(1:S, each=N), expand.grid(env, sp_optima))
names(dat) = c('sp', 'env', 'optima')
resp = round(mapply(gauss.niche, z=dat$env, u=dat$optima,
                    MoreArgs = list(m=m, s=s)))
     # + rpois(length(dat$env), 0.05)
     # if noise is desired then uncomment the above line

dat = data.frame(dat, resp)
head(dat)

comm = with(dat, tapply(resp, list(env, sp), sum))
comm[1:10, 1:10]

from mobr.

rueuntal avatar rueuntal commented on May 29, 2024

Great thanks!

from mobr.

rueuntal avatar rueuntal commented on May 29, 2024

Hey @dmcglinn I think I know for this particular example why the jagged behavior occurs. This simulated community is highly structured - in the one I got, there is exactly ONE species in 99 of the 100 plots and zero in the remaining one (don't remember which; something in the middle). The head and tail of the community matrix look like this:

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1   2 0 0 0 0 0 0 0 0  0  0  0  0  0  0
2   2 0 0 0 0 0 0 0 0  0  0  0  0  0  0
3   2 0 0 0 0 0 0 0 0  0  0  0  0  0  0
4   1 0 0 0 0 0 0 0 0  0  0  0  0  0  0
5   0 1 0 0 0 0 0 0 0  0  0  0  0  0  0
96  0 0 0 0 0 0 0 0 0  0  0  0  0  1  0
97  0 0 0 0 0 0 0 0 0  0  0  0  0  0  1
98  0 0 0 0 0 0 0 0 0  0  0  0  0  0  2
99  0 0 0 0 0 0 0 0 0  0  0  0  0  0  2
100 0 0 0 0 0 0 0 0 0  0  0  0  0  0  2

And here's the spatial curve I got:
rplot
The breaking points are different from those in your plot (different parameterization?) but I suspect the underlying reason is the same. Let's look at the four points to the far right - if the number of plots is >= 97, we would always get all 15 species. Looking back to the community matrix this makes sense - each species at least occurs in 4 consecutive plots, so we'd have to remove at least 4 plots (i.e. nplots <= 96) to begin to lose species.
This is true in both our algorithm and the moving window algorithm, of course. But the reason that we are seeing a bump at nplot = 96 in ours but not for the moving window, is that our algorithm makes is FAR MORE LIKELY to lose (or I should say not covering) those four plots at either end. In the moving window, there are only 5 starting points when nplot = 96: 1-96, 2-97, 3-98, 4-99, 5-100. The first and last combo give S = 14 because the 4 consecutive plots are left plot from one end, while the middle three combos give S = 15. In our approach, we can starting from any of the 100 plots. And for the vast majority of these starting points we'd not hit the 4 plots at one end or the other. For example:
Starting from 1 - go all the way to 96, leaving out 97-100.
Starting from 10 - go all the way to 1 and all the way to 19, then to 96.
Starting from 40 - go all the way to 1 and all the way to 79, then to 96.
...
You can probably see my point - unless the starting point is very close to the center (e.g., starting from 50 we'd have 2-97 or 3-98), one end would always be left out. It is much much more likely in our algorithm compared to the moving window. Personally I see this as a difference in sampling (i.e., how do we take a sample of n adjacent plots from the landscape?), instead of a problem.

from mobr.

dmcglinn avatar dmcglinn commented on May 29, 2024

Ok I think you nailed it! Thank you so much for such a careful comparison of the two curves. I totally see your point about why the moving window SAR and our spatial rarefaction curve diverge increasingly at large scales. Both approaches place more weight on samples near the center of the sampling extent because they are included in more curves but it appears that the spatial rarefaction curve does this more than the moving window because there are more possible curves that do not capture the edges. My personal feeling is that the moving window more faithfully captures the extents of interest but I also recognize that the moving window approach is really designed for spatially regular or contiguous sampling schemes where our approach shines in more messy designs.

Do you think this is possibly further evidence that we shouldn't over interpret the pattern of the spatial rarefaction curves at large scales? So for example these curves from Jon's burn study:

image

What would you make of the wiggly pattern in those curves at large scales (also note that the x-axis is log scaled in the first plot but not in the others - more code to fix ;)

from mobr.

rueuntal avatar rueuntal commented on May 29, 2024

Yeah I totally agree, moving window is kind of an "unbiased" sampling because at each scale it selects each possible configuration exactly once. I have been thinking about how to implement the moving window to a scattered plot design this evening but couldn't come up with an easy solution.

And you are right this just adds additional ambiguity to the effect of aggregation (besides that the scale is kind of arbitrary, and the effect inevitably diminishes towards largest scale; oh well). We should probably remove the language saying that the spatial curve is equivalent to the SAR and briefly discuss this in caveat.

Thanks again for pointing out the issue and providing such a clear cut example!

from mobr.

dmcglinn avatar dmcglinn commented on May 29, 2024

When we point this out though we need to note the two curves are really are very close but technically not identical we could use this example as a supplement to demonstrate it.

from mobr.

rueuntal avatar rueuntal commented on May 29, 2024

Sounds good.

from mobr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.