pinskylab / dynamic_range_model Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 0.0 65.45 MB

dynamic range models and forecast methods

Jupyter Notebook 48.46% R 21.44% Python 7.11% Stan 22.99%

dynamic_range_model's People

Contributors

Stargazers

Watchers

dynamic_range_model's Issues

size selectivity of survey and frequency distribution of size classes

Adding size structure in the model is unfortunately a little more complicated than just fitting the model to the data over an additional dimension. Because of survey size selectivity, small fish are very rare, so the data makes it look like there are more adults than small juveniles. Of course, that's not biologically possible, and the reverse should be true.

In other words, at the moment the model is trying to predict a population with few recruits and many adults, which we clearly need to fix. @DanOvando suggested rescaling the survey data to account for size selectivity, which I'll look into. I also think we may need to model presence and abundance separately (i.e. a delta model) so the population model is just trying to predict positive counts rather than tons of zeroes.

temperature data to use in model

need to decide on & get a permanent source for temperature data to use in the model. @mpinsky any thoughts on this? do you have access to the methods that were used to extract ROMS data for the cod challenge? I've only seen ROMS datasets prepared by oceanographers but can look into it.

sensitivity to initial conditions + model stochasticity

some thoughts following a model discussion in Stats 691 with Michael Stein:

the model currently assumes that the initial conditions (starting population sizes in each patch of the three size classes) are known without error. this is definitely not the case; I'd like to explore sensitivity to this assumption, maybe by randomly varying the year 0 inputs to see if model outputs change much.

relatedly, this may be an overlooked source of error in every time step, if the model assumes that the population sizes simulated last year are known without error. right now, really the only source of variability seems to be the initial parameter draws; after that the model is fully deterministic (?). I'm not sure computationally how we'd do this but Michael suggested adding just a little bit of variability in N_A, N_J, and N_Y every year, maybe with Markov Chain models (which I've only used in MCMC, need to look into this).

biologically informed length class break points

Right now, the breakpoints between the three size classes in the model are determined as:

Small juvenile / large juvenile: halfway between smallest size caught, and length at maturity
Large juvenile / adult: length at maturity (mean of Lm for two sexes, if different)

The large juv / adult breakpoint is biologically informed and the small juv / large juv is not. May be worth considering what the meaningful differences between the latter groups, and if we can find a more biologically informed break point.

Sample data spatial filtering and VAST domain

While fitting the VAST model to the summer flounder data a few things jumped out. There seems to be a rogue inland sample data point (Chesapeake Bay?). I went ahead and filtered this out. Though it made me wonder about what our spatial domain should be and if we should go with something slightly different than the default VAST northwest Atl grid, which extends into areas now regularly surveyed by the SEFSC. Image below for reference -- black points are the default VAST northwest Atl grid, the red points are the sample data. For this PR, I grabbed a NOAA NEFSC bottom trawl survey polygon and used that as our domain. The shapefile is here, though happy to use something else if it is more appropriate.

consider integrating non-temperature chemical/biological data

chlorophyll / oxygen / what else? needs to be supported by a proposed mechanistic link to population dynamics

speed up model

As soon as I added spatial structure the model slowed to taking 4-5ish hours to run. Need to look into ways to speed up Stan models or parallelize this on a server.

integrate ROMS for future forecasting

We need to get a ROMS product for this region that we can use for forecasting future states, and re-run the model fitted to historical ROMS values instead of the in situ temperature data.

deal with seasonality in surveys

In order to combine data from surveys conducted in different seasons--which have important differences in methods (temperature/timing) and results (e.g., for spiny dogfish, different patterns in distribution, abundance, etc.)--we need some statistical method to account for this source of variation... an observation model?

incorporate additional survey data

Right now the model just uses NEUS spring survey data, which is the simplest because the survey has been very consistent, and every haul collected length data. Other datasets to consider incorporating:

SEUS SEAMAP trawl survey (has length data for some, not all, tows; but has biomass for all)
Scotian Shelf trawl data (has biomass for all?)
Southeast Reef Fish Survey (SERFS) - fixed gear

choosing informative priors

for spiny dogfish, a priority should be choosing informative priors, even if they are just constrained to reasonable orders of magnitude. this should help with parameter identifiability among the life stages for which we have little or no data. for cod (below), it was assumed that the average sizes at the two life stage transitions were known without error. the other rates were drawn from uniform distributions.

I think both of these lines of thought should probably be revisited for spiny dogfish. while it adds parameters to estimate, I'm not sure we can assume L_J and L_Y are known without error. further, I think uniform priors are increasingly discouraged in Bayesian methods because they can disproportionately weight extreme values relative to other distributions with some central tendency. not sure if that's applicable to ABC too, but worth discussing.

        L_0_theta    = np.random.uniform(0,4)#np.random.normal(0.4,0.3) #np.random.beta(2,2)
        L_inf_theta    =np.random.uniform(100, 180)
        L_J_theta=34
        L_Y_theta=68
        #np.random.uniform(0,1)#np.random.beta(2,2)
        Topt_theta =np.random.uniform(6,15)#np.random.normal(6.5,2) #np.random.uniform(1,12) #np.random.lognormal(1,1)
        width_theta  =np.random.uniform(1,2)#np.random.normal(2,1)
        ##np.random.lognormal(1,1)
        kopt_theta    =np.random.uniform(0.1,1)#np.random.normal(0.5,0.4)# np.random.u(0,1)
        xi_theta     =np.random.uniform(0,0.5/2)#np.random.normal(0.1,0.09) #np.random.normal(0,1)#np.random.normal(0,0.5)
        #r_theta     =np.random.uniform(0,1)
        m_J_theta    =np.random.uniform(0,0.1)#np.random.normal(0.04,0.04) # #np.random.beta(2,2)
        m_Y_theta    =np.random.uniform(0,0.1)#np.random.normal(0.05,0.04) #np.random.uniform(0,1) #np.random.beta(2,2)
        m_A_theta    =np.random.uniform(0,0.1)#np.random.normal(0.05,0.05)# np.random.uniform(0,1)#np.random.beta(2,2)
        K_theta= np.random.uniform(1,1000)

appropriate summary statistics for model interpretation/comparison

How do we want to present and compare results from models?

(EDIT: especially need to consider that our model forecasts abundance whereas traditional SDMs predict probability of occurrence)

write down Bayesian network / likelihood for this model

not sure how ABC fits in but it would be helpful for me to write down a network diagram and associated equations for the model that I can update as we make changes (work with Jude on this)

appropriate spatial scale and normalization

Right now the model uses average biomass in latitudinal bands. Should we be further standardizing by effort? Is this the appropriate spatial scale of analysis?

choosing temperature variables for model fits

As currently written, when fitting the dynamic range model to real population data over space and time, we also pass the model a matrix of temperature values (patches x years). I'll start out just using a mean of in situ bottom temperature readings from the trawls in each patch in each year (including all hauls, not just those where the focal species was found), but other suggestions are welcome.

pinskylab / dynamic_range_model Goto Github PK

dynamic_range_model's People

Contributors

Stargazers

Watchers

dynamic_range_model's Issues

Recommend Projects

Recommend Topics

Recommend Org