kingaa / pomp Goto Github PK

R package for statistical inference using partially observed Markov processes

Home Page: https://kingaa.github.io/pomp

License: GNU General Public License v3.0

R 73.69% C++ 18.57% C 7.10% Makefile 0.64%

pomp mathematical-modelling statistical-inference dynamical-systems stochastic-processes state-space markov-model particle-filter likelihood-free likelihood

pomp's Introduction

pomp

an R package for statistical inference on partially observed Markov processes

For installation instructions and user guides, see the package website.

Related packages:

pomp's People

Contributors

Stargazers

Watchers

pomp's Issues

zeronames functionality

I find the zeronames argument of pomp very helpful, but I needed a little trial and error to find out how it works exactly:
From the documentation it was not immediately clear to me whether "setting to zero after each observation" refers to the time points given in the time series data frame or the simulation step (delta.t, dt). Given the former, it might be worthwhile to warn users of "missing values", i.e. for reasonable results a time series of incidences needs to be equidistant and if a data point is missing the simulation will suggest a doubled incidence.
A more tricky situation arises if the simulation is started at a time point t0 that lies more than the typical time interval in the time series before the first observation in the time series - which I would like to do for the dynamics to equilibrate before I do inference. This results in a situation in which the simulated variable has accumulated a much larger value than expected and observed. Is there an elegant way to fix this issue, i.e. to tell pomp how to evalutate the variable corresponding to the time series at the first data point?

How to shorten the simulation time on the basis of credible results

I have run a simulation using the parallel computation (doMC package) on a cloud server (Intel Xeon CPU,Cores:16), however, it haven't be finished for nearly four days. The dataset in the simulation is the weekly case for two years. The Np and Nmif argument in mif function is 2000 and 100(I don't know whether it is enough to get the credible result), respectively. How could i make the simulation faster? You have developed the mpifarm package for parallel computation and how about the efficiency compared to the doMC package?
Another question is that how do i set the initial value of every parameter? In my situation, there are nearly 8 parameters which i can't decide the potential range of parameters because no reference have reported them. Some papers have used the traj.match function to get the potential value for parameters, however, the results are also related to the original parameter value during the process to build the pomp model. What is the standard way to set the parameter value before the simulation?

multivariate pomp

I used your code for multivariate pomp given in faq. But I could not figure out how to provide skeleton to the pomp and how to write traj.match function.

Add better global optimization method for trajectory matching

Is it possible to add some better optimization method in traj.match()? I would recommend add genetic algorithm or alike method to make it better on finding global optimum.
Thanks a lot.

error with the initializer of pomp

Hi, I am facing a problem with the initializer of the pomp. It give me the error of

Error in try(.Call(simulation_computations, object, params, times, t0,  : 
  'init.state' error: user 'initializer' must return a named numeric vector
Error: ‘simulate’ error

The code I am using is:

library(pomp)
library(foreach)
library(gtools)
data<-c(3,8,28,76,222,293,257,237,192,126,70,28,12,5)

rmeas<-"
cases=rnbinom_mu(theta,exp(rho)*I/(1+exp(rho)));
"
dmeas<-"
lik=dnbinom_mu(cases,theta,exp(rho)*I/(1+exp(rho)),give_log);
"

sir.step<-"
double rate[2];
double dN[2];
double P=S+I+R;
rate[0]=bet*I/P;
rate[1]=mu;
reulermultinom(1,S,&rate[0],dt,&dN[0]);
reulermultinom(1,I,&rate[1],dt,&dN[1]);
if(!R_FINITE(S)) Rprintf(\"%lg %lg %lg %lg %lg %lg %lg %lg %lg\\n\",dN[0],rate[0],dN[1],rate[1],bet,mu,S,I,R);
S+=-dN[0];
I+=dN[0]-dN[1];
R+=dN[1];
"

sir<-pomp(data=data.frame(cases=NA,time=seq(1,14,by=1)),
          times="time",t0=0,dmeasure=Csnippet(dmeas),
          rmeasure=Csnippet(rmeas),rprocess=euler.sim(
            step.fun=Csnippet(sir.step),delta.t=1/14/2),
          statenames=c("S","I","R"),
          paramnames=c("popsize","bet","mu","rho","theta1","theta2","theta"),
          initializer=function(params,t0,...){
            ps<-exp(params["theta1"])/(1+exp(params["theta1"])+exp(params["theta2"]))
            pi<-1/(1+exp(params["theta1"])+exp(params["theta2"]))
            pr<-exp(params["theta2"])/(1+exp(params["theta1"])+exp(params["theta2"]))
            return(setNames(c(rmultinom(1,params["popsize"],prob=c(ps,pi,pr))),c("S","I","R")))
          },params=c(popsize=763,theta=100,bet=2,mu=0.5,rho=3,theta1=5,theta2=1))
sir<-simulate(sir,seed = 73691676L)

When I just run the initializer function, I could get the return value

S     I    R 
741   8  14

where S, I, R are the statenames of my pomp system, but it give the error

Error in try(.Call(simulation_computations, object, params, times, t0,  : 
  'init.state' error: user 'initializer' must return a named numeric vector
Error: ‘simulate’ error

which means that the return value of the initializer has no name, but it has the name, what is the problem?

And there is also such problem when run the mif function, where is the problem?

How Could we add A seasonal state?

In state space model we could probably be interested in adding a seasonal state, for example :

       t = 1......T ;   
       y(t) = mu(t) + Seas(t) + eps(t) ; 
       mu(t+1) = mu(t) + u(t)
       Seas(t) = = - sum(Seas(t-j) , j = 1 .... S-1 ) + w(t)

with S the number of seasons.
Where eps(t) , u(t) , w(t) are errors following a normal distribution with mean 0 and a given variance.

Please How can I specify rprocess and rmeasure for this case?
Thanks in advance.
note that the seasonal components could be modeled with by vectors of size S : number of seasons , and each components follow a random walk constrained to sum to zero for each time t

for more about modeling seasonality please have a look at this paper : link below
http://ec.europa.eu/eurostat/documents/3888793/5842169/KS-DT-06-019-EN.PDF/f046f027-ef69-4095-aa49-d55119998d19
Thanks in advance

covar names missing after coercing pomp objects to data frame

I ran into a little problem today when trying to simulate ensemble distributions at a single time point. It can be demonstrated by adding covariates to the pomp object in the #54:

library(pomp)

create_example <- function(times = c(1,2), t0 = 0, mu = 0.001, N_0 = 1,
                                covar = data.frame(x = c(0, 1), time = c(0, 52))) {
  data <- data.frame(time = times, reports = NA)
  mmodel <- reports ~ pois(ct)
  rate.fun <- function(j, x, t, params, covars, ...) {
      switch(j, params["mu"]*x["N"], stop("unrecognized event ",j))
  }
  rprocess <- gillespie.sim(rate.fun = rate.fun, v=rbind(N=-1, ct=1))
  initializer <- function(params, t0, ...) {
     c(N=N_0,ct=12)
  }
  pomp(data = data, times = "time", t0 = t0, params = c(mu=mu),
            rprocess = rprocess, initializer = initializer,
            zeronames = "ct", paramnames = "mu", statenames = c("N","ct"),
            measurement.model = mmodel, covar = covar, tcovar = "time")
}
simulate(create_example(times = 1), as.data.frame=TRUE)
simulate(create_example(times = c(1,2)), as.data.frame=TRUE)

With pomp 1.15.3.1 (current master branch on github), I get the following output:

  time reports N ct         NA sim
1    1       0 1  0 0.01923077   1

whereas the second example is fine:

  time reports N ct          x sim
1    1       0 1  0 0.01923077   1
2    2       0 1  0 0.03846154   1

The output with missing variable names can lead to cryptic error messages when used in a pipe with some dplyr functions. In case it's of use, my solution was to change lookup_in_table like this: e3bo/pomp@72741ae .

statenames error with R version 3.2.2

I get the following error message using the R 3.2.2 version, but not the R 3.0.3 version:

"Error in paste(statenames, ".0", sep = "") : argument "statenames" is missing, with no default
Error: ‘simulate’ error"

Attached is the relevant section of R code (r_code.txt) and the C code. I tried to load up the .R and .c files but got an 'Unfortunately, we don’t support that file type.' message when I tried to load them. The relevant R code to generate the .dll file from the c code is included.

r_code.txt
lbvseirNoSeasFreqTwoParsMeasureBinom.txt

Demo(sir) don't work

The demo(sir) cann't work well in my compute. Error messages are:

Error in (function (name = NULL, dir = NULL, statenames, paramnames, covarnames, :
cannot compile shared-object library ‘C:/Users/dell/AppData/Local/Temp/RtmpSEv0Pr/pompB0D5B1C24832.dll’
In addition: Warning message:
running command '"D:/R-3.2.1/bin/R" PKG_CFLAGS=" -ID:/R-3.2.1/library/pomp/include" CMD SHLIB -o C:/Users/dell/AppData/Local/Temp/RtmpSEv0Pr/pompB0D5B1C24832.dll C:/Users/dell/AppData/Local/Temp/RtmpSEv0Pr/pompB0D5B1C24832.c' had status 1
Error: in ‘pomp’: error in building shared-object library from Csnippets:
Error in (function (name = NULL, dir = NULL, statenames, paramnames, covarnames, :
cannot compile shared-object library ‘C:/Users/dell/AppData/Local/Temp/RtmpSEv0Pr/pompB0D5B1C24832.dll’

Session information for my compute is as followed:

 R version 3.2.1 (2015-06-18)
 Platform: x86_64-w64-mingw32/x64 (64-bit)
 Running under: Windows 8 x64 (build 9200)
locale:
[1] LC_COLLATE=Chinese (Simplified)_China.936  LC_CTYPE=Chinese (Simplified)_China.936   
[3] LC_MONETARY=Chinese (Simplified)_China.936 LC_NUMERIC=C                              
[5] LC_TIME=Chinese (Simplified)_China.936    

attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base     

 other attached packages:
 [1] pomp_0.70-1

 loaded via a namespace (and not attached):
 [1] deSolve_1.12    subplex_1.1-4   tools_3.2.1     coda_0.17-1     mvtnorm_1.0-2  
 [6] grid_3.2.1      nloptr_1.0.4    lattice_0.20-31

Transform parametervector

Is it possible to write a Csnippet for parameter transformation in case the parameter is a vector?
For example the parameter is rho_S=(rho_S_1,...rho_S_17). My approach so far for the "fromEstimationScale" is:

exptrans<-Csnippet("
                   int n=17;
                   int j;
                   double *rho_S=&rho_S_1;
                   double *Trho_S=&Trho_S_1;

                   for (j = 0; j < n; j++){
                   Trho_S[j]=exp(rho_S[j]);
                   }
                   ")

How extract the loglikelihood from the mcmc

I am using pmcmc by the function

pmcmc1<-foreach(i=1:3,.combine=c) %dopar%{
  pmcmc(pomp(sir,dprior=sir.dprior),start=theta.mif,Nmcmc=100000,Np=100,max.fail=Inf,
        proposal=mvn.diag.rw(c(bet=0.1,gam=0.3,mu=0.1,rho=0.1,theta1=0.5,theta2=0.5,theta3=0.5)))
}

which is running 3 different chains by parallel, and when I use

plot(pmcmc1)

I could see the trace plot with the log likelihood, now I need to extract the log likelihood, how could I extract them?

Thanks in advance.

pmcmc function could return Nan Nan Nan value for SIR model using the pomp

I am using SIR model by the pomp. Former, I found the the pomp very useful. But these days, after I update my R studio, I am facing some problem for "task 2 failed - "in ‘pmcmc’: error in ‘pfilter’: Error : in ‘pfilter’: ‘dmeasure’ returns non-finite value."(The weird thing is that sometime it could work out, but sometime it could not work out. But I used the same code.) So I add the print part to the dmeasure, and found that sometime, the pomp could get nan nan nan value for the S, I, R, some times not. How could that happen? Is that a bug? I linked my code below, it is almost the same code as on the paper: Statistical Inference for Partially Observed Markov Processes via the R Package pomp(in section 5).

library(pomp)
library(foreach)
rmeas<-"
cases=rnbinom_mu(theta,rho*I);
"
dmeas<-"
lik=dnbinom_mu(cases,theta,rho*I,give_log);
if (!R_FINITE(lik)) 
  Rprintf(\"%lg %lg %lg %lg %lg %lg\\n\",cases,theta,rho,S,I,R);
"

sir.step<-"
double rate[2];
double dN[2];
rate[0]=be*I;
rate[1]=mu;
reulermultinom(1,S,&rate[0],dt,&dN[0]);
reulermultinom(1,I,&rate[1],dt,&dN[1]);
S+=-dN[0];
I+=dN[0]-dN[1];
R+=dN[1];
"
sir<-pomp(data=data.frame(cases=NA,time=seq(1,14,by=1/2)),
          times="time",t0=-1/4,dmeasure=Csnippet(dmeas),
          rmeasure=Csnippet(rmeas),rprocess=euler.sim(
            step.fun=Csnippet(sir.step),delta.t=1/14/4),
          statenames=c("S","I","R"),
          paramnames=c("popsize","be","mu","rho","theta","S.0","I.0","R.0"),
          initializer=function(params,t0,...){
            fracs<-params[c("S.0","I.0","R.0")]
            setNames(c(round(params["popsize"]*fracs/sum(fracs))),c("S","I","R"))
          },params=c(popsize=769,be=0.002,mu=0.5,theta=500,rho=0.7,S.0=0.98,I.0=0.01,R.0=0.01))

sir<-simulate(sir,seed=190L)
plot(sir,type="p")
sir.log.tf<-function(params, ...) log(params)
sir.exp.tf<-function(params, ...) exp(params)
sir<-pomp(sir,toEstimationScale=sir.log.tf,fromEstimationScale=sir.exp.tf)
estpars<-c("be","mu","rho","S.0","I.0","R.0")
theta.true<-coef(sir)
mif1<-foreach(i =1:2,.combine = c)%dopar%{
  theta.guess<-theta.true
  theta.guess[estpars]<-rlnorm(n=length(estpars),
                               meanlog=log(theta.guess[estpars]),sdlog=1)
  mif(sir,Nmif=100,start=theta.guess, transform=TRUE,
      Np=2000,var.factor=2,cooling.fraction=0.7,
      rw.sd=c(be=0.02,mu=0.02,rho=0.02,S.0=0.02,I.0=0.02,R.0=0.02))
}
pf1<-foreach(mf=mif1,.combine=c) %dopar% {
  pf<-replicate(n=2,logLik(pfilter(mf,Np=10000)))
  logmeanexp(pf)
}
mf1<-mif1[[which.max(pf1)]]
theta.mif<-coef(mf1)
hyperparams<-list(min=coef(sir)/10,max=coef(sir)*10)
sir.dprior<-function(params, ...,log){
  f<-sum(dunif(params,min=hyperparams$min,max=hyperparams$max,
               log=TRUE))
  if(log) f else exp(f)
}

pmcmc1<-foreach(i=1:3,.combine=c) %dopar%{
  pmcmc(pomp(sir,dprior=sir.dprior),start=theta.mif,Nmcmc=20000,Np=100,max.fail=Inf,
        proposal=mvn.diag.rw(c(be=0.01,mu=0.01,rho=0.01,S.0=0.01,I.0=0.01,R.0=0.01)))
}

You could see that in the dmeas function, I print all of the parameter and when the error occurs, I could get the output:

27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan
27 500 0.842667 nan nan nan

Is that a bug? Or there is some problem in my code?

Why ‘dmeasure’ returns non-finite value？

Thank you for your time at first!
The fist code I used had no problem. Everything was fine during the calculation. The code and data are:

rm(list=ls())
library(ggplot2)
library(plyr)
library(reshape2)
library(magrittr)
library(pomp)
library(foreach)
library(iterators)
library(parallel)
library(doParallel)
data01 <- read.table("SD00.txt")
covartable <- data.frame(
  time=data01$time,
  B=data01$births,
  D=data01$deaths,
  P=data01$pop
)

dmeas = Csnippet("
                 if (ISNA(cases)) {
                 lik = (give_log) ? 0 : 1;
                 } else {
                 lik =  dnbinom_mu(cases, 1/od, H, 1);
                 lik = (give_log) ? lik : exp(lik);}
                 //  if (!R_FINITE(lik)) 
                 //  Rprintf(\"%lg %lg %lg %lg %lg %lg %lg\\n\",od,H,cases,gamma,beta1,beta2,lik);
                 ")
rmeas = Csnippet("
                 cases = rnbinom_mu(1/od,H);
                 ")

##initlz-----------------------------------
init <- Csnippet("
                 S = 97890000*(0.5);I = 250; H = 26;R = 97890000*(0.5);
                 ")
##Transition-------------------------------
toEst <- Csnippet("
                  Tbeta1  = log(beta1);//Tbeta2  = logit(beta2);
                  Tod = log(od);
                  
                  ")

fromEst <- Csnippet("
                    Tod = exp(od);
                    Tbeta1  = exp(beta1);
                    
                    ")
##-------------NAMES----------
statenames <- c("S","I","H","R","Beta")
rp_names <- c("od","gamma","beta1","beta2","Beta_Autm","th1","th2","phi1","phi2","iota")
## ----rprocess------------------------------------------------------------
rproc <- Csnippet("
                  double rate[6];
                  double dN[6];
                  double mud = 0.013506;
                  
                  double t1 = t-2014;
                  
                  
                  Beta =  beta1*(sin(phi1*M_PI*(t1+th1)));
                  if (t>0.8)
                  Beta = beta2*(sin(phi1*M_PI*(t1+th1)));
                  if (Beta < iota){
                  Beta = iota;  
                  }
                  double BetaAut;
                  BetaAut = Beta_Autm*cos(phi2*M_PI*(t1+th2));
                  if (BetaAut<0){
                  BetaAut = 0;
                  }
                  Beta += BetaAut;
                  rate[0] = B; // birth
                  rate[1] = Beta * I / P; // transmission
                  rate[2] = mud; // death from S
                  rate[3] = gamma; // recovery
                  rate[4] = mud; // death from I
                  rate[5] = mud; // death from R
                  dN[0] = rpois(rate[0] * dt);
                  reulermultinom(2, S, &rate[1], dt, &dN[1]);
                  reulermultinom(2, I, &rate[3], dt, &dN[3]);
                  reulermultinom(1, R, &rate[5], dt, &dN[5]);
                  S += dN[0] - dN[1] - dN[2];
                  I += dN[1] - dN[3] - dN[4];
                  R += dN[3] - dN[5];
                  H = 0;
                  H = dN[1];
                  ")
params<-c(od=0.03323176,beta1=17.05119617, beta2 = 35.06511975, Beta_Autm = 8.37132179,gamma = 6.37966530 , iota = 3.85213180,
           th1 = 0.13913023,th2 = 0.69688513,phi1 = 1.98411065,phi2 = 1.77986175,)
##------------------PUT into Together---------------

mesd <- pomp(
  data=data01,
  times="time",
  t0=2014,
  rprocess = euler.sim(step.fun = rproc, delta.t=1/52),
  rmeasure = rmeas,
  dmeasure = dmeas,
  covar=covartable,
  tcovar="time",
  statenames = statenames,
  paramnames = c(rp_names),
  initializer=init,
  toEstimationScale=toEst, 
  fromEstimationScale=fromEst,
  params=params
)
##---------------first run and vanilla particle filter to generate the liklihood------------
params1<-c(od=0.03323176,beta1=17.05119617, beta2 = 35.06511975, Beta_Autm = 8.37132179,gamma = 6.37966530 , iota = 3.85213180,
           th1 = 0.13913023,th2 = 0.69688513,phi1 = 1.98411065,phi2 = 1.77986175,) 
simreslt<-simulate(mesd,nsim=10,params = params1,as.data.frame=TRUE,include.data=TRUE)
simulate(mesd,nsim=9,params = params1,as.data.frame=TRUE,include.data=TRUE) %>%
  subset(time>0,select=c(time,sim,cases,Beta)) %>%
  melt(id=c("time","sim")) %>%
  ggplot(aes(x=time,y=value,color=ifelse(sim=="data","data","simulation"),
             alpha=(sim=="data"),
             group=sim))+
  scale_alpha_discrete(breaks=c(`TRUE`=1,`FALSE`=0.8))+
  labs(color="")+
  guides(alpha=FALSE)+
  geom_line()+
  facet_wrap(~variable,ncol=1,scales="free_y")+theme_bw()
logLik(pfilter(mesd,params=params1,Np=1000))
###---------------------mif2 1st-----------------------------

pf0 <- pfilter(mesd,params=params1,Np=2000);logLik(pf0)
mf <- mif2(mesd,
           Np=5000,
           Nmif=50,
           start=params1,
           cooling.type="geometric",
           cooling.fraction.50=0.5,
           transform=TRUE,
           rw.sd=rw.sd(od = 0.001,beta1 = 0.01, beta2 = 0.01,
                       Beta_Autm = 0.01,gamma = 0.001 , 
                       iota = 0.001,th1 = 0.001,th2 = 0.001,
                       phi1 = 0.001,phi2 = 0.001)
); `

SD00.txt
However, when I implement another parameter like the second code, and I only change the
init <- Csnippet(" S = 97890000*(ProRate);I = 250; H = 26;R = 97890000*(0.5); ")，
rp_names <- c("od","gamma","beta1","beta2","Beta_Autm","th1","th2","phi1","phi2","iota","ProRate")，
params1<-c(od=0.03323176,beta1=17.05119617, beta2 = 35.06511975, Beta_Autm = 8.37132179,gamma = 6.37966530 , iota = 3.85213180, th1 = 0.13913023,th2 = 0.69688513,phi1 = 1.98411065,phi2 = 1.77986175,ProRate = 0.5)
and，
mf <- mif2(mesd, Np=5000, Nmif=50, start=params1, cooling.type="geometric", cooling.fraction.50=0.5, transform=TRUE, rw.sd=rw.sd(od = 0.001,beta1 = 0.01, beta2 = 0.01, Beta_Autm = 0.01,gamma = 0.001 , iota = 0.001,th1 = 0.001,th2 = 0.001, phi1 = 0.001,phi2 = 0.001,ProRate = 0.001) );
The error " ‘dmeasure’ returns non-finite value" occurred.
I used
if (!R_FINITE(lik)) Rprintf(\"%lg %lg %lg %lg %lg %lg %lg %lg\\n\",od,H,cases,gamma,beta1,beta2,ProRate,lik);
as previous issue mentioned. And I get something like these:

0.120557 nan 22 5.11698 29.7311 19.988 nan
0.120427 nan 22 5.11292 30.0793 20.0104 nan
0.120234 nan 22 5.1152 29.2223 19.9833 nan
0.120355 nan 22 5.11239 29.2918 20.0043 nan
0.12026 nan 22 5.11311 29.9388 20.0393 nan
0.120391 nan 22 5.11393 29.9993 20.0182 nan

The state H seems return NaN. I am not sure where was the problem.
Please enlighten me! Thank you very much!

problem in compiling after defining dmeasure

I can compile the examples provided in the package so my machine should be set ok.

I have some compilation issue once I define a dmeasure.

Here follows a minimal example, which works OK. Then I insert the problematic line.

library(pomp)

data.dat <- read.csv(text="
                 time,y
                 1,-1.4910201e-01   
                 2,1.9068885e+00    
                 3,4.6734116e-01    
                 4,-6.7779706e-01   
                 5,-3.0348450e+00   
                 6,6.5399287e+00    
                 7,-6.6112666e-01   
                 8,2.5084616e+00    
                 9,1.7287442e-01    
                10,5.7009581e+00"
)

nlingauss.step <- Csnippet("
  double eps = rnorm(0,sigmax);
  x = 2*sin(exp(x)) + eps;
  ")

nlingauss.skel <- Csnippet("
                    Dx = 2*sin(exp(x));
                    ")

rmeas <- Csnippet("
  y = rnorm(x,sigmay);
                  ")

dmeas <- Csnippet("
  lik = dnorm(y,x,sigmay,give_log);
                  ")

nlingauss <- pomp(data.dat,time="time",t0=0,rprocess=discrete.time.sim(nlingauss.step,delta.t=1),rmeasure=rmeas,skeleton=nlingauss.skel,skeleton.type="map",skelmap.delta.t=1,statenames="x",paramnames=c("sigmax","sigmay"))

So the above is all fine. However after I add a dmeasure to the pomp object, that's when I get a compilation error.

nlingauss <- pomp(nlingauss,dmeasure=dmeas,statenames="x")

which produces

> nlingauss <- pomp(nlingauss,dmeasure=dmeas,statenames="x")
gcc -m64 -I"C:/PROGRA~1/R/R-32~1.2/include" -DNDEBUG     -I"d:/RCompile/r-compiling/local/local320/include"     -O2 -Wall  -std=gnu99 -mtune=core2 -c C:/Users/Umberto/AppData/Local/Temp/RtmpuSV96B/6928/pomp_2b64d54b1a2495001b2478f1625b67e1.c -o C:/Users/Umberto/AppData/Local/Temp/RtmpuSV96B/6928/pomp_2b64d54b1a2495001b2478f1625b67e1.o
C:/Users/Umberto/AppData/Local/Temp/RtmpuSV96B/6928/pomp_2b64d54b1a2495001b2478f1625b67e1.c: In function '__pomp_dmeasure':
C:/Users/Umberto/AppData/Local/Temp/RtmpuSV96B/6928/pomp_2b64d54b1a2495001b2478f1625b67e1.c:16:17: error: 'sigmay' undeclared (first use in this function)
C:/Users/Umberto/AppData/Local/Temp/RtmpuSV96B/6928/pomp_2b64d54b1a2495001b2478f1625b67e1.c:16:17: note: each undeclared identifier is reported only once for each function it appears in
make: *** [C:/Users/Umberto/AppData/Local/Temp/RtmpuSV96B/6928/pomp_2b64d54b1a2495001b2478f1625b67e1.o] Error 1
Warning message:
running command 'make -f "C:/PROGRA~1/R/R-32~1.2/etc/x64/Makeconf" -f "C:/PROGRA~1/R/R-32~1.2/share/make/winshlib.mk" SHLIB="C:/Users/Umberto/AppData/Local/Temp/RtmpuSV96B/6928/pomp_2b64d54b1a2495001b2478f1625b67e1.dll" WIN=64 TCLBIN=64 OBJECTS="C:/Users/Umberto/AppData/Local/Temp/RtmpuSV96B/6928/pomp_2b64d54b1a2495001b2478f1625b67e1.o"' had status 2 
Error in pompCompile(fname = name, direc = pompSrcDir(dir), src = csrc,  : 
  cannot compile shared-object library ‘C:/Users/Umberto/AppData/Local/Temp/RtmpuSV96B/6928/pomp_2b64d54b1a2495001b2478f1625b67e1.dll’
In addition: Warning message:
running command '"C:/PROGRA~1/R/R-32~1.2/bin/R" PKG_CFLAGS=" -IC:/Users/Umberto/Documents/R/win-library/3.2/pomp/include" CMD SHLIB -o C:/Users/Umberto/AppData/Local/Temp/RtmpuSV96B/6928/pomp_2b64d54b1a2495001b2478f1625b67e1.dll C:/Users/Umberto/AppData/Local/Temp/RtmpuSV96B/6928/pomp_2b64d54b1a2495001b2478f1625b67e1.c' had status 1 
Error: in ‘pomp’: error in building shared-object library from Csnippets:
Error in pompCompile(fname = name, direc = pompSrcDir(dir), src = csrc,  : 
  cannot compile shared-object library ‘C:/Users/Umberto/AppData/Local/Temp/RtmpuSV96B/6928/pomp_2b64d54b1a2495001b2478f1625b67e1.dll’

I appreciate any suggestion. Thanks.

NaN output in mif2 algorithm

The problem I am facing is that after running a local search with the mif2 algorithm the object mifs_local contains several entries where all fitted parameters are NaN (e.g. mif_local[[2]] in the code below) whereas for others I get some output. Surprisingly, this does not produce an error (why?), however I see an error when trying to evaluate the likelihood with pfilter that dmeasure returns non-finite value.

Another big issue is that, even with a very low number of particles and mif2 iterations, I am facing extremely long running times although the code is parallelized and I have access to a cluster. I am wondering if this is normal or if I can change something to speed up my calculations?

sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.6 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] doParallel_1.0.10 magrittr_1.5      pomp_1.7          doMC_1.3.4        iterators_1.0.8   foreach_1.4.3     reshape2_1.4.1    plyr_1.8.4        ggplot2_2.1.0    

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.6      knitr_1.13       munsell_0.4.3    colorspace_1.2-6 lattice_0.20-33  subplex_1.1-6    stringr_1.0.0    tools_3.3.1      gtable_0.2.0    
[10] coda_0.18-1      digest_0.6.9     nloptr_1.0.4     codetools_0.2-14 deSolve_1.13     stringi_1.1.1    compiler_3.3.1   scales_0.4.0     mvtnorm_1.0-5

Here is my model:

#observational model, observations are poisson
rmeas <- Csnippet("cases1 = rpois(H1);
                   cases2 = rpois(H2);
                   cases3 = rpois(H3);
                  ")
#allow for the possibility of NAs in the data
dmeas <-Csnippet(" if (ISNA(cases1)) {
                   lik = (give_log) ? 0 : 1;
                   } else {
                   lik =  dpois(cases1, H1, 1) +
                            dpois(cases2, H2, 1) +
                            dpois(cases3, H3, 1);
                   lik = (give_log) ? lik : exp(lik);
                  }")
# process model is Markovian SIRS with 3 age classes
sir.step <- Csnippet("
double rate[19];
double dN[19];
double Beta1;
double Beta2;
double Beta3;
double I;
I= I1+I2+I3;
Beta1 = beta1*(1 + beta11 * cos(M_2PI/52*time + phi)); 
Beta2 = beta2*(1 + beta11 * cos(M_2PI/52*time + phi)); 
Beta3 = beta3*(1 + beta11 * cos(M_2PI/52*time + phi)); 
rate[0] = alpha*N;
rate[1] = Beta1*I/N;
rate[2] = delta1;
rate[3] = Beta2*I/N;
rate[4] = delta2;
rate[5] = Beta3*I/N;
rate[6] = mu;
rate[7] = gamma;
rate[8] = delta1;
rate[9] = gamma;
rate[10] = delta2;
rate[11] = gamma;
rate[12] = mu;
rate[13] = delta1;
rate[14] = omega;  
rate[15] = delta2;  
rate[16] = omega;  
rate[17] = mu;  
rate[18] = omega;  
dN[0] = rpois(rate[0]*dt); // births are Poisson
reulermultinom(2, S1, &rate[1], dt, &dN[1]);
reulermultinom(2, S2, &rate[3], dt, &dN[3]);
reulermultinom(2, S3, &rate[5], dt, &dN[5]);
reulermultinom(2, I1, &rate[7], dt, &dN[7]);
reulermultinom(2, I2, &rate[9], dt, &dN[9]);
reulermultinom(2, I3, &rate[11], dt, &dN[11]);
reulermultinom(2, R1, &rate[13], dt, &dN[13]);
reulermultinom(2, R2, &rate[15], dt, &dN[15]);
reulermultinom(2, R3, &rate[17], dt, &dN[17]);
S1 += dN[0] - dN[1] - dN[2] + dN[14];
S2 += dN[2] - dN[3] - dN[4]  + dN[16];
S3 += dN[4] - dN[5] - dN[6] + dN[18];
I1 += dN[1]          - dN[7] - dN[8];
I2 += dN[3] + dN[8]  - dN[9] - dN[10];
I3 += dN[5] + dN[10] - dN[11] - dN[12];
R1 += dN[7]           - dN[13] - dN[14];
R2 += dN[9]  + dN[13] - dN[15] - dN[16];
R3 += dN[11] + dN[15] - dN[17] - dN[18];
H1 += dN[1];
H2 += dN[3];
H3 += dN[5];
")
# to and from estimation scale (phi should be in [-2*pi,0])
toEst <- Csnippet("Tbeta1  = log(beta1);
                   Tbeta2  = log(beta2);
                   Tbeta3  = log(beta3);
                   Tbeta11 = logit(beta11);
                   Tphi    = logit(-phi/M_2PI);
                  ")
fromEst <- Csnippet("Tbeta1  = exp(beta1);
                     Tbeta2  = exp(beta2);
                     Tbeta3  = exp(beta3);
                     Tbeta11 = expit(beta11);
                     Tphi    = -M_2PI*expit(phi);
                    ")
# initalizer
sir_initializer <- Csnippet("S1= 4872721;   I1= 2871; R1= 124408; H1=0;
                             S2= 54942298;  I2= 639;  R2= 57063;  H2=0;
                             S3= 19990218;  I3= 174;  R3= 9608;  H3=0;
                            ")
# read in the observations
obs <- read.table("data.txt")
# define the covariate time
tbasis <- seq(-6*52+1,nrow(obs),by=1)
basis <- as.matrix(data.frame(time=tbasis))
# define parameters
params <- c(beta1=1.287395e+01, beta2=2.409750e-01, beta3=4.034036e-01, 
                      beta11=1.509189e-01, phi= -1.706477e-03, gamma=1, delta1=1/(5*52),   
                      delta2=1/(55*52), alpha=1/(80*52), mu=1/(20*52), N=80000000, omega=1/(1*52))
# add at t=0 a row of NAs to not have problems with the accumulator variable since
# t0 is much less than t[1]
dat <- arrange(time,df=rbind(data.frame(time=0,cases1=NA,cases2=NA,cases3=NA),obs))
# define the pomp object
pomp(data = dat,
     times="time",
     t0=tbasis[1],
     dmeasure = dmeas,
     rmeasure = rmeas,
     rprocess = euler.sim(step.fun = sir.step, delta.t = 1/80),
     statenames = c("S1", "I1", "R1", "H1", "S2", "I2", "R2", "H2","S3","I3", "R3", "H3"),
     obsnames = c("cases1", "cases2","cases3"),
     paramnames = names(params),
     zeronames=c("H1", "H2", "H3"),
     initializer=sir_initializer,
     toEstimationScale=toEst,
     fromEstimationScale=fromEst,
     covar=basis,
     tcovar=tbasis,
     params = params,
     PACKAGE="pomp"
) ->sir
# iterated filtering with only local search
run_level <- 1
switch(run_level,
       {sir_Np=100; sir_Nmif=10; sir_Neval=10; sir_Nglobal=10; sir_Nlocal=10}
)
require(doParallel)
cores <- 4
registerDoParallel(cores)
mcopts <- list(set.seed=TRUE)
set.seed(396658101,kind="L'Ecuyer")
sir_rw.sd <- 0.02
sir_cooling.fraction.50 <- 0.5
sir.start <- coef(sir)
# check if particle filter works
stew(file=sprintf("pf-%d.rda",run_level),{
  t_pf <- system.time(
    pf <- foreach(i=1:10,.packages='pomp',
                  .options.multicore=mcopts) %dopar% try(
                    pfilter(sir,params=sir.start,Np=sir_Np)
                  )
  )
},seed=1320290398,kind="L'Ecuyer")
(L_pf <- logmeanexp(sapply(pf,logLik),se=TRUE))

The output of the particle filter is

                   se 
-16108.933     10.918

Now running the mif2 algorithm

stew(file=sprintf("local_search_seas-%d.rda",run_level),{
  t_local <- system.time({
    mifs_local <- foreach(i=1:sir_Nlocal,.packages='pomp', .combine=c, 
                          .options.multicore=mcopts) %dopar%  {
      mif2(
        sir,
        start=sir.start,
        Np=sir_Np,
        Nmif=sir_Nmif,
        cooling.type="geometric",
        cooling.fraction.50=sir_cooling.fraction.50,
        transform=TRUE,
        rw.sd=rw.sd(
          beta1=sir_rw.sd,
          beta2=sir_rw.sd,
          beta3=sir_rw.sd,
          beta11=sir_rw.sd,
          phi=sir_rw.sd
        )
      )
    }
  })
},seed=900242057,kind="L'Ecuyer")

Here I obtain for example

coef(mifs_local[[1]])
        beta1         beta2         beta3        beta11           phi         gamma        delta1 
 1.635396e+01  3.652980e-01  6.534982e-01  1.771040e-01 -9.856913e-04  1.000000e+00  3.846154e-03 
       delta2         alpha            mu             N         omega 
 3.496503e-04  2.403846e-04  9.615385e-04  8.000000e+07  1.923077e-02

BUT

coef(mifs_local[[2]])
 beta1  beta2  beta3 beta11    phi  gamma delta1 delta2  alpha     mu      N  omega 
   NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN

which gives the following error when evaluating pfilter

stew(file=sprintf("lik_local_seas-%d.rda",run_level),{
  t_local_eval <- system.time({
    liks_local <- foreach(i=1:sir_Nlocal,.packages='pomp',.combine=rbind) %dopar% {
      evals <- replicate(sir_Neval, logLik(pfilter(sir,params=coef(mifs_local[[i]]),Np=sir_Np)))
      logmeanexp(evals, se=TRUE)
    }
  })
},seed=900242057,kind="L'Ecuyer")

Error in { : 
  task 2 failed - "in ‘pfilter’: ‘dmeasure’ returns non-finite value."

Cannot run mif2 on BBS example

Has anyone ever run mif2 on the BBS example? pfilter works, but not mif2.
The error message is

Error: in ‘mif2’: particle-filter error:Error : ‘mif2.pfilter’ error: ‘dmeasure’ returns non-finite value

which might have to do with the fact that the variable S is almost entirely missing.

If anyone has actually run this before, I would be grateful to see the code.

Probe matching examples, probe.period

I am using pomp's functionality to estimate parameters on summary statistics such as mean, standard deviation etc. where the probes seem to work fine. I have also tried probe.period but got error messages such as this one:
Fehler: in 'probe.match.objfun': applying probes to simulated data: missing values in object
I had already deleted "NA" in the original data for test purposes. Is it possible that the occasional "NA" might be mishandled in spec.pgram (called in probe.period) as its default value for na.action is na.fail ?

Are there other examples demonstrationg pomp's use in probe matching than the one in the J. Stat. Software Paper? I have found out by trial and error that I can provide a pomp object with starting parameters (start=..., est=... or params =...) or alternatively an object returned by the probe function (as in the paper). The former will probably more helpful to sample different starting values for the optimization procedure. Are there any more detailled tutorials available on this or related issues?

printing in Csnippet

How to give print command in Csnippet?

File handling and access of (multidimensional) arrays within Csnippets

Is it possible to read data from file in the Csnippets, in particular for initialization or parameter choice – and how do I mask quotation marks in the file handles?
Is it otherwise possible to read a parameter matrix (e.g. a contact matrix) within in R and use it within the Csnippets (and how do I deal with the different offset in array indices starting from 0 or 1 in C and R, respectively)?

pomp example request

I was attempting to use your pomp R package and was wondering if you had any example code for something like a linear Gaussian state space model.

I'm a little confused as to what goes where and was having trouble getting started: for example, with

rprocess = discrete.time.sim(step.fun = f, delta.t=1)

I wanted to use rnorm to simulate an AR(1) simple random walk process which has Gaussian noise with mean 0 and var 1 but I'm unsure how to specify rnorm as a function. For example, rnorm(Np,x_{t-1},1) is the generic form but I'm unsure how to code this within rprocess.

Would really appreciate any help.

Fit multiple time series

This is not really an issue, but more a request for some pointers. All the examples I found deal with the situation were you have one long time series. In contrast I have a large number of short time series. Is it possible to fit data such as that to a model? The data model does not seem to allow one to pass multiple time series. (The time series are independent of each other, but of course dependent within the time series).

Can I convert a named vector or list with random-walk perturbations to rw.sd function?

Hi,
I'm trying to estimate parameters using mif2. Due to an update (I guess) you need to specify random-walk perturbations with the rw.sd function. Before, a named vector worked fine.
My question is if there is a way to convert a named vector to the rw.sd function since my model has got too many parameters to write them all down individually like rw.sd (a=0.01, b=0.01,...)?
Thanks
Jochen

Fitting a mixed model on replicate time series

I am looking for any documentation or examples of how one might use pomp to fit a mixed model over multiple time series. I have time series data for a livestock disease outbreaks on multiple farms, and would like to fit the same model simultaneously across all of them, estimating key epidemiological parameters (contact rate, etc.) as draws from a common distribution. This can be done with Stan if I used ODE trajectory matching, but extinction processes are important in this case so I have been opting to use a Gillespie simulation in pomp. Any information you can direct me to?

Loglikelihood in mif2 search decreases before converging

I performed a global mif2 search with the model in detailed described in Issue 25. The result I observe confuses me, because for the first few iterations the loglikelihood increases quite much, however stabilises at a value which is below the initially found region with higher loglikelihood. In order to see the phenomenon I am talking about, the plot below does not show the first 2 iterations (the loglikelihood is starting somewhere around -10500 for the first iteration).

I tried different constellations of intensities of the random walk, used higher and lower values for the cooling fraction and also different number of particles and iterations, but in all cases I sooner or later observe that the loglikelihood stabilizes at a value which does not look optimal after seeing higher likelihoods. Do you have any explanation for this or am I doing something wrong? Thanks a lot.

Here the code for the mif2 search:

sir_box <- rbind(
  beta1=c(10,13),
  beta2=c(0.2,0.6),
  beta3=c(0.2,0.6),
  beta11=c(0.11,0.15),
  phi=c(0,0.05),
  od=c(0.001,1)
)

sir_fixed_params <- c(gamma=1, delta1=1/(5*52), delta2=1/(55*52), alpha=1/(80*52), 
                      omega=1/(1*52), mu=1/(20*52), N=80000000, y1=dat$cases1[2], 
                      y2=dat$cases2[2], y3=dat$cases3[2])

stew(file="bi_box_eval_negbin-%d.rda",{

  t_global <- system.time({
    mifs_global <- foreach(i=1:20,.packages='pomp', .combine=c, .options.multicore=mcopts) %dopar% {
      mif2(
        sir,
        start=c(apply(sir_box,1,function(x)runif(1,min=x[1],max=x[2])),sir_fixed_params),
        Np=4000,
        Nmif=200,
        cooling.type="geometric",
        cooling.fraction.50=0.5,
        transform=TRUE,
        rw.sd=rw.sd(beta1=0.002,beta2=0.002,beta3=0.002,beta11=0.002,phi=0.002,od=0.01)
      )
    }
  })
},seed=1270401374,kind="L'Ecuyer")

mifs_global %>%
  conv.rec(c("loglik","nfail","beta1","beta2","beta3","beta11","phi","od")) %>%
  melt() %>%subset(iteration>2)%>%
  ggplot(aes(x=iteration,y=value,color=variable,group=L1))+
  geom_line()+
  guides(color=FALSE)+
  labs(x="MIF2 Iteration",y="")+
  facet_wrap(~variable,scales="free_y",ncol=2)+
  theme_bw()

Failing to fit an SCIRS epidemic model with pomp package: Could the problem be my Csnippets?

I'm using the pomp package to simulate and fit a stochastic model for the first time and running into some errors, which i'm unable to solve based on your previous answers.
The model is an SCIRS ( Susceptible - Carrier - Ill - Recovered) model with birth, natural death, and disease related death. Carriers are asymptomatic and can either clear infection then have a temporary immunity. or become ill. Ill individual then recover from disease with temporary immunity as well. Carriers and ill individuals are both infectious (I have attached the model compartments flow below)

Note that transmission parameter beta(t) and invasion parameter a(t) have seasonal variation during the year.

I'm getting the following errors messages when trying to infer epidemiological parameters from weekly incidence time-series using the mif2 function. My dmeasure returns NaN with the following message :
"dmeasure returning a non-finite value..."
When trying to estimate the likelihood with pfilter i also get the following error message.
"non finite state variables are being returned"

But before I can further proceed with your previous explanations on this kind of errors, Could you please check if my Csnippet for the model described above are properly coded for pomp? (In particular the random draws using reulermultinom?

First, I'm suspecting that my Csnippets for the the deterministic and stochastic skeleton might be the problem. I have previously coded and successfully fitted the deterministic version of that same model in plain R code. Now i would like to fit a stochastic version of the same model with the pomp package.

## define deterministic skeleton for POMP Object.
SCIRS_determ.c<-'
        double a, beta, births;
        double year = 365.0;
        double pi = 3.141593;
        double rate[11]; 
        double term[11];
        
        // double population = Susc + Carrier + Ill + Recov;
        // I commented the line above as i store population size in
        // a covariates data-frame.

        // Seasonal function for invasion. Their are some evidence that the invasion
        // parameter < a > could be up to 100 -fold the baseline or average value during dry season
        // teta is the time when epidemic picks
        a = a0*((49.5*epsilon_a)*cos(2*pi*(t-teta)/year)+(1 + 49.5*epsilon_a));
        
        // When we make no assumption about the magnitude of variations of parameter < a >
        // a = a0*(1 + epsilon_a*cos(2*pi*(t-teta)/year));
        
        // Seasonal function for transmission
        beta = beta0*(1 + epsilon_b*cos(2*pi*(t-teta)/year));
     
        // new births
        births = mu*population + ((gamma*Ill)*act_comp_mening_death);

        // act_comp_mening_death is a variable that I can set to 0 or 1
        // from the parameters file, to decide if births are compensated
        // by both natural and meningitis related death. Since meningitis case fatality
        // is relatively low it effect on large population may be marginal. 
        // but in this case we are simulating relatively small populations (size around 10000).
        
        // The transition rates
        rate[0] = births; // births into susceptibles
        rate[1] = beta*(Carrier + Ill)/population; // infection 
        rate[2] = a;  //invasion
        rate[3] = alpha; // carriage recovery 
        rate[4] = rho; // Disease recovery
        rate[5] = phi; // Re-susceptibility
        
        rate[6] = mu; // natural death of S
        rate[7] = mu;  // natural death of R
        rate[8] = gamma;  // meningitis death of I
        rate[9] = mu;  // natural death of I
        rate[10] = mu;  // natural death of C
        
        // Individuals moving between the model compartments within dt
        term[0] = rate[0];
        term[1] = rate[1]*Susc;
        term[2] = rate[2]*Carrier;
        term[3] = rate[3]*Carrier;
        term[4] = rate[4]*Ill;
        term[5] = rate[5]*Recov;
        
        term[6] = rate[6]*Susc;
        term[7] = rate[7]*Recov;
        term[8] = rate[8]*Ill;
        term[9] = rate[9]*Ill;
        term[10] = rate[10]*Carrier;
        
        // balance the equations
        DSusc = term[0] + term[5] - term[1] - term[6] ;
        DCarrier= term[1] - term[2] - term[3] - term[10];
        DIll = term[2] - term[4] - term[8] - term[9];
        DRecov = term[3] + term[4] - term[5] - term[7];
        DInc = term[2]; // true incidence is cumulative I over dt
'

## define the stochastic model for POMP object
SCIRS_stock.c<-'
        double a, beta, births;
        double year = 365.0;
        double pi = 3.141593;
        double rate[11];
        double trans[11];
        // Population size in a covariate data.frame instead.
        //double population = Susc + Carrier + Ill + Recov;
        
        // Seasonal function for invasion. Their are some evidence that the invasion
        // parameter < a > could be up to 100 -fold the baseline or average value during dry season
        // teta is the time when epidemic picks
        a = a0*((49.5*epsilon_a)*cos(2*pi*(t-teta)/year)+(1 + 49.5*epsilon_a));
        
       // otherwise when no assumption is made on magnitude of variations of the parameters
        //a = a0*(1 + epsilon_a*cos(2*pi*(t-teta)/year));
        
        // seasonal function for transmission
        beta = beta0*(1 + epsilon_b*cos(2*pi*(t-teta)/year));

        //New  births
        births = mu*population + ((gamma*Ill)*act_comp_mening_death);

        rate[0] = births; // births into susceptibles

        // The following rate will be used below to draw the number of individual to transition 
        // between each model compartment at time dt
        
        rate[1] = beta*(Carrier + Ill)/population; // infection.
        rate[2] = a;  //invasion
        rate[3] = alpha; // carriage recovery 
        rate[4] = rho; // Disease recovery
        rate[5] = phi; // Re-susceptibility
        
        rate[6] = mu; // natural death of S
        rate[7] = mu;  // natural death of R
        rate[8] = gamma;  // meningitis death of I
        rate[9] = mu;  // natural death of I
        rate[10] = mu;  // natural death of C
        
        trans[0] = rpois(rate[0]*dt);	// Assuming births are Poisson distributed
        
        reulermultinom(1,Susc,&rate[1],dt,&trans[1]);
        reulermultinom(1,Susc,&rate[6],dt,&trans[6]); 
        reulermultinom(1,Carrier,&rate[2],dt,&trans[2]);
        reulermultinom(1,Carrier,&rate[3],dt,&trans[3]); 
        reulermultinom(1,Carrier,&rate[10],dt,&trans[10]);
        reulermultinom(1,Ill,&rate[4],dt,&trans[4]);
        reulermultinom(1,Ill,&rate[8],dt,&trans[8]); 
        reulermultinom(1,Ill,&rate[9],dt,&trans[9]);
        reulermultinom(1,Recov,&rate[5],dt,&trans[5]); 
        reulermultinom(1,Recov,&rate[7],dt,&trans[7]);

        // balance the equations with transitions
        Susc += trans[0] - trans[1] - trans[6] + trans[5];
        Carrier += trans[1] - trans[2] - trans[3] - trans[10];
        Ill += trans[2] - trans[4] - trans[8] - trans[9];
        Recov += trans[3] + trans[4] - trans[5] - trans[7];
        Inc += trans[2]; // true incidence is cumulative I over dt
'

SCIRS_initlz <-"
        Susc = nearbyint(population*Susc_0);
        Carrier = nearbyint(population*Carrier_0);
        Ill = nearbyint(population*Ill_0);
        Recov = nearbyint(population*Recov_0);
        Inc = 0;
"

SCIRS_dmeas.c<- '
        lik = dpois(meningitis_cases, nearbyint(Inc), give_log);
        
        // print values of state var if log likelihood return non-finite value.
        if (!R_FINITE(lik)){
                Rprintf(\"%lg %lg %lg %lg %lg %lg\\n\",Susc, Carrier, Ill, Recov, Inc,lik);
        }
        
        lik = (give_log) ? lik : exp(lik);
        
'

SCIRS_rmeas.c<-'
        meningitis_cases = rpois(Inc);
        if (meningitis_cases >= 0.0) {
                meningitis_cases = nearbyint(meningitis_cases);
        } else {
                meningitis_cases = 0;
        }
'

## Model tranformed params values c code
SCIRS_toEst.c<- '
        // fixed params
        Tgamma = log(gamma);
        Tmu    = log(mu);
        Trho   = log(rho);
        Tact_comp_mening_death = 1;
        // Unknown params
        Tbeta0 = log(beta0);
        Talpha = log(alpha);
        Tphi = log(phi);
        Tteta = log(teta);
        Tepsilon_a = logit(epsilon_a);
        Tepsilon_b = logit(epsilon_b);
        Ta0 = log(a0);
        to_log_barycentric (&TSusc_0, &Susc_0, 4);
'

## Model untranformed params values c code
SCIRS_fromEst.c<- '
        // fixed params
        Tgamma = exp(gamma);
        Tmu    = exp(mu);
        Trho   = exp(rho);
        Tact_comp_mening_death = 1;
        // Unknown params
        Tbeta0 = exp(beta0);
        Talpha = exp(alpha);
        Tphi = exp(phi);
        Tteta = exp(teta);
        Tepsilon_a = expit(epsilon_a);
        Tepsilon_b = expit(epsilon_b);
        Ta0 = exp(a0);
        from_log_barycentric (&TSusc_0, &Susc_0, 4);
'

## pomp object
# new pomp model specification for both deterministic and stochastic simulations
time_col_name<-"time"

        pomp(
             data = sample_data_set[,c(time_col_name, "meningitis_cases")],
             times= time_col_name,
             t0 = sample_data_set[,time_col_name][1],
             rprocess=euler.sim(Csnippet(SCIRS_stock.c),delta.t=1/52), #stochastic version
             skeleton = vectorfield(Csnippet(SCIRS_determ.c)), # deterministic model
             dmeasure = Csnippet(SCIRS_dmeas.c),
             rmeasure = Csnippet(SCIRS_rmeas.c),
             toEstimationScale=Csnippet(SCIRS_toEst.c),
             fromEstimationScale=Csnippet(SCIRS_fromEst.c),
             initializer = Csnippet(SCIRS_initlz),
             covar = subset_demog_seguenega_bema,
             tcovar= time_col_name, #timevector name corresponding in the covarible dataframe
             zeronames=c("Inc"),
             statenames = c("Susc", "Carrier", "Ill", "Recov", "Inc"),
             paramnames = c("gamma", "beta0", "a0", "alpha", "epsilon_a",
                            "epsilon_b", "mu", "teta", "rho", "phi", "act_comp_mening_death"
                            ,"Susc_0","Carrier_0","Ill_0","Recov_0"
                            )
        ) -> SCIRS_pomp

# sample data set with weekly disease case report in a given year
    week     time meningitis_cases
106    1 2006.012                0
107    2 2006.031                0
108    3 2006.051                0
109    4 2006.070                0
110    5 2006.089                0
111    6 2006.108                0
112    7 2006.127                0
113    8 2006.146                0
114    9 2006.166                0
115   10 2006.185                0
116   11 2006.204                0
117   12 2006.223                1
118   13 2006.242                1
119   14 2006.261                4
120   15 2006.281                1
121   16 2006.300                1
122   17 2006.319                2
123   18 2006.338                1
124   19 2006.357                1
125   20 2006.376                0
126   21 2006.396                0
127   22 2006.415                0
128   23 2006.434                0
129   24 2006.453                0
130   25 2006.472                0
131   26 2006.491                0
132   27 2006.511                1
133   28 2006.530                0
134   29 2006.549                0
135   30 2006.568                0
136   31 2006.587                0
137   32 2006.606                0
138   33 2006.626                0
139   34 2006.645                0
140   35 2006.664                0
141   36 2006.683                0
142   37 2006.702                0
143   38 2006.721                0
144   39 2006.741                0
145   40 2006.760                0
146   41 2006.779                0
147   42 2006.798                0
148   43 2006.817                0
149   44 2006.836                0
150   45 2006.856                0
151   46 2006.875                0
152   47 2006.894                0
153   48 2006.913                0
154   49 2006.932                0
155   50 2006.951                0
156   51 2006.971                0
157   52 2006.990                0

# the corresponding covariate data-frame for that same year
    week     time population
106    1 2006.012      12350
107    2 2006.031      12350
108    3 2006.051      12350
109    4 2006.070      12350
110    5 2006.089      12350
111    6 2006.108      12350
112    7 2006.127      12350
113    8 2006.146      12350
114    9 2006.166      12350
115   10 2006.185      12350
116   11 2006.204      12350
117   12 2006.223      12350
118   13 2006.242      12350
119   14 2006.261      12350
120   15 2006.281      12350
121   16 2006.300      12350
122   17 2006.319      12350
123   18 2006.338      12350
124   19 2006.357      12350
125   20 2006.376      12350
126   21 2006.396      12350
127   22 2006.415      12350
128   23 2006.434      12350
129   24 2006.453      12350
130   25 2006.472      12350
131   26 2006.491      12350
132   27 2006.511      12350
133   28 2006.530      12350
134   29 2006.549      12350
135   30 2006.568      12350
136   31 2006.587      12350
137   32 2006.606      12350
138   33 2006.626      12350
139   34 2006.645      12350
140   35 2006.664      12350
141   36 2006.683      12350
142   37 2006.702      12350
143   38 2006.721      12350
144   39 2006.741      12350
145   40 2006.760      12350
146   41 2006.779      12350
147   42 2006.798      12350
148   43 2006.817      12350
149   44 2006.836      12350
150   45 2006.856      12350
151   46 2006.875      12350
152   47 2006.894      12350
153   48 2006.913      12350
154   49 2006.932      12350
155   50 2006.951      12350
156   51 2006.971      12350
157   52 2006.990      12350

Parameters and initial state variables values.

# params
SCIRS_theta<-c(
        gamma = 5.2 ,
        mu    = (1 / 54) ,
        rho   = 52 ,
        act_comp_mening_death = 1, # dimensionless
        # estimates from previous work
        beta0 = 105,
        a0 = 0.0001737079*year,
        alpha = 0.1331496*year,
        teta = 91,
        epsilon_a = 0.003086651,
        epsilon_b = 0.9707787,
        phi = 0.003297274*year
)

# initial states
SCIRS_init_state <- c(
        Susc_0 = 0.99, #proportion of the population
        Carrier_0 = 0.01, # approximately 1% of carriers in the population at start of the year.
        Ill_0 = 0,
        Recov_0 = 0,
        Inc_0 = 0
)

Deterministic skeleton simulation

SCIRS.traj <- trajectory(SCIRS_pomp, 
                         params = c(SCIRS_theta, SCIRS_init_state), 
                         as.data.frame = TRUE
                         )

output of deterministic simulation

Stochastic simulation

SCIRS_pomp %>% 
        simulate(params=c(SCIRS_theta,SCIRS_init_state),nsim=11,as.data.frame=TRUE,include.data=TRUE) -> SCIRS_sim

SCIRS_sim %>%
        ggplot(aes(x=time,y=meningitis_cases,group=sim,color=(sim=="data")))+
        guides(color=FALSE)+
        geom_line()+ geom_point()+ facet_wrap(~sim,ncol=2)

Output of stochastic simulation

The error i get when i run a pfilter on the pomp_object

Thanks in advance for your help.

Unused argument

I have written following code for multivariate pomp using example given in faq. I tried to add a skeleton function. But I am getting an error unused argument.

Error in map(skel = Csnippet("\n                                        Dx1 = a11*x1 + a12*x2;\n                                        Dx2 = a21*x1 + a22*x2;"),  : 
  unused argument (skel = Csnippet("\n                                        Dx1 = a11*x1 + a12*x2;\n                                        Dx2 = a21*x1 + a22*x2;"))

library(magrittr)
library(pomp)
library(ggplot2)
data <- read.csv("C:/Users/admin/Documents/phd/book2.csv", stringsAsFactors = FALSE)
data <- as.data.frame(data)

pomp(
  data,times="t", t0=0,
  rmeasure=Csnippet("
                    y1 = rnorm(x1+x2,sigma1);
                    y2 = rnorm(x2-x1,sigma2);
                    "),
  dmeasure=Csnippet("
                    lik = dnorm(y1,x1+x2,sigma1,1)+dnorm(y2,x2-x1,sigma2,1);
                    lik = (give_log) ? lik : exp(lik);
                    "),
  rprocess=discrete.time.sim(
    step.fun=Csnippet("
                      double tx1, tx2;
                      tx1 = rnorm(a11*x1 + a12*x2,nu1);
                      tx2 = rnorm(a21*x1 + a22*x2,nu2);
                      x1 = tx1; x2 = tx2;
                      "),
    delta.t=1),
  skeleton = map(skel= Csnippet("
                                        Dx1 = a11*x1 + a12*x2;
                                        Dx2 = a21*x1 + a22*x2;"),                                        
                                        delta.t=1 ),
  initializer=Csnippet("
                       x1 = 0;
                       x2 = 0;
                       "),
  
  #toEstimationScale = logtrans <- Csnippet("
   #                                           Tr1 = log(a11);
    #                                            TK1 = log(a21);
     #                                            Tsigma1 = log(sigma1);
      #                                            
       #                                   "),
  #
  #fromEstimationScale=exptransexptrans <- Csnippet("
   #                                                Tr1 = exp(a11);
    #                                                  TK1 = exp(21);
     #                                                Tsigma1 = exp(sigma1);
      #                                              
      #                                             "),
  statenames=c("x1","x2"),
  paramnames=c("a11","a12","a21","a22","sigma1","sigma2","nu1","nu2"),
  params=c(a11=0.5,a12=-0.1,a21=0.2,a22=-1,nu1=0.3,nu2=0.1,sigma1=0.1,sigma2=0.3)
  ) -> parus

Summarizing the posterior distribution

When using pmcmc for estimation, I note that the package outputs parameter estimates which can be extracted using the coef function. Does the software calculate these estimates using all pmcmc samples in a chain? Is there a way to specify burn-in? Does the package calculate standard deviations or credible intervals for parameters?

Issue with abc.R test file.

Hi,

I'm running into issues with the abc test file.

Lines 101 and 102 produce this error:

> try(abc7 <- c(abc2,abc3))
Error in validObject(.Object) : 
  invalid class “abcList” object: error in ‘c’: to be combined, ‘abc’ objects must have chains of equal length
> try(abc7 <- c(abc7,abc3))
Error in validObject(.Object) : 
  invalid class “abcList” object: error in ‘c’: to be combined, ‘abc’ objects must have chains of equal length

But that appears to be because of line 58: abc3 <- continue(abc3,Nabc=700)

Once 58 is omitted the chains error goes away, but more warning messages start to follow from there

Lines 106-108:

> plot(abc2,scatter=TRUE,y=NA)
Warning message:
in ‘plot-abc’: ‘y’ is ignored 
> plot(abc7,scatter=TRUE,y=NA)
Warning message:
in ‘plot-abc’: ‘y’ is ignored 
> try(plot(abc7,pars=c("alpha.1"),scatter=TRUE,y=NA))
Error : in ‘plot-abc’: can't make a scatterplot with only one variable
In addition: Warning message:
in ‘plot-abc’: ‘y’ is ignored

Line 145:

 abc9 <- abc(
+     abc8,Nabc=1,probes=probes.good,scale=scale.dat,epsilon=5,
+     proposal=mvn.rw.adaptive(rw.var=vmat,scale.start=500,shape.start=100))
Warning messages:
1: In chol.default(covmat, pivot = TRUE) :
  the matrix is either rank-deficient or indefinite
2: in ‘mvn.rw.adaptive’: degenerate proposal

Lines 149, 151, 155 and 161 with their own but similar errors :)

> try(abc(abc2,Nabc=3,probes=probes.good,scale=scale.dat,epsilon=1,
+         proposal=function(theta,...)stop("urp!")))
Error : in ‘abc’: in proposal function: urp!
> try(abc(pomp(ou2,dprior=function(params,log,...)stop("oof!")),
+         Nabc=3,probes=probes.good,scale=scale.dat,epsilon=1,
+         proposal=mvn.diag.rw(c(alpha.1=0.01,alpha.2=0.01))))
Error : in ‘abc’: ‘dprior’ error: oof!

> try(abc(abc2,Nabc=3,probes=probes.good,scale=scale.dat,epsilon=1,
+         proposal=function(theta,...) {
+             neval <<- neval+1
+             if (neval>1) stop("bif!") else theta
+         }))
Error : in ‘abc’: in proposal function: bif!

> try(abc(
+     pomp(ou2,dprior=function(params,log,...) {
+         neval <<- neval+1
+         if (neval>10) stop("eep!")
+         else if (log) 0
+         else 1
+     }),Nabc=20,probes=probes.good,scale=scale.dat,epsilon=1,
+     proposal=mvn.diag.rw(c(alpha.1=0.01,alpha.2=0.01))))
Error : in ‘abc’: ‘dprior’ error: eep!

And so on. All of the try statements have similar ends. Would love to play with this a bit more, just hoping to use this file as a reference.

Multitype epidemic models - state/parameter vectors or matrices

I am interested in multi-type epidemic models in which the states (statenames) will for example have to be subdivided into age groups. It would therefore be helpful to have vectors with the numbers of susceptible hosts in certain age classes S[0]…S[n](as well as infected etc.). Is there a way to formalize this within pomp or do I have to define states S0, S1,…,Sn manually?

Having this in mind, is it also possible to read data from file in the Csnippets, in particular for initialization or parameter choice – and how do I mask quotation marks in the file handles?
Is it otherwise possible to read a parameter matrix (e.g. the next generation matrix) within in R and use it within the Csnippets (and how do I deal with the different offset in array indices starting from 0 or 1 in C and R, respectively)?

double exp(double) in rprocess returns nan

My problem is that i want to test the covariate effect on the FOI and reporting rate, therefore i use a exp function in the rprocess to make sure the FOI is non-negative and monotonic increasing. However, it returns nan during the simulation. I try to use the pow function to replace the exp function, but it still doesn't work. My session infomation:

R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] doMC_1.3.4 iterators_1.0.8 foreach_1.4.3 magrittr_1.5 pomp_1.9
loaded via a namespace (and not attached):
[1] deSolve_1.13 subplex_1.1-6 rsconnect_0.4.3 tools_3.3.1 coda_0.18-1 mvtnorm_1.0-5 codetools_0.2-14 grid_3.3.1 digest_0.6.10 nloptr_1.0.4 lattice_0.20-34

Codes and Data

Missing packages in "Getting started" vignette

This vignette depends on several packages besides "pomp": from what I could determine, at least "ggplot2", "magrittr", "foreach", "iterators" and "coda" (but perhaps more?). From these packages, only "coda" is being specifically loaded in the vignette.

Perhaps these packages should be explicitly loaded in the vignette (via library() or require()), to make it easier for people to follow the vignette?

FAQ: Issue 4.1

The FAQ has been very helpful in interpreting and locating problems in my Csnippets.

One suggestion to improve Issue 4.1 of the FAQ is to add another common cause for problems in the dmeasure function: trying to modify the pre-declared lik variable without first initializing it to zero. I know it's a trivial C rookie error, but perhaps adding that line to the FAQ may help in the future at least one or two people like me, saving them a couple of hours ;)

On the other hand, the suggestion of using Rprintf to diagnose problems was a good hint in the right direction.

Thanks, once again.

rbinom returns nan when using mif2 function

I'm trying to infer epidemiological parameters from incidence time-series using you mif2 function and I do not understand why the rbinom function of my step function sometimes returns nan values when I am using the mif2 function whereas it works fine when I only ran simulations using the simulate function.
Here is a part of my code:

seidr_step <- Csnippet('
  Rprintf(\"S=%lg\\tE=%lg\\tI=%lg\\tD=%lg\\tR=%lg\\n\", S, E, I, D, R);
  double R0_I = (1-f) * R0;
  Rprintf(\"R0_I=%lg\\n\", R0_I);
  double beta_I = R0_I * gamma;
  Rprintf(\"beta_I=%lg\\n\", beta_I);
  double lambda_I = beta_I / N * I;
  Rprintf(\"lambda_I=%lg\\n\", lambda_I);
  double R0_D = f * R0;
  Rprintf(\"R0_D=%lg\\n\", R0_D);
  double beta_D = R0_D * mu / p;
  Rprintf(\"beta_D=%lg\\n\", beta_D);
  double lambda_D = beta_D / N * D;
  Rprintf(\"lambda_D=%lg\\n\", lambda_D);

  // Transitions
  // From class S to class E
  double transSE = rbinom(S, 1.0 - exp(- lambda_I * dt)) + 
      rbinom(S, 1.0 - exp(- lambda_D * dt));
  Rprintf(\"S=%lg\\n\", S);
  Rprintf(\"dt=%lg\\n\", dt);
  Rprintf(\"1.0 - exp(- lambda_I * dt)=%lg\\n\", 1.0 - exp(- lambda_I * dt));
  Rprintf(\"rbinom(S, 1.0 - exp(- lambda_I * dt))=%lg\\n\",
       rbinom(S, 1.0 - exp(- lambda_I * dt)));
  Rprintf(\"1.0 - exp(- lambda_D * dt)=%lg\\n\", 1.0 - exp(- lambda_D * dt));
  Rprintf(\"rbinom(S, 1.0 - exp(- lambda_D * dt))=%lg\\n\", 
       rbinom(S, 1.0 - exp(- lambda_D * dt)));
  Rprintf(\"transSE=%lg\\n\", transSE);
  // From class E to class I
  double transEI = rbinom(E, 1.0 - exp(- sigma * dt));
  // From class I
  double transI = rbinom(I, 1.0 - exp(- gamma * dt));
  // to class D
  double transID = rbinom(transI, p);
  double transIR = transI - transID;
  // From class D to class R
  double transDR = rbinom(D, 1.0 - exp(- mu * dt));

  // Balance the equations
  S -= transSE;
  E += transSE - transEI;
  I += transEI - transID - transIR;
  D +=  transID - transDR;
  R += transIR + transDR;
  N_EI += transEI; // incidence of I cases
')

seidr.sim <- simulate(
	seidr,
	params=c(N=4000000, R0=2, f=0.5, sigma=30.9322/12, p=0.765, 
                         gamma=58.87097/12, mu=239.53655/12, rho=1, k=0), 
	nsim=1,
	as.data.frame=TRUE,
	include.data=TRUE
)

seidr.mif <- mif2(
	seidr,
	Np=1,
	Nmif=50,
	cooling.type="geometric",
	cooling.fraction.50=0.5,
	transform=F,
	start=c(
		N=4000000, R0=2, f=0.5, sigma=30.9322/12, p=0.765, 
                gamma=58.87097/12, mu=239.53655/12, rho=1, k=0
	),
	rw.sd=rw.sd(
		N=0.2, 
		R0=0.02, 
		f=0.02, 
		sigma=0, 
		p=0, 
		gamma=0, 
		mu=0.02, 
		rho=0, 
		k=0
	)
)

The first lines returned by the Rprintf calls are:

S=4e+06	E=0	I=1	D=0	R=0
R0_I=1.03331
beta_I=5.06932
lambda_I=1.26733e-06
R0_D=0.97515
beta_D=25.4252
lambda_D=0
S=4e+06
dt=0.01
1.0 - exp(- lambda_I * dt)=1.26733e-08
rbinom(S, 1.0 - exp(- lambda_I * dt))=nan
1.0 - exp(- lambda_D * dt)=0
rbinom(S, 1.0 - exp(- lambda_D * dt))=nan
transSE=nan

Could you help me please ?

Emma

Using R function instead of CSnippet

Hi, I'm trying to replicate this tutorial http://kingaa.github.io/pomp/vignettes/getting_started.html, but instead of use CSnippet I want to use R function. I have not found any example of this. I want to do this to call an external binary from R (with system()) and to understand better the pomp's implementation.

So, I've replaced this function in the tutorial:

step.fun <- Csnippet("
      double dW = rnorm(0,sqrt(dt));
      N += r*N*(1-N/K)*dt+sigma*N*dW;
    ")

for this

test.fun <- function(x,t,params,delta.t,...){
    dW <- rnorm(0, sqrt(t))
     N <- N + r*N*(1-N/K)*dt+sigma*N*dW
}

and then like in the tutorial

parus <- pomp(data=parus.dat,time="year",t0=1959, 
        rprocess=euler.sim(step.fun=test.fun,delta.t=1/305),
        statenames="N",paramnames=c("r","K","sigma"))

and simulate
simStates <- simulate(parus,nsim=10,params=c(r=0.2,K=200,sigma=0.5,N.0=200),states=TRUE)

but I have this error

Error in (function (x, t, params, delta.t, ...) :
object 'N' not found
Error: ‘simulate’ error

and of course, N is not declare in the R function, but for the CSnippet pomp set that variable.

If you have any example of suggestion of how to do this I'll be very grateful.

p.d: I've been using this page to try the implementation http://www.rdocumentation.org/packages/pomp/functions/plugins

obsnames not extracted from data when 1 row in data.frame

I stumbled upon this problem today when working on tests for the C snippet plugin. Here's an example that shows the problem with pomp 1.15 from CRAN:

create_example <- function(times = c(1,2), t0 = 0, mu = 0.001, N_0 = 1) {
  data <- data.frame(time = times, reports = NA)
  mmodel <- reports ~ pois(ct)
  rate.fun <- function(j, x, t, params, covars, ...) {
    switch(j, params["mu"]*x["N"], stop("unrecognized event ",j))
  }
  rprocess <- gillespie.sim(rate.fun = rate.fun, v=rbind(N=-1, ct=1), d=rbind(1, 0))
  initializer <- function(params, t0, ...) {
    c(N=N_0,ct=12)
  }
  pomp(data = data, times = "time", t0 = t0, params = c(mu=mu),
    rprocess = rprocess, initializer = initializer,
    zeronames="ct", paramnames=c("mu"), statenames=c("N","ct"),
    measurement.model = mmodel)
 }
 simulate(create_example(times = 1), as.data.frame=TRUE)
 simulate(create_example(times = c(1,2)), as.data.frame=TRUE)

The output from the first simulate call is:

Error: in ‘simulate’: variable 'reports' not found among the observables

Whereas the second call produces expected output:

   time reports N ct sim
 1    1       0 1  0   1
 2    2       0 1  0   1

The problem seems to be rooted in pomp.R in the line:

data <- t(sapply(data[-tpos],as.numeric))

The following seems to be a more robust replacement:

data <- do.call(rbind, lapply(data[-tpos], as.numeric))

Error plotting results from mif2()

Hi,
I'm getting an error when trying to plot diagnostic charts after running mif2(). It's quite possible that it's me doing something wrong but it's not at all obvious from the error. So if it turns out not to be a bug, it would be nice to get an error message that allows me to fix the issue myself.

mif2() runs without any warnings and the estimated parameter values make sense.

...

mif1 <- foreach(i = 1:10, .combine = c) %dopar% {
  mif2(model, Nmif = 100, start = perturbCoefs(coefs.true, sdCoefs), transform=TRUE,
       Np = 2000, rw.sd=sdCoefs_rw, cooling.fraction.50 = 0.05, verbose=TRUE)
}

pf1 <- foreach(mf = mif1, .combine = c) %dopar% {
  pf <- replicate(n = 10, logLik(pfilter(mf, Np = 10000)))
  logmeanexp(pf)
}

plot(mif1)
Hit <Return> to see next plot: 
Error in object@filter.mean[pars, ] : subscript out of bounds

The effective sample size chart is displayed but nothing else. Let me know if you'd like the complete data + model code so that you can replicate the problem.

Thanks

Fitting ecological data with multiple replicates per time point

Thank you very much for this great package. I am interested in using it to fit ecological data that has various replicate measures per time point. Before trying the particle filtering I am running some examples using MLE. Instead of sending my exact example I think it would be easier to simply use an example you have in "Model-based Inference in Ecology and Epidemiology (a short course)". Please see lesson "Introduction to inference: parameter estimation", in the first example under the section "Likelihood", you introduce a the dataset of community A into the pomp function. Say now that you have three replicates per time point, that would give you a dataset like the example one I upload here.
If I give pomp(data= "niameyAreps", ...) then I get an error that says

 in 'pomp': 'times' must be an increasing numeric vector

So my question is, can pomp take data that has multiple replicates per time point? and if so, then how do I restructure the data or the code so that it can use this extra information.

niameyAreps.zip

Acceptance step in ABC algorithm

I was reviewing the implementation of Toni et al's abc-mcmc algorithm and noticed that the acceptance step doesn't appear to include there M5. step. When the priors are uniform I see that alpha=1 and hence the probability of acceptance is 1, does this then mean that the POMP abc function can only be used with uniform priors?

I may be missing something crucial here or this may be an oversight?

Algorithm outline and code chunks below for information

Thanks for a great package!

In the abc function (line 150)

            ## ABC update rule
            distance <- sum(((datval-simval)/scale)^2)
            if( (is.finite(distance)) && (distance<epssq) ){
                theta <- theta.prop
                log.prior <- log.prior.prop
                .accepts <- .accepts+1L
            }

and in the pmcmc function

            ## PMCMC update rule (OK because proposal is symmetric)
            alpha <- exp(pfp.prop@loglik+log.prior.prop-pfp@loglik-log.prior)
            if (runif(1) < alpha) {
                pfp <- pfp.prop
                theta <- theta.prop
                log.prior <- log.prior.prop
                .accepts <- .accepts+1L
}

it don't work!

I plugged it in but it done broke.

Recommended optimization for traj.match

For the function of traj.match to solve ODE with reporting error, what is the recommended way for optimization?

I have try Nelder-Mead, subplex and SANN. It is highly likely that subplex give me some error/complaining about initial value. I have try use Nelder-Mead or SANN--> Nelder-Mead. However, increasing sample size in parameter space will give me better result in log likelihood. This probably mean that I didn't arrive the global optimum.

I have 10-20 parameters.

What would the recommended way to solve the optimization problem?

How to generate random normal variable in the C snippets

Hi, I am new in programming in R using C code. I want to generate random normal variable in the C snippets in the pomp, former I could define my own generating function in C as:

#include  < math.h >
#include  < stdlib.h >

double
randn (double mu, double sigma)
{
  double U1, U2, W, mult;
  static double X1, X2;
  static int call = 0;

  if (call == 1)
    {
      call = !call;
      return (mu + sigma * (double) X2);
    }

  do
    {
      U1 = -1 + ((double) rand () / RAND_MAX) * 2;
      U2 = -1 + ((double) rand () / RAND_MAX) * 2;
      W = pow (U1, 2) + pow (U2, 2);
    }
  while (W >= 1 || W == 0);

  mult = sqrt ((-2 * log (W)) / W);
  X1 = U1 * mult;
  X2 = U2 * mult;

  call = !call;

  return (mu + sigma * (double) X1);
}

How could I do this in the pomp, what I want is

sir.step<-"
double rate[2];
double dN[2];
double P=S+I+R;
rate[0]=bet*I/P;
rate[1]=mu;
reulermultinom(1,S,&rate[0],dt,&dN[0]);
reulermultinom(1,I,&rate[1],dt,&dN[1]);
if(!R_FINITE(S)) Rprintf(\"%lg %lg %lg %lg %lg %lg %lg %lg %lg\\n\",dN[0],rate[0],dN[1],rate[1],bet,mu,S,I,R);
S+=-dN[0];
I+=dN[0]-dN[1];
R+=dN[1];
"

sir<-pomp(data=data.frame(cases=data,time=seq(1,14,by=1)),
          times="time",t0=0,dmeasure=Csnippet(dmeas),
          rmeasure=Csnippet(rmeas),rprocess=euler.sim(
            step.fun=Csnippet(sir.step),delta.t=1/14/2),
          statenames=c("S","I","R"),
          paramnames=c("bet","mu","rho","theta1","theta2"),
          initializer=function(params,t0,...){
            ps<-exp(params["theta1"])/(1+exp(params["theta1"])+exp(params["theta2"]))
            pi<-1/(1+exp(params["theta1"])+exp(params["theta2"]))
            pr<-exp(params["theta2"])/(1+exp(params["theta1"])+exp(params["theta2"]))
            return(setNames(as.numeric(rmultinom(1,763,prob=c(ps,pi,1-ps-pi))),
                     c("S","I","R")))
          },params=c(bet=2,mu=0.5,rho=3,theta1=5,theta2=1))

I want to generate random normal variable in the sir.step function, how could I include < math.h >, < stdlib.h > the two header? Or you know some better way to generate random variable in the pomp? Thank you in advance.

How to demonstrate the result properly?

Thanks for a great package which helps me a lot on the research!
After running a few mif2 code, I think I have find the highest loglik.

Can I determine that the set of parameters value with highest log likelihood is the best result?
Why don't paper such as He , 2010 emphasize the confidence interval for parameters?
How to demonstrate the result properly? Which figures should I put into the paper?

Thank you!

resampling frequency

Hi
in the documentation it is clearly stated that the currently implemented particle filter resamples particles at each time point. I was wondering if there exist an experimental version of pomp (e.g. a version intended for future release) that allow users to control the resampling frequency, e.g. resample only when the ESS goes below a certain threshold.

Problem with compiling C snippets on Windows

I have a problem running pomp on my Windows (7, x64) machine. I have installed: pomp 1.2.2.1, R 3.2.3 and Rtools 3.3 (also tried 3.2).

Some of the C snippets do compile, but one of them doesn't. I get:

 fatal error: pomp.h: No such file or directory compilation terminated.

The problem is that I have

# include <pomp.h>

in this file instead of

 # include <C:/Users/***/Documents/R/win-library/3.2/pomp/include/pomp.h>

that I have for other generated .C files.

Any ideas why this might be happening?

problem in mif

I have written following code. Where I am getting the errors:

Error in attr(val, "system.time") <- tmg : 
  attempt to set an attribute on NULL
 pairs(~loglik+a11+a12+a21+a22+nu1+nu2+sigma1+sigma2,data=mle)
Error in eval(predvars, data, env) : object 'loglik' not found.

Can you please help me?

library(magrittr)
library(pomp)
library(ggplot2)
library(foreach)
library(doMPI)
library(doRNG)
library(iterators)
cl <- startMPIcluster()
registerDoMPI(cl)
registerDoRNG(348885445L)

data <- read.csv("C:/Users/admin/Documents/phd/book2.csv", stringsAsFactors = FALSE)
data <- as.data.frame(data)

pomp(
  data,times="t", t0=0,
  rmeasure=Csnippet("
                    y1 = rnorm(x1+x2,sigma1);
                    y2 = rnorm(x2-x1,sigma2);
                    "),
  dmeasure=Csnippet("
                    lik = dnorm(y1,x1+x2,sigma1,1)+dnorm(y2,x2-x1,sigma2,1);
                    lik = (give_log) ? lik : exp(lik);
                    "),
  rprocess=discrete.time.sim(
    step.fun=Csnippet("
                      double tx1, tx2;
                      tx1 = rnorm(a11*x1 + a12*x2,nu1);
                      tx2 = rnorm(a21*x1 + a22*x2,nu2);
                      x1 = tx1;x2= tx2;"),
    delta.t=1),
  skeleton = map(
    skel.de <- Csnippet("double rx1,rx2;
                        rx1= a11*x1 + a12*x2;
                        rx2= a21*x1 + a22*x2;
                        Dx1= rx1;Dx2= rx2;  
                        "),                                        
    delta.t=1 
    ),
  initializer=Csnippet("
                       x1 = 0;
                       x2 = 0;
                       "),
  
  toEstimationScale = logtrans <- Csnippet("
                                           Ta11 = (a11);
                                           Ta21 = (a21);
                                           Ta12 = (a12);
                                           Ta22 = (a22);
                                           Tsigma1 = (sigma1);
                                           Tsigma2 = (sigma2);
                                           Tnu1 = nu1;
                                           Tnu2 = nu2;
                                           "),
  
  fromEstimationScale=exptrans <- Csnippet("
                                           Ta11 = (a11);
                                           Ta21 = (a21);
                                           Ta12 = (a12);
                                           Ta22 = (a22);
                                           Tsigma1 = (sigma1);
                                           Tsigma2 = (sigma2);
                                           Tnu1 = nu1;
                                           Tnu2 = nu2;   
                                           "),
  statenames=c("x1","x2"),
  paramnames=c("a11","a12","a21","a22","sigma1","sigma2","nu1","nu2")
 # params=c(a11=0.5,a12=-0.1,a21=0.2,a22=-1,nu1=0.3,nu2=0.1,sigma1=0.1,sigma2=0.3)
  ) -> parus
bake(file = "parus.rds",{
  guesses <- sobolDesign(lower=c(a11=0.5,a12=-0.1,a21=0.2,a22=-1,nu1=0.3,nu2=0.1,sigma1=0.1,sigma2=0.3),
                         upper=c(a11=0.6,a12=0.0,a21=0.3,a22=-0.9,nu1=0.4,nu2=0.2,sigma1=0.2,sigma2=0.4),
                         nseq=100)
  print(guesses)
  library(foreach)
  foreach (guess = iter(guesses,"row"),.combine=rbind,
           .packages=c("pomp","magrittr"),.errorhandling="remove",
           .export="parus",.inorder=FALSE) %dopar% {
             parus %>% 
               mif2(start = unlist(guess),Nmif=5,Np=1000,transform=TRUE,
                    cooling.fraction.50=0.8,cooling.type="geometric",
                    rw.sd=rw.sd(a11=0.02,a12=0.02,a21=0.02,a22=0.02,nu1=0.02,nu2=0.02,sigma1=0.02,sigma2=0.02)) %>%
               mif2() -> mf
             ll <- logmeanexp(replicate(5,logLik(pfilter(mf))),se=TRUE)
             data.frame(loglik<-ll[1],loglik.se<-ll[2],as.list(coef(mf)))
           } -> mle
}) -> mle

## ----plot-mles-----------------------------------------------------------
pairs(~loglik+a11+a12+a21+a22+nu1+nu2+sigma1+sigma2,data=mle)

time varying random walk standard deviations

The developer version of pomp 0.44-1 has time varying random walk standard deviations. It would be great if this feature were integrated into the next release version.

-Micaela

What happens in case of a filtering failure?

I am doing inference with pmcmc and see repeated filtering failures. The manual says that this corresponds to the situation in which the log likelihood is below the parameter tol for all particles (is it actually the likelihood as the default value of tol is 1e-17 not -17?)).
Unfortunately, I have not found what happens at this point - does the simulation just continue until particles (hopefully) increase in likelihood, are particles reset, does something else happen...? I wonder particularly as it seems that the time scale of changes in the traces of my simulations seems to grow after an initial phase of faster (though not consistent) changes. It would be helpful to learn more to improve on the convergence of pmcmc (similar questions seem to have have been touched in #13).

Rprintf in Csnippet

I'm trying to estimate parameters in a multi-vairate epidemic model via the mif2 algorithm. Unfortunately there is often an error message that the dmeasure returns non-finite values during the mif2 process. In your documentation it says that using Rprintf in the dmeasure Csnippet can help debug the error.
If I use the Rprintf statement in my Csnippet code there is always an error message during the creation of the pomp-object.
Is there an example how to use the Rprintf statement in the Csnippet code?

Warning: "in 'table_lookup': extrapolating at ..."

First of all, thanks for maintaining such useful package. My experience with it has been positive, and the quantity and quality of documentation is quite good.

I have a question, though: when running trajectory matching, I'm always getting these specific warnings which are somewhat descriptive (i.e. i understand what's wrong), but a bit mysterious as well (i.e. i don't understand why it's happening).

For example, with a model with data/observables and covariates defined from t = 0 to t = 127 (with t0=0, on the call to the pomp() function), I get these warnings:

1: in 'table_lookup': extrapolating at 1.288334e+02
2: in 'table_lookup': extrapolating at 1.288334e+02
3: in 'table_lookup': extrapolating at 1.311213e+02
4: in 'table_lookup': extrapolating at 1.311213e+02
5: in 'table_lookup': extrapolating at 1.305028e+02
6: in 'table_lookup': extrapolating at 1.305028e+02
[...]

My interpretation of these warnings is that the code is trying to get data/observables or covariates for t > 127, and they are undefined at those points:

21 records of 4 observables, recorded from t = 0 to 127
zero time, t0 = 0
42 records of 16 covariates, recorded from t = 0 to 127

Is there an obvious reason why this behaviour (i.e. code trying to extrapolate outside the implicit time period defined by the observables+covariates) would occur? (perhaps I forgot to set up something?)

Should I be worried about these warnings? (it seems to me that the trajectory matching is working correctly otherwise, despite these warnings)

Once again, thanks for your work.

(If anything is required to better understand the source of the warnings [code, data, running diagnostic functions], please let me know.)

Issues with parallel computing

Although I understand that this is probably not really within the realm of your package I wonder whether you had similar problems - and maybe even have found a solution?! In your JSS paper (and supplementary code provided) you give examples how to parallelize computations. I have adapted this along the lines

library(doMC)
library(foreach)
registerDoMC()

pmcmc1 <- foreach(
    i=1:5,
    .inorder=FALSE,
    .packages="pomp",
    .combine=c,
    .options.multicore=list(set.seed=TRUE)
  ) %dopar% {
    pmcmc(pomp(sir, dprior = sir.dprior), start = para,
          Nmcmc = 600000, Np = 100, max.fail = Inf,
          proposal = mvn.diag.rw(c(Beta=0.02, gamma=0.01,mu =0.05, sigma = 0.0, 
          cov_primary=0.1, cov_child =0.1, cov_teen =0.1, rho1=0.005, rho2 = 0.005, 
          rho3=0.001,rho4=0.001,rho5=0.001, dispersion = 0)))
  }

for which I get an error message as described here:
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17137

If I switch off prescheduling, i.e.

.options.multicore=list(set.seed=TRUE, preschedule = FALSE)

there is still the message about problems in fork.c - only the final message changes from

In mclapply(argsList, FUN, mc.preschedule = preschedule, mc.set.seed = set.seed, :
all scheduled cores encountered errors in user code

In mclapply(argsList, FUN, mc.preschedule = preschedule, mc.set.seed = set.seed, :
5 function calls resulted in an error

I do not find the information provided following the link above too helpful, particularly regarding the fact that the traces are about the size of 50MB (and not in the order of GB where problems were expected according to Brian Ripley). Have you encountered similar problems or could help me with any clues?