lindermanlab / ssm Goto Github PK

View Code? Open in Web Editor NEW

540.0 540.0 196.0 53.87 MB

Bayesian learning and inference for state space models

License: MIT License

Python 9.08% Cython 0.12% Jupyter Notebook 90.80%

ssm's People

Contributors

Stargazers

Watchers

Forkers

satpreetsingh miloventimiglia zhoupc deng-xinyi quliuwuyihmy zshwuhan ekellbuch johannah serenidpity qiaoxingli ashkanfa gbarello diegoaldarondo hanyas cj401 merz9b kevin-sean-chen snazari jchutrue zhanghonglishanzai bagibence davidzoltowski yahmadian braraki ahwillia aernesto veasal jretza andrewczgithub davidgwyrick vishalbelsare ronteichner summeryingliu zzjsjspig zashwood em812 chingf ulisespereira gumpfly ebatty adelaneh oztc yinsenmiao traveldriffter kamdh jiaxx temp3rr0r stoutheo alexfarhang 1kastner pnickl mmyros guyhwilson eqs wolfking2015 ceciliad93 gongzhengxin dmitrijs-c grossalice ivanorsolic jiyunshin gaoyuanjun carinafo zhaoyuqi788 liuchongwei daisilin aslansd bigdig bryancsouza cxrodgers jglaser2 sakuralily11 pankajkarman mhashemi0873 nerd911 rinpoche-peregrine akshay-jaggi ghuckins gwslab saramatias masonhargrave orrenkt raeedcho rhoadesscholar schlagercollin dikshagup bo1929 realwsq mindpecking xiaoxuchen shengyuan-cai thlautenschlaeger yjiong0228 musiclit marcpabst burnhamdr zhengh-honghui diogoflduarte felixp8 kkalm

ssm's Issues

Error when initializing an LDS with autoregressive observations

I'm getting an error when initializing an LDS with autoregressive observations. For example, this happens when running the notebook "1b Simple Linear Dynamical System" with the relevant lines in cell 6 changed to

# Create the model and initialize its parameters
lds = LDS(N, D, emissions="autoregressive")
lds.initialize(y)

The resulting error is

ValueError: all the input arrays must have same number of dimensions

It looks like the error traces to lines 514-15 in observations.py where there is a zip around a potentially undefined input, although it seems to be more complicated than that.

StudentsT dof fixed point grew beyond bounds

It's possible this warning is being issued because my data is ill-conditioned. Going to look into it further, but here's what I'm seeing:

hmm = HMM(30, 10, observations='studentst')
hmm_lls = hmm.fit(list(pca_dict_concat.values()), 
                  method='em',
                  num_em_iters=200, verbose=True)

This error repeats after roughly 28 iterations,

LP: -161679.7:  14%|█▍        | 28/200 [01:15<07:42,  2.69s/it]/home/jmarkow/dev/ssm/ssm/util.py:201: UserWarning: generalized_newton_studentst_dof fixed point grew beyond bounds [0.001,20].

random_rotation

Hi Scott,

Thanks for this great, easy-to-use library!
I may have found a small bug (only relevant for the .initialize methods, as fas as I can see): if you meant the function random_rotation in ssm/ssm/util.py to output a random n x n rotation matrix (rotation in n-dim space around a uniformly random axis with small-random or specified angle), then the current function doesn't exactly do that. (Can see this by checking that the determinant of the output matrix is zero, instead of one.) But this can be easily remedied by changing line 81 from out = np.zeros((n, n)) to out = np.eye(n). (The current output matrix first projects on the hyperplane perpendicular to the random axis and then rotates the projection in that plane.)

EM Fit breaks when log-likelihood gets too large

When I do a EM fit I get this assertion error, I think because the log-likelihood becomes infinite:

Iteration 9. LL: 45156.53447347552
Iteration 10. LL: 48537.887453636096
Iteration 11. LL: 74795.85433022543

AssertionError                            Traceback (most recent call last)
<ipython-input-5-2d40179a6593> in <module>()
      1 N_iters = 25
      2 hmm = HMM(40, 5, observations="ar")
----> 3 hmm_lls = hmm.fit(copy.deepcopy(data[:6000]), method="em", num_em_iters=N_iters, verbose=True)

/util.py in wrapper(self, datas, inputs, masks, tags, **kwargs)
     55             tags = [tags]
     56 
---> 57         return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
     58 
     59     return wrapper

/core.py in fit(self, datas, inputs, masks, tags, method, initialize, **kwargs)
    190             self.initialize(datas, inputs=inputs, masks=masks, tags=tags)
    191 
--> 192         return self._fitting_methods[method](datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
    193 
    194 

/core.py in _fit_em(self, datas, inputs, masks, tags, num_em_iters, verbose, **kwargs)
    174 
    175             # Store progress
--> 176             lls.append(self.log_probability(datas, inputs, masks, tags))
    177 
    178             if verbose:

/util.py in wrapper(self, datas, inputs, masks, tags, **kwargs)
     55             tags = [tags]
     56 
---> 57         return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
     58 
     59     return wrapper

/core.py in log_probability(self, datas, inputs, masks, tags)
    128             log_likes = self.observations.log_likelihoods(data, input, mask, tag)
    129             lp += hmm_normalizer(log_pi0, log_Ps, log_likes)
--> 130             assert np.isfinite(lp)
    131         return lp
    132 

AssertionError:

out of memory issue

Running a simple HMM on 3M x 15 data points takes up ~ 140 GB of memory over model iteration.

sampling with Poisson emissions gives ValueError

The following lines give a value error most of the time:

true_lds = ssm.LDS(15, 10, dynamics="none", emissions="poisson")
x,y = true_lds.sample(1000)

The traceback looks like this:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-13-27cbe882d496> in <module>
----> 1 x,y = true_lds.sample(1000)

/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/lds.py in sample(self, T, input, tag, prefix, with_noise)
   1018
   1019     def sample(self, T, input=None, tag=None, prefix=None, with_noise=True):
-> 1020         (_, x, y) = super().sample(T, input=input, tag=tag, prefix=prefix, with_noise=with_noise)
   1021         return (x, y)

/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/lds.py in sample(self, T, input, tag, prefix, with_noise)
    252         # Sample observations given latent states
    253         # TODO: sample in the loop above?
--> 254         y = self.emissions.sample(z, x, input=input, tag=tag)
    255         return z[pad:], x[pad:], y[pad:]
    256

/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/emissions.py in sample(self, z, x, input, tag)
    640         z = np.zeros_like(z, dtype=int) if self.single_subspace else z
    641         lambdas = self.mean(self.forward(x, input, tag))
--> 642         y = npr.poisson(lambdas[np.arange(T), z, :])
    643         return y
    644

~/anaconda3/lib/python3.7/site-packages/autograd/tracer.py in f_wrapped(*args, **kwargs)
     46             return new_box(ans, trace, node)
     47         else:
---> 48             return f_raw(*args, **kwargs)
     49     f_wrapped.fun = f_raw
     50     f_wrapped._is_autograd_primitive = True

mtrand.pyx in numpy.random.mtrand.RandomState.poisson()

common.pyx in numpy.random.common.disc()

common.pyx in numpy.random.common.discrete_broadcast_d()

common.pyx in numpy.random.common.check_array_constraint()

ValueError: lam value too large

I think this means that somewhere the lambda parameter passed to npr.Poisson is too large. According to the documentation, "Because the output is limited to the range of the C long type, a ValueError is raised when lam is within 10 sigma of the maximum representable value."

I don't think this is a big issue per se -- it's just because the forward function generating the lambdas is unstable. But maybe the default initializations could be adjusted so that this doesn't happen.

Newton step fails for Students T DOF Parameter

Reproducer:

import ssm
from ssm.util import SEED
import autograd.numpy as np
np.random.seed(seed=SEED)
true_lds = ssm.SLDS(15, 3, 10, transitions="nn_recurrent", dynamics="diagonal_t", emissions="studentst")
z, x, y = true_lds.sample(100)

new_lds = ssm.SLDS(15, 3, 10, transitions="nn_recurrent", dynamics="diagonal_t", emissions="studentst")
new_lds.fit(y, method="svi", variational_posterior="mf", num_init_iters=2)

Traceback:

Traceback (most recent call last):
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/quick_test.py", line 9, in <module>
    new_lds.fit(y, method="svi", variational_posterior="mf", num_init_iters=2)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/util.py", line 110, in wrapper
    return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/lds.py", line 856, in fit
    self.initialize(datas, inputs, masks, tags, num_iters=num_init_iters)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/util.py", line 110, in wrapper
    return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/lds.py", line 184, in initialize
    method="em", num_iters=num_iters)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/util.py", line 110, in wrapper
    return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/hmm.py", line 480, in fit
    return _fitting_methods[method](datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/hmm.py", line 449, in _fit_em
    self.observations.m_step(expectations, datas, inputs, masks, tags, **observations_mstep_kwargs)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/observations.py", line 1376, in m_step
    self._m_step_nu(expectations, datas, inputs, masks, tags)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/observations.py", line 1475, in _m_step_nu
    self._log_nus[k] = np.log(generalized_newton_studentst_dof(E_taus[k], E_logtaus[k]))
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/regression.py", line 462, in generalized_newton_studentst_dof
    assert a > 0 and b < 0, "generalized_newton_studentst_dof encountered invalid values of a,b"
AssertionError: generalized_newton_studentst_dof encountered invalid values of a,b

Evaluating log probability with Categorical observations

Firstly, than you very much for your code base - it is really nice to use.
assert(data.dtype == int or data.dtype == bool) (line 393 of observations.py) throws an error when the Simple HMM notebook is altered to change the observations to "categorical" and when evaluating the log probability i.e. when running
true_hmm = HMM(5, 3, observations="categorical")
z, y = true_hmm.sample(1000)
true_ll = true_hmm.log_probability(y)
This is due to data = np.zeros((T+1, D)) (line 73 of core.py), which instantiates data as a numpy array of float 64s.

installation issue-.pyx

Hi,

I get unknown file type '.pyx' (from 'ssm/messages.pyx') during installation.
I've tried upgrading setuptools, replacing .pyx with .c in the setup script, installing pyrex but none of these solved the issue..
Could you share conda list file?

Heejae

Error compiling cstats

When I try to install the newest version of ssm, I get the following error:

    error: unknown file type '.pyx' (from 'ssm/cstats.pyx')

I was able to install by adding this line to the beginning of the setup.py script:
cythoned_files = cythonize('ssm/*.pyx')
and changing the two lines referencing *.pyx files to reference *.c files instead, i.e.:
sources=["ssm/messages.c"].

However, I'm not familiar with compiling and installing cython code so this might not be the most elegant way to install ssm.

StudentsTEmisssions log-likelihoods are the wrong shape

Fitting an SLDS with StudentsT Emissions fails because of a shape mismatch:

import ssm
import autograd.numpy as np
np.random.seed(seed=123457)
true_lds = ssm.SLDS(15, 3, 10, transitions="standard", dynamics="gaussian", emissions="studentst")
z, x, y = true_lds.sample(100)

new_lds = ssm.SLDS(15, 3, 10, transitions="standard", dynamics="gaussian", emissions="studentst")
new_lds.fit(y, method="svi", variational_posterior="mf")

Mismatch happens in lds.py

Traceback (most recent call last):
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/quick_test.py", line 8, in <module>
    new_lds.fit(y, method="svi", variational_posterior="mf")
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/util.py", line 108, in wrapper
    return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/lds.py", line 862, in fit
    elbos = _fitting_methods[method](posterior, datas, inputs, masks, tags, learning=True, **kwargs)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/lds.py", line 422, in _fit_svi
    elbos = [-_objective(params, 0) * T]
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/lds.py", line 412, in _objective
    obj = self.elbo(variational_posterior, datas, inputs, masks, tags)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/util.py", line 141, in wrapper
    return f(self, arg0, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/lds.py", line 312, in elbo
    log_likes += self.emissions.log_likelihoods(data, input, mask, tag, x)
ValueError: operands could not be broadcast together with shapes (100,3) (100,15) (100,3)

Should be straightforward to fix.

PoissonObservations sample_x() uses `inv_lambdas`, which does not exist

See https://github.com/slinderman/ssm/blob/master/ssm/observations.py#L349

I believe this should in fact be lambdas = np.exp(self.log_lambdas)

Feature request: Convergence detection

I also would like some version of "detecting convergence"... in other words, has the ELBO stabilized. The reason is that I am comparing run-times with another algorithm that has a convergence condition, and to be fair to your algorithm code I'll have to fine-tune the number of iterations for every problem set.

This may be close to a duplicate of "early stopping", but I do not need it to be a function of any test set.

P.S. got the code running!

Fitting fails with AR emissions

import ssm
true_lds = ssm.LDS(15, 10, dynamics="none", emissions="ar")
x, y = true_lds.sample(1000)

lds = ssm.LDS(15, 10, dynamics="none", emissions="ar")
lds.fit(y, method="svi", variational_posterior="mf")

fails because of a dimension mismatch when trying to initialize the ARHMM using KMeans. In the case of an LDS (as opposed toS LDS) the number of clusters passed to Kmeans is just 1, so it's pretty much a no-op anyway.

It's line 111 (observations.py) that fails, because the dimension of data is wrong -- it's T x T x D instead of T x D.

Seems like the cause might be here:

 xs = [self.emissions.invert(data, input, mask, tag)
              for data, input, mask, tag in zip(datas, inputs, masks, tags)]

This outputs a list where each entry is T x T x D.

Put autograd in install_requires?

Easy to fix, but came across an import error,

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-4-62c6107d478d> in <module>()
      8 from sklearn.mixture import BayesianGaussianMixture
      9 from sklearn.decomposition import PCA
---> 10 from ssm.models import SLDS, LDS
     11 from scipy import linalg
     12 from copy import deepcopy

~/python_repos/ssm/ssm/models.py in <module>()
----> 1 from ssm.core import _HMM, _LDS, _SwitchingLDS
      2 
      3 from ssm.init_state_distns import InitialStateDistribution
      4 
      5 from ssm.transitions import \

~/python_repos/ssm/ssm/core.py in <module>()
      3 from functools import partial
      4 
----> 5 import autograd.numpy as np
      6 import autograd.numpy.random as npr
      7 from autograd.scipy.misc import logsumexp

ModuleNotFoundError: No module named 'autograd'

Bugs in SLDS (log transition matrices shape error)

Hi, thanks for this great work in ssm package!

However, when I was running SLDS notebook, the code broke down here:
q_mf_elbos = slds.fit(q_mf, y_masked, masks=mask, num_iters=1000, initialize=False).

It threw an Assertion error. After inspecting into the codes, I found that you have one line commented out in the StanationaryTransition class:

def log_transition_matrices(self, data, input, mask, tag):
    T = data.shape[0]
    log_Ps = self.log_Ps - logsumexp(self.log_Ps, axis=1, keepdims=True)
    # return np.tile(log_Ps[None, :, :], (T-1, 1, 1))
    return log_Ps[None, :, :]

This makes the log_Ps has the shape (1,K,K), which does not match the assertion in messages.pyx that it should have the shape (T, K, K).

However, uncommenting the commented line and returning that would fix the issue. But I don't know if this would affect other functions of the package.

Cleanup variational posterior initialization

It would be nice to have the fit() function initialize both the model and the variational posterior.

Add Hidden Semi Markov Model (HSMM) with negative binomial observations

Semi-Markov models explicitly model the duration distribution over time spent in each discrete state.
This introduces a non-Markovian dependency, but it's not arbitrarily complex. We can include semi-Markov models by extending the state space of the HMM. This can be done efficiently for certain classes of duration distributions, like the negative binomial distribution with an integer shape parameter. See Matt Johnson's thesis, for example.

small bug in emissions.py

Hey @slinderman, there's a tiny bug in the permute function of the _OrthogonalLinearEmissions class on line 268 of emissions.py:

self._Ms = self._Bs[perm]

should be

self._Ms = self._Ms[perm]

This is obviously an easy fix; do you have preferred way for me to address small bugs like this in the future? (i.e. through a pull request? I've never done those before)

question rather than issue

How is the HMM model here different from that developed by the Datta lab?

Implement an iterative algorithm for MLE in the AR model with correlated Gaussian noise

NaNs in gradients when fitting RSLDS

I sometimes get errors (seems to be random based on the initialization) when trying to fit an RSLDS. The following lines reproduce the error:

import ssm
import autograd.numpy as np
np.random.seed(seed=123457)
true_lds = ssm.SLDS(15, 3, 10, transitions="rbf_recurrent", dynamics="gaussian", emissions="gaussian")
z, x, y = true_lds.sample(100)

new_lds = ssm.SLDS(15, 3, 10, transitions="rbf_recurrent", dynamics="gaussian", emissions="gaussian")
new_lds.fit(y, method="svi", variational_posterior="mf")

The traceback looks like this:

Traceback (most recent call last):
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/quick_test.py", line 8, in <module>
    new_lds.fit(y, method="svi", variational_posterior="mf")
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/util.py", line 108, in wrapper
    return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/lds.py", line 862, in fit
    elbos = _fitting_methods[method](posterior, datas, inputs, masks, tags, learning=True, **kwargs)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/lds.py", line 430, in _fit_svi
    params, val, g, state = step(value_and_grad(_objective), params, itr, state)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/optimizers.py", line 44, in _step
    step(_value_and_grad, _x, itr, state=state, *args, **kwargs)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/optimizers.py", line 75, in adam_step
    val, g = value_and_grad(x, itr)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/optimizers.py", line 41, in _value_and_grad
    v, g = value_and_grad(unflatten(x), i)
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/autograd/wrap_util.py", line 20, in nary_f
    return unary_operator(unary_f, x, *nary_op_args, **nary_op_kwargs)
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/autograd/differential_operators.py", line 140, in value_and_grad
    return ans, vjp(vspace(ans).ones())
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/autograd/core.py", line 14, in vjp
    def vjp(g): return backward_pass(g, end_node)
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/autograd/core.py", line 21, in backward_pass
    ingrads = node.vjp(outgrad[0])
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/autograd/core.py", line 78, in <lambda>
    return lambda g: (vjp_0(g), vjp_1(g))
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/autograd/scipy/linalg.py", line 29, in vjp
    v = al2d(solve_triangular(a, g, trans=_flip(a, trans), lower=lower))
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/autograd/tracer.py", line 48, in f_wrapped
    return f_raw(*args, **kwargs)
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/scipy/linalg/basic.py", line 336, in solve_triangular
    b1 = _asarray_validated(b, check_finite=check_finite)
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/scipy/_lib/_util.py", line 239, in _asarray_validated
    a = toarray(a)
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/numpy/lib/function_base.py", line 496, in asarray_chkfinite
    "array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs

I also get a message that LBFGS-B ran out of iteration before converging, during the initialization ARHMM fitting, which might be related.

Make classes for public API models

Currently methods defined in models.py serve as wrappers that return the "private" classes implemented in core.py. It would be nice to have public classes instead of standalone methods. Some example skeleton code:

from ssm.core import _HMM
class HMM(_HMM):
    def __init__(self, args):
        # Check that arguments are valid, set up transition & observation
        # models, etc.
        super().__init__(args):

That way users such as myself could write subclasses that inherit from the public HMM object while implementing additional methods in an application-specific manner. More example skeleton code:

from ssm.models import HMM
class CustomHMM(HMM):
    def __init__(self, args):
        # Convert data into format expected by public HMM object,
        # do additional checks, then forward args to superclass
        super().__init__(args)
    
    # Example additional method not implemented either in 
    # public HMM API or private _HMM object
    def fit_ext(self, ext_args):
        # Learn mapping from internal state space to external space
        # e.g. position of animal

Early stopping?

Poking around and I'm not sure if this is already supported, but I was wondering if I could use a fit method like so,

train_data, test_data = split(all_the_data)
hmm = HMM(25, 50, observations='gaussian')
hmm_lls = hmm.fit(train_data, method='em', num_em_iters=200, verbose=True, test_data=test_data)

And have the fit method compute the test likelihood on test_data to stop the fit once we no longer get improvements in the test likelihood. See the section from xgboost on early stopping as an example. At the very least, would save some time.

Add prefix to SLDS sample function

Update the SLDS sample function to have history terms as an input, here termed prefix, as in the HMM sample function.

interface to arbitrary data

Hi Scott! I'm going to use your older pyhsmm in the meantime, but it would be great if you could include a natural interface to working with real data. What I mean is, rather than just having your objects built around generating data from a given model and then fitting it, having some kind of sklearn style interface to estimate a model from a timeseries.

Take care,
Kam

scipy.misc logsyumexp error

I'm trying to run the ssm code and keep getting errors that look like this:
Traceback (most recent call last):
File "ssm.py", line 18, in
from ssm.models import HMM
File "/tigress/vcorbit/ssm/ssm/models.py", line 1, in
from ssm.core import BaseHMM, BaseHSMM, BaseLDS, BaseSwitchingLDS
File "/tigress/vcorbit/ssm/ssm/core.py", line 8, in
from autograd.scipy.misc import logsumexp
ImportError: cannot import name 'logsumexp'

I googled and it looks like logsumexp is actually part of scipy.special, not scipy.misc, so I tried editing that in the source code, but then it just throws another error that looks like this except for a different file.

We had this code working a few months ago, was there possibly an update that messed things up?

Thanks!

Sampling and Input Driven Transitions

Hi,

Thanks for your help with the last issue.

I modified your Simple HMM notebook to use your input driven transitions class. There is an issue with the sample function because of the padding that you use in the case where input is not provided. In particular, if you run:

`T = 100
K = 2
D = 1
M = 3
true_hmm = HMM(K, D, M, observations="gaussian",transitions = "inputdriven")

input = npr.randn(T, M)

z, y = true_hmm.sample(T, input = input)`

you get an assertion error.

NameError when initializing a simple LDS

Running the following example code:
`
from ssm.models import LDS

Set the parameters of the HMM

T = 200 # number of time bins
K = 5 # number of discrete states
D = 2 # number of latent dimensions
N = 10 # number of observed dimensions

Make an LDS with the somewhat interesting dynamics parameters

true_lds = LDS(N, D, emissions="gaussian")
`
results in the following error:

NameError Traceback (most recent call last)
in
8 # Make an LDS with the somewhat interesting dynamics parameters
9
---> 10 true_lds = LDS(N, D, emissions="gaussian")

[DIR]/ssm/ssm/models.py in LDS(N, D, M, dynamics, dynamics_kwargs, hierarchical_dynamics_tags, emissions, emission_kwargs, hierarchical_emission_tags, **kwargs)
393
394 # Make the HMM
--> 395 return BaseLDS(N, D, M, dynamics_distn, emission_distn)
396
397

[DIR]/ssm/ssm/core.py in init(self, N, D, M, dynamics, emissions)
1007 init_state_distn = InitialStateDistribution(1, D, M)
1008 transitions = StationaryTransitions(1, D, M)
-> 1009 super(_LDS, self).init(N, 1, D, M, init_state_distn, transitions, dynamics, emissions)
1010
1011 @ensure_slds_args_not_none

NameError: name '_LDS' is not defined

Bug in Students-T observations

When I run a HMM with Student -T observations i I get the error:

AttributeError: 'StudentsTObservations' object has no attribute '_log_likelihoods'

StudentsTObservations is inheriting from _Observations where _log_likelihoods is defined without the underscore _.

This only affects 'StudentsTObservations' because other observations define their own m_step methods.

Fitting LDS with no dynamics fails

The following combination of settings fails for fitting an LDS:
dynamics = none, emissions = studentst, method = svi, posterior = mf

This is the traceback:

Traceback (most recent call last):
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/tests/test_lds.py", line 480, in test_lds_sample_and_fit
    fit_lds.fit(y, method=method, variational_posterior=posterior, alpha=0)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/util.py", line 108, in wrapper
    return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/lds.py", line 857, in fit
    self.initialize(datas, inputs, masks, tags)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/util.py", line 108, in wrapper
    return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/lds.py", line 171, in initialize
    self.emissions.initialize(datas, inputs, masks, tags)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/util.py", line 108, in wrapper
    return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/emissions.py", line 495, in initialize
    pca = self._initialize_with_pca(datas, inputs=inputs, masks=masks, tags=tags)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/util.py", line 108, in wrapper
    return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/emissions.py", line 190, in _initialize_with_pca
    pca, xs, ll = pca_with_imputation(self.D, resids, masks, num_iters=num_iters)
  File "/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/preprocessing.py", line 36, in pca_with_imputation
    x = pca.fit_transform(data)
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/sklearn/decomposition/pca.py", line 360, in fit_transform
    U, S, V = self._fit(X)
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/sklearn/decomposition/pca.py", line 382, in _fit
    copy=self.copy)
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py", line 542, in check_array
    allow_nan=force_all_finite == 'allow-nan')
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py", line 49, in _assert_all_finite
    if is_float and (np.isfinite(_safe_accumulator_op(np.sum, X))):
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/sklearn/utils/extmath.py", line 688, in _safe_accumulator_op
    result = op(x, *args, **kwargs)
  File "<__array_function__ internals>", line 6, in sum
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 2182, in sum
    initial=initial, where=where)
  File "/Users/Bantin/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 90, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
FloatingPointError: invalid value encountered in reduce

Alternatives to autograd

This is not a real issue. I am just wondering if you are considering alternatives given that Autograd is not being developed anymore. I've seen the JAX branch, but I'm guessing you are not yet willing to make that move yet.

I am raising this point in part also because I've noticed how slow Autograd can be in comparison to alternatives like PyTorch. Especially when using NeuralRecurrentTransitions, diffing through the NN seems to be the main bottleneck, at least in my use case.

I was able to achieve multiple fold speed increase by moving that part of code into PyTorch, although I can see that PyTorch is an overkill and may kill the flexibility of the framework.

HMM.filter fails after switch to Numba

Trying to call HMM.filter gives an error because Numba is not able to work in Nopython mode.

---------------------------------------------------------------------------
TypingError                               Traceback (most recent call last)
<ipython-input-9-ffcb976fe52f> in <module>
      1 # get a random reach
      2 test_trial_spikes = np.random.choice(train_datas)
----> 3 posterior = simple_hmm.filter(test_trial_spikes)

/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/util.py in wrapper(self, data, input, mask, tag, **kwargs)
    155 
    156         mask = np.ones_like(data, dtype=bool) if mask is None else mask
--> 157         return f(self, data, input=input, mask=mask, tag=tag, **kwargs)
    158     return wrapper
    159 

/Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/hmm.py in filter(self, data, input, mask, tag)
    264         Ps = self.transitions.transition_matrices(data, input, mask, tag)
    265         log_likes = self.observations.log_likelihoods(data, input, mask, tag)
--> 266         return hmm_filter(pi0, Ps, log_likes)
    267 
    268     @ensure_args_not_none

~/anaconda3/lib/python3.7/site-packages/numba/dispatcher.py in _compile_for_args(self, *args, **kws)
    374                 e.patch_message(msg)
    375 
--> 376             error_rewrite(e, 'typing')
    377         except errors.UnsupportedError as e:
    378             # Something unsupported is present in the user code, add help info

~/anaconda3/lib/python3.7/site-packages/numba/dispatcher.py in error_rewrite(e, issue_type)
    341                 raise e
    342             else:
--> 343                 reraise(type(e), e, None)
    344 
    345         argtypes = []

~/anaconda3/lib/python3.7/site-packages/numba/six.py in reraise(tp, value, tb)
    656             value = tp()
    657         if value.__traceback__ is not tb:
--> 658             raise value.with_traceback(tb)
    659         raise value
    660 

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Internal error at <numba.typeinfer.CallConstraint object at 0x7fde6b4fb990>.
got an unexpected keyword argument 'axis'
[1] During: resolving callee type: type(CPUDispatcher(<function logsumexp at 0x7fde90a72830>))
[2] During: typing of call at /Users/bantin/Documents/Linderman-Shenoy/ssm/ssm/messages.py (75)

Enable logging at debug level for details.

File "../ssm/messages.py", line 75:
def hmm_filter(pi0, Ps, ll):
    <source elided>
    # Predict forward with the transition matrix
    pz_tt = np.exp(alphas - logsumexp(alphas, axis=1, keepdims=True))
    ^

This is not usually a problem with Numba itself but instead often caused by
the use of unsupported features or an issue in resolving types.

To see Python/NumPy features supported by the latest release of Numba visit:
http://numba.pydata.org/numba-doc/latest/reference/pysupported.html
and
http://numba.pydata.org/numba-doc/latest/reference/numpysupported.html

For more information about typing errors and how to debug them visit:
http://numba.pydata.org/numba-doc/latest/user/troubleshoot.html#my-code-doesn-t-compile

If you think your code should work with Numba, please report the error message
and traceback, along with a minimal reproducer at:
https://github.com/numba/numba/issues/new

I will look into this. We should also beef up the regression tests to avoid this in the future.

Attribute error with scipy.misc.logsumexp and autograd

Hi,

I think related to this issue (HIPS/autograd#501) here, I'm having trouble importing ssm because I get an attribute error trying to import logsumexp from scipy.misc. Would re-installing autograd from the master branch be the solution here?

Thanks!
Shruthi

Lin alg error for 1D data

Hi, I am an undergrad working with Arunesh at LIINC and I ran into an issue running HMM on 1D time series data. Each time series varies in length but we roughly have 50 videos x 30000 frames x 1 "channel". Below is a sample script that replicates this behavior and the error that I get.

It seems that np.linalg.cholesky takes a matrix that is usually square but is not in this case. Is there any way to run a HMM when D=1?

import numpy as np
from ssm.models import HMM

# HMM parameters
K = 5
D = 1
T = 30000

def generate(K, T):
    states_set = np.arange(K)
    states = np.zeros(T)
    x = np.zeros(T)

    # assume uniform initial distribution on states
    states[0] = np.random.choice(states)

    # uniform random transition matrix
    transitions = []
    for state in range(K):
        probabilities = np.random.random_sample((K,))
        probabilities = probabilities / np.sum(probabilities)
        transitions.append(probabilities)
    transitions = np.array(transitions);

    # generate states
    for t in range(1,T):
        prev = int(states[t-1])
        states[t] = np.random.choice(a=states_set,p=transitions[prev])

    # normal emissions distribution
    emissions = []
    for state in range(K):
        emissions.append(np.random.normal)

    for t in range(T):
        curr_state = int(states[t])
        x_t = emissions[curr_state](curr_state, curr_state / 10)
        x[0] = x_t

    return x.reshape(T,1)

# 5 x 30000 x 1
x = []
for _ in range(5):
    x.append(generate(K,T))

# run HMM on data
arhmm = HMM(K, D)
arhmm.fit(x,method="em")

 File "hmm.py", line 50, in <module>
    arhmm.fit(x,method="em")
  File "/home/kevin/microssm/ssm/ssm/util.py", line 109, in wrapper
    return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
  File "/home/kevin/microssm/ssm/ssm/core.py", line 369, in fit
    self.initialize(datas, inputs=inputs, masks=masks, tags=tags)
  File "/home/kevin/microssm/ssm/ssm/util.py", line 109, in wrapper
    return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
  File "/home/kevin/microssm/ssm/ssm/core.py", line 56, in initialize
    self.observations.initialize(datas, inputs=inputs, masks=masks, tags=tags)
  File "/home/kevin/microssm/ssm/ssm/util.py", line 109, in wrapper
    return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
  File "/home/kevin/microssm/ssm/ssm/observations.py", line 105, in initialize
    self._sqrt_Sigmas = np.linalg.cholesky(Sigmas + 1e-8 * np.eye(self.D))
  File "/home/kevin/miniconda/envs/micro-ssm/lib/python3.6/site-packages/autograd/tracer.py", line 48, in f_wrapped
    return f_raw(*args, **kwargs)
  File "/home/kevin/miniconda/envs/micro-ssm/lib/python3.6/site-packages/numpy/linalg/linalg.py", line 730, in cholesky
    _assertNdSquareness(a)
  File "/home/kevin/miniconda/envs/micro-ssm/lib/python3.6/site-packages/numpy/linalg/linalg.py", line 215, in _assertNdSquareness
    raise LinAlgError('Last 2 dimensions of the array must be square')
numpy.linalg.linalg.LinAlgError: Last 2 dimensions of the array must be square

sampling with AR observations overflows

The following lines cause an overflow:

true_lds = ssm.LDS(15, 10, dynamics="none", emissions="ar")
x, y = true_lds.sample(3000)

My immediate guess is that, because in this case we are essentially just iterating a discrete time LDS, if the spectral radius of the A matrix (in the AR observations) is greater than 1, the system is unstable and blows up. Should be an easy fix to initialize the matrix with smaller eigenvalues.

Errors on installation

Hi @slinderman
apologies i am getting errors on the install

Requirement already satisfied: six in /home/andrewcz/miniconda3/envs/myenv1/lib/python3.7/site-packages (from cycler>=0.10->matplotlib->ssm==0.0.1) (1.12.0)
Installing collected packages: ssm
Running setup.py develop for ssm
ERROR: Complete output from command /home/andrewcz/miniconda3/envs/myenv1/bin/python -c 'import setuptools, tokenize;file='"'"'/home/andrewcz/ssm/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' develop --no-deps:
ERROR: USE_OPENMP False
running develop
running egg_info
writing ssm.egg-info/PKG-INFO
writing dependency_links to ssm.egg-info/dependency_links.txt
writing requirements to ssm.egg-info/requires.txt
writing top-level names to ssm.egg-info/top_level.txt
reading manifest file 'ssm.egg-info/SOURCES.txt'
writing manifest file 'ssm.egg-info/SOURCES.txt'
running build_ext
building 'ssm.messages' extension
gcc -pthread -B /home/andrewcz/miniconda3/envs/myenv1/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/andrewcz/.local/lib/python3.7/site-packages/numpy/core/include -I/home/andrewcz/miniconda3/envs/myenv1/include/python3.7m -I/home/andrewcz/miniconda3/envs/myenv1/include/python3.7m -c ssm/messages.cpp -o build/temp.linux-x86_64-3.7/ssm/messages.o -O3
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /home/andrewcz/.local/lib/python3.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822,
from /home/andrewcz/.local/lib/python3.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
from /home/andrewcz/.local/lib/python3.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from ssm/messages.cpp:629:
/home/andrewcz/.local/lib/python3.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
17 | #warning "Using deprecated NumPy API, disable it with " \
| ^~~~~~~
g++ -pthread -shared -B /home/andrewcz/miniconda3/envs/myenv1/compiler_compat -L/home/andrewcz/miniconda3/envs/myenv1/lib -Wl,-rpath=/home/andrewcz/miniconda3/envs/myenv1/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/ssm/messages.o -o build/lib.linux-x86_64-3.7/ssm/messages.cpython-37m-x86_64-linux-gnu.so
/home/andrewcz/miniconda3/envs/myenv1/compiler_compat/ld: build/temp.linux-x86_64-3.7/ssm/messages.o: unable to initialize decompress status for section .debug_info
/home/andrewcz/miniconda3/envs/myenv1/compiler_compat/ld: build/temp.linux-x86_64-3.7/ssm/messages.o: unable to initialize decompress status for section .debug_info
/home/andrewcz/miniconda3/envs/myenv1/compiler_compat/ld: build/temp.linux-x86_64-3.7/ssm/messages.o: unable to initialize decompress status for section .debug_info
/home/andrewcz/miniconda3/envs/myenv1/compiler_compat/ld: build/temp.linux-x86_64-3.7/ssm/messages.o: unable to initialize decompress status for section .debug_info
build/temp.linux-x86_64-3.7/ssm/messages.o: file not recognized: file format not recognized
collect2: error: ld returned 1 exit status
error: command 'g++' failed with exit status 1
----------------------------------------
ERROR: Command "/home/andrewcz/miniconda3/envs/myenv1/bin/python -c 'import setuptools, tokenize;file='"'"'/home/andrewcz/ssm/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' develop --no-deps" failed with error code 1 in /home/andrewcz/ssm/

M-step update for sticky transitions

Hi Scott,

It was great meeting you at SfN. Once again: we are massive fans of ssm, so I want to thank you and everyone who has worked on it for producing such a nice code base.

I wanted to ask if the m-step update for the sticky transitions is correct if your alpha parameter != 1. In particular, I just went through the math, and believe I got a different update (where I am using alpha now to be the Dirichlet distribution parameters):

Specifically, I think the M-Step for the Sticky transitions should be:

        expected_joints = sum([np.sum(Ezzp1, axis=0) for _, Ezzp1, _ in expectations]) + 1e-8
        expected_joints += self.kappa * np.eye(self.K) + (self.alpha-1) * np.ones((self.K, self.K))
        P = expected_joints / expected_joints.sum(axis=1, keepdims=True)
        self.log_Ps = np.log(P)

Change observation sample methods to generate as many samples as given discrete states

The random noise generated by, e.g., GaussianObservations.sample is only for a single sample. If you pass in multiple discrete states, you should get multiple conditional samples.

Numerical stability?

Thank you for building this great package.

I have more of a question rather than an issue:
It's stated in Rabiner's review on hidden markov models that the naive forward-backward message passing algorithms are numerically unstable, are you able to avoid this problem just by taking logs first?

Allow the LDS to recognize when using a Gaussian observation model and a structured variational posterior

If we did this we could just do EM instead of SGD on the ELBO.

Is it possible to fit states to more than one series at a time?

Hi,

Thanks for sharing this library. I was wondering whether we can fit the states on a collection of time series instead of just one as you do on the examples.

Thanks.

ssm.py script

I have tried downloading and redownloading this package and I just cannot find an actual ssm.py script in any of the downloaded folders. Is it possible it somehow got removed from the download? I am able to import ssm in python so it seems to be downloading properly, I'm just looking for the actual script so I can learn how it works.

EM on HMM divide by zero error

Seeing this error when running approximately more than 20 iterations for a Gaussian emission HMM. The fit with <20 looks somewhat reasonable, and the features being fed to the model have been Cholesky-whitened.

>>> print(lst[0][:, :10].shape)
(11818, 10)

>>> hmm = HMM(10, 10, observations='gaussian')
>>> lls = hmm.fit(lst[0][:, :10], method="em", num_em_iters=100, verbose=True)
Iteration 0.  LL: -93230.4
Iteration 1.  LL: -91421.0
Iteration 2.  LL: -90526.6
Iteration 3.  LL: -89963.6
Iteration 4.  LL: -89623.7
Iteration 5.  LL: -89415.4
Iteration 6.  LL: -89252.0
Iteration 7.  LL: -89134.4
Iteration 8.  LL: -89021.7
Iteration 9.  LL: -88901.5
Iteration 10.  LL: -88773.6
Iteration 11.  LL: -88664.0
Iteration 12.  LL: -88571.6
Iteration 13.  LL: -88487.1
Iteration 14.  LL: -88405.6
Iteration 15.  LL: -88310.8
Iteration 16.  LL: -88164.2
Iteration 17.  LL: -87965.4
Iteration 18.  LL: -87695.1
Iteration 19.  LL: -87387.7
Iteration 20.  LL: -87044.9
Iteration 21.  LL: -86613.1
Iteration 22.  LL: -86215.6
Iteration 23.  LL: -85858.3
Iteration 24.  LL: -85345.8
Iteration 25.  LL: -84206.9
Iteration 26.  LL: -82743.6
Iteration 27.  LL: -80948.3
Iteration 28.  LL: -78257.4
Iteration 29.  LL: -74076.1
Iteration 30.  LL: -67966.6
Iteration 31.  LL: -59290.6
Iteration 32.  LL: -50023.7
Iteration 33.  LL: -26831.8

/home/jm447/miniconda3/envs/moseq2/lib/python3.6/site-packages/autograd/tracer.py:48: RuntimeWarning: divide by zero encountered in log
  return f_raw(*args, **kwargs)
/home/jm447/python_repos/ssm/ssm/observations.py:108: RuntimeWarning: divide by zero encountered in true_divide
  (np.log(2 * np.pi * sigmas) + (data[:, None, :] - mus)**2 / sigmas)
---------------------------------------------------------------------------
FloatingPointError                        Traceback (most recent call last)
<ipython-input-1138-9222da7c2526> in <module>()
      1 hmm = HMM(10, 10, observations='gaussian')
----> 2 lls = hmm.fit(lst[0][:, :10], method="em", num_em_iters=100, verbose=True)

~/python_repos/ssm/ssm/util.py in wrapper(self, datas, inputs, masks, tags, **kwargs)
    102             tags = [tags]
    103 
--> 104         return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
    105 
    106     return wrapper

~/python_repos/ssm/ssm/core.py in fit(self, datas, inputs, masks, tags, method, initialize, **kwargs)
    246             self.initialize(datas, inputs=inputs, masks=masks, tags=tags)
    247 
--> 248         return self._fitting_methods[method](datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
    249 
    250 

~/python_repos/ssm/ssm/core.py in _fit_em(self, datas, inputs, masks, tags, num_em_iters, verbose, debug, **kwargs)
    230 
    231             # Store progress
--> 232             lls.append(self.log_probability(datas, inputs, masks, tags))
    233 
    234             if verbose:

~/python_repos/ssm/ssm/util.py in wrapper(self, datas, inputs, masks, tags, **kwargs)
    102             tags = [tags]
    103 
--> 104         return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
    105 
    106     return wrapper

~/python_repos/ssm/ssm/core.py in log_probability(self, datas, inputs, masks, tags)
    150             log_pi0 = self.init_state_distn.log_initial_state_distn(data, input, mask, tag)
    151             log_Ps = self.transitions.log_transition_matrices(data, input, mask, tag)
--> 152             log_likes = self.observations.log_likelihoods(data, input, mask, tag)
    153             lp += hmm_normalizer(log_pi0, log_Ps, log_likes)
    154             assert np.isfinite(lp)

~/python_repos/ssm/ssm/observations.py in log_likelihoods(self, data, input, mask, tag)
    106         mask = np.ones_like(data, dtype=bool) if mask is None else mask
    107         return -0.5 * np.sum(
--> 108             (np.log(2 * np.pi * sigmas) + (data[:, None, :] - mus)**2 / sigmas)
    109             * mask[:, None, :], axis=2)
    110 

FloatingPointError: invalid value encountered in true_divide

HSMM example fails

need to change the following

https://github.com/slinderman/ssm/blob/855aaad95efa90c5c3b967aa91f2ccaed0187235/ssm/hmm.py#L622

pi0 = self.init_state_distn.initial_state_distn(data, input, mask, tag)

and

here

https://github.com/slinderman/ssm/blob/855aaad95efa90c5c3b967aa91f2ccaed0187235/examples/hsmm.py#L31

num_em_iters -> num_iters

Add support for hierarchical hidden Markov models (not just hierarchical priors on the transitions and observations)

Hello,
Thanks for creating this package! I was wondering if you could help me with an example to create a hierarchical HMM? I'm looking to have two levels of states - the higher governs the structure of the transitions between the lower states with a simple Gaussian observation model - I was wondering how to instantiate such a model.
Thanks again!
Shruthi

remove fopenmp distutils from cstats.pyx

These are already present in the setup.py file.

Input Driven Transitions - learning Ws

Hi Scott,

I was just wondering if it was intentional that the weights Ws for the InputDrivenTransitions class are not updated during fitting? In particular, if you modify your first notebook to print the Ws parameters before and after fitting, you can see that they are the same. This is because the M-Step for InputDrivenTransitions is inherited from the StickyTransitions class. I suspect this should not be the case, and that Ws should be learned. Thanks.

`### Set the parameters of the HMM
T = 100 # number of time bins
K = 2 # number of discrete states
D = 1 # data dimension
M = 3 # Input dimension

Make an HMM

true_hmm = HMM(K, D, M, observations="categorical",transitions = "inputdriven", observation_kwargs = {"C": 3})

Sampled input

input = npr.randn(T, M)

Sample some data from the HMM

z, y = true_hmm.sample(T, input = input)

True Loglikelihood

true_ll = true_hmm.log_probability(y, inputs = input[1:])

New HMM

N_iters = 500
hmm = HMM(K, D, M, observations="categorical", transitions = "inputdriven", observation_kwargs = {"C": 3})

Print Ws before fitting

hmm.params[1][1]

Fit

hmm_lls = hmm.fit(y, inputs = input[1:], method="em", num_em_iters=N_iters, verbose=True)

Print Ws after fitting

hmm.params[1][1]`

hmm forward pass error

ssm.messages.forward_pass() throws an assertion error without any additional info. Minimal working example:

model_kwargs = {'transitions': 'standard', 'observations': 'ar'}
K = 2  # discrete states
D = 8  # data dim
M = 0  # inputs dim
T = 100  # trial length
N = 10 # num trials

arhmm = HMM(K=K, D=D, M=M, **model_kwargs)
sim_data = [np.random.rand(T, D) for _ in range(N)]
logprobs = arhmm.fit(datas=sim_data, method='em', num_em_iters=50)

Traceback:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-20-8955b9e168f5> in <module>()
     10 sim_data = [np.random.rand(T, D) for _ in range(N)]
     11 
---> 12 logprobs = arhmm.fit(datas=sim_data, method='em', num_em_iters=50)

~/Dropbox/github/ssm/ssm/util.py in wrapper(self, datas, inputs, masks, tags, **kwargs)
    107             tags = [tags]
    108 
--> 109         return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
    110 
    111     return wrapper

~/Dropbox/github/ssm/ssm/hmm.py in fit(self, datas, inputs, masks, tags, method, initialize, **kwargs)
    479             self.initialize(datas, inputs=inputs, masks=masks, tags=tags)
    480 
--> 481         return _fitting_methods[method](datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
    482 
    483 

~/Dropbox/github/ssm/ssm/hmm.py in _fit_em(self, datas, inputs, masks, tags, num_em_iters, tolerance, init_state_mstep_kwargs, transitions_mstep_kwargs, observations_mstep_kwargs)
    435         M-step: analytical maximization of E_{p(z | x)} [log p(x, z; theta)].
    436         """
--> 437         lls = [self.log_probability(datas, inputs, masks, tags)]
    438 
    439         pbar = trange(num_em_iters)

~/Dropbox/github/ssm/ssm/util.py in wrapper(self, datas, inputs, masks, tags, **kwargs)
    107             tags = [tags]
    108 
--> 109         return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
    110 
    111     return wrapper

~/Dropbox/github/ssm/ssm/hmm.py in log_probability(self, datas, inputs, masks, tags)
    301     @ensure_args_are_lists
    302     def log_probability(self, datas, inputs=None, masks=None, tags=None):
--> 303         return self.log_likelihood(datas, inputs, masks, tags) + self.log_prior()
    304 
    305     def expected_log_likelihood(

~/Dropbox/github/ssm/ssm/util.py in wrapper(self, datas, inputs, masks, tags, **kwargs)
    107             tags = [tags]
    108 
--> 109         return f(self, datas, inputs=inputs, masks=masks, tags=tags, **kwargs)
    110 
    111     return wrapper

~/Dropbox/github/ssm/ssm/hmm.py in log_likelihood(self, datas, inputs, masks, tags)
    295             log_Ps = self.transitions.log_transition_matrices(data, input, mask, tag)
    296             log_likes = self.observations.log_likelihoods(data, input, mask, tag)
--> 297             ll += hmm_normalizer(log_pi0, log_Ps, log_likes)
    298             assert np.isfinite(ll)
    299         return ll

~/anaconda3/lib/python3.6/site-packages/autograd/tracer.py in f_wrapped(*args, **kwargs)
     46             return new_box(ans, trace, node)
     47         else:
---> 48             return f_raw(*args, **kwargs)
     49     f_wrapped.fun = f_raw
     50     f_wrapped._is_autograd_primitive = True

~/Dropbox/github/ssm/ssm/primitives.py in hmm_normalizer(log_pi0, log_Ps, ll)
     27     ll = to_c(ll)
     28 
---> 29     forward_pass(log_pi0, log_Ps, ll, alphas)
     30     return logsumexp(alphas[-1])
     31 

~/Dropbox/github/ssm/ssm/messages.pyx in ssm.messages.forward_pass()

~/Dropbox/github/ssm/ssm/messages.pyx in ssm.messages.forward_pass()

AssertionError:

Error when running Poisson SLDS notebook

After pulling the latest version, I got TypeError: _fit_em() got an unexpected keyword argument 'num_iters' when calling slds.initialize(y_masked, masks=mask) in the Poisson SLDS example notebook.

I think it's because of the removal of **kwargs, so I'm guessing similar errors might occur in other places as well.

lindermanlab / ssm Goto Github PK

ssm's People

Contributors

Stargazers

Watchers

Forkers

ssm's Issues

Set the parameters of the HMM

Make an LDS with the somewhat interesting dynamics parameters

Make an HMM

Sampled input

Sample some data from the HMM

True Loglikelihood

New HMM

Print Ws before fitting

Fit

Print Ws after fitting

Recommend Projects

Recommend Topics

Recommend Org