oseiskar / simdkalman Goto Github PK

View Code? Open in Web Editor NEW

175.0 13.0 36.0 167 KB

Python Kalman filters vectorized as Single Instruction, Multiple Data

Home Page: https://simdkalman.readthedocs.io/

License: MIT License

Python 99.76% Shell 0.24%

kalman-filter python

simdkalman's Introduction

SIMD Kalman

Fast Kalman filters in Python leveraging single-instruction multiple-data vectorization. That is, running n similar Kalman filters on n independent series of observations. Not to be confused with SIMD processor instructions.

import simdkalman

kf = simdkalman.KalmanFilter(
    state_transition = np.array([[1,1],[0,1]]),
    process_noise = np.diag([0.1, 0.01]),
    observation_model = np.array([[1,0]]),
    observation_noise = 1.0)

data = numpy.random.normal(size=(200, 1000))

# smooth and explain existing data
smoothed = kf.smooth(data)
# predict new data
pred = kf.predict(data, 15)

See examples/example.py for a more comprehensive example and ReadTheDocs for the full documentation. For the changelog, see releases page

According to examples/benchmark.py. This can be up to 100x faster than pykalman and 70x faster than filterpy when can be vectorized over many independent timeseries. Also in the non-vectorized case, it can be 2x faster.

Installation

pip install simdkalman

Development

Create virtualenv
- Python 2: virtualenv venvs/python2
- Python 3: python3 -m venv venvs/python3
Activate virtualenv: source venvs/pythonNNN/bin/activate
Install locally pip install -e .[dev,test,docs]
./run-tests.sh
deactivate virtualenv

Distribution

(personal howto)

Once:

create an account in https://testpypi.python.org/pypi and https://pypi.python.org/pypi
create ~/.pypirc as described here
sudo pip install twine
create testing virutalenvs:
- virtualenv venvs/test-python2
- python3 -m venv venvs/test-python3

Each distribution:

# first, set version in setup.py
# then, download the wheel.zip artifact from Github and extract it to dist/
# or create it manually: python setup.py bdist_wheel

# test PyPI site
twine upload --repository testpypi dist/simdkalman-VERSION*
# the real thing
twine upload dist/simdkalman-VERSION*
# update git tags
git tag VERSION -m "released VERSION"
git push --tags

Test installation from the test site with

source venvs/test-pythonNNN/bin/activate
pip install \
    --index-url https://test.pypi.org/simple/ \
    --extra-index-url https://pypi.org/simple \
    simdkalman --upgrade
# or the real thing with just
# pip install simdkalman
pip install matplotlib
python examples/example.py
deactivate

simdkalman's People

Contributors

Stargazers

Watchers

simdkalman's Issues

Regression with time-varying coefficients

Hello,

I am trying to implement regression with time-varying coefficients as described in Time Series Analysis by State Space Methods: Second Edition section 3.6.1. For that purpose I use the last example given at lectures: kalman-filters.

I somehow do not manage to replicate the results of pykalman and cannot tell why. I cannot see that I am doing something wrong.

I've summarized my results in the following notebook: https://nbviewer.jupyter.org/gist/cs224/031305297abc17a5546d4f0c2073716c

First I replicate the desired results with pykalman and then I try to do the same with simdkalman. simdkalman only returns my first state parameter and a copy of the first state parameter instead of returning the filter results for both state parameters.

I'd prefer to use simdkalman as it seems to be faster and actively developed.

Thanks a lot,
Christian

Bug in KalmanFilter.compute

I think there is a bug in KalmanFilter.compute, more specifically on line 522 of kalmanfilter.py. I noticed that when I tried to perform KalmanFilter.em in a scenario where observations consist of more than 1 variable, the code crashes due to broadcasting errors.

On line 522, result.filtered.gains is initialized as follows:
result.filtered.gains = np.empty((n_vars, n_measurements, n_states, n_states))
while I think it should be:
result.filtered.gains = np.empty((n_vars, n_measurements, n_states, n_obs))

Would it be possible to have a look, and hotfix if needed?
Many thanks in advance!

Single-precision mode

Optionally supporting 32-bit floats instead of only double precision would significantly reduce the memory consumption of the library, as discussed in #25 (comment)

Using multiprocessing pathos?

Hi.
I'm quite new when using multiprocessing. I want to try using pathos for multiprocessing per block of numpy array but it seems that your code is not callable by Pathos.

kf = simdkalman.KalmanFilter(
    state_transition,
    process_noise_cov,
    observation_model,
    observation_noise_cov)
with tqdm(total=train.as_matrix().shape[0], desc='  Encoding', unit='cols') as pbar:
    for n_rows in range(train.as_matrix().shape[0]):
        block=train.as_matrix()[n_rows:n_rows+100, :]
        result = kf.compute(block, 60)
        results = ProcessingPool().map(result, block)
        X.append(results.smoothed.observations.mean)
        pred.append(results.predicted.observations.mean)

TypeError: 'Result' object is not callable

Vectorized online update

I didn't see any indication in the documentation and examples of whether vectorized online update of multiple data streams is possible.

Example:

import numpy as np
import simdkalman

N_STREAMS = 3

kf = simdkalman.KalmanFilter(
    state_transition = np.array([[1,1],[0,1]]),
    process_noise = np.diag([0.1, 0.01]),
    observation_model = np.array([[1,0]]),
    observation_noise = 1.0)

initial_means = np.zeros((N_STREAMS, 2))
initial_covariances = np.tile(np.eye(2), (N_STREAMS, 1, 1)) # shape = (N_STREAMS, 2, 2)
observations = np.random.normal(size=(N_STREAMS, 1))

kf.update(initial_means, initial_covariances, observations)

This throws an assertion error but i'm not sure whether i'm doing something wrong or this kind of use is not supported.

Shape convention check

Hello Otto.
In line 200 of kalmanfilter the following dimension check is made, along with others:
n_states = state_transition.shape[0]
Should this be
n_states = state_transition.shape[-2]
to support the use of multiple transition matrices, or is that not supported?
Peter

AssertionError

Hi, I'm new to kalman, can you help me? This is my code:

from simdkalman.primitives import predict, update
import numpy as np

 A = np.eye(4)  # A 
 Q = R = np.eye(4) * 10 ** -5  # Q and R
 H = np.eye(4)  # H 
# init values
# where the res is an another functions's output, it is an numpy array ,shape is (4,2), like this:[[237 149] [162 223] [378 227] [260 
# 149]]  and res is time_varying array because the output in every frame is different
m = res
P = np.array([[9, 0, 0, 0], [0, 25, 0, 0], [0, 0, 4, 0], [0, 0, 0, 4]])  # P

m, P = predict(m, P, A, Q)
m, P = update(m, P, H, R, np.eye(4)*0.01)

But I always get the assert error:

Traceback (most recent call last):
File "D:/PycharmProjects/first_ldw/test.py", line 64, in
m,P= update(m, P, H, R, np.eye(4)0.01)
File "D:\Python38\lib\site-packages\simdkalman\primitives.py", line 146, in update
return _update(prior_mean, prior_covariance, observation_model, observation_noise, measurement)[:2]
File "D:\Python38\lib\site-packages\simdkalman\primitives.py", line 52, in reshaped_func
outputs = func([to_3d_array(a) for a in args], **kwargs)
File "D:\Python38\lib\site-packages\simdkalman\primitives.py", line 95, in _update
assert(measurement.shape[-2:] == (m,1))
AssertionError

Where should I modify my code ?
Thanks~

Online Smoothing via update() and smooth_current()?

Hello,

This library has been quite helpful! I was wondering how to properly call the update and smooth_current functions to update the smoother in an online fashion.

For example, if I have the following to smooth 10000 normal random variables

data = np.random.normal(size=10000)

 ##KF specificiation taken from examples
kf = simdkalman.KalmanFilter(
    state_transition = [[1,1],[0,1]],        # matrix A
    process_noise = np.diag([0.1, 0.01]),    # Q
    observation_model = np.array([[1,0]]),   # H
    observation_noise = 1.0)

kf_results = kf.smooth(data)

And I wanted to get the smoothed value for the 10,001st term would I do the following?

#Snippet2

m = kf_results.states.mean[-1]
P = kf_results.states.mean[-1]
y = np.random.normal() #10,001st term
post_mean, post_cov, loglike = kf.update(m = m, P = P, y=y)
smooth_mean, smooth_cov, smooth_gain = kf.smooth_current(m = m, P = P, ms=post_mean Ps=post_cov)
new_smoothed_observation = smooth_mean
m = post_mean
P = posterior_cov

I'm finding that the above doesn't do a very good job of recreating the smoothing on the whole series especially if I continue this process over longer time periods. For example:

Here are the results from the last 500 points of the kf.smooth(data) call

And here are the results from using the following for loop, adapting #Snippet 2 above

m = kf_results.states.mean[-1]
P = kf_results.states.cov[-1]
smoothed_current_obs = []
for index in range(9500,10000):
    y = data[index] #get next obs to smooth
    post_mean, post_cov, loglike = kf.update(m = m, P = P, y=y.reshape(-1,1))
    smooth_mean, smooth_cov, smooth_gain = kf.smooth_current(m = m, P = P, ms=post_mean, Ps=post_cov)
    smoothed_current_obs.append(smooth_mean[0][0])
    m = post_mean
    P = post_cov

Is it possible to recreate the output of kf.smooth() using kf.update() and kf.smooth_current()?

Here's the full code example to make it easier to run

import numpy as np
import simdkalman
import pandas as pd


data = np.random.normal(size=10000)

 ##KF specificiation taken from examples
kf = simdkalman.KalmanFilter(
    state_transition = [[1,1],[0,1]],        # matrix A
    process_noise = np.diag([0.1, 0.01]),    # Q
    observation_model = np.array([[1,0]]),   # H
    observation_noise = 1.0)

kf_results = kf.smooth(data)
smoothed_obs = kf_results.observations.mean

#Make Plots
df = pd.DataFrame([data,smoothed_obs]).T
df.columns = ['data','smoothed_obs']
df.iloc[-500:].plot()


m = kf_results.states.mean[-1]
P = kf_results.states.cov[-1]
smoothed_current_obs = []
for index in range(9500,10000):
    y = data[index] #get next obs to smooth
    post_mean, post_cov, loglike = kf.update(m = m, P = P, y=y.reshape(-1,1))
    smooth_mean, smooth_cov, smooth_gain = kf.smooth_current(m = m, P = P, ms=post_mean, Ps=post_cov)
    smoothed_current_obs.append(smooth_mean[0][0])
    m = post_mean
    P = post_cov  

df2 = pd.DataFrame([data[9500:10000],smoothed_current_obs]).T
df2.columns = ['data','smoothed_current_data']
df2.plot()

porting from pykalman to simdkalman

Hello!

I want to understand how to port a pykalman code to simdkalman, could anyone help me?

it's a implementation of kalman to understand the kinetic (position+velocity+acceleration) movement (KCA) from lopez prado (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2422183)

# by MLdP on 02/22/2014 <[email protected]>
# Kinetic Component Analysis
import numpy as np
from pykalman import KalmanFilter
#-------------------------------------------------------------------------------
def fitKCA(t,z,q,fwd=0):    
    '''
    Inputs:
        t: Iterable with time indices
        z: Iterable with measurements
        q: Scalar that multiplies the seed states covariance
        fwd: number of steps to forecast (optional, default=0)
    Output:
        x[0]: smoothed state means of position velocity and acceleration
        x[1]: smoothed state covar of position velocity and acceleration
    Dependencies: numpy, pykalman
    '''

    #1) Set up matrices A,H and a seed for Q
    h=(t[-1]-t[0])/t.shape[0]    
    # my doubt... should the H be = [[h,0]] ? considering https://github.com/oseiskar/simdkalman/issues/11

    A=np.array([[1,h,.5*h**2],     # space
                [0,1,h],           # velocity
                [0,0,1]])          # aceleration
    Q=q*np.eye(A.shape[0])

    #2) Apply the filter    
    kf=KalmanFilter(transition_matrices=A,transition_covariance=Q)

    #3) EM estimates
    kf=kf.em(z)
    # could this be done by smidkalman?

    #4) Smooth
    x_mean,x_covar=kf.smooth(z)
    # what about this one?

    #5) Forecast
    for fwd_ in range(fwd):
        x_mean_,x_covar_=kf.filter_update(filtered_state_mean=x_mean[-1], \
            filtered_state_covariance=x_covar[-1])
        x_mean=np.append(x_mean,x_mean_.reshape(1,-1),axis=0)
        x_covar_=np.expand_dims(x_covar_,axis=0)
        x_covar=np.append(x_covar,x_covar_,axis=0)

    #6) Std series
    x_std=(x_covar[:,0,0]**.5).reshape(-1,1)
    for i in range(1,x_covar.shape[1]):
        x_std_=x_covar[:,i,i]**.5
        x_std=np.append(x_std,x_std_.reshape(-1,1),axis=1)
    return x_mean,x_std,x_covar

Question: How to use this with a dataframe?

Amazing stuff and I'm curious if I can use it with pandas dataframe. For example I have this dataframe

d = {'A': [8, 2, 4, 5, 6, 4, 3, 5, 5, 3]}
df = pd.DataFrame(data=d)

print(df)

which looks like this

How can I add a row with the Kalman prediction for row 10?

I came this far

kf = simdkalman.KalmanFilter(
    state_transition = [[1,1],[0,1]],        # matrix A
    process_noise = np.diag([0.1, 0.01]),    # Q
    observation_model = np.array([[1,0]]),   # H
    observation_noise = 1.0)

pr = kf.predict(df.squeeze(), 1)

to get one predicted value.

But adding it to the dataframe gives an error.

df.append(pr)

TypeError: cannot concatenate object of type '<class 'simdkalman.kalmanfilter.KalmanFilter.Result'>'; only Series and DataFrame objs are valid

I understand why I get the error but I don't know how to 'extract' the predicted value from object.

Readthedocs minor issues, e.g. example needs numpy import

Not that anybody who can figure out the Kalman filter won't be able to figure out how to import numpy, but for completeness of the example you might as well add the import:

https://simdkalman.readthedocs.io/en/latest/

I'm personally partial to

import numpy as np

Also, predict is spelled as "precit":

"See smooth and precit"

Thanks for writing these examples up.

Coupled observations example

Hi, I've been trying to build a coupled observations example, and I'm not entirely sure I have the dimensionality of the parameters correct, is this correct? If so it might be worth including as an example.

dt= 0.1  # 10 Hz samples
A = np.array([[1, dt], [0, 1]])
Q = np.array([[1.0, 0.1], [0, 0.1]])
H = np.diag([1, 1])
R = np.diag([0.5, 1.0])

kf = simdkalman.KalmanFilter(state_transition=A,
                             process_noise=Q,
                             observation_model=H,
                             observation_noise=R)

# Given speed and accel from independent sensors (doppler GPS and IMU)
# which are the same length and sampling rate.
#
# Mask out speeds that have high gradients as likely errors
speed[np.abs(np.gradient(speed, 1/10)) > 3] = np.nan

# Prepare an array of shape (1, n, 2)
data = np.array([np.array([speed, accel]).T])
data = kf.smooth(data, initial_value=data[0,0,:], initial_covariance=np.eye(2) * 0.1)
speed, accel = data.observations.mean[0].T

Masking out the bad speeds feels like it shouldn't be required, that coupling the processes together should result in matching acceleration and velocity processes. What am I doing wrong?

Parameter estimation

Does this package implement parameter estimation for Kalman filters?

Possible minor inconsistency in dimensions?

I was in the process of writing some unit tests for the multi-dim usage and noticed a possible minor inconsistency. Both KalmanFilter.predict and KalmanFilter.update require the initial means to take a certain shape. The latter preserves this shape in the output results object. However, the .predict method transposes the last two dimensions. Thus the shape of the means going in isn't the same as going out. If this output is merely intended as reporting that doesn't matter, but as you can see from this test script there might be a slight inconsistency here. However, perhaps I misunderstand the intent or am abusing the usage somehow.

LMK if you want this test script in a notebook or PR into simdkalman.

Data format - Kalman filter

Hi @oseiskar

I was just trying to run a experiment with the dataset and I am getting the following error.
I was wondering if you could assist.

The data set it -

measurements

masked_array(
data=[[1.83 , 1.314 , 0.124 , ..., 2.072 , 0.959 , 0.093 ],
[1.907 , 1.341 , 0.131 , ..., 2.109 , 0.972 , 0.1 ],
[1.935 , 1.36 , 0.138 , ..., 2.141 , 1. , 0.105 ],
...,
[4.864745, 2.99041 , 6.945452, ..., 5.772958, 1.830929, 0.58429 ],
[4.87512 , 3.002997, 7.048876, ..., 5.792051, 1.817048, 0.578934],
[4.887871, 3.013825, 7.171646, ..., 5.813568, 1.805832, 0.577737]],
mask=False,
fill_value=1e+20)

"""""""""

kf = simdkalman.KalmanFilter(A, np.linalg.inv(W_neg_sqrt @ W_neg_sqrt), C,np.linalg.inv(V_neg_sqrt @ V_neg_sqrt))
kf = kf.em(measurements, n_iter=10)

received the error -

ValueError Traceback (most recent call last)
in
68 measurements[~(K & ~M_test)] = np.ma.masked
69 kf = simdkalman.KalmanFilter(A, np.linalg.inv(W_neg_sqrt @ W_neg_sqrt), C,np.linalg.inv(V_neg_sqrt @ V_neg_sqrt))
---> 70 kf = kf.em(measurements, n_iter=10)
71 filtered_state_means, _ = kf.smooth(measurements)
72 yhat_em = filtered_state_means @ C.T

~/tutorial-env/lib/python3.7/site-packages/simdkalman/kalmanfilter.py in em(self, data, n_iter, initial_value, initial_covariance, verbose)
732 gains = True,
733 log_likelihood = False,
--> 734 verbose = verbose)
735
736 if verbose:

~/tutorial-env/lib/python3.7/site-packages/simdkalman/kalmanfilter.py in compute(self, data, n_test, initial_value, initial_covariance, smoothed, filtered, states, covariances, observations, likelihoods, gains, log_likelihood, verbose)
526 print('filtering %d/%d' % (j+1, n_measurements))
527
--> 528 y = data[:,j,...].reshape((n_vars, n_obs, 1))
529
530 tup = self.update(m, P, y, log_likelihood)

ValueError: cannot reshape array of size 119 into shape (119,48,1)

Can't process my data

Hi, sorry for my dumb question but i'm a newbie.
I'm trying to use your algorithm with my Gps data, but i keep getting error message: "unsupported operand type(s) for -: 'GPXTrackSegment' and 'float'"
How can i correctly implement it?
Polo_match.txt

if i want get the final A and kalman gains ,what should i do ？

I want to get A MATRIX and kalman gains ,what should i do?

example.py error

On running the example, I receive the error:

ValueError: ("Size of label '%s' for operand %d does not match previous terms.", 'B', 1)

Multi Dimensional Data Reshape Error

Hi There,

I'm using the Kalman Filter to perform weight estimation for a Bayesian inference problem. I had been using PyKalman, however due to lack of maintenance, I'm looking for alternatives and I landed on simdkalman.

I am not sure where my problem lies, but I'm getting a reshape error when I call smooth:

Here is a dummy setup that represents my data structures:

import simdkalman
import numpy as np
import numpy.random as random
import matplotlib.pyplot as plt

H = random.normal(size=(100,11)) # Observation model of size N_observations x M weights of interest
data = random.normal(size = (100,180)) # 100 Independent time series, each180 time samples long

kf = simdkalman.KalmanFilter(
    state_transition = np.eye(11),
    process_noise = 0.5*np.eye(11),
    observation_model = H,
    observation_noise = 0.5*np.eye(100)) 

smoothed = kf.smooth(data, 
      initial_value = np.zeros((11)), 
      initial_covariance=0.5*np.eye(11))

When I call kf.smooth I get:
ValueError: cannot reshape array of size 100 into shape (100,100,1)

Here is the same code I used to call with pyKalman (again a representation with correct array sizes, not actual data):

from pykalman import KalmanFilter

H = random.normal(size=(100,11)) # Observation model of size N_observations x M weights of interest
data = random.normal(size = (100,180)) # 100 Independent time series, each180 time samples long

kf = KalmanFilter(observation_matrices = H,
                        observation_offsets = np.zeros((100)),
                        observation_covariance = 0.5*np.eye(100),
                        initial_state_mean = np.zeros((11)),
                        initial_state_covariance = 0.5*np.eye(11),
                        transition_matrices = np.eye(11),
                        transition_offsets = np.zeros(11),
                        transition_covariance = 0.5*np.eye(11))

(smoothedMeans, smoothedCov) = kf.smooth(data.T)

Any idea why my formulation and setup would cause this reshape error?

Thank you