juliadynamics / timeseriesprediction.jl Goto Github PK

Prediction of timeseries using methods of nonlinear dynamics and timeseries analysis

License: MIT License

Julia 100.00%

nonlinear dynamics timeseries prediction julia chaos

timeseriesprediction.jl's Introduction

Documentation	Travis	AppVeyor	Gitter

Repository for predicting timeseries using methods from nonlinear dynamics and timeseries analysis. It uses DelayEmbeddings.

Kuramoto-Sivashinsky example

This example performs a temporal prediction of the Kuramoto-Sivashinsky model. It is a one-dimensional system with the spatial dimension shown on the x-axis and its temporal evolution along the y-axis. The algorithm makes iterative predictions into the future that stay similar to the true evolution for a while but eventually diverge.

timeseriesprediction.jl's People

Contributors

Stargazers

Watchers

Forkers

maplewzx stjordanis cxlusst okonsamuel shahrokhx ricoter playfloor rikhuijzer

timeseriesprediction.jl's Issues

Spatio-Temporal timeseries prediction

In principle spatio-temporal timeseries prediction should already be possible by providing the Reconstruction.

Once JuliaDynamics/DynamicalSystemsBase.jl#47 is implemented,
a dedicated interface for this could be offered.

The call signature could look like this
localmodel_tsp(s,D,τ,I,K,c, p ; kwargs)

Move documentation of TSPred to this repo.

Remove it from DynamicalSystems repo and put it here, because it should be hosted on it's own. This ties nicely with the new weebsite

Remove duplication of inner to reduce source

The code in e.g. https://github.com/JuliaDynamics/TimeseriesPrediction.jl/blob/master/src/symmetric_embedding.jl#L128 is shared (because it is independent of the boundary conditions). It may be that this happens in other cases as well, besides symmetric embeddings.

We should make it a function and re-use it instead, to reduce total loc and improve clarity.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

change all `examples` folder to use `systems_def` and produce figures instead of animations

In short, do what I have done in 38e5cd9

Extend the new `reconstruct` method by defining a SpatioTemporalEmbedding

See JuliaDynamics/DynamicalSystemsBase.jl#63

New issue test

Lalala test.

Error tagging new release

The tag name "0.2.0" is not of the appropriate SemVer form (vX.Y.Z).
cc: @Datseris

Better Boundary Conditions

I believe it would be a good idea to implement better options for boundary conditions.

One could rename ConstantBoundary to DirichletBoundary
and add a separate VonNeumannBoundary
(filling missing values with value at boundary).

temporal prediction with different start point

I just noticed, in principle it is not necessary that the temporal prediction
starts at the end of the training set.
As long as an embedded state (or long enough timeseries) is provided as initial condition, there is no such requirement.

This change would add generality to the function and resulting prediction models.
The current behaviour would not change at all
but instead of calling the prediction function directly we would add a layer of abstraction that does
temporalprediction(data, em, tsteps) = temporalprediction(data, em, data[end-getmaxτ(em): end], tsteps)
where
temporalprediction(data, em, starting_timeseries, tsteps) = #prediction algorithm

Dynamical Networks

This article
https://aip.scitation.org/doi/citedby/10.1063/1.1737818
and the more extensive thesis
https://ediss.uni-goettingen.de/handle/11858/00-1735-0000-0006-B57E-4

propose a time series prediction algorithm using Dynamical Networks.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

In systems_def.jl allow parameters to be arguments instead of hardcoded

In short, make this:

function barkley_periodic_boundary(T, Nx, Ny)
    a = 0.75
    b = 0.02
    ε = 0.02

this:

function barkley_periodic_boundary(T, Nx, Ny;  a = 0.75, b = 0.02, ε = 0.02)

Other KNN Algorithms

In most cases more than 90% of runtime is spent within the knn algorithm.

Therefore it may be worth looking into other knn implementations.

Here is a list with various algorithms with python interfaces.
http://ann-benchmarks.com/
https://github.com/erikbern/ann-benchmarks

In particular this one
https://github.com/nmslib/hnsw
might be of interest.

We will likely need a Julia wrapper which should be it's on package.
Maybe somehow connected to NearestNeighbors.jl ?

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

2Dfield_crossprediction.jl example does not work anymore...?

@JonasIsensee do you know porque? When I run it I get error on the line
pcaem = PCAEmbedding(source_train, em; maxoutdim=5)

Prediction of flow dynamics using point processes (new prediction scheme)

https://aip.scitation.org/doi/full/10.1063/1.5016219?Track=CHAFT

From the paper's abstract:

Describing a time series parsimoniously is the first step to study the underlying dynamics. For a time-discrete system, a generating partition provides a compact description such that a time series and a symbolic sequence are one-to-one. But, for a time-continuous system, such a compact description does not have a solid basis. Here, we propose to describe a time-continuous time series using a local cross section and the times when the orbit crosses the local cross section. We show that if such a series of crossing times and some past observations are given, we can predict the system's dynamics with fine accuracy. This reconstructability neither depends strongly on the size nor the placement of the local cross section if we have a sufficiently long database. We demonstrate the proposed method using the Lorenz model as well as the actual measurement of wind speed.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Make default constructor of SpatioTemporalEmbeding a separate function

Any good name proposals?

This is used literally everywhere: all tests files and all example files!

Docstring Description of `localmodel_stts` is veeeery poor

This method works identically to [`localmodel_tsp`](@ref), by expanding the concept
from vector-states to general array-states.

This is very poor. You have written a decent introduction to the DynamicalSystems.jl page, can you make the current description a bit more like an actual description?

Other Techniques for dimension reduction

The are approaches to dimension reduction other than PCA.
PCA is not always best ( finds only linear relationships)

i.e. radial basis functions could sometimes be better.

To implement this, we need a RadialBasisEmbedding <: AbstractSpatialEmbedding
that provides an interface similar to PCAEmbedding.
RadialBasisEmbedding(training_set, em::SpatioTemporalEmbedding) → embedding

and reconstruction method
embedding(inplace_vec, training_set, α, τ)

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

ArgumentError: principal variance cannot exceed total variance.

This happens to me sometimes on the cluster.
So far the worker nodes haven't given me much more useful information.

I highly doubt that there is an error in MultivariateStats.jl especially since there haven't been any changes since the last project.
I fear that accumulation of floating point rounding errors of our beloved Float32 comes around to bite us but I cannot be certain.

So far this has happened rarely so we could simply ignore this issue and repeat
the run until it succeeds.
This approach is questionable at best, though.
Another option could possibly be to make explicit conversions to Float64
during the computation of the covmat and converting back to Float32 afterwards.
This will likely involve a few hours of work and result in a significant performance penalty.

ArgumentError: principal variance cannot exceed total variance.                                                                        
Type at /home/isensee/.julia/packages/MultivariateStats/nNJuu/src/pca.jl:24
#pcacov#8 at /home/isensee/.julia/packages/MultivariateStats/nNJuu/src/pca.jl:112
#pcacov at ./none:0 [inlined]           
compute_pca at /home/isensee/.julia/dev/TimeseriesPrediction/src/pcaembedding.jl:81 [inlined]                                          
#PCAEmbedding#48 at /home/isensee/.julia/dev/TimeseriesPrediction/src/pcaembedding.jl:65                                               
Type at ./none:0 
macro expansion at ./logging.jl:322 [inlined] 
cross_estimation at /scratch15/isensee/embeddingresearch/src/cross_estimation.jl:5                                                     
#112 at /usr/lfpn/SOURCES.ORIGINAL/julia/usr/share/julia/stdlib/v1.0/Distributed/src/process_messages.jl:269
run_work_thunk at /usr/lfpn/SOURCES.ORIGINAL/julia/usr/share/julia/stdlib/v1.0/Distributed/src/process_messages.jl:56 
macro expansion at /usr/lfpn/SOURCES.ORIGINAL/julia/usr/share/julia/stdlib/v1.0/Distributed/src/process_messages.jl:269 [inlined]
#111 at ./task.jl:259

versions, documentation, paper

@JonasIsensee there are many of commits in the master branch, behind the stable version... Maybe it is a good idea to resume the versioning process? For example, the stable branch still uses r_0 for the NamedTuple of the light cone, as I realized while working on the project.

In addition, you paper is now on arXiv. I think it is beneficial (especially for you) to have a citable BibTeX entry in the readme as well as the documentation of the software.

Let's work a bit more on the symmetry API before putting it in the docs, I have some suggestions on it.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Make Multivariate predictions (input & output) for spatial systems possible

This can probably be done by implementing an MultivariateSpatioTemporalEmbedding <: AbstractSpatialEmbedding.
Existing functions such as reconstruct may need to relax type restrictions for dispatch
typeof(mv_s) == Vector{Vector{Array, T,Φ}}

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Change all source code of `data[range, 1:2]`

Issue Description

Change all source code of data[range, 1:2] to

sind = SVector(1, 2)
data[range, sind]

The latter is 100x faster.

Improve docstring of `LinearLocalModel`

The 1-sentence we have is maybe not enough. If you can come up with a better description that uses at most 3 sentences, then I think it will be better

Make `localmodel_tsp` return the `B` end of a reconstruction only

When given a Reconstruction or a MDReconstruction, make it so that it automatically returns the last column, or the last B columns.

To make the source code clear, we keep the lower level method with abstract dataset as is, and rename it to _localmodel_tsp (with one _ in front). Then the methods that take a Reconstruction or MDReconstruction call this function and return only the last B columns.

If you have this ::MDReconstruction{DxB, D, B, T, τ} then you know you can create the
a = SVector{B, Int}((DxB - i for i in B-1:-1:0)...) and simple return the returned result as ret[:, a].

(When given a timeseries, or Dataset the behavior should remain as it is now)

Nearest Trajectory Strategy for Time Series Prediction

Currently the scheme for finding nearest neighbors works by not distinguishing neighbors based on where they are with respect to the existing dataset. Most of the time (and this is especially true for densely sampled data), nearest neighbors belong to a single or at best a couple of trajectory segments. This may mean that a specific trajectory segment is over-weighted for the prediction while other nearing segments are disregarded. The figure describes this perfectly.

It can be advantageous in some cases of timeseries predictions to expand the neighbor-finding strategy to "Nearest Trajectory". This process is described in the paper: "A Nearest Trajectory Strategy for Time Series Prediction" by James McNames. You can find the pdf here.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

make theiler window argument of `predict_timeseries`

From predict_timeseries:

    for n=1:p   #Iteratively estimate timeseries
        idxs,dists = neighborhood_and_distances(q,R, tree,ntype)
        xnn = R[idxs]
        ynn = R[idxs+step]
        q = method(q, xnn, ynn, dists)
        push!(s_pred, q[end])
    end

clearly the theiler window is not used in neighborhood_and_distances(q,R, tree,ntype)

The theiler window should be keyword argument of the predict_timeseries and be normally used as expected.

You can make w = mean(τ) for the call without R, and w = mean(R.delay) for the call with R.

Light Cone constructor for SpatioTemporalEmbedding

Implement a constructor for SpatioTemporalEmbedding that
creates a light cone like embedding.
Likely needs just one parameter. Speed

Note: STE implements a forward embedding.

Example for the docs

Parameter `c` used twice

Not sure if this is a big problem but nonetheless:
We use c as name for ConstantBoundary parameter
and c as speed in `light_cone_embedding_.

Parallelized calculation of PCAEmbedding

The only limiting factor for super large (>1000) embedding dimensions,
which can then be reduced by PCA is the
calculation of PCAEmbedding.

This could be parallelized
by splitting the dataset into different subsets and
computing the covariance matrix for each of them.
All covariance matrices are then averaged.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Actually use the theiler window...

Well, neighbor_and_distance has implementation for using the theiler window, but there is no information for that in localmodel_tsp, nor any arguments...

error to run timeseriesprediction.ipynb

run this cell got following error:

estimate_delay(s, "first_zero")

ArgumentError: Unknown method for estimate_delay.

Stacktrace:
[1] #estimate_delay#39(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(estimate_delay), ::Array{Float64,1}, ::String, ::StepRange{Int64,Int64}) at /home/steve/.julia/packages/DelayEmbeddings/7PwQ4/src/estimate_delay.jl:73
[2] estimate_delay at /home/steve/.julia/packages/DelayEmbeddings/7PwQ4/src/estimate_delay.jl:33 [inlined] (repeats 2 times)
[3] top-level scope at In[15]:1

--------- My environment below:
Status ~/.julia/environments/v1.2/Project.toml
[c52e3926] Atom v0.10.1
[6e4b80f9] BenchmarkTools v0.4.3
[5ae59095] Colors v0.9.6
[a93c6f00] DataFrames v0.19.3
[1313f7d8] DataFramesMeta v0.5.0
[abce61dc] Decimals v0.4.0
[31c24e10] Distributions v0.21.1
[61744808] DynamicalSystems v1.3.0
[5789e2e9] FileIO v1.0.7
[28b8d3ca] GR v0.41.0
[bd48cda9] GraphRecipes v0.4.0
[f67ccb44] HDF5 v0.12.3
[7073ff75] IJulia v1.20.0
[916415d5] Images v0.18.0
[f7bf1975] Impute v0.3.0
[70c4c096] Indicators v0.6.0
[a93385a2] JuliaDB v0.12.0
[e5e0dc1b] Juno v0.7.2
[194296ae] LibPQ v0.11.2
[add582a8] MLJ v0.2.3
[d491faf4] MLJModels v0.2.3
[86f7a689] NamedArrays v0.9.3
[eadc2687] Pandas v1.3.0
[14b8a8f1] PkgTemplates v0.6.2
[f0f68f2c] PlotlyJS v0.12.5
[91a5bcdd] Plots v0.26.2
[d330b81b] PyPlot v2.8.2
[1a8c2f83] Query v0.12.1
[612083be] Queryverse v0.3.1
[ce6b1742] RDatasets v0.6.3
[3cdcf5f2] RecipesBase v0.7.0
[295af30f] Revise v2.1.10
[03a91e81] SplitApplyCombine v0.4.1
[2913bbd2] StatsBase v0.32.0
[f3b207a7] StatsPlots v0.12.0
[054b7d4e] Strategems v0.2.0
[bd369af6] Tables v0.2.11
[a110ec8f] Temporal v0.6.1
[9e3dc215] TimeSeries v0.16.0
[f269a46b] TimeZones v0.9.2
[f218859d] TimeseriesPrediction v0.6.0
[9d95f2ec] TypedTables v1.2.0
[b8865327] UnicodePlots v1.1.0
[112f6efa] VegaLite v0.7.0

Improve spatio temporal prediction tests

They never tests cross predictions of periodic models
They must test the field values themselves, besides a coarse grained average of a prediction.
They should test both maximum and minimum errors

Add tests for LinearLocalModel

There aren't any tests for LinearLocalModel in teh file localmodeling_tests.jl. THis would have caught the wrong definition of LinearLocalModel done in #20

Add tests for different number types ( Float32, Complex )

TSP claims to be type agnostic.
There should be tests for this.
In particular Float32 and Complex numbers will be of Interest.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Renaming: change all `D` to `γ` in general

In accordance with the upcoming paper.

Update Readme to inform about new version

Implement a CompositeBoundary

Just an idea. In particular this could be extensively discussed in docs
to highlight extensibility.
This CompositeBoundary would
implement a way to treat each boundary with a separate condition type.
( difficult could be corners)

I would like to see if this could be used to easily create a mobius strip.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

asdfasdfsad

Cluster Weighted Modelling for timeseries prediction

In the reference "[1] : Eds. B. Schelter et al., Handbook of Time Series Analysis,
VCH-Wiley, pp 39-65 (2006)", from which we got the implementation for the Local Modelling, there is also an implementation for "Cluster Weighted Modelling" for predicting timeseries.

CWM essentially tries to estimate the joint density p(x, y), since this density allows to
compute derived quantities like the conditional forecast <y|x> for new query points.

If anyone wants to tackle this, contact me for the pdf!

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Use multiple input timeseries as training

We need a way to combine multiple time series of one system
with different initial conditions into a single training set.
How could this best be done?

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Documentation of SymmetricSpatialEmbedding

A very big feature was merged here: #66

However it is not documented in the online documentation, which makes it difficult for users to become aware of it.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Optional Type conversion in `reconstruct` ? i.e. Float64 → Float32

It might be worth implementing an optional type conversion in
reconstruct via a keyword argument.
Float64 → Float32 could in principle reduce reconstruction size and tree size by a factor of two.

Consistency of source code for crossprediction

The docstring of crossprediction, which btw we may want to rename to crossestimation has:

crossprediction(source_train, target_train, source_pred,

yet none of these terms are in the source code for the function , which reads:

function crossprediction(train_in ::AbstractVector{<:AbstractArray{T, Φ}},
                        train_out::AbstractVector{<:AbstractArray{T, Φ}},
pred_in ::AbstractVector{<:AbstractArray{T, Φ}},

or

function crossprediction(params, train_out,pred_in, R, tree; progress=true)

I think it is best if there is an agreement on the terms used and the same thing is used for documentation and source code.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Less memory usage with KDTrees

By default a KDTree stores a copy of the data.

AFAICT this could be avoided by creating the tree with

data = reconstruct(U, em) 
dftree = DataFreeTree(KDTree, data) 
tree = injectdata(dftree, data)

Furthermore in both cases the reconstruction (data)
is available under tree.data.
With this we would not have to explicitly pass the reconstruction to the prediction functions.

If we also write our own constructors, we can allow using views and strip away a safeguard in injectdata
to take care of the case in temporalprediction where some reconstructed points are not part of the tree but in the reconstruction

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Documentation: light_cone_embedding and r0

The documentation of light_cone_embedding does not have r0 anymore (for no reason whatsoever).

But this creates a problem as it states:

The radius of the light cone evolves as: r = i*τ*c + r for each step i ∈ 0:γ.

while the two r symbols used in the equation actually mean different things.

Issue with light cone embedding

Here is how the embedding looks currently:

time  | c = 1.0               | c = 2.0               | c = 0.0

n + 1 | ..........o.......... | ..........o.......... | ..........o..........
n     | .........x□x......... | .........x□x......... | .........x□x.........
n - 1 | ..................... | ..................... | .....................
n - τ | .......xxxxxxx....... | .....xxxxxxxxxx...... | .........xxx.........

What you notice is that even though τ is greater than 1, the first spatiotemporal frame is always directly behind the point to be predicted.

Isn't this wrong? You want something that is exactly τ away from the point to be predicted... Right?

@JonasIsensee could this be why the light cone embedding doesn't work that well? Or maybe I have misunderstood how this works?

Write Tests which verify that it works

Saving PCAEmbedding to .jld2

It is not possible to load
a PCAEmbedding that was saved to a .jld2 file.

One has to import MultivariateStats first
because otherwise MultivariateStats.PCA is not a known type and will be reconstructed ( bad).

Likely fixed by exporting PCA

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

juliadynamics / timeseriesprediction.jl Goto Github PK

timeseriesprediction.jl's Introduction

Kuramoto-Sivashinsky example

timeseriesprediction.jl's People

Contributors

Stargazers

Watchers

Forkers

timeseriesprediction.jl's Issues

Issue Description

Recommend Projects

Recommend Topics

Recommend Org