Giter VIP home page Giter VIP logo

timeseriesprediction.jl's Introduction

TimeseriesPredition.jl logo

Documentation Travis AppVeyor Gitter
Build Status Build status Gitter

Repository for predicting timeseries using methods from nonlinear dynamics and timeseries analysis. It uses DelayEmbeddings.

Kuramoto-Sivashinsky example

Kuramoto-Sivashinsky Prediction

This example performs a temporal prediction of the Kuramoto-Sivashinsky model. It is a one-dimensional system with the spatial dimension shown on the x-axis and its temporal evolution along the y-axis. The algorithm makes iterative predictions into the future that stay similar to the true evolution for a while but eventually diverge.

timeseriesprediction.jl's People

Contributors

datseris avatar femtocleaner[bot] avatar jonasisensee avatar juliatagbot avatar rikhuijzer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

timeseriesprediction.jl's Issues

Better Boundary Conditions

I believe it would be a good idea to implement better options for boundary conditions.

One could rename ConstantBoundary to DirichletBoundary
and add a separate VonNeumannBoundary
(filling missing values with value at boundary).

temporal prediction with different start point

I just noticed, in principle it is not necessary that the temporal prediction
starts at the end of the training set.
As long as an embedded state (or long enough timeseries) is provided as initial condition, there is no such requirement.

This change would add generality to the function and resulting prediction models.
The current behaviour would not change at all
but instead of calling the prediction function directly we would add a layer of abstraction that does
temporalprediction(data, em, tsteps) = temporalprediction(data, em, data[end-getmaxτ(em): end], tsteps)
where
temporalprediction(data, em, starting_timeseries, tsteps) = #prediction algorithm

Other KNN Algorithms

In most cases more than 90% of runtime is spent within the knn algorithm.

Therefore it may be worth looking into other knn implementations.

Here is a list with various algorithms with python interfaces.
http://ann-benchmarks.com/
https://github.com/erikbern/ann-benchmarks

In particular this one
https://github.com/nmslib/hnsw
might be of interest.

We will likely need a Julia wrapper which should be it's on package.
Maybe somehow connected to NearestNeighbors.jl ?


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Prediction of flow dynamics using point processes (new prediction scheme)

https://aip.scitation.org/doi/full/10.1063/1.5016219?Track=CHAFT

From the paper's abstract:

Describing a time series parsimoniously is the first step to study the underlying dynamics. For a time-discrete system, a generating partition provides a compact description such that a time series and a symbolic sequence are one-to-one. But, for a time-continuous system, such a compact description does not have a solid basis. Here, we propose to describe a time-continuous time series using a local cross section and the times when the orbit crosses the local cross section. We show that if such a series of crossing times and some past observations are given, we can predict the system's dynamics with fine accuracy. This reconstructability neither depends strongly on the size nor the placement of the local cross section if we have a sufficiently long database. We demonstrate the proposed method using the Lorenz model as well as the actual measurement of wind speed.


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Docstring Description of `localmodel_stts` is veeeery poor

This method works identically to [`localmodel_tsp`](@ref), by expanding the concept
from vector-states to general array-states.

This is very poor. You have written a decent introduction to the DynamicalSystems.jl page, can you make the current description a bit more like an actual description?

Other Techniques for dimension reduction

The are approaches to dimension reduction other than PCA.
PCA is not always best ( finds only linear relationships)

i.e. radial basis functions could sometimes be better.

To implement this, we need a RadialBasisEmbedding <: AbstractSpatialEmbedding
that provides an interface similar to PCAEmbedding.
RadialBasisEmbedding(training_set, em::SpatioTemporalEmbedding) → embedding

and reconstruction method
embedding(inplace_vec, training_set, α, τ)


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

ArgumentError: principal variance cannot exceed total variance.

This happens to me sometimes on the cluster.
So far the worker nodes haven't given me much more useful information.

I highly doubt that there is an error in MultivariateStats.jl especially since there haven't been any changes since the last project.
I fear that accumulation of floating point rounding errors of our beloved Float32 comes around to bite us but I cannot be certain.

So far this has happened rarely so we could simply ignore this issue and repeat
the run until it succeeds.
This approach is questionable at best, though.
Another option could possibly be to make explicit conversions to Float64
during the computation of the covmat and converting back to Float32 afterwards.
This will likely involve a few hours of work and result in a significant performance penalty.

ArgumentError: principal variance cannot exceed total variance.                                                                        
Type at /home/isensee/.julia/packages/MultivariateStats/nNJuu/src/pca.jl:24
#pcacov#8 at /home/isensee/.julia/packages/MultivariateStats/nNJuu/src/pca.jl:112
#pcacov at ./none:0 [inlined]           
compute_pca at /home/isensee/.julia/dev/TimeseriesPrediction/src/pcaembedding.jl:81 [inlined]                                          
#PCAEmbedding#48 at /home/isensee/.julia/dev/TimeseriesPrediction/src/pcaembedding.jl:65                                               
Type at ./none:0 
macro expansion at ./logging.jl:322 [inlined] 
cross_estimation at /scratch15/isensee/embeddingresearch/src/cross_estimation.jl:5                                                     
#112 at /usr/lfpn/SOURCES.ORIGINAL/julia/usr/share/julia/stdlib/v1.0/Distributed/src/process_messages.jl:269
run_work_thunk at /usr/lfpn/SOURCES.ORIGINAL/julia/usr/share/julia/stdlib/v1.0/Distributed/src/process_messages.jl:56 
macro expansion at /usr/lfpn/SOURCES.ORIGINAL/julia/usr/share/julia/stdlib/v1.0/Distributed/src/process_messages.jl:269 [inlined]
#111 at ./task.jl:259      

versions, documentation, paper

@JonasIsensee there are many of commits in the master branch, behind the stable version... Maybe it is a good idea to resume the versioning process? For example, the stable branch still uses r_0 for the NamedTuple of the light cone, as I realized while working on the project.

In addition, you paper is now on arXiv. I think it is beneficial (especially for you) to have a citable BibTeX entry in the readme as well as the documentation of the software.

Let's work a bit more on the symmetry API before putting it in the docs, I have some suggestions on it.


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Make `localmodel_tsp` return the `B` end of a reconstruction only

When given a Reconstruction or a MDReconstruction, make it so that it automatically returns the last column, or the last B columns.

To make the source code clear, we keep the lower level method with abstract dataset as is, and rename it to _localmodel_tsp (with one _ in front). Then the methods that take a Reconstruction or MDReconstruction call this function and return only the last B columns.

If you have this ::MDReconstruction{DxB, D, B, T, τ} then you know you can create the
a = SVector{B, Int}((DxB - i for i in B-1:-1:0)...) and simple return the returned result as ret[:, a].

(When given a timeseries, or Dataset the behavior should remain as it is now)

Nearest Trajectory Strategy for Time Series Prediction

image

Currently the scheme for finding nearest neighbors works by not distinguishing neighbors based on where they are with respect to the existing dataset. Most of the time (and this is especially true for densely sampled data), nearest neighbors belong to a single or at best a couple of trajectory segments. This may mean that a specific trajectory segment is over-weighted for the prediction while other nearing segments are disregarded. The figure describes this perfectly.

It can be advantageous in some cases of timeseries predictions to expand the neighbor-finding strategy to "Nearest Trajectory". This process is described in the paper: "A Nearest Trajectory Strategy for Time Series Prediction" by James McNames. You can find the pdf here.


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

make theiler window argument of `predict_timeseries`

From predict_timeseries:

    for n=1:p   #Iteratively estimate timeseries
        idxs,dists = neighborhood_and_distances(q,R, tree,ntype)
        xnn = R[idxs]
        ynn = R[idxs+step]
        q = method(q, xnn, ynn, dists)
        push!(s_pred, q[end])
    end

clearly the theiler window is not used in neighborhood_and_distances(q,R, tree,ntype)

The theiler window should be keyword argument of the predict_timeseries and be normally used as expected.

You can make w = mean(τ) for the call without R, and w = mean(R.delay) for the call with R.

Parameter `c` used twice

Not sure if this is a big problem but nonetheless:
We use c as name for ConstantBoundary parameter
and c as speed in `light_cone_embedding_.

Parallelized calculation of PCAEmbedding

The only limiting factor for super large (>1000) embedding dimensions,
which can then be reduced by PCA is the
calculation of PCAEmbedding.

This could be parallelized
by splitting the dataset into different subsets and
computing the covariance matrix for each of them.
All covariance matrices are then averaged.


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Actually use the theiler window...

Well, neighbor_and_distance has implementation for using the theiler window, but there is no information for that in localmodel_tsp, nor any arguments...

error to run timeseriesprediction.ipynb

run this cell got following error:

estimate_delay(s, "first_zero")

ArgumentError: Unknown method for estimate_delay.

Stacktrace:
[1] #estimate_delay#39(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(estimate_delay), ::Array{Float64,1}, ::String, ::StepRange{Int64,Int64}) at /home/steve/.julia/packages/DelayEmbeddings/7PwQ4/src/estimate_delay.jl:73
[2] estimate_delay at /home/steve/.julia/packages/DelayEmbeddings/7PwQ4/src/estimate_delay.jl:33 [inlined] (repeats 2 times)
[3] top-level scope at In[15]:1

--------- My environment below:
Status ~/.julia/environments/v1.2/Project.toml
[c52e3926] Atom v0.10.1
[6e4b80f9] BenchmarkTools v0.4.3
[5ae59095] Colors v0.9.6
[a93c6f00] DataFrames v0.19.3
[1313f7d8] DataFramesMeta v0.5.0
[abce61dc] Decimals v0.4.0
[31c24e10] Distributions v0.21.1
[61744808] DynamicalSystems v1.3.0
[5789e2e9] FileIO v1.0.7
[28b8d3ca] GR v0.41.0
[bd48cda9] GraphRecipes v0.4.0
[f67ccb44] HDF5 v0.12.3
[7073ff75] IJulia v1.20.0
[916415d5] Images v0.18.0
[f7bf1975] Impute v0.3.0
[70c4c096] Indicators v0.6.0
[a93385a2] JuliaDB v0.12.0
[e5e0dc1b] Juno v0.7.2
[194296ae] LibPQ v0.11.2
[add582a8] MLJ v0.2.3
[d491faf4] MLJModels v0.2.3
[86f7a689] NamedArrays v0.9.3
[eadc2687] Pandas v1.3.0
[14b8a8f1] PkgTemplates v0.6.2
[f0f68f2c] PlotlyJS v0.12.5
[91a5bcdd] Plots v0.26.2
[d330b81b] PyPlot v2.8.2
[1a8c2f83] Query v0.12.1
[612083be] Queryverse v0.3.1
[ce6b1742] RDatasets v0.6.3
[3cdcf5f2] RecipesBase v0.7.0
[295af30f] Revise v2.1.10
[03a91e81] SplitApplyCombine v0.4.1
[2913bbd2] StatsBase v0.32.0
[f3b207a7] StatsPlots v0.12.0
[054b7d4e] Strategems v0.2.0
[bd369af6] Tables v0.2.11
[a110ec8f] Temporal v0.6.1
[9e3dc215] TimeSeries v0.16.0
[f269a46b] TimeZones v0.9.2
[f218859d] TimeseriesPrediction v0.6.0
[9d95f2ec] TypedTables v1.2.0
[b8865327] UnicodePlots v1.1.0
[112f6efa] VegaLite v0.7.0

Improve spatio temporal prediction tests

  1. They never tests cross predictions of periodic models
  2. They must test the field values themselves, besides a coarse grained average of a prediction.
  3. They should test both maximum and minimum errors

Add tests for LinearLocalModel

There aren't any tests for LinearLocalModel in teh file localmodeling_tests.jl. THis would have caught the wrong definition of LinearLocalModel done in #20

Implement a CompositeBoundary

Just an idea. In particular this could be extensively discussed in docs
to highlight extensibility.
This CompositeBoundary would
implement a way to treat each boundary with a separate condition type.
( difficult could be corners)

I would like to see if this could be used to easily create a mobius strip.


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Cluster Weighted Modelling for timeseries prediction

In the reference "[1] : Eds. B. Schelter et al., Handbook of Time Series Analysis,
VCH-Wiley, pp 39-65 (2006)", from which we got the implementation for the Local Modelling, there is also an implementation for "Cluster Weighted Modelling" for predicting timeseries.

CWM essentially tries to estimate the joint density p(x, y), since this density allows to
compute derived quantities like the conditional forecast <y|x> for new query points.

If anyone wants to tackle this, contact me for the pdf!


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Consistency of source code for crossprediction

The docstring of crossprediction, which btw we may want to rename to crossestimation has:

crossprediction(source_train, target_train, source_pred,

yet none of these terms are in the source code for the function , which reads:

function crossprediction(train_in ::AbstractVector{<:AbstractArray{T, Φ}},
                        train_out::AbstractVector{<:AbstractArray{T, Φ}},
pred_in ::AbstractVector{<:AbstractArray{T, Φ}},

or

function crossprediction(params, train_out,pred_in, R, tree; progress=true)

I think it is best if there is an agreement on the terms used and the same thing is used for documentation and source code.


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Less memory usage with KDTrees

By default a KDTree stores a copy of the data.

AFAICT this could be avoided by creating the tree with

data = reconstruct(U, em) 
dftree = DataFreeTree(KDTree, data) 
tree = injectdata(dftree, data) 

Furthermore in both cases the reconstruction (data)
is available under tree.data.
With this we would not have to explicitly pass the reconstruction to the prediction functions.

If we also write our own constructors, we can allow using views and strip away a safeguard in injectdata
to take care of the case in temporalprediction where some reconstructed points are not part of the tree but in the reconstruction


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Documentation: light_cone_embedding and r0

The documentation of light_cone_embedding does not have r0 anymore (for no reason whatsoever).

But this creates a problem as it states:

The radius of the light cone evolves as: r = i*τ*c + r for each step i ∈ 0:γ.

while the two r symbols used in the equation actually mean different things.

Issue with light cone embedding

Here is how the embedding looks currently:

time  | c = 1.0               | c = 2.0               | c = 0.0

n + 1 | ..........o.......... | ..........o.......... | ..........o..........
n     | .........x□x......... | .........x□x......... | .........x□x.........
n - 1 | ..................... | ..................... | .....................
n - τ | .......xxxxxxx....... | .....xxxxxxxxxx...... | .........xxx.........

What you notice is that even though τ is greater than 1, the first spatiotemporal frame is always directly behind the point to be predicted.

Isn't this wrong? You want something that is exactly τ away from the point to be predicted... Right?

@JonasIsensee could this be why the light cone embedding doesn't work that well? Or maybe I have misunderstood how this works?

Barkley in tests/system_defs.jl is very slow

I really like the implementation of the barkley model that have in
test/system_defs.jl because it is so flexible.
It turns out that it has an immense problem with Core.Box inside its stepper function.

This makes it about 30× slower than a simpler implementation.

Write Tests for (Discrete) Maps

The package has only been tested and developed for continuous systems so far.

Everything should work just the same though. (Fingers crossed)

Write Tests which verify that it works

Saving PCAEmbedding to .jld2

It is not possible to load
a PCAEmbedding that was saved to a .jld2 file.

One has to import MultivariateStats first
because otherwise MultivariateStats.PCA is not a known type and will be reconstructed ( bad).

Likely fixed by exporting PCA


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.