microprediction / timemachines Goto Github PK

Predict time-series with one line of code.

Home Page: https://www.microprediction.com/blog/popular-timeseries-packages

License: MIT License

Python 92.41% Jupyter Notebook 7.59%

timeseries time-series time-series-analysis timeseries-analysis timeseries-forecasting timeseries-data prediction predictive-modeling prediction-algorithm predictions

timemachines's Introduction

timemachines

Simple prediction functions (documented and assessed)

Because why not do things in walk-forward incremental fashion with one line of code? Here yt is a vector or scalar, and we want to predict yt (or its first coordinate if a vector) three steps in advance.

 from timemachines.skaters.somepackage.somevariety import something as f
 for yt in y:
     xt, xt_std, s = f(y=yt, s=s, k=3)

This emits a k-vector xt of forecasts, and corresponding k-vector xt_std of estimated standard errors. See skaters for choices of somepackage, somevariety and something. You can also ensemble, compose, bootstrap and do other things with one line of code.

See the docs.

Packages used

Skaters draw on functionality from popular python time-series packages like river, pydlm, tbats, pmdarima, statsmodels.tsa, neuralprophet, Facebook Prophet, Uber's orbit, Facebook's greykite and more. See the docs.

What's a "skater"?

More abstractly:

$$ f : (y_t, state; k) \mapsto ( [\hat{y}(t+1),\hat{y}(t+2),\dots,\hat{y}(t+k) ], [\sigma(t+1),\dots,\sigma(t+k)], posterior\ state)) $$

where $\sigma(t+l)$ estimates the standard error of the prediction $\hat{y}(t+l)$.

If you prefer an legitimate (i.e. stateful) state machine, see FAQ question 1.

Skater function conventions

See docs/interface for description of skater inputs and outputs. Briefly:

  x, w, s = f(   y:Union[float,[float]],             # Contemporaneously observerd data, 
                                                     # ... including exogenous variables in y[1:], if any. 
            s=None,                                  # Prior state
            k:float=1,                               # Number of steps ahead to forecast. Typically integer. 
            a:[float]=None,                          # Variable(s) known in advance, or conditioning
            t:float=None,                            # Time of observation (epoch seconds)
            e:float=None,                            # Non-binding maximal computation time ("e for expiry"), in seconds
            r:float=None)                            # Hyper-parameters ("r" stands for for hype(r)-pa(r)amete(r)s)

Contributions and capstone projects

See CONTRIBUTE.md and good first issues.
See the suggested steps for a capstone project.

Getting live help

FAQ.
See the Slack invite on my user page here.
Office hours here.
Learn how to deploy some of these models and try to win the daily $125 prize.

Install instructions

Oh what a mess the Python timeseries ecosystem is. So packages are not installed by default. See the methodical install instructions and be incremental for best results. The infamous xkcd cartoon really does describe the alternative quite well.

Cite

Thanks

    @electronic{cottontimemachines,
        title = {{Timemachines: A Python Package for Creating and Assessing Autonomous Time-Series Prediction Algorithms}},
        year = {2021},
        author = {Peter Cotton},
        url = {https://github.com/microprediction/timemachines}
    }

or something here.

timemachines's People

Contributors

$fractor avatar$

Stargazers

Watchers

timemachines's Issues

Provide utilities to enter M6

A bit early as I write this, but see m6

top_rated_model fails with k=2

this is almost certainly aftermath of JSON leaderboard corruption. Reproduce bug as follows:

  from timemachines.skatertools.recommendations.suggestions import top_rated_models, get_ratings
  from pprint import pprint 
  max_seconds = 1
  min_count = 10
  for k in [1,2, 5,13]:
     print('k='+str(k))
     suggestions = top_rated_models(k=k,max_seconds=500, require_passing=True)
     pprint(suggestions)
     print('')

consider removing statsmodels dependency

Neuralprophet is failing

Probably something obvious

should check n_jobs=1 throughout

should check n_jobs=1 throughout so relative timing is accurate

Add AutoTS

Create ats following the instructions provided here

Add biteopt

Duplicate of microprediction/humpday#9 but leaving here as nobody reads the humpday issues (not even me, apparently).

Documentation Question

Hi,

Was looking to find any instance / mention of Confidence or Prediction Intervals - could have this wrong, just a novice here looking into your package.

Thanks!

cache top_rated_models

Produce 2-d plot showing accuracy and speed

Product 2-d plot showing accuracy and speed, in log scale, to convey which algorithms are a good tradeoff.

consider removing sklearn dependency

Why are predictions changing on each instantiation?

(easy) Create notebook example for timecast

Steps:

Create notebook example hello_timecast of using timecast package (similar to hello_divinity)
Make a pull request at timeseries-notebooks

Add gluonts

Create glu

Create notebook example of forecast

Steps:

Create notebook example hello_forecast of using forecast package (similar to hello_divinity)
Make a pull request at timeseries-notebooks

Add some of the pycaret datasets as loop streams

https://github.com/pycaret/datasets

Improve Orbit skaters

Orbit-ml skaters should benefit from pre-processing.

import error when trying to run the example on homepage

Hello, I pip installed timemachines 0.7.2 with python3.8.5 on Ubuntu 20.04 LTS successfully. I can import timemachines package. But can't run the example on homepage due to missing imports on the first line:

from timemachines.skatertools.data import hospital_with_exog

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-2-e7d8e2dd7f0b> in <module>
----> 1 from timemachines.skatertools.data import hospital_with_exog
      2 from timemachines.skatertools.visualization.priorplot import prior_plot
      3 import matplotlib.pyplot as plt
      4 
      5 # Get some data

ImportError: cannot import name 'hospital_with_exog' from 'timemachines.skatertools.data' (/home/crayonfu/myProjects/ts_test/bt_venv/lib/python3.8/site-packages/timemachines/skatertools/data/__init__.py)

I can't find the class of hospital_with_exog in the directory of timemachines/skatertools/data on github repo. Am I missing something here? Thank you.

Create hot dog or not image contest

Steps

Find infinite collection of labeled hot dog or not images
Every 20 minutes, run a github action (similar to here) that replaces a file in the hot_dog_or_not repo with a new image.
At the same time, publish the answer to microprediction
Use set_repository() on the stream creator to point people to the hot_dog_or_not_repo

Merlion failing in elo ratings run

https://microprediction.github.io/timeseries-elo-ratings/html_leaderboards/univariate-k_003.html

Even more timeseries to include

from georgios Paraskevas....

You can also add:
https://pypi.org/project/pyemd/ for empirical mode decomposition. Quite useful for time series decomposition

TICC https://github.com/davidhallac/TICC for clustering of multivariate time series
and Greedy Gaussian Segmentation
https://github.com/cvxgrp/GGS

couple minor things I noticed

I've been using hyperopt and optuna for a while, very curious to see if any of the other optimizers do better, although honestly my use cases may be pretty simple and I suspect not that much difference.

would add requirements.txt or conda environment.yml to make it easier to set up, see below

in timemachines/optimizers/alloptimizers.py I see
https://github.com/microprediction/timemachines/blob/main/timemachines/optimizers/alloptimizers.py#L40

print(optimizer.__name__,(optimizer.__name__,optimizer(objective, n_trials=50, n_dim=5, with_count=True)))

I think this should use the loop parameters per below?

print(optimizer.__name__,(optimizer.__name__,optimizer(objective, n_trials=n_trials, n_dim=n_dim, with_count=True)))

alloptimizers.py seems to run single_threaded, would consider pool.map to run many optimizations concurrently (but I'm not sure best way to do that). if that makes sense let me know and I can take a crack at a pull request.
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.map

this is requirements.txt I have, can do a pull request, lmk, might need testing/adjusting,

# may want to install fbprophet via conda, needs compiler, pystan
ax-platform
deap
divinity
fbprophet
funcy
hyperopt
microconventions
momentum
nevergrad
numpy==1.19.5
optuna
platypus-opt
poap
pydlm
pymoo
pystan
pySOT
swarmlib

Add some classic time-series

Add some classic time-series in loops. But do it in a way that it takes the data a LONG time to exhaust completely. What I have in mind in the following:

Choose a long univariate series from Libra, UCI, UC whatever
Or take two, and hit with a 2 x 2 matrix.
Hold out the second half the data set
Loop over the first half
On the next loop, include one data point (or a few) from the second half of the series
Etc

Maybe arrange it so that the extra point (or a few) enter at a specific time of day, so if anyone really cares, there could be an assessment completely out of sample.

Or maybe there are two streams, and one stream only ever had new points added.

Since these are usually public, we probably shouldn't obsess too much, since people can cheat if they really have their hearts set on it.

More optimizers to include

-https://github.com/gdikov/hypertunity (wraps gpyopt)

https://github.com/google/jax/blob/master/benchmarks/benchmark.py
https://github.com/davisking/dlib
cikit.optimize ... others
gpyopt
spearmint. but strange license!
https://docs.ray.io/en/latest/tune/
https://github.com/topics/grey-wolf-optimizer
https://github.com/AxeldeRomblay/MLBox
https://www.ml4aad.org/automl/bohb/
https://github.com/krzysztofarendt/modestga

Divinity breaking due to use of deprecated model path

raise NotImplementedError(ARIMA_DEPRECATION_ERROR)\nNotImplementedError: \nstatsmodels.tsa.arima_model.ARMA and statsmodels.tsa.arima_model.ARIMA have\nbeen removed in favor of statsmodels.tsa.arima.model.ARIMA

It's really a divinity bug reported at drds1/divinity#8

Create new one-liner that bounces

Create a different shell script, or combination of two, similar to
this one that

Checks for a private key file
Saves the private key, if it is generated
Restarts when the python command crashes, which it eventually will

TypeError: evaluate_mean_squared_error_with_sporadic_fit() got an unexpected keyword argument 'e'

Hi,

I am testing the examples and get the following error with this example code:


from timemachines.skatertools.tuning.hyperempirical import optimal_r_for_stream
from timemachines.skaters.proph.prophskaterssingular import fbprophet_univariate_r2
from humpday.optimizers.optunacube import optuna_tpe_cube
from timemachines.skaters.proph.prophparams import PROPHET_META, prophet_params
from pprint import pprint
from timemachines.skatertools.data.live import random_regular_stream_name

# Illustrates how to find the best hyper-parameter r in (0,1), and interpret this as two prophet hyper-parameters
# We use a random stream from https://www.microprediction.org/browse_streams.html
# Your should expect this to take many hours. A time update is provided after the first function evaluation.


if __name__=='__main__':
    name, url = random_regular_stream_name(min_len=PROPHET_META['n_warm'], with_url=True)
    print('We will find the best fbprophet hyper-parameters for '+url)
    print("Prophet will be fit for most of them, after a burn_in, and for many different hyper-params. Don't hold your breathe.")

    best_r, best_value, info = optimal_r_for_stream(f=fbprophet_univariate_r2,name=name,k=10,optimizer=optuna_tpe_cube,
                                                    n_burn=PROPHET_META['n_warm']+20,n_trials=50,n_dim=2)
    pprint(info)
    params = prophet_params(r=best_r,dim=2)
    pprint(params)


We will find the best fbprophet hyper-parameters for https://www.microprediction.org/stream_dashboard.html?stream=finance-futures-corn-change
Prophet will be fit for most of them, after a burn_in, and for many different hyper-params. Don't hold your breathe.
Traceback (most recent call last):
  File "D:\Anaconda3\envs\aim\lib\site-packages\IPython\core\interactiveshell.py", line 3444, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-9-3596626c8db7>", line 5, in <module>
    best_r, best_value, info = optimal_r_for_stream(f=fbprophet_univariate_r2,name=name,k=10,optimizer=optuna_tpe_cube,
  File "D:\Anaconda3\envs\aim\lib\site-packages\timemachines\skatertools\tuning\hyperempirical.py", line 36, in optimal_r_for_stream
    return optimal_r(f=f,y=y,k=k, a=None,t=t,e=None,evaluator=evaluator,optimizer=optimizer,n_trials=n_trials,
  File "D:\Anaconda3\envs\aim\lib\site-packages\timemachines\skatertools\tuning\hyper.py", line 54, in optimal_r
    a_test = objective(u=[0.5]*n_dim)  # Fail fast with easier trace
  File "D:\Anaconda3\envs\aim\lib\site-packages\timemachines\skatertools\tuning\hyper.py", line 50, in objective
    return evaluator(f=f, y=y, k=k, a=a, t=t, e=r, r=r, n_burn=n_burn)
TypeError: evaluate_mean_squared_error_with_sporadic_fit() got an unexpected keyword argument 'e'

PLEASE READ CONTRIBUTE.md

Hi all, please read CONTRIBUTE.md if you need a quick lay of the land.

Add Facebook KATS

Create kts following the instructions provided here

(easy) Create notebook example for proust

Steps:

Create notebook example hello_proust of using timecast package (similar to hello_divinity)
Make a pull request at timeseries-notebooks

consider removing pandas dependency

Consider removing pandas dependency. Even though many packages need this, and it can be helpful to diagnose install issues by having it, I think we should ditch pandas from setup and deal with it gracefully.

Timemachines doesn't need this baggage.

Ensemble ideas

Non-linear stacking https://bmcresnotes.biomedcentral.com/articles/10.1186/s13104-020-4931-7

Gelman's paper. https://arxiv.org/abs/2101.08954

Orbit skater

New package from Uber, Orbit.

Ian is working on it.

Create notebook example for pygam

Steps:

Create notebook example hello_pygam of using timecast package (similar to hello_divinity)
Make a pull request at timeseries-notebooks

Write // version of precision weighted skater

Write a version of precision weighted skater so that it can track the accuracy of ensembles more efficiently. At present each individual model keeps its own copy of all history.

Create notebook example of hcrystalball

Steps:

Create notebook example hello_hcrystalball of using hcrystalball package (similar to hello_divinity)
Make a pull request at timeseries-notebooks

Create notebook example of cesium

Steps:

Create notebook example hello_cesium of using cesium package (similar to hello_divinity)
Make a pull request at timeseries-notebooks

Create notebook example of tsai

Steps:

Create notebook example hello_tsai of using tsai package (similar to hello_divinity)
Make a pull request at timeseries-notebooks

Create notebook example of timeseries

Steps:

Create notebook example hello_timeseries of using timeseries package (similar to hello_divinity)
Make a pull request at timeseries-notebooks

Create notebook example of cronston

Steps:

Create notebook example hello_cronston of using cronston package (similar to hello_divinity)
Make a pull request at timeseries-notebooks

make sure greykite is in ratings

Create notebook example of auto_ts

Steps:

Create notebook example hello_auto_ts of using auto_ts package (similar to hello_divinity)
Make a pull request at timeseries-notebooks

fix package loading so there is no http call slowing down initial module loading

Create notebook example for autogluon

Steps:

Create notebook example hello_timecast of using autogluon package (similar to hello_divinity)
Make a pull request at timeseries-notebooks

Create bitcoin ETF bias stream

Create the ETF stream.
Create a stream that is ETF divided by bitcoin price.
Get fancy if you like if we know the cash holding of ETF

(Documentation) Note on mutability of state passed to skater

Hi, I am not sure why predictions are not consistent. I ran the following:

from timemachines.skaters.simple.thinking import thinking_slow_and_fast
import numpy as np
y = np.cumsum(np.random.randn(1000))
s = {}
x = list()
for yi in y:
    xi, x_std, s = thinking_slow_and_fast(y=yi, s=s, k=3)

Then, wanted to verify predictions using state variable "s" and the same y using the following simple loop code:

for i in range(0, 10):
    x_new, x_std_new, s_new = thinking_slow_and_fast(y=yi, s=s, k=3)
    print(x_new)

I was expecting to see the same predictions for each loop given that I am using the same yi and s variables (ie they haven't changed). But got different predictions:

[-1.0763699106402587, -1.0762804377907655, -1.0761909649412726]
[-1.0848981995166735, -1.08481700115846, -1.0847358028002463]
[-1.0930360369514371, -1.0929623509702981, -1.092888664989159]
[-1.1007996104897944, -1.1007327443804924, -1.1006658782711902]
[-1.1082046070063825, -1.108143931623388, -1.108083256240394]
[-1.115266209255507, -1.1152111531047109, -1.1151560969539147]
[-1.1219990952068173, -1.1219491392351129, -1.1218991832634084]
[-1.128417439789925, -1.1283721126322885, -1.1283267854746521]
[-1.1345349187114502, -1.1344937923578347, -1.1344526660042193]
[-1.1403647140440984, -1.1403273998910912, -1.140290085738084]

Shouldn't using the same state "s" and the same "y" give us the same prediction?

Create notebook example hello_gluonts of using gluonts package (similar to hello_divinity)
Make a pull request at timeseries-notebooks

microprediction / timemachines Goto Github PK

timemachines's Introduction

timemachines

Simple prediction functions (documented and assessed)

Packages used

What's a "skater"?

Skater function conventions

Contributions and capstone projects

Getting live help

Install instructions

Cite

timemachines's People

Contributors

Stargazers

Watchers

Forkers

timemachines's Issues

Recommend Projects

Recommend Topics

Recommend Org