Giter VIP home page Giter VIP logo

uber / orbit Goto Github PK

View Code? Open in Web Editor NEW
1.8K 33.0 135.0 164.25 MB

A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.

Home Page: https://orbit-ml.readthedocs.io/en/stable/

License: Other

Python 93.86% Stan 6.10% Dockerfile 0.05%
python forecasting bayesian exponential-smoothing pyro stan pystan probabilistic-programming probabilistic forecast

orbit's Introduction

Join Slack   |   Documentation   |   Blog - Intro   |   Blog - v1.1

Orbit banner


GitHub release (latest SemVer) PyPI Build and Test Documentation Status PyPI - Python Version Downloads Conda Recipe Conda - Platform Conda (channel only) PyPI - License

User Notice

The default page of the repo is on dev branch. To install the dev version, please check the section Installing from Dev Branch. If you are looking for a stable version, please refer to the master branch here.

Disclaimer

This project

  • is stable and being incubated for long-term support. It may contain new experimental code, for which APIs are subject to change.
  • requires cmdstanpy as one of the core dependencies for Bayesian sampling.

Orbit: A Python Package for Bayesian Forecasting

Orbit is a Python package for Bayesian time series forecasting and inference. It provides a familiar and intuitive initialize-fit-predict interface for time series tasks, while utilizing probabilistic programming languages under the hood.

For details, check out our documentation and tutorials:

Currently, it supports concrete implementations for the following models:

  • Exponential Smoothing (ETS)
  • Local Global Trend (LGT)
  • Damped Local Trend (DLT)
  • Kernel Time-based Regression (KTR)

It also supports the following sampling/optimization methods for model estimation/inferences:

  • Markov-Chain Monte Carlo (MCMC) as a full sampling method
  • Maximum a Posteriori (MAP) as a point estimate method
  • Variational Inference (VI) as a hybrid-sampling method on approximate distribution

Installation

Installing Stable Release

Install the library either from PyPi or from the source with pip. Alternatively, you can also install it from Anaconda with conda:

With pip

  1. Installing from PyPI

    $ pip install orbit-ml
  2. Install from source

    $ git clone https://github.com/uber/orbit.git
    $ cd orbit
    $ pip install -r requirements.txt
    $ pip install .

With conda

The library can be installed from the conda-forge channel using conda.

$ conda install -c conda-forge orbit-ml

Installing from Dev Branch

$ pip install git+https://github.com/uber/orbit.git@dev

Quick Start with Damped-Local-Trend (DLT) Model

FULL Bayesian Prediction

from orbit.utils.dataset import load_iclaims
from orbit.models import DLT
from orbit.diagnostics.plot import plot_predicted_data

# log-transformed data
df = load_iclaims()
# train-test split
test_size = 52
train_df = df[:-test_size]
test_df = df[-test_size:]

dlt = DLT(
  response_col='claims', date_col='week',
  regressor_col=['trend.unemploy', 'trend.filling', 'trend.job'],
  seasonality=52,
)
dlt.fit(df=train_df)

# outcomes data frame
predicted_df = dlt.predict(df=test_df)

plot_predicted_data(
  training_actual_df=train_df, predicted_df=predicted_df,
  date_col=dlt.date_col, actual_col=dlt.response_col,
  test_actual_df=test_df
)

full-pred

Demo

Nowcasting with Regression in DLT:

Open All Collab

Backtest on M3 Data:

Open All Collab

More examples can be found under tutorials and examples.

Contributing

We welcome community contributors to the project. Before you start, please read our code of conduct and check out contributing guidelines first.

Versioning

We document versions and changes in our changelog.

References

Presentations

Check out the ongoing deck for scope and roadmap of the project. An older deck used in the meet-up during July 2021 can also be found here.

Citation

To cite Orbit in publications, refer to the following whitepaper:

Orbit: Probabilistic Forecast with Exponential Smoothing

Bibtex:

@misc{
    ng2020orbit,
    title={Orbit: Probabilistic Forecast with Exponential Smoothing},
    author={Edwin Ng,
        Zhishi Wang,
        Huigang Chen,
        Steve Yang,
        Slawek Smyl},
    year={2020}, eprint={2004.08492}, archivePrefix={arXiv}, primaryClass={stat.CO}
}

Papers

  • Bingham, E., Chen, J. P., Jankowiak, M., Obermeyer, F., Pradhan, N., Karaletsos, T., Singh, R., Szerlip, P., Horsfall, P., and Goodman, N. D. Pyro: Deep universal probabilistic programming. The Journal of Machine Learning Research, 20(1):973–978, 2019.
  • Hoffman, M.D. and Gelman, A. The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J. Mach. Learn. Res., 15(1), pp.1593-1623, 2014.
  • Hyndman, R., Koehler, A. B., Ord, J. K., and Snyder, R. D. Forecasting with exponential smoothing: the state space approach. Springer Science & Business Media, 2008.
  • Smyl, S. Zhang, Q. Fitting and Extending Exponential Smoothing Models with Stan. International Symposium on Forecasting, 2015.

Related projects

orbit's People

Contributors

ariel77 avatar changdaniel avatar dependabot[bot] avatar edwinnglabs avatar freerealestate221 avatar fritzo avatar gavinsteiningeruber avatar jeongyoonlee avatar juanitorduz avatar okroshiashvili avatar pochoi avatar ppstacy avatar steveyang90 avatar sugatoray avatar swotai avatar szmark001 avatar vincewu51 avatar wangzhishi avatar xiaoyangzhou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

orbit's Issues

Refactor run_group_backtest()

it is forced to run bt_expand.run(..., date_col=date_col, response_col=response_col, regressor_col=regressor_col) to fit models like Prophet.
It seems to me a hacky solution.

Refactor to store `stan.sampling` directly to enable diagnostic methods

Currently we store stan.sampling.extract() as self.posterior_samples. However, we want to be able to retrieve chain level information from stan.sampling directly to enable diagnostic methods.

However, we can't currently retrieve stan.sampling directly since we never store the attribute.

There are two proposed solutions:

  1. Store stan.sampling at the end of fit, and only call stan.sampling.extract() for downstream methods such as predict, plot, diagnostic.
  2. Store self.posterior_samples as-is, and additionally store stan.sampling.to_dataframe() to something like self.posterior_samples_chain.

The first may require refactoring of other methods when we don't use stan.sampling for example if we fit using VI, MAP, or Pyro.

The second requires storing double the information that we would otherwise. An alternative approach to the second method is to parse the dataframe to the same state that our current self.posterior_samples is in, but poses a challenge because matrix samples are stored as a single column in a dataframe.

Another alternative to the second method is to store only the chain info from the dataframe, but we'd have to guarantee order preservation between the dataframe and arrays in stan.sampling.extract()

More robust unit tests

  • Current unit tests should be parameterized instead of static
  • Stronger assertions with fixtures for actual values
  • Parameterization for more variants of init / fit / predict args
  • Speed up run time without compromising the tests themselves

Selective input of priors

Allow users to use dictionary for regressor_beta_prior and regressor_sigma_prior args to selectively input priors.

For example, suppose we have feature1 through feature5, but only want custom priors for feature1 and feature4. Users could set the following args:

lgt = LGT(
    regressor_cols=['feature1', 'feature2', 'feature3', 'feature4', 'feature5'],
    regressor_beta_prior={'feature1': 100, 'feature4': 1000} ,
    regressor_sigma_prior={'feature1': 20, 'feature4': 40}
)

Create public method to Regression coefficients

Currently users need to use obj.aggregated_posteriors.get('median').get('rr_beta') and only get an array.

Better public facing method to aggregate rr_beta and pr_beta with column names in a dataframe

Reparameterization of sigma

Right now we use bounded Cauchy for both MAP and NUTS:

obs_sigma ~ cauchy(0, CAUCHY_SD) T[0,];

It works well for MAP. However, it may be related to slowness in NUTS and suggesting for NUTS:

real<lower=0, upper=pi()/2> obs_sigma_unif_dummy;
obs_sigma = CAUCHY_SD * tan(obs_sigma_unif_dummy); 

We may need to split stan code in handling NUTS vs. rest or MAP vs. rest

Warning raised by calling plotting utils should be investigated

if os.environ.get('DISPLAY', '') == '':
print('no display found. Using non-interactive Agg backend')
matplotlib.use('Agg')

The above lines are raising a warning / exception in unit tests for test_backtest.py

Warning Output:

orbit/utils/utils.py:15
  /home/travis/build/uber/orbit/orbit/utils/utils.py:15: UserWarning: 
  This call to matplotlib.use() has no effect because the backend has already
  been chosen; matplotlib.use() must be called *before* pylab, matplotlib.pyplot,
  or matplotlib.backends is imported for the first time.
  
  The backend was *originally* set to 'TkAgg' by the following code:
    File "setup.py", line 66, in <module>
      'Programming Language :: Python :: 3.7',
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/setuptools/__init__.py", line 145, in setup
      return distutils.core.setup(**attrs)
    File "/opt/python/3.7.1/lib/python3.7/distutils/core.py", line 148, in setup
      dist.run_commands()
    File "/opt/python/3.7.1/lib/python3.7/distutils/dist.py", line 966, in run_commands
      self.run_command(cmd)
    File "/opt/python/3.7.1/lib/python3.7/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/setuptools/command/test.py", line 237, in run
      self.run_tests()
    File "setup.py", line 39, in run_tests
      errcode = pytest.main(self.test_args)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/config/__init__.py", line 79, in main
      return config.hook.pytest_cmdline_main(config=config)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/hooks.py", line 284, in __call__
      return self._hookexec(self, self.get_hookimpls(), kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 67, in _hookexec
      return self._inner_hookexec(hook, methods, kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 61, in <lambda>
      firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
      res = hook_impl.function(*args)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 242, in pytest_cmdline_main
      return wrap_session(config, _main)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 209, in wrap_session
      session.exitstatus = doit(config, session) or 0
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 248, in _main
      config.hook.pytest_collection(session=session)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/hooks.py", line 284, in __call__
      return self._hookexec(self, self.get_hookimpls(), kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 67, in _hookexec
      return self._inner_hookexec(hook, methods, kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 61, in <lambda>
      firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
      res = hook_impl.function(*args)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 258, in pytest_collection
      return session.perform_collect()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 485, in perform_collect
      items = self._perform_collect(args, genitems)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 524, in _perform_collect
      self.items.extend(self.genitems(node))
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 762, in genitems
      for x in self.genitems(subnode):
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 759, in genitems
      rep = collect_one_node(node)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 407, in collect_one_node
      rep = ihook.pytest_make_collect_report(collector=collector)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/hooks.py", line 284, in __call__
      return self._hookexec(self, self.get_hookimpls(), kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 67, in _hookexec
      return self._inner_hookexec(hook, methods, kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 61, in <lambda>
      firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
      res = hook_impl.function(*args)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 289, in pytest_make_collect_report
      call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 226, in from_call
      result = func()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 289, in <lambda>
      call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 435, in collect
      self._inject_setup_module_fixture()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 447, in _inject_setup_module_fixture
      setup_module = _get_non_fixture_func(self.obj, "setUpModule")
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 251, in obj
      self._obj = obj = self._getobj()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 432, in _getobj
      return self._importtestmodule()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 499, in _importtestmodule
      mod = self.fspath.pyimport(ensuresyspath=importmode)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/py/_path/local.py", line 668, in pyimport
      __import__(modname)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/assertion/rewrite.py", line 296, in load_module
      six.exec_(co, mod.__dict__)
    File "/home/travis/build/uber/orbit/tests/test_backtest.py", line 6, in <module>
      from orbit.utils.constants import BacktestFitColumnNames
    File "/home/travis/build/uber/orbit/orbit/utils/constants.py", line 4, in <module>
      from orbit.utils.utils import get_parent_path
    File "/home/travis/build/uber/orbit/orbit/utils/utils.py", line 6, in <module>
      import matplotlib.pyplot as plt
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/matplotlib/pyplot.py", line 71, in <module>
      from matplotlib.backends import pylab_setup
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/matplotlib/backends/__init__.py", line 17, in <module>
      line for line in traceback.format_stack()
  
  
    matplotlib.use('Agg')

auto_scale=True with regressors errors out

when auto_scale=True with regressors, the following error message pops out:


TypeError Traceback (most recent call last)
in
1 from sklearn.preprocessing import MinMaxScaler
2 regressor_min_max_scaler = MinMaxScaler(1, 2.719)
----> 3 df[regressor_col] = regressor_min_max_scaler.fit_transform(df[regressor_col])

~/Desktop/uTS-py/myenv/lib/python3.7/site-packages/sklearn/base.py in fit_transform(self, X, y, **fit_params)
569 if y is None:
570 # fit method of arity 1 (unsupervised transformation)
--> 571 return self.fit(X, **fit_params).transform(X)
572 else:
573 # fit method of arity 2 (supervised transformation)

~/Desktop/uTS-py/myenv/lib/python3.7/site-packages/sklearn/preprocessing/_data.py in fit(self, X, y)
337 # Reset internal state before fitting
338 self._reset()
--> 339 return self.partial_fit(X, y)
340
341 def partial_fit(self, X, y=None):

~/Desktop/uTS-py/myenv/lib/python3.7/site-packages/sklearn/preprocessing/_data.py in partial_fit(self, X, y)
361 """
362 feature_range = self.feature_range
--> 363 if feature_range[0] >= feature_range[1]:
364 raise ValueError("Minimum of desired feature range must be smaller"
365 " than maximum. Got %s." % str(feature_range))

TypeError: 'int' object is not subscriptable

bug with pyro and MAP

get this when running
predicted_df = lgt_map.predict(df=test_df)

RuntimeError Traceback (most recent call last)
in
----> 1 predicted_df = lgt_map.predict(df=test_df)

~/work/orbit-super/orbit/orbit/estimator.py in predict(self, df, decompose)
566 # prediction
567 predicted_dict = self._predict(
--> 568 df=df, include_error=False, decompose=decompose
569 )
570

~/work/orbit-super/orbit/orbit/lgt.py in _predict(self, df, include_error, decompose)
504 trend_forecast_matrix
505 = torch.zeros((num_sample, trend_forecast_length), dtype=torch.double)
--> 506 trend_component = torch.cat((local_global_trend_sums, trend_forecast_matrix), dim=1)
507
508 last_local_trend_level = local_trend_levels[:, -1]

RuntimeError: Expected object of scalar type Float but got scalar type Double for sequence element 1 in sequence argument at position #1 'tensors'

Implement damped LGT class

Implement the concrete class for Damped LGT class (DLT), which is now separate from the main LGT model.

Implement DLT / LGT with a multiplicative option

Current implementation is the additive form, but doing a log transformation before fit and an exp transformation after predict, makes it a multiplicative model.

This can be integrated directly into the model classes.

Further this allows Backtesting to work as intended on the original scale of the data, without having to implement transformation function callbacks

config of pyro

enhance control of config inside pyro such as steps, message etc.

Pyro Estimation with Regressors


RuntimeError Traceback (most recent call last)
in
1 # make prediction of past and future
----> 2 predicted_df = lgt_reg_map.predict(df=df, decompose=True)
3 predicted_df.head(5)

~/work/orbit-super/orbit/orbit/estimator.py in predict(self, df, decompose)
567 # prediction
568 predicted_dict = self._predict(
--> 569 df=df, include_error=False, decompose=decompose
570 )
571

~/work/orbit-super/orbit/orbit/lgt.py in _predict(self, df, include_error, decompose)
472 regressor_matrix = df[self.regressor_col].values
473 regressor_torch = torch.from_numpy(regressor_matrix)
--> 474 regressor_component = torch.matmul(regressor_torch, regressor_beta)
475 regressor_component = regressor_component.t()
476 else:

RuntimeError: Expected object of scalar type Double but got scalar type Float for argument #2 'mat2' in call to _th_mm

Include meta data in backtest.py

can we also include two meta data

in _predicted_df append train end (if date_col is available in the splitter
in _score_df append number of splits as a reference for user knowing how many splits has conducted

Redesign Backtest Module

Redesign backtest module so Backtest is initialized only with the data.
Make models a function of _fit() instead of an attribute of Backtest.

Regression Coef Penalty

  • get a dataset for benchmarking/testing
  • L1/L2 Penalty
  • Total Coef Penalty
  • Set Positive Reg Coef as zero instead of rejecting sample
  • Variable selection with spike and slab

Just some thoughts; not necessarily complete all of them

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.