Giter VIP home page Giter VIP logo

orion's Introduction

“DAI-Lab” An open source project from Data to AI Lab at MIT.

“Orion”

Development Status Python PyPi Shield Tests Downloads Binder

Orion

A machine learning library for unsupervised time series anomaly detection.

Important Links
💻 Website Check out the Sintel Website for more information about the project.
📖 Documentation Quickstarts, User and Development Guides, and API Reference.
Tutorials Checkout our notebooks
:octocat: Repository The link to the Github Repository of this library.
📜 License The repository is published under the MIT License.
Community Join our Slack Workspace for announcements and discussions.

Overview

Orion is a machine learning library built for unsupervised time series anomaly detection. With a given time series data, we provide a number of “verified” ML pipelines (a.k.a Orion pipelines) that identify rare patterns and flag them for expert review.

The library makes use of a number of automated machine learning tools developed under Data to AI Lab at MIT.

Read about using an Orion pipeline on NYC taxi dataset in a blog series:

Part 1: Learn about unsupervised time series anomaly detection Part 2: Learn how we use GANs to solving the problem? Part 3: How does one evaluate anomaly detection pipelines?

Notebooks: Discover Orion through colab by launching our notebooks!

Quickstart

Install with pip

The easiest and recommended way to install Orion is using pip:

pip install orion-ml

This will pull and install the latest stable release from PyPi.

In the following example we show how to use one of the Orion Pipelines.

Fit an Orion pipeline

We will load a demo data for this example:

from orion.data import load_signal

train_data = load_signal('S-1-train')
train_data.head()

which should show a signal with timestamp and value.

    timestamp     value
0  1222819200 -0.366359
1  1222840800 -0.394108
2  1222862400  0.403625
3  1222884000 -0.362759
4  1222905600 -0.370746

In this example we use aer pipeline and set some hyperparameters (in this case training epochs as 5).

from orion import Orion

hyperparameters = {
    'orion.primitives.aer.AER#1': {
        'epochs': 5,
        'verbose': True
    }
}

orion = Orion(
    pipeline='aer',
    hyperparameters=hyperparameters
)

orion.fit(train_data)

Detect anomalies using the fitted pipeline

Once it is fitted, we are ready to use it to detect anomalies in our incoming time series:

new_data = load_signal('S-1-new')
anomalies = orion.detect(new_data)

⚠️ Depending on your system and the exact versions that you might have installed some WARNINGS may be printed. These can be safely ignored as they do not interfere with the proper behavior of the pipeline.

The output of the previous command will be a pandas.DataFrame containing a table of detected anomalies:

        start         end  severity
0  1402012800  1403870400  0.122539

Leaderboard

In every release, we run Orion benchmark. We maintain an up-to-date leaderboard with the current scoring of the verified pipelines according to the benchmarking procedure.

We run the benchmark on 11 datasets with their known grounth truth. We record the score of the pipelines on each datasets. To compute the leaderboard table, we showcase the number of wins each pipeline has over the ARIMA pipeline.

Pipeline Outperforms ARIMA
AER 10
TadGAN 7
LSTM Dynamic Thresholding 7
LSTM Autoencoder 6
Dense Autoencoder 6
VAE 6
GANF 6
Azure 0

You can find the scores of each pipeline on every signal recorded in the details Google Sheets document. The summarized results can also be browsed in the following summary Google Sheets document.

Resources

Additional resources that might be of interest:

Citation

If you use AER for your research, please consider citing the following paper:

Lawrence Wong, Dongyu Liu, Laure Berti-Equille, Sarah Alnegheimish, Kalyan Veeramachaneni. AER: Auto-Encoder with Regression for Time Series Anomaly Detection.

@inproceedings{wong2022aer,
  title={AER: Auto-Encoder with Regression for Time Series Anomaly Detection},
  author={Wong, Lawrence and Liu, Dongyu and Berti-Equille, Laure and Alnegheimish, Sarah and Veeramachaneni, Kalyan},
  booktitle={2022 IEEE International Conference on Big Data (IEEE BigData)},
  pages={1152-1161},
  doi={10.1109/BigData55660.2022.10020857},
  organization={IEEE},
  year={2022}
}

If you use TadGAN for your research, please consider citing the following paper:

Alexander Geiger, Dongyu Liu, Sarah Alnegheimish, Alfredo Cuesta-Infante, Kalyan Veeramachaneni. TadGAN - Time Series Anomaly Detection Using Generative Adversarial Networks.

@inproceedings{geiger2020tadgan,
  title={TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks},
  author={Geiger, Alexander and Liu, Dongyu and Alnegheimish, Sarah and Cuesta-Infante, Alfredo and Veeramachaneni, Kalyan},
  booktitle={2020 IEEE International Conference on Big Data (IEEE BigData)},
  pages={33-43},
  doi={10.1109/BigData50022.2020.9378139},
  organization={IEEE},
  year={2020}
}

If you use Orion which is part of the Sintel ecosystem for your research, please consider citing the following paper:

Sarah Alnegheimish, Dongyu Liu, Carles Sala, Laure Berti-Equille, Kalyan Veeramachaneni. Sintel: A Machine Learning Framework to Extract Insights from Signals.

@inproceedings{alnegheimish2022sintel,
  title={Sintel: A Machine Learning Framework to Extract Insights from Signals},
  author={Alnegheimish, Sarah and Liu, Dongyu and Sala, Carles and Berti-Equille, Laure and Veeramachaneni, Kalyan},  
  booktitle={Proceedings of the 2022 International Conference on Management of Data},
  pages={1855–1865},
  numpages={11},
  publisher={Association for Computing Machinery},
  doi={10.1145/3514221.3517910},
  series={SIGMOD '22},
  year={2022}
}

orion's People

Contributors

alexandergeiger avatar ban2aru avatar csala avatar dailab-bot avatar dyuliu avatar hector-hedb12 avatar hramir avatar kronerte avatar kveerama avatar lcwong0928 avatar manuelalvarezc avatar micahjsmith avatar pvk-developer avatar sarahmish avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.