Giter VIP home page Giter VIP logo

timeseries-clustering-vae's Introduction

Timeseries clustering

Timeseries clustering is an unsupervised learning task aimed to partition unlabeled timeseries objects into homogenous groups/clusters. Timeseries in the same cluster are more similar to each other than timeseries in other clusters

This algorithm is able to:

  • Identify joint dynamics across the sequences
  • Eliminate lags (time-shifts) across sequences (usually called lag-invariance)
  • Generate features which are interpretable.

In general, timeseries clustering algorithms are of two types:

  1. Feature based - transform raw data using feature extraction, run clustering on top of generated features
  2. Raw-data based - directly applied over timeseries vectors without any space-transformations

Variational Recurrent Auto-encoders (VRAE)

VRAE is a feature-based timeseries clustering algorithm, since raw-data based approach suffers from curse of dimensionality and is sensitive to noisy input data. The middle bottleneck layer will serve as the feature representation for the entire input timeseries.

Architecture

The network

From here on, RNN refers to Recurrent Neural Network architecture, either LSTM/GRU block. Our model comprises mainly of four blocks

  • The encoder: A sequence of input vectors is fed to the RNN, last hidden layer h_end, is plucked from the RNN and is passed to the next layer
  • The encoder-to-latent layer: h_end is mapped to the mean and standard deviation by using a linear layer
  • Given mean and std. deviation, reparameterization is performed during training. This essentially means sampling latent vector from a distribution defined by its mean and std. deviation
  • The latent-to-decoder layer: Latent vector sampled is passed through a linear layer to obtain initial states for the decoder RNN.
  • The decoder: Given the initial states of the decoder, decoder inputs are initialized to zero and are updated using backpropagation. Outputs of the decoder and passed on to the output layer
  • The output layer: This linear layer maps the output of decoder RNN to sequence length to obtain x_decoded

Loss function

The network is jointly trained on 2 loss functions:

  1. KL-divergence between the distribution learned in latent space with the normal distribution.
  2. An auto-encoder learns the identity function, so the sequence of input and output vectors must be similar. In our case, MSE loss is calculated between x_decoded and x forming the reconstruction error. It signifies the extent to which input is reconstructed. Another option : SmoothL1Loss

Optimizer

Given the gradients and the weights, Adam is used to update the weights. Option provided to use Stochastic Gradient Descent(SGD) for optimization.

Why recurrent neural network in an auto-encoder?

  • The length of timeseries may vary from sample to sample. Conventional techniques only work on inputs of fixed size.
  • The patterns in timeseries can have arbitrary time span and be non stationary. The recurrent neural network can learn patterns in arbitrary time scale (lag invariance)
  • The weight/linear layer in vanilla auto-encoders might grow large in size as the length of time series increases, eventually slowing down the learning process.

Exploratory data analysis is performed on the latent vectors in order to visualize the clusters formed. Since the dimension of z can be more than 2, it has to be converted to a lower dimension to be visualized. PCA and t-SNE are performed to visualize the clusters.

Example - ECG data

NOTE: Please be wary of using this dataset. I had developed this library for an internal client project and had to obfuscate data using an opensource dataset. You may find some errors in the way I'm using this data

The above network is trained on a dataset of 8500 ECG's and tested on 950 ECG's Named ECG5000 on the UCR archive, this dataset has 5 classes, and the labels are used to generate different colors on PCA, tSNE chart.

Here's a snapshot of the data:

Results

When the VRAE model is run on this set of hyperparameters, we get the plots for both PCA and tSNE. PCA and t-SNE are performed in order to convert to a lower dimension and to visualize the clusters. This figure shows that latent space exhibits structure. The pink and green labels obviously cluster in different parts of the space.

FYI, the entire algorithm is an unsupervised one. Labels are just used to color and visually test the results. If you don't have labels for your application, you can run k-means on top of latent vectors to get labels and use those labels to color individual data points.

Application Areas

  1. Anomaly detection
  2. Data reduction
  3. Determining products with similar selling patterns
  4. Product cannibalization
  5. New product introduction
  6. Low selling items
  7. Items that are introduced and phased out quite frequently (can't estimate seasonality, data < 1yr)

Solution: Forecast items in groups (borrowed from here)

  1. Even though each item has a short/sparse life cycle, clustered group has enough data
  2. Modeling the group as a whole, is more robust to outliers and missing data.

Conclusion

We present variational recurrent auto-encoder that learns the structure in the timeseries. Training is unsupervised. When we color the latent vectors with the actual labels, we show that the structure makes sense.

Requirements

Repo works with:

  • python==3.5
  • torch==1.0.0
  • numpy==1.15.4
  • plotly==3.4.2
  • scikit-learn==0.20.1
  • matplotlib==3.0.2
  • scipy==1.1.0

References

timeseries-clustering-vae's People

Contributors

lodayatejas avatar n-gao avatar tejaslodaya avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.