Giter VIP home page Giter VIP logo

chronicle-timeseries's Introduction

Chronicle-TimeSeries

Multi-Threaded Time Series library

Purpose

This library has two efficiency objectives

  • efficient storage on long sequences of data in a column based database.
  • multi-threaded processing where possible
  • integration with engine for lookup and management of the TimeSeries.
  • perform calculation on time series where the timings are in micro-seconds and each time series has it's own timestamps i.e. they don't have to be in sync or vectorized.

Enterprise edition

The enterprise version

  • has more multi-threaded implementations.
  • peristsed timeseries (via memory mapped files)
  • remote access to time series (no need to have the time series locally)
  • distributed times series data where data is processed locally. e.g. if you have N servers, each server can process 1/N of the work.

Sample program

This program creates two series for the mid of an instrumentent and attempts to see if there is any correlation.

Note: the two time series have different times.

    long size = 600_000_000;

    // generate series 1
    TimeSeries ts = new InMemoryTimeSeries(null);
    ts.setLength(size);

    LongColumn time = ts.getTimestamp();
    time.setAll(Random::new, (c, i, r) -> c.set(i, 9 + (int) Math.pow(1e6, sqr(r.nextFloat()))));
    long sum = time.integrate(); // sum all the intervals

    System.out.printf("%.1f days%n", sum/86400e6);

    DoubleColumn mid = ts.acquireDoubleColumn("mid", BytesDoubleLookups.INT16_4);
    mid.generateBrownian(1, 2, 0.0005);

    // generate series 2
    TimeSeries ts2 = new InMemoryTimeSeries(null);
    ts2.setLength(size);

    LongColumn time2 = ts.getTimestamp();
    time2.setAll(Random::new, (c, i, r) -> c.set(i, 9 + (int) Math.pow(1e6, sqr(r.nextFloat())))); //
    long sum2 = time2.integrate(); // sum all the intervals

    System.out.printf("%.1f days%n", sum2/86400e6);

    DoubleColumn mid2 = ts2.acquireDoubleColumn("mid", BytesDoubleLookups.INT16_4);
    mid2.generateBrownian(1, 2, 0.0005);

    
    // compare the correlation
    CorralationStatistic stats = PearsonsCorrelation.calcCorrelation(mid, mid2, Mode.AFTER_BOTH_CHANGE);

This takes around 30 seconds on a 16 core machine for two sets of 600 million data points generated and compared (Notionally >250 business days, ie a year)

When comparing correlations there is many different ways you might do this when the spacing between events varies.

You can look to see when either changes by a minimum amounts, or when one changes, or when both have changed. You might also prefer to sub-sample the data before performing correlation to reduce noise.

Predictive testing.

The ultimate purpose of the library is to find patterns which might have predictive power. To do this you need to estimate a forward movement in a metric you would want to predict and find inputs which would can help estimate this forward.

The steps to do this are

  • estimate the time horizon you need e.g. 5 mins, an hour, 5 days. THZ
  • generate a forward by comparing the difference between the current and current + THZ.
  • find a correlation between the inputs and this forward.

If you look at enough inputs you will find spurious correlations so you need to take a view on which correlations are predictive or not. [http://tylervigen.com/spurious-correlations]

chronicle-timeseries's People

Contributors

peter-lawrey avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.