patrick-kidger / neuralcde Goto Github PK

Code for "Neural Controlled Differential Equations for Irregular Time Series" (Neurips 2020 Spotlight)

License: Apache License 2.0

Python 100.00%

machine-learning rough-paths neural-differential-equations time-series controlled-differential-equations deep-learning deep-neural-networks pytorch dynamical-systems differential-equations neural-networks

neuralcde's Introduction

Neural Controlled Differential Equations for Irregular Time Series
(NeurIPS 2020 Spotlight)
[arXiv, YouTube]

Building on the well-understood mathematical theory of controlled differential equations, we demonstrate how to construct models that:

Act directly on irregularly-sampled partially-observed multivariate time series.
May be trained with memory-efficient adjoint backpropagation - even across observations.
Demonstrate state-of-the-art performance.

They are straightforward to implement and evaluate using existing tools, in particular PyTorch and the torchcde library.

Library

See torchcde.

Example

We encourage looking at example.py, which demonstrates how to use the library to train a Neural CDE model to predict the chirality of a spiral.

Also see irregular_data.py, for demonstrations on how to handle variable-length inputs, irregular sampling, or missing data, all of which can be handled easily, without changing the model.

A self contained short example:

import torch
import torchcde

# Create some data
batch, length, input_channels = 1, 10, 2
hidden_channels = 3
t = torch.linspace(0, 1, length)
t_ = t.unsqueeze(0).unsqueeze(-1).expand(batch, length, 1)
x_ = torch.rand(batch, length, input_channels - 1)
x = torch.cat([t_, x_], dim=2)  # include time as a channel

# Interpolate it
coeffs = torchcde.natural_cubic_spline_coeffs(x)
X = torchcde.NaturalCubicSpline(coeffs)

# Create the Neural CDE system
class F(torch.nn.Module):
    def __init__(self):
        super(F, self).__init__()
        self.linear = torch.nn.Linear(hidden_channels, 
                                      hidden_channels * input_channels)
    def forward(self, t, z):
        return self.linear(z).view(batch, hidden_channels, input_channels)

func = F()
z0 = torch.rand(batch, hidden_channels)

# Integrate it
torchcde.cdeint(X=X, func=func, z0=z0, t=X.interval)

Reproducing experiments

Everything to reproduce the experiments of the paper can be found in the experiments folder. Check the folder for details.

Results

As an example (taken from the paper - have a look there for similar results on other datasets):

Citation

@article{kidger2020neuralcde,
    title={{N}eural {C}ontrolled {D}ifferential {E}quations for {I}rregular {T}ime {S}eries},
    author={Kidger, Patrick and Morrill, James and Foster, James and Lyons, Terry},
    journal={Advances in Neural Information Processing Systems},
    year={2020}
}

neuralcde's People

Contributors

Stargazers

Watchers

neuralcde's Issues

Dealing with 2D data

Hi,

Thanks for the great work which targets a real issue of Neural ODE.
I was wondering if there's any example dealing with 2D data. For example an image where the architecture only uses Conv2D layers?

Thanks

How to run Neural CDE on times series with irregular intervals?

I tried to run Neural CDE on PhysioNet 2012, which is different from PhysioNet 2019 in that it is sampled with irregular intervals. In PhysioNet 2019, all records are sampled with one-hour intervals. However, in 2012 data, intervals are irregular (could be anything, e.g., 36 minutes, 1 hour and 21 minutes). This creates a mismatch between times and final_index in your code. I tried to modify the code but it didn't seem to work. Is there an easy way to do that? Thanks.

Low performance on various GPUs

Dear Professor,

I'm a third-year student who is interested in your works. I'm starting with Neural Control Differential Equations for Irregular Times Series Data.

I have a question and I hope you have time to respond to me.

I wonder that what GPUs you had used for your experiments and how many days that you have to wait to achieve the result in the paper. I tried to run your work by Python with some updates for compatibility with the new version of libraries. After few days, some methods to preprocess the UEA datasets were deprecated. I start with the Speech Commands datasets. I had run code on 3 types of GPUs.

1/ 1x 1050Ti 4GB. (70s-90s / 1 epoch)
2/ 1x A100 40GB. (120s - 144s / 1 epoch)
3/ 1x 3090Ti 12GB. (40s - 50s / 1 epoch)

With the above GPUs, I received quite bad results. For more specific. It was around 0.57 ~ 0.58 accuracy on the validation set (and also the train test, too) for the first time. The second time, I run on A100. and after 70 epochs of the first run, the validation was returned is 0.213. I just want to know what your GPUs are, when you run the code, maybe I was wrong somewhere.

Best regards,
Tien Dung

How to predict Time Series?

Hello,
I've been playing around with the example for a bit and cannot solve one issue. I want to predict a time series, for now a simple cosine function. The function is irregularly sampled at 250 points.
How I have been doing this so far is to chunk the time series into chunks of length 50, and fit a spline on each chunk with their corresponding time stamps scaled to [0,1].
I set the times for the model to a linspace(0, 1, 50). This does not work. How should I go about setting everything up properly? Here is my code for reference (I hope it is not too much of a mess).
(Only the relevent part from the example. Additionally, I changed the solver to dopri8, available in the newest version of torchdiffeq. It gave better results than all the other solvers at the cost of being slightly slower)

def get_data():
    # some time series I had at hand
    d = np.load('/XXX/tsdata/300.npy')
    
    # the timestamps
    t = d[0]
    
    # generating the cos function
    Xfull = np.cos(t*10*np.pi)
    Xfull = (Xfull - np.mean(Xfull)) / np.std(Xfull)
    
    ts = []
    Xs = []
    ys = []
    
    tstest = []
    Xstest = []
    ystest = []
    
    # splitting the dataset into train and test
    for i in range(0, len(Xfull)-101):
        ts.append(t[i:i+50])
        ts[-1] = (ts[-1] - np.min(ts[-1])) / (np.max(ts[-1]) - np.min(ts[-1]))
        Xs.append(Xfull[i:i+50])
        ys.append(Xfull[i+50])

    for i in range(len(Xfull)-101, len(Xfull)-51):
        tstest.append(t[i:i+50])
        tstest[-1] = (tstest[-1] - np.min(tstest[-1])) / (np.max(tstest[-1]) - np.min(tstest[-1]))
        Xstest.append(Xfull[i:i+50])
        ystest.append(Xfull[i+50])
        
    return (torch.Tensor(ts), torch.Tensor(Xs), torch.Tensor(ys),
            torch.Tensor(tstest), torch.Tensor(Xstest), torch.Tensor(ystest))



def main(): 
    from collections import defaultdict
    train_t, train_X, train_y, test_t, test_X, test_y = get_data()


    device = torch.device('cuda')
    model = NeuralCDE(input_channels=1, hidden_channels=12, output_channels=1).to(device)
    optimizer = torch.optim.Adam(model.parameters())
    
    # getting the spline coefficients
    # done like this because the timestamps are different for each chunk
    train_coeffs = []
    for t, X in zip(train_t, train_X):
        train_coeffs.append(controldiffeq.natural_cubic_spline_coeffs(t, torch.unsqueeze(X, 0).unsqueeze(-1)))
    elements = defaultdict(list)

    final_coeffs = []
    for c in train_coeffs:
        for ix, el in enumerate(c):
            elements[ix].append(el)

    for i in elements.keys():
        final_coeffs.append(torch.stack(elements[i]).squeeze(1))

    train_coeffs = tuple(final_coeffs)
    train_y = train_y.to(device)
    
    # getting the spline coefficients for the test data
    test_coeffs = []
    for t, X in zip(test_t, test_X):
        test_coeffs.append(controldiffeq.natural_cubic_spline_coeffs(t, torch.unsqueeze(X, 0).unsqueeze(-1)))
    elements = defaultdict(list)

    final_coeffs = []
    for c in test_coeffs:
        for ix, el in enumerate(c):
            elements[ix].append(el)

    for i in elements.keys():
        final_coeffs.append(torch.stack(elements[i]).squeeze(1))

    test_coeffs = tuple(final_coeffs)
    test_y = test_y.to(device)
    
    train_dataset = torch.utils.data.TensorDataset(*train_coeffs, train_y)
    train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=256, shuffle=True)

    test_dataset = torch.utils.data.TensorDataset(*test_coeffs, test_y)
    test_dataloader = torch.utils.data.DataLoader(test_dataset, batch_size=256, shuffle=False)

    criterion = torch.nn.MSELoss()

    ts = torch.Tensor(np.linspace(0, 1, train_t.shape[1])).to(device)

    test_predictions = []

    for epoch in range(500):
        for batch in train_dataloader:
            *batch_coeffs, batch_y = batch
            coeffs = []
            for c in batch_coeffs:
                coeffs.append(c.to(device))
            pred_y = model(ts, coeffs).squeeze(-1)
            loss = criterion(pred_y, batch_y)
            loss.backward()
            optimizer.step()
            optimizer.zero_grad()

        model.eval()
        with torch.no_grad():
            test_predictions = [[], []]
            for batch in test_dataloader:
                *batch_coeffs, batch_y = batch
                coeffs = []
                for c in batch_coeffs:
                    coeffs.append(c.to(device))
                pred_y = model(ts, coeffs).squeeze(-1)
                test_predictions[0].append(pred_y.detach().cpu().numpy())
                test_predictions[1].append(batch_y.detach().cpu().numpy())
        model.train()
        
        np.save(f"/home/XXX/preds/{epoch}_p.npy", np.array(test_predictions))
            
        print('Epoch: {}   Training loss: {}'.format(epoch, loss.item()))

no module named 'sktime.utils.load_data'

underflow error

I have been getting "underflow in dt 0.0" after a few epochs of training. I am using Adam and lr of 1e-5(decreased from 1e-3, still not working). Any idea why this is happening? Do you have any suggestions to avoid this type of error?

Forecasting a time series

Hi,
Thanks for your work.

I was wondering about a case like this:
You have your data which recorded between 0.0->2.0 seconds.
You try to predict the next 2 seconds aka, from 2.0 to 4.0 for example.
In your example:

  z_T = torchcde.cdeint(X=X,
                        z0=z0,
                        func=self.func,
                        t=[2.0,2.5,3.0,3.5,4.0])

is it valid to set t=[2.0,2.5,3.0,3.5,4.0] even though the supplied observations are in range 0.0->2.0 sec?
Or I have to solve the whole thing, aka insert an input from 0.0 ->4.0 with missing data points between 2.0->4.0?

Thanks

train on gpu

In the example, train_coeff is wrapped in a python list. How do I move this thing to the GPU so the entire model can be trained on a GPU