Giter VIP home page Giter VIP logo

Comments (16)

patrick-kidger avatar patrick-kidger commented on May 31, 2024 2

So the issue here is that you only have one training sample. The usual pattern in supervised learning is to have lots of training samples, and train the model on the whole collection. Neural CDEs aren't any different here - one way or another you need to try and get more training data. (Which can be by taking your one sample and cutting it up into pieces as you did originally - I now understand why you were doing that.)

In terms of forecasting forwards, I'd suggesting using a sequence-to-sequence architecture, with a Neural CDE as the encoder, and for example a Neural ODE as the decoder. This naturally captures the structure of the problem.

Lastly - and I realise this isn't done in the example, which is something I should fix (EDIT: now fixed) - I'd strongly recommend appending time as a channel to your input. So you have a two-dimensional input; one channel is time and one channel are your observations. (This is for complicated mathematics reasons: CDEs don't notice the speed at which you pass data unless you explicitly set them to do so; it's called the reparameterisation invariance property).

from neuralcde.

he-ritter avatar he-ritter commented on May 31, 2024 1

Thanks for the suggestions! After implementing your suggestions, things started working out. I tried a NeuralODE decoder for Seq2Seq, the results were actually worse than just using the NeuralCDE model. Can you maybe say what a resonable approach to set the initial z0 would be? In the example, it is set to all 0s, but I got better results when I tried to set it to e.g. an encoding of the first X values of the input time series. Else it would be impossible for the CDE to capture trends in the time series correctly, if I understand correctly. (As z0 + integral(f(z) * dX/dt) cannot be moved up/down if z0 is always 0)
I understand that the example is only set up for the particular problem presented there, and that the approach taken there is probably not the ideal one for every use case.
But so far, this package is doing wonders to time series prediction with a low number of samples. Other models never worked, this is finding a reasonable solution quite quickly. Impressive!

from neuralcde.

patrick-kidger avatar patrick-kidger commented on May 31, 2024

That's quite a lot of code! Try reducing things down to a minimal (10-20 line) example of your issue.

Additionally, providing a more thorough description of the error (tracebacks etc.) will help.

From what you're saying, I'd begin by asking why you're splitting things up into chunks? The typical pattern is to interpolate a spline through the whole time series, and then run a Neural CDE over the whole thing.

from neuralcde.

he-ritter avatar he-ritter commented on May 31, 2024

Yeah, it's a lot, sorry. Thanks for the advice though, that already helps! I am still kind of stuck in the mindset of 'regular' neural networks I guess. So if I understand correctly, I fit a network on a single example of a time series (i.e. 1 batch with 1 element), where I predict e.g. 10 output channels to look 10 steps ahead?

from neuralcde.

he-ritter avatar he-ritter commented on May 31, 2024

Hello,
I tried your suggestion of interpolating a spline through the whole time series, but this just lead to it overfitting on the training data practically immediately. I think I am doing something majorly wrong. I tried to condense my example as much as possible. Thank you for the help!

def get_data():
    # some time series I had at hand
    d = np.load('/XXX/tsdata/300.npy')
    
    # the timestamps (250 values)
    # just some irregularly spaced intervals
    t = d[0]
    
    # generating the cos function
    Xfull = np.cos(t*10*np.pi)
    Xfull = (Xfull - np.mean(Xfull)) / np.std(Xfull)
    
    ts = t[0:150]
    ts = (ts - np.min(ts)) / (np.max(ts) - np.min(ts))
    Xs = Xfull[0:150]
    ys = Xfull[150:200]
    
    tstest = t[150:200]
    tstest = (tstest - np.min(tstest)) / (np.max(tstest) - np.min(tstest))
    Xstest = Xfull[150:200]
    ystest = Xfull[200:250]
        
    return (torch.Tensor(ts), torch.Tensor(Xs).unsqueeze(0), torch.Tensor(ys).unsqueeze(0),
            torch.Tensor(tstest), torch.Tensor(Xstest).unsqueeze(0), torch.Tensor(ystest).unsqueeze(0))



def main():
    train_t, train_X, train_y, test_t, test_X, test_y = get_data()

    model = NeuralCDE(input_channels=1, hidden_channels=12, output_channels=50)
    optimizer = torch.optim.Adam(model.parameters())

    # getting the spline coefficients for the train data
    train_coeffs = controldiffeq.natural_cubic_spline_coeffs(train_t, train_X.unsqueeze(-1))

    # getting the spline coefficients for the test data
    test_coeffs = controldiffeq.natural_cubic_spline_coeffs(test_t, test_X.unsqueeze(-1))
    
    train_dataset = torch.utils.data.TensorDataset(*train_coeffs, train_y)
    train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=256)

    test_dataset = torch.utils.data.TensorDataset(*test_coeffs, test_y)
    test_dataloader = torch.utils.data.DataLoader(test_dataset, batch_size=256)

    criterion = torch.nn.MSELoss()

    for epoch in range(100):
        for batch in train_dataloader:
            *batch_coeffs, batch_y = batch
            pred_y = model(train_t, batch_coeffs)
            loss = criterion(pred_y, batch_y)
            loss.backward()
            optimizer.step()
            optimizer.zero_grad()

        model.eval()
        with torch.no_grad():
            for batch in test_dataloader:
                *batch_coeffs, batch_y = batch
                pred_y = model(test_t, batch_coeffs)
                loss = criterion(pred_y, batch_y)
        model.train()
        
        print('Epoch: {}   Validation loss: {}'.format(epoch, loss.item()))

if __name__ == '__main__':
    main()

from neuralcde.

andrewcztrack avatar andrewcztrack commented on May 31, 2024

Hi @patrick-kidger !!
love the work.
Terms of predicting forward how would this work in practice.

Would the below repo be a good template.
this seems like quite a challenging piece of code to write.

https://github.com/bentrevett/pytorch-seq2seq

from neuralcde.

patrick-kidger avatar patrick-kidger commented on May 31, 2024

That looks about right, yes. Seq2seq models aren't that difficult, though! Completely untested example:

class Seq2Seq(torch.nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        self.encoder = torch.nn.GRU(input_size, hidden_size)
        self.decoder = torch.nn.GRU(1, hidden_size)
        self.readout = torch.nn.Linear(hidden_size, output_size)

def forward(self, seq):
    # seq is of shape (length, batch, input_size)
    _, hidden1 = self.encoder(seq)
    # hidden1 is of shape (1, batch, hidden_size)
    hidden1 = hidden1[-1]
    times = torch.linspace(0, 1, seq.size(0), device=seq.device)
    hidden2, _ = self.decoder(times, hidden1)
    # hidden2 is of shape (length, batch, hidden_size)
    return self.readout(hidden2)  # of shape (length, batch, output_size)

from neuralcde.

andrewcztrack avatar andrewcztrack commented on May 31, 2024

@patrick-kidger Thank you ! thats so generous!!
although im not sure how to use it.
nxd dataframe, to predict a nx1 dataframe?

from neuralcde.

patrick-kidger avatar patrick-kidger commented on May 31, 2024

This particular example takes an input of some length, and produces an output of the same length. (A simple choice just for illustrative purposes.) If you only want a length-1 output then you could replace the decoder in the above example with an MLP, for example.

I'd suggest having a read up on how these models work.

from neuralcde.

andrewcztrack avatar andrewcztrack commented on May 31, 2024

ah cool cool :). thank you! @patrick-kidger

from neuralcde.

andrewcztrack avatar andrewcztrack commented on May 31, 2024

this is so cool :)!!!

from neuralcde.

patrick-kidger avatar patrick-kidger commented on May 31, 2024

@he-ritter - you're completely correct that the initial z0 shouldn't be all zeros, and should instead be a function (e.g. a small MLP) of the first value of the time series, as else the model is translation-invariant. This was a mistake with the example that I've now fixed!

I'm very glad things seem to be working out for you!

from neuralcde.

lorrp1 avatar lorrp1 commented on May 31, 2024

@andrewcztrack @he-ritter Iā€™m wondering if you managed to compare it to other forecasting models

from neuralcde.

 avatar commented on May 31, 2024

@he-ritter @andrewcztrack Could you please share a small working example for time series if you've implemented it successfully?

from neuralcde.

 avatar commented on May 31, 2024

@patrick-kidger Hi when using the Seq2Seq model, will it replace the CDEFunc model inside the NeuralCDE?

from neuralcde.

patrick-kidger avatar patrick-kidger commented on May 31, 2024

No. That remains unchanged: you would use the final state of the CDE as the initial condition of a neural ODE.
I suggest familiarising yourself with how it would work with RNNs, and then making the analogy to CDEs/ODEs.

from neuralcde.

Related Issues (8)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.