Hi Patrick, I was testing issue <a class="issue-link js-issue-link"

Time channel unaltered after interpolation and automatic padding when missing values about torchcde HOT 6 CLOSED

patrick-kidger commented on June 1, 2024

Time channel unaltered after interpolation and automatic padding when missing values

from torchcde.

Comments (6)

patrick-kidger commented on June 1, 2024

What you're seeing isn't what you want to pass to an NCDE, I agree. It's not a bug, though -- you just need to switch a couple operations around.

In what you've written here, you've appended NaNs and then appended time. Instead you should append time before appending NaNs. You've passed

[[0, NaN, NaN], [0.1667, value, value], ..., [0.8333, value, value], [1.0, NaN, NaN]]

(ignoring observational channels for now)
rather than

[[0.1667, NaN, NaN], [0.1667, value, value], ..., [0.8333, value, value], [0.8333, NaN, NaN]].

Looking at the irregular data example, the procedure for missing data should be performed before the procedure for variable length data. Incidentally I can appreciate the value of having an example that spells out everything explicitly, so I'll open an issue to track that.

Hopefully this behaviour isn't too surprising: natural_cubic_coeffs and linear_interpolation_coeffs don't distinguish channels. They've got no way to know that that is the time channel, so they're not going to treat it specially. [1] Additionally, I'd note that the input to natural_cubic_coeffs and linear_interpolation_coeffs is pretty much the same as the same input for an RNN, so there's little NCDE-specific going on here.

(To be precise: there are two very minor differences. (a) You need to replace NaN values with e.g. zeros for RNNs; here natural_cubic_coeffs and linear_interpolation_coeffs fill them in for you. (b) NCDEs have a time channel and a cumulative observation channel; RNNs typically use the first difference of that and have delta-time and observational mask channels.)

[1] With one exception: linear_interpolation_coeffs(..., rectilinear=value), in which scenario you are explicitly specifying the time channel precisely because it can't be inferred.

from torchcde.

patrick-kidger commented on June 1, 2024

I've now updated the irregular data example -- how clear do you find it?

from torchcde.

JoaquinGajardo commented on June 1, 2024

Thanks for the explanation Patrick. The irregular data example is pretty clear, it's nice to have all the cases together in one go.

In my case the data has some batches that are shorter and are already filled with missing values, and there is no irregular sampling so just one set of timesteps for all batches. In this case I can't really switch the operations around, I just can append my fixed timesteps tensor as a channel to each batch.

Anyways what you suggest based on the numbers you printed, is that I need to pad the time channel myself (depending on each batch's valid length) before appending it, because the interpolation can't infer which channel is the time channel. If so, it makes sense and was actually my way around it so far.

I don't know if it's possible without losing generality, but I just thought it would be nice if the interpolation could somehow take care of padding the time channel too, but then also the time channel position will need to be provided as an argument to the cubic and linear interpolation, which maybe you would prefer to avoid.

from torchcde.

patrick-kidger commented on June 1, 2024

Glad to hear that it's resolved.

I'd note that the time channel is not treated any differently from the other channel when using linear interpolation or natural cubic splines. All of them are just padded forwards. What special behaviour were you expecting?

from torchcde.

JoaquinGajardo commented on June 1, 2024

Thanks!

As I said maybe it doesn't make sense in a general case, but what I was imagining that would have suited my case would have been that the coeffs functions in torchcde detect the last (at least partially) valid timestep by looking at all the channels and pads along the time channel when doing the padding for other channels. For instance in the example that put it would know that the last timestep (1.0) has all the channels invalid so it can safely pad the time channel along too.

Again I don't think this is crucial, in my case it just saves the need of, for each batch , manually appending NaNs in the timestamps tensor that I append as a channel. Anyways it's ok for me if we close the issue ;)

from torchcde.

patrick-kidger commented on June 1, 2024

Alright then - closing.

from torchcde.

Time channel unaltered after interpolation and automatic padding when missing values about torchcde HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent