Giter VIP home page Giter VIP logo

Comments (40)

ddrummond avatar ddrummond commented on August 28, 2024 2

Hi! I'm applying dl4j to a timeseries regression problem with 2 output variables. I reviewed both the regression and RNN pages and code examples and cobbled together a working example. Training and prediction work without error, but I can't really tell if its working "right" because the regression output isn't really great. Before I charge ahead tuning the network and adding more input features I'd like to have some confirmation that I've set the network up correctly for the problem. Can someone review my gist and answer my questions regarding RNN + regression on 2 output nodes? Thanks in advance.

Here is the gist of the code along with feature and label files for training and test phases: https://gist.github.com/ddrummond/5e05e54f6d79c900c8491b4ab8c1b34f

Problem Description:
Time series forecast for a single securities minor reversal points.
A "minor reversal point" is defined as either a period (day) with a high price greater than both the previous and next high prices, or a period with a low value lower than both the previous and next low prices.
This is a time series regression problem with 8 input features and 2 regression output variables

Input Features:

  1. period return r = (p2 - p1) / p1, descr: a linear return value, typically [-0.05,0.05]
  2. period volume, descr: standardized, z = (x - Mean) / SD
  3. High-Low spread, descr: standardized, z = (x - Mean) / SD
  4. overnight return ((open - previousClose) / previousClose), descr: a linear return value, typically [-0.05,0.05]
  5. open position in Range (open / (high - low)), descr: [0-1]
  6. close position in Range (close / (high - low)), descr: [0-1]
  7. return since last minor reversal point, descr: linear return value, typically [-0.10,0.10]
  8. periods (days) since last minor reversal point, descr: Standardized, z = (x - Mean) / SD

Regression Targets:

  1. linear return of next reversal point relative to current price, descr: linear return value, typically [-0.10,0.10]

  2. periods (days) until next reversal point, descr: Standardized, z = (x - Mean) / SD

    NOTE: Regarding regression target "periods (days)" we'll have to adjust the standardized output of "periods (days)" to the problem domain scale when using the prediction value (i.e. adjustedPeriods = prediction * SD + Mean)

My Questions:
After reading the tutorials on RNN (http://deeplearning4j.org/usingrnns.html) and Regression (http://deeplearning4j.org/linear-regression) and reviewing the example code I have the following questions regarding timeseries regression with multiple outputs:

  1. Am I taking the right general approach for loading the data?
    a) I have 1 stock, so I used minibatch=1
    b) 1 file for training feature set (8 columns), 1 file for training label set (2 columns). 1 line for each day of history (228 days). The lines of each file match up.
    c) Out-of-sample test data in separate training and label files (same format as training files but only has 27 lines)
  2. I combined the net configuration from the GravesLSTMCharModellingExample and RegressionSum example, do I have the correct understanding of the input and output shape?
    a) I have 1 stock, so I used minibatch=1
    b) INDArray dimensions for training are: miniBatchSize x (FeatureSize + LabelSize) x TrainingTimeHistoryLength, in this case: (1 x 10 x 228). Correct?
    c) INDArray dimensions for output are: miniBatchSize x LabelSize x TestTimeHistoryLength, in this case: (1 x 2 x 27). Is this correct?
    d) I was under the impression from the RNN tutorial that MultiLayerNetwork.rnnTimeStep(org.nd4j.linalg.api.ndarray.INDArray input) would predict 1 step at a time, but in my case produced 1 value for each time step in the test data set. Is this the correct usage pattern and does the network state update itself automatically between each timestep under the covers?
    e) Surprisingly I couldn't find a way to retrieve the output values from INDArray as a double[], I had to retrieve each dimension value individually like so. Is this right? Is there an easier way?
                for(int i=0; i < testSetLength; i++ ) {
                    predictionReturns[i] = prediction.getDouble(0, 0, i);
                    standardPredictionPeriods[i] = prediction.getDouble(0, 1, i);
                }

f) Do you have an example of using a minibatchSize > 1 in a timeseries context? The description of Minibatch Size on this page (http://deeplearning4j.org/troubleshootingneuralnets) makes it sound like minibatch is purely a parallel processing consideration, however when evaluating time series it seems like minibatch must equal the number of timeseries that you are trying to learn from. If that is true, then it seems to imply that IF I want to train my network on multiple examples (aka multiple stocks) over the same time span, then I'd have to interleave multiple stock prices in the same file. For example, if I want to learn from the time series of 10 stocks and use a minibatchSize=10, I would have to write 10 price lines for t=0 for (1 for each stock), then for t=1 write to lines 11-20, etc. Is this right? Is there a way to avoid merging all of the price files together?
g) Is it appropriate to alternate calls to MultiLayerNetwork.fit() and MultiLayerNetwork.rnnTimeStep() if I want to update the net weights with new information or is this implied as part of rnnTimeStep()?
3. Can I train the net to produce non-standardized results, or is it better for the net to produce standardized results (as above), and then convert the values to the problem domain scale? (in the case of the period days output this would be (output * SD + Mean)). The later approach causes me some problems as described in the next bullet point.
4. After I convert the "standardized" period (days) output to the domain specific value by applying (output * SD + Mean) I sometimes get negative numbers. I realize this is the byproduct of minimizing error across two dimensions of standardized input/output, but raw negative numbers for this field are never encountered and are meaningless in the problem domain. Is there a way to force (or encourage) range constraint of outputs? In this case one node to always output values [-1, 1] and the other node values >=0? Again, I'm talking about domain-specific values, so I'd love to get non-standardized values directly from the network rather than converting them. I imagine there are different ways to handle this problem. Its almost as if there is not enough weight placed on the period component when computing the score during backpropagation. Is there a way to create a custom error calculator (scorer thingy) that applies custom error weightings to the different output nodes? Alternatively, perhaps the mean and standard deviation computed during testing simply drifted too far from the values computed during training, resulting in negative conversion values. I suppose I could use a smaller training window, but this doesn't guarantee a solution and certainly promises to reduce the generalization of the network. Again, it seems like this could be avoided by getting non-standardized output directly from the network.
5. It wasn't clear to me how to standardize the data starting from SequenceRecordReaderDataSetIterator or using Canova, so I just standardized it myself, does the standardization scheme look ok to have some fields range from [0-1] while other range from [-0.1,0.1], while others are z scores?

from deeplearning4j-examples.

ivanstepanovftw avatar ivanstepanovftw commented on August 28, 2024

++

from deeplearning4j-examples.

Tschigger avatar Tschigger commented on August 28, 2024

I'm a finance student thinking about writing a thesis about machine learning analysis for a specific type of derivative. I was able to set up the dl4j environment (great tutorials by the way), read a lot about recurrent neural networks, both on deeplearning4j and other sites, and played around a bit with the GravesLSTM example provided.

However, setting up just a very basic time-series example seems like an unfeasible step for me. I know developers here are busy, but providing a short example would make everyone able to get a quick hands-on and maybe even make them actively improve the code and/or provide further examples based on the one initially created.

Right now, the entry barrier for everyone trying to play around with time series in dl4j is just too high for "casual" programmers and I think a quick example (doesn't need to be complex, can be super-basic) would definitely change this.

from deeplearning4j-examples.

AlexDBlack avatar AlexDBlack commented on August 28, 2024

No arguments there. We certainly need to make things easier here.

How about this: do some research and find me some data sets for this, maybe in the range of 10k to 1M time steps total (though smaller or larger might work too). Multi-variate input/output is fine. Data sets that have human-interpretable output would be better for an example.

If you can find something suitable, I'll try to put together a basic example in the next few days.

from deeplearning4j-examples.

AlexDBlack avatar AlexDBlack commented on August 28, 2024

This might be useful:
http://mldata.org/
http://www.kdnuggets.com/datasets/index.html
http://www.datasciencecentral.com/profiles/blogs/big-data-sets-available-for-free
http://www.datasciencecentral.com/profiles/blogs/20-free-big-data-sources-everyone-should-check-out
http://www.datasciencecentral.com/profiles/blogs/great-sensor-datasets-to-prepare-your-next-career-move-in-iot-int?xg_source=activity
http://archive.ics.uci.edu/ml/datasets.html

from deeplearning4j-examples.

Tschigger avatar Tschigger commented on August 28, 2024

This sounds great. Won't have time the next two days, but I will get you something by the end of the week. The kind of data I am aiming for at my thesis will be pretty boring for the average user, but I have something else in mind which should be more interesting for people in here:

I could provide time series for bitcoin prices of the last 2 years with 1 minute intervals. Should be 1M time steps in total. I could also add the traded volume in that 1 minute, so we got multi-variate input. Wouldn't be surprised if you get good results there, since bitcoin prices are pretty much only driven by supply/demand and trends. Other financial instruments are, in general, strongly influenced by real-world news which makes training on past data much harder IMHO.

from deeplearning4j-examples.

Tschigger avatar Tschigger commented on August 28, 2024

Hey,

I found time today to create the dataset, it reaches from July 2013 until December 2015.
I increased the length of the time steps to 10 minutes minimum, because at shorter timesteps the price was largely dependent on the last trade (last trade hitting an ask = high price, last trade hitting a bid = low price). BTW, the traded volume is measured in bitcoins, not USD.
http://www.filedropper.com/btcusd

I can also try to find some non-financial time-series until the end of the week if you are not happy with this one. Don't put too much work into the results, the example can be super-basic, just to give people something to toy with will be wonderful already.

from deeplearning4j-examples.

AlexDBlack avatar AlexDBlack commented on August 28, 2024

thanks for the data

I do have a few questions/concerns about this as an example though:

  • What specifically would we be predicting from this data set?
    • next price (or specifically price change)? Or price N steps ahead? Or price changes + volume at next step etc?
  • Would probably need a fair bit of preprocessing for this.
    • Can't predict prices directly (need normalized inputs to NN) -> price changes?
    • Presumably very different levels of (average) volume over the 2013-2015 period, could cause problems. Doing this properly would require a normalization scheme to handle this
    • A good example: would include time of day/day of week etc as inputs (presumably price changes depend on this)

from deeplearning4j-examples.

Tschigger avatar Tschigger commented on August 28, 2024

A) Whatever you want. I think price 2-3 steps ahead seems the most intuitive

B)

  1. (Price normalization) Okay, wasn't completely sure if you have methods which automatically normalize the input. Will provide the log-returns. Should price movement itself be normalized too, so that starting price equals end price and expected return = 0?

  2. (Volume normalization) Should have checked that, my bad. Intuitively though, this might be hard to do. Volume in July 2013 was very low, in November 2013 (during the big bubble) it was high - even for today's standards. Will think about a way to do this.

  3. (Inputs) You are absolutely right, good idea. Will include day of the week and hour. Also, the number of ticks in the interval (number of different price levels) and maybe something else if I get creative until I come home.

from deeplearning4j-examples.

Tschigger avatar Tschigger commented on August 28, 2024

Ok.

  1. Normalized prizes. We have log-returns now (last column). This is what would be nice to predict.

  2. Volume normalized as far as thats possible. Problem is that at the beginning, there are many time-steps with zero volume and I can't really change that. Did basic linear normalization to get rid of the 'trend'.

  3. Added
    A) Day
    B) Hour
    C) Number of Ticks (number of bids/asks being hit)
    D) Variance of the Price in that time step

If you have any questions, feel free to ask. Readme explains the structure. BTW, at end of August 2015 there are many blank rows. This is due to the server suspending trading. No clue what to do there. Problems start at Unix Time 1440394827. August 17th to August 24th, the market had problems in general. Maybe we should split the data here and define our training data as the data before 17th of August, and testing data as the data past August 24th?

http://www.filedropper.com/btcusd_1

from deeplearning4j-examples.

Tschigger avatar Tschigger commented on August 28, 2024

August 15th - unix timestamp 1439682027
btc_10min line:
119.24418331969501,0,1,997.5949627008066,.000365909712118941,-.0026350476380051138
btc_60min line:
1626.9757215653303,0,1,5407.618989005082,.0022965163818520463,-.00932505282362648

August 25th - unix timestamp 1440582027
btc_10min line:
69.37682744664761,3,11,73.95588956557472,.00030399310005783063,-.003342472528277482
btc_60min line:
568.5045582433623,3,11,1181.636044151092,.0014780764785619163,-.007427372978020849

from deeplearning4j-examples.

AlexDBlack avatar AlexDBlack commented on August 28, 2024

OK, thanks. I think I can work with that. I should be able to throw something basic together in the next few days. I'll keep you posted as to my progress.

from deeplearning4j-examples.

Tschigger avatar Tschigger commented on August 28, 2024

If you don't get statistical significant results - doesn't matter. It's just about a very basic example, how it's done. To get the whole thing rollin'.

from deeplearning4j-examples.

AlexDBlack avatar AlexDBlack commented on August 28, 2024

Probably won't be able to get to this for a few more days. Had something important come up that takes priority.
I'll be using some new data loading stuff in 0.4-rc3.8 (which is as yet unreleased) so you wouldn't be able to use any example until the next dl4j release anyway.

from deeplearning4j-examples.

Tschigger avatar Tschigger commented on August 28, 2024

No stress. Have exams coming up in January so I don't have time anyway. Thanks for the effort so far.

from deeplearning4j-examples.

AlexDBlack avatar AlexDBlack commented on August 28, 2024

So I haven't looked at this until now. I've had higher priorities, sorry.
Link above to the data is dead, it seems.

If you still want an example an example like this: could you provide the data again?
Or, alternatively, take a look at this documentation that went up a little while ago: http://deeplearning4j.org/usingrnns.html
That page should have much of what you need to write your own version. 0.4-rc3.8 is out now, which has the data loading features needed for this.
(Note I was thinking of splitting up the data, possibly into weekly or two-weekly blocks. If you do that, loading data as per the linked page should be relatively straightforward)

from deeplearning4j-examples.

Tschigger avatar Tschigger commented on August 28, 2024

Hey,

unfortunately I have final exams soon, starting at the end of January, so I don't have much time at the moment. I uploaded the data again for you: http://www.filedropper.com/btcusd (would be best if split training/testing at the dates specified in previous post).

With splitting the data up in weekly blocks, you mean pre-preparation into multiple files?

from deeplearning4j-examples.

AlexDBlack avatar AlexDBlack commented on August 28, 2024

Right, pre-preparation into multiple files. That would probably be easiest, given the current data import functionality we have.
Anyway, I've downloaded a local copy of that file. I'll try to get this up in the next couple of weeks.

from deeplearning4j-examples.

Tschigger avatar Tschigger commented on August 28, 2024

Pre-prepared data into weekly blocks. Each file = one week. Each row = one time step. The output we want to forecast (log_return) is in the last column. Already split into training and testing data.

Here you go (download and save before link deactivates).

from deeplearning4j-examples.

amo87 avatar amo87 commented on August 28, 2024

+1

(link is deaktivated)

from deeplearning4j-examples.

eanie avatar eanie commented on August 28, 2024

For RNN data, is it a case that each file would contain examples (time steps) of a period of time in each one. For example file 1: 1 hour time steps for 24 hours, file 2: 1 hour time steps for the next 24 hours. Or would each file contain 1 hour time steps on a sliding window ? When I train on a sliding window set of data sources, I see huge spikes in error during training, as if the NN forgets what it has learnt

from deeplearning4j-examples.

tom2good avatar tom2good commented on August 28, 2024

I have similar request as Tschigger's

Basically, I'd like to use RNN/LSTM to predict time-series multiple variable input and output. The variables are mostly real number, some are categorical. It will be nice if the new example shows how to handle such variables as well as time stamps.

from deeplearning4j-examples.

tonythefreedom avatar tonythefreedom commented on August 28, 2024

wow Ive been trying a very similar project to @tom2good 's.
my data, including input and output are all numbers.
I'm trying to predict a couple of variable inputs in them.
I understood how I build model so far but, don't know how I reverse things to test variable inputs after training.
It will be very helpful and thankful if somebody codes a kind of this time-series based example.
Help me plz @AlexDBlack

from deeplearning4j-examples.

Elroch avatar Elroch commented on August 28, 2024

Note that for time series prediction, it may be possible to get predictions for multiple time steps "for free", by generating multiple stocastic predictions of the future by randomly sampling the next time step and iterating, a sort of Monte Carlo simulation. This is essentially what is done in a superficially very different RNN example in DL4J (where samples are generated one character at a time, and future characters are sampled conditional on the previous ones.

To do this properly for a real-valued time series, it is necessary to model the probability distribution of each time step, not merely its mean. Predicting the mean and the variance allows a useful (if not precise) approximation of the next time step as a normal distribution, which can be used for Monte Carlo sampling as I mentioned.

from deeplearning4j-examples.

ddrummond avatar ddrummond commented on August 28, 2024

Elroch, yes, I saw that. I'll take another look to see if I can apply it here. In the meanwhile, I'm mostly trying to get some validation that I understand the basics correctly of how the RNN uses input and output, and if I'm handling the 2 variable regression correctly.

from deeplearning4j-examples.

agibsonccc avatar agibsonccc commented on August 28, 2024

This has been added. https://github.com/deeplearning4j/dl4j-0.4-examples/tree/master/src/main/java/org/deeplearning4j/examples/recurrent/basic

from deeplearning4j-examples.

charlsjoseph avatar charlsjoseph commented on August 28, 2024

I tried the RNN examples. I 'm getting below error.

02:49:07.235 [main] WARN o.d.nn.conf.NeuralNetConfiguration - Layer not named l1 or l2 has been added to configuration but useRegularization is set to false.
Exception in thread "main" java.lang.UnsupportedOperationException: Illegal combination of indexes for vector
at org.nd4j.linalg.indexing.ShapeOffsetResolution.tryShortCircuit(ShapeOffsetResolution.java:113)
at org.nd4j.linalg.indexing.ShapeOffsetResolution.exec(ShapeOffsetResolution.java:325)
at org.nd4j.linalg.api.ndarray.BaseNDArray.get(BaseNDArray.java:3671)
at org.nd4j.linalg.api.ndarray.BaseNDArray.put(BaseNDArray.java:1787)
at org.deeplearning4j.nn.params.GravesLSTMParamInitializer.init(GravesLSTMParamInitializer.java:62)
at org.deeplearning4j.nn.layers.factory.DefaultLayerFactory.getParams(DefaultLayerFactory.java:107)
at org.deeplearning4j.nn.layers.factory.DefaultLayerFactory.create(DefaultLayerFactory.java:64)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.init(MultiLayerNetwork.java:361)
at org.deeplearning4j.examples.recurrent.basic.BasicRNNExample$.main(BasicRNNExample.scala:89)
at org.deeplearning4j.examples.recurrent.basic.BasicRNNExample.main(BasicRNNExample.scala)

from deeplearning4j-examples.

manaswiv avatar manaswiv commented on August 28, 2024

Hi all,

Is there any improvement on this front? has anyone uploaded a basic example on implementing RNN on time series data with a moving window?

from deeplearning4j-examples.

coreyauger avatar coreyauger commented on August 28, 2024

Hi,

It would appear to me that there is a bug.
https://github.com/deeplearning4j/deeplearning4j/blob/master/deeplearning4j-nn/src/main/java/org/deeplearning4j/nn/multilayer/MultiLayerNetwork.java#L2699

The input can not be 2d (rank 2) and have rank 3 ?
If this is the case a simple work around is to take the returned value and call
out.tensorAlongDimension(0, 1, 0);

if (inputIs2d && input.rank() == 3 && layers[layers.length - 1].type() == Type.RECURRENT) {
           //Return 2d output with shape [miniBatchSize,nOut]
            // instead of 3d output with shape [miniBatchSize,nOut,1]
            return input.tensorAlongDimension(0, 1, 0);
}

from deeplearning4j-examples.

RobAltena avatar RobAltena commented on August 28, 2024

@coreyauger If you think there is a bug in the DeepLearning4J code can you please create an issue in the deeplearning4j repository? Please provide a small sample to reproduce the bug.
This repository is for the deeplearning4J examples. And the issue in this thread was closed more than a year ago.

from deeplearning4j-examples.

coreyauger avatar coreyauger commented on August 28, 2024

@RobAltena looks like this is not a bug.
deeplearning4j/deeplearning4j#4365

The confusion here stems from the fact that I want to pass in a single Time Series Sequence into my trained models and have it predict the label.

Currently calling rnnTimeStep produces a matrix of probabilities (for each time step).
Using the above work around at least produced a vector with size = nOut

These probabilities sum to 1... I interpreted this to be the probability for each label.

Any guidance you can give to me would be a great help. Thanks :)

from deeplearning4j-examples.

RobAltena avatar RobAltena commented on August 28, 2024

Check out the UCISequenceClassificationExample.

from deeplearning4j-examples.

coreyauger avatar coreyauger commented on August 28, 2024

@RobAltena thanks. I based my code off this example to begin with. It works great to both train and evaluate on my test data. However when it comes time to use the model in a live setting.. I am not sure what I can use to pass a single time series in and get the predicted label out. It seems like this is the purpose of the model to begin with.. So I am not sure if I am missing something?

I am still pretty new to RNN so again any help with this would be great.

In summary.. I have a trained model and I simply want to do
rnn.predict( some_time_series_matrix ) and get back one of my labels
It seems like for RNN the correct way to go about this is to in fact call rnn.rnnTimeStep( .. )
correct?

Thanks again :)

from deeplearning4j-examples.

RobAltena avatar RobAltena commented on August 28, 2024

Straight from the rnn documentation
INDArray timeSeriesFeatures = ...;
INDArray timeSeriesOutput = myNetwork.output(timeSeriesFeatures);`

from deeplearning4j-examples.

coreyauger avatar coreyauger commented on August 28, 2024

Not sure how I missed that.. Thanks!

from deeplearning4j-examples.

kgonia avatar kgonia commented on August 28, 2024

If I had set of historical data and then I want prediction for one step at the time and still improve my network should code look like this:

myModel.fit(historicalData)

// assuming newFeatures is infinite stream, for just represents futher steps
for(INDArray feature: newFeatures){
INDArray timeSeriesOutput = myNetwork.output(feature);

if(iCanGetLabelsForNewData){
// if I want predict labels for n step must wait n steps until I have label for new data
myModel.output(featureWithLabel - n, true);
}
}

Is my aproach correct?

from deeplearning4j-examples.

totaswift15 avatar totaswift15 commented on August 28, 2024

from deeplearning4j-examples.

Tschigger avatar Tschigger commented on August 28, 2024

from deeplearning4j-examples.

totaswift15 avatar totaswift15 commented on August 28, 2024

from deeplearning4j-examples.

treo avatar treo commented on August 28, 2024

This issue isn't helping anyone anymore.

If you've got additional questions, please ask them on https://community.konduit.ai/.

from deeplearning4j-examples.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.