Giter VIP home page Giter VIP logo

tensorflow-lstm-sin's Introduction

TensorFlow LSTM sin(t) Example

Single- and multilayer LSTM networks with no additional output nonlinearity based on aymericdamien's TensorFlow examples and Sequence prediction using recurrent neural networks.

Experiments with varying numbers of hidden units, LSTM cells and techniques like gradient clipping were conducted using static_rnn and dynamic_rnn. All networks have been optimized using Adam on the MSE loss function.

Experiment 1

Given a single LSTM cell with 100 hidden states, predict the next 50 timesteps given the last 100 timesteps.

The network is trained on a sine of 1 Hz only using random shifts, thus fails on generalizing to higher frequencies (2 Hz and 3 Hz in the image); in addition, the network should be able to simply memoize the shape of the input. It was optimized with a learning rate of 0.001 for 200000 iterations and batches of 50 examples.

Experiment 2

Given a single LSTM cell with 150 hidden states, predict the next 50 timesteps given the last 100 timesteps.

The network is trained on sines of random frequencies between 0.5 .. 4 Hz using random shifts. Prediction quality is worse than for the 1 Hz only experiment above, but it generalizes to the 2 Hz and 3 Hz tests. It was optimized with a learning rate of 0.001 for 300000 iterations and batches of 50 examples.

At loss 0.614914, the prediction looks like this:

Experiment 3

Given a single LSTM cell with 150 hidden states, predict the next 50 timesteps given the last 25 timesteps.

The network is trained on sines of random frequencies between 0.5 Hz and 4 Hz using random shifts. Prediction quality is worse than for the 1 Hz only experiment above, but it generalizes to the 2 Hz and 3 Hz tests. It was optimized with a learning rate of 0.0005 for 500000 iterations and batches of 50 examples.

The following image shows the output at loss 0.177742:

The following is the network trained to predict the next 100 timesteps given the previous 25 timesteps; the parameters are otherwise unchanged.

This is the result at loss 0.257725:

Experiment 4

Same as the last experiment, however using 500 hidden states and gradient clipping for the optimizer as described here:

adam = tf.train.AdamOptimizer(learning_rate=learning_rate)
gradients = adam.compute_gradients(loss)
clipped_gradients = [(tf.clip_by_value(grad, -0.5, 0.5), var) for grad, var in gradients]
optimizer = adam.apply_gradients(clipped_gradients)

Losses get as low as 0.069027 within the given iterations, but vary wildly. This is at loss 0.422188:

Experiment 5

This time, the dynamic_rnn() function is used instead of rnn(), drastically improving the startup time. In addition, the single LSTM cell has been replaced with 4 stacked LSTM cells of 32 hidden states each.

lstm_cells = [rnn.LSTMCell(n_hidden, forget_bias=1.0) 
              for _ in range(n_layers)]
stacked_lstm = rnn.MultiRNNCell(lstm_cells)
outputs, states = tf.nn.dynamic_rnn(stacked_lstm, x, dtype=tf.float32, time_major=False)

The output still uses linear regression:

output = tf.transpose(outputs, [1, 0, 2])
pred = tf.matmul(output[-1], weights['out']) + biases['out']

The network is trained with learning rate 0.001 for at least 300000 iterations (with the additional criterion that the loss should be below 0.15).

The following picture shows the performance at loss 0.138701 at iteration 375000.

When using only 10 hidden states, training takes much longer given a learning rate of 0.001 and reaches a loss of about 0.5 after ~1200000 iterations, where convergence effectively stops.

The following used 10 hidden states and a base learning rate of 0.005 in combination with a step policy that reduced the learning rate by a factor of 0.1 every 250000 iterations. Similar to the previous experiment, optimization was stopped after at least 300000 iterations have passed and the loss was below 0.2.

The picture shows the outcome after 510000 iterations at a loss of 0.180995:

Experiment 6

Much like the last experiment, this one uses 10 hidden states per layer in a 4 layer deep recurrent structure. Instead of using LSTM layers, however, this one uses GRUs.

Because the loss did not go below 0.3, training was stopped after 1000000 iterations.

tensorflow-lstm-sin's People

Contributors

sunsided avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tensorflow-lstm-sin's Issues

Issue running tf-recurrent-sin-5.1.py

"ValueError: Trying to share variable rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel, but specified shape (20, 40) and found shape (11, 40)."

on python 3.5 with tensorflow_gpu-1.3.0

Issue running tf-recurrent-sin-5.py

I've tried to run tf-recurrent-sin-5.py and ran into the following error:

python tf-recurrent-sin-5.py
Traceback (most recent call last):
File "tf-recurrent-sin-5.py", line 47, in
outputs, states = tf.nn.dynamic_rnn(stacked_lstm, x, dtype=tf.float32, time_major=False)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/rnn.py", line 574, in dynamic_rnn
dtype=dtype)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/rnn.py", line 737, in _dynamic_rnn_loop
swap_memory=swap_memory)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2770, in while_loop
result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2599, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2549, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/rnn.py", line 722, in _time_step
(output, new_state) = call_cell()
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/rnn.py", line 708, in
call_cell = lambda: cell(input_t, state)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 180, in call
return super(RNNCell, self).call(inputs, state)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/layers/base.py", line 441, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 916, in call
cur_inp, new_state = cell(cur_inp, cur_state)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 180, in call
return super(RNNCell, self).call(inputs, state)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/layers/base.py", line 441, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 383, in call
concat = _linear([inputs, h], 4 * self._num_units, True)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 1017, in _linear
initializer=kernel_initializer)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 1065, in get_variable
use_resource=use_resource, custom_getter=custom_getter)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 962, in get_variable
use_resource=use_resource, custom_getter=custom_getter)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 360, in get_variable
validate_shape=validate_shape, use_resource=use_resource)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 1405, in wrapped_custom_getter
*args, **kwargs)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 183, in _rnn_get_variable
variable = getter(*args, **kwargs)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 183, in _rnn_get_variable
variable = getter(*args, **kwargs)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 352, in _true_getter
use_resource=use_resource)
File "/home/.../anaconda3/envs/time-series-1/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 669, in _get_single_variable
found_var.get_shape()))
ValueError: Trying to share variable rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel, but specified shape (64, 128) and found shape (33, 128).

I'm using python 3.5. and tensorflow-gpu 1.2.0.

The following scripts are running on my system:

tf-recurrent-sin.py to tf-recurrent-sin-4.py

Could you please let me know what went wrong?

Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.