crazydonkey200 / tensorflow-char-rnn Goto Github PK

View Code? Open in Web Editor NEW

426.0 19.0 268.0 5.68 MB

Char-RNN implemented using TensorFlow.

License: MIT License

Python 97.07% Shell 2.93%

tensorflow recurrent-neural-networks rnn-tensorflow rnn

tensorflow-char-rnn's People

Contributors

Stargazers

Watchers

Forkers

dcdowney kevinwilde rikkimelissa cwend samirjoshi lyy15 eng543 lyudison chaofanyu jdbruce1 imhtt ktanaka086 nsaguirre 3011204077 northwestern huluhu sandeepraju yingdahu juelin thehandsomepanther fantaosha minghejiang pbxqdown xinyu-zhang shvsmk xunil17 sameersrivastava jacksonmid sixuanyu yuntuowang kbushick13 nikipatel fretbuzz tekexu msklaroff djdapz lil-wang jnbohrer marcusyyy sylxiaozhi maxschuman abhi7cr gaoliaoliao minhewang mikhailtodes ncorwin frankpeng7 kevincheng96 topanitanw jeikyong rromo12 obiorahm feiyingliu siyuzhang aco711 tangdrew svorbrich hermannity sarthakdubey agamgupta angelanki jinsuh bojieli111 djperlovsky danielrthomas mihuecho goldenpigstar xiaoxiao33 idealistzxy taitfmurphy wayne-x yiyiren dawn-chu gcurry730 upasna29 efoongch jacklowrie kubz113 benmel nmeagan11 clairemelvin tsully193 emcvicar24 ctmark7 anujiravane zbxzc35 wmolter jonrovira yisz mgiangreco adamhe17 sujana1207 pantsworth minarik15 michaeljh jpgarb hailianghu chosunone johesbater andrew-kluge

tensorflow-char-rnn's Issues

--train_frac / --valid_frac

I've noticed that your default settings are train_frac= 0.9 and valid_frac=0.05 , is that on purpose?

Getting "UnboundLocalError: local variable 'cell_fn' referenced before assignment"

Running it using anaconda on Windows. The full error is:

Traceback (most recent call last):
  File "train.py", line 379, in <module>
    main()
  File "train.py", line 249, in main
    train_model = CharRNN(is_training=True, use_batch=True, **params)
  File "H:\Development\ML\tensorflow-char-rnn\char_rnn_model.py", line 68, in __init__
    cell = cell_fn(
UnboundLocalError: local variable 'cell_fn' referenced before assignment

I'm not a python programmer, so I have no clue as to how to solve this error.

Python 3

looks like tensorflow is only supporting python 3 on windows :(
please update to python 3. i tried everything to make it work. it's impossible

ImportError: This module is deprecated. Use tf.nn.rnn_* instead.

The error occurred when testing the source code:

python train.py --data_file=data/tiny_shakespeare.txt --num_epochs=10 --test

Here is the full stack:

Traceback (most recent call last):
  File "train.py", line 10, in <module>
    from char_rnn_model import *
  File "/websail/nor/tf/tensorflow-char-rnn/char_rnn_model.py", line 5, in <module>
    from tensorflow.models.rnn import rnn
  File "/home/northanapon/.virtualenvs/tf/lib/python2.7/site-packages/tensorflow/models/rnn/rnn.py", line 21, in <module>
    raise ImportError("This module is deprecated.  Use tf.nn.rnn_* instead.")
ImportError: This module is deprecated.  Use tf.nn.rnn_* instead.

I am using GPU version of Tensorflow 0.9.0rc0.

GPU mode?

Is there a way to activate GPU mode?

Nevermind, GPU mode is working now.

Output activations of individual neurons of a given layer of a trained model.

I want to see what the activations are for individual neurons of a given layer for a given input character.

Any suggestions?

Parameters

Thanks for sharing!
Do you have the parameters to reproduce the results in karpathy's blog, for instance the latex example and the linux source code example?

Is num_unrollings the same as seq_length in the original char-rnn?

Hi, I'm not really familiar with much of the ML terminology yet. Just to clarify, is the num_unrollings parameter used in the same way as seq_length is in the original repo? Also, why does it default to 10 instead of 50 (like here)? Thanks

Feeding same _initial_state_ to all layers

In the training phase the self.initial_state is used as multi_cell.zero_state and final_state of the last layer is kept:

self.initial_state = create_tuple_placeholders_with_default(multi_cell.zero_state(batch_size, tf.float32), extra_dims=(None,), shape=multi_cell.state_size)
outputs, final_state = tf.contrib.rnn.static_rnn(multi_cell, sliced_inputs, initial_state=self.initial_state)
self.final_state = final_state

However, in the testing phase (def sample_seq()) it seems that all the layers are fed just with the state of the last layer of the previous step, self.final_state, as:

state = session.run(self.final_state, {self.input_data: x, self.initial_state: state})

If I'm not wrong I think all the states of each layer must be kept and then fed them in their corresponding layer for the following steps, not feeding the last one to all the layers.

TypeError: init() got an unexpected keyword argument 'state_is_tuple'

Was trying to experiment with this in the TensorFlow Docker image and got this back after running ython train.py --data_file=data/tiny_shakespeare.txt --test:

============================================================
All final and intermediate outputs will be stored in output/
All information will be logged to stdout
============================================================

03:43:22 INFO:Parameters are:
{
    "batch_size": 20, 
    "dropout": 0.0, 
    "embedding_size": 0, 
    "hidden_size": 128, 
    "input_dropout": 0.0, 
    "learning_rate": 0.002, 
    "max_grad_norm": 5.0, 
    "model": "lstm", 
    "num_layers": 2, 
    "num_unrollings": 10
}

03:43:22 INFO:Reading data from: data/tiny_shakespeare.txt
03:43:22 INFO:Number of characters: 1000
03:43:22 INFO:Creating train, valid, test split
03:43:22 INFO:Creating vocabulary
03:43:22 INFO:Vocabulary is saved in output/vocab.json
03:43:22 INFO:Vocab size: 46
03:43:22 INFO:Creating graph
Traceback (most recent call last):
  File "train.py", line 373, in <module>
    main()
  File "train.py", line 241, in main
    train_model = CharRNN(is_training=True, use_batch=True, **params)
  File "/src/char_rnn_model.py", line 66, in __init__
    **params)
TypeError: __init__() got an unexpected keyword argument 'state_is_tuple'

'pretrained_shakespeare' cannot be sampled on TF version 1.2.1

Hi, I just tried to use the pretrained model and it does not seem to work on the latest version of TF. I wanted to quickly see what kind of results you get but something doesn't work. I suspect it is to do with the older version of the pretrained model not being compatible. Anyone mind sending a pull request with the latest model? My computer is a bit slow for such things at the moment. Here is the log from my OS X terminal:

 python sample.py --init_dir=pretrained_shakespeare --start_text="The meaning of life is" --length=10
Traceback (most recent call last):
  File "sample.py", line 114, in <module>
    main()
  File "sample.py", line 105, in main
    saver.restore(session, best_model)
  File "/Users/bharatkunwar/Anaconda3/envs/openai2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1548, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/Users/bharatkunwar/Anaconda3/envs/openai2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 789, in run
    run_metadata_ptr)
  File "/Users/bharatkunwar/Anaconda3/envs/openai2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 997, in _run
    feed_dict_string, options, run_metadata)
  File "/Users/bharatkunwar/Anaconda3/envs/openai2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
    target_list, options, run_metadata)
  File "/Users/bharatkunwar/Anaconda3/envs/openai2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel not found in checkpoint
	 [[Node: evaluation/checkpoint_saver/RestoreV2_4 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_evaluation/checkpoint_saver/Const_0_0, evaluation/checkpoint_saver/RestoreV2_4/tensor_names, evaluation/checkpoint_saver/RestoreV2_4/shape_and_slices)]]

Caused by op u'evaluation/checkpoint_saver/RestoreV2_4', defined at:
  File "sample.py", line 114, in <module>
    main()
  File "sample.py", line 88, in main
    saver = tf.train.Saver(name='checkpoint_saver')
  File "/Users/bharatkunwar/Anaconda3/envs/openai2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1139, in __init__
    self.build()
  File "/Users/bharatkunwar/Anaconda3/envs/openai2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1170, in build
    restore_sequentially=self._restore_sequentially)
  File "/Users/bharatkunwar/Anaconda3/envs/openai2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 691, in build
    restore_sequentially, reshape)
  File "/Users/bharatkunwar/Anaconda3/envs/openai2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 407, in _AddRestoreOps
    tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
  File "/Users/bharatkunwar/Anaconda3/envs/openai2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 247, in restore_op
    [spec.tensor.dtype])[0])
  File "/Users/bharatkunwar/Anaconda3/envs/openai2/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 640, in restore_v2
    dtypes=dtypes, name=name)
  File "/Users/bharatkunwar/Anaconda3/envs/openai2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/Users/bharatkunwar/Anaconda3/envs/openai2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Users/bharatkunwar/Anaconda3/envs/openai2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
    self._traceback = _extract_stack()

NotFoundError (see above for traceback): Key rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel not found in checkpoint
	 [[Node: evaluation/checkpoint_saver/RestoreV2_4 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_evaluation/checkpoint_saver/Const_0_0, evaluation/checkpoint_saver/RestoreV2_4/tensor_names, evaluation/checkpoint_saver/RestoreV2_4/shape_and_slices)]]

UnicodeEncodeError: 'ascii' codec can't encode character u'\u201d' in position 4066: ordinal not in range(128)

Hello !

First, thanks for your great work !

I would like to export sample directly into files by using ">>" but I'm encountering this error. Any idea where it comes from ?

TypeError: int() argument must be a string or a number, not 'tuple'

$ python train.py

01:15:03 INFO:Reading data from: data/tiny_shakespeare.txt
01:15:03 INFO:Number of characters: 1115394
01:15:03 INFO:Creating train, valid, test split
01:15:03 INFO:Creating vocabulary
01:15:03 INFO:Vocabulary is saved in output/vocab.json
01:15:03 INFO:Vocab size: 65
01:15:03 INFO:Creating graph
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell.BasicLSTMCell object at 0x7f9f8d4e4e10>: Using a concatenated state is slower and will soon be deprecated. Use state_is_tuple=True.
01:15:03 WARNING:<tensorflow.python.ops.rnn_cell.BasicLSTMCell object at 0x7f9f8d4e4e10>: Using a concatenated state is slower and will soon be deprecated. Use state_is_tuple=True.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell.BasicLSTMCell object at 0x7f9f8d4e4250>: Using a concatenated state is slower and will soon be deprecated. Use state_is_tuple=True.
01:15:03 WARNING:<tensorflow.python.ops.rnn_cell.BasicLSTMCell object at 0x7f9f8d4e4250>: Using a concatenated state is slower and will soon be deprecated. Use state_is_tuple=True.
Traceback (most recent call last):
File "train.py", line 373, in
main()
File "train.py", line 241, in main
train_model = CharRNN(is_training=True, use_batch=True, **params)
File "/root/tensorflow-char-rnn/char_rnn_model.py", line 90, in init
'initial_state')
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 1324, in placeholder
shape = tensor_shape.as_shape(shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 815, in as_shape
return TensorShape(shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 451, in init
self._dims = [as_dimension(d) for d in dims_iter]
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 374, in as_dimension
return Dimension(value)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 33, in init
self._value = int(value)
TypeError: int() argument must be a string or a number, not 'tuple'

How to know how confident the model is about a sample it generated?

Hi! I'm a newbie in neural networks. So pardon me if I'm asking stupid questions.

Is there a way to know how confident the model is about a sample it generated?
I mean, for example, when max_prob is FALSE, if we run sample.py for several times, the model would generate several different samples. For each sample it generated, the model might have different confidence in the sample.
So, how to measure the confidence about a particular sample it generated?
Can the model produce something like a confidence score?
Or, what variable should I be looking that already reveals this confidence?

Maybe it's easier to calculate the confidence if we set max_prob to TRUE, and every time we get the same sample? and still, how to calculate the confidence in that case?

I'm not sure if I expressed it right. Please let me know if I'm not clear enough.
Many thanks! :)

How to modify it to process Chinese text?

It seems that this program is designed for processing English text, but I have some Chinese text to train. How can I modify it?

.

Problem with the train.py

Hi, I've just installed and when I attempt the initial training exercise

I'm using windows/python 3.5.2/newest tensorflow

Traceback (most recent call last):
File "train.py", line 379, in
main()
File "train.py", line 249, in main
train_model = CharRNN(is_training=True, use_batch=True, **params)
File "C:\tensorflow-char-rnn\char_rnn_model.py", line 57, in init
cell_fn = tf.contrib.rnn.BasicLSTMCell
AttributeError: module 'tensorflow.contrib.rnn' has no attribute 'BasicLSTMCell'

To continue a finished training

hi. I am trying to train the char rnn on a dump of wikipedia. It has been divided in several files, and I feed it one file at a time. The first file has been trained without any errors, but when it starts the second one by this command (python train.py --data_file=your-data-file --init_dir=your-output-folder) it returns lots of ( INFO:Unexpected char) error. I am not sure what is the problem. I trained the first file with this configuration
-data_file=./wiki2/enwiki-20180120-pages-articles-multistream.xml-0001.txt", "--hidden_size=512",
"--embedding_size=300","--encoding=utf8", "--num_layers=3",
"--model=rnn", "--num_epochs=10", "--dropout=0.5", "--test"

How to improve performance?

I have a 33MB corpus and I ran this command:

python3 train.py --data_file=data/creepypasta.txt --hidden_size=512 --num_layers=2 --num_unrollings=64 --dropout=0.5 --verbose=1

It outputs an additional +0.4% progress every 110 seconds or so, which indicates that it'll take more than a week to complete 25 epochs. How can I speed this up?

Is it possible to set use_batch=true in sample.py?

With use_batch=false, it generates characters one by one while sampling. It seems that only cpu is used and gpu is idle, so the speed of sampling is quite slow.

Is it possible to set use_batch=true to speed up sampling?