Giter VIP home page Giter VIP logo

textgenrnn's Introduction

textgenrnn

dank text

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly train on a text using a pretrained model.

textgenrnn is a Python 3 module on top of Keras/TensorFlow for creating char-rnns, with many cool features:

  • A modern neural network architecture which utilizes new techniques as attention-weighting and skip-embedding to accelerate training and improve model quality.
  • Train on and generate text at either the character-level or word-level.
  • Configure RNN size, the number of RNN layers, and whether to use bidirectional RNNs.
  • Train on any generic input text file, including large files.
  • Train models on a GPU and then use them to generate text with a CPU.
  • Utilize a powerful CuDNN implementation of RNNs when trained on the GPU, which massively speeds up training time as opposed to typical LSTM implementations.
  • Train the model using contextual labels, allowing it to learn faster and produce better results in some cases.

You can play with textgenrnn and train any text file with a GPU for free in this Colaboratory Notebook! Read this blog post or watch this video for more information!

Examples

from textgenrnn import textgenrnn

textgen = textgenrnn()
textgen.generate()
[Spoiler] Anyone else find this post and their person that was a little more than I really like the Star Wars in the fire or health and posting a personal house of the 2016 Letter for the game in a report of my backyard.

The included model can easily be trained on new texts, and can generate appropriate text even after a single pass of the input data.

textgen.train_from_file('hacker_news_2000.txt', num_epochs=1)
textgen.generate()
Project State Project Firefox

The model weights are relatively small (2 MB on disk), and they can easily be saved and loaded into a new textgenrnn instance. As a result, you can play with models which have been trained on hundreds of passes through the data. (in fact, textgenrnn learns so well that you have to increase the temperature significantly for creative output!)

textgen_2 = textgenrnn('/weights/hacker_news.hdf5')
textgen_2.generate(3, temperature=1.0)
Why we got money “regular alter”

Urburg to Firefox acquires Nelf Multi Shamn

Kubernetes by Google’s Bern

You can also train a new model, with support for word level embeddings and bidirectional RNN layers by adding new_model=True to any train function.

Interactive Mode

It's also possible to get involved in how the output unfolds, step by step. Interactive mode will suggest you the top N options for the next char/word, and allows you to pick one.

When running textgenrnn in the terminal, pass interactive=True and top=N to generate. N defaults to 3.

from textgenrnn import textgenrnn

textgen = textgenrnn()
textgen.generate(interactive=True, top_n=5)

word_level_demo

This can add a human touch to the output; it feels like you're the writer! (reference)

Usage

textgenrnn can be installed from pypi via pip:

pip3 install textgenrnn

For the latest textgenrnn, you must have a minimum TensorFlow version of 2.1.0.

You can view a demo of common features and model configuration options in this Jupyter Notebook.

/datasets contains example datasets using Hacker News/Reddit data for training textgenrnn.

/weights contains further-pretrained models on the aforementioned datasets which can be loaded into textgenrnn.

/outputs contains examples of text generated from the above pretrained models.

Neural Network Architecture and Implementation

textgenrnn is based off of the char-rnn project by Andrej Karpathy with a few modern optimizations, such as the ability to work with very small text sequences.

default model

The included pretrained-model follows a neural network architecture inspired by DeepMoji. For the default model, textgenrnn takes in an input of up to 40 characters, converts each character to a 100-D character embedding vector, and feeds those into a 128-cell long-short-term-memory (LSTM) recurrent layer. Those outputs are then fed into another 128-cell LSTM. All three layers are then fed into an Attention layer to weight the most important temporal features and average them together (and since the embeddings + 1st LSTM are skip-connected into the attention layer, the model updates can backpropagate to them more easily and prevent vanishing gradients). That output is mapped to probabilities for up to 394 different characters that they are the next character in the sequence, including uppercase characters, lowercase, punctuation, and emoji. (if training a new model on a new dataset, all of the numeric parameters above can be configured)

context model

Alternatively, if context labels are provided with each text document, the model can be trained in a contextual mode, where the model learns the text given the context so the recurrent layers learn the decontextualized language. The text-only path can piggy-back off the decontextualized layers; in all, this results in much faster training and better quantitative and qualitative model performance than just training the model gien the text alone.

The model weights included with the package are trained on hundreds of thousands of text documents from Reddit submissions (via BigQuery), from a very diverse variety of subreddits. The network was also trained using the decontextual approach noted above in order to both improve training performance and mitigate authorial bias.

When fine-tuning the model on a new dataset of texts using textgenrnn, all layers are retrained. However, since the original pretrained network has a much more robust "knowledge" initially, the new textgenrnn trains faster and more accurately in the end, and can potentially learn new relationships not present in the original dataset (e.g. the pretrained character embeddings include the context for the character for all possible types of modern internet grammar).

Additionally, the retraining is done with a momentum-based optimizer and a linearly decaying learning rate, both of which prevent exploding gradients and makes it much less likely that the model diverges after training for a long time.

Notes

  • You will not get quality generated text 100% of the time, even with a heavily-trained neural network. That's the primary reason viral blog posts/Twitter tweets utilizing NN text generation often generate lots of texts and curate/edit the best ones afterward.

  • Results will vary greatly between datasets. Because the pretrained neural network is relatively small, it cannot store as much data as RNNs typically flaunted in blog posts. For best results, use a dataset with at least 2,000-5,000 documents. If a dataset is smaller, you'll need to train it for longer by setting num_epochs higher when calling a training method and/or training a new model from scratch. Even then, there is currently no good heuristic for determining a "good" model.

  • A GPU is not required to retrain textgenrnn, but it will take much longer to train on a CPU. If you do use a GPU, I recommend increasing the batch_size parameter for better hardware utilization.

Future Plans for textgenrnn

  • More formal documentation

  • A web-based implementation using tensorflow.js (works especially well due to the network's small size)

  • A way to visualize the attention-layer outputs to see how the network "learns."

  • A mode to allow the model architecture to be used for chatbot conversations (may be released as a separate project)

  • More depth toward context (positional context + allowing multiple context labels)

  • A larger pretrained network which can accommodate longer character sequences and a more indepth understanding of language, creating better generated sentences.

  • Hierarchical softmax activation for word-level models (once Keras has good support for it).

  • FP16 for superfast training on Volta/TPUs (once Keras has good support for it).

Articles/Projects using textgenrnn

Articles

Projects

Tweets

Maintainer/Creator

Max Woolf (@minimaxir)

Max's open-source projects are supported by his Patreon. If you found this project helpful, any monetary contributions to the Patreon are appreciated and will be put to good creative use.

Credits

Andrej Karpathy for the original proposal of the char-rnn via the blog post The Unreasonable Effectiveness of Recurrent Neural Networks.

Daniel Grijalva for contributing an interactive mode.

License

MIT

Attention-layer code used from DeepMoji (MIT Licensed)

textgenrnn's People

Contributors

akx avatar bafonso avatar cclauss avatar danielgrijalva avatar dkavraal avatar edwardbetts avatar fahadh4ilyas avatar irekrybark avatar minimaxir avatar netr avatar reedkavner avatar sisygoboom avatar torokati44 avatar v01dma1n avatar yslai avatar zacharymcgee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

textgenrnn's Issues

Unable to open largetext weights.

I successfully trained a model using train_from_largetext and saved the weights as 'weights/recipes.hdf5'.

However, I don't seem to be able to reopen the saved weights. When I try:

recipes.save('weights/recipes.hdf5')
recipes2 = textgenrnn('weights/recipes.hdf5')

I get the following error message when I try to reopen the weights:

Traceback (most recent call last):
File "", line 1, in
File "/home/janelle_shane/textgenrnn/textgenrnn/textgenrnn.py", line 65, in init
weights_path=weights_path)
File "/home/janelle_shane/textgenrnn/textgenrnn/model.py", line 38, in textgenrnn_model
model.load_weights(weights_path, by_name=True)
File "/home/janelle_shane/.pyenv/versions/3.5.5/lib/python3.5/site-packages/keras/engine/network.py", line 1177, in load_weights
reshape=reshape)
File "/home/janelle_shane/.pyenv/versions/3.5.5/lib/python3.5/site-packages/keras/engine/saving.py", line 1018, in load_weights_from_hdf5_group_by_name
str(weight_values[i].shape) + '.')
ValueError: Layer #1 (named "embedding"), weight <tf.Variable 'embedding_8/embeddings:0' shape=(465, 100) dtype=float32_ref> has shape (465, 100), but the saved weight has shape (106, 100).

This combo of commands works for models trained with regular train_from_file, so I wonder if there's some problem with train_from_largetext_file?

TypeError: softmax() got an unexpected keyword argument 'axis'

Dear,

My environment as below:
Python 3.5.2 |Anaconda custom (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.
1900 64 bit (AMD64)] on win32

I received below error when I issue textgenrnn():

textgen = textgenrnn()
2018-04-27 09:39:11.639683: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY
35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn
't compiled to use AVX instructions, but these are available on your machine and
could speed up CPU computations.
2018-04-27 09:39:11.640683: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY
35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn
't compiled to use AVX2 instructions, but these are available on your machine an
d could speed up CPU computations.
Traceback (most recent call last):
File "", line 1, in
File "C:\Program Files\Anaconda3\lib\site-packages\textgenrnn\textgenrnn.py",
line 59, in init
weights_path=weights_path)
File "C:\Program Files\Anaconda3\lib\site-packages\textgenrnn\model.py", line
29, in textgenrnn_model
output = Dense(num_classes, name='output', activation='softmax')(attention)
File "C:\Program Files\Anaconda3\lib\site-packages\keras\engine\topology.py",
line 619, in call
output = self.call(inputs, **kwargs)
File "C:\Program Files\Anaconda3\lib\site-packages\keras\layers\core.py", line
881, in call
output = self.activation(output)
File "C:\Program Files\Anaconda3\lib\site-packages\keras\activations.py", line
29, in softmax
return K.softmax(x)
File "C:\Program Files\Anaconda3\lib\site-packages\keras\backend\tensorflow_ba
ckend.py", line 2963, in softmax
return tf.nn.softmax(x, axis=axis)
TypeError: softmax() got an unexpected keyword argument 'axis'

Keras while_loop() error

I've trained this model sucessfully on Windows 10 and Ubuntu with no issues, however when attempting to train on an Azure notebook server I get the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-4441bc1684a3> in <module>()
----> 1 textgen = textgenrnn("textgenrnn_weights.hdf5")
      2 textgen.generate()

~/anaconda3_501/lib/python3.6/site-packages/textgenrnn/textgenrnn.py in __init__(self, weights_path, vocab_path, config_path, name)
     63         self.model = textgenrnn_model(self.num_classes,
     64                                       cfg=self.config,
---> 65                                       weights_path=weights_path)
     66         self.indices_char = dict((self.vocab[c], c) for c in self.vocab)
     67 

~/anaconda3_501/lib/python3.6/site-packages/textgenrnn/model.py in textgenrnn_model(num_classes, cfg, context_size, weights_path, dropout, optimizer)
     27     for i in range(cfg['rnn_layers']):
     28         prev_layer = embedded if i is 0 else rnn_layer_list[-1]
---> 29         rnn_layer_list.append(new_rnn(cfg, i+1)(prev_layer))
     30 
     31     seq_concat = concatenate([embedded] + rnn_layer_list, name='rnn_concat')

~/anaconda3_501/lib/python3.6/site-packages/keras/layers/recurrent.py in __call__(self, inputs, initial_state, constants, **kwargs)
    498             additional_inputs += constants
    499             self.constants_spec = [InputSpec(shape=K.int_shape(constant))
--> 500                                    for constant in constants]
    501             self._num_constants = len(constants)
    502             additional_specs += self.constants_spec

~/anaconda3_501/lib/python3.6/site-packages/keras/engine/base_layer.py in __call__(self, inputs, **kwargs)

~/anaconda3_501/lib/python3.6/site-packages/keras/layers/recurrent.py in call(self, inputs, mask, training, initial_state)
   2110                   'recurrent_dropout': self.recurrent_dropout,
   2111                   'implementation': self.implementation}
-> 2112         base_config = super(LSTM, self).get_config()
   2113         del base_config['cell']
   2114         return dict(list(base_config.items()) + list(config.items()))

~/anaconda3_501/lib/python3.6/site-packages/keras/layers/recurrent.py in call(self, inputs, mask, training, initial_state, constants)
    607                 states = [states]
    608             else:
--> 609                 states = list(states)
    610             return [output] + states
    611         else:

~/anaconda3_501/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py in rnn(step_function, inputs, initial_states, go_backwards, mask, constants, unroll, input_length)
   2955 
   2956 def sigmoid(x):
-> 2957     """Element-wise sigmoid.
   2958 
   2959     # Arguments

TypeError: while_loop() got an unexpected keyword argument 'maximum_iterations'

I found this issue which appeared to be similar and attempted the suggested fix. However I then get the following warning textgenrnn 1.3.1 has requirement keras>=2.1.5, but you'll have keras 2.1.2 which is incompatible.
The same error as above persists also.
I've tried a few other versions of Tensorflow and Keras including the newest of each but none work.

The code I'm attempting is the absolute minimal code for this library and the issue persists

from textgenrnn import textgenrnn
import pandas as pd
import numpy as np
textgen = textgenrnn()
textgen.generate()

"Dimension 0 in both shapes must be equal" when loading weights made from large data sets

I can successfully train and sample new models, but I am unable to load and continue training weights files saved after training with large data sets.

from textgenrnn import textgenrnn

tg=textgenrnn(name="dyk")
tg=textgenrnn(weights_path='dyk_weights.hdf5',
             vocab_path='dyk_vocab.json',
             config_path='dyk_config.json')

Running the above gives me the following error:

tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 0 in both shapes must be equal, but are 96 and 178. Shapes are [96,100] and [178,100]. for 'Assign_20' (op: 'Assign') with input shapes: [96,100], [178,100]

I'm unable to load the weights even by omitting vocab_path and config_path, as in

tg=textgenrnn('dyk_weights.hdf5')

"dyk.txt" is about 12.4 million characters, while other sets I've tried out are ~90k - 600k characters. Smaller sets load, sample, and continue just fine.

Typo in README

text after a graph

Alternatively, if context labels are provided with each text document, the model can be trained in a contextual mode, where the model learns the text given the context so the recurrent layers learn the decontextualized language. The text-only path can piggy-back off the decontextualized layers; in all, this results in much faster training and better quantitative and qualitative model performance than just training the model gien the text alone.

gien -> given

Jibberish with standard settings

First of all thanks for your tool!

After 10 epocs at standard settings (I'm using collaboratory) I get mostly jibberish with the ice-cream list from [https://github.com/janelleshane/ice-cream].

Here is the output at temperature 0.5:

Cookil
Cheam Chocolate Chocolate Strip
Cheam Bripberry
Turnd Chocolate Chip
Turin Chocolate Checocan Chocolate Seram
Drocon Choocolat Chocolate Stream Coconut Coffee
Chocolate Chocolate
Brutter
Crocolate Coocolate andio
Conamon Chocolate Chocolate
Milk Chocolate Butter
Baney Chocolate

fee
Cracolate
Chocolate Coconut Crocolat Townica
May Swigo Cream
Toate Caramel Bunte Cheesecoke
Pipplich Caconut Bumple
Chocolate Chocolate Chip
Chocolate And Chocolate Chocolat Chocolate Chocolate Blocke Chicola
Chinna
Chocolate
Coffee
Chip
Blinger
Beach Chocolate Chocolate
Thippry
Swilk Carel Mon

e Anfla
Chocolate Chocolate Chocolate Pipli Chocolate Chocolate Cookie Chocolate Chippler Chocolate Strermon Cheemer
Chocolate Cheescocake
Blunn
Chocolate Chincamed Bered Crrenie
Toron
Chocolate Crunch Coconat Brown Swerry Cream
Caram
Borin Checon Boran And Blunge
Black
Cream
Coconut Butter Bluin Cr

Or perhaps I'm just expecting too much?

Missing accented letters

There are some letters in the Hungarian language which are not in the vocab file:

Í Ó ő Ő Ú ű Ű

Could these be added perhaps?

random keyerrors

Hi,

When I train my models, I randomly get the following key error. Any idea what might be causing the problem?

Traceback (most recent call last):
File "train.py", line 39, in
word_level=cfg['model_config']['word_level'])
File "textgenrnn/textgenrnn.py", line 262, in train_new_model
**kwargs)
File "textgenrnn/textgenrnn.py", line 195, in train_on_texts
validation_steps=val_steps
File "miniconda2/envs/py36/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "miniconda2/envs/py36/lib/python3.6/site-packages/keras/engine/training.py", line 1426, in fit_generator
initial_epoch=initial_epoch)
File "miniconda2/envs/py36/lib/python3.6/site-packages/keras/engine/training_generator.py", line 229, in fit_generator
callbacks.on_epoch_end(epoch, epoch_logs)
File "miniconda2/envs/py36/lib/python3.6/site-packages/keras/callbacks.py", line 77, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "textgenrnn-s/textgenrnn/utils.py", line 174, in on_epoch_end
max_gen_length=self.max_gen_length)
File "textgenrnn-s/textgenrnn/textgenrnn.py", line 94, in generate_samples
self.generate(n, temperature=temperature, **kwargs)
File "textgenrnn-s/textgenrnn/textgenrnn.py", line 83, in generate
max_gen_length)
File "textgenrnn-s/textgenrnn/utils.py", line 69, in textgenrnn_generate
next_char = indices_char[next_index]
KeyError: 0

start_text?

Hi,

Does textgenrnn have the ability to provide a start_text like in char-rnn?

Thanks!

Incorporate Markov probabilities

Char-based Markocv chain probabilities might work to hedge against predicting typoes (e.g calculate a weighted average between the output of the textgenrnn and the Markov probability of the last char and the next predicted char).

It would need to be loaded (and saved) as a separate file. For a ~400 char vocab, this represents ~16000 values, which may be weighty.

decontextualized explanation wanted

please explain: "The network was also trained in such a way that the rnn layer is decontextualized in order to both improve training performance and mitigate authorial bias."

Idea: port to R/Keras?

Hi. I'm learning R Keras (emphasis on the learning) and I'd like to get this running under R rather than Python. Is it a big thing to port to R?

Weights are not saving

Also how can I save weights to train more? When I just start it like this
from textgenrnn import textgenrnn
t = textgenrnn('textgenrnn_weights.hdf5')
t.train_from_file('input.txt', num_epochs=1);
Loss will be reset again (it was 0.6 for example, and now again 1.5)

Non-standard / complete hdf5?

If I save a .hdf5-file from Collaboratory and try uploading it with:

from textgenrnn import textgenrnn
uploaded = files.upload()
all_files = [(name, os.path.getmtime(name)) for name in os.listdir()]
latest_file = sorted(all_files, key=lambda x: -x[1])[0][0]

and then

textgen = textgenrnn(latest_file)
textgen.generate(20, temperature=0.8)

Then it throws this error:

`---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in ()
----> 1 textgen = textgenrnn(latest_file)
2 textgen.generate(20, temperature=0.8)

/usr/local/lib/python3.6/dist-packages/textgenrnn/textgenrnn.py in init(self, weights_path, vocab_path, config_path, name)
63 self.model = textgenrnn_model(self.num_classes,
64 cfg=self.config,
---> 65 weights_path=weights_path)
66 self.indices_char = dict((self.vocab[c], c) for c in self.vocab)
67

/usr/local/lib/python3.6/dist-packages/textgenrnn/model.py in textgenrnn_model(num_classes, cfg, context_size, weights_path, dropout, optimizer)
36 model = Model(inputs=[input], outputs=[output])
37 if weights_path is not None:
---> 38 model.load_weights(weights_path, by_name=True)
39 model.compile(loss='categorical_crossentropy', optimizer=optimizer)
40

/usr/local/lib/python3.6/dist-packages/keras/engine/topology.py in load_weights(self, filepath, by_name, skip_mismatch, reshape)
2662 load_weights_from_hdf5_group_by_name(
2663 f, self.layers, skip_mismatch=skip_mismatch,
-> 2664 reshape=reshape)
2665 else:
2666 load_weights_from_hdf5_group(

/usr/local/lib/python3.6/dist-packages/keras/engine/topology.py in load_weights_from_hdf5_group_by_name(f, layers, skip_mismatch, reshape)
3462 ' weight(s), but the saved weights' +
3463 ' have ' + str(len(weight_values)) +
-> 3464 ' element(s).')
3465 # Set values.
3466 for i in range(len(weight_values)):

ValueError: Layer #2 (named "rnn_1") expects 3 weight(s), but the saved weights have 6 element(s).`

Where as if I upload the .hdf5 from here:

https://eblong.com/zarf/ftp/ifdb-titles-epoch50.hdf5

  • Or use one of the predefines ones from the weights-folder..

It works fine..

'charmap' codec can't encode characters in position 0-1: character maps to <undefined>

Using textgenrnn with tweetgenerator - While printing the last of the 3rd epoch examples during training -

####################
Temperature: 1.0
####################
Ahnksui! Chameraritom. Tito honge wnen Bormonick! #IHamU mavest. Sahen! Ge Ar-gieer overong.

Traceback (most recent call last):
File "tweet_generator.py", line 62, in
word_level=cfg['model_config']['word_level'])
File "C:\Users\Will\AppData\Local\conda\conda\envs\tweet generator\lib\site-packages\textgenrnn\textgenrnn.py", line 261, in train_new_model
**kwargs)
File "C:\Users\Will\AppData\Local\conda\conda\envs\tweet generator\lib\site-packages\textgenrnn\textgenrnn.py", line 194, in train_on_texts
validation_steps=val_steps
File "C:\Users\Will\AppData\Local\conda\conda\envs\tweet generator\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "C:\Users\Will\AppData\Local\conda\conda\envs\tweet generator\lib\site-packages\keras\engine\training.py", line 1426, in fit_generator
initial_epoch=initial_epoch)
File "C:\Users\Will\AppData\Local\conda\conda\envs\tweet generator\lib\site-packages\keras\engine\training_generator.py", line 229, in fit_generator
callbacks.on_epoch_end(epoch, epoch_logs)
File "C:\Users\Will\AppData\Local\conda\conda\envs\tweet generator\lib\site-packages\keras\callbacks.py", line 77, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "C:\Users\Will\AppData\Local\conda\conda\envs\tweet generator\lib\site-packages\textgenrnn\utils.py", line 174, in on_epoch_end
max_gen_length=self.max_gen_length)
File "C:\Users\Will\AppData\Local\conda\conda\envs\tweet generator\lib\site-packages\textgenrnn\textgenrnn.py", line 93, in generate_samples
self.generate(n, temperature=temperature, **kwargs)
File "C:\Users\Will\AppData\Local\conda\conda\envs\tweet generator\lib\site-packages\textgenrnn\textgenrnn.py", line 84, in generate
print("{}\n".format(gen_text))
File "C:\Users\Will\AppData\Local\conda\conda\envs\tweet generator\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-1: character maps to <undefined>

Python 2 UnicodeDecode Error

Most of Python 2 works with the script, but need a noncomplicated way of handling UnicodeDecodeErrors when a Unicode character is chosen as the next_char.

Word-level enhancements

Word level addition was a last-min change, so need to work on it a bit:

  • Make sure the vocab abides by max_length.
  • Add a feature to collapse punctuation.

Bugg in training

I'm getting this long error each time I try to train a model.
I tried just to copy collaboratory in my drive and train on multiple files/change parameters or not, but each time i've got this error :

Training new model w/ 8-layer, 128-cell Bidirectional LSTMs
Training on 485,616 character sequences.
Epoch 1/10


ResourceExhaustedError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
1321 try:
-> 1322 return fn(*args)
1323 except errors.OpError as e:

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
1306 return self._call_tf_sessionrun(
-> 1307 options, feed_dict, fetch_list, target_list, run_metadata)
1308

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
1408 self._session, options, feed_dict, fetch_list, target_list,
-> 1409 run_metadata)
1410 else:

ResourceExhaustedError: OOM when allocating tensor with shape[1024,40,100] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: embedding_1/GatherV2 = GatherV2[Taxis=DT_INT32, Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@training/RMSprop/gradients/embedding_1/GatherV2_grad/Reshape"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](embedding_1/embeddings/read, embedding_1/Cast, rnn_8/ExpandDims_1/dim)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[Node: loss_1/mul/_325 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_7279_loss_1/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

During handling of the above exception, another exception occurred:

ResourceExhaustedError Traceback (most recent call last)
in ()
20 max_length=model_cfg['max_length'],
21 dim_embeddings=model_cfg['dim_embeddings'],
---> 22 word_level=model_cfg['word_level'])

/usr/local/lib/python3.6/dist-packages/textgenrnn/textgenrnn.py in train_from_largetext_file(self, file_path, new_model, **kwargs)
298 if new_model:
299 self.train_new_model(
--> 300 texts, single_text=True, **kwargs)
301 else:
302 self.train_on_texts(texts, single_text=True, **kwargs)

/usr/local/lib/python3.6/dist-packages/textgenrnn/textgenrnn.py in train_new_model(self, texts, context_labels, num_epochs, gen_epochs, batch_size, dropout, validation, **kwargs)
259 dropout=dropout,
260 validation=validation,
--> 261 **kwargs)
262
263 def save(self, weights_path="textgenrnn_weights_saved.hdf5"):

/usr/local/lib/python3.6/dist-packages/textgenrnn/textgenrnn.py in train_on_texts(self, texts, context_labels, batch_size, num_epochs, verbose, new_model, gen_epochs, train_size, max_gen_length, validation, dropout, via_new_model, **kwargs)
192 max_queue_size=2,
193 validation_data=gen_val,
--> 194 validation_steps=val_steps
195 )
196

/usr/local/lib/python3.6/dist-packages/keras/legacy/interfaces.py in wrapper(*args, **kwargs)
89 warnings.warn('Update your ' + object_name + 90 ' call to the Keras 2 API: ' + signature, stacklevel=2)
---> 91 return func(*args, **kwargs)
92 wrapper._original_function = func
93 return wrapper

/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
2228 outs = self.train_on_batch(x, y,
2229 sample_weight=sample_weight,
-> 2230 class_weight=class_weight)
2231
2232 if not isinstance(outs, list):

/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in train_on_batch(self, x, y, sample_weight, class_weight)
1881 ins = x + y + sample_weights
1882 self._make_train_function()
-> 1883 outputs = self.train_function(ins)
1884 if len(outputs) == 1:
1885 return outputs[0]

/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py in call(self, inputs)
2480 session = get_session()
2481 updated = session.run(fetches=fetches, feed_dict=feed_dict,
-> 2482 **self.session_kwargs)
2483 return updated[:len(self.outputs)]
2484

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
898 try:
899 result = self._run(None, fetches, feed_dict, options_ptr,
--> 900 run_metadata_ptr)
901 if run_metadata:
902 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
1133 if final_fetches or final_targets or (handle and feed_dict_tensor):
1134 results = self._do_run(handle, final_targets, final_fetches,
-> 1135 feed_dict_tensor, options, run_metadata)
1136 else:
1137 results = []

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
1314 if handle is None:
1315 return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1316 run_metadata)
1317 else:
1318 return self._do_call(_prun_fn, handle, feeds, fetches)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
1333 except KeyError:
1334 pass
-> 1335 raise type(e)(node_def, op, message)
1336
1337 def _extend_graph(self):

ResourceExhaustedError: OOM when allocating tensor with shape[1024,40,100] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: embedding_1/GatherV2 = GatherV2[Taxis=DT_INT32, Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@training/RMSprop/gradients/embedding_1/GatherV2_grad/Reshape"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](embedding_1/embeddings/read, embedding_1/Cast, rnn_8/ExpandDims_1/dim)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[Node: loss_1/mul/_325 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_7279_loss_1/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Caused by op 'embedding_1/GatherV2', defined at:
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py", line 16, in
app.launch_new_instance()
File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelapp.py", line 477, in start
ioloop.IOLoop.instance().start()
File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/ioloop.py", line 177, in start
super(ZMQIOLoop, self).start()
File "/usr/local/lib/python3.6/dist-packages/tornado/ioloop.py", line 888, in start
handler_func(fd_obj, events)
File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
self._handle_recv()
File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
self._run_callback(callback, msg)
File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
callback(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 283, in dispatcher
return self.dispatch_shell(stream, msg)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 235, in dispatch_shell
handler(stream, idents, msg)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 399, in execute_request
user_expressions, allow_stdin)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/ipkernel.py", line 196, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/zmqshell.py", line 533, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2718, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2828, in run_ast_nodes
if self.run_code(code, result):
File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 22, in
word_level=model_cfg['word_level'])
File "/usr/local/lib/python3.6/dist-packages/textgenrnn/textgenrnn.py", line 300, in train_from_largetext_file
texts, single_text=True, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/textgenrnn/textgenrnn.py", line 242, in train_new_model
cfg=self.config)
File "/usr/local/lib/python3.6/dist-packages/textgenrnn/model.py", line 21, in textgenrnn_model
name='embedding')(input)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/topology.py", line 619, in call
output = self.call(inputs, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/keras/layers/embeddings.py", line 138, in call
out = K.gather(self.embeddings, inputs)
File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 1215, in gather
return tf.gather(reference, indices)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 2666, in gather
return gen_array_ops.gather_v2(params, indices, axis, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 3141, in gather_v2
"GatherV2", params=params, indices=indices, axis=axis, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3414, in create_op
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1740, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1024,40,100] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: embedding_1/GatherV2 = GatherV2[Taxis=DT_INT32, Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@training/RMSprop/gradients/embedding_1/GatherV2_grad/Reshape"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](embedding_1/embeddings/read, embedding_1/Cast, rnn_8/ExpandDims_1/dim)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[Node: loss_1/mul/_325 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_7279_loss_1/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

How could I use a pre trained model

I have pretrained models from FastText and from Aravec which is another source. The file formats are different from "hdf5". For example there is "bin" and "vec". How could I use these pre trained models

Colab trouble w/ training more than once?

I've been trying out the collab notebook (awesome!) and think I'm seeing an issue where the training only works the first time - if I try to train again, even if I change datasets or re-upload the original dataset, it converges toward something that looks nothing like my original data.

Dimension 0 in both shapes must be equal

Hey, guys, I'm trying to continue training model and getting an error:
ValueError: Dimension 0 in both shapes must be equal, but are 465 and 56. Shapes are [465,100] and [56,100]. for 'Assign' (op: 'Assign') with input shapes: [465,100], [56,100].

What I'm doing is:

textgen = textgenrnn()
textgen.train_from_file('sentenses.txt', num_epochs=1, new_model=True)

and then

textgen = textgenrnn('textgenrnn_weights.hdf5')
textgen.train_from_file('sentenses.txt', num_epochs=100)
textgen.generate()

I'm pretty new to all this magic, so I might not get even basic things, so any help is highly appreciated.

Make ouput longer?

How can I make the output longer? I am getting pretty short output, about 2 lines, and would like to make it longer

Using textgenrnn in French

I've tested the collab script with french texts
Of course, result was quit messy at the beginning, because of retraining from english to french.
Do you think it may somehow work, with more training steps etc ?

error when train_on_texts instead of train_from_largetext_file: AttributeError: 'numpy.ndarray' object has no attribute 'lower'

Hi,
I tried to train your example at https://github.com/minimaxir/textgenrnn/blob/master/docs/textgenrnn-demo.ipynb

my code looks like:

from textgenrnn import textgenrnn

textgen = textgenrnn()
textgen.reset()
texts = ['Never gonna give you up, never gonna let you down',
        'Never gonna run around and desert you',
        'Never gonna make you cry, never gonna say goodbye',
        'Never gonna tell a lie and hurt you']


textgen.train_on_texts(texts, num_epochs=2,  gen_epochs=2)

I get the following output:

Using TensorFlow backend.
2018-08-01 15:40:15.286776: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-08-01 15:40:15.356998: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:897] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-08-01 15:40:15.357329: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1392] Found device 0 with properties: 
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
totalMemory: 7.93GiB freeMemory: 7.61GiB
2018-08-01 15:40:15.357343: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-08-01 15:40:17.171291: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-01 15:40:17.171337: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-08-01 15:40:17.171348: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-08-01 15:40:17.171563: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7343 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
Training on 174 character sequences.
Epoch 1/2
Traceback (most recent call last):
  File "test2.py", line 11, in <module>
    textgen.train_on_texts(texts, num_epochs=2,  gen_epochs=2)
  File "/home/tom/programming/tensorflow/venv/lib/python3.6/site-packages/textgenrnn/textgenrnn.py", line 194, in train_on_texts
    validation_steps=val_steps
  File "/home/tom/programming/tensorflow/venv/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/tom/programming/tensorflow/venv/lib/python3.6/site-packages/keras/engine/training.py", line 1415, in fit_generator
    initial_epoch=initial_epoch)
  File "/home/tom/programming/tensorflow/venv/lib/python3.6/site-packages/keras/engine/training_generator.py", line 177, in fit_generator
    generator_output = next(output_generator)
  File "/home/tom/programming/tensorflow/venv/lib/python3.6/site-packages/keras/utils/data_utils.py", line 793, in get
    six.reraise(value.__class__, value, value.__traceback__)
  File "/home/tom/.local/lib/python3.6/site-packages/six.py", line 693, in reraise
    raise value
  File "/home/tom/programming/tensorflow/venv/lib/python3.6/site-packages/keras/utils/data_utils.py", line 658, in _data_generator_task
    generator_output = next(self._generator)
  File "/home/tom/programming/tensorflow/venv/lib/python3.6/site-packages/textgenrnn/model_training.py", line 49, in generate_sequences_from_texts
    x = process_sequence([x], textgenrnn, new_tokenizer)
  File "/home/tom/programming/tensorflow/venv/lib/python3.6/site-packages/textgenrnn/model_training.py", line 79, in process_sequence
    X = new_tokenizer.texts_to_sequences(X)
  File "/home/tom/programming/tensorflow/venv/lib/python3.6/site-packages/keras_preprocessing/text.py", line 274, in texts_to_sequences
    return list(self.texts_to_sequences_generator(texts))
  File "/home/tom/programming/tensorflow/venv/lib/python3.6/site-packages/keras_preprocessing/text.py", line 299, in texts_to_sequences_generator
    text = text.lower()
AttributeError: 'numpy.ndarray' object has no attribute 'lower'

The following does work though and I get the training step outputs:

from textgenrnn import textgenrnn
textgen = textgenrnn()
textgen.reset()

train_function = textgen.train_from_largetext_file

train_function(
    file_path='somevalidpath',
    new_model=True,)

It seem like its an issue only with training from list of strings directly instead of loading from file.

PS: I am in a python virtual environment with tensorflow-gpu installed on latest Ubuntu 18.

how to create fixed-length documents?

i want to generate recipes using this, but i am not sure how to format the dataset? should i combine one recipe into a single sentence (so each sentence is a recipe) and feed a single document to the code? or is there another way to go about this?

saving model file at certain epochs

is there any functionality to save the model file at every k epochs? the reason is; when i train after a certain point the model overfits, so i want to save the model under a different name after every e.g. 100 epochs. i looked at the source code and found the following function:

self.model.save_weights()

but i am not sure where to call it..

Running out of memory with medium-sized datasets.

I am trying ~100 mb datasets and the process gets always killed.

This is most likely because the code is aiming at loading all samples in memory. It would be probably better to load the file in batches, and then use Keras' fit_generator.

Cyrillic support?

After train_from_file over file with cyrillic text textgen.generate() produces only spaces instead of words. Any hints on cyrillic support?

UPDATE: Wait, maybe it's because of empty lines in file. Trying to retrain.

train_from_largetext only produces short chunks of output text

This is specifically regarding the train_from_largetext() option in character-level mode. The output text seems to be produced in chunks of around 300 characters with no continuity between them. This is in contrast to other char-rnns that produce text in one continuous flow.

Here's what I did:
I trained on a large text file that has thousands of recipes, using the following mostly-default command.
recipes.train_from_largetext_file('datasets/recipes.txt', new_model=True, num_epochs=1)

Then I generate output using this command:
recipes.generate(10, temperature=0.6)

The generated output, however, is produced a chunk of about 300 characters at a time, and each 300-character chunk doesn't have any continuity with a previous chunk.

Here's what I'd expect to get (a recipe from another char-rnn based on the same training text):

@@@@@ Now You're Cooking! Export Format

Cold Salad Crawfish

seafood

1 1/4 cup heavy cream
1 paprika
1 onion; finely diced
1 cup honey
2 tablespoon light cream

Preheat oven to 350 degrees F.

When the noodles have set, add the salt and pepper and soak
the fat away for 5 minutes. Spoon over the tomato puree.
Add the shrimp and bring to a boil. Reduce heat. Simmer, uncovered, for 10 minutes. Add the meat and stir-fry for 25 minutes, stirring well.
Remove from heat, add shrimp. Then add pork and cook, stirring, 10 to 12 min. Remove from heat. Add sliced beets and stir-fry for 3 minutes to the brandy. Add chili powder and rind of soy sauce. Cook 3 minutes, or untill set and browned, bring bread mixture to a boil. Reduce heat, cover and simmer 8 to 10 minutes until cooked through.

Yield: 1 servings

And here's what I actually get. Each time it generates a new chunk of output text, it starts in the middle of a different recipe.

occasionally, until ship three meat is thickens... Recipe by by a marinaded consistery or but dish. Cover and cook on a pan or pot and add the beans or milk. Spread in each thick to pieces are almosh the meat is
endiples.
Cook and finely chill on toppot stagh or shells on medium half the onions in

heese, shallow finely chopped
1 tablespoon powdered sugar
1 cup water

Place finely chopped cubes excess cookie crushed pineapples, smooth the cooking the butter and scallions are tender.

Add milk, halved slices, carrot and vanilla.

Yield: 6 servings

@@@@@ Now You're Cooking! Export Forma

're Cooking! Export Format

Caramelizable Soup Chili

fish, chicken, chicken, berries

2 tablespoon whites chicken breasts and remove brown black excepers and past and half boiling with woods until smooth and simmer for 6-3 hrsped pan. If mady be spices and salted base in a large to a medium and sp

ed sugar and butter and sugar and simmer until firm are tender. Spoon over the batter. Drained relish, stirring constantly, the chopped in a minute.
Cover and cover the cake the sides of the baking ingred excess of topping fat. Cover and let longer to stir to root. Cover and place in
melted blackly

inkle with beef ice cream to taste.

Noter: 5 x 9 inch soup pot of catsup

Combine the coramas, garlic in thick to simmer 1 to 1 hour, about 1 1/2 hours. Remove the pan or boils. Add to the spray or bubble off the cream excess finely salt and pepper.

Yield: 6 servings

Are there training/sampling parameters that would avoid this issue?

Name

File "I:\Software\Anaconda3.5\lib\site-packages\keras\engine\input_layer.py", line 172, in Input
dtype = K.floatx()

NameError: name 'K' is not defined

I'm a starter,how to resolve this.I googled,didnt find the answer.

I'm using Spyder(Python3.6)

Dataset size and parameters for complete new training

Not an issue per se, merely a question. I try to train a new model for a particular issue, it's bureaucratic Russian. Luckily, there's no problem in obtaining as many such samples as I need. Could you please tell me your initial Reddit dataset size and training parameters? I suspect it will take a really long time, so any additional guidelines would be highly appreciated.

generated text is repeating the training data

I have a relatively small dataset; 10-15 lines of recipes, and about 400 recipes in total. i am trying to train textgenrnn to generate new recipes, and I trained 1000 epochs, with the following config:

num_epochs: 1000
gen_epochs: 10
batch_size: 128
prop_keep: 1.0
new_model: True

model_config:
rnn_layers: 2
rnn_size: 128
rnn_bidirectional: False
max_length: 40
dim_embeddings: 100
word_level: False

However, when I generate the text, with temperature ranging from 0.1 to 1.0, the generated text is just copying the trainign data (sometimes, there is slight differences, but most of the times it is just copy-paste). Is there any parameter I should pay attention to avoid overfitting?

ValueError: Layer #1 (named "embedding")

Hi,

After I train the model with the following config:

num_epochs: 1000
gen_epochs: 10
batch_size: 128
prop_keep: 1.0
new_model: True
model_config:
rnn_layers: 2
rnn_size: 128
rnn_bidirectional: False
max_length: 40
dim_embeddings: 100
word_level: False

I am trying to generate some text with the following code;

from textgenrnn import textgenrnn
textgen_2 = textgenrnn('weights.hdf5')
textgen_2.generate(3, temperature=1.0)

But every time I get the following error: (i changed the dataset, the number of epochs, trained a few times from scratch, result is the same...)

Traceback (most recent call last):
File "test.py", line 2, in
textgen_2 = textgenrnn('weights.hdf5')
File "textgenrnn/textgenrnn.py", line 66, in init
weights_path=weights_path)
File "textgenrnn/model.py", line 38, in textgenrnn_model
model.load_weights(weights_path, by_name=True)
File "lib/python3.6/site-packages/keras/engine/network.py", line 1177, in load_weights
reshape=reshape)
File "lib/python3.6/site-packages/keras/engine/saving.py", line 1018, in load_weights_from_hdf5_group_by_name
str(weight_values[i].shape) + '.')
ValueError: Layer #1 (named "embedding"), weight <tf.Variable 'embedding/embeddings:0' shape=(465, 100) dtype=float32_ref> has shape (465, 100), but the saved weight has shape (91, 100).

What might be the problem?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.