Comments (8)
When I ran the example above and got garbage, I had trained another model in the meantime, which presumably overwrote textgenrnn_vocab.json and textgenrnn_config.json. I guess the .save() strategy always worked for regular shorttext training b/c _vocab.json and _config.json weren't changing. Probably if I went back and tried to load a shorttext model after training a longtext model, it wouldn't work either.
Solution: if I want to save a largetext model to come back to later, it looks like the strategy is to manually copy textgenrnn_weights.hdf5, textgenrnn_vocab.json and textgenrnn_config.json to a new folder before running another model.
I confirmed that this works by training a largetext model:
recipes_len20.train_from_largetext_file('datasets/recipes.txt', new_model=True, num_epochs=1, max_length=20)
Manually copying the textgenrnn_vocab.json, textgenrnn_config.json, and textgenrnn_weights.hdf5 to another folder, Training another model, And then loading a new model from the original saved weights, vocab, config:
recipes_len20_reload2 = textgenrnn(weights_path='weights/recipes_len20/textgenrnn_weights.hdf5', vocab_path='weights/recipes_len20/textgenrnn_vocab.json', config_path='weights/recipes_len20/textgenrnn_config.json')
Then I was able to sample from my saved model:
>>>recipes_len20_reload2.generate()
canned with a deveined tomatoes and crushed
salt. Add the sausage and cook on high speed and sugar to the shrimp. Seal the casserole in the tomato sauce, pat
each additional mixed
1 water
1 egg
1 tablespoon lemon juice
1 tablespoon curry powder
1/2 teaspoon crumbled cheddar cheese, shredded
1 c
It would be nice if .save() did all the copying-to-a-new-folder for me, but at least this seems to work.
from textgenrnn.
If you are using a custom model, you need to load with the config.json
and the vocab.json
that were also generated.
from textgenrnn.
When I saved my model, it looks like it only saved a .hdf5 file, not the .json files. The command I used was something like this:
recipes_old.save('weights/recipes.hdf5')
Then when I do this, after having trained some other models in the meantime:
recipes = textgenrnn(weights_path='weights/recipes.hdf5', vocab_path='textgenrnn_vocab.json', config_path='textgenrnn_config
.json')
I don't get an error message, but the generated text looks like garbage.
recipes.generate()
) o D T wT f 99 95 u5555555555" "5''80g00'..t'' '-w...ssr0r0Bswoc3c- - 9- ""T0 0A5 p5-5F9G 5T '.'qr0 l - 0 00 J - 0 - 0 5 0 - 0 0 - 0 0 - 0 F C - 6 F -
Was I supposed to save using a different method?
from textgenrnn.
The the saved weight has shape (106, 100).
part of the error message implies that a new model was trained (the 106
corresponds to the size of the vocabulary). Those files should have been generated before the model started training.
Each training epoch saves the weights; there shouldn't be a reason to explicitly call .save()
and I should probably consider depreciating that.
I double checked the code path of train_from_file
and this order-of-operations should be working correctly, although I'm not sure why garbage got generated.
from textgenrnn.
Saving to a new folder w/ the 3 files is a good use case to avoid depreciation. I'll look into it.
Thanks for the very detailed workflow! :)
from textgenrnn.
Make some experiments when got similar error.. I found, that new_model=False
work great but
new_model=True
created broken weights (if weights=1,2 MB. then weights is broken). So if you need train new model, my advice is just do not specify this parameter. In that case I tested few times and library generated each time correct weights (with 1,9 MB.)
Currently using new_model=True
generated broken weights each time for me.
Hope that can helps anyone!
from textgenrnn.
Adding this in case it helps people, because it's still possible to end up with a shape error with the workflow above.
textgenrnn_vocab.json and textgenrnn_config.json only update if training is run with "new_model=true"
So, if you initialize a model with my_model = textgenrnn() and then train, the training may still work but the textgenrnn_vocab.json and textgenrnn_config.json won't match your model (so you won't be able to go back to these results later) unless you also run it with "new_model=true".
from textgenrnn.
Thanks for following up, Janelle!
I admit I haven't polished the transfer learning workflow; I'll take a look at some point.
from textgenrnn.
Related Issues (20)
- train_on_texts gets stuck HOT 2
- Train to generate about Business HOT 2
- Why is the code so broken? HOT 6
- Failed to call ThenRnnForward HOT 1
- notebook errors HOT 4
- Unable to clone git using Colab HOT 2
- ImportError: cannot import name 'multi_gpu_model' from 'tensorflow.keras.utils' HOT 1
- disable progressbar on GENERATION HOT 5
- uses all my vram HOT 1
- what should you have before training. - txt HOT 1
- please use aitextgen for better results
- All dependencies are impossible to install or import.
- Can this "app" be run locally? If so, how? HOT 2
- textgen.generate(n) does not produce n lines.
- Using this module in browser
- How to train on question->answer dataset?
- Can't install on Linux HOT 2
- Can't install on Linux HOT 1
- change to float16 instead of 32?
- New PyPI package
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from textgenrnn.