Comments (8)
You could try adding a print message around line 94-97 of seq2tfrec_onehot.py to make sure that a training set (rather than a test set) is being converted.
By the way, if you only want to reproduce 16S prediction using the seq2species model. The original implementation by google might be helpful:
https://github.com/tensorflow/models/tree/master/research/seq2species
from deepmicrobes.
Yes I already made sure it is converted to a training set with the convert_advance_file
function and that function correctly extracts the information.
Turns out the input_fn_train
is set depending on the --encode_method flag which I failed to set, it default to kmer which is of course wrong. Setting --encode_method to one_hot fixes the TFrecord parsing, and the training starts succesfully.
Calculation the loss seems to fail however and I am not sure what is causing it.
I am getting this error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to explicitly squeeze dimension 1 but dimension was not 1: 0
[[Node: sparse_softmax_cross_entropy_loss/remove_squeezable_dimensions/Squeeze = Squeeze[T=DT_INT64, squeeze_dims=[-1], _device="/job:localhost/replica:0/task:0/device:CPU:0"](IteratorGetNext:1)]]
Any idea on how to fix this/what causes this?
P.S. I found the original paper and code to be very convoluted and difficult to work with, and I am interest in also trying the other models in this repo.
from deepmicrobes.
I'm not really sure about the solution. But I think the problem lies in the training data (e.g., the length of DNA sequences) rather than the model. The model needs a flag of --max_len whose default value is 150 bp. Try setting it to the max length of your full-length 16S data.
from deepmicrobes.
Oops I see I forgot to add the command I ran
DeepMicrobes.py --input_tfrec=combined_train_small.tfrec --model_name=seq2species --model_dir=seq2species_new_weights_small --max_len=400 --encode_method=one_hot
(I trimmed the sequences to 400bp).
So that should not be the problem. When I did forget to set the --max_len I get an error about padding to a lower size than the original.
from deepmicrobes.
Try deleting the model_dir (rm -rf seq2species_new_weights_small) and running again.
from deepmicrobes.
Still no luck unfortunatly
Log
This is my repo if you are puzzeled by the print statements
github.com/Bartvelp/DeepMicrobes_clone
from deepmicrobes.
You should set the --num_classes flag to your actual number of categories. The default value is --num_classes=2505 (I had 2505 species for the pre-trained model).
from deepmicrobes.
yes thank you I forgot that.
I figured it out, due to a weird bug or something my tfrecord file did not contain the classes/labels. When I recreated them it all worked out-of-the-box. Thanks alot for your help!
Closing
from deepmicrobes.
Related Issues (20)
- UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
- Error: seq-shuf not detected HOT 1
- Future work?
- problem running the example tfrec conversion on Windows Subsystem for Linux WSL and likewise on docker HOT 7
- running DeepMicrobes in RHEL 8.4 HOT 5
- failed prediction only in some samples HOT 2
- Error in tfrec_train_kmer.sh with a customized training set HOT 3
- model weights not saving? HOT 2
- Error in tfrec_train_kmer.sh with SILVA 138.1 SSU database as training set HOT 5
- Versioned release package for DeepMicrobes HOT 11
- UnicodeEncodeError: 'ascii' codec can't encode character '\ufeff' in position # HOT 5
- Running DeepMicrobes with TensorFlow 2.x HOT 4
- 关于DeepMicrobe论文中的细节问题想向您请教 HOT 2
- Trained weights not accesible HOT 2
- Request for Pretrained Model
- Question about training a DeepMicrobes model to predict the taxonomy from phylum to species HOT 13
- Error training a custom model HOT 3
- Error occurred during model training. HOT 8
- Question about the size of the model HOT 3
- Model weights for training HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepmicrobes.