eisenjulian / nlp_estimator_tutorial Goto Github PK

View Code? Open in Web Editor NEW

135.0 135.0 51.0 6.5 MB

Educational material on using the TensorFlow Estimator framework for text classification

License: MIT License

Jupyter Notebook 87.76% Python 12.24%

estimator nlp tensorflow text-classification

nlp_estimator_tutorial's Introduction

nlp_estimator_tutorial

Educational material on using the TensorFlow Estimator framework for text classification

Try it live in Colab here

nlp_estimator_tutorial's People

Contributors

Stargazers

Watchers

nlp_estimator_tutorial's Issues

why use 200 as the number of fixed size?

Hello, excuse me. In your blog-Classifying text with TensorFlow Estimators at the section of Loading the data, you set 200 as the number of fixed size of sequences. I found this sequences with a maximum of 2494 and minimum of 11, may i ask you why you set 200? Is there any references about that question?
Thank you very much.

Python 35 is causing the error:ValueError: Mismatched label shape. Classifier configured with n_classes=1. Received 254. Suggested Fix: check your n_classes argument to the estimator and/or the shape of your label.

The error comes from here

params = {'embedding_initializer': tf.random_uniform_initializer(-1.0, 1.0)}

# Creating an estimator for random embedding
cnn_classifier = tf.estimator.Estimator(model_fn=cnn_model_fn,
                                        model_dir=os.path.join(model_dir, 'cnn'),
                                        params=params)
train_and_evaluate(cnn_classifier)

This is the error.

ValueError: Mismatched label shape. Classifier configured with n_classes=1. Received 254. Suggested Fix: check your n_classes argument to the estimator and/or the shape of your label.

The code worked perfected with py27

Why train_input_fn outputs 3 values

In this function you outputs 3 values [x_train, x_len_train, y_train]:
def train_input_fn(): dataset = tf.data.Dataset.from_tensor_slices((x_train, x_len_train, y_train)) dataset = dataset.shuffle(buffer_size=len(x_train_variable)) dataset = dataset.batch(100) dataset = dataset.map(parser) dataset = dataset.repeat() iterator = dataset.make_one_shot_iterator() return iterator.get_next()

However, in Google document, they say:

The return value must be a two-element tuple organized as follows: :

The first element must be a dict in which each input feature is a key, and then a list of values for the training batch.

The second element is a list of labels for the training batch.

So I don't really understand that how custom Estimator can work with a tuple of 3 values
Thanks in advance

why use index_offset in index_to_text ? x_train_variable contains the original index begin from 1.

nlp_estimator_tutorial/nlp_estimators.py

Line 81 in 5bfec8c

print(index_to_text(x_train_variable[0]))

How to extend this logic for huge dataset

Thank you for this great example on input pipeline.

(x_train_variable, y_train), (x_test_variable, y_test) = imdb.load_data(num_words=vocab_size)

I have a dataset that is 2GB in size. Unlike you I cannot load the whole data in memory. How can we modify this logic to include GB's dataset ?

Key error when execute prediction

First thank you for your execellent tutorial.

I met a KerError when I run the nlp_estimators.py at L140, which said it cannot find the key logistic:

predictions = np.array([p['logistic'][0] for p in classifier.predict(input_fn=eval_input_fn)])

I'm using TensorFlow 1.9. Is it supposed to be probabilities?

eisenjulian / nlp_estimator_tutorial Goto Github PK

nlp_estimator_tutorial's Introduction

nlp_estimator_tutorial

nlp_estimator_tutorial's People

Contributors

Stargazers

Watchers

Forkers

nlp_estimator_tutorial's Issues

why use 200 as the number of fixed size?

Python 35 is causing the error:ValueError: Mismatched label shape. Classifier configured with n_classes=1. Received 254. Suggested Fix: check your n_classes argument to the estimator and/or the shape of your label.

Why train_input_fn outputs 3 values

why use index_offset in index_to_text ? x_train_variable contains the original index begin from 1.

How to extend this logic for huge dataset

Key error when execute prediction

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent