Giter VIP home page Giter VIP logo

jiegzhan / multi-class-text-classification-cnn-rnn Goto Github PK

View Code? Open in Web Editor NEW
592.0 53.0 263.0 90.69 MB

Classify Kaggle San Francisco Crime Description into 39 classes. Build the model with CNN, RNN (GRU and LSTM) and Word Embeddings on Tensorflow.

Home Page: https://www.kaggle.com/c/sf-crime/data

License: Apache License 2.0

Python 100.00%
cnn text-classification kaggle tensorflow rnn embeddings lstm

multi-class-text-classification-cnn-rnn's Introduction

Project: Classify Kaggle San Francisco Crime Description

Highlights:

  • This is a multi-class text classification (sentence classification) problem.
  • The goal of this project is to classify Kaggle San Francisco Crime Description into 39 classes.
  • This model was built with CNN, RNN (LSTM and GRU) and Word Embeddings on Tensorflow.
  • Input: Descript

  • Output: Category

  • Examples:

    Descript Category
    GRAND THEFT FROM LOCKED AUTO LARCENY/THEFT
    POSSESSION OF NARCOTICS PARAPHERNALIA DRUG/NARCOTIC
    AIDED CASE, MENTAL DISTURBED NON-CRIMINAL
    AGGRAVATED ASSAULT WITH BODILY FORCE ASSAULT
    ATTEMPTED ROBBERY ON THE STREET WITH A GUN ROBBERY

Train:

  • Command: python3 train.py train_data.file train_parameters.json
  • Example: python3 train.py ./data/train.csv.zip ./training_config.json

Predict:

  • Command: python3 predict.py ./trained_results_dir/ new_data.csv
  • Example: python3 predict.py ./trained_results_1478563595/ ./data/small_samples.csv

Reference:

multi-class-text-classification-cnn-rnn's People

Contributors

gustavomr avatar jiegzhan avatar prabh-me avatar rickyhan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

multi-class-text-classification-cnn-rnn's Issues

How to change parametrs to much bigger text on russian?

Thanks for your code it's pretty much exactly what i was looking for.
But i need to classify bigger text (around 500 words in one article), and it's gonna be in Russian language.
Can you advise how to improve code for this?

What do i need to change in config file?
batch_size": 256,
"dropout_keep_prob": 0.5,
"embedding_dim": 300,
"evaluate_every": 100,
"filter_sizes": "3,4,5",
"hidden_unit": 300,
"l2_reg_lambda": 0.0,
"max_pool_size": 4,
"non_static": false,
"num_epochs": 1,
"num_filters": 128

What do i need to change for bigger sentences?

ata loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

DataLossError (see above for traceback): unable to open table file .\trained_results_1509434l62\best_mo del.ckpt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[Node: save/RestoreV2_ll = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/tas k:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_ll/tensor_names, save/RestoreV2_ll/shape_and_slices)]]
K:\users\Eduard\Downloads\multi-class-text-classification-cnn-rnn-master\multi-class-text-classificatio
n-cnn-rnn-master>

Can it work if “non_static” in parameters sets "True"?

Thanks for your sharing project, and I find the "non-static" in the config.

I just find the code like this:
with open(trained_dir + 'embeddings.pickle', 'wb') as outfile:
pickle.dump(embedding_mat, outfile, pickle.HIGHEST_PROTOCOL)

"embedding_mat" is formed by np.uniform and when I train using "non-static : True", the val is changed just in model "cnn_rnn". when I save it ,it still was the val formed by np.uniform.

So, I don't know if I decript clearly. when I want to using variable embedding, it will work just set "non_static : True"?

Thank you

While Training getting error

[root@bdl02node04 multi-class-text-classification-cnn-rnn-master]# python train.py
CRITICAL:root:The maximum length is 14
INFO:root:x_train: 711219, x_dev: 79025, x_test: 87805
INFO:root:y_train: 711219, y_dev: 79025, y_test: 87805
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
Traceback (most recent call last):
File "train.py", line 161, in
train_cnn_rnn()
File "train.py", line 60, in train_cnn_rnn
l2_reg_lambda = params['l2_reg_lambda'])
File "/root/NN/multi-class-text-classification-cnn-rnn-master/text_cnn_rnn.py", line 34, in init
pad_prio = tf.concat(1, [self.pad] * num_prio)
File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1047, in concat
dtype=dtypes.int32).get_shape(
File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 651, in convert_to_tensor
as_ref=False)
File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 716, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/constant_op.py", line 176, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/constant_op.py", line 165, in constant
tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 367, in make_tensor_proto
_AssertCompatible(values, dtype)
File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 302, in _AssertCompatible
(dtype.name, repr(mismatch), type(mismatch).name))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

create a SavedModel

hi jiegzhan,
i want to create a savedModel to deploy it later in google ML engine , but your code support only tensorflow 0.9.0 . and in this version, I can't use the tf.train.Saver class to generate a savedmodel . do you have any idea how can I fix this problem ?

when trying to run train.py

sidrah@sidrah-VirtualBox:/Downloads/multi-class-text-classification-cnn-rnn-master$ python3 train.py ./data/train.csv.zip ./training_config.json
Traceback (most recent call last):
File "train.py", line 8, in
import data_helper
File "/home/sidrah/Downloads/multi-class-text-classification-cnn-rnn-master/data_helper.py", line 13, in
from tensorflow.contrib import learn
File "/usr/local/lib/python3.4/dist-packages/tensorflow/init.py", line 23, in
from tensorflow.python import *
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/init.py", line 48, in
from tensorflow.python import pywrap_tensorflow
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 28, in
_pywrap_tensorflow = swig_import_helper()
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description)
File "/usr/lib/python3.4/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
ImportError: /usr/local/lib/python3.4/dist-packages/tensorflow/python/_pywrap_tensorflow.so: invalid ELF header
sidrah@sidrah-VirtualBox:
/Downloads/multi-class-text-classification-cnn-rnn-master$ python3 train.py ./data/train.csv.zip ./training_config.json
Traceback (most recent call last):
File "train.py", line 8, in
import data_helper
File "/home/sidrah/Downloads/multi-class-text-classification-cnn-rnn-master/data_helper.py", line 13, in
from tensorflow.contrib import learn
File "/usr/local/lib/python3.4/dist-packages/tensorflow/init.py", line 23, in
from tensorflow.python import *
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/init.py", line 48, in
from tensorflow.python import pywrap_tensorflow
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 28, in
_pywrap_tensorflow = swig_import_helper()
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description)
File "/usr/lib/python3.4/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
ImportError: /usr/local/lib/python3.4/dist-packages/tensorflow/python/_pywrap_tensorflow.so: invalid ELF header

Error while training: Unknown argument "syntax"

I get the following error when trying to train :
TypeError: init() got an unexpected keyword argument 'syntax'

Any idea where this comes from?

File "train.py", line 8, in
import data_helper
File "/Users/Nanous/Desktop/crime_classification/data_helper.py", line 13, in
from tensorflow.contrib import learn
File "/usr/local/lib/python2.7/site-packages/tensorflow/init.py", line 24, in
from tensorflow.python import *
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/init.py", line 54, in
from tensorflow.core.framework.graph_pb2 import *
File "/usr/local/lib/python2.7/site-packages/tensorflow/core/framework/graph_pb2.py", line 16, in
from tensorflow.core.framework import node_def_pb2 as tensorflow_dot_core_dot_framework_dot_node__def__pb2
File "/usr/local/lib/python2.7/site-packages/tensorflow/core/framework/node_def_pb2.py", line 16, in
from tensorflow.core.framework import attr_value_pb2 as tensorflow_dot_core_dot_framework_dot_attr__value__pb2
File "/usr/local/lib/python2.7/site-packages/tensorflow/core/framework/attr_value_pb2.py", line 16, in
from tensorflow.core.framework import tensor_pb2 as tensorflow_dot_core_dot_framework_dot_tensor__pb2
File "/usr/local/lib/python2.7/site-packages/tensorflow/core/framework/tensor_pb2.py", line 16, in
from tensorflow.core.framework import resource_handle_pb2 as tensorflow_dot_core_dot_framework_dot_resource__handle__pb2
File "/usr/local/lib/python2.7/site-packages/tensorflow/core/framework/resource_handle_pb2.py", line 22, in
serialized_pb=_b('\n/tensorflow/core/framework/resource_handle.proto\x12\ntensorflow"m\n\x0eResourceHandle\x12\x0e\n\x06\x64\x65vice\x18\x01 \x01(\t\x12\x11\n\tcontainer\x18\x02 \x01(\t\x12\x0c\n\x04name\x18\x03 \x01(\t\x12\x11\n\thash_code\x18\x04 \x01(\x04\x12\x17\n\x0fmaybe_type_name\x18\x05 \x01(\tB4\n\x18org.tensorflow.frameworkB\x13ResourceHandleProtoP\x01\xf8\x01\x01\x62\x06proto3')

TypeError: init() got an unexpected keyword argument 'syntax'

Extract Associated Probability

This is awesome code. I am fairly new to TF and have tried to google the answer myself but can't figure it out. How do I also extract the associated probability with the label that it predicts?

How to deal with the imbalance data problem?

I tried to transplant the code on my own text classification data( 47 classes in 42000 records), finding out that the classifier would tend to choose the larger classes like THEFT, ASSULT and so forth. How you guys deal with the imbalance data to make them seems more 'balance'?

tensorflow 1 migration

Migration to tensorflow 1
I changed the concat from (1,xxx) to (xxx,1)
I changed the tf.nn.rnn_ to tf.contrib.rnn

But now I have this error in
File "train.py", line 161, in
train_cnn_rnn()
File "train.py", line 60, in train_cnn_rnn
l2_reg_lambda = params['l2_reg_lambda'])
File "/home/administrator/django/demo/tempo/multi-class-text-classification-cnn-rnn/text_cnn_rnn.py", line 58, in init
inputs = [tf.squeeze(input_, [1]) for input_ in tf.split(1, reduced, pooled_concat)]
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/ops/array_ops.py", line 1203, in split
num = size_splits_shape.dims[0]
IndexError: list index out of range

Any ideas?

issue regarding training file please help

W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
Traceback (most recent call last):
File "train.py", line 161, in
train_cnn_rnn()
File "train.py", line 60, in train_cnn_rnn
l2_reg_lambda = params['l2_reg_lambda'])
File "/home/akshata/keras_NN/multi-class-text-classification-cnn-rnn-master/text_cnn_rnn.py", line 34, in init
pad_prio = tf.concat(1, [self.pad] * num_prio)
File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1047, in concat
dtype=dtypes.int32).get_shape(
File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 651, in convert_to_tensor
as_ref=False)
File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 716, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/constant_op.py", line 176, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/constant_op.py", line 165, in constant
tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 367, in make_tensor_proto
_AssertCompatible(values, dtype)
File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 302, in _AssertCompatible
(dtype.name, repr(mismatch), type(mismatch).name))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

Saver not working

I believe the file train.py and line#151 : os.rename(path, trained_dir + 'best_model.ckpt') needs to be updated for 1.2. The path variable is missing the extension of the file ? not sure. Is the any other way to fix it ?

AND

predict.py line 109,110, and 111 needs to be updated as well.

checkpoint_file = trained_dir + 'best_model.ckpt' saver = tf.train.Saver(tf.all_variables()) saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file[:-5]))`

don't work with python2.7 and tensorflow0.9

Traceback (most recent call last):
File "/home/mjq/PycharmProjects/multi-class-text/multi-class-text-classification-cnn-rnn/train.py", line 167, in
train_cnn_rnn()
File "/home/mjq/PycharmProjects/multi-class-text/multi-class-text-classification-cnn-rnn/train.py", line 63, in train_cnn_rnn
l2_reg_lambda=params['l2_reg_lambda'])
File "/home/mjq/PycharmProjects/multi-class-text/multi-class-text-classification-cnn-rnn/text_cnn_rnn.py", line 56, in init
lstm_cell = tf.contrib.rnn.DropoutWrapper(lstm_cell, output_keep_prob=self.dropout_keep_prob)
AttributeError: 'module' object has no attribute 'DropoutWrapper'

What is the reference of rnn?

It only says about cnn in the reference of this ReadMe, but what about rnn?
Would you please give out the reference paper or some materials?

os.rename can not found model-2600

I didn't change any file
Traceback (most recent call last):
File "train.py", line 161, in
train_cnn_rnn()
File "train.py", line 151, in train_cnn_rnn
os.rename(path, trained_dir + 'best_model.ckpt')
FileNotFoundError: [Errno 2] No such file or directory: './checkpoints_1504000981/model-2600' -> './trained_results_1504000981/best_model.ckpt'

train.py fails with best_model.ckpt not found

Hi,
I am trying the example with Python 3.6.1 and TensorFlow 1.2.1 on Windows 10.

I am getting the following error when I run "python train.py ./data/train.csv.zip ./training_config.json".

CRITICAL:root:Saved model ./checkpoints_1501717661/model-2700 at step 2700
CRITICAL:root:Best accuracy 0.997291996203733 at step 2700
CRITICAL:root:Training is complete, testing the best model on x_test and y_test
INFO:tensorflow:Restoring parameters from ./checkpoints_1501717661/model-2700
INFO:tensorflow:Restoring parameters from ./checkpoints_1501717661/model-2700
CRITICAL:root:Accuracy on test set: 0.9972894482090997
Traceback (most recent call last):
File "train.py", line 161, in
train_cnn_rnn()
File "train.py", line 151, in train_cnn_rnn
os.rename(path, trained_dir + 'best_model.ckpt')
FileNotFoundError: [WinError 2] The system cannot find the file specified: './checkpoints_1501717661/model-2700' -> './trained_results_1501717661/best_model.ckpt'

I ran the train.py couple of times now. Same error. Please help me to solve this issue.

Thanks,
Hilmi.

Predicting fail using tensorflow 1.3.0

env: tensorflow (1.3.0)

using the demo data, execute the predict.py failed.

`(tf_env) [root@patsnap360svr multi-class-text-classification-cnn-rnn]# python predict.py ../trained_results_1506070488/ ../multi-class-text-classification-cnn-rnn-modified/data/small_samples.csv
2017-09-27 13:00:53.275348: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-27 13:00:53.275444: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-27 13:00:53.275473: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-09-27 13:00:53.275492: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-27 13:00:53.275510: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
WARNING:tensorflow:From predict.py:110: all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Please use tf.global_variables instead.
INFO:tensorflow:Restoring parameters from ../trained_results_1506070488/best_model.ckpt
Traceback (most recent call last):
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1327, in _do_call
return fn(*args)
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1306, in _run_fn
status, run_metadata)
File "/usr/local/lib/python3.6/contextlib.py", line 89, in exit
next(self.gen)
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.NotFoundError: Tensor name "conv-maxpool-3/b" not found in checkpoint files ../trained_results_1506070488/best_model.ckpt
[[Node: save/RestoreV2_1 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_1/tensor_names, save/RestoreV2_1/shape_and_slices)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "predict.py", line 136, in
predict_unseen_data()
File "predict.py", line 112, in predict_unseen_data
saver.restore(sess, checkpoint_file)
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1560, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Tensor name "conv-maxpool-3/b" not found in checkpoint files ../trained_results_1506070488/best_model.ckpt
[[Node: save/RestoreV2_1 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_1/tensor_names, save/RestoreV2_1/shape_and_slices)]]

Caused by op 'save/RestoreV2_1', defined at:
File "predict.py", line 136, in
predict_unseen_data()
File "predict.py", line 110, in predict_unseen_data
saver = tf.train.Saver(tf.all_variables())
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1140, in init
self.build()
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1172, in build
filename=self._filename)
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 688, in build
restore_sequentially, reshape)
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 407, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 247, in restore_op
[spec.tensor.dtype])[0])
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 663, in restore_v2
dtypes=dtypes, name=name)
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/root/.virtualenvs/tf_env/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1204, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

NotFoundError (see above for traceback): Tensor name "conv-maxpool-3/b" not found in checkpoint files ../trained_results_1506070488/best_model.ckpt
[[Node: save/RestoreV2_1 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_1/tensor_names, save/RestoreV2_1/shape_and_slices)]]`

Tensorboard embedding

Has anyone tried to visualize text embeddings in tensorboard ? Any guidence on how to implement it ?

Thanks

AttributeError and IndexError

Hi, I'm getting some errors with Tensorflow 1.0

I ran the tensorflow upgrade script. This fixed the argument order error for concat.

However now I get:

AttributeError: module 'tensorflow.python.ops.nn' has no attribute 'rnn_cell'.

I understand that rnn_cell was moved to tf.contrib.

If I change rnn_cell to tf.contrib I get the following error:

IndexError: list index out of range

Unsuccessful TensorSliceReader constructor

Hi Jie getting below error any idea who to rectify this error:

NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ./checkpoints_1517652366/model-0
[[Node: save/RestoreV2_36 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_36/tensor_names, save/RestoreV2_36/shape_and_slices)]]

KeyError: 'RECOVERED VEHICLE',what's the small_sample.csv?

(tensorflow) F:\Postgraduate\KaggleLearning\multi-class-text-classification-cnn-rnn-master\multi-class-text-classification-cnn-rnn-master>python predict.py ./t
rained_results_1541818386/ ./data2/samples.csv
D:\Anaconda\anaconda\envs\tensorflow\lib\site-packages\gensim\utils.py:1212: UserWarning: detected Windows; aliasing chunkize to chunkize_serial
warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
Traceback (most recent call last):
File "predict.py", line 141, in
predict_unseen_data()
File "predict.py", line 68, in predict_unseen_data
x_, y_, df = load_test_data(test_file, labels)
File "predict.py", line 43, in load_test_data
y_ = df[select[1]].apply(lambda x: label_dict[x]).tolist()
File "D:\Anaconda\anaconda\envs\tensorflow\lib\site-packages\pandas\core\series.py", line 3194, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/_libs/src\inference.pyx", line 1472, in pandas.libs.lib.map_infer
File "predict.py", line 43, in
y
= df[select[1]].apply(lambda x: label_dict[x]).tolist()
KeyError: 'RECOVERED VEHICLE'

webserver for quick demo

Hey,

Hope you are all well !

Is it possible to have a demo web-server to predict classes from the trained model from the browser for a single short text ?

Cheers,
Richard

Training using tensorflow 1.0 fails

I learnt that you used tf 0.9 when you built this project. My tf version is 1.0. There are something different such as the location of rnn_cell.py. I've changed the source to adjust the run_cell and tf.concat() problems. However, an IndexError: list index out of range exception were thrown when I run the code. Here is the traceback.

    Traceback (most recent call last):
      File "train.py", line 165, in <module>
        train_cnn_rnn()
      File "train.py", line 62, in train_cnn_rnn
        l2_reg_lambda=params['l2_reg_lambda'])
      File "/Users/jiechengwu/Downloads/mctccr/text_cnn_rnn.py", line 60, in __init__
        inputs = [tf.squeeze(input_, [1]) for input_ in tf.split(1, reduced, pooled_concat)]
      File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 1203, in split
        num = size_splits_shape.dims[0]
    IndexError: list index out of range

The exception happens in the tf.split() function. I'm new to NN, please do reply soon.

Thanks.

Training fails

Hi, I'm having this issue when I run training:

python3 train.py ./data/train.csv.zip ./training_config.json

CRITICAL:root:Accuracy on test set: 0.9971641706053186
Traceback (most recent call last):
File "train.py", line 161, in
train_cnn_rnn()
File "train.py", line 151, in train_cnn_rnn
os.rename(path, trained_dir + 'best_model.ckpt')
FileNotFoundError: [Errno 2] No such file or directory: './checkpoints_1486165230/model-2700' -> './trained_results_1486165230/best_model.ckpt'

I'll spend a bit of time tomorrow to see how t fix this problem.

Shape must be of rank 4 but is of rank 3

Hi,

While running train.py I am getting below error:

Shape must be rank 4 but is rank 3 for 'conv-maxpool-3/concat_2' (op: 'ConcatV2') with input shapes: [?,1,300,1], [?,548,1], [?,1,300,1], [].

This is happening in text_cnn_rnn.py at
conv = tf.nn.conv2d(emb_pad, W, strides=[1, 1, 1, 1], padding='VALID', name='conv')

Can any one support in providing some help on this.

the data only can get from the zip

Hi,jiegzhan
I'm working on using clstm to mak classfication.But this is only one label and binary-class and I only have a data in csv,how can I use the network on my data?
Thank you!

Training fails with small data set

I only have a small training data set (less than 20k). The training fails to learn, because it never completes a cycle of training.

What parameters could I change in the 'training_config.json' to resolve this issue?

Issue With Predictions

I have tried messing around with the code quite a bit, but can't figure it out. How do I turn the predictions.py into more of an API that can be called. I am familiar with setting up APIs, but I have issues restoring the model

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.