Giter VIP home page Giter VIP logo

densenet-nlp's Introduction

Very Deep Convolutional Networks for Natural Language Processing in Tensorflow

This is the DenseNet implementation of the paper Do Convolutional Networks need to be Deep for Text Classification ? in Tensorflow. We study in the paper the importance of depth in convolutional models for text classification, either when character or word inputs are considered. We show on 5 standard text classification and sentiment analysis tasks that deep models indeed give better performances than shallow networks when the text input is represented as a sequence of characters. However, a simple shallow-and-wide network outperforms deep models such as DenseNet with word inputs. Our shallow word model further establishes new state-of-the-art performances on two datasets: Yelp Binary (95.9%) and Yelp Full (64.9%).

Paper:

Hoa T. Le, Christophe Cerisara, Alexandre Denis. Do Convolutional Networks need to be Deep for Text Classification ?. Association for the Advancement of Artificial Intelligence 2018 (AAAI-18) Workshop on Affective Content Analysis. (https://arxiv.org/abs/1707.04108)

@article{DBLP:journals/corr/LeCD17,
  author    = {Hoa T. Le and
               Christophe Cerisara and
               Alexandre Denis},               
  title     = {Do Convolutional Networks need to be Deep for Text Classification ?},  
  journal   = {CoRR},  
  year      = {2017}  
}

Results:

Reference Source Codes: https://github.com/dennybritz/cnn-text-classification-tf

densenet-nlp's People

Contributors

lethienhoa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

densenet-nlp's Issues

Adding dimensions error: Dimensions must be equal, but are 64 and 128 for 'res_unit_1_0/add' (op: 'Add') with input shapes: [?,1,129,64], [?,1,129,128]

Hi, I'm just running train.py, but it raised error when calling cnn = VDCNN():

Para

meters:
ALLOW_SOFT_PLACEMENT=True
BATCH_SIZE=128
CHECKPOINT_EVERY=1000
DROPOUT_KEEP_PROB=0.5
EVALUATE_EVERY=5000
L2_REG_LAMBDA=0.0
LOG_DEVICE_PLACEMENT=False
NUM_EPOCHS=50

Loading data...
Non-neutral instances processed: 10000
Non-neutral instances processed: 20000
Non-neutral instances processed: 30000
Non-neutral instances processed: 40000
Non-neutral instances processed: 50000
Non-neutral instances processed: 60000
Non-neutral instances processed: 70000
Non-neutral instances processed: 80000
Non-neutral instances processed: 90000
Non-neutral instances processed: 100000
Non-neutral instances processed: 110000
Non-neutral instances processed: 120000
Non-neutral instances processed: 130000
Non-neutral instances processed: 140000
Non-neutral instances processed: 150000
Non-neutral instances processed: 160000
Non-neutral instances processed: 170000
Non-neutral instances processed: 180000
Non-neutral instances processed: 190000
Loading done
x_char_seq_ind=(194544,)
y shape=(194544, 2)
Train/Dev split: 0/194544
2017-11-20 17:48:22.932910: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 129, 64)
(?, 1, 129, 128)
(?, 1, 129, 128)


Traceback (most recent call last):
  File "train.py", line 64, in <module>
    cnn = VDCNN()
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/model.py", line 59, in __init__
    h = resUnit(h, num_filters_per_size[i], cnn_filter_size, i, j)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/model.py", line 41, in resUnit
    output = input_layer + part6
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 894, in binary_op_wrapper
    return func(x, y, name=name)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 183, in add
    "Add", x=x, y=y, name=name)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2958, in create_op
    set_shapes_for_outputs(ret)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2209, in set_shapes_for_outputs
    shapes = shape_func(op)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2159, in call_with_requiring
    return call_cpp_shape_fn(op, require_shape_fn=True)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 627, in call_cpp_shape_fn
    require_shape_fn)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 691, in _call_cpp_shape_fn_impl
    raise ValueError(err.message)
ValueError: Dimensions must be equal, but are 64 and 128 for 'res_unit_1_0/add' (op: 'Add') with input shapes: [?,1,129,64], [?,1,129,128].

Is there anyway that I can get rid of this? Is that the tensorflow version issue or something?

Code error

Hi, thank you for your work.
You have a code error in the model.py script on line 29 :
b = tf.Variable(tf.constant(0.1, shape=[num_filters_per_size]), name="b")
Wrong shape it should be :
b = tf.Variable(tf.constant(0.1, shape=[num_filters_per_size[0]]), name="b")

k-max pooling

The VDCNN paper's k-max pooling finds the top k values while preserving the order in which they appear in the original sequence. I don't think tensorflow's tf.nn.top_k does that.

This is what I get when I use tf.nn.top_k on a simple example:

import tensorflow as tf

def k_max_pool(value, k):
    sorted_tensor = tf.nn.top_k(value, k, sorted=True)
    unsorted_tensor = tf.nn.top_k(value, k, sorted=False)
    return sorted_tensor, unsorted_tensor

x = [[1,3,2],[4,5,6],[9,8,7]]
print 'input', x

sess = tf.Session()
X = tf.placeholder("int32")
sorted_tensor, unsorted_tensor = k_max_pool(X, 2)
sorted_value, unsorted_value = sess.run([sorted_tensor, unsorted_tensor], feed_dict={X: x})
print 'sorted', sorted_value
print 'unsorted', unsorted_value
input [[1, 3, 2], [4, 5, 6], [9, 8, 7]]
sorted TopKV2(values=array([[3, 2],
       [6, 5],
       [9, 8]], dtype=int32), indices=array([[1, 2],
       [2, 1],
       [0, 1]], dtype=int32))
unsorted TopKV2(values=array([[2, 3],
       [5, 6],
       [8, 9]], dtype=int32), indices=array([[2, 1],
       [1, 2],
       [1, 0]], dtype=int32))

code erreor

Hi, i've mentioned that the output Dim of Conv Block 64 is 1 x 72 x 64, how could you do convolution with the filter with shape 13128*128? Dimensions not equal!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.