Giter VIP home page Giter VIP logo

char-cnn-text-classification-tensorflow's Introduction

char-cnn-text-classification-tensorflow

主要用于记录“Character-level Convolutional Networks for Text Classification”论文的模型架构和仿真实现方法。 这是一篇2016年4月份刚发的文章,在此之前,原作者还发表过一篇“Text Understanding from Scratch”的论文,两篇论文基本上是一样的。 可以参见自己的博客http://blog.csdn.net/liuchonge/article/details/70947995

char-cnn-text-classification-tensorflow's People

Contributors

lc222 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

char-cnn-text-classification-tensorflow's Issues

关于测试的文件

能请博主上传一下测试的相关代码吗?
另外还有一个问题是:论文中一共有9层,代码里只看到了卷积加上全链接7层?
deep learning 的新手,还请指导!

Error: Argument must be a dense tensor:

ValueError: Tried to convert 'indices' to a tensor and failed. Error: Argument must be a dense tensor: range(0, 69) - got shape [69], but wanted [].
博主 有没有遇到这类问题?

Sudden increase of loss (from 0.1 to thousands) after ~4000 steps

I got a sudden drop of accuracy from ~90% to below 20% after about 4000 training steps? An answer on stack overflow says that's a problem of Adam, and suggests a higher epsilon to fix?

Do you have any suggestion based on your training experience? Thanks!

My Stats:

...
2018-03-21T13:56:22.703288: step 3866, loss 0.43315, acc 0.90625
2018-03-21T13:56:34.078052: step 3867, loss 0.479325, acc 0.851562
2018-03-21T13:56:49.068364: step 3868, loss 0.256937, acc 0.9375
2018-03-21T13:57:06.444321: step 3869, loss 0.273636, acc 0.90625
2018-03-21T13:57:15.877391: step 3870, loss 0.232426, acc 0.929688
2018-03-21T13:57:23.503175: step 3871, loss 3.60959, acc 0.5625
2018-03-21T13:57:31.559549: step 3872, loss 8.54077, acc 0.46875
2018-03-21T13:57:41.647740: step 3873, loss 5.25972, acc 0.429688
2018-03-21T13:57:51.344488: step 3874, loss 322.297, acc 0.234375
2018-03-21T13:58:01.384422: step 3875, loss 21.8541, acc 0.453125
2018-03-21T13:58:10.741925: step 3876, loss 131.247, acc 0.242188
2018-03-21T13:58:19.820908: step 3877, loss 4977.22, acc 0.289062
2018-03-21T13:58:29.067122: step 3878, loss 155.665, acc 0.289062
2018-03-21T13:58:36.979536: step 3879, loss 1825.98, acc 0.226562
2018-03-21T13:58:44.365748: step 3880, loss 1705.01, acc 0.304688
2018-03-21T13:58:53.205122: step 3881, loss 1100.61, acc 0.25
2018-03-21T13:59:01.279594: step 3882, loss 12413.7, acc 0.234375
2018-03-21T13:59:10.017025: step 3883, loss 23248.9, acc 0.273438
...

accuracy

我想restore预测单个句子,应该怎么弄呢?

报错:FailedPreconditionError (see above for traceback): Attempting to use uninitialized value conv_layer-1/w
[[Node: conv_layer-1/w/read = IdentityT=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
[[Node: output_layer/predictions/_5 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_157_output_layer/predictions", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

predition代码:

coding=utf-8

import tensorflow as tf
from NN_Real_examples.char_CNN_text_classification_tf.data_helper import Dataset
import time
import os
from tensorflow.python import debug as tf_debug
from NN_Real_examples.char_CNN_text_classification_tf.charCNN import CharCNN
import datetime
from NN_Real_examples.char_CNN_text_classification_tf.config import config
import numpy as np

Load data

print("正在载入数据...")
dev_data = Dataset(config.dev_data_source)

print ("得到120000维的doc_train,label_train")
print ("得到9600维的doc_dev, label_train")

with tf.Graph().as_default():
session_conf = tf.ConfigProto(
allow_soft_placement=True,
log_device_placement=False)
sess = tf.Session(config=session_conf)
# sess = tf_debug.LocalCLIDebugWrapperSession(sess)
with sess.as_default():
cnn = CharCNN(
l0=config.l0,
num_classes=config.nums_classes,
conv_layers=config.model.conv_layers,
fc_layers=config.model.fc_layers,
l2_reg_lambda=0)

    saver = tf.train.Saver()
    module_file = tf.train.latest_checkpoint(
        'D:/workspace/myPython/NN_Real_examples/char_CNN_text_classification_tf/runs/1530092351/checkpoints/model-3000.data-00000-of-00001')  # ckpt路径抽调出来
    sess = tf.Session()
    if module_file is not None:  # 添加一个判断语句,判断ckpt的路径文件
        saver.restore(sess, save_path=module_file)  # 读取保存的模型

    def dev_step(x_batch, y_batch):
        feed_dict = {
          cnn.input_x: x_batch,
          cnn.input_y: y_batch,
          cnn.dropout_keep_prob: 1.0
        }
        predictions = sess.run(
            [cnn.predictions],
            feed_dict)
        return predictions

    try:
        text = "Wall St. Bears Claw Back Into the Black (Reuters).Reuters - Short-sellers, Wall Street's dwindling\band of ultra-cynics, are seeing green again."
        embedding_w, embedding_dic = dev_data.onehot_dic_build()
        doc_vec = dev_data.doc_process(text, embedding_dic)
        doc_image = []
        label_image = []
        label_class = np.zeros(4, dtype='float32')
        label_class[0] = 1
        doc_image.append(doc_vec)
        label_image.append(label_class)
        batch_x = np.array(doc_image, dtype='int64')
        batch_y = np.array(label_image, dtype='float32')
        predictions = dev_step(batch_x, batch_y)
        label_pred = predictions[0].tolist()  # 转换为list
        print(label_pred)
    except EOFError:
        print("Closing session.")

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.