Giter VIP home page Giter VIP logo

image-captioning-chinese's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

image-captioning-chinese's Issues

image caption,结果中,出现多个同类型的词

你好,我用的是COCO数据集,
两层LSTM模型, 一层用于实现自上而下的注意力,一层实现语言模型。

用jieba提取词
我将所有图片描述中,出现频率大于3次的词作为字典文件,一共有14226个词。
words = [w for w in word_freq.keys() if word_freq[w] > 3]

训练好模型后,在使用时,结果中出现多个同类型的词,比如:
放在 床上 的 笔记 笔记本 笔记本电脑 电脑
一个 小女 小女孩 女孩 站 在 一起

请教一下,应该怎样解决这个问题?

dataset

i can't download the download the dataset.i can't open it
are there another way to download?

lstm_attention.py 训练问题

您好,我在执行您的代码时运行到
out, attn, alpha = tf.split(lstm_net.outputs, [n_hidden, d_local, a_local ** 2], axis=2)
这一步报错:tensorflow.python.framework.errors_impl.InvalidArgumentError: Sum of output sizes must match the size of the original Tensor along the split dimension or the sum of the positive sizes must be less if it contains a -1 for 'split_1' (op: 'SplitV') with input shapes: [?,?,512], [3], [] and with computed input tensors: input[1] = <512 512 49>, input[2] = <2>.
我理解的是lstm_net.outputs 输出的是[?,?,512]维的tensor ,不能拆分为512,512,49维的out ,attn ,alpha。
我不知道是源代码问题,还是tensorflow 版本问题。
想问您原网络lstm_net输出的是[?,?,1073]还是[?,?,512]呢?

lstm_attention.py run error

successfully run lstm.py
there is a error when run latm_attention.py in these codes:train_op = tf.train.AdamOptimizer().minimize(loss, var_list=train_vars).
error:Shapes (?, ?, 1073) and (?, ?, 512) are not compatible
my configuration:pythony3.5+tensorflow1.1+GPU

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.