Giter VIP home page Giter VIP logo

Comments (7)

coopie avatar coopie commented on August 12, 2024

you need to set state_is_tuple to be false, so the cell and hidden state are concatenated together.

i.e

# Create RNN cells using the TensorFlow RNN library
char_cell = td.ScopedLayer(
    tf.contrib.rnn.BasicLSTMCell(num_units=16, state_is_tuple=False),
    'char_cell'
)
word_cell = td.ScopedLayer(
    tf.contrib.rnn.BasicLSTMCell(num_units=32, state_is_tuple=False),
    'word_cell'
)

The example should work. I'll leave it up to you to figure out what the output of the eval actually represents :)

from fold.

benjaminderei avatar benjaminderei commented on August 12, 2024

from fold.

pklfz avatar pklfz commented on August 12, 2024

According to section 3.3 of hierarchical LSTMs, the word_lstm gets the hidden outputs of char_lstm as inputs. So, I think the correct way to construct a word_lstm cell with char_cell's state_is_tuple=True is here below:

import tensorflow as tf
import tensorflow_fold as td

# Create RNN cells using the TensorFlow RNN library
char_cell = td.ScopedLayer(tf.contrib.rnn.BasicLSTMCell(num_units=16), 'char_cell')
word_cell = td.ScopedLayer(tf.contrib.rnn.BasicLSTMCell(num_units=32), 'word_cell')

# character LSTM converts a string to a word vector
char_lstm = (td.InputTransform(lambda s: [ord(c) for c in s]) >>
             td.Map(td.Scalar('int32') >>
                    td.Function(td.Embedding(128, 8))) >>
             td.RNN(char_cell))
# word LSTM converts a sequence of word vectors to a sentence vector.
word_lstm = td.Map(char_lstm >> td.GetItem(1) >> td.GetItem(1)) >> td.RNN(word_cell)

sess = tf.InteractiveSession()

word_lstm.eval(["bon","ben"])

I add one more >> td.GetItem(1) behind >> td.GetItem(1), so that word_lstm can get the hidden_output of one char_lstm instead of (cell_status, hidden_output) of one.

Additionally, the content of test_hierarchical_rnn in blocks_test.py also need to be revised.

Wish this is helpful!

from fold.

benjaminderei avatar benjaminderei commented on August 12, 2024

Will Try :)

but are you sure ? if i remember the char_lstm_output[1] is just 32 float long...
anyway i will check !

Yeah you right but with state_is_tuple=True ! I'm dumb 8~|
:)

from fold.

pklfz avatar pklfz commented on August 12, 2024

@benjaminderei
No. char_lstm_output[1] is an instance of LSTMStateTuple if you set state_is_tuple=True; otherwise, it is 32 float numpy array.

from fold.

benjaminderei avatar benjaminderei commented on August 12, 2024

I have edited my previous post :)

from fold.

pklfz avatar pklfz commented on August 12, 2024

@benjaminderei glad to help you.

from fold.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.