Comments (7)
The Seq2seq framework includes a ready made attention model which does the same.
Yes it does have one!
from keras-attention.
I add a TimeDistribute(Dense()) to fit input_dim, but how to fit the time_step.
As the module structure is below:
Get Error when run
ValueError: Error when checking target: expected activation_1 to have shape (None, 6, 30) but got array with shape (5, 9, 30)
from keras-attention.
@Opdoop can you please share the minimal code to reproduce your error?
from keras-attention.
@philipperemy
Data prepare is :
input_text = ['** 的 首都 是 北京'
, '日本 的 首都 是 东京'
, '美国 的 首都 是 华盛顿'
, '英国 的 首都 是 伦敦'
, '德国 的 首都 是 柏林']
tar_text = ['Beijing is the capital of China'
, 'Tokyo is the capital of Japan'
, 'Washington is the capital of the United States'
, 'London is the capital of England'
, 'Berlin is the capital of Germany']
input_list = []
tar_list = []
END = ' EOS'
for tmp_input in input_text:
tmp_input = tmp_input+END
input_list.append(tokenize(tmp_input))
for tmp_tar in tar_text:
tmp_tar = tmp_tar+END
tar_list.append(tokenize(tmp_tar))
vocab = sorted(reduce(lambda x, y: x | y, (set(tmp_list) for tmp_list in input_list + tar_list)))
vocab_size = len(vocab) + 1 # keras embedding need len(vocab)+1
input_maxlen = max(map(len, (x for x in input_list)))
tar_maxlen = max(map(len, (x for x in tar_list)))
output_dim = vocab_size
hidden_dim = 1000
INPUT_DIM = hidden_dim #senten length
TIME_STEPS = input_maxlen #
word_to_idx = dict((c, i + 1) for i, c in enumerate(vocab))
idx_to_word = dict((i + 1, c) for i, c in enumerate(vocab))
inputs_train, tars_train = vectorize_stories(input_list, tar_list, word_to_idx, input_maxlen, tar_maxlen, vocab_size)
#inputs_trains shape is (5,6) and tars_tratin shape is (5,9,30)
Module is :
def model_attention_applied_before_lstm():
inputs = Input(shape=(input_maxlen,))
embed = Embedding(input_dim=vocab_size,
output_dim=620,
input_length=input_maxlen)(inputs)
attention_mul = LSTM(hidden_dim, return_sequences=True)(embed)
attention_mul = attention_3d_block(attention_mul)
attention_mul = LSTM(500, return_sequences=True)(attention_mul)
output = TimeDistributed(Dense(output_dim, activation='sigmoid'),input_shape=(tar_maxlen, output_dim))(attention_mul)
#output = TimeDistributed(Dense(output_dim))(output)
#output = Dense(output_dim, activation='sigmoid')(attention_mul)
#output = RepeatVector(tar_maxlen)(output)
#output = Permute((tar_maxlen,output_dim),name='reshapeLayer')(output)
output = Activation('softmax')(output)
model = Model(input=[inputs], output=output)
return model
And run the model as:
m = model_attention_applied_before_lstm()
m.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
print(m.summary())
m.fit(inputs_train, tars_train, epochs=10, batch_size=3, validation_split=0.1)
from keras-attention.
In that case, what you need to have is a sequence to sequence attention. This project does not support this.
- http://distill.pub/2016/augmented-rnns/
- https://www.tensorflow.org/tutorials/seq2seq
- https://github.com/harvardnlp/seq2seq-attn
from keras-attention.
Ooooooooh. Thanks a lot.
Do you know any sequence to sequence attention implement using keras?
from keras-attention.
I'm not sure about this. This project:
is the most famous seq2seq in Keras. Maybe there's an attention mechanism there. To be checked
from keras-attention.
Related Issues (20)
- Hiddent state parameter, what really should be passed? HOT 1
- pip install and numpy, keras packages are forced to be uninstalled HOT 1
- Use this repository for CNN HOT 1
- 2D attention HOT 6
- weird attention weights when adding sequence of numbers. HOT 1
- attention when using more than one feature HOT 1
- get_config HOT 14
- Using attention with multivariate timeseries data
- Loading model problems HOT 5
- Interpreting attention weights for more than one input features. HOT 2
- Add guidance to README to use Functional API for saving models that use this layer HOT 4
- Attention Mechanism not working HOT 10
- what do the h_t mean in the Attention model? HOT 1
- Output with multiple time steps HOT 1
- Attention not working for MLP HOT 2
- TypeError: Expected `trainable` argument to be a boolean, but got: 64 HOT 3
- Please update version HOT 1
- TypeError: __call__() takes 2 positional arguments but 3 were given HOT 2
- Number of parameters in Attention layer HOT 2
- Does it support causal mask? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from keras-attention.