tixierae / deep_learning_nlp Goto Github PK

Keras, PyTorch, and NumPy Implementations of Deep Learning Architectures for NLP

Home Page: http://arxiv.org/abs/1808.09772

Python 4.53% Jupyter Notebook 95.47%

python3 keras deep-learning saliency-map cnn-keras nlp self-attention hierarchical-attention-network recurrent-neural-networks seq2seq-pytorch

deep_learning_nlp's People

Contributors

Stargazers

Watchers

Forkers

lilitom yangzch helloworld-9527 yasaman1997 jdc08161063 geogubd wanjinchang qss2012 currylym lu839684437 yespon caibing1872 brittneygogogo octalxia lccc-scu qibaoyuan akailcy wzbob jiayong somous-jhzhao vvvictorlee leichangqing buaazb kunwangrui qgzang jjwangnlp hulalazz joey10huawei wuqingzhou828 zhangleiqss awesome-archive waterwind sunnymarkliu rpj911 guanlongtianzi ze3000a lishiji1992 scapeqin qwe134 oojiaoo hzckid muxinghan flysky1991 cherryl411 zjatc ryuji1435 nextnight yjfiejd orangefly0214 masterpiece1992 bimhud lxf123a generalzh zzhhoubin sevinjyolchuyeva pmaxit junailin thocao276 shubhampachori12110095 rahasayantan chaoongithub 7472741 triper1022 lianglili sainttde luxiaolingfei smile-ttxp wangfengjs wenjiandong zhengrong34 areafather kaeflint fengkang1992 platinumyzm aman2202 hanyangshu1 thanhtd91 dangkhoa0894 jessfelicout llianashi kruthikakr sanketsdeshmukh hawksilent phillette santiagoahc sailfish009 fscheidt paulmorio laskari pa-wan srinivasgutta7 aurora-yuan xrosliang 13301338176 zachsweet chenmalobani tomorrow1weak2up christan7652 dayofthepenguin hxiaoj

deep_learning_nlp's Issues

han_my_functions.py --> TypeError: float() argument must be a string or a number, not 'NoneType'

Hi,

First let me thank you for the detailed and really well explained HAN example! I was looking for days for such a source to get up and running with attention visualisation in NLP.

I have prepared my data as in the description, everything runs smoothly until I get to training: han.fit_generator(..., which stops and throws:

I've noticed that it has to do something with the metrics, but couldn't figure out what's next.

Btw, is there a specific version of Keras and Tensorflow I should run this example? Currently I'm on tensorflow: 2.4.1 and keras: 2.4.3 (both being probably the latest)

Thanks!!

Getting Attention coefficients with saved models

Dear Antoine,

I want to thank you for your great NLP github repository, which is always font of inspiration. I’m working on text classification, and I used in the past the 1D CNN, and now the HAN, that you so clearly explain in your notebooks.

I’ve a question about getting the attention coefficients, to show “important” words and sentences. I’ve been able to get and show them, putting the code in the same script that does the training (like you did in your notebook, where you show them in the same notebook that creates and fits the model).

But I wanted to write a class “Predictor”, that loads the saved model, does the prediction and shows the attention. For the attention coefficients I need:

get_sent_att_coeffs = Model(sent_ints,sent_att_coeffs) # coeffs over the words in a sentence
get_doc_attention_coeffs = Model(doc_ints,doc_att_coeffs) # coeffs over the sentences in a document

and Python of course complains that sent_ints, sent_att_coeff etc are not defined, if I don’t put in the class the whole definition of the two models (sentence encoder and document encoder). But I didn’t like to rewrite the whole definition there (which worked, but is a quick and dirty solution, I want instead to load the models from files.

I tried then this:

    json_file = open('sentencoder_newsgroups_model.json', 'r')
    loaded_model_json = json_file.read()
    json_file.close()
    sent_encoder = model_from_json(loaded_model_json, custom_objects= 
                                                     {'AttentionWithContext': AttentionWithContext})
    sent_encoder.load_weights('sentencoder_newsgroups_weights.h5')

    json_file = open('han_newsgroups_model.json', 'r')
    loaded_model_json = json_file.read()
    json_file.close()
    han = model_from_json(loaded_model_json, custom_objects={'AttentionWithContext': 
                                               AttentionWithContext})
    han.load_weights('han_newsgroups_weights.h5')
    
    
    reshaped_sentences = self._get_sequence()
    reshaped_sentences_tensor = _to_tensor(reshaped_sentences,dtype='float32') 
    
    get_sent_att_coeffs = Model(sent_encoder.input, 
                                                 sent_encoder.get_layer('attention_with_context_1').output[1])
    
    get_doc_attention_coeffs = Model(han.input, 
                                                             han.get_layer('attention_with_context_2').output[1])

but at get_sent_att_coeffs I get an error:

ValueError: Output tensors to a Model must be the output of a Keras Layer (thus holding past layer metadata). Found: Tensor("strided_slice_1:0", shape=(200,), dtype=float32)

Then I found that, if I print(sent_encoder.summary()) and print(han.summary()) after the model definition, I see that the attention layers have (correctly) 2 outputs, while if I print the same after loading the models from file, the second output, which gives me the coefficient, is gone (btw, I think there is an error in compute_output_shape in the Class AttentionWithContext, the dimension of the coefficients are wrong, they should be:

def compute_output_shape(self, input_shape):
    if self.return_coefficients:
        return [(input_shape[0], input_shape[-1]), (input_shape[0], input_shape[1], 1)]

instead of:

def compute_output_shape(self, input_shape):
    if self.return_coefficients:
        return [(input_shape[0], input_shape[-1]), (input_shape[0], input_shape[-1], 1)]

i.e. the second dimension of the coefficients should be the number of steps, input_shape[1] (number of words or of sentences, there is one coefficient per word or sentence) ), and not input_shape[-1] (number of features)

Anyway, I'm still looking how to print attentions for a document, if the model is just loaded from a file (either as a full model, or json + weights), as it should be in a "production" environment.
If the second output of a layer won't be saved in the model, maybe we need to call the Attention twice, with 2 different parameters (return_coefficient= False and return_coefficient=True), modifying the AttentionWithContext in order to have two different outputs depending on return_coefficients.

Thank you
Francesco

Predictive text regions

Thank you for the very clear introduction to CNN for NLP!

I've have a question about the predictive text regions. You write "we want to identify the n_show regions of each branch that are associated with the highest weights in the corresponding feature maps", but in the code you only take into account the activations of the feature maps. I wonder if we should also weight them with the weights of the dense layer.

I'm trying to apply the same idea to multi-class multi-label problems (the model is essentially the same, like in https://github.com/inspirehep/magpie, we only need a dense layer with more output neurons), and I'd like to identify the regions that are associated with the different labels. In this case, the feature maps of a document are the same for the different labels, but the weights of the Dense layers are of course important for activating or deactivating a specific output neuron.
What do you think?

Thanks,
Francesco

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.