Giter VIP home page Giter VIP logo

deep_learning_nlp's People

Contributors

tixierae avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deep_learning_nlp's Issues

han_my_functions.py --> TypeError: float() argument must be a string or a number, not 'NoneType'

Hi,

First let me thank you for the detailed and really well explained HAN example! I was looking for days for such a source to get up and running with attention visualisation in NLP.

I have prepared my data as in the description, everything runs smoothly until I get to training: han.fit_generator(..., which stops and throws:

Screen Shot 2021-01-26 at 11 59 43

I've noticed that it has to do something with the metrics, but couldn't figure out what's next.

Btw, is there a specific version of Keras and Tensorflow I should run this example? Currently I'm on tensorflow: 2.4.1 and keras: 2.4.3 (both being probably the latest)

Thanks!!

Getting Attention coefficients with saved models

Dear Antoine,

I want to thank you for your great NLP github repository, which is always font of inspiration. I’m working on text classification, and I used in the past the 1D CNN, and now the HAN, that you so clearly explain in your notebooks.

I’ve a question about getting the attention coefficients, to show “important” words and sentences. I’ve been able to get and show them, putting the code in the same script that does the training (like you did in your notebook, where you show them in the same notebook that creates and fits the model).

But I wanted to write a class “Predictor”, that loads the saved model, does the prediction and shows the attention. For the attention coefficients I need:

get_sent_att_coeffs = Model(sent_ints,sent_att_coeffs) # coeffs over the words in a sentence
get_doc_attention_coeffs = Model(doc_ints,doc_att_coeffs) # coeffs over the sentences in a document

and Python of course complains that sent_ints, sent_att_coeff etc are not defined, if I don’t put in the class the whole definition of the two models (sentence encoder and document encoder). But I didn’t like to rewrite the whole definition there (which worked, but is a quick and dirty solution, I want instead to load the models from files.

I tried then this:

    json_file = open('sentencoder_newsgroups_model.json', 'r')
    loaded_model_json = json_file.read()
    json_file.close()
    sent_encoder = model_from_json(loaded_model_json, custom_objects= 
                                                     {'AttentionWithContext': AttentionWithContext})
    sent_encoder.load_weights('sentencoder_newsgroups_weights.h5')

    json_file = open('han_newsgroups_model.json', 'r')
    loaded_model_json = json_file.read()
    json_file.close()
    han = model_from_json(loaded_model_json, custom_objects={'AttentionWithContext': 
                                               AttentionWithContext})
    han.load_weights('han_newsgroups_weights.h5')
    
    
    reshaped_sentences = self._get_sequence()
    reshaped_sentences_tensor = _to_tensor(reshaped_sentences,dtype='float32') 
    
    get_sent_att_coeffs = Model(sent_encoder.input, 
                                                 sent_encoder.get_layer('attention_with_context_1').output[1])
    
    get_doc_attention_coeffs = Model(han.input, 
                                                             han.get_layer('attention_with_context_2').output[1]) 

but at get_sent_att_coeffs I get an error:

ValueError: Output tensors to a Model must be the output of a Keras Layer (thus holding past layer metadata). Found: Tensor("strided_slice_1:0", shape=(200,), dtype=float32)

Then I found that, if I print(sent_encoder.summary()) and print(han.summary()) after the model definition, I see that the attention layers have (correctly) 2 outputs, while if I print the same after loading the models from file, the second output, which gives me the coefficient, is gone (btw, I think there is an error in compute_output_shape in the Class AttentionWithContext, the dimension of the coefficients are wrong, they should be:

def compute_output_shape(self, input_shape):
    if self.return_coefficients:
        return [(input_shape[0], input_shape[-1]), (input_shape[0], input_shape[1], 1)]

instead of:

def compute_output_shape(self, input_shape):
    if self.return_coefficients:
        return [(input_shape[0], input_shape[-1]), (input_shape[0], input_shape[-1], 1)]

i.e. the second dimension of the coefficients should be the number of steps, input_shape[1] (number of words or of sentences, there is one coefficient per word or sentence) ), and not input_shape[-1] (number of features)

Anyway, I'm still looking how to print attentions for a document, if the model is just loaded from a file (either as a full model, or json + weights), as it should be in a "production" environment.
If the second output of a layer won't be saved in the model, maybe we need to call the Attention twice, with 2 different parameters (return_coefficient= False and return_coefficient=True), modifying the AttentionWithContext in order to have two different outputs depending on return_coefficients.

Thank you
Francesco

Predictive text regions

Thank you for the very clear introduction to CNN for NLP!

I've have a question about the predictive text regions. You write "we want to identify the n_show regions of each branch that are associated with the highest weights in the corresponding feature maps", but in the code you only take into account the activations of the feature maps. I wonder if we should also weight them with the weights of the dense layer.

I'm trying to apply the same idea to multi-class multi-label problems (the model is essentially the same, like in https://github.com/inspirehep/magpie, we only need a dense layer with more output neurons), and I'd like to identify the regions that are associated with the different labels. In this case, the feature maps of a document are the same for the different labels, but the weights of the Dense layers are of course important for activating or deactivating a specific output neuron.
What do you think?

Thanks,
Francesco

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.