Giter VIP home page Giter VIP logo

Comments (4)

TheophileBlard avatar TheophileBlard commented on June 8, 2024 1

Hi @emiliepicardcantin, I believe you make the assumption that because the model was trained on a binary classification task, its output is a single neuron with sigmoid activation. In fact, this model have two output neurons on which we apply a softmax activation in order to get (pseudo) probabilities. Because this is a binary classification task, if the "negative" probability is > 0.5, then the predicted label is "negative", and if the "positive" probability is > 0.5 then the predicted label is "positive". That's why on your examples, the output scores are always > 0. If you pass the return_all_scores=True to the pipeline object, you will get the (probability) score for both outputs.

This said, the predicted label seems ok in your examples (3rd example is arguable) , but I'd advise you to fine-tune the model for your task (as the model was only trained on movie review data).

from french-sentiment-analysis-with-bert.

emiliepicardcantin avatar emiliepicardcantin commented on June 8, 2024

Hello !
Thank you for this model.
I am having some problems with the labelling process. The returned score does not seem to match the labels in my case and I don't know why. Here are some examples :

Je suis tres satisfaite du service . Prix correct. Assurance que je recommande a tous. En cas de souci très réactif et a l'écoute des clients. Pas déçue . [{'label': 'POSITIVE', 'score': 0.9916808009147644}]

L'établissement et la réalisation contrat sont faciles à mettre en place, tout comme les appels à la plateforme. Plus qu'à voir en cas de sinistre,mais je ne suis pas pressé !! [{'label': 'POSITIVE', 'score': 0.5242959260940552}]

Le prix m'a convenu pour un jeune conducteur. Il est par contre excessif si l'on rajoute un conducteur secondaire. Le service téléphonique est plutôt assez rapide à répondre. [{'label': 'NEGATIVE', 'score': 0.9145129919052124}]

Un assureur qui assure tant qu'il n'y a pas de sinistre... Un assureur qui résilie le contrat et qui ferme les accès aux informations personnelles avant la fin du contrat en indiquant au 24/09/2021 le message suivant "votre contrat est résilié depuis le 3/12/2021" :-) Donc plus de 2 mois avant l'échéance, le client n'a plus accès à son contrat, ni à la liste des sinistres enregistrés sur son compte, ne serait-ce que pour vérifier qu'il n'y a pas d'erreur... Le service client explique qu'effectivement, l'espace personnel est fermé à partir du jour d'envoi du courrier de résiliation, et non pas à la date de fin de contrat. [{'label': 'NEGATIVE', 'score': 0.9356728196144104}]

from french-sentiment-analysis-with-bert.

TheophileBlard avatar TheophileBlard commented on June 8, 2024

If you bypass the pipeline and directly use the model, you will be able to get its outputs

review = "J'aime le camembert"
inputs = tokenizer(review, return_tensors="tf")
model_outputs = model(inputs)
outputs = model_outputs["logits"][0]
print(outputs) # => tf.Tensor([-0.6336924   0.65147054], shape=(2,), dtype=float32)

You can then manually apply the softmax

import numpy as np

def softmax(_outputs):
    maxes = np.max(_outputs, axis=-1, keepdims=True)
    shifted_exp = np.exp(_outputs - maxes)
    return shifted_exp / shifted_exp.sum(axis=-1, keepdims=True)

scores = softmax(outputs)
print(scores) # => [0.21667267 0.7833273 ]

You will get the same results than with the pipeline

nlp = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)
result = nlp(review, return_all_scores=True)
result 
[[{'label': 'NEGATIVE', 'score': 0.2166726142168045},
  {'label': 'POSITIVE', 'score': 0.7833273410797119}]]

from french-sentiment-analysis-with-bert.

emiliepicardcantin avatar emiliepicardcantin commented on June 8, 2024

Thank you !

from french-sentiment-analysis-with-bert.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.