Giter VIP home page Giter VIP logo

Comments (5)

matteo-grella avatar matteo-grella commented on August 25, 2024

The operation of these APIs is not what you've guessed. My bad, I didn't describe APIs in the README yet.

Predict

predict performs a prediction based on a trained Masked Language Model (MLM).
In short, MLM is a fill-in-the-blank task, where the objective is to use the context words surrounding a [MASK] token to try to predict what that [MASK] word should be.

To try the predict API, it is necessary that the underlying model contains all the necessary neural layers. My advice is to start with the base BERT English model trained by Hugging Face (exact name for the import: bert-base-cased)

Once you have imported the model and the server is listening, try this:

curl -k -d '{"text": "Sorry I did not get a chance to answer your [MASK] earlier."}' -H "Content-Type: application/json" "http://127.0.0.1:1987/predict?pretty"

It should print:

{
    "tokens": [
        {
            "text": "question",
            "start": 44,
            "end": 50,
            "label": "PREDICTED"
        }
    ],
    "took": 416
}

You can experiment with more [MASK] tokens and the model will generate the most likely substitution for each. Keep in mind that the more tokens are masked the less context is usable and therefore the accuracy may drop.

You can even mix several languages in the same sentence using a multi-lingual model (exact name for the import: bert-base-multilingual-cased).

For example:

curl -k -d '{"text": "Io sono italiano quindi parlo [MASK] , but as soon as I am with my German colleagues I switch to [MASK] ."}' -H "Content-Type: application/json" "http://127.0.0.1:1987/predict?pretty"

It should print:

{
    "tokens": [
        {
            "text": "italiano",
            "start": 30,
            "end": 36,
            "label": "PREDICTED"
        },
        {
            "text": "English",
            "start": 97,
            "end": 103,
            "label": "PREDICTED"
        }
    ],
    "took": 469
}

Cool! Isn't it?

Discriminate

There's something in the tests I'm doing that doesn't feel right. I'll come back to this later.

from spago.

bonedaddy avatar bonedaddy commented on August 25, 2024

Woah that's really cool :O

from spago.

bonedaddy avatar bonedaddy commented on August 25, 2024

I'm trying out the prediction now and for some reason am getting a result like this:

&{Tokens:[{Text:[PAD] Start:44 End:50 Label:PREDICTED}] Took:390}

Text I tried predicting:

Sorry I did not get a chance to answer your [MASK] earlier.

model used: deepset/bert-base-cased-squad2

from spago.

matteo-grella avatar matteo-grella commented on August 25, 2024

This is because the model deepset/bert-base-cased-sqad2 is fine-tuned for question-answering on the SQuAD2.0 dataset.

As a consequence of fine-tuning, some layers used during the so-called pre-training phase in which the BERT model is trained to learn language patterns (or in other words to become a super-parrot) are discarded.

The masked token prediction works on those layers that are no longer available in the fine-tuned SQuAD model:

// BERT Model
type Model struct {
	Config          Config
	Vocabulary      *vocabulary.Vocabulary
	Embeddings      *Embeddings
	Encoder         *Encoder
	Predictor       *Predictor // <---- This is the layer only available with basic "pre-trained"
	Discriminator   *Discriminator 
	Pooler          *Pooler
	SeqRelationship *linear.Model
	SpanClassifier  *SpanClassifier
}

As I said in my first answer, I suggest to use the following models:

  • bert-base-cased

  • bert-base-multilingual-cased

These are the exact names of the models to import, without subdirectories as for deepset models.

Please note that the server CLI has changed here: #28

from spago.

matteo-grella avatar matteo-grella commented on August 25, 2024

It's working like it's supposed to.

from spago.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.