nielsen-oss / fasttext-serving Goto Github PK

Serve your fastText models for text classification and word vectors

License: Apache License 2.0

Dockerfile 1.39% Python 98.61%

fasttext-serving's Introduction

FastText Serving

FastText Serving is a simple and efficient serving system for fastText models. Inspired by TensorFlow Serving, it provides the missing piece in the microservice puzzle to connect your business logic with basic Natural Language Processing (NLP). The idea of this project is to provide an elegant and reusable implementation for managing several fastText models, allowing to run concurrent multi model predictions. The API of the service is based on gRPC to reduce network latency and deliver higher throughput. For instance, you can run millions of predictions in around one second using just a single CPU.

The service has been developed in Python, making use of Facebook's fastText library for running predictions over text pieces (words, sentences, paragraphs, etc.). The fastText API is used through the Python bindings provided in the official project. Clients of the service can boost their performance by sending multiple sentences grouped in batches within the same request as the fastText library is compiled as a binary.

Serving models are determined by reading the contents of a configuration file. These models are cached in memory depending on the amount of memory available and the size of the model. Every request is dispatched to the model specified in the body of that request. In addition, models are reloaded when a newer version is published or the file contents are changed in disk, thanks to the watchdog library.

Features

These are the most interesting features of this project:

Concurrent management and serving of different models
Model versioning, allowing A/B test with concurrent requests to different versions
Hot model serving, loading the new model as soon as a new version is detected in the storage
Both bag of words and skip-gram models are supported
gRPC API

Quick Start

# Clone the repository
git clone https://github.com/nielsen-oss/fasttext-serving
cd fasttext-serving

# Build the Docker image
IMAGE_NAME=fasttext-serving
docker image build -t IMAGE_NAME .

# Start serving some example models
docker run -p 50051:50051 \
  -v ${PWD}/sample/models:/models \
  -v ${PWD}/sample/config.yaml:/etc/fts/config.yaml \
  -e SERVICE_CONFIG_PATH=/etc/fts/config.yaml \
  IMAGE_NAME 

# You can download pretrained models from fasttext webpage
# https://fasttext.cc/docs/en/supervised-models.html
# Do not forget to include the model in the models section of the config
wget https://dl.fbaipublicfiles.com/fasttext/supervised-models/dbpedia.ftz -P sample/models/dbpedia/1/

# Install requirements
pip3 install -r requirements.txt

# Compile protocol buffers (required by the client)
pip3 install .

# Make predictions using the example client
python3 sample/client.py

API

The gRPC API exposes a set of methods for performing model management and predictions with fastText. More specifically, the service provides this functionalities:

Classify a sentence
Get the words vectors of a set of words
Get currently loaded models
Load a list of models
Reload the models in the configuration file
Get the status of a given model:
- UNKNOWN: The model is not defined in the configuration file
- LOADED: The model is cached in memory and ready to make predictions
- AVAILABLE: The model is defined but not loaded, due to resource constraints
- FAILED: The model is not loaded due to a different internal error

The complete specification can be found in the protocol buffer definition in the protos directory.

Troubleshooting

Newer versions of the model are not loaded.

Check that the model has the extension .ftz or .bin and the path where the file has been uploaded. Also review your config file to check that the model is listed in the models section
Predictions are too slow.

Send all the predictions to the same model in bigger batches. Increase the maximum number of concurrent workers in the service configuration.

Contact

You can open an issue in this project of just email your questions or comments to Francisco Delgado or Javier Tovar

Contribute

Pull requests are welcome! For major changes, please open an issue first to discuss what you would like to change.

We recommend you to work in an isolated virtual environment:

git clone https://github.com/nielsen-oss/fasttext-serving
cd fasttext-serving
python3 -m venv venv
source venv/bin/activate
export SERVICE_CONFIG_PATH="sample/config.yaml"
python3 -m fts

And do not forget to pass the tests and add yours:

python3 test/test_suite.py

License

This project is released under the terms of the Apache 2.0 License.

Acknowledgements

Big thanks to these third-party projects used by fastText Serving:

fasttext-serving's People

Contributors

Stargazers

Watchers

Forkers

sahumson efrnlsn dgseten fjteam davidelavarga senecaso gdelafuente supertulli dego9001 akamil-etsy

fasttext-serving's Issues

Serve word vector model

Hi.
Can your method only be used if the supervised model is attached to it? I don't have a classification model. All I have is word vectors (.bin and .txt). How can I serve it?

slow memory leak

There is a slow memory leak somewhere in the code base. The python process starts off small (~1GB), and over the course of several hours of processing many batch prediction requests, the resident memory of the process creeps up to 4GB, 5GB, etc, until finally the process is killed by the OOM killer. I am using code from git commit 8aae32727a19, which is fairly recent. If there is more information I can provide, let me know. I know very little about python, but I can gather information if required.

Trouble configuring client

Hello.

First off, thanks for your work!

Been trying to use this to vectorize text to serve an ML clustering algorithm. I have set the setup.py on the client side, by running "python setup.py install" on the src folder of the ML app project while having the fts/protos folder inside de project folder, and it in fact builds the proto protocol folder into it. Still I'm having trouble as running the app issues the following error:

TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "service.proto":
service.proto: A file with this name is already in the pool.

Any idea what I'm doing wrong? Clearly I don't understand enough of gPRC, for starters... ;)

Thanks again

Pedro

Reload config file problem

Hi, I have been using tensorflow serving for a few months, and now I have to serve fasttext models.
I like very much the feature on TF Serving that you can reload models, updating your config file. I could not find my way with this on fasttext-serving.

Here is my code to try and reload the config file, to serve a second model.

import os
import grpc

from fts.protos import service_pb2, service_pb2_grpc

if __name__ == "__main__":

    # Generate GRPC stub
    channel = grpc.insecure_channel("localhost:50051")
    stub = service_pb2_grpc.FastTextStub(channel)

    request = service_pb2.ReloadModelsRequest()
    response = stub.ReloadConfigModels(request)
    print(response)
    request = service_pb2.LoadModelsRequest()
    response = stub.GetLoadedModels(request)
    print(response)
    loaded_models = [model.name for model in response.models]
    print(loaded_models)

I am trying with "yelp_review_polarity" and "dbpedia". Here my config.yaml file. I tried loading the server with just "yelp_review_polarity", then adding "dbpedia" on config.yaml, and sending a request from the previous client code.

# List of models to serve
models_path: sample/models
models:
  - base_path: yelp_review_polarity
    name: yelp_review_polarity
  - base_path: dbpedia
    name: dbpedia

What am i doing wrong?

Error while executing docker run.

I was trying to follow through the [Quickstart] (https://github.com/nielsen-oss/fasttext-serving#quick-start) and getting error while I execute the "docker run" command, the traceback is as follows.
Traceback (most recent call last): File "/usr/local/lib/python3.8/runpy.py", line 193, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/local/lib/python3.8/runpy.py", line 86, in _run_code exec(code, run_globals) File "/usr/src/app/fts/__main__.py", line 84, in <module> serve() File "/usr/src/app/fts/__main__.py", line 58, in serve servicer = FastTextServicer() File "/usr/src/app/fts/server/server.py", line 23, in __init__ self._fasttext_service = FastTextService() File "/usr/src/app/fts/service/fasttext_service.py", line 56, in __init__ self._observer.start() File "/usr/local/lib/python3.8/site-packages/watchdog/observers/api.py", line 253, in start emitter.start() File "/usr/local/lib/python3.8/site-packages/watchdog/utils/__init__.py", line 110, in start self.on_thread_start() File "/usr/local/lib/python3.8/site-packages/watchdog/observers/inotify.py", line 121, in on_thread_start self._inotify = InotifyBuffer(path, self.watch.is_recursive) File "/usr/local/lib/python3.8/site-packages/watchdog/observers/inotify_buffer.py", line 35, in __init__ self._inotify = Inotify(path, recursive) File "/usr/local/lib/python3.8/site-packages/watchdog/observers/inotify_c.py", line 200, in __init__ self._add_dir_watch(path, recursive, event_mask) File "/usr/local/lib/python3.8/site-packages/watchdog/observers/inotify_c.py", line 387, in _add_dir_watch raise OSError(errno.ENOTDIR, os.strerror(errno.ENOTDIR), path) NotADirectoryError: [Errno 20] Not a directory: b'sample/models'

I am using
docker version : Docker version 20.10.18, build b40c2f6
ubuntu 18.04

No module named fts while executing sample/client.py

While running the final step i.e. python sample/client in Quickstart, I am getting following error.
Traceback (most recent call last): File "sample/client.py", line 19, in <module> import fts.protos.service_pb2_grpc ModuleNotFoundError: No module named 'fts'
Tried to find solution from protobuf/issues/1491, but failed finding a solution.