Giter VIP home page Giter VIP logo

hironsan / bertsearch Goto Github PK

View Code? Open in Web Editor NEW
890.0 890.0 202.0 745 KB

Elasticsearch with BERT for advanced document search.

Home Page: https://towardsdatascience.com/elasticsearch-meets-bert-building-search-engine-with-elasticsearch-and-bert-9e74bf5b4cf2

License: MIT License

Python 53.61% HTML 40.69% Dockerfile 4.70% Shell 1.01%
bert elasticsearch machine-learning natural-language-processing search-engine

bertsearch's Introduction

Hi there, I'm Hiroki 👋

I'm interested in creating software related to machine learning and natural language processing.

Also, I like writing and translating books in Japanese. Here are some examples:

Author

Translator

If you want to support me, see GitHub Sponsors❤️

bertsearch's People

Contributors

arnipluseinn avatar ecemal avatar hironsan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bertsearch's Issues

'docker-compose up' version error

Hi,
some error happen in very begining with docker-compose.
Looks like your version of compose is quite old from default version 3.8
Do we have chance use this version here?

Thanks a lot for advance!

How to reproduce the problem

bash docker-compose up
ERROR: Version in "./docker-compose.yaml" is unsupported. You might be seeing this error because you're using the wrong Compose file version. Either specify a supported version (e.g "2.2" or "3.3") and place your service definitions under the services key, or omit the version key and place your service definitions at the root of the file to use version 1.
For more on the Compose file format versions, see https://docs.docker.com/compose/compose-file/

My Environment

  • Operating System: Ubuntu 18
  • docker version: 19.03.13,
  • docker-compose version: 1.17.1

search is not working in Flask app

@Hironsan
Updated the following version and ran the application.

  • Updated to bert-serving-client==1.10.0 from 1.9.8 in requirements.txt
  • Updated image: docker.elastic.co/elasticsearch/elasticsearch:7.7.1 from 7.3.2 in docker-compose.yaml
  • Updated version: '3.8' from 3.7 in docker-compose.yaml

I am able to load the URL - localhost:5000. When I search in the app, search is not working as expected.

Bert_Search_Web Fails to start

I followed the steps provided and made changes suggested in the reported issues, I am getting the following Error
Can you please look into this.

arch_bertserving_1
unifidata@bertsearch-unifidata-ubuntu:~$ sudo docker logs bertsearch_web_1
Traceback (most recent call last):
  File "app.py", line 4, in <module>
    from flask import Flask, render_template, jsonify, request
  File "/usr/local/lib/python3.8/site-packages/flask/__init__.py", line 14, in <module>
    from jinja2 import escape
ImportError: cannot import name 'escape' from 'jinja2' (/usr/local/lib/python3.8/site-packages/jinja2/__init__.py)

Your Environment

  • Operating System: Ubuntu 20.04.4
  • Python Version: 3.8.10

Cannot create_documents

@Hironsan I have a similar issue as a previous ticket where the create_documents.py command hangs for a long time and is unable to create the documents.jsonl

I have confirmed that my docker containers are correctly running and bert-server is up.

I have tried debugging the script to no avail, could you just clarify that it is indeed functional?

Could you add the LICENSE file?

First of all, thanks for creating and sharing bertsearch!
Could you add the LICENSE file?

I hope you choose a permissive license like MIT.
Thanks!

error on docker-compose up

Followed the guide and images built OK, but issue when launching; reproducible example below.

(base) PS P:\project\Fun\bertsearch> $env:PATH_MODEL
./cased_L-12_H-768_A-12
(base) PS P:\project\Fun\bertsearch> ls


    Directory: P:\project\Fun\bertsearch


Mode                LastWriteTime         Length Name
----                -------------         ------ ----
d-----        3/29/2020   7:20 PM                bertserving
d-----        11/1/2018   8:25 AM                cased_L-12_H-768_A-12
d-----        3/29/2020   7:20 PM                docs
d-----        3/29/2020   7:20 PM                example
d-----        3/29/2020   7:20 PM                web
-a----        3/29/2020   7:20 PM           1896 .gitignore
-a----        3/29/2020   8:32 PM      404261442 cased_L-12_H-768_A-12.zip
-a----        3/29/2020   7:20 PM            668 docker-compose.yaml
-a----        3/29/2020   7:20 PM           4757 README.md


(base) PS P:\project\Fun\bertsearch> $env:INDEX_NAME
jobsearch
(base) PS P:\project\Fun\bertsearch> docker-compose up
WARNING: The Docker Engine you're using is running in swarm mode.

Compose does not use swarm mode to deploy services to multiple nodes in a swarm. All containers will be scheduled on the current node.

To deploy your application across the swarm, use `docker stack deploy`.

Building bertserving
Step 1/8 : FROM tensorflow/tensorflow:1.12.0-py3
 ---> 39bcb324db83
Step 2/8 : RUN pip install -U pip --trusted-host pypi.org
 ---> Running in de5b48040c8d
Collecting pip
  Downloading https://files.pythonhosted.org/packages/54/0c/d01aa759fdc501a58f431eb594a17495f15b88da142ce14b5845662c13f3/pip-20.0.2-py2.py3-none-any.whl (1.4MB)
Installing collected packages: pip
  Found existing installation: pip 18.1
    Uninstalling pip-18.1:
      Successfully uninstalled pip-18.1
Successfully installed pip-20.0.2
Removing intermediate container de5b48040c8d
 ---> 0db4eeea7dab
Step 3/8 : RUN pip install --no-cache-dir bert-serving-server --trusted-host pypi.org
 ---> Running in e517c2f9690f
Collecting bert-serving-server
  Downloading bert_serving_server-1.10.0-py3-none-any.whl (61 kB)
Requirement already satisfied: pyzmq>=17.1.0 in /usr/local/lib/python3.5/dist-packages (from bert-serving-server) (17.1.2)
Collecting GPUtil>=1.3.0
  Downloading GPUtil-1.4.0.tar.gz (5.5 kB)
Requirement already satisfied: six in /usr/local/lib/python3.5/dist-packages (from bert-serving-server) (1.11.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.5/dist-packages (from bert-serving-server) (1.15.4)
Requirement already satisfied: termcolor>=1.1 in /usr/local/lib/python3.5/dist-packages (from bert-serving-server) (1.1.0)
Building wheels for collected packages: GPUtil
  Building wheel for GPUtil (setup.py): started
  Building wheel for GPUtil (setup.py): finished with status 'done'
  Created wheel for GPUtil: filename=GPUtil-1.4.0-py3-none-any.whl size=7411 sha256=e89f5d7b312e40f33852c4bc014fea79181c53cf27a010de8514780988fb954e
  Stored in directory: /tmp/pip-ephem-wheel-cache-ttp1jr48/wheels/f6/e6/51/5492c820fa91b191504b98eb341456b52786eea95ad2b21230
Successfully built GPUtil
Installing collected packages: GPUtil, bert-serving-server
Successfully installed GPUtil-1.4.0 bert-serving-server-1.10.0
Removing intermediate container e517c2f9690f
 ---> ca338baebe86
Step 4/8 : COPY ./ /app
 ---> 5052373dca60
Step 5/8 : COPY ./entrypoint.sh /app
 ---> db1145fafa52
Step 6/8 : WORKDIR /app
 ---> Running in 7eedd9f9c55c
Removing intermediate container 7eedd9f9c55c
 ---> 262156bb1421
Step 7/8 : ENTRYPOINT ["/app/entrypoint.sh"]
 ---> Running in bbdce76d4652
Removing intermediate container bbdce76d4652
 ---> 40f113a0af22
Step 8/8 : CMD []
 ---> Running in aea46aa3f6f9
Removing intermediate container aea46aa3f6f9
 ---> 358956debd64

Successfully built 358956debd64
Successfully tagged bertsearch_bertserving:latest
WARNING: Image for service bertserving was built because it did not already exist. To rebuild this image you must use `docker-compose build` or `docker-compose up --build`.
Building web
Step 1/7 : FROM python:3
 ---> f88b2f81f83a
Step 2/7 : COPY . /app
 ---> b7270a84884f
Step 3/7 : WORKDIR /app
 ---> Running in 4e2e13aa5f5f
Removing intermediate container 4e2e13aa5f5f
 ---> fd37fd293639
Step 4/7 : RUN pip install -U pip --trusted-host pypi.org
 ---> Running in c9e324e43463
Requirement already up-to-date: pip in /usr/local/lib/python3.8/site-packages (20.0.2)
Removing intermediate container c9e324e43463
 ---> db5526a56f99
Step 5/7 : RUN pip install -r requirements.txt --trusted-host pypi.org
 ---> Running in 7c3fb6cad0f8
Collecting bert-serving-client==1.9.8
  Downloading bert_serving_client-1.9.8-py2.py3-none-any.whl (28 kB)
Collecting elasticsearch==7.0.4
  Downloading elasticsearch-7.0.4-py2.py3-none-any.whl (83 kB)
Collecting Flask==1.1.1
  Downloading Flask-1.1.1-py2.py3-none-any.whl (94 kB)
Collecting numpy
  Downloading numpy-1.18.2-cp38-cp38-manylinux1_x86_64.whl (20.6 MB)
Collecting pyzmq>=17.1.0
  Downloading pyzmq-19.0.0-cp38-cp38-manylinux1_x86_64.whl (1.1 MB)
Collecting urllib3>=1.21.1
  Downloading urllib3-1.25.8-py2.py3-none-any.whl (125 kB)
Collecting click>=5.1
  Downloading click-7.1.1-py2.py3-none-any.whl (82 kB)
Collecting Werkzeug>=0.15
  Downloading Werkzeug-1.0.0-py2.py3-none-any.whl (298 kB)
Collecting Jinja2>=2.10.1
  Downloading Jinja2-2.11.1-py2.py3-none-any.whl (126 kB)
Collecting itsdangerous>=0.24
  Downloading itsdangerous-1.1.0-py2.py3-none-any.whl (16 kB)
Collecting MarkupSafe>=0.23
  Downloading MarkupSafe-1.1.1-cp38-cp38-manylinux1_x86_64.whl (32 kB)
Installing collected packages: numpy, pyzmq, bert-serving-client, urllib3, elasticsearch, click, Werkzeug, MarkupSafe, Jinja2, itsdangerous, Flask
Successfully installed Flask-1.1.1 Jinja2-2.11.1 MarkupSafe-1.1.1 Werkzeug-1.0.0 bert-serving-client-1.9.8 click-7.1.1 elasticsearch-7.0.4 itsdangerous-1.1.0 numpy-1.18.2 pyzmq-19.0.0 urllib3-1.25.8
Removing intermediate container 7c3fb6cad0f8
 ---> 88138be02664
Step 6/7 : ENTRYPOINT ["python"]
 ---> Running in 48f011d8f704
Removing intermediate container 48f011d8f704
 ---> d6aa422486ce
Step 7/7 : CMD ["app.py"]
 ---> Running in de0af515288b
Removing intermediate container de0af515288b
 ---> 776a579b975c

Successfully built 776a579b975c
Successfully tagged bertsearch_web:latest
WARNING: Image for service web was built because it did not already exist. To rebuild this image you must use `docker-compose build` or `docker-compose up --build`.
Creating bertsearch_bertserving_1   ... error
Creating bertsearch_elasticsearch_1 ...

Creating bertsearch_elasticsearch_1 ... done

ERROR: for bertserving  Cannot create container for service bertserving: Access is denied.
ERROR: Encountered errors while bringing up the project.

facing errors with docker-compose up

error:

WARNING: The PATH_MODEL variable is not set. Defaulting to a blank string.
Starting bertsearch-master_elasticsearch_1 ...
Creating bertsearch-master_bertserving_1 ... error

ERROR: for bertsearch-master_bertserving_1 Cannot create container for service Starting bertsearch-master_elasticsearch_1 ... done
phanumeric characters

ERROR: for bertserving Cannot create container for service bertserving: create .: volume name is too short, names should be at least two alphanumeric characters
ERROR: Encountered errors while bringing up the project.

Error with docker compose

I am able to run the app on my local windows but I am facing an issue when running on a linux server.
This is the error I am getting
ERROR: for deb7f8db0841_bertsearch-master_bertserving_1 Cannot start service bertserving: OCI runtime create failed: container_linux.go:345: starting container process caused "exec: "/app/entrypoint.sh": permission denied": unknown

ERROR: for bertserving Cannot start service bertserving: OCI runtime create failed: container_linux.go:345: starting container process caused "exec: "/app/entrypoint.sh": permission denied": unknown
I have tried adding RUN ["chmod", "+x", "/entrypoint.sh"] before the entrypoint in the dockerfile but its still not working

Pression denied in docker-compose up

`Traceback (most recent call last):
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/urllib3/connectionpool.py", line 677, in urlopen
chunked=chunked,
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/urllib3/connectionpool.py", line 392, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/http/client.py", line 1277, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/http/client.py", line 1323, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/http/client.py", line 1272, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/http/client.py", line 1032, in _send_output
self.send(msg)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/http/client.py", line 972, in send
self.connect()
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/docker/transport/unixconn.py", line 43, in connect
sock.connect(self.unix_socket)
PermissionError: [Errno 13] Permission denied

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/urllib3/connectionpool.py", line 727, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/urllib3/util/retry.py", line 403, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/urllib3/packages/six.py", line 734, in reraise
raise value.with_traceback(tb)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/urllib3/connectionpool.py", line 677, in urlopen
chunked=chunked,
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/urllib3/connectionpool.py", line 392, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/http/client.py", line 1277, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/http/client.py", line 1323, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/http/client.py", line 1272, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/http/client.py", line 1032, in _send_output
self.send(msg)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/http/client.py", line 972, in send
self.connect()
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/docker/transport/unixconn.py", line 43, in connect
sock.connect(self.unix_socket)
urllib3.exceptions.ProtocolError: ('Connection aborted.', PermissionError(13, 'Permission denied'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/docker/api/client.py", line 205, in _retrieve_server_version
return self.version(api_version=False)["ApiVersion"]
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/docker/api/daemon.py", line 181, in version
return self._result(self._get(url), json=True)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/docker/utils/decorators.py", line 46, in inner
return f(self, *args, **kwargs)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/docker/api/client.py", line 228, in _get
return self.get(url, **self._set_request_timeout(kwargs))
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/requests/sessions.py", line 543, in get
return self.request('GET', url, **kwargs)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/requests/sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', PermissionError(13, 'Permission denied'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/heqi/anaconda3/envs/bertsearch/bin/docker-compose", line 8, in
sys.exit(main())
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/compose/cli/main.py", line 67, in main
command()
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/compose/cli/main.py", line 123, in perform_command
project = project_from_options('.', options)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/compose/cli/command.py", line 69, in project_from_options
environment_file=environment_file
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/compose/cli/command.py", line 132, in get_project
verbose=verbose, version=api_version, context=context, environment=environment
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/compose/cli/docker_client.py", line 43, in get_client
environment=environment, tls_version=get_tls_version(environment)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/compose/cli/docker_client.py", line 170, in docker_client
client = APIClient(**kwargs)
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/docker/api/client.py", line 188, in init
self._version = self._retrieve_server_version()
File "/home/heqi/anaconda3/envs/bertsearch/lib/python3.7/site-packages/docker/api/client.py", line 213, in _retrieve_server_version
'Error while fetching server API version: {0}'.format(e)
docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', PermissionError(13, 'Permission denied'))
`

how to solve it ?

unable to run docker-compose

Getting the following error , am i missing some step:

WARNING: The PATH_MODEL variable is not set. Defaulting to a blank string.
Building bertserving
Traceback (most recent call last):
File "bin/docker-compose", line 6, in
File "compose/cli/main.py", line 71, in main
File "compose/cli/main.py", line 127, in perform_command
File "compose/cli/main.py", line 1085, in up
File "compose/cli/main.py", line 1081, in up
File "compose/project.py", line 527, in up
File "compose/service.py", line 360, in ensure_image_exists
File "compose/service.py", line 1084, in build
File "site-packages/docker/api/build.py", line 260, in build
File "site-packages/docker/api/build.py", line 307, in _set_auth_headers
File "site-packages/docker/auth.py", line 310, in get_all_credentials
File "site-packages/docker/auth.py", line 262, in _resolve_authconfig_credstore
File "site-packages/docker/auth.py", line 287, in _get_store_instance
File "site-packages/dockerpycreds/store.py", line 25, in init
dockerpycreds.errors.InitializationError: docker-credential-gcloud not installed or not available in PATH
[18955] Failed to execute script docker-compose

inputs as pdf

How should I go about if I want to feed pdfs and search through the pdf ?

Can I search with sentence ?

To: Hironsan
I just tried to follow your steps, and everything was fine. I have two questions:
(1)if I want to ask questions of the topic, can it give me the certain sentence, not the whole paragraph ?
(2) Can it executed on Internet(such as Google Search) instead of localhost?

Jonathan

How much time docker-compose takes ?

Hi. After successfully cloning the repo. I have run the docker-compose command. But it is taking time. Can some one explain how much time it will take ? Currently CPU core consumption is high.

The picture showing the Docker-Compose running is here.
dock

How to debug in vscode

This code is using a docker.
I want to run and debug it at vscode.
Do anyone know how to debug this code in vscode?

Large size files indexing

Hello everyone!

Is there any possibility to index a document if it's length is longer than max_seq_len?
As I understand right now, all left over is being cut from the BERT input, so the embedding is not composed of the whole file.
Is it possible to make an average embedding (get multiple of them for different parts of the document and average afterwards) or so?

Thank you!

No output in the html file

Hello, Hironsan. We have tried your manual, but we are stuck at the webapi.py , the error is the index problem just like this

Traceback (most recent call last):
web_1 | File "app.py", line 8, in
web_1 | INDEX_NAME = os.environ['INDEX_NAME']
web_1 | File "/usr/local/lib/python3.8/os.py", line 673, in getitem
web_1 | raise KeyError(key) from None
web_1 | KeyError: 'INDEX_NAME'

and there is no output on html , can you please help us?
Thank you !

Docker-compose up have error.

When run docker-compose up have error.Error problems is following:

xujh@DESKTOP-C9I74F8:/mnt/f/github_download/semantic_search/bertsearch$ docker-compose up
[+] Running 3/3
⠿ Container bertsearch_elasticsearch_1 Running 0.0s
⠿ Container bertsearch_bertserving_1 Started 3.5s
⠿ Container bertsearch_web_1 Running 0.0s
Attaching to bertserving_1, elasticsearch_1, web_1
bertserving_1 | standard_init_linux.go:228: exec user process caused: no such file or directory
bertserving_1 exited with code 1

Any help for me is so appreciate.

arguments in entrypoint.sh arn't being passed to the underlying container when running docker-compose.up

We're looking to use different a configuration of bert-serving-server than the default one.
For that we've changed a couple of arguments in bertserving/entrypoint.sh. For example:

bert-serving-start -num_worker=1 -model_dir /model

was changed to:

bert-serving-start -num_worker=4 -max_seq_len 512 -model_dir /model
(also tried -max_seq_len=512)

However, it seems as if docker-compose "ignores" these completely as in the logging we see that the configuration has remained the same (namely num_worker=1, max_seq_len=25), and the performance of the document retreival is accordingly (inferior)
image

bertserving only "partial-complete" services set-up in docker-compose

Hi Hironsan,

Thanks for your wonderful work!

found out when start both bertserving and elastics search (ES), the bert-as-service will hanged at "freeze" stage.

/usr/local/bin/bert-serving-start -num_worker=1 -model_dir /model
                 ARG   VALUE
__________________________________________________
           ckpt_name = bert_model.ckpt
         config_name = bert_config.json
                cors = *
                 cpu = False
          device_map = []
       do_lower_case = True
  fixed_embed_length = False
                fp16 = False
 gpu_memory_fraction = 0.5
       graph_tmp_dir = None
    http_max_connect = 10
           http_port = None
        mask_cls_sep = False
      max_batch_size = 256
         max_seq_len = 25
           model_dir = /model
no_position_embeddings = False
    no_special_token = False
          num_worker = 1
       pooling_layer = [-2]
    pooling_strategy = REDUCE_MEAN
                port = 5555
            port_out = 5556
       prefetch_size = 10
 priority_batch_size = 16
show_tokens_to_client = False
     tuned_model_dir = None
             verbose = False
                 xla = False

I:VENTILATOR:[__i:__i: 67]:freeze, optimize and export graph, could take a while...
I:GRAPHOPT:[gra:opt: 53]:model config: /model/bert_config.json
I:GRAPHOPT:[gra:opt: 56]:checkpoint: /model/bert_model.ckpt
I:GRAPHOPT:[gra:opt: 60]:build graph...
I:GRAPHOPT:[gra:opt:132]:load parameters from checkpoint...
I:GRAPHOPT:[gra:opt:136]:optimize...
I:GRAPHOPT:[gra:opt:144]:freeze...

versus

  • comment out the ES in docker-compose file and run "docker-compose up --build"
  • run local bert-as-service (Mac with pip install)
  • standalone bert-as-service docker

the result will look as below

/usr/local/bin/bert-serving-start -num_worker=1 -model_dir /model
                 ARG   VALUE
__________________________________________________
           ckpt_name = bert_model.ckpt
         config_name = bert_config.json
                cors = *
                 cpu = False
          device_map = []
       do_lower_case = True
  fixed_embed_length = False
                fp16 = False
 gpu_memory_fraction = 0.5
       graph_tmp_dir = None
    http_max_connect = 10
           http_port = None
        mask_cls_sep = False
      max_batch_size = 256
         max_seq_len = 25
           model_dir = /model
no_position_embeddings = False
    no_special_token = False
          num_worker = 1
       pooling_layer = [-2]
    pooling_strategy = REDUCE_MEAN
                port = 5555
            port_out = 5556
       prefetch_size = 10
 priority_batch_size = 16
show_tokens_to_client = False
     tuned_model_dir = None
             verbose = False
                 xla = False

I:VENTILATOR:[__i:__i: 67]:freeze, optimize and export graph, could take a while...
I:GRAPHOPT:[gra:opt: 53]:model config: /model/bert_config.json
I:GRAPHOPT:[gra:opt: 56]:checkpoint: /model/bert_model.ckpt
I:GRAPHOPT:[gra:opt: 60]:build graph...
I:GRAPHOPT:[gra:opt:132]:load parameters from checkpoint...
I:GRAPHOPT:[gra:opt:136]:optimize...
I:GRAPHOPT:[gra:opt:144]:freeze...
I:GRAPHOPT:[gra:opt:149]:write graph to a tmp file: /tmp/tmpannxv8fg
I:VENTILATOR:[__i:__i: 75]:optimized graph is stored at: /tmp/tmpannxv8fg
I:VENTILATOR:[__i:_ru:129]:bind all sockets
I:VENTILATOR:[__i:_ru:133]:open 8 ventilator-worker sockets
I:VENTILATOR:[__i:_ru:136]:start the sink
I:SINK:[__i:_ru:306]:ready
I:VENTILATOR:[__i:_ge:222]:get devices
W:VENTILATOR:[__i:_ge:246]:no GPU available, fall back to CPU
I:VENTILATOR:[__i:_ge:255]:device map:
		worker  0 -> cpu
I:WORKER-0:[__i:_ru:531]:use device cpu, load graph from /tmp/tmpannxv8fg
I:WORKER-0:[__i:gen:559]:ready and listening!
I:VENTILATOR:[__i:_ru:164]:all set, ready to serve request!

and this cause the create_documents.py hanged forever...
would this related to memory issue / network issue in docker-compose ?

Looking forward to your update ~

This does not work

When I follow the steps in the readme, I get the following error:

Traceback (most recent call last):
web_1 | File "/app/app.py", line 5, in
web_1 | from elasticsearch import Elasticsearch
web_1 | File "/usr/local/lib/python3.9/site-packages/elasticsearch/init.py", line 24, in
web_1 | from .client import Elasticsearch
web_1 | File "/usr/local/lib/python3.9/site-packages/elasticsearch/client/init.py", line 4, in
web_1 | from ..transport import Transport
web_1 | File "/usr/local/lib/python3.9/site-packages/elasticsearch/transport.py", line 5, in
web_1 | from .connection import Urllib3HttpConnection
web_1 | File "/usr/local/lib/python3.9/site-packages/elasticsearch/connection/init.py", line 2, in
web_1 | from .http_requests import RequestsHttpConnection
web_1 | File "/usr/local/lib/python3.9/site-packages/elasticsearch/connection/http_requests.py", line 3, in
web_1 | from base64 import decodestring
web_1 | ImportError: cannot import name 'decodestring' from 'base64' (/usr/local/lib/python3.9/base64.py)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.