openvinotoolkit / model_server Goto Github PK

A scalable inference server for models optimized with OpenVINO™

Home Page: https://docs.openvino.ai/2024/ovms_what_is_openvino_model_server.html

License: Apache License 2.0

Dockerfile 0.08% Makefile 0.56% Python 11.57% Shell 0.74% Starlark 1.34% Groovy 0.09% C++ 83.95% C 0.71% CMake 0.04% Go 0.49% Java 0.43%

openvino inference ai edge cloud deep-learning serving dag kubernetes machine-learning

model_server's People

Contributors

Stargazers

Watchers

Forkers

qindj wangjinzhu8013 xe1gyq baishalichaudhury kkonradi colinsongf mypopydev gavinljj sunilsivadas chenqi1997 omar16100 topgun666 hi-yan forwhaat zhangxiaoli73 ramakrishnapallala ericustc emilyhutson ashahba rahul24-06 tomholmisto xiaoyubing tebogonakampe wkkyle jhrobres awesome-archive myzha0 flavio58it judytong doken-tokuyama taross-f llandis subashp namptiter piruo7 bretagne-peiqi arcral leafleftout cuongdv1 stefanruan naoufelabs nitin-mane psfoley patrykmatyjasek eunjeesung louie-tsai rasapala hah1504 janekmichalik deepmd-io schoi-habana hesitationer frisky1985 pnijhara zerocurve tkhred kei-1986 panda259 gnomonsis kthumar1997 ojjsaw spfuentes jerinmax smcope anilmur dhenisdj gonzalopixart liuyicai kylehuangcb gentletorch mandar-opc citron rrcoco jsxyhelu shreyas-chaudhari lsdace30095 toydogcat dtrawins samerhjr clever-scientist mjkambex leo-xukang xiaming9880 wzhen12 beosro ivost pandinosaurus ramalg miss-bug prathyusha6 sshyran danielschulz nyurguhun mzegla skypow2012 yzyw0702 ryanloney dinhtanpham cshubhamrao 1000milesaway

model_server's Issues

Suggest improve document to get them consistent to help ease the run of example_client scripts

According to the home page, I can successfully run model server and play the face_detection.py script.

But when I try to run other examples scripts under example_client, it does not work.
I think this is because they require different configurations when launching server

model path should follow the requirements of TensorFlow Serving REST API? --model_path /v1/models/face-detection
document should highlight the method to enable REST API, --rest_port, I think this will help save newbies a lot of time.
It would be better if have some introductions on the meaning of each docker command parameters.

how to get libcpu_extension_avx2.so?

plugin = IEPlugin(device=args.device, plugin_dirs=args.plugin_dir)
if args.cpu_extension and 'CPU' in args.device:
plugin.add_cpu_extension(args.cpu_extension)

and parser.add_argument("-l", "--cpu_extension",
help="MKLDNN (CPU)-targeted custom layers.Absolute path to a shared library with the kernels "
"impl.", type=str, default='C:\Users\HHH\Documents\Intel\OpenVINO\inference_engine_samples_build\intel64\Release')
i can't find libcpu_extension_avx2.so file.
C:\Users\HHH\Documents\Intel\OpenVINO\inference_engine_samples_build\intel64\Release?

Kubernetes usage with NCS

Is openvino-model-server usage with NCSv2 compatible with Kubernetes? I can get a private image of openvino-model-server to work under docker with ncsv2 acceleration, but it cannot seem to be able to get it to work with Kubernetes, despite using docker runtime.

I have tried to recompile libusb to remove udev support as per: https://docs.openvinotoolkit.org/latest/_docs_install_guides_installing_openvino_docker_linux.html, but that does not seem to help.

With Kubernetes, I am seeing the following error:

Content error: Can not init Myriad device: NC_ERROR.

I am starting the container as privileged and using host network.

Dockerfile uses hard-coded -j4 option

The hard coded value 4 works well for CPUs with 4 threads. On a CPU with more than that, which is found in many modern computers, the CPU is under used with the -j4 setting. Using e.g. the outcome of nproc instead would be more efficient and shorten build times on CPUs capable of more than 4 threads.

use yolov3 erro

when i use
docker pull intelaipg/openvino-model-server
docker run --rm -d -it -v /home/xs/openvino_model/models:/opt/ml/:ro -p 9001:9001 intelaipg/openvino-model-server:latest /ie-serving-py/start_server.sh ie_serving model --model_path /opt/ml/yolov3 --model_name yolov3 --port 9001
I accept
2019-07-13 05:18:05,096 - ie_serving.main - INFO - Log level set: INFO
2019-07-13 05:18:05,096 - ie_serving.models.model - INFO - Server start loading model: yolov3
2019-07-13 05:18:05,098 - ie_serving.models.model - INFO - Creating inference engine object for version: 1
2019-07-13 05:18:06,459 - ie_serving.models.ir_engine - INFO - Matched keys for model: {'inputs': {'Placeholder': 'Placeholder'}, 'outputs': {'detector/yolo-v3/Conv_14/BiasAdd/YoloRegion': 'detector/yolo-v3/Conv_14/BiasAdd/YoloRegion', 'detector/yolo-v3/Conv_22/BiasAdd/YoloRegion': 'detector/yolo-v3/Conv_22/BiasAdd/YoloRegion', 'detector/yolo-v3/Conv_6/BiasAdd/YoloRegion': 'detector/yolo-v3/Conv_6/BiasAdd/YoloRegion'}}
2019-07-13 05:18:06,459 - ie_serving.models.model - INFO - List of available versions for yolov3 model: [1]
2019-07-13 05:18:06,459 - ie_serving.models.model - INFO - Default version for yolov3 model is 1
2019-07-13 05:18:06,465 - ie_serving.server.start - INFO - Server listens on port 9001 and will be serving models: ['yolov3']
but when i use
python3 jpeg_classification.py --grpc_port 9001 --input_name Placeholder --output_name outputs --model_name yolov3 --size 416
I accept
Start processing:
Model name: yolov3
Images list file: input_images.txt
images/airliner.jpeg (1, 3, 416, 416) ; data range: 0.0 : 255.0
Invalid output name outputs
Available outputs:
detector/yolo-v3/Conv_22/BiasAdd/YoloRegion
detector/yolo-v3/Conv_14/BiasAdd/YoloRegion
detector/yolo-v3/Conv_6/BiasAdd/YoloRegion
please help

Unable to load person detection model with config.json

My config.json,
{
"model_config_list":[
{
"config":{
"name":"face_detection",
"base_path":"s3://bucket/2020/4/face_detection",
"batch_size": "auto",
"model_version_policy": {"specific": { "versions":[3] }},
"shape": "auto"
}
},
{
"config":{
"name":"person_detection",
"base_path":"s3://buckets/2020/4/person_detection",
"batch_size": "auto",
"model_version_policy": {"specific": { "versions":[1] }},
"shape": "auto"
}
}
]
}

The problem is the face detection model load and infers without error. But the person detection model is causing the following error.

ie_serving.models.model - ERROR - Error occurred while loading model version: {'xml_file': '/tmp/person-detection-0106.xml', 'bin_file': '/tmp/person-detection-0106.bin', 'mapping_config': None, 'version_number': 3, 'batch_size_param': None, 'shape_param': '(1,3,800,1344)', 'num_ireq': 1, 'target_device': 'CPU', 'plugin_config': None}

ie_serving.models.model - ERROR - Content error: IShapeInferExtension wasn't registered for node 668 with type ExperimentalDetectronPriorGridGenerator

Unexpected results when processing batches with face-detection-retail-0004

I already have a working solution with the model by processing input data with the shape (1, 3, 300, 300) multiple times, but I am looking to increase the performance of my solution using batch processing.

My issue is as follows:
If I, for example, process an input with the shape (4, 3, 300, 300) it seems like only the first of the batch returns correct values. The others will return confidence, image_id and label values (that look potentially correct) and a 0 as each of the other values (xmin, xmax, ymin, ymax).
Below is an example of part of the output.

[1.0, 1.0, 0.999951243401, 0.0, 0.0, 0.0, 0.0] , [1.0, 1.0, 0.99983048439, 0.0, 0.0, 0.0, 0.0] , [1.0, 1.0, 0.999551355839, 0.0, 0.0, 0.0, 0.0] , [1.0, 1.0, 0.998133003712, 0.0, 0.0, 0.0, 0.0] , [1.0, 1.0, 0.998106598854, 0.0, 0.0, 0.0, 0.0] , [1.0, 1.0, 0.966695904732, 0.0, 0.0, 0.0, 0.0] , [1.0, 1.0, 0.948006510735, 0.0, 0.0, 0.0, 0.0] , [1.0, 1.0, 0.897149503231, 0.0, 0.0, 0.0, 0.0] , [1.0, 1.0, 0.670906364918, 0.0, 0.0, 0.0, 0.0] , [1.0, 1.0, 0.59589445591, 0.0, 0.0, 0.0, 0.0] ,

I have used batch processing for other models and it has worked as expected but I'm not sure what is happening with face-detection-retail-0004.

Any help would be appreciated.

intelaipg/openvino-model-server:latest with --target_device MYRIAD fails

I am trying to use the official model server with the MYRIAD device (NCSv2) and it fails as follows:

2020-01-28 21:01:53,455 - ie_serving.main - INFO - Log level set: INFO
2020-01-28 21:01:53,456 - ie_serving.models.model - INFO - Server start loading model: squeezenet1.1
2020-01-28 21:01:53,457 - ie_serving.models.model - INFO - Creating inference engine object for version: 1
2020-01-28 21:01:53,457 - ie_serving.models.model - ERROR - Error occurred while loading model version: {'xml_file': '/opt/ml/squeezenet/1/squeezenet1.1.xml', 'bin_file': '/opt/ml/squeezenet/1/squeezenet1.1.bin', 'mapping_config': None, 'version_number': 1, 'batch_size_param': None, 'shape_param': None, 'num_ireq': 1, 'target_device': 'MYRIAD', 'plugin_config': None}
2020-01-28 21:01:53,457 - ie_serving.models.model - ERROR - Content error: Cannot find plugin to use :
2020-01-28 21:01:53,457 - ie_serving.models.model - INFO - List of available versions for squeezenet1.1 model: []
2020-01-28 21:01:53,457 - ie_serving.models.model - INFO - Default version for squeezenet1.1 model is -1
2020-01-28 21:01:53,461 - ie_serving.server.start - INFO - gRPC server listens on port 9001 and will be serving models: ['squeezenet1.1']
2020-01-28 21:01:54,463 - ie_serving.models.model - INFO - Server will start updating model: squeezenet1.1
2020-01-28 21:01:54,463 - ie_serving.models.model - INFO - Creating inference engine object for version: 1
2020-01-28 21:01:54,464 - ie_serving.models.model - ERROR - Error occurred while loading model version: {'xml_file': '/opt/ml/squeezenet/1/squeezenet1.1.xml', 'bin_file': '/opt/ml/squeezenet/1/squeezenet1.1.bin', 'mapping_config': None, 'version_number': 1, 'batch_size_param': None, 'shape_param': None, 'num_ireq': 1, 'target_device': 'MYRIAD', 'plugin_config': None}
2020-01-28 21:01:54,464 - ie_serving.models.model - ERROR - Content error: Cannot find plugin to use :
2020-01-28 21:01:54,464 - ie_serving.models.model - INFO - List of available versions after update for squeezenet1.1 model: []
2020-01-28 21:01:54,464 - ie_serving.models.model - INFO - Default version after update for squeezenet1.1 model is -1
2020-01-28 21:01:55,466 - ie_serving.models.model - INFO - Server will start updating model: squeezenet1.1
2020-01-28 21:01:55,466 - ie_serving.models.model - INFO - Creating inference engine object for version: 1

...

and so forth in a loop. Note that if I remove the --target_device MYRIAD from docker command line, all works as expected. I can however use the NCS device just fine when I run demos from the toolkit package directly on the host or from a container. So it would seem to be an issue between MYRIAD and the official container.

Converting espnet fastspeech model onnx to openvino, Got bin file size 0 without any error

Hi, I trying to convert espnet fastspeech model pytorch to openvino model to check how much inference faster than pytorch.
So Using onnx converted model put in openvino tool. There is no error. just success log message. but I got a bin file size 0.
I checked model unsupported function have. It is clear.
What part can I check in my model?

Thanks.

Dynamically change configuration file

How can we dynamically add or delete new models without restarting the service ?

Looking for model server to support 2019.R2 models

It seems that the latest version (2019.1.1) only supports version "4" of the OpenVino models. Is there any guidance on how to host the most recent models (version=6) in the model server?

gs://intelai_public_models not found

gsutil ls gs://intelai_public_models
BucketNotFoundException: 404 gs://intelai_public_models bucket does not exist.

We were using the above for some examples in Seldon Core. Is there a new location?

Grpc-java-client takes a lot of time.

openvino model_server can support tensorflow serving,but when I use grcp-java-client,it takes a lot of time.
when I use python,input deserialization need 0.8 ms.
when I use java,input deserialization need 38 ms.

Can a single server instance use multiple NCS devices?

Will a single server instance utilize multiple NCS devices to accelerate a single model?

Is the openvino model server docker image compatible with a Raspberry Pi 4?

I'm trying to run the openvino model server on a Raspberry Pi 4 and after trying to run the docker image I ran into this error:

exec user process caused "exec format error"

Is it possible to run openvino model server on a raspberry pi? What steps would I need to take to make it work?

Thanks in advance for any help.

When will it support kaldi model? Many thanks.

gRPC client examples fails

Build steps Pass:
cp (download path)/l_openvino_toolkit_p_2019.1.094_online.tgz .
make docker_build_bin http_proxy=$http_proxy https_proxy=$https_proxy

Placed ResNet-50 model under /opt/ml/model1/1 folder

Starting Docker Container with a Single Model with:
docker run --rm -d -v /models/:/opt/ml:ro -p 9001:9001 ie-serving-py:latest
/ie-serving-py/start_server.sh ie_serving model --model_path /opt/ml/model1 --model_name my_model --port 9001

The following command is failing with below log:
python grpc_serving_client.py --grpc_port 9001 --images_numpy_path imgs.npy --input_name data --output_name prob --transpose_input False --labels_numpy lbs.npy

Log:
E0604 18:41:13.744116590 24051 http_proxy.cc:62] 'https' scheme not supported in proxy URI
('Image data range:', 0.0, ':', 255.0)
Start processing:
Model name: resnet
Iterations: 10
Images numpy path: imgs.npy
Images in shape: (10, 3, 224, 224)

Traceback (most recent call last):
File "grpc_serving_client.py", line 105, in
result = stub.Predict(request, 10.0) # result includes a dictionary with all model outputs
File "/usr/lib64/python2.7/site-packages/grpc/_channel.py", line 565, in call
return _end_unary_response_blocking(state, call, False, None)
File "/usr/lib64/python2.7/site-packages/grpc/_channel.py", line 467, in _end_unary_response_blocking
raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.NOT_FOUND
details = "Servable not found for request: Specific(resnet, 0)"
debug_error_string = "{"created":"@1559653873.758416316","description":"Error received from peer ipv6:[::1]:9001","file":"src/core/lib/surface/call.cc","file_line":1046,"grpc_message":"Servable not found for request: Specific(resnet, 0)","grpc_status":5}"

@dtrawins How can I fix this issue? is it a known issue?

Kubeflow demo error

Hi!

Just tried to run Kubeflow pipelines demo and caught this error: Internal compiler error: Found unresolved PipelineParam.

Ubuntu 18.04, kubeflow pipelines v0.1.40

OpenVINO-model-server/example_kubeflow_pipelines/ovms_deployer$ /usr/local/bin/dsl-compile --py deployer.py --output deployer.tar.gz

/usr/lib/python3/dist-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.23) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)
/usr/local/lib/python3.6/dist-packages/kfp/dsl/_metadata.py:67: UserWarning: Explicit creation of `kfp.dsl.PipelineParam`s by the users is deprecated. The users should define the parameter type and default values using standard pythonic constructs: def my_func(a: int = 1, b: str = "default"):
  warnings.warn('Explicit creation of `kfp.dsl.PipelineParam`s by the users is deprecated. The users should define the parameter type and default values using standard pythonic constructs: def my_func(a: int = 1, b: str = "default"):')
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:133: UserWarning: Missing type name was inferred as "PipelineParam" based on the value "{{pipelineparam:op=;name=model-export-path}}".
  warnings.warn('Missing type name was inferred as "{}" based on the value "{}".'.format(type_name, str(value)))
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:154: UserWarning: There are no registered serializers from type "PipelineParam" to type "PipelineParam", so the value will be serializers as string "{{pipelineparam:op=;name=model-export-path}}".
  serialized_value),
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:133: UserWarning: Missing type name was inferred as "PipelineParam" based on the value "{{pipelineparam:op=;name=server-name}}".
  warnings.warn('Missing type name was inferred as "{}" based on the value "{}".'.format(type_name, str(value)))
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:154: UserWarning: There are no registered serializers from type "PipelineParam" to type "PipelineParam", so the value will be serializers as string "{{pipelineparam:op=;name=server-name}}".
  serialized_value),
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:133: UserWarning: Missing type name was inferred as "PipelineParam" based on the value "{{pipelineparam:op=;name=log-level}}".
  warnings.warn('Missing type name was inferred as "{}" based on the value "{}".'.format(type_name, str(value)))
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:154: UserWarning: There are no registered serializers from type "PipelineParam" to type "PipelineParam", so the value will be serializers as string "{{pipelineparam:op=;name=log-level}}".
  serialized_value),
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:133: UserWarning: Missing type name was inferred as "PipelineParam" based on the value "{{pipelineparam:op=;name=batch-size}}".
  warnings.warn('Missing type name was inferred as "{}" based on the value "{}".'.format(type_name, str(value)))
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:154: UserWarning: There are no registered serializers from type "PipelineParam" to type "PipelineParam", so the value will be serializers as string "{{pipelineparam:op=;name=batch-size}}".
  serialized_value),
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:133: UserWarning: Missing type name was inferred as "PipelineParam" based on the value "{{pipelineparam:op=;name=model-version-policy}}".
  warnings.warn('Missing type name was inferred as "{}" based on the value "{}".'.format(type_name, str(value)))
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:154: UserWarning: There are no registered serializers from type "PipelineParam" to type "PipelineParam", so the value will be serializers as string "{{pipelineparam:op=;name=model-version-policy}}".
  serialized_value),
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:133: UserWarning: Missing type name was inferred as "PipelineParam" based on the value "{{pipelineparam:op=;name=replicas}}".
  warnings.warn('Missing type name was inferred as "{}" based on the value "{}".'.format(type_name, str(value)))
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:154: UserWarning: There are no registered serializers from type "PipelineParam" to type "PipelineParam", so the value will be serializers as string "{{pipelineparam:op=;name=replicas}}".
  serialized_value),
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:133: UserWarning: Missing type name was inferred as "PipelineParam" based on the value "{{pipelineparam:op=;name=evaluation-images-list}}".
  warnings.warn('Missing type name was inferred as "{}" based on the value "{}".'.format(type_name, str(value)))
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:154: UserWarning: There are no registered serializers from type "PipelineParam" to type "PipelineParam", so the value will be serializers as string "{{pipelineparam:op=;name=evaluation-images-list}}".
  serialized_value),
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:133: UserWarning: Missing type name was inferred as "PipelineParam" based on the value "{{pipelineparam:op=;name=image-path-prefix}}".
  warnings.warn('Missing type name was inferred as "{}" based on the value "{}".'.format(type_name, str(value)))
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:154: UserWarning: There are no registered serializers from type "PipelineParam" to type "PipelineParam", so the value will be serializers as string "{{pipelineparam:op=;name=image-path-prefix}}".
  serialized_value),
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:133: UserWarning: Missing type name was inferred as "PipelineParam" based on the value "{{pipelineparam:op=;name=model-input-name}}".
  warnings.warn('Missing type name was inferred as "{}" based on the value "{}".'.format(type_name, str(value)))
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:154: UserWarning: There are no registered serializers from type "PipelineParam" to type "PipelineParam", so the value will be serializers as string "{{pipelineparam:op=;name=model-input-name}}".
  serialized_value),
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:133: UserWarning: Missing type name was inferred as "PipelineParam" based on the value "{{pipelineparam:op=;name=model-output-name}}".
  warnings.warn('Missing type name was inferred as "{}" based on the value "{}".'.format(type_name, str(value)))
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:154: UserWarning: There are no registered serializers from type "PipelineParam" to type "PipelineParam", so the value will be serializers as string "{{pipelineparam:op=;name=model-output-name}}".
  serialized_value),
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:133: UserWarning: Missing type name was inferred as "PipelineParam" based on the value "{{pipelineparam:op=;name=model-input-size}}".
  warnings.warn('Missing type name was inferred as "{}" based on the value "{}".'.format(type_name, str(value)))
/usr/local/lib/python3.6/dist-packages/kfp/components/_data_passing.py:154: UserWarning: There are no registered serializers from type "PipelineParam" to type "PipelineParam", so the value will be serializers as string "{{pipelineparam:op=;name=model-input-size}}".
  serialized_value),
Traceback (most recent call last):
  File "/usr/local/bin/dsl-compile", line 11, in <module>
    load_entry_point('kfp==0.1.40', 'console_scripts', 'dsl-compile')()
  File "/usr/local/lib/python3.6/dist-packages/kfp/compiler/main.py", line 123, in main
    compile_pyfile(args.py, args.function, args.output, not args.disable_type_check)
  File "/usr/local/lib/python3.6/dist-packages/kfp/compiler/main.py", line 112, in compile_pyfile
    _compile_pipeline_function(pipeline_funcs, function_name, output_path, type_check)
  File "/usr/local/lib/python3.6/dist-packages/kfp/compiler/main.py", line 71, in _compile_pipeline_function
    kfp.compiler.Compiler().compile(pipeline_func, output_path, type_check)
  File "/usr/local/lib/python3.6/dist-packages/kfp/compiler/compiler.py", line 879, in compile
    package_path=package_path)
  File "/usr/local/lib/python3.6/dist-packages/kfp/compiler/compiler.py", line 942, in _create_and_write_workflow
    self._write_workflow(workflow, package_path)
  File "/usr/local/lib/python3.6/dist-packages/kfp/compiler/compiler.py", line 897, in _write_workflow
    'Internal compiler error: Found unresolved PipelineParam. '

!Error:Failed to pick subchannel

I can't find the reason.Seek help,thanks.

NOTE:
[root@VM-0-15-centos ~]# python3 get_serving_meta.py --grpc_port 9000 --model_name semantic-segmentation-adas --model_version 1
2020-10-18 23:21:10.053573: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-10-18 23:21:10.053619: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Getting model metadata for model: semantic-segmentation-adas
Traceback (most recent call last):
File "get_serving_meta.py", line 97, in
result = stub.GetModelMetadata(request, 10.0) # result includes a dictionary with all model outputs
File "/usr/local/lib64/python3.6/site-packages/grpc/_channel.py", line 690, in call
return _end_unary_response_blocking(state, call, False, None)
File "/usr/local/lib64/python3.6/site-packages/grpc/_channel.py", line 592, in _end_unary_response_blocking
raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses"
debug_error_string = "{"created":"@1603034471.856134639","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3934,"referenced_errors":[{"created":"@1603034471.856132122","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":393,"grpc_status":14}]}"

Dockerfile_binary_openvino can benefit from ADD instead of COPY

ADD has the build-in unpacking of tar archives, which would be beneficial in Dockerfile_binary_openvino where we could then skip having a separate tar command.

gRPC client example jpeg_classification.py passes RBG image while RESNET expects RGB

Line 48 of jpeg_classification.py transposes BGR-->RBG. RESNET expects RGB format.

Proposed change:
Line 48: img = img.transpose(2,1,0).reshape(1,3,size,size)

Duplicate package entries for the Dockerfile apt-get commands

curl, ca-certificates and libgtk2.0-dev all have duplicate entries in the Dockerfiles' apt-get commands. By having the package list alphabetically sorted, duplicates can easily be spotted by humans and be avoided. Also, humans can then quickly see if a packages is missing or not. Avoiding duplicates keeps the list as short as possible which increases ease of maintainability.

ERROR - Content error: Cannot create ShapeOf layer PriorBoxClustered_0/0_port id:519

I converted model from SSD object detection API (ssd_inception_v2_coco_2018_01_28). followed from https://docs.openvinotoolkit.org/2020.3/_docs_MO_DG_prepare_model_convert_model_tf_specific_Convert_Object_Detection_API_Models.html
When loading model in openvino model server. I had an error ERROR - Content error: Cannot create ShapeOf layer PriorBoxClustered_0/0_port id:519.
I tried some models SSD with backbones such as mobilenetv2, resnet50 and both of them had the same error.
With model face_detection in examples, it 's ok.

Can I use it under Windows?

Hi ,Admin

How can i use this model server under windows without docker?

thx

Download specific OpenVINO version (2020.1.023)

The 2020.1.023 version was installed in my host machine (ubuntu 16.04), now I want to build an OpenVINO docker image using the same OpenVINO version with HDDL device support, but this link ( Intel Distribution of OpenVINO binary package) https://registrationcenter-download.intel.com/akdlm/irc_nas/17062/l_openvino_toolkit_p_2021.0.023.tgz is not accessible:

docker build -f Dockerfile --build-arg DLDT_PACKAGE_URL=https://registrationcenter-download.intel.com/akdlm/irc_nas/17062/l_openvino_toolkit_p_2021.0.023.tgz -t ie-serving-py:2020r1 .

How and where I can download a specificOpenVINO version. Thanks for your time.

Reproduce OpenVINO efficiency

Hi!

Cannot reproduce the performance of plain OpenVINO with Model Server. Benchmarking tool gives more than 12FPS however the best efficiency from OMS is only 8.4 FPS. Can you guide me if I'm using OMS in a wrong way? Shall I try to run multiple instances of docker / clients to achieve performance close to target?

benchmark_app.py:

$ python3 /opt/intel/openvino/deployment_tools/tools/benchmark_tool/benchmark_app.py -m graph.xml -l /opt/intel/openvino/deployment_tools/inference_engine/lib/intel64/libcpu_extension_avx2.so

[Step 1/11] Parsing and validating input arguments
[ WARNING ]  -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[Step 2/11] Loading Inference Engine
[ INFO ] CPU extensions is loaded /opt/intel/openvino/deployment_tools/inference_engine/lib/intel64/libcpu_extension_avx2.so
[ INFO ] InferenceEngine:
         API version............. 2.1.custom_releases/2019/R3_cb6cad9663aea3d282e0e8b3e0bf359df665d5d0
[ INFO ] Device info
         CPU
         MKLDNNPlugin............ version 2.1
         Build................... 30677

[Step 3/11] Reading the Intermediate Representation network
[Step 4/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1, precision: MIXED
[Step 5/11] Configuring input of the model
[Step 6/11] Setting device configuration
[Step 7/11] Loading the model to the device
[Step 8/11] Setting optimal runtime parameters
[Step 9/11] Creating infer requests and filling input blobs with images
[ INFO ] Network input 'input_ids' precision FP32, dimensions (NC): 1 128
[ INFO ] Network input 'input_type_ids' precision FP32, dimensions (NC): 1 128
[ INFO ] Network input 'input_mask' precision FP32, dimensions (NC): 1 128
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Infer Request 0 filling
[ INFO ] Fill input 'input_ids' with random values (some binary data is expected)
[ INFO ] Fill input 'input_type_ids' with random values (some binary data is expected)
[ INFO ] Fill input 'input_mask' with random values (some binary data is expected)
[ INFO ] Infer Request 1 filling
[ INFO ] Fill input 'input_ids' with random values (some binary data is expected)
[ INFO ] Fill input 'input_type_ids' with random values (some binary data is expected)
[ INFO ] Fill input 'input_mask' with random values (some binary data is expected)
[ INFO ] Infer Request 2 filling
[ INFO ] Fill input 'input_ids' with random values (some binary data is expected)
[ INFO ] Fill input 'input_type_ids' with random values (some binary data is expected)
[ INFO ] Fill input 'input_mask' with random values (some binary data is expected)
[ INFO ] Infer Request 3 filling
[ INFO ] Fill input 'input_ids' with random values (some binary data is expected)
[ INFO ] Fill input 'input_type_ids' with random values (some binary data is expected)
[ INFO ] Fill input 'input_mask' with random values (some binary data is expected)
[Step 10/11] Measuring performance (Start inference asyncronously, 4 inference requests using 4 streams for CPU, limits: 60000 ms duration)
[Step 11/11] Dumping statistics report
Count:      744 iterations
Duration:   60606.23 ms
Latency:    322.86 ms
Throughput: 12.28 FPS

Client script (tried to use 16 async requests):

import grpc
import numpy as np
import tensorflow as tf
import time
import argparse

from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc

channel = grpc.insecure_channel('127.0.0.1:9001')

stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
request = predict_pb2.PredictRequest()

request.model_spec.name = 'bert'

input_ids = np.random.randint(0, 255, (1, 128))
input_mask = np.random.randint(0, 255, (1, 128))
input_type_ids = np.random.randint(0, 255, (1, 128))

request.inputs['input_ids'].CopyFrom(tf.make_tensor_proto(input_ids))
request.inputs['input_mask'].CopyFrom(tf.make_tensor_proto(input_mask))
request.inputs['input_type_ids'].CopyFrom(tf.make_tensor_proto(input_type_ids))

nireq = 16

futures = [stub.Predict.future(request, 5.0) for i in range(nireq)]
for future in futures:
    future.result()


start = time.time()
n = 64
for i in range(n // nireq):
    futures = [stub.Predict.future(request, 5.0) for i in range(nireq)]
    for future in futures:
        future.result()

print((time.time() - start) / n)

time.sleep(5)

CPU_THROUGHPUT_NUMA and CPU_THREADS_NUM

docker run --rm -d  -v /path/to/models/:/opt/ml:ro -p 9001:9001 -p 8001:8001 ie-serving-py:latest /ie-serving-py/start_server.sh ie_serving model --model_path /opt/ml/bert --model_name bert --port 9001 --rest_port 8001 --nireq 4 --grpc_workers 8 --plugin_config "{\"CPU_THROUGHPUT_STREAMS\": \"CPU_THROUGHPUT_NUMA\",\"CPU_THREADS_NUM\": \"4\"}"

7.35 FPS

CPU_THROUGHPUT_AUTO

docker run --rm -d  -v /path/to/models/:/opt/ml:ro -p 9001:9001 -p 8001:8001 ie-serving-py:latest /ie-serving-py/start_server.sh ie_serving model --model_path /opt/ml/bert --model_name bert --port 9001 --rest_port 8001 --nireq 4 --grpc_workers 8 --plugin_config "{\"CPU_THROUGHPUT_STREAMS\": \"CPU_THROUGHPUT_AUTO\"}"

8.33 FPS

CPU_THROUGHPUT_AUTO and nireq = 8 in the Python script

7.63 FPS

CPU_THROUGHPUT_AUTO and nireq = 4 in the Python script

6.53 FPS

Model: BERT base uncased, graph.bin, graph.xml

Thanks in advance!

Improve Dockerfile build step

Using Docker's WORKDIR and its directory creation functionality can make the build steps more readable and get rid of some mkdir foo && cd foo stuff. With cleaner build steps, debugging in intermediate containers is improved when there are compilation failures.

error found in home page demo TypeError: Expected Ptr<cv::UMat> for argument 'img'

OS: Ubuntu 16.04
Failed to run the last step in https://github.com/openvinotoolkit/model_server#running-the-server
Below change is needed to fix it
-img_out = cv2.rectangle(img_out,(x_min,y_min),(x_max,y_max),(0,0,255),1)
+img_out = cv2.rectangle(cv2.UMat(img_out),(x_min,y_min),(x_max,y_max),(0,0,255),1)

python face_detection.py --batch_size 1 --width 600 --height 400 --input_images_dir images --output_dir results
['people1.jpeg']
Start processing 1 iterations with batch size 1

Request shape (1, 3, 400, 600)
Response shape (1, 1, 200, 7)
image in batch item 0 , output shape (3, 400, 600)
detection 0 [[[0. 1. 1. 0.55241716 0.3024692 0.59122956
0.39170963]]]
x_min 331
y_min 120
x_max 354
y_max 156
Traceback (most recent call last):
File "face_detection.py", line 110, in
img_out = cv2.rectangle(img_out,(x_min,y_min),(x_max,y_max),(0,0,255),1)
TypeError: Expected Ptrcv::UMat for argument 'img'

Using the correct configurations to get the best performance

Hi,

I'm using openvino model server to run inference on multiple models. I've read the documentation but I'm not completely sure how I should set up the config.json. The target hardware is an Intel Xeon Silver 4216 (16 cores, 32 threads).

Below is what I have been using.

{
   "model_config_list":[
      {
         "config":{
            "name":"face-detection-retail-0004",
            "base_path":"/opt/ml/face-detection-retail-0004",
            "shape": "auto",
            "nireq": 8
         },
         "plugin_config": {"CPU_THROUGHPUT_STREAMS": 8, "CPU_THREADS_NUM": 32}
      },
      {
         "config":{
            "name":"age-gender-recognition-retail-0013",
            "base_path":"/opt/ml/age-gender-recognition-retail-0013",
            "batch_size": "auto",
            "nireq": 8
         },
         "plugin_config": {"CPU_THROUGHPUT_STREAMS": 8, "CPU_THREADS_NUM": 32}
      },
      {
         "config":{
            "name":"emotions-recognition-retail-0003",
            "base_path":"/opt/ml/emotions-recognition-retail-0003",
            "batch_size": "auto",
            "nireq": 8
         },
         "plugin_config": {"CPU_THROUGHPUT_STREAMS": 8, "CPU_THREADS_NUM": 32}
      },
      {
         "config":{
            "name":"head-pose-estimation-adas-0001",
            "base_path":"/opt/ml/head-pose-estimation-adas-0001",
            "batch_size": "auto",
            "nireq": 8
         },
         "plugin_config": {"CPU_THROUGHPUT_STREAMS": 8, "CPU_THREADS_NUM": 32}
      },
     {
        "config":{
           "name":"person-detection-retail-0013",
           "base_path":"/opt/ml/person-detection-retail-0013",
           "batch_size": "auto",
           "shape": "auto",
           "nireq": 8
         },
         "plugin_config": {"CPU_THROUGHPUT_STREAMS": 8, "CPU_THREADS_NUM": 32}
     },
     {
        "config":{
           "name":"person-reidentification-retail-0079",
           "base_path":"/opt/ml/person-reidentification-retail-0079",
           "batch_size": "auto",
           "nireq": 8
         },
         "plugin_config": {"CPU_THROUGHPUT_STREAMS": 8, "CPU_THREADS_NUM": 32}
     }
   ]
}

I'm using:
"CPU_THROUGHPUT_STREAMS": 8 because this is what the benchmark app determined was the optimal setup.
"CPU_THREADS_NUM": 32 because the hardware has 32 threads.
"nireq": 8 because that would be the maximum requests we would send per model.

I have a few questions regarding the config.json file:

Is there anything that looks incorrect?
Should the throughput streams and threads be set per model and should the values all be the same?
Should I include the grpc_workers property in each model's config or in the docker run command?
If grpc_workers should be in the config.json, is setting it to the same value as nireq fine or would I get better performance with a higher grpc_worker value?

I also have another question regarding sending batches to OVMS (not sure if I should make another issue for this). I have noticed that the fps is lower when sending larger batches. For example, I made a container with just 1 model loaded onto it. I then sent 500 images into it (asynchronously) in batches of 1, 4, 10 and 50. Using batches of 4 processed the images the fastest.
It is my understanding that using higher batches should produce a higher throughput, is this not the case when processing asynchronously?

Any help would be be appreciated.

Erro environment variable LD_LIBRARY_PATH in docker image

I use target device HDDL on k8s and image is openvino/ubuntu18_model_server:latest
The docker images environment:
docker inspect -f {{.Config.Env}} openvino/ubuntu18_model_server:latest

[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin PYTHON=python3.6 INTEL_OPENVINO_DIR=/opt/intel/openvino PYTHONPATH=:/opt/intel/openvino/python/python3.6 LD_LIBRARY_PATH=:/opt/intel/openvino/deployment_tools/inference_engine/external/tbb/lib:/opt/intel/openvino/deployment_tools/inference_engine/external/mkltiny_lnx/lib:/opt/intel/openvino/deployment_tools/inference_engine/lib/intel64:/opt/intel/openvino/deployment_tools/ngraph/lib]

IF run HDDL need:

LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/opt/intel/openvino/deployment_tools/inference_engine/external/hddl/lib
HDDL_INSTALL_DIR=/opt/intel/openvino/deployment_tools/inference_engine/external/hddl

Is it possible to run with --target GPU

Does anyone know if it is possible to build the docker version of this with support for acceleration via intel GPUs?

Simplify the pwd | grep -> check result part of Dockerfile_binary_openvino

The

pwd | grep -q openvino_toolkit_p ;
if [ $? = 0 ];then sed -i 's [...]

parts of Dockerfile_binary_openvino are not, IMO, easily read. But what they do is to determine from pwd what toolkit (p/fpga) is at hand and selects components in silent.cfg accordingly. I reckon such a check is a typical job for the shell's built-in case statement, which would not only increase readability but also be easier to maintain. (If, say, the components would differ between toolkit versions, handling of multiple versions is easily added as case statements.)

make docker_build_clearlinux fails to build

I am trying to rebuild the container image using: make docker_build_clearlinux and it fails with:

Step 6/9 : RUN pip --no-cache-dir install -r requirements_clearlinux.txt
 ---> Running in 6f46c85b84ae
ERROR: Could not find a version that satisfies the requirement tensorflow==1.14.0 (from -r requirements_clearlinux.txt (line 1)) (from versions: none)
ERROR: No matching distribution found for tensorflow==1.14.0 (from -r requirements_clearlinux.txt (line 1))
The command '/bin/sh -c pip --no-cache-dir install -r requirements_clearlinux.txt' returned a non-zero code: 1
Makefile:113: recipe for target 'docker_build_clearlinux' failed

Received message larger than max ERROR

Have set "--grpc_channel_arguments grpc.max_receive_message_length=104857600",but still report Error“ debug_error_string = "{"created":"@1602672141.715481155","description":"Received message larger than max (8388653 vs. 4194304)","file":"src/core/ext/filters/message_size/message_size_filter.cc","file_line":190,"grpc_status":8}"”It seems that the restriction has not been removed。

Should I install OpenVINO first?

or I only need to deploy OpenVINO optimized model?

thx

Failed to deploy on Heroku.

Hi, I am trying to deploy ovms on heroku but getting this error. I also tried assigning --rest_port=$PORT but
cannot resolve the issue. I get this error Web process failed to bind to $PORT within 60 seconds of launch

Dockerfile

FROM openvino/model_server:latest as build_image

# Set where models should be stored in the container
ENV MODEL_BASE_PATH /opt/ml
ENV MODEL_NAME facedetect

COPY models/facedetect /opt/ml/facedetect

# Run this command to start ovms
COPY --chown=admin:admin ovms_entrypoint.sh /usr/bin/ovms_entrypoint.sh
ENTRYPOINT []
CMD ["/usr/bin/ovms_entrypoint.sh"]

ovms_entrypoint.sh

#!/bin/bash

echo $PORT
/ovms/bin/ovms --model_name=${MODEL_NAME} --model_path=${MODEL_BASE_PATH}/${MODEL_NAME} --log_level DEBUG "$@"

Log file from Heroku

2020-10-27T20:32:02.458949+00:00 heroku[web.1]: Starting process with command `/usr/bin/ovms_entrypoint.sh`
2020-10-27T20:32:05.229729+00:00 app[web.1]: 20344
2020-10-27T20:32:05.363836+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] CLI parameters passed to ovms server
2020-10-27T20:32:05.363844+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] model_path: /opt/ml/facedetect
2020-10-27T20:32:05.363865+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] model_name: facedetect
2020-10-27T20:32:05.363881+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] batch_size: 0
2020-10-27T20:32:05.363881+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] shape: 
2020-10-27T20:32:05.363881+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] model_version_policy: 
2020-10-27T20:32:05.363882+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] nireq: 0
2020-10-27T20:32:05.363885+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] target_device: CPU
2020-10-27T20:32:05.363885+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] plugin_config: 
2020-10-27T20:32:05.363888+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] gRPC port: 9178
2020-10-27T20:32:05.363932+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] REST port: 0
2020-10-27T20:32:05.363932+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] REST workers: 24
2020-10-27T20:32:05.363933+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] gRPC workers: 1
2020-10-27T20:32:05.363933+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] gRPC channel arguments: 
2020-10-27T20:32:05.363937+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] log level: DEBUG
2020-10-27T20:32:05.363937+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] log path: 
2020-10-27T20:32:05.364030+00:00 app[web.1]: [2020-10-27 20:32:05.363] [serving] [debug] Batch size set: false, shape set: false
2020-10-27T20:32:05.364185+00:00 app[web.1]: [2020-10-27 20:32:05.364] [serving] [debug] Currently registered versions count:0
2020-10-27T20:32:05.364189+00:00 app[web.1]: [2020-10-27 20:32:05.364] [serving] [info] Getting model from /opt/ml/facedetect
2020-10-27T20:32:05.364189+00:00 app[web.1]: [2020-10-27 20:32:05.364] [serving] [info] Model downloaded to /opt/ml/facedetect
2020-10-27T20:32:05.364214+00:00 app[web.1]: [2020-10-27 20:32:05.364] [serving] [info] Will add model: facedetect; version: 1 ...
2020-10-27T20:32:05.364311+00:00 app[web.1]: [2020-10-27 20:32:05.364] [serving] [info] Loading model: facedetect, version: 1, from path: /opt/ml/facedetect/1, with target device: CPU ...
2020-10-27T20:32:05.364321+00:00 app[web.1]: [2020-10-27 20:32:05.364] [serving] [info] STATUS CHANGE: Version 1 of model facedetect status change. New status: ( "state": "START", "error_code": "OK" )
2020-10-27T20:32:05.364325+00:00 app[web.1]: [2020-10-27 20:32:05.364] [serving] [debug] setLoading: facedetect - 1 (previous state: START) -> error: OK
2020-10-27T20:32:05.364326+00:00 app[web.1]: [2020-10-27 20:32:05.364] [serving] [info] STATUS CHANGE: Version 1 of model facedetect status change. New status: ( "state": "LOADING", "error_code": "OK" )
2020-10-27T20:32:05.364329+00:00 app[web.1]: [2020-10-27 20:32:05.364] [serving] [debug] Getting model files from path:/opt/ml/facedetect/1
2020-10-27T20:32:05.364808+00:00 app[web.1]: [2020-10-27 20:32:05.364] [serving] [debug] Try reading model file:/opt/ml/facedetect/1/face-detection-0200.xml
2020-10-27T20:32:05.595846+00:00 app[web.1]: [2020-10-27 20:32:05.594] [serving] [debug] Network shape - (1,3,256,256); Final shape - (1,3,256,256)
2020-10-27T20:32:05.597750+00:00 app[web.1]: [2020-10-27 20:32:05.594] [serving] [info] Input name: image; mapping_name: ; shape: 1 3 256 256 ; precision: FP32, layout:NCHW
2020-10-27T20:32:05.597752+00:00 app[web.1]: [2020-10-27 20:32:05.594] [serving] [debug] model: facedetect, version: 1; reshaping inputs is not required
2020-10-27T20:32:05.597753+00:00 app[web.1]: [2020-10-27 20:32:05.594] [serving] [info] Output name: detection_out ; mapping name: ; shape: 1 1 200 7  ; precision: FP32, layout:NCHW
2020-10-27T20:33:02.683257+00:00 heroku[web.1]: Error R10 (Boot timeout) -> Web process failed to bind to $PORT within 60 seconds of launch
2020-10-27T20:33:02.738447+00:00 heroku[web.1]: Stopping process with SIGKILL
2020-10-27T20:33:03.255409+00:00 heroku[web.1]: Process exited with status 137
2020-10-27T20:33:03.303212+00:00 heroku[web.1]: State changed from starting to crashed

Support for intel Atom based processor(non AVX instruction set)

Currently the model server fails to run on hardware which doesn't contain AVX instructions.

tensorflow==1.13.1 --> This build is compiled on AVX instructions.

Error
2019-07-18 21:30:52.201867: F tensorflow/core/platform/cpu_feature_guard.cc:37] The TensorFlow library was compiled to use AVX instructions, but these aren't available on your machine.
/models/inference_engine_entrypoint.sh: line 25: 148 Aborted (core dumped) /ie-serving-py/start_server.sh ie_serving config --config_path /models/models.conf --port 8500 --rest-port 8501

Model optimizer with constant node

Hi,
I do realize that openvino automatically removes a constant node that does nothing to the computation. However, for my project I would need a constant node that records information about the network, e.g. having attributes of lists of strings. Is there anyway I could embed a constant node in the network?

Thank you .

Hi,When I run hddldaemon and BurnInTest in the same time for a stress test,hddldaemon reported an error

When I run hddldaemon and BurnInTest in the same time for a stress test,hddldaemon reported an error and lost Device.Do you have any suggest for this issue?

using openvino server in async mode

I have one model which I want to use maximally on my CPU (8 cores).

sudo docker run -d -v /home/open_vino_model:/models/detection_model/1 -e LOG_LEVEL=DEBUG -p 9000:9000 openvino/ubuntu18_model_server /ie-serving-py/start_server.sh ie_serving model --model_path /models/detection_model --model_name detection_model --port 9000 --shape auto --grpc_workers 25 --rest_workers 25 --nireq 10 --plugin_config "{\"CPU_THROUGHPUT_STREAMS\": \"5\", \"CPU_THREADS_NUM\": \"10\"}"

I don't understand how can I speed up this most efficiently. Because right now this command without any parameters specification gives me about 2.1 fps. While all the variants I try are maximum 1.5 or less.

How to use model server in maximally async mode and get the most performance gain?

`docker build .` fails

Here is where it gets stuck:

Step 18/25 : COPY start_server.sh setup.py requirements.txt version /ie-serving-py/
COPY failed: stat /var/lib/docker/tmp/docker-builder606115979/version: no such file or directory

In order to get docker build . to run, I need to remove version from line 65

COPY start_server.sh setup.py requirements.txt version /ie-serving-py/

Is this supposed to be here?

Thanks,

Ryan

Can't start docker container with NCS

Plugin for Intel® Movidius™ Neural Compute Stick is distributed only in a binary form, so loading models on NCS is possible only with binary built docker image.

What does this mean?
So can't I use your built-in docker image?

I tried with ssd_inception_v2 model and got this error:

I was able to inference the model using this sample code directly:
https://gist.github.com/kodamap/9a9c971f43bffdc457a628c4313947fe

Any idea?

Cannot import opencv after installing intel-openvino-dev-ubuntu18-2019.3.344

I am not sure if here is the right place for this issue...
I've installed intel-openvino-dev-ubuntu18-2019.3.344 on Ubuntu 18.04 as your Dockerfile did:

curl -o GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB
sudo apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB
sudo echo "deb https://apt.repos.intel.com/openvino/2019/ all main" > /etc/apt/sources.list.d/intel-openvino-2019.list
sudo apt-get update && sudo apt-get install -y intel-openvino-dev-ubuntu18-2019.3.344
export PYTHONPATH="/opt/intel/openvino/python/python3.6"
export LD_LIBRARY_PATH="/opt/intel/openvino/deployment_tools/inference_engine/external/tbb/lib:/opt/intel/openvino/deployment_tools/inference_engine/external/mkltiny_lnx/lib:/opt/intel/openvino/deployment_tools/inference_engine/lib/intel64"

But after that, I can't import cv2, which is supposed?

Here is my output:

root@balena:~# printenv PYTHONPATH
/opt/intel/openvino/python/python3.6
root@balena:~# printenv LD_LIBRARY_PATH
/opt/intel/openvino/deployment_tools/inference_engine/external/tbb/lib:/opt/intel/openvino/deployment_tools/inference_engine/external/mkltiny_lnx/lib:/opt/intel/openvino/deployment_tools/inference_engine/lib/intel64
root@balena:~# python3
Python 3.6.9 (default, Jul 24 2019, 11:25:17) 
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/opt/intel/openvino/python/python3.6', '/usr/local/lib/python36.zip', '/usr/local/lib/python3.6', '/usr/local/lib/python3.6/lib-dynload', '/usr/local/lib/python3.6/site-packages']
>>> import cv2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'cv2'
>>>

Version: no such file or directory

When I build dockerfile from master branch the builder returned follows error:

Step 12/15 : RUN virtualenv -p python3 .venv && . .venv/bin/activate && pip3 --no-cache-dir install -r requirements.txt
---> Using cache
---> a02618f3f829
Step 13/15 : COPY start_server.sh setup.py version /ie-serving-py/
COPY failed: stat /var/lib/docker/tmp/docker-builder232822774/version: no such file or directory

If remove the version file from line
COPY start_server.sh setup.py version /ie-serving-py/
then build is done successfully

tensorflow preventing container to run without AVX

When trying to run the container on a QNAP TS-251, having a Celeron J1800, the container would not start because of missing AVX instructions :

[tensorflow/core/platform/cpu_feature_guard.cc:37] The TensorFlow library was compiled to use AVX instructions, but these aren't available on your machine.

I've been running inference with OpenVino docker image on this NAS without any problem, why should TensorFlow prevent running the whole OpenVino server while it's only there for the protobuf API format right ?

error Inputs expected to be in the set {['0']}

I start a server with person-attributes-recognition-crossroad model，and I send a image use gprc， but I got
status = StatusCode.INVALID_ARGUMENT
details = "input tensor alias not found in signature: ['inputs']. Inputs expected to be in the set{['0']}."

this is my client:
image = cv2.imread('1.png')
image = image.transpose(2,0,1)
image = np.expand_dims(image, 1)

request.inputs['inputs'].CopyFrom(tf.contrib.util.make_tensor_proto(image, shape=(image.shape)))

this is info after start a server：
ie_serving.models.ir_engine - INFO - Matched keys for model: {'inputs': {'0': '0'}, 'outputs': {'453': '453', '455': '455', '457': '457'}}

the server means he need a dict format image，so how should I send my image

Model Server Hangs with --target_device=MYRIAD

So this issue is similar to other posted involving NCS2 devices and the Myriad plugin in docker, but i think the specifics here may be unique. After many attempts i was finally able to build an image based on this guide, your Dockerfile_binary_openvino and the Open Vino Workbench dockerfile in which i could access my NCS2 device. I can confirm access to the MYRIAD device by using the demo_squeezenet_download_convert_run.sh script from the openvino installation with a MYRIAD target. however when i attempt to launch the model server, whether or not I have run the demo script within the container, it always hangs without completely starting up. If set the LOG_LEVEL=DEBUG the last line printed before it hangs is
ie_serving.models.ir_engine - DEBUG - [Model: model_1, version: 3] --- effective batch size - auto
I have tried with batch size set to 1, with the same result. Additionally if i set the target device to CPU the model server start up normally. At this point I am once again stuck, any help would be appreciated.

My Dockerfile

FROM ubuntu:18.04
USER root
WORKDIR /

RUN useradd -ms /bin/bash openvino && \
    chown openvino -R /home/openvino

ARG INSTALLDIR=/opt/intel/openvino
ARG TEMP_DIR=/tmp/openvino_installer
ARG DL_INSTALL_DIR="$INSTALLDIR/deployment_tools"

ARG DEPENDENCIES="autoconf \
                  automake \
                  build-essential \
                  cmake \
                  cpio \
                  curl \
                  gnupg2 \
                  libdrm2 \
                  libglib2.0-0 \
                  lsb-release \
                  libgtk-3-0 \
                  libtool \
#                  python3-pip \
                  udev \
                  unzip"

ARG OTHER_DEPENDENCIES="virtualenv  \
			sudo \
			python3.5-dev \
                       "

RUN apt-get update && \
    apt-get install -y --no-install-recommends software-properties-common && \
    add-apt-repository ppa:deadsnakes/ppa && \
    rm -rf /var/lib/apt/lists/*

RUN apt-get update && \
    apt-get install -y --no-install-recommends ${DEPENDENCIES} ${OTHER_DEPENDENCIES} && \
    rm -rf /var/lib/apt/lists/*

ADD l_openvino_toolkit*.tgz $TEMP_DIR/

RUN cd $TEMP_DIR/l_openvino_toolkit* && \
    sed -i 's/decline/accept/g' silent.cfg && \
    ./install.sh -s silent.cfg --ignore-signature && \
    rm -Rf $TEMP_DIR

#FOR VPU
RUN usermod -aG users openvino
RUN cp ${INSTALLDIR}/deployment_tools/inference_engine/external/97-myriad-usbboot.rules /etc/udev/rules.d/ && \
    ldconfig

WORKDIR /opt
RUN curl -L https://github.com/libusb/libusb/archive/v1.0.22.zip --output v1.0.22.zip && \
    unzip v1.0.22.zip && cd libusb-1.0.22 && \
    ./bootstrap.sh && \
    ./configure --disable-udev --enable-shared && \
    make -j4

RUN apt-get update && \
    apt-get install -y --no-install-recommends libusb-1.0-0-dev && \
    rm -rf /var/lib/apt/lists/*

WORKDIR /opt/libusb-1.0.22/libusb
RUN /bin/mkdir -p '/usr/local/lib' && \
    /bin/bash ../libtool   --mode=install /usr/bin/install -c   libusb-1.0.la '/usr/local/lib' && \
    /bin/mkdir -p '/usr/local/include/libusb-1.0' && \
    /usr/bin/install -c -m 644 libusb.h '/usr/local/include/libusb-1.0' && \
    /bin/mkdir -p '/usr/local/lib/pkgconfig' && \
    cd  /opt/libusb-1.0.22/ && \
    /usr/bin/install -c -m 644 libusb-1.0.pc '/usr/local/lib/pkgconfig' && \
    ldconfig

ENV HDDL_INSTALL_DIR="$DL_INSTALL_DIR/inference_engine/external/hddl"
ENV PYTHONPATH="$INSTALLDIR/python/python3.5"
ENV LD_LIBRARY_PATH="$DL_INSTALL_DIR/inference_engine/external/tbb/lib:$DL_INSTALL_DIR/inference_engine/external/mkltiny_lnx/lib:$DL_INSTALL_DIR/inference_engine/external/hddl/lib:$DL_INSTALL_DIR/inference_engine/lib/intel64"

WORKDIR /ie-serving-py

COPY start_server.sh setup.py version requirements.txt /ie-serving-py/
RUN virtualenv -p python3.5 .venv && . .venv/bin/activate && pip3 install -r requirements.txt

COPY ie_serving /ie-serving-py/ie_serving

RUN . .venv/bin/activate && pip3 install .

start_server.sh is bash-specific but could be POSIX compliant

There is nothing in start_server.sh that needs bash-specific functionality.
Hence it would be better off as POSIX compliant.

Can you tell me how to use many models in once post?

like face-detection-retail,face-reidentification-retail,landmarks-regression-retail,l want to use it in once post ,so l want to ust custom bankgrand.

Unnecessary curl save to file in Dockerfiles

In the Dockerfiles, we see that curl downloads to file, then tar unpacks the file and finally the file is removed. Readability and maintainability would be improved if curl would pipe its downloaded data directly to tar (perhaps some performance would be gained too, but I reckon the prior benefits mentioned are more important) thus eliminating the cleanup/rm step.

openvinotoolkit / model_server Goto Github PK

model_server's People

Contributors

Stargazers

Watchers

Forkers

model_server's Issues

plugin = IEPlugin(device=args.device, plugin_dirs=args.plugin_dir) if args.cpu_extension and 'CPU' in args.device: plugin.add_cpu_extension(args.cpu_extension)

Recommend Projects

Recommend Topics

Recommend Org

plugin = IEPlugin(device=args.device, plugin_dirs=args.plugin_dir)
if args.cpu_extension and 'CPU' in args.device:
plugin.add_cpu_extension(args.cpu_extension)