Comments (2)
@harshbafna this should really be working. Can we please look into this?
from serve.
The issue is with the curl command which sends the images in two separate requests.
curl -v -X POST http://127.0.0.1:8080/predictions/resnet-152-batch -T "{kitten.jpg,dog.jpg}"
* Trying 127.0.0.1:8080...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
> POST /predictions/resnet-152-batch HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/7.65.3
> Accept: */*
> Content-Length: 110969
> Expect: 100-continue
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
* Mark bundle as not supporting multiuse
< HTTP/1.1 200
< x-request-id: 4fb625c8-cfcc-45e4-a389-73c39007fb74
< Pragma: no-cache
< Cache-Control: no-cache; no-store, must-revalidate, private
< Expires: Thu, 01 Jan 1970 00:00:00 UTC
< content-length: 32
< connection: keep-alive
<
[
"n02123159",
"tiger_cat"
* Connection #0 to host 127.0.0.1 left intact
]* Found bundle for host 127.0.0.1: 0x7f8db9f00320 [serially]
* Re-using existing connection! (#0) with host 127.0.0.1
* Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
> POST /predictions/resnet-152-batch HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/7.65.3
> Accept: */*
> Content-Length: 78949
> Expect: 100-continue
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
* Mark bundle as not supporting multiuse
< HTTP/1.1 200
< x-request-id: 0f46f9e1-eebe-4fe6-8689-64b73f5b8ed6
< Pragma: no-cache
< Cache-Control: no-cache; no-store, must-revalidate, private
< Expires: Thu, 01 Jan 1970 00:00:00 UTC
< content-length: 41
< connection: keep-alive
<
[
"n02099712",
"Labrador_retriever"
* Connection #0 to host 127.0.0.1 left intact
]
Also note that the default handlers do not support batch inferencing. We have already added an example for batch inferencing using resnet152 model in fix for #66
from serve.
Related Issues (20)
- TorchServe linux aarch64 plan
- Serve multiple models with both CPU and GPU HOT 3
- How to modify torchserve’s Python runtime from 3.8.0 to 3.10 HOT 1
- TorchServe crashes in production with `WorkerThread - IllegalStateException error' HOT 4
- Unable to use build_image.sh to build the cu102 version of the image HOT 2
- Metrics collector crashes when NVIDIA MIGs are present HOT 1
- Server crashes in production with `WorkerThread - IllegalStateException error' HOT 1
- Whether the pre- and post-processing operations of batch processing are parallel HOT 1
- Update cpp/llamacpp to Llama 3 HOT 1
- Update LLM/llama2 to Llama3
- Update large_models/inferentia2/llama2 to Llama3
- Update large_models/tp_llama to llama3
- Update large_models/gpt_fast to llama3
- How to pass parameters from preprocessing to postprocessing when using micro-batch operations HOT 4
- Load model failed - error: Worker died HOT 5
- Docker regression failure: test_handler_traceback_logging.py
- Exchange Llama2 against Llama3 in HF_accelerate example
- CUDA out of Memory with low Memory Utilization (CUDA error: device-side assert triggered) HOT 4
- If micro_batch_size of micro-batch is set to 1, then model inference is still batch processing? HOT 1
- question to model inference optimization HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from serve.