Comments (7)
@IonBoleac Yes, these are python processes. If you have a multi-GPU instance, you can choose to assign each GPU to a worker. In an intel CPU based instance, you can use core pinning to distribute the physical cores evenly between all the workers
from serve.
@agunapal thanks for your answer... In this case the workers of a model works in parallel? If the serving receive a request for a inference., then this request can be divided in all workers assigned to this model? Thus this request is elaborated in parallel form all workers of a model or one worker of the model works for only one request? So one worker for only one request even if the request is very large. Is this right?
I have this dubious because the workers are process then for theory the process don't shares their resource with other and every worker process works only on one core... but they are managed by frontend that are java threads, and these can works in parallel on more cores. So in theory the workers can works in parallel... But anyway the workers don't works with other workers to resolve one request or they can? Can you help me to resolve this things?
from serve.
@agunapal thangs very much for your eventually help
from serve.
@IonBoleac No, each worker will handle a single request sent from the frontend at a time. This request can be either dynamically batched or not depending on the config.
If multiple clients are sending requests at the same time, then having multiple workers can process these in parallel instead of being sequential
from serve.
@agunapal thnaks very much... I've an another question. What is the sense to use minWorker and maxWorker if there isn't autoscaling of the workers? The single way to increase or decrease the workers is form management API
from serve.
@IonBoleac It is expensive for worker process to auto scale. Currently TorchServe only supports increase or decrease the worker via management API
from serve.
@lxning thanks for your help...
from serve.
Related Issues (20)
- https://github.com/pytorch/serve/issues/2870 - New Release Required for this Fix HOT 1
- Warning in regression test: test_gRPC_inference_api.py
- Warning in regression test: test_install_dependencies_to_target_directory_with_requirements HOT 1
- enable test_install_dependencies_to_venv_with_requirements in docker regression HOT 2
- Cannot run the text_classification example HOT 4
- CPP build failed with errors
- Open Inference Protocol with nightly build not working HOT 14
- CPP backend debugging and troubleshooting
- KServe wrapper default configuration is faulty HOT 5
- Update documentation on deprecating mac x86 support
- It seems like `metrics.yaml` doesn't apply HOT 1
- Config to disable gpu system metrics collection HOT 1
- torch.compile benchmark nightlies failing because of dependency of simpy
- '503 Service Unavailable' for url 'http://0.0.0.0:8085/v1/models/mnist:predict' HOT 2
- Update token authentication doc with maven link for downloading prebuilt plugin HOT 1
- KServe nightly tests are failing
- Incomplete example about emitting metrics HOT 3
- Broken example for a custom Counter metrics HOT 3
- Metrics REST API doesn't return custom metric HOT 3
- Model results are inconsistent between preheating and after preheating HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from serve.