Giter VIP home page Giter VIP logo

Comments (7)

agunapal avatar agunapal commented on May 31, 2024

@IonBoleac Yes, these are python processes. If you have a multi-GPU instance, you can choose to assign each GPU to a worker. In an intel CPU based instance, you can use core pinning to distribute the physical cores evenly between all the workers

from serve.

IonBoleac avatar IonBoleac commented on May 31, 2024

@agunapal thanks for your answer... In this case the workers of a model works in parallel? If the serving receive a request for a inference., then this request can be divided in all workers assigned to this model? Thus this request is elaborated in parallel form all workers of a model or one worker of the model works for only one request? So one worker for only one request even if the request is very large. Is this right?
I have this dubious because the workers are process then for theory the process don't shares their resource with other and every worker process works only on one core... but they are managed by frontend that are java threads, and these can works in parallel on more cores. So in theory the workers can works in parallel... But anyway the workers don't works with other workers to resolve one request or they can? Can you help me to resolve this things?

from serve.

IonBoleac avatar IonBoleac commented on May 31, 2024

@agunapal thangs very much for your eventually help

from serve.

agunapal avatar agunapal commented on May 31, 2024

@IonBoleac No, each worker will handle a single request sent from the frontend at a time. This request can be either dynamically batched or not depending on the config.
If multiple clients are sending requests at the same time, then having multiple workers can process these in parallel instead of being sequential

from serve.

IonBoleac avatar IonBoleac commented on May 31, 2024

@agunapal thnaks very much... I've an another question. What is the sense to use minWorker and maxWorker if there isn't autoscaling of the workers? The single way to increase or decrease the workers is form management API

from serve.

lxning avatar lxning commented on May 31, 2024

@IonBoleac It is expensive for worker process to auto scale. Currently TorchServe only supports increase or decrease the worker via management API

from serve.

IonBoleac avatar IonBoleac commented on May 31, 2024

@lxning thanks for your help...

from serve.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.