In my config.properties file, I have the lines: <

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Agreed with <a class="user-mention notranslate" data-hovercard-type="user" data-hoverc

TorchServe ignores batch config properties about serve HOT 12 CLOSED

pytorch commented on May 14, 2024 1

TorchServe ignores batch config properties

from serve.

Comments (12)

punshriv commented on May 14, 2024 5

@harshbafna If batchSize and max_batch_delay can only be configured only through management API what is the recommendation from Torchserve team to configure this when using multiple replicas in Kubernetes to load these values on container start/restart ?

from serve.

harshbafna commented on May 14, 2024

@fbbradheintz : batch_size and max_batch_delay configuration is not supported with config.properties.

For details on supported config.properties parameters please refer configuration documentation

These parameters can be configured through management APIs while registering the model as documented in here.

The default values for these params is :

batchSize - 1
maxBatchDelay - 100

These parameters are only used in case of batch inferencing.

For usage example refer : Batch inferencing example with Resnet 152 model

from serve.

fbbradheintz commented on May 14, 2024

If we can't put the batch config info per model into a configuration file, that's going to be a problem for the server saving its state. I've flagged this as v1.0, but there should be a discussion about this soon.

from serve.

harshbafna commented on May 14, 2024

@fbbradheintz : While saving the snapshot for registered models, we do save the batchSize and maxBatchDelay params for every model-version and the models are restored with their corresponding batchSize and maxBatchDelay when we restart TorchServe with a snapshot config.

However, these parameters are not available as generic config parameters and the default vaules are used for

models passed in models param (list of models to be loaded while starting TorchServe) in config file
models registered through management api without these params in argument.

Note that if the config file supplied with --ts-config-file param while starting Torchserve is a snapshot config file, then the models param is ignored and the models in model_snapshot parameter are registered.

from serve.

fbbradheintz commented on May 14, 2024

I understand that the option is available in the command line and Management API; the concern is that it is not available in config files, and generally we don't have parity among the multiple configuration methods.

This doesn't block launch, but I'm leaving open for discussion.

from serve.

harshbafna commented on May 14, 2024

Batch Inferencing is not supported by TorchServe default handlers and is only supported through custom handlers. In case we make this configurable through config.property at global level, this will get applicable for models using default handlers as well and the will break. Thus, it makes sense to keep this configurable at model level only at the time of model registration.

However, TorchServe should provide an api to update these parameters through API post registration as well.

@mycpuorg , @fbbradheintz , @dhaniram-kshirsagar thoughts?

from serve.

fbbradheintz commented on May 14, 2024

Agreed with @harshbafna that batch params should be set at the model level. The point here is consistency and parity among the multiple configuration channels.

Are we limited here by the .properties format?

Also, can we agreed on a definition for this feature before writing more code for it?

from serve.

vdantu commented on May 14, 2024

It may be a good idea to discuss the approach taken and its pros/cons. Some of my thoughts are as follows:

Ideally we shouldn't add model-level configuration to config.properties. We could look into adding these options to MANIFEST.json in MAR files (or) look into having a separate configuration file (analogous to log4j.properties) which contains all the model-specific configuration. The concerns with both the options are:
1. Adding all the model level configuration to a single file could potentially make that configuration file hard to manually modify. We may have to look into tooling to generate this configuration.
2. If we go the MANIFEST route, we will be mandating that, customers who want to use this feature MUST provide a mar file. And every time customers want to change this configuration, they would have to regenerate their MAR files. This might not be feasible as model creation and deployment might not happen in the same pipeline.
I don't think we need a new API for this. We currently do support PUT /models, which already modifies number of workers. Why can't we add options to this API?
We could take the environment variables route. But defining a good namespace and not polluting the environment variables might be a discussion of its own.

from serve.

dhaniram-kshirsagar commented on May 14, 2024

It may be a good idea to discuss the approach taken and its pros/cons. Some of my thoughts are as follows:

Ideally we shouldn't add model-level configuration to config.properties. We could look into adding these options to MANIFEST.json in MAR files (or) look into having a separate configuration file (analogous to log4j.properties) which contains all the model-specific configuration. The concerns with both the options are:

Adding all the model level configuration to a single file could potentially make that configuration file hard to manually modify. We may have to look into tooling to generate this configuration.

If we go the MANIFEST route, we will be mandating that, customers who want to use this feature MUST provide a mar file. And every time customers want to change this configuration, they would have to regenerate their MAR files. This might not be feasible as model creation and deployment might not happen in the same pipeline.

I don't think we need a new API for this. We currently do support PUT /models, which already modifies number of workers. Why can't we add options to this API?

We could take the environment variables route. But defining a good namespace and not polluting the environment variables might be a discussion of its own.

I like #2 above i.e. modify PUT API to support batch params.

from serve.

toretak commented on May 14, 2024

what about to check if there is MODELNAME.properties in model store during start? If so, load batch_size and delay from it per model?

from serve.

lxning commented on May 14, 2024

model level config is added in v0.4.1.

from serve.

darkain84 commented on May 14, 2024

@harshbafna If batchSize and max_batch_delay can only be configured only through management API what is the recommendation from Torchserve team to configure this when using multiple replicas in Kubernetes to load these values on container start/restart ?

I found an example for that in torchserve github. https://github.com/pytorch/serve/blob/master/kubernetes/EKS/config.properties.

I hope the above link will be helpful.

from serve.

TorchServe ignores batch config properties about serve HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent