Here is an example http request using only curl sending a POST request to with a json body.
curl -X POST "http://eubic2023.external.msaid.io:8501/v2/models/Prosit_2019_intensity_ensemble/infer" \
--data-raw '
{
"id": "LGGNEQVTR_GAGSSEPVTGLDAK",
"inputs": [
{"name": "peptides_in_str:0", "shape": [2,1], "datatype": "BYTES", "data": ["LGGNEQVTR","GAGSSEPVTGLDAK"]},
{"name": "collision_energy_in:0", "shape": [2,1], "datatype": "FP32", "data": [25,25]},
{"name": "precursor_charge_in_int:0", "shape": [2,1], "datatype": "INT32", "data": [1,2]}
]
}
'
See the examples in the corresponding documentation folder
TODO
dlomix-serving depends on docker and nvidia-container-toolkit.
You can find an ansible script that installs all dependencies here.
After installing the dependencies you can pull the docker image and run it with.
docker run \
--gpus all \
--shm-size 2G \
-p 8500:8500 \
-p 8501:8501 \
-d \
ghcr.io/wilhelm-lab/dlomix-serving:latest
- Install dependencies (Ansible script)
- (Suggested) Install docker-compose
- Clone the repo
- Download existing models with
./getModels.sh
- Start the server with
docker-compose up
This step depends on what framework you used to train your model. For detailed instructions in what format your model needs to be provided you can check out this documentation
You can find examples for TensorFlow, PyTorch and XGBoost in our model repository. The model files themselves need to be downloaded from Zenodo.
Triton supports models written in pure python. If your model requires pre- and/or post-processing you can implement this as a "standalone" model in python.
There are numerous examples in this repository. One with low complexity you can find here.
Ensemble models don't have any code themselves they just manage moving tensors between other models. This is perfect for combining your potentially various pre- and post-processing steps with your main model to create one single model/workflow.