ServiceX, a component of the IRIS-HEP DOMA group's iDDS, is an experiment-agnostic service to enable on-demand data delivery along the concepts originally developed for ATLAS but specifically tailored for nearly-interactive, high performance, array based and pythonic analyses context. It provides uniform backend interfaces to data storage services and frontend (client-facing) service endpoints for multiple different data formats and organizational structures.
It is capable of retrieving and delivering data from data lakes, as those systems and models evolve. It depends on Rucio to find and access the data. The service is capable of on-the-fly data transformations to enable data delivery in a variety of different formats, including streams of ROOT data, small ROOT files, HDF5, and Apache Arrow buffers as examples. In addition, ServiceX includes pre-processing functionality for event data and preparation for multiple clustering frameworks (e.g. such as Spark). Eventually, it will be able to automatically unpack compressed formats, potentially including hardware accelerated techniques, and can prefilter events so that only useful data is transmitted to the user.
The entire serviceX stack can be installed using the helm chart contained in this repo. The chart has been deployed to the ssl-hep helm repo. The default values for the chart have been set up so you can run the service with a public xAOD file without any grid credentials.
There is also an included Jupyter notebook for driving the system and getting some simple plots back.
ServiceX requires an installation of Kubernetes. If you have access to a cluster you can use that. Otherwise, you can enable a single node kubernetes cluster on your desktop using Docker-Desktop.
The service is published as a helm chart. You will need a version of helm. The chart has been tested on Helm3 installed on your desktop.
ServiceX will use your CERN grid certificates to communicate with Rucio. In this repo we've included a set of demo values that will just use a public xAOD file so you can try out ServiceX with minimal fuss.
% helm repo add ssl-hep https://ssl-hep.github.io/ssl-helm-charts/
% helm repo update
% helm install -g -f demo-values.yaml ssl-hep/servicex
The notes at the end give you some helpful tips on interacting with the pods in your deployment. It takes about one minute for RabbitMQ to complete its initialization and all of the other pods are able to register and start up. Note that it is normal for some pods to enter a "Crash Loop Backoff" during this time.
You can check on the status of the deployment with the command
% kubectl get pod
Eventually your setup should resemble this:
NAME READY STATUS RESTARTS AGE
servicex-1579021789-code-gen-7cd998d5b6-nwtgv 1/1 Running 0 49m
servicex-1579021789-did-finder-7c5cbb4575-52wxf 1/1 Running 0 7m53s
servicex-1579021789-minio-78b55bfdf8-mbmmf 1/1 Running 0 49m
servicex-1579021789-preflight-b748b4dfd-qqt89 1/1 Running 4 49m
servicex-1579021789-rabbitmq-0 1/1 Running 0 49m
servicex-1579021789-servicex-app-98779c79c-cvmqx 1/1 Running 3 49m
servicex-1579021789-x509-secrets-74f4bcc8bb-5kqvb 1/1 Running 0 49m
The name of your release will be different.
The default values for the ServiceX helm chart deploy the following services:
Service | Purpose |
---|---|
servicex-app | This is the REST service that receives your requests and manages interactions with the other services. It also uses the kubernetes API to launch transformer jobs |
rabbitmq | This is a popular queue messaging system it provides a transaction mechanism for distributing work to the asynchronous components of the service |
minio | An opensource implementation of Amazon's S3 object store. It is not required for ServiceX, but makes it easy to deliver results from smaller transformations as downloadable files. |
did-finder | This service accepts DIDs and consults Rucio to find replicas of the constituaent ROOT files. It sends these files back to the REST server so the transformers can access them |
code-gen | This service generates code from your qastle selects |
preflight | This service accepts the first ROOT file found by the DID-Finder and does a quick validation of your request to make sure the columns are valid. Only after passing this test do the transformer jobs get started |
x509-secrets | This service uses your provided grid cert and credentials to obtain an X509 proxy from VOMs. This proxy is stored in the cluster as a secret that the DID finder and Transformers can mount to connect to resources on your behalf |
Kubernetes offers some sophisticated ways to expose deployed services for external connections. For this demo, we will just use the simplest possible way expose the services to your desktop: port forwarding.
We will need to expose port 5000 of the REST server so transform requests can be submitted. We will also expose port 9000 of the Minio server so we can browse and download generated files.
The notes at the end of helm deployment provide some nice generic commands to do this. You can also look at the list of pods and then issue commands like this to tell kubernetes to forward their ports. Here are examples based on the pod names shown above. Your pod names will be different:
% kubectl port-forward exegetical-mouse-servicex-app-6fdd5bf7b6-dcwwd 5000:5000
% kubectl port-forward exegetical-mouse-minio-57cbd595b5-77426 9000:9000
Note the port-forward command doesn't return when you run it. It blocks in the terminal so long as the port is being forwarded. You can stop it at any time with ^C. You will need to enter each of these commands in different terminals since port forwarding will be needed during the demo.
We have a notebook that submits a sample request to your serviceX, waits for
the transformation to complete and downloads the results. The notebook is
found in the examples
folder of this repo.
First create a python virtualenv. Note that the notebooks (and package requirements) are currently configured to work for Python 3.
% virtualenv ~/.virtualenvs/servicex_demo
% source ~/.virtualenvs/servicex_demo/bin/activate
Now install the dependent python packages
% pip install -r examples/requirements.txt
Then you can start Jupyter server
% jupyter notebook
Once that launches it should open up a browser window for you. Vist the
examples
folder of this repo and open up the ServiceXDemo
notebook.
Now that you have tried out the demo, you will certainly want to take the training wheels off and run ServiceX against a real dataset.
The DID Finder will need to talk to CERN's Rucio service which requires grid certificates and a passcode to unlock them. If you are a member of the ATLAS experiment, you can follow these helpful instructions on obtaining proper grid certificates.
You install the certs into the cluster as a global kubernetes secret. This is easily done with the ServiceX Command Line Interface (cli).
- Install the cli with
pip install servicex-cli
- When this completes you can copy the certs into the cluster with
servicex init
By default, the script will pick up your grid certs from the.globus
directory in your home directory. You can override this with the--cert-dir
command line option. Also, the cli will create the secret in thedefault
namespace. If you are using a different namespace you can override this with the--namespace
option.
ServiceX can deliver transformed datasets to an object store service (Minio) that is optionally installed with this helm chart. An alternative delivery mechanism is streamed arrow tables using Kafka. Full instructions and sample config files are in the kafka directory.
The following table lists the configurable parameters of the ServiceX chart and their default values. Note that you may wish to change some of the default parameters for the dependent rabbitMQ or mino
Parameter | Description | Default |
---|---|---|
app.image |
ServiceX_App image name | sslhep/servicex_app |
app.tag |
ServiceX image tag | latest |
app.pullPolicy |
ServiceX image pull policy | IfNotPresent |
app.rabbitmq.retries |
Number of times to retry connecting to RabbitMQ on startup | 12 |
app.rabbitmq.retry_interval |
Number of seconds to wait between RabbitMQ retries on startup | 10 |
app.replicas |
Number of App pods to start. Experimental! | 1 |
app.resources |
Pass in Kubernetes pod resource spec to deployment to change CPU and memory | { } |
didFinder.image |
DID Finder image name | sslhep/servicex-did-finder |
didFinder.tag |
DID Finder image tag | latest |
didFinder.pullPolicy |
DID Finder image pull policy | IfNotPresent |
didFinder.site |
Tier 2 site that DID finder should prefer. If blank will just return a random replica from Rucio | |
didFinder.rucio_host |
URL for Rucio service to use | https://voatlasrucio-server-prod.cern.ch:443 |
didFinder.auth _host |
URL to obtain rucio authentication | https://voatlasrucio-auth-prod.cern.ch:443 |
didFinder.threads |
Number of threads for pull replicas out of Rucio | 5 |
preflight.image |
Preflight image name | sslhep/servicex-transformer |
preflight.tag |
Preflight image tag | latest |
preflight.pullPolicy |
Preflight image pull policy | IfNotPresent |
codeGen.enabled |
Enable deployment of code generator service? | true |
codeGen.image |
Code Gen image name | sslhep/servicex_code_gen_funcadl_xaod |
codeGen.tag |
Code Gen image tag | latest |
codeGen.pullPolicy |
Code Gen image pull policy | IfNotPresent |
x509Secrets.image |
X509 Secret Service image name | sslhep/x509-secrets |
x509Secrets.tag |
X509 Secret Service image tag | latest |
x509Secrets.pullPolicy |
X509 Secret Service image pull policy | IfNotPresent |
x509Secrets.vomsOrg |
Which VOMS org to contact for proxy? | atlas |
rbacEnabled |
Specify if rbac is enabled in your cluster | true |
hostMount |
Optional path to mount in transformers as /data | - |
gridAccount |
CERN User account name to access Rucio | - |
rabbitmq.password |
Override the generated RabbitMQ password | leftfoot1 |
objectstore.enabled |
Deploy a minio object store with Servicex? | true |
postgres.enabled |
Deploy a postgres database into cluster? If not, we use a sqllite db | false |
minio.accessKey |
Access key to log into minio | miniouser |
minio.accessKey |
Secret key to log into minio | leftfoot1 |
transformer.pullPolicy |
Pull policy for transformer pods (Image name specified in REST Request) | IfNotPresent |
elasticsearchLogging.enabled |
Set to True to enable writing of reports to an external ElasticSearch system | False |
elasticsearchLogging.host |
Hostname for external ElasticSearch server | |
elasticsearchLogging.port |
Port for external ElasticSearch Server | 9200 |
elasticsearchLogging.user |
Username to connect to external ElasticSearch Server | |
elasticsearchLogging.password |
Password to connect to external ElasticSearch Server |
You can access the REST service on your desktop with
% kubectl port-forward <<app pod>> 5000:5000
You can access a minio browser with:
% kubectl port-forward <minio pod> 9000:9000
Log in to this as miniouser
, password is leftfoot1
Once you have exposed port 5000 of the REST app, you can use the included postman collection to submit a transformation request, and check on the status of a running job. You can import the collection from the ServiceX REST App repo
We are still experimenting with various configurations for deploying a scaled-up ServiceX.
It's certainly helpful to beef up the RabbitMQ deployment:
rabbitmq:
resources:
requests:
memory: 256Mi
cpu: 100m
replicas: 3
Tip: List all releases using
helm list
To uninstall/delete the my-release
deployment:
$ helm delete my-release
The command removes all the Kubernetes components associated with the chart and deletes the release.
ServiceX is a community effort. The architecture lends itself to orchestrating different types of workflows and generating different outputs. Fork us on github and make a contribution!
Please submit issues for bugs and feature requests to this repo.
We manage project priorities with a zenhub board
We coordinate effort on the IRIS-HEP slack. Come join this intellectual hub.
Microservice architectures can be difficult to test and debug. Here are some helpful hints to make this easier.
- Instead of relying on the DID Finder to locate some particular datafile, you
can mount one of your local directories into the transformer pod and then
instruct the DID Finder to always offer up the path to that file regardless of
the submitted DID. You can use the
hostMount
value to have a local directory mounted into each transformer pod under/data
. You can use thedidFinder.staticFile
value to instruct DID Finder to offer up a file from that directory. - You can use port-forwarding to expose port 15672 from the RabbitMQ pod to
your laptop and log into the Rabbit admin console using the username:
user
and passwordleftfoot1
. From here you can monitor the queues, purge old messages and inject your own messages
This project is supported by National Science Foundation under Cooperative Agreement OAC-1836650. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.