Stream Engine

About

Stream Engine is the back-end query engine for the OOI CyberInfrastructure. It exposes a JSON-based HTTP interface which allows for the querying of full resolution or subsampled data from the data store and execution of numerous data product algorithms on this data. The results are returned to the user in a JSON data structure for synchronous results or asynchronously as NetCDF4, CSV or TSV.

Prerequisites

Clone the repository
Clone all submodules (git submodule update --init)
Create a conda virtual environment with the necessary packages:

conda env create -f conda_env.yml

Configuration

The default Stream Engine configuration can be found in config/default.py. Overrides can be entered into config/local.py. Gunicorn-specific configuration parameters are set in gunicorn_config.py.

Running

The script manage-streamng allows for starting, stopping, reloading and checking the status of Stream Engine. The restart option combines the stop and start options. The reload option will send a HUP signal to gunicorn which will terminate all idle workers and restart them. Any busy worker will continue until the current request is complete. The status option returns the process id (PID) of the gunicorn parent process.

./manage-streamng start
./manage-streamng stop
./manage-streamng restart
./manage-streamng reload
./manage-streamng status

Note that the stop behavior is similar to the reload behavior. Any active workers will continue until their current task is complete. Any new requests will be rejected but the master gunicorn process will continue running until all workers are shutdown. Stopping Stream Engine should generally be avoided unless necessary.

Running on Test Server

Our current test server is uframe-3-test under user asadev. Source the Stream Engine conda environment engine and start the service. Run Stream Engine in the logs directory as they are written to the current working directory:

source activate engine
cd ~/miniconda/envs/engine/stream_engine/logs
../manage-streamng start

Logs

The following logs are generated in the logs folder:

stream_engine.error.log - General data retrieval, product creation logs
stream_engine.access.log - Gunicorn access logs

Updating a Stream Engine installation to a newer release

Updating to a new release of Stream Engine is simple, just grab the update, update your conda environment and the preload database submodule then reload Stream Engine.

git pull # or git fetch / git checkout <tag>
git submodule update
conda env update -f conda_env.yml
./manage-streamng reload

Creating a new Stream Engine release

Update preload database submodule (if needed)
Update conda_env.yml with any desired library updates
Update config/default.py with the new version
Update RELEASE_NOTES with the new version
Commit the above changes
Tag the commit with the new version

git tag -a vX.X.X

You can then push the commit and the tag to the upstream repo(s):

git push gerrit master
git push gerrit master --tags

Updating the preload-database submodule (usually only to satisfy unit tests)

Within the stream_engine root, change directory to preload-database
Ensure all changes you may have are cleared/saved off
Run the following commands

git fetch origin # assuming "origin" points to the source URL
git rebase origin/master
cd ..  # to stream_engine root
git add preload_database
git commit -m "Issue #nnnnn <message>
git push origin HEAD:nnnnn

Updating the ooi-metadata-service-api submodule

Within the stream_engine root, change directory to util/metadata_service/metadata_service_api
Ensure all changes you may have are cleared/saved off
Run the following commands

git fetch origin # assuming "origin" points to the source URL
git rebase origin/master
cd ../../..  # to stream_engine root
git add util/metadata_service/metadata_service_api
git commit -m "Issue #nnnnn <message>
git push origin HEAD:nnnnn

Creating test NetCDF files (used in test_stream_request.py, among others)

NOTE: this was used for 13182,14654 (may be applicable elsewhere) Ensure up-to-date data has been ingested for the NC files you want to create Then temporarily modify the stream_engine code as follows to create the files:

In util/netcdf_generator.py's _filter_parameters, change default_params to add: sci_water_pressure

In util/netcdf_generator.py's _create_files a) comment line: ds = rename_glider_lat_lon(stream_key, ds) b) after the following code snippet (as of 9/3/2020)

                for external_stream_key in self.stream_request.external_includes:
                    for parameter in self.stream_request.external_includes[external_stream_key]:
                        long_parameter_name = external_stream_key.stream_name + "-" + parameter.name

  add the following code snippet to ensure these parameters are retained in the output

                        if parameter.name in ('m_gps_lat', 'm_gps_lon', 'm_lat', 'm_lon', 'interp_lat', 'interp_lon'):
                            params_to_include.append(long_parameter_name)
                            continue

Once this is done run a data request against the data to produce the NC files. Then back out these changes.

oceanobservatories / stream_engine Goto Github PK

stream_engine's Introduction

Stream Engine

About

Prerequisites

Configuration

Running

Running on Test Server

Logs

Updating a Stream Engine installation to a newer release

Creating a new Stream Engine release

Updating the preload-database submodule (usually only to satisfy unit tests)

Updating the ooi-metadata-service-api submodule

Creating test NetCDF files (used in test_stream_request.py, among others)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent