Giter VIP home page Giter VIP logo

metaflow-service's Introduction

Metaflow Metadata Service

Metadata service implementation for Metaflow. This provides a thin wrapper around a database and keeps track of metadata associated with metaflow entities such as Flows, Runs, Steps, Tasks, and Artifacts.

For more information, see Metaflow's website

Migration Service

The Migration service is a tool to help users manage underlying DB migrations and launch the most recent compatible version of the metadata service

Note that it is possible to run the two services independently and a Dockerfile is supplied for each service. However the default Dockerfile combines the two services.

Also note that at runtime the migration service and the metadata service are completely disjoint and do not communicate with each other

Getting Started

The service depends on the following Environment Variables to be set:

  • MF_METADATA_DB_HOST [defaults to localhost]
  • MF_METADATA_DB_PORT [defaults to 5432]
  • MF_METADATA_DB_USER [defaults to postgres]
  • MF_METADATA_DB_PSWD [defaults to postgres]
  • MF_METADATA_DB_NAME [defaults to postgres]

Optionally you can also overrider the host and port the service runs on

  • MF_METADATA_PORT [defaults to 8080]
  • MF_MIGRATION_PORT [defaults to 8082]
  • MF_METADATA_HOST [defaults to 0.0.0.0]
pip3 install ./
python3 -m services.metadata_service.server

Swagger UI: http://localhost:8080/api/doc

Using docker-compose

docker-compose up -d

to access the container run

docker exec -it metadata_service /bin/bash

within the container curl the service directly

curl localhost:8080/ping

Using published image on DockerHub

Latest release of the image is available on dockerhub

docker pull netflixoss/metaflow_metadata_service

Be sure to set the proper env variables when running the image

docker run -e MF_METADATA_DB_HOST='<instance_name>.us-east-1.rds.amazonaws.com' \
-e MF_METADATA_DB_PORT=5432 \
-e MF_METADATA_DB_USER='postgres' \
-e MF_METADATA_DB_PSWD='postgres' \
-e MF_METADATA_DB_NAME='metaflow' \
-it -p 8082:8082 -p 8080:8080 metaflow_metadata_service

Migrating to the latest db schema

Note may need to do a rolling restart to get latest version of the image if you don't have it already

You can manage the migration either via the api provided or with the utility cli provided with migration_tools.py

  • check status and note version you are on
    • Api: /db_schema_status
    • cli: python3 migration_tools.py db-status
  • see if there are migrations to be run
    • if there are any migrations to be run is_up_to_date should be false and a list of migrations to be applied will be shown under unapplied_migrations
  • take backup of db
    • in case anything goes wrong it is a good idea to take a back up of the db
  • migrations may cause downtime depending on what is being run as part of the migration
  • Note concurrent updates are not supported. it may be advisable to reduce your cluster size to a single node
  • upgrade db schema
    • Api: /upgrade
    • cli: python3 migration_tools.py upgrade
  • check status again to verify you are on up to date version
    • Api: /db_schema_status
    • cli: python3 migration_tools.py db-status
    • Note that is_up_to_date should be set to True and migration_in_progress should be set to False
  • do a rolling restart of the metadata service cluster
    • In order for the migration to be effective a full restart of the containers is required
  • latest available version of service should be ready
    • cli: python3 migration_tools.py metadata-service-version
  • If you had previously scaled down your cluster it should be safe to return it to the desired number of containers

Under the Hood: What is going on in the Docker Container

Within the published metaflow_metadata_service image the migration service is packaged along with the latest version of the metadata service compatible with every version of the db. This means that multiple versions of the metadata service comes bundled with the image, each is installed under a different virtual env.

When the container spins up, the migration service is launched first and determines what virtualenv to activate depending on the schema version of the DB. This will determine which version of the metadata service will run.

Get in Touch

There are several ways to get in touch with us:

metaflow-service's People

Contributors

ferras avatar alau avatar crk-codaio avatar msavela avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.