Giter VIP home page Giter VIP logo

data-ingestion-agent's Introduction

data-ingestion-agent

Coverage Status

Ad Astra Docker agent code base for cloud integration without VPN tunnels

Pre-requisites

Docker version 18.02 or greater (Community Edition or any Enterprise Edition)

Oracle

When connecting to an Oracle database the specified database user must be given read/execute access to the following:

  • DBMS_METADATA.GET_DDL (function)
  • ALL_TABLES (view)
  • ALL_CONS_COLUMNS (view)
  • ALL_CONSTRAINTS (view)
  • All tables referenced by this agent (see 'Query Preview' section below)

Resource Requirements

Docker Host

Memory:

  • 8GB (Recommended)

Container

Memory:

  • 4GB (Recommended)

Install

docker pull adastradev/data-ingestion-agent:latest

Run

# See Host System Requirements above for agent resource requirements
PROCESS_MAX_MEMORY_SIZE_MB=4096

docker run -d -t \
-m $PROCESS_MAX_MEMORY_SIZE_MB'M' \
-e PROCESS_MAX_MEMORY_SIZE_MB=$PROCESS_MAX_MEMORY_SIZE_MB \
-e ASTRA_CLOUD_USERNAME=<your_username> \
-e ASTRA_CLOUD_PASSWORD=<your_password> \
-e ORACLE_ENDPOINT=hostname:port/service_name \
-e ORACLE_USER=user \
-e ORACLE_PASSWORD=password \
--network=bridge \
adastradev/data-ingestion-agent:<tag>

To see a demo of the agent without connecting it to any data source, omit the ORACLE_* environment variables. In demo mode, the agent can verify connectivity to the Astra Cloud and push a mock dataset into S3.

The docker agent also supports the following optional arguments:

# [Demo, Banner, PeopleSoft, Colleague]
-e INTEGRATION_TYPE=Banner \
# [dev, prod]
-e DEFAULT_STAGE=prod \
-e AWS_REGION=us-east-1 \
-e CONCURRENT_CONNECTIONS=5 \
# [error, warn, info, verbose, debug, silly]
-e LOG_LEVEL=info

Configure Network Access

The Data Ingestion Agent requires outbound internet access over HTTPS to Amazon Web Services (*.amazonaws.com). In general, the agent should be provided outbound internet access via providing a bridge network as shown above. If runnning through an internet proxy, it is recommended to configure the proxy at docker run time by using an environment variable --env HTTPS_PROXY="https://127.0.0.1:3001". For more information, see the Configure Docker to use a proxy server.

No inbound access to the agent is required.

See Getting started with HTTPS proxies for more information.

Query Preview

Note: To see required table/field access for each integration type, see the following documentation:

Prior to sending any data you can run the following docker command to log each query to the console to examine each query. No data is sent to the destination using this command.

docker run -i \
-m $PROCESS_MAX_MEMORY_SIZE_MB'M' \
-e PROCESS_MAX_MEMORY_SIZE_MB=$PROCESS_MAX_MEMORY_SIZE_MB \
-e ASTRA_CLOUD_USERNAME=<your_username> \
-e ASTRA_CLOUD_PASSWORD=<your_password> \
--network=bridge \
adastradev/data-ingestion-agent:latest \
preview

Adhoc Ingestion

To immediately begin the ingestion process you can run the following with the 'ingest' flag. This command will terminate the container once the process has completed.

# See Host System Requirements above for agent resource requirements
PROCESS_MAX_MEMORY_SIZE_MB=4096

docker run -i \
-m $PROCESS_MAX_MEMORY_SIZE_MB'M' \
-e PROCESS_MAX_MEMORY_SIZE_MB=$PROCESS_MAX_MEMORY_SIZE_MB \
-e ASTRA_CLOUD_USERNAME=<your_username> \
-e ASTRA_CLOUD_PASSWORD=<your_password> \
-e ORACLE_ENDPOINT=hostname:port/service_name \
-e ORACLE_USER=user \
-e ORACLE_PASSWORD=password \
--network=bridge \
adastradev/data-ingestion-agent:latest \
ingest

Uninstall

The data ingestion agent is a long running process that may be performing work when an uninstall occurs. To reduce negative side effects of immediately stopping the agent it is advised to always stop the container with a grace period as shown below. Outright usage of docker kill is discouraged.

If multiple versions of the ingestion agent exist be sure to specify the optional tag when removing an image.

docker stop --time 10 <container_name_or_id>
docker rm <container_name_or_id>
docker rmi <image>:<tag>

Administration

Root Access to a running agent container

After starting the agent and confirming a healthy status you can use the containers name or ID to access the virtual machine via command line (bash) as follows:

docker exec -it <container_id_or_name> /bin/bash

Container Health Monitoring

The data ingestion agent periodically informs docker of its current health. Using docker inspect you can get a general idea of the applications state.

docker inspect --format='{{json .State.Health.Status}}' <container_name_or_id>

To monitor container resource usage run the following:

docker stats <container_name_or_id>

View agent logs

# View console output from container host
docker logs dia
# Copy/export logs from the container to the host
docker cp dia:/var/log/dia /tmp/log/dia

Development

See the Development guide for Data Ingestion Agent

data-ingestion-agent's People

Contributors

aregier avatar croutledgeaais avatar lhermanson15 avatar mattcookio avatar twoslick avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.