Giter VIP home page Giter VIP logo

scarf-postgres-exporter's Introduction

Scarf -> PostgreSQL Exporter

This script pulls down your raw Scarf data and sends it into a PostgreSQL DB.

This script is intended to be run as a daily batch job.

On an empty DB, the last 31 days of data will be backfilled, not including today. Subsequent runs of the script will import the most recent day of missing data through the end of yesterday.

Getting started

Ensure these environment variables are set:

SCARF_API_TOKEN=<your api token>
SCARF_ENTITY_NAME=<Scarf username or org name>
PSQL_CONN_STRING=<PSQL connection string>

You can optionally set:

BACKFILL_DAYS=31 #defaults to 31 if not set

Note, the psql command must be available in your environment separately.

Then run:

$ npm i
$ npm run buildAndRun

Docker

$ docker run \
    -e SCARF_API_TOKEN=<> \
    -e SCARF_ENTITY_NAME=<>\
    -e PSQL_CONN_STRING=<> \
    -e BACKFILL_DAYS=<> \
   scarf.docker.scarf.sh/scarf-sh/scarf-postgres-exporter

Configuring on GitHub Actions

You can use GitHub Actions cron functionality to run this exporter periodically in your GitHub repo with an action like this:

name: Export Scarf data
on:
  schedule:
    - cron: '0 0 * * *'

jobs:
  export-scarf-data:
    runs-on: ubuntu-latest
    steps:
      - uses: docker://scarf.docker.scarf.sh/scarf-sh/scarf-postgres-exporter:latest
        env:
          SCARF_API_TOKEN: ${{ secrets.SCARF_API_TOKEN }}
          SCARF_ENTITY_NAME: {Your Scarf user or org name}
          PSQL_CONN_STRING: ${{ secrets.PSQL_CONN_STRING }}

Contributing

Code contributions are more than welcome! Please open an issue first to discuss your change before getting started. Feel free to jump into Scarf's community Slack if you'd like to chat with us directly.

Publishing the Docker container

The container is published to GHCR and distriburted via the scarf.docker.scarf.sh endpoint.

Ensure you are building the container for architechures besides your own like so:

$ docker buildx build --platform linux/amd64,linux/arm64 --push -t ghcr.io/scarf-sh/scarf-postgres-exporter .

License

Apache 2.0

scarf-postgres-exporter's People

Contributors

aviaviavi avatar hollyos avatar justinwoo avatar alexbiehl avatar fabioluz avatar ken-scarf avatar

Stargazers

Wilson Gichu avatar Peer Richelsen avatar  avatar

Watchers

Brady Ouren avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

peerrich hlopetz

scarf-postgres-exporter's Issues

table-def.sql missing columnt origin_state

Hey,
After trying it out it was giving me an

Download Completed csv downloaded importing CSV into postgres /root/test/scarf/index.ts:45 reject(new Error(errorMessage)); ^ Error: Command failed with code 1: ERROR: column "origin_state" of relation "scarf_events_raw" does not exist at ChildProcess.<anonymous> (/root/test/scarf/index.ts:45:16) at ChildProcess.emit (node:events:514:28) at maybeClose (node:internal/child_process:1105:16) at ChildProcess._handle.onexit (node:internal/child_process:305:5) Node.js v20.5.1

added origin_state text to table-def.sql

Publish as a Docker container

It may be helpful to publish a container that has all of the dependencies pre-loaded. Currently, you need npm and psql in the environment.

Forward-compatibility with new columns

Currently, this script relies on the table schema matching the CSV. Adding new columns to the CSV breaks the importer, which is unnecessary. Let's ensure we're specifying the exact set of known columns we want to pull from the CSV and ignoring the extra columns.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.