Giter VIP home page Giter VIP logo

bach's Introduction

Bach

Orchestrate a cluster of preemptible virtual machines on google compute engine.

Prerequisites

  • Node.js
    • Installing node and npm
    • Running as a command line tool
  • Docker
    • Setting up docker to run on local machine
    • Setting up a docker host on a local subnet
  • Google Cloud CLI
    • Get started with Google Cloud
  • Slave docker/vm image

Installation

Install via npm

npm i <tbc> -g

Install from source

Linking will allow changes made to thee source code to be immediately reflected in the tool.

git clone https://github.com/conorturner/bach.git && \
cd bach && \
npm link

Usage

The bachfile

Applications are defined using a 'bachfile', this specifies the location of the binary file to be run in the computation. It also contains a definition of the hardware requirements for each slave node.

Map Reduce

This use case supports a basic map and collect phase reading from any HTTP storage supporting the 'range' header. Documentation is available here.

Stream Processing

Documentation is available here.

Interesting Datasets

Good source of datasets: https://registry.opendata.aws/

US IRS filings https://registry.opendata.aws/irs990/ https://s3.amazonaws.com/irs-form-990/index_20xx.json

Massive web crawl database https://registry.opendata.aws/commoncrawl/

Nexrad weather satellite data https://docs.opendata.aws/noaa-nexrad/readme.html Data can be searched byprefix as shown below https://noaa-nexrad-level2.s3.amazonaws.com/?prefix=2019/01/19

Database of a subset of all 'events' that occur on this earth. Scraped from the internet I assume. https://www.gdeltproject.org/#intro Smaller 1.1gb version of the dataset http://data.gdeltproject.org/events/GDELT.MASTERREDUCEDV2.1979-2013.zip

Headers for 30gb taxi dataset http://www.debs2015.org/call-grand-challenge.html

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.