Giter VIP home page Giter VIP logo

ferry's Introduction

Ferry: big data development engine

Ferry lets you define, run, and deploy big data stacks on your local machine using Docker.

Ferry currently supports Hadoop/Yarn, GlusterFS/OpenMPI, and Cassandra (with more in the future). By using Ferry developers can get started creating their big data applications right away without the pain of installing and configuring all the complex backend software.

Big Data in small places

Big data technologies are designed to operate and scale over many machines and usually consist of multiple functional parts. Developers interested in creating a Hadoop application, for example, must first download the appropriate packages, configure these systems to operate in a single-machine environment (or multiple machines for operational environments), and configure other required services (e.g., PostgreSQL).

Fortunately for us, Ferry and Docker vastly simplify the entire process by capturing the entire process in a set of lightweight Linux containers. This enables developers to quickly stand up a big data stack and attach connectors/clients with zero manual configuration. Because Docker is so lightweight, you can even test multiple big data stacks with minimal overhead.

Getting started

Ferry is a Python application and runs on your local machine. All you have to do to get started is have docker installed and type the following pip install -U ferry. Afterwards you can start creating your big data application. Here's an example stack:

{
  "backend":[
   {
    "storage":
        {
  	   "personality":"gluster",
  	   "instances":2
	},
    "compute":[
	{
	  "personality":"mpi",
	  "instances":2
	}]
   }],
  "connectors":[
	{"personality":"mpi-client"}
  ]
}

This stack consists of two GlusterFS data nodes, and two OpenMPI compute nodes. There's also a Linux client that automatically connects to those backend components. To create this stack, just type ferry start openmpi. Once you create the stack, you can log in by typing ferry ssh sa-0.

More detailed installation instructions and examples can be found here.

Under the hood

Ferry leverages some awesome open source projects:

  • Docker simplifies the management of Linux containers
  • Python programming language
  • Hadoop is a general-purpose big data storage and processing framework
  • GlusterFS is a parallel filesystem actively developed by Redhat
  • OpenMPI is a scalable MPI implementation focused on modeling & simulation
  • Cassandra is a highly scalable column store
  • PostgreSQL is a popular relational database

ferry's People

Contributors

jhorey avatar renzok avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.