Giter VIP home page Giter VIP logo

blaze's Introduction

Build Status Coverage Status

Blaze extends the usability of NumPy and Pandas to distributed and out-of-core computing. Blaze provides an interface similar to that of the NumPy ND-Array or Pandas DataFrame but maps these familiar interfaces onto a variety of other computational engines like Postgres or Spark.

Example

Blaze separates the computations that we want to perform:

>>> accounts = Symbol('accounts', 'var * {id: int, name: string, amount: int}')

>>> deadbeats = accounts[accounts.amount < 0].name

From the representation of data

>>> L = [[1, 'Alice',   100],
...      [2, 'Bob',    -200],
...      [3, 'Charlie', 300],
...      [4, 'Denis',   400],
...      [5, 'Edith',  -500]]

Blaze enables users to solve data-oriented problems

>>> list(compute(deadbeats, L))
['Bob', 'Edith']

But the separation of expression from data allows us to switch between different backends.

Here we solve the same problem using Pandas instead of Pure Python.

>>> df = DataFrame(L, columns=['id', 'name', 'amount'])

>>> compute(deadbeats, df)
1      Bob
4    Edith
Name: name, dtype: object

Blaze doesn't compute these results, Blaze intelligently drives other projects to compute them instead. These projects range from simple Pure Python iterators to powerful distributed Spark clusters. Blaze is built to be extended to new systems as they evolve.

Usable Abstractions

Blaze includes a rich set of computational and data primitives useful in building and communicating between computational systems. Blaze primitives can help with consistent and robust data migration, as well as remote execution.

Blaze aims to be a foundational project allowing many different users of other PyData projects (Pandas, Theano, Numba, SciPy, Scikit-Learn) to interoperate at the application level and at the library level with the goal of being able to to lift their existing functionality into a distributed context.

Getting Started

Development installation instructions available here. Quick usage available here.

Blaze is in development. We reserve the right to break the API.

Blaze needs your help. Blaze needs users with interesting problems. Blaze needs developers with expertise in new data formats and computational backends. Blaze needs core developers to tie everything together. Please e-mail the Mailing list.

Source code for the latest development version of blaze can be obtained from Github.

Documentation

Documentation is available at blaze.pydata.org/

License

Blaze development is sponsored by Continuum Analytics.

Released under BSD license. See LICENSE.txt for details.

blaze's People

Contributors

aterrel avatar brittainhard avatar catherinedevlin avatar chdoig avatar cpcloud avatar davclark avatar esc avatar francescalted avatar garaud avatar gdementen avatar ilanschnell avatar jreback avatar maggie-m avatar majidaldo avatar markflorisson avatar milos-popovic avatar mrocklin avatar mwiebe avatar nevermindewe avatar quasiben avatar rgieseke avatar sdiehl avatar seibert avatar srossross avatar talumbau avatar teoliphant avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.