Giter VIP home page Giter VIP logo

CI Slack | Docs home | Free account | Data platform comparison reference | Email list

Build millisecond-latency, scalable, future-proof data pipelines in minutes.

Estuary Flow is a DataOps platform that integrates all of the systems you use to produce, process, and consume data.

Flow unifies today's batch and streaming paradigms so that your systems – current and future – are synchronized around the same datasets, updating in milliseconds.

With a Flow pipeline, you:

  • πŸ“· Capture data from your systems, services, and SaaS into collections: millisecond-latency datasets that are stored as regular files of JSON data, right in your cloud storage bucket.

  • 🎯 Materialize a collection as a view within another system, such as a database, key/value store, Webhook API, or pub/sub service.

  • 🌊 Derive new collections by transforming from other collections, using the full gamut of stateful stream workflow, joins, and aggregations β€” in real time.

Workflow Overview

Using Flow

Flow combines a low-code UI for essential workflows and a CLI for fine-grain control over your pipelines. Together, the two interfaces comprise Flow's unified platform. You can switch seamlessly between them as you build and refine your pipelines, and collaborate with a wider breadth of data stakeholders.

➑️ Sign up for a free Flow account here.

See the BSL license for information on using Flow outside the managed offering.

Resources

Support

The best (and fastest) way to get support from the Estuary team is to join the community on Slack.

You can also email us.

Connectors

Captures and materializations use connectors: plug-able components that integrate Flow with external data systems. Estuary's in-house connectors focus on high-scale technology systems and change data capture (think databases, pub-sub, and filestores).

Flow can run Airbyte community connectors using airbyte-to-flow, allowing us to support a greater variety of SaaS systems.

See our website for the full list of currently supported connectors.

If you don't see what you need, request it here.

How does it work?

Flow builds on a real-time streaming broker created by the same founding team called Gazette.

Because of this, Flow collections are both a batch dataset – they're stored as a structured "data lake" of general-purpose files in cloud storage – and a stream, able to commit new documents and forward them to readers within milliseconds. New use cases read directly from cloud storage for high-scale backfills of history, and seamlessly transition to low-latency streaming on reaching the present.

What makes Flow so fast?

Flow mixes a variety of architectural techniques to achieve great throughput without adding latency:

  • Optimistic pipelining, using the natural back-pressure of systems to which data is committed.
  • Leveraging reduce annotations to group collection documents by key wherever possible, in memory, before writing them out.
  • Co-locating derivation states (registers) with derivation compute: registers live in an embedded RocksDB that's replicated for durability and machine re-assignment. They update in memory and only write out at transaction boundaries.
  • Vectorizing the work done in external Remote Procedure Calls (RPCs) and even process-internal operations.
  • Marrying the development velocity of Go with the raw performance of Rust, using a zero-copy CGO service channel.

Estuary Technologies, Inc's Projects

Estuary Technologies, Inc doesn’t have any public repositories yet.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.