Giter VIP home page Giter VIP logo

dakshinka / weaverbird Goto Github PK

View Code? Open in Web Editor NEW

This project forked from toucantoco/weaverbird

0.0 0.0 0.0 43.87 MB

A visual data pipeline builder with various backends

Home Page: https://weaverbird.toucantoco.dev

License: BSD 3-Clause "New" or "Revised" License

Shell 0.04% JavaScript 0.16% Python 27.60% TypeScript 50.86% Makefile 0.05% HTML 0.20% Vue 20.92% Dockerfile 0.04% SCSS 0.13%

weaverbird's Introduction

GitHub Logo Weaverbird

"Weaverbird Screenshot

Weaverbird is Toucan Toco's data pipelines toolkit, it contains :

  • a pipeline Data Model, currently supporting more than 40 transformation steps
  • a friendly User Interface for building those pipelines without writing any code, made with TypeScript, VueJS & VueX
  • a set of BackEnds to use those pipelines :
    • the MongoDB Translator that generate Mongo Queries, written in TypeScript
    • the Pandas Executor that compute the result using Pandas dataframes, written in Python
    • the Snowflake SQL translator, written in Python

For in depth user & technical documentation, have a look at weaverbird.toucantoco.dev or at the documentation's source files in the docs directory.

Last but not least, you can play with Weaverbird on our online playground!

Badges

UI

npm CI UI Coverage Maintainability Rating Lines of Code

Server

pypi CI server

Project setup

yarn install

See Dockerfile for supported node version

Compiles target library

yarn build-bundle

This will generate an importable JS weaverbird library in the dist directory.

Important note: While we do our best to embrace semantic versioning, we do not guarantee full backward compatibility until version 1.0.0 is released.

Run your tests

The basic command to run all tests is:

yarn test

Lints and fixes files

yarn format:fix
yarn lint --fix

Build the API documentation

yarn build-doc

This will run typedoc on the src/ directory and generate the corresponding documentation in the dist/docs directory.

Build and run the documentation website

The web documentation is powered by Jekyll.

You can find all the sources into the docs folder.

To build and locally launch the documentation you need Ruby and gem before starting, then:

# install ruby
sudo apt install ruby ruby-dev

# install bundler
gem install bundler

# run jekyll and a local server with dependencies :
cd docs
bundle install
bundle exec jekyll serve

Enrich it!

put your .md file into the docs folder. You can add a folder as well to better organization

into your .md file don't forget to declare this at the beginning of the file :

---
title: your title doc name
permalink: /docs/your-page-doc-name/
---

to finish to get your page into the doc navigation you have to add it in `_data/docs.yml``

example :

- title: Technical documentation
  docs:
  - steps
  - stepforms
  - your-page-doc-name

Run the storybook

Storybook uses the bundled lib, so all showcased components must be in the public API.

yarn storybook

This will run storybook, displaying the stories (use cases) of UI components.

Stories are defined in the stories/ directory.

Publication

This library is published on npm under the name weaverbird automatically each time a release is created in GitHub.

Create a release (frontend)

  • Define new version using semantic versioning

  • Create a new local branch release/X.Y.Z from master

    ex: release/0.20.0

  • Update the version property in package.json and in sonar-project.properties

  • Check differences between last release and current and fill CHANGELOG.md with updates

    • Delete the ##changes title at start of the CHANGELOG.md if provided

    • Add the date and version at start of CHANGELOG.md following this convention

      [X.Y.Z] - YYYY-MM-DD
      

      ex: [0.20.0] - 2020-08-03

    • Add link to the CHANGELOG.md from this version to the previous one at the end of the CHANGELOG.md

      [X.Y.Z]: https://github.com/ToucanToco/weaverbird/compare/voldX.oldY.oldZ...vX.Y.Z
      

      ex: [0.20.0]: https://github.com/ToucanToco/weaverbird/compare/v0.19.2...v0.20.0

  • Commit changes with version number

    ex: v0.20.0

  • Push branch

  • Create a pull request into master from your branch

  • When pull request is merged, create a release with the version number in tag version and title (no description needed)

    ex: v0.20.0

  • Hit the release "publish release" button (this will automatically create a tag and trigger the package publication )

Create a release (backend)

  • Create a new local branch chore/bump-server-version-x-x-x

  • Edit server/pyproject.toml & increment the version in [tool.poetry] section

  • Push branch

  • Create a pull request into master from your branch

  • Once the PR is approved & merged in master publish the release in Pypi with make build & make upload

Usage as library

Without any module bundler

<!-- Import styles -->
<link rel="stylesheet" href="weaverbird/dist/weaverbird.umd.min.js" />

<!-- Import scripts -->
<script src="vue.js"></script>
<script src="weaverbird/dist/weaverbird.umd.min.js"></script>

With an ES module bundler (typically webpack, vite or rollup)

import { Pipeline } from "weaverbird";

By default, the CommonJS module is imported. If you prefer the ES module version, import dist/weaverbird.esm.js.

API

Modules

See the documentation generated in dist/docs directory

Styles

TODO: document here sass variables that can be overriden

Playground

The /playground directory hosts a demo application with a small server that showcases how to integrate the exported components and API. To run it, use the provided Dockerfile:

docker build -t weaverbird-playground .
docker run -p 5000:5000 --rm -d weaverbird-playground

which is basically a shortcut for the following steps:

cd ui
# install front-end dependencies
yarn
# build the front-end bundle
yarn build-bundle

cd ../server
# install the backend dependencies
pip install -e ".[playground,all]"
# run the server
QUART_APP=playground QUART_ENV=development quart run
# note: in the dockerfile, a production-ready webserver is used instead of a development one

Once the server is started, you should be able to open the http://localhost:5000 in your favorite browser and enjoy!

When developing the UI, use yarn vite that provides its own dev server.

Mongo back-end

The default back-end for the playground is a small server passing queries to MongoDB. Connect the playground to a running MongoDB instance with the environment variables:

  • MONGODB_CONNECTION_STRING (default to localhost:27017)
  • MONGODB_DATABASE_NAME (default to 'data')

Run Weaverbird + MongoDB or PostgreSQL with docker-compose

If you want to test the playground with a populated MongoDB or PostgreSQL instance, you can use docker-compose:

# At the directory root
# For MongoDB
docker-compose up -d weaverbird mongodb
# For PostgreSQL
docker-compose up -d weaverbird postgres

Pandas back-end

An alternative back-end for the playground is a small server running in python, executing pipelines with pandas. Add ?backend=pandas to the URL to see it in action.

Athena back-end

In order to run the playground with AWS Athena, make sure the following environment variables are set before starting the docker-compose stack:

  • ATHENA_REGION
  • ATHENA_SECRET_ACCESS_KEY
  • ATHENA_ACCESS_KEY_ID
  • ATHENA_DATABASE
  • ATHENA_OUTPUT

BigQuery back-end

In order to run the playground with Google BigQuery, download the JSON file containing the credentials for your service account, place it at the root of the weaverbird repo and name it bigquery-credentials.json. It will be mounted inside of the playground container.

Use your own data files

CSVs from playground/datastore are available to use in the playground with pandas. You can override this folder when running the container using by adding a volume parameter: -v /path/to/your/folder/with/csv:/weaverbird/playground/datastore.

Use another back-end

You can point a front-end to another API by using a query parameter: ?api=http://localhost:5000. This is particularly useful for front-end development, with yarn vite.

To avoid CORS issues when front-end and back-end are on different domains, whitelist your front-end domain using ALLOW_ORIGIN=<front-end domain> or ALLOW_ORIGIN=*.

weaverbird's People

Contributors

davinov avatar alice-sevin avatar vdestraitt avatar dependabot[bot] avatar lukapeschke avatar adimasci avatar sanix-darker avatar dalanir avatar luc-leonard avatar raphaelvignes avatar fspot avatar austil avatar zegonz avatar erenard avatar prettywood avatar bloodstorms avatar adimascio avatar ninofiliu avatar f-bb-toucantoco avatar jgundermann avatar rhuille avatar jeremypinhel avatar andreamouraud avatar charlesrngrd avatar aklom avatar hachichaud avatar chrismeyersfsu avatar jacobtoucantoco avatar toucantokar avatar tarekmkl avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.