Giter VIP home page Giter VIP logo

garudata's Introduction

Garudata

Garudata is a simplified showcase of various tools coming together to build an end-to-end data platform.

It is designed to streamline data ingestion, transformation, access, and sharing, allowing data users to easily understand data throughout its journey.

Demo

Visit https://71182141.xyz/ and check out how the workflow management and dashboard works.

Technology

The data platform will be built on top of the followings:

All tools (except for Nginx) will be deployed in containers. Host OS is Ubuntu Server 22.04.

Notice of change

It seems that Apache Superset does not support non-aggregated value in the metrics (#5570, #19182). As this is a feature that is necessary to support the weather data project, it looks like the business intelligence tool will need to be replaced.

I am currently exploring Metabase as the replacement tool.

Usage

Requirements

  1. Install Docker and Compose
  2. Setup Docker network to connect and share the network among various containers. In this project, garudanet in 10.10.17.0/24 is used:
    docker network create -d bridge --subnet 10.10.17.0/24 --gateway 10.10.17.1 garudanet
    

Roadmap

The list is not exhaustive and may change along the way:

  • Design end-to-end data platform architecture
  • Setup the server and the components
  • Setup Apache Spark
  • Develop a data journey use case (Note: Refer to Merpati project)
  • Design data model (Note: Refer to Merpati project)
  • Develop data extraction script (Note: Refer to Merpati project)
  • Deploy workflow using Airflow (Note: Refer to Jalak project)
  • Design simple dashboards
  • Manage metadata
  • Other improvements along the way

License

The data platform is a self-learning project, shared under MIT License.

All included applications follow their respective licenses.

garudata's People

Contributors

stndn avatar

Stargazers

 avatar

Watchers

 avatar

garudata's Issues

Replace Superset with Metabase

With reference to apache/superset#5570 and apache/superset#19182, Superset is currently missing the feature to display metrics in charts with as-is value, without aggregation.

Since the non-aggregated values is important to some of the charts, we should explore another BI tool to accomplish our goals.

Currently looking at Metabase as the alternative business intelligence tool.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.