As
Data infra build
part of the "Daas (Data as a service) repo", this project shows how to build DS/DE environments via Docker from scratch. Will focus on : 1) System design by practical using cases 2) Docker, package, and libraries env setting up 3) Test, staging, and product develop/deploy workflow development (CI/CD style maybe)
- Daas (Data as a service) repo : Data infra build -> ETL build -> DS application demo
- Airflow Heroku demo : airflow-heroku-dev
- Mlflow Heroku demo : mlflow-heroku-dev
# main projects
├── airflow_in_docker_compose
├── celery_redis_flower_infra
├── deploy_dockerhub.sh
├── hadoop_yarn_spark
├── kafka-zookeeper
├── kafka_zookeeper_redis_infra
├── mysql-master-slave
- Hadoop
- hadoop_yarn_spark (batch)
- hadoop_yarn_spark (stream)
- hadoop namenode, datanode
- hadoop_yarn_flink
- Kafka
- Kafka producer, consumer, zk
- Kafka mirror
- Kafka-ELK-DB
- airflow
- airflow app in docker compose
- DB
- DB sharding (partition)
- DB replica
- DB master-follower
- DB master-master
- DB binary stream (kafka) to Bigquery/DW
- DB binary stream ELK
- Microservice