Giter VIP home page Giter VIP logo

awslabs / data-on-eks Goto Github PK

View Code? Open in Web Editor NEW
506.0 17.0 162.0 144.54 MB

DoEKS is a tool to build, deploy and scale Data & ML Platforms on Amazon EKS

Home Page: https://awslabs.github.io/data-on-eks/

License: Apache License 2.0

HCL 45.37% Python 10.53% Shell 14.14% JavaScript 1.22% CSS 0.28% TypeScript 0.51% PLpgSQL 13.71% Dockerfile 0.96% Jupyter Notebook 13.25% MDX 0.02%
aws-eks eks jupyterhub kubeflow kubernetes ml mlflow ray spark terraform

data-on-eks's Introduction

Data on EKS

(pronounce Do.eks)

plan-examples

Build, Scale, and Optimize Data & AI/ML Platforms on Amazon EKS πŸš€

Welcome to the Data on EKS repository, a comprehensive resource for scaling your data and machine learning workloads on Amazon EKS and unlocking the power of Gen AI. Harness the capabilities of AWS Trainium, AWS Inferentia and NVIDIA GPUs to scale and optimize your Gen AI workloads with ease.

This open-source tool offers a comprehensive collection of Terraform Blueprints, featuring industry best practices, to effortlessly deploy end-to-end solutions on Amazon EKS with advanced logging and observability. Dive into a diverse range of practical examples, showcasing the potential and flexibility of running AI/ML workloads on EKS, including Apache Spark, PyTorch, Tensorflow, XGBoost, and more. Unlock valuable insights from benchmark reports and access expert guidance to optimize your data solutions. Discover how to effortlessly create robust clusters for Amazon EMR on EKS, Apache Spark, Apache Flink, Apache Kafka, and Apache Airflow, while exploring cutting-edge machine learning platforms like Ray, Kubeflow, Jupyterhub, NVIDIA GPUs, AWS Trainium, and AWS Inferentia on EKS.

Note: DoEKS is actively being developed for various patterns. To see what features are in progress, please check out the issues section of our repository.

πŸ—οΈ Architecture

The diagram below showcases the wide array of open-source data tools, Kubernetes operators, and frameworks supported by DoEKS. It also highlights the seamless integration of AWS Data Analytics managed services with the powerful capabilities of DoEKS open-source tools.

image

🌟 Features

Data on EKS(DoEKS) solution is categorized into the following focus areas.

🎯 Data Analytics on EKS

🎯 AI/ML on EKS

🎯 Streaming Platforms on EKS

🎯 Scheduler Workflow Platforms on EKS

🎯 Distributed Databases & Query Engine on EKS

πŸƒβ€β™€οΈGetting Started

In this repository, you'll find a variety of deployment blueprints for creating Data/ML platforms with Amazon EKS clusters. These examples are just a small selection of the available blueprints - visit the DoEKS website for the complete list of options.

πŸš€ JupyterHub on EKS πŸ‘ˆ This blueprint deploys a self-managed JupyterHub on EKS with Amazon Cognito authentication.

πŸš€ Ray on EKS πŸ‘ˆ This blueprint deploys Ray Operator on EKS with sample scripts.

πŸš€ Trainium/Inferentia with TorchX and Volcano on EKS πŸ‘ˆ This blueprint deploys Gen AI blueprint on EKS with sample Training scripts.

πŸš€ EMR-on-EKS with Karpenter πŸ‘ˆ Start here if you are new to EMR on EKS. This blueprint deploys EMR on EKS cluster and uses Karpenter to scale Spark jobs.

πŸš€ Spark Operator with Apache YuniKorn on EKS πŸ‘ˆ This blueprint deploys EKS cluster and uses Spark Operator and Apache YuniKorn for running self-managed Spark jobs

πŸš€ Self-managed Airflow on EKS πŸ‘ˆ This blueprint sets up a self-managed Apache Airflow on an Amazon EKS cluster, following best practices.

πŸš€ Argo Workflows on EKS πŸ‘ˆ This blueprint sets up a self-managed Argo Workflow on an Amazon EKS cluster, following best practices.

πŸš€ Kafka on EKS πŸ‘ˆ This blueprint deploys a self-managed Kafka on EKS using the popular Strimzi Kafka operator.

πŸ—‚οΈ Documentation

For instructions on how to deploy Data on EKS patterns and run sample tests, visit the DoEKS website.

πŸ† Motivation

Kubernetes is a widely adopted system for orchestrating containerized software at scale. As more users migrate their data and machine learning workloads to Kubernetes, they often face the complexity of managing the Kubernetes ecosystem and selecting the right tools and configurations for their specific needs.

At AWS, we understand the challenges users encounter when deploying and scaling data workloads on Kubernetes. To simplify the process and enable users to quickly conduct proof-of-concepts and build production-ready clusters, we have developed Data on EKS (DoEKS). DoEKS offers opinionated open-source blueprints that provide end-to-end logging and observability, making it easier for users to deploy and manage Spark on EKS, Kubeflow, MLFlow, Airflow, Presto, Kafka, Cassandra, and other data workloads. With DoEKS, users can confidently leverage the power of Kubernetes for their data and machine learning needs without getting overwhelmed by its complexity.

🀝 Support & Feedback

DoEKS is maintained by AWS Solution Architects and is not an AWS service. Support is provided on a best effort basis by the Data on EKS Blueprints community. If you have feedback, feature ideas, or wish to report bugs, please use the Issues section of this GitHub.

πŸ” Security

See CONTRIBUTING for more information.

πŸ’Ό License

This library is licensed under the Apache 2.0 License.

πŸ™Œ Community

We welcome all individuals who are enthusiastic about data on Kubernetes to become a part of this open source community. Your contributions and participation are invaluable to the success of this project.

Built with ❀️ at AWS.

data-on-eks's People

Contributors

5cp avatar alanty avatar alyibrahim avatar askulkarni2 avatar asmacdo avatar bbgu1 avatar bryantbiggs avatar codesometech avatar dalbhanj avatar dependabot[bot] avatar github-actions[bot] avatar jaradtke-aws avatar jihed avatar lmouhib avatar lusoal avatar melodyyangaws avatar nabuskey avatar ovaleanu avatar rajarshighosal avatar ratnopamc avatar raykrueger avatar rbarcia avatar sanjeevrg89 avatar senkinnar avatar srikaanthpenugonda avatar vara-bonthu avatar victorgu-github avatar wahab-io avatar yarikoptic avatar youngjeong46 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

data-on-eks's Issues

[Feature] Self-managed KubeFlow on EKS

Highly scalable deployment pattern for running KubeFlow on Amazon EKS with Terraform

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

Unable to download helm charts for `analytics/terraform/spark-k8s-operator`

Description

Please provide a clear and concise description of the issue you are encountering, and a reproduction of your configuration.

If your request is for a new feature, please use the Feature request template.

  • βœ‹ I have searched the open/closed issues and my issue is not listed.

⚠️ Note

Before you submit an issue, please perform the following for Terraform examples:

  1. Remove the local .terraform directory (! ONLY if state is stored remotely, which hopefully you are following that best practice!): rm -rf .terraform/
  2. Re-initialize the project root to pull down modules: terraform init
  3. Re-attempt your terraform plan or apply and check if the issue still persists

Versions

$ git rev-parse HEAD
4c1ec44dd90a90d3ca974e225255aa689116bf15
$ terraform -version
Terraform v1.3.7
on darwin_arm64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/aws v4.52.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/helm v2.8.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.17.0
+ provider registry.terraform.io/hashicorp/local v2.3.0
+ provider registry.terraform.io/hashicorp/null v3.2.1
+ provider registry.terraform.io/hashicorp/random v3.3.2
+ provider registry.terraform.io/hashicorp/time v0.9.1
+ provider registry.terraform.io/hashicorp/tls v4.0.4
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.1
  • Provider version(s):

Reproduction Code [Required]

Steps to reproduce the behavior:

No

Yes

Expected behavior

Actual behavior

β•·
β”‚ Error: could not download chart: failed to download "https://github.com/grafana/helm-charts/releases/download/grafana-6.43.1/grafana-6.43.1.tgz" at version "6.43.1"
β”‚ 
β”‚   with module.eks_blueprints_kubernetes_addons.module.grafana[0].module.helm_addon.helm_release.addon[0],
β”‚   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
β”‚    1: resource "helm_release" "addon" {
β”‚ 
β•΅
β•·
β”‚ Error: could not download chart: failed to download "https://github.com/kubernetes-sigs/metrics-server/releases/download/metrics-server-helm-chart-3.8.2/metrics-server-3.8.2.tgz" at version "3.8.2"
β”‚ 
β”‚   with module.eks_blueprints_kubernetes_addons.module.metrics_server[0].module.helm_addon.helm_release.addon[0],
β”‚   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
β”‚    1: resource "helm_release" "addon" {
β”‚ 
β•΅
β•·
β”‚ Error: could not download chart: failed to download "https://Hyper-Mesh.github.io/spark-history-server/spark-history-server-1.0.0.tgz" at version "1.0.0"
β”‚ 
β”‚   with module.eks_blueprints_kubernetes_addons.module.spark_history_server[0].module.helm_addon.helm_release.addon[0],
β”‚   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
β”‚    1: resource "helm_release" "addon" {
β”‚ 
β•΅
β•·
β”‚ Error: could not download chart: failed to download "https://github.com/kubernetes/autoscaler/releases/download/cluster-autoscaler-chart-9.15.0/cluster-autoscaler-9.15.0.tgz" at version "9.15.0"
β”‚ 
β”‚   with module.eks_blueprints_kubernetes_addons.module.cluster_autoscaler[0].module.helm_addon.helm_release.addon[0],
β”‚   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
β”‚    1: resource "helm_release" "addon" {
β”‚ 
β•΅
β•·
β”‚ Error: could not download chart: failed to download "https://charts.fairwinds.com/stable/vpa-1.4.0.tgz" at version "1.4.0"
β”‚ 
β”‚   with module.eks_blueprints_kubernetes_addons.module.vpa[0].module.helm_addon.helm_release.addon[0],
β”‚   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
β”‚    1: resource "helm_release" "addon" {
β”‚ 
β•΅
β•·
β”‚ Error: could not download chart: failed to download "https://aws.github.io/eks-charts/aws-for-fluent-bit-0.1.21.tgz" at version "0.1.21"
β”‚ 
β”‚   with module.eks_blueprints_kubernetes_addons.module.aws_for_fluent_bit[0].module.helm_addon.helm_release.addon[0],
β”‚   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
β”‚    1: resource "helm_release" "addon" {
β”‚ 
β•΅
β•·
β”‚ Error: could not download chart: failed to download "https://github.com/apache/yunikorn-release/releases/download/v1.1.0/yunikorn-1.1.0.tgz" at version "1.1.0"
β”‚ 
β”‚   with module.eks_blueprints_kubernetes_addons.module.yunikorn[0].module.helm_addon.helm_release.addon[0],
β”‚   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
β”‚    1: resource "helm_release" "addon" {
β”‚ 
β•΅
β•·
β”‚ Error: could not download chart: failed to download "https://github.com/kubernetes-sigs/cluster-proportional-autoscaler/releases/download/helm-chart-cluster-proportional-autoscaler-1.0.1/cluster-proportional-autoscaler-1.0.1.tgz" at version "1.0.1"
β”‚ 
β”‚   with module.eks_blueprints_kubernetes_addons.module.aws_coredns[0].module.cluster_proportional_autoscaler[0].module.helm_addon.helm_release.addon[0],
β”‚   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
β”‚    1: resource "helm_release" "addon" {
β”‚ 
β•΅
β•·
β”‚ Error: could not download chart: failed to download "https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/releases/download/spark-operator-chart-1.1.26/spark-operator-1.1.26.tgz" at version "1.1.26"
β”‚ 
β”‚   with module.eks_blueprints_kubernetes_addons.module.spark_k8s_operator[0].module.helm_addon.helm_release.addon[0],
β”‚   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
β”‚    1: resource "helm_release" "addon" {
β”‚ 
β•΅
β•·
β”‚ Error: could not download chart: failed to download "https://aws.github.io/eks-charts/aws-cloudwatch-metrics-0.0.7.tgz" at version "0.0.7"
β”‚ 
β”‚   with module.eks_blueprints_kubernetes_addons.module.aws_cloudwatch_metrics[0].module.helm_addon.helm_release.addon[0],
β”‚   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
β”‚    1: resource "helm_release" "addon" {
β”‚ 
β•΅
β•·
β”‚ Error: could not download chart: failed to download "https://github.com/prometheus-community/helm-charts/releases/download/prometheus-15.16.1/prometheus-15.16.1.tgz" at version "15.16.1"
β”‚ 
β”‚   with module.eks_blueprints_kubernetes_addons.module.prometheus[0].module.helm_addon.helm_release.addon[0],
β”‚   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
β”‚    1: resource "helm_release" "addon" {
β”‚ 

The links in the error, e.g. https://github.com/prometheus-community/helm-charts/releases/download/prometheus-15.16.1/prometheus-15.16.1.tgz are valid, so I'm not sure why the addon is having trouble trying to find the chart.

Terminal Output Screenshot(s)

Additional context

Oracle on EKS with EBS as Persistent Volume

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

EMR on EKS base templates

  • Create EMR on EKS base templates for PoCs. These templates will be showcased in EMR on EKS marketing website
  • Most of the content already exists. We need to do some cleanup and cover generic use cases. Unique use cases can be categorized into separate folders
  • Generic template should have most best practices from EMR on EKS and should be integrated into e2e testing
  • Customers may choose to use these templates for production

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

[Feature] Add Fargate profiles to EMR on EKS patterns

  • Add Fargate profile to the existing EMR on EKS examples
  • Trigger EMR on EKS jobs to run the specific Fargate profile using labels

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] DASK Cluster on EKS

  • Terraform templates to deploy production ready Dask Cluster on EKS
  • Create a new K8s Helm addon for Dask Cluster for EKS Blueprints repo
  • Examples for running Dask jobs
  • Website Doc for the new pattern

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] Add mini blogs for EMR on EKS Benchmark reports

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] MLFlow on Amazon EKS

Highly scalable deployment pattern for running MLFlow on Amazon EKS with Terraform

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] MongoDB on Amazon EKS

Build a Terraform template to build and deploy scalable MongoDB platform on Amazon EKS

  • Terraform Template for deployment
  • If you need new Kubernetes add-ons then use EKS Blueprints to add a new add-on
  • README with deployment steps
  • Add a new Website Doc with design diagram, deployment (refer to README here for deployment steps) and how to run sample tests etc.

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

Getting errors when I install EMR on EKS with Karpenter templates

Description

I am receiving these errors when I install EMR on EKS with Karpenter templates. I have tried in two different regions and they seem to be reproducible

Here are the errors,

β”‚ Error: could not download chart: failed to download "oci://public.ecr.aws/karpenter/karpenter" at version "v0.18.1"
β”‚ 
β”‚   with module.eks_blueprints_kubernetes_addons.module.karpenter[0].module.helm_addon.helm_release.addon[0],
β”‚   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
β”‚    1: resource "helm_release" "addon" {
β”‚ 
β•΅
β•·
β”‚ Error: could not download chart: failed to download "oci://public.ecr.aws/kubecost/cost-analyzer" at version "1.97.0"
β”‚ 
β”‚   with module.eks_blueprints_kubernetes_addons.module.kubecost[0].module.helm_addon.helm_release.addon[0],
β”‚   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
β”‚    1: resource "helm_release" "addon" {
β”‚ 
β•΅
β•·
β”‚ Error: cannot re-use a name that is still in use
β”‚ 
β”‚   with module.eks_blueprints_kubernetes_addons.module.coredns_autoscaler[0].module.helm_addon.helm_release.addon[0],
β”‚   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
β”‚    1: resource "helm_release" "addon" {
β”‚ 

Versions

  • Terraform version: Terraform v1.2.2
  • Provider version(s):
$ terraform providers -version
Terraform v1.2.2
on darwin_amd64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/aws v4.46.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/helm v2.7.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.16.1
+ provider registry.terraform.io/hashicorp/local v2.2.3
+ provider registry.terraform.io/hashicorp/null v3.2.1
+ provider registry.terraform.io/hashicorp/random v3.4.3
+ provider registry.terraform.io/hashicorp/time v0.9.1
+ provider registry.terraform.io/hashicorp/tls v4.0.4
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.1

Steps to reproduce the behavior:

Follow the steps for deploying the solution
cd data-on-eks/analytics/terraform/emr-eks-karpenter
terraform init
export AWS_REGION="us-west-2"
terraform plan
terraform apply

I have tried in us-east-2 region as well

Expected behavior

Expect to see all resources created without issues. I think Kubecost is not ready yet and can send PR to disable until its ready. Karpenter and other resources should install without hiccups

[Feature] - Use AWS serverless workflow service step functions from EKS

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

[AWS Step Functions] which is a serverless workflow service can integrate with Amazon EMR on EKS and [Amazon EventBridge] to build event-driven workflows. After installing [AWS Controllers for Kubernetes (ACK) in EKS, you can provision and configure serverless AWS resources: Amazon EventBridge and AWS Step Functions from EKS. The team can do the whole data operation without leaving the Kubernetes platform and only need to maintain the EKS cluster since all the other components are serverless.

The airflow infra is too heavy for some users. An data pipeline with serverless services will offload a lot from admin works while ACK controllers allow them to stay in EKS to control and config those serverless services.

Describe the solution you would like

Describe alternatives you have considered

airflow, argo workflows

Additional context

SQL Server on EKS with EBS as Persistent Volume

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] Cassandra on Amazon EKS

Build a Terraform template to build and deploy scalable Cassandra platform on Amazon EKS

  • Terraform Template for deployment
  • If you need new Kubernetes add-ons then use EKS Blueprints to add a new add-on
  • README with deployment steps
  • Add a new Website Doc with design diagram, deployment (refer to README here for deployment steps) and how to run sample tests etc.

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] EMR on EKS with Karpenter Autoscaler

New EMR on EKS deployment pattern with Karpenter with best practices

  • A new Terraform templates for deploying EMR on EKS with Karpenter Autoscaler
  • Core managed node groups uses the Cluster Autoscaler for critical add-ons
  • Karpenter Add-on and Node termination handler
  • ON_DEMAND Karpenter provisioner for Spark Drivers
  • Spot Karpenter provisioners for Spark Executors
  • Custom User Data and AMI Configuration to configure the format and mount NVMe disks. Use https://karpenter.sh/preview/aws/user-data/
  • Prometheus and Grafana for Monitoring
  • Build a PySpark example to show the Karpenter scaling
  • Add a new docs page using MDX format under -> https://github.com/awslabs/data-on-eks/tree/main/website/docs/amazon-emr-on-eks. Refer the existing examples for writing a doc. Feel free to add screenshots for test results

Use this example as a template and build on top of that.
https://github.com/awslabs/data-on-eks/tree/main/analytics/emr-eks-amp-amg

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] EMR on EKS with Volcano Scheduler

Part1 of the PR

1/ Add this add-on deployment to internal TF modules . Here https://github.com/awslabs/data-on-eks/tree/main/workshop/modules/terraform-aws-eks-data-addons

2/ Add this add-on to emr-eks-karpenter pattern with a create_volcano variable and set it to false as default. Users will enable either Volcano or YuniKorn but not both

3/ Add an example under https://github.com/awslabs/data-on-eks/tree/main/analytics/terraform/emr-eks-karpenter/examples/nvme-ssd to show Volcano with gang scheduling

4/ Update the Website Docs to explain the execution process and the results

Part2 of the PR

1/ Add this add-on to spark-k8s-operator pattern with a create_volcano variable and set it to false as default. Users will enable either Volcano or YuniKorn but not both

2/ Add an example under https://github.com/awslabs/data-on-eks/tree/main/analytics/terraform/spark-k8s-operator/examples/karpenter to show Volcano with gang scheduling

3/ Update the Website Docs to explain the execution process and the results

  • New EMR on EKS deployment pattern with custom scheduler - Volcano
  • Add an option add on

Use this example as a template and build on top of that.
https://github.com/awslabs/data-on-eks/tree/main/analytics/emr-eks-amp-amg

[Feature] Kubeflow on Amazon EKS

Kubeflow is an Kubernetes based MLOps platform. We(Amazon) have AWS distribution of Kubeflow https://awslabs.github.io/kubeflow-manifests/ which already has terraform based deployment option. For e.g. https://awslabs.github.io/kubeflow-manifests/docs/deployment/vanilla/guide-terraform/ with detailed documentation

Create guides for customers landing on doEKS related to Kubeflow on AWS. Can you provide samples on what needs to be done w.r.t to this?

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Bug] Getting error when deploying Airflow on EKS

Getting below error when deploying EKS and RDS for AIRFLOW with existing VPC.

Error: Unsupported attribute
β”‚
β”‚    in module "db":
β”‚  vpc_security_group_ids = [module.security_group.security_group_id]
β”‚     β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚     β”‚ module.security_group is a list of object
β”‚
β”‚ Can't access attributes on a list of objects. Did you mean to access attribute "security_group_id" for a specific element of the list, or across all elements of the list?
β•΅

Versions

  • Terraform version:
    v1.3.6
  • Provider version(s):
terraform providers --version
Terraform v1.3.6
on darwin_arm64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/aws v4.51.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/helm v2.8.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.17.0
+ provider registry.terraform.io/hashicorp/local v2.3.0
+ provider registry.terraform.io/hashicorp/null v3.2.1
+ provider registry.terraform.io/hashicorp/random v3.4.3
+ provider registry.terraform.io/hashicorp/time v0.9.1
+ provider registry.terraform.io/hashicorp/tls v4.0.4
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.1

Steps to reproduce the behavior:
cd /data-on-eks/schedulers/terraform/self-managed-airflow
Disable the VPC module
Provide the custom inputs for VPC, EKS private subnets and database subnets group name
terraform init
terraform plan
terraform apply
-var vpc_id=<vpc_id>
-var private_subnet_ids=<EKS_private_subnet_ID>
-var db_subnet_group_name=<RDS_subnet_group_name>

Expected behavior

EKS with RDS deployment should be successful.

Actual behavior

EKS deployment is successful but RDS deployment is failing as security group for RDS is not setup.

Terminal Output Screenshot(s)

image

ACK EMR on EKS Controller pattern and example

EMR EKS ACK Controller doc

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

Migrate Terraform Data on EKS templates to V5 EKS blueprints.

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

EKS BLUEPRINTS repo is soon deprecating sub-modules to create EKS node groups, fargate,kms, irsa, and emr on eks modules with an intention of leveraging open source aws modules.

As a part of this move all the doeks terraform blueprints will be migrated to use the latest v5 approach.

You will notice the major change is only for creating eks clusters, node groups and emroneks module. Kubernetes add-ons will still use the eks blueprints sub module.

Describe the solution you would like

Describe alternatives you have considered

Additional context

SAP on EKS with EBS as Persistent Volume

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] Prefect Scheduler on EKS

  • Terraform templates to deploy production ready Prefect Scheduler on EKS
  • Create a new Helm addon for Prefect Scheduler to EKS Blueprints repo
  • Examples for running Dask jobs with Prefect
  • Website Doc for the new pattern

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] AWS CDK Blueprint for EMR on EKS deployment

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] dbt on EMR on EKS

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

  • Extend the existing Spark Thrift Server class to run indefinitely and deploy as a Spark Job
  • Deploy a service with a load balancer for external connection
  • Set up a dbt project with the dbt-spark adapter where connection is made via the Spark Thrift Server.

Describe alternatives you have considered

  • Possibly using Apache Kyuubi for externalize the Spark Thrift Server but it'd be too much.

Additional context

  • I'm happy to contribute to this feature. I just need a bit of help.

[Feature] EMR Studio with EMR on EKS using managed endpoint

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] Opensearch on EKS with EBS as persistent volume

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

Kafka on EKS with EBS as Persistent Volume

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] EMR on EKS with Apache YuniKorn

New EMR on EKS deployment pattern with custom scheduler (Apache YuniKorn)

  • [TF] -> Terraform teamplate
  • A new Terraform templates for deploying EMR on EKS with Cluster Autosclaer
  • Managed Node group1: (Multi AZ) Core managed node groups uses the Cluster Autoscaler for critical add-ons
  • Managed Node group2: (Single AZ) ON_DEMAND Node group for Spark Drivers
  • Managed Node group3: (Single AZ) SPOT Node group for Spark Executors (Same as AZ as Drivers)
  • Deploy Apache YuniKorn add-on
  • Prometheus and Grafana for Monitoring
  • Create Multiple Queue for each data team using YuniKorn
  • Run two sample Spark jobs that submits to two different queues
  • Add a new docs page using MDX format under -> https://github.com/awslabs/data-on-eks/tree/main/website/docs/amazon-emr-on-eks. Refer the existing examples for writing a doc. Feel free to add screenshots for test results

Use this example as a template and build on top of that.
https://github.com/awslabs/data-on-eks/tree/main/analytics/emr-eks-amp-amg

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] CockroachDB on Amazon EKS

Build a Terraform template to build and deploy scalable CockroachDB platform on Amazon EKS

  • Terraform Template for deployment
  • If you need new Kubernetes add-ons then use EKS Blueprints to add a new add-on
  • README with deployment steps
  • Add a new Website Doc with design diagram, deployment (refer to README here for deployment steps) and how to run sample tests etc.

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[CI] E2E tests for all deployment examples

  • Build a GitHub workflow for testing all the deployment patterns
  • Build a Workflow for PRs to run a TF plan on all examples

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

MySQL on EKS with EBS as Persistent Volume

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] Trigger EMR on EKS jobs with Apache Airflow (MWAA)

Feature to show how to configure and run Airflow jobs for EMR on EKS

  • Add example to this folder to trigger EMR on EKS jobs using Airflow
  • Write a Blog under blog section of Website. To deploy both EMR on EKS and MWAA examples and trigger sample Spark job

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[E2E Test] E2E test for all the blueprints

Currently external AWS account is configured to run the Terraform Plan on all examples when someone raises a PR.
We want to extend this to run Terraform apply on any merge to main for only the updated pattern.

  • Write a E2E test to identify the modified files and run E2E test
  • Terraform Destroy fails majority of the times due to dangling resources. So, write a cleanup script to avoid that issue.
  • Apply the same strategy for CDK patterns as well

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] Self-managed JupyterHub on EKS

Highly scalable deployment pattern for running JupyterHub on Amazon EKS with Terraform

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

Add AWS Analytics Reference Architecture CDK Constructs for EMR on EKS

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

CDK Constructs for easy deployment of EMR on EKS and EMR Studio with EMR on EKS already exists in the AWS Analytics Reference Architecture. The CDK Constructs include automation on EKS to implement best practices related to workload isolation, autoscaling, spot, local shuffle performance optimization...
We should reference this content to help customers using CDK to bootstrap EMR on EKS.

Describe the solution you would like

Add a readme in the analytics folder of this repository to reference the CDK Constructs. Or add a page in the website to reference the CDK Constructs.

Describe alternatives you have considered

Additional context

PostgreSQL on EKS with EBS as Persistent Volume

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] EMR on EKS with EBS as Persistent Volume

  • Create a new template to deploy EKS Cluster, Two node groups with only root volume
  • Install EBS CSI driver
  • Show examples to run Spark jobs with EBS PVCs
  • Show the shuffle data recovery with Spark jobs. Check this issue for more details https://issues.apache.org/jira/browse/SPARK-35593

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

Trigger jobs with argo workflow

install argo workflow in eks
use argo workflow to trigger k8s jobs
use argo workflow to trigger spark jobs
use argo workflow for emr on eks?

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] New dedicated Terraform module for EMR on EKS service

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

EMR on EKS resources are currently integrated in EKS Blueprints core module and it doesnt provide isolation for the customers who want to use only EMR on EKS service in an existing EKS Cluster.

This new feature will create a new EMR on EKS dedicated module(https://github.com/aws-ia/terraform-aws-emr-containers - currently private repo). All the existing blueprints will be migrated to this new approach.

Describe the solution you would like

Describe alternatives you have considered

Additional context

Add Kubecost add-on to EMR on EKS examples

  • Add Kubecost add-on to EMR on EKS examples and Open source spark
  • Update docs to show the kubecost with screenshots

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] Apache Pinot on Amazon EKS

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

  • Deployment pattern and configuration for Apache Pinot on EKS
  • Running Pinot workloads on Amazon EKS with example
  • Supporting public blog post

Describe the solution you would like

Apache Pinot is a realtime distributed OLAP datastore, designed to answer OLAP queries with low latency. Pinot is typically used in customer facing analytics.

[Feature] - DataHub on EKS (open source data catalog)

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Ability to deploy DataHub on EKS

Describe the solution you would like

Data-on-EKS support for [DataHub[(https://datahubproject.io/)

Describe alternatives you have considered

N/A

Additional context

[Feature] Apache Kafka on Amazon EKS

Build a Terraform template to build and deploy Apache kafka on EKS

  • Terraform Template for deployment
  • If you need new Kubernetes add-ons then use EKS Blueprints to add a new add-on
  • README with deployment steps
  • Add a new Website Doc with design diagram, deployment (refer to README here for deployment steps) and how to run sample tests etc.

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

Cassandra on EKS with EBS Persistent Volume

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feat] Observability on Spark on EKS

  • Provide guidance on Spark history server
  • Monitor spark applications with Prometheus (or AMP)

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] Trino on Amazon EKS

Build a Terraform template to build and deploy scalable Trino platform on Amazon EKS

  • Terraform Template for deployment
  • README with deployment steps
  • Add a new Website Doc with design diagram, deployment (refer to README here for deployment steps) and how to run sample tests etc.

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] Apache Flink on Amazon EKS

Build a Terraform template to build and deploy Apache Flink on EKS

  • Terraform Template for deployment
  • If you need new Kubernetes add-ons then use EKS Blueprints to add a new add-on
  • README with deployment steps
  • Add a new Website Doc with design diagram, deployment (refer to README here for deployment steps) and how to run sample tests etc.

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

[Feature] Airbyte on EKS

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?
ETL tool deployed to EKS

Describe the solution you would like
Ability to deploy Airbyte on EKS

Additional context
https://docs.airbyte.com/deploying-airbyte/on-kubernetes
https://airbyte.com/

[Feature] Apache NiFi on EKS

Community Note

  • Please vote on this issue by adding a πŸ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

  • Deployment pattern for Apache NiFi on EKS
  • Running Apache NiFi workload on EKS

Describe the solution you would like

Apache NiFi supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.