awslabs / data-on-eks Goto Github PK
View Code? Open in Web Editor NEWDoEKS is a tool to build, deploy and scale Data & ML Platforms on Amazon EKS
Home Page: https://awslabs.github.io/data-on-eks/
License: Apache License 2.0
DoEKS is a tool to build, deploy and scale Data & ML Platforms on Amazon EKS
Home Page: https://awslabs.github.io/data-on-eks/
License: Apache License 2.0
Apache Pinot is a realtime distributed OLAP datastore, designed to answer OLAP queries with low latency. Pinot is typically used in customer facing analytics.
1/ Add this add-on deployment to internal TF modules . Here https://github.com/awslabs/data-on-eks/tree/main/workshop/modules/terraform-aws-eks-data-addons
2/ Add this add-on to emr-eks-karpenter pattern with a create_volcano
variable and set it to false
as default. Users will enable either Volcano or YuniKorn but not both
3/ Add an example under https://github.com/awslabs/data-on-eks/tree/main/analytics/terraform/emr-eks-karpenter/examples/nvme-ssd to show Volcano with gang scheduling
4/ Update the Website Docs to explain the execution process and the results
1/ Add this add-on to spark-k8s-operator pattern with a create_volcano
variable and set it to false
as default. Users will enable either Volcano or YuniKorn but not both
2/ Add an example under https://github.com/awslabs/data-on-eks/tree/main/analytics/terraform/spark-k8s-operator/examples/karpenter to show Volcano with gang scheduling
3/ Update the Website Docs to explain the execution process and the results
Use this example as a template and build on top of that.
https://github.com/awslabs/data-on-eks/tree/main/analytics/emr-eks-amp-amg
Ability to deploy DataHub on EKS
Data-on-EKS support for [DataHub[(https://datahubproject.io/)
N/A
Build a Terraform template to build and deploy Apache Flink on EKS
emr-eks-ack-controller
EMR EKS ACK Controller doc
Build a Terraform template to build and deploy scalable MongoDB platform on Amazon EKS
[AWS Step Functions] which is a serverless workflow service can integrate with Amazon EMR on EKS and [Amazon EventBridge] to build event-driven workflows. After installing [AWS Controllers for Kubernetes (ACK) in EKS, you can provision and configure serverless AWS resources: Amazon EventBridge and AWS Step Functions from EKS. The team can do the whole data operation without leaving the Kubernetes platform and only need to maintain the EKS cluster since all the other components are serverless.
The airflow infra is too heavy for some users. An data pipeline with serverless services will offload a lot from admin works while ACK controllers allow them to stay in EKS to control and config those serverless services.
airflow, argo workflows
Build a Terraform template to build and deploy scalable CockroachDB platform on Amazon EKS
is it possible to deploy a flink job on EMR on EKS?
Kubeflow is an Kubernetes based MLOps platform. We(Amazon) have AWS distribution of Kubeflow https://awslabs.github.io/kubeflow-manifests/ which already has terraform based deployment option. For e.g. https://awslabs.github.io/kubeflow-manifests/docs/deployment/vanilla/guide-terraform/ with detailed documentation
Create guides for customers landing on doEKS related to Kubeflow on AWS. Can you provide samples on what needs to be done w.r.t to this?
Please provide a clear and concise description of the issue you are encountering, and a reproduction of your configuration.
If your request is for a new feature, please use the Feature request
template.
Before you submit an issue, please perform the following for Terraform examples:
.terraform
directory (! ONLY if state is stored remotely, which hopefully you are following that best practice!): rm -rf .terraform/
terraform init
$ git rev-parse HEAD
4c1ec44dd90a90d3ca974e225255aa689116bf15
$ terraform -version
Terraform v1.3.7
on darwin_arm64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/aws v4.52.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/helm v2.8.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.17.0
+ provider registry.terraform.io/hashicorp/local v2.3.0
+ provider registry.terraform.io/hashicorp/null v3.2.1
+ provider registry.terraform.io/hashicorp/random v3.3.2
+ provider registry.terraform.io/hashicorp/time v0.9.1
+ provider registry.terraform.io/hashicorp/tls v4.0.4
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.1
Steps to reproduce the behavior:
No
Yes
โท
โ Error: could not download chart: failed to download "https://github.com/grafana/helm-charts/releases/download/grafana-6.43.1/grafana-6.43.1.tgz" at version "6.43.1"
โ
โ with module.eks_blueprints_kubernetes_addons.module.grafana[0].module.helm_addon.helm_release.addon[0],
โ on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
โ 1: resource "helm_release" "addon" {
โ
โต
โท
โ Error: could not download chart: failed to download "https://github.com/kubernetes-sigs/metrics-server/releases/download/metrics-server-helm-chart-3.8.2/metrics-server-3.8.2.tgz" at version "3.8.2"
โ
โ with module.eks_blueprints_kubernetes_addons.module.metrics_server[0].module.helm_addon.helm_release.addon[0],
โ on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
โ 1: resource "helm_release" "addon" {
โ
โต
โท
โ Error: could not download chart: failed to download "https://Hyper-Mesh.github.io/spark-history-server/spark-history-server-1.0.0.tgz" at version "1.0.0"
โ
โ with module.eks_blueprints_kubernetes_addons.module.spark_history_server[0].module.helm_addon.helm_release.addon[0],
โ on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
โ 1: resource "helm_release" "addon" {
โ
โต
โท
โ Error: could not download chart: failed to download "https://github.com/kubernetes/autoscaler/releases/download/cluster-autoscaler-chart-9.15.0/cluster-autoscaler-9.15.0.tgz" at version "9.15.0"
โ
โ with module.eks_blueprints_kubernetes_addons.module.cluster_autoscaler[0].module.helm_addon.helm_release.addon[0],
โ on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
โ 1: resource "helm_release" "addon" {
โ
โต
โท
โ Error: could not download chart: failed to download "https://charts.fairwinds.com/stable/vpa-1.4.0.tgz" at version "1.4.0"
โ
โ with module.eks_blueprints_kubernetes_addons.module.vpa[0].module.helm_addon.helm_release.addon[0],
โ on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
โ 1: resource "helm_release" "addon" {
โ
โต
โท
โ Error: could not download chart: failed to download "https://aws.github.io/eks-charts/aws-for-fluent-bit-0.1.21.tgz" at version "0.1.21"
โ
โ with module.eks_blueprints_kubernetes_addons.module.aws_for_fluent_bit[0].module.helm_addon.helm_release.addon[0],
โ on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
โ 1: resource "helm_release" "addon" {
โ
โต
โท
โ Error: could not download chart: failed to download "https://github.com/apache/yunikorn-release/releases/download/v1.1.0/yunikorn-1.1.0.tgz" at version "1.1.0"
โ
โ with module.eks_blueprints_kubernetes_addons.module.yunikorn[0].module.helm_addon.helm_release.addon[0],
โ on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
โ 1: resource "helm_release" "addon" {
โ
โต
โท
โ Error: could not download chart: failed to download "https://github.com/kubernetes-sigs/cluster-proportional-autoscaler/releases/download/helm-chart-cluster-proportional-autoscaler-1.0.1/cluster-proportional-autoscaler-1.0.1.tgz" at version "1.0.1"
โ
โ with module.eks_blueprints_kubernetes_addons.module.aws_coredns[0].module.cluster_proportional_autoscaler[0].module.helm_addon.helm_release.addon[0],
โ on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
โ 1: resource "helm_release" "addon" {
โ
โต
โท
โ Error: could not download chart: failed to download "https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/releases/download/spark-operator-chart-1.1.26/spark-operator-1.1.26.tgz" at version "1.1.26"
โ
โ with module.eks_blueprints_kubernetes_addons.module.spark_k8s_operator[0].module.helm_addon.helm_release.addon[0],
โ on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
โ 1: resource "helm_release" "addon" {
โ
โต
โท
โ Error: could not download chart: failed to download "https://aws.github.io/eks-charts/aws-cloudwatch-metrics-0.0.7.tgz" at version "0.0.7"
โ
โ with module.eks_blueprints_kubernetes_addons.module.aws_cloudwatch_metrics[0].module.helm_addon.helm_release.addon[0],
โ on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
โ 1: resource "helm_release" "addon" {
โ
โต
โท
โ Error: could not download chart: failed to download "https://github.com/prometheus-community/helm-charts/releases/download/prometheus-15.16.1/prometheus-15.16.1.tgz" at version "15.16.1"
โ
โ with module.eks_blueprints_kubernetes_addons.module.prometheus[0].module.helm_addon.helm_release.addon[0],
โ on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
โ 1: resource "helm_release" "addon" {
โ
The links in the error, e.g. https://github.com/prometheus-community/helm-charts/releases/download/prometheus-15.16.1/prometheus-15.16.1.tgz are valid, so I'm not sure why the addon is having trouble trying to find the chart.
New EMR on EKS deployment pattern with Karpenter with best practices
Use this example as a template and build on top of that.
https://github.com/awslabs/data-on-eks/tree/main/analytics/emr-eks-amp-amg
Highly scalable deployment pattern for running KubeFlow on Amazon EKS with Terraform
Highly scalable deployment pattern for running MLFlow on Amazon EKS with Terraform
Highly scalable deployment pattern for running JupyterHub on Amazon EKS with Terraform
What is the outcome that you are trying to reach?
ETL tool deployed to EKS
Describe the solution you would like
Ability to deploy Airbyte on EKS
Additional context
https://docs.airbyte.com/deploying-airbyte/on-kubernetes
https://airbyte.com/
Currently external AWS account is configured to run the Terraform Plan
on all examples when someone raises a PR.
We want to extend this to run Terraform apply
on any merge to main
for only the updated pattern.
Build a Terraform template to build and deploy scalable Trino platform on Amazon EKS
New EMR on EKS deployment pattern with custom scheduler (Apache YuniKorn)
Use this example as a template and build on top of that.
https://github.com/awslabs/data-on-eks/tree/main/analytics/emr-eks-amp-amg
EKS BLUEPRINTS repo is soon deprecating sub-modules to create EKS node groups, fargate,kms, irsa, and emr on eks modules with an intention of leveraging open source aws modules.
As a part of this move all the doeks terraform blueprints will be migrated to use the latest v5 approach.
You will notice the major change is only for creating eks clusters, node groups and emroneks module. Kubernetes add-ons will still use the eks blueprints sub module.
Build a Terraform template to build and deploy scalable Cassandra platform on Amazon EKS
Apache NiFi supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic.
Feature to show how to configure and run Airflow jobs for EMR on EKS
Build a Terraform template to build and deploy Apache kafka on EKS
I am receiving these errors when I install EMR on EKS with Karpenter templates. I have tried in two different regions and they seem to be reproducible
Here are the errors,
โ Error: could not download chart: failed to download "oci://public.ecr.aws/karpenter/karpenter" at version "v0.18.1"
โ
โ with module.eks_blueprints_kubernetes_addons.module.karpenter[0].module.helm_addon.helm_release.addon[0],
โ on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
โ 1: resource "helm_release" "addon" {
โ
โต
โท
โ Error: could not download chart: failed to download "oci://public.ecr.aws/kubecost/cost-analyzer" at version "1.97.0"
โ
โ with module.eks_blueprints_kubernetes_addons.module.kubecost[0].module.helm_addon.helm_release.addon[0],
โ on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
โ 1: resource "helm_release" "addon" {
โ
โต
โท
โ Error: cannot re-use a name that is still in use
โ
โ with module.eks_blueprints_kubernetes_addons.module.coredns_autoscaler[0].module.helm_addon.helm_release.addon[0],
โ on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
โ 1: resource "helm_release" "addon" {
โ
$ terraform providers -version
Terraform v1.2.2
on darwin_amd64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/aws v4.46.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/helm v2.7.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.16.1
+ provider registry.terraform.io/hashicorp/local v2.2.3
+ provider registry.terraform.io/hashicorp/null v3.2.1
+ provider registry.terraform.io/hashicorp/random v3.4.3
+ provider registry.terraform.io/hashicorp/time v0.9.1
+ provider registry.terraform.io/hashicorp/tls v4.0.4
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.1
Steps to reproduce the behavior:
Follow the steps for deploying the solution
cd data-on-eks/analytics/terraform/emr-eks-karpenter
terraform init
export AWS_REGION="us-west-2"
terraform plan
terraform apply
I have tried in us-east-2 region as well
Expect to see all resources created without issues. I think Kubecost is not ready yet and can send PR to disable until its ready. Karpenter and other resources should install without hiccups
Getting below error when deploying EKS and RDS for AIRFLOW with existing VPC.
Error: Unsupported attribute
โ
โ in module "db":
โ vpc_security_group_ids = [module.security_group.security_group_id]
โ โโโโโโโโโโโโโโโโโ
โ โ module.security_group is a list of object
โ
โ Can't access attributes on a list of objects. Did you mean to access attribute "security_group_id" for a specific element of the list, or across all elements of the list?
โต
terraform providers --version
Terraform v1.3.6
on darwin_arm64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/aws v4.51.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/helm v2.8.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.17.0
+ provider registry.terraform.io/hashicorp/local v2.3.0
+ provider registry.terraform.io/hashicorp/null v3.2.1
+ provider registry.terraform.io/hashicorp/random v3.4.3
+ provider registry.terraform.io/hashicorp/time v0.9.1
+ provider registry.terraform.io/hashicorp/tls v4.0.4
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.1
Steps to reproduce the behavior:
cd /data-on-eks/schedulers/terraform/self-managed-airflow
Disable the VPC module
Provide the custom inputs for VPC, EKS private subnets and database subnets group name
terraform init
terraform plan
terraform apply
-var vpc_id=<vpc_id>
-var private_subnet_ids=<EKS_private_subnet_ID>
-var db_subnet_group_name=<RDS_subnet_group_name>
EKS with RDS deployment should be successful.
EKS deployment is successful but RDS deployment is failing as security group for RDS is not setup.
install argo workflow in eks
use argo workflow to trigger k8s jobs
use argo workflow to trigger spark jobs
use argo workflow for emr on eks?
CDK Constructs for easy deployment of EMR on EKS and EMR Studio with EMR on EKS already exists in the AWS Analytics Reference Architecture. The CDK Constructs include automation on EKS to implement best practices related to workload isolation, autoscaling, spot, local shuffle performance optimization...
We should reference this content to help customers using CDK to bootstrap EMR on EKS.
Add a readme in the analytics
folder of this repository to reference the CDK Constructs. Or add a page in the website
to reference the CDK Constructs.
EMR on EKS resources are currently integrated in EKS Blueprints core module and it doesnt provide isolation for the customers who want to use only EMR on EKS service in an existing EKS Cluster.
This new feature will create a new EMR on EKS dedicated module(https://github.com/aws-ia/terraform-aws-emr-containers - currently private repo). All the existing blueprints will be migrated to this new approach.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.