gveerashekar Goto Github PK

followers: 0.0 following: 1.0 repos: 57.0 gists: 1.0

Type: User

gveerashekar's Projects

90_python_examples

The best way to learn Python is by practicing examples. The repository contains examples of basic concepts of Python. You are advised to take the references from these examples and try them on your own.

ansible-best-practises

A project structure that outlines some best practises of how to use ansible

ansible-for-devops

Ansible for DevOps examples.

ansible-tuto

Ansible tutorial

anz_llm_bootcamp

athena2pyspark

Very simple library to consume aws athena from spark or lambda services

aws-glue-pyspark-etl-job

aws-glue-pyspark-etl-job-1

A Pyspark job to handle upserts, conversion to parquet and create partitions on S3

aws-pyspark-etl-job

This module performs statistical analysis on the vendor INFY and SBIN dataset

awspractice

cicd_with_databricks

cloud9-spark-emr-musichistorydata_project

To implement a data lake using S3 and Spark on an EMR cluster using AWS Cloud9 environment and develop an ETL pipeline for a Data Lake that extracts data from S3, processes the data using Spark, and loads the data back into S3 as a set of dimensional tables.

data-engineering-case

ETL Redshift-based workflow automated with AWS Step Funtions.

data-lake-using-aws-emr-pyspark-and-s3

Building an ETL pipeline that extracts data from S3, processes it using Spark, and loads the data back into S3 as a set of dimensional tables.

dataengineer-transformations-python

datalake-etl-pipeline

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

dbdemos

Demos to implement your Databricks Lakehouse

dda-devops-build

common lib for devops builds using pythons pybuilder, terraform, docker and gopass

devops-exercises

Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions

dlt-meta

This is metadata driven DLT based framework for bronze/silver pipelines

eks

AWS EKS - kubernetes project

etl_docker

ETL - DOCKER with Apache PySPARK

etl_spark_aws_s3

ETL pipeline from AWS S3 bucket, transformation in PySpark, and loaded back into S3 bucket.

etl_with_pyspark

:bowtie: This project build an ETL pipeline using AWS Data Lake and PySpark for the music streaming app Sparkify. This is part of the Udacity Data Engineering Nanodegree program.

examples

Examples for Apache Oozie book

frank-kanes-taming-big-data-with-apache-spark-and-python

Frank Kane's Taming Big Data with Apache Spark and Python, published by Packt

infrastructure-as-code-tutorial

Infrastructure As Code Tutorial. Covers Packer, Terraform, Ansible, Vagrant, Docker, Docker Compose, Kubernetes

infrastructure-playground

docker, kubernetes, terraform and full stack web apps

introtopyspark

Quick and easy setup of Amazon EMR (Elastic Map Reduce) with PySpark - using persistent Jupyter Notebook including walk-through of basic ETL scenario.

kubespray

Deploy a Production Ready Kubernetes Cluster

gveerashekar Goto Github PK

gveerashekar's Projects

Recommend Projects

Recommend Topics

Recommend Org