Giter VIP home page Giter VIP logo

Sagar Lakshmipathy's Projects

airflow-tweet-loader icon airflow-tweet-loader

This repo contains a data pipeline code written in python and scheduled in Apache Airflow. It monitors, fetches and cleans tweets and stores in HDFS. And finally, loads it to a hive database.

collaborative-authoring icon collaborative-authoring

This project lets users merge two analyses together and create a new analysis or even update the target analysis in place.

hudi icon hudi

Upserts, Deletes And Incremental Processing on Big Data.

hudi-on-databricks icon hudi-on-databricks

This repository serves as a guide to work with Hudi tables on Databricks environment

incubator-xtable icon incubator-xtable

OneTable is an omni-directional converter for table formats that facilitates interoperability across data processing systems and query engines.

lex-prediction-pyspark icon lex-prediction-pyspark

Performed Regression Analysis using PySpark to predict Life Expectancy based on World Population data.

ml-api icon ml-api

Guide on creating an API for serving your ML model

ml-pyspark-customer-churn icon ml-pyspark-customer-churn

The code was written for Big Data Infrastructure final capstone project to predict the customer churn for a telecom company. Data was sourced from Kaggle but can run on databricks independent of supporting documents/datasets. It includes techniques like hyperparameter tuning for feature engineering and model evaluation. Random Forest Classifier model served us with the best accuracy at 71%.

movie-analysis-pig-latin icon movie-analysis-pig-latin

The code below was written in Pig Latin which finds the best movies sorted by time and worst of the movies sorted by the times it was rated. Repository includes, code for the analysis and datasets 1. "u.data" file (contains rating info) and 2. "u.item" file (metadata).

py4j icon py4j

A Simple Py4J implementation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.