srishtisingh3895 Goto Github PK
Name: Srishti Singh
Type: User
Company: University of Rochester
Location: Washington, USA
Name: Srishti Singh
Type: User
Company: University of Rochester
Location: Washington, USA
This project creates a data lake using AWS S3, EMR and Spark to build an ETL pipeline for a music database.
Modeled song data using Apache Cassandra and designed an ETL pipeline.
This project was done as a part of Udacity's Data Engineering Nanodegree Program.
Created the DAGs and designed an ETL pipeline using custom operators to perform tasks such as staging the data, filling the data warehouse, and running checks on the data as the final step. This project was completed as a part of Udacity's Data Engineering Nanodegree.
This project builds a data warehouse using Amazon Redshift and S3, and was completed as a part of Udacity's Data Engineering Nanodegree program.
Comparative study of frequent pattern mining algorithm on Adult Census Data
Data and details for University of Rochester Biomedical Data Science Hackathon
In this project we have created a music database which can be a part of a much larger application of online music streaming. The database keeps record of all the songs and its properties as well as all the artists and their details. Moreover, it also keeps track of the all the users, their playlists and the songs in their playlists. The idea is to capture the userβs taste of music by storing the details of the songs like song name, song genre, artist, number of times user listened to a song etc. so that the analysts can use this data to design a recommender system and to improve the song base. The amount of data that can be collected for creating a music library is quite large. For the purpose of this project, we used two Kaggle datasets of Spotify top tracks and artists to create the database. We also generated synthetic data for user details, playlists etc. Database management system is necessary for this application since it consumes a huge amount of space, and many users access it at the same time from various locations. The database is administered by admins, who can add/delete songs as well as users from their region. Thus, there are two login pages, one for users and other for admins with different functions associated with these accounts. Some of the admin functions include, inserting new songs, deleting old songs, deleting and monitoring users. Moreover, some of the user functions are, viewing or searching songs/artists, creating playlists, deleting playlists, etc.
Analysis of popularity of top 100 spotify songs based on the musical attributes
The project aims to create a data lake for US immigration data and developing an ETL pipeline to build this data lake using data from various sources. The project was completed as a part of Udacity's Data Engineering Nanodegree program.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.