Samples of my work from UC Berkeley's Data Science 100 course (Principles and Techniques of Data Science) taught by Sandra Dudoit and John Denero.
In this class, we explored key areas of data science including question formulation, data collection and cleaning, visualization, statistical inference, predictive modeling, and decision making.
This repo includes work we did on a variety of topics in data science including but not limited to:
- Pandas library functionality
- Techniques of EDA
- Data visualization libraries including Matplot and Seaborn
- Advanced SQL techniques
- Dimensionality reduction and PCA
- Regression models (Linear, Logisitc, Tree)
- Gradient descent, regularization, cross validation)
- Distributed computing