This is the master respository containing my work for the Coursera Data Science Specialization. Each of the subdirectories correpsond to a different course and are self-contained. The official repository for the specialization can be found here.
The first course of the specialization is to orient you on the definition of a Data Scientist proposed by the 3 instructors (Brian Caffo, Jeff Leek and Roger Peng). Further information on there general philosophys can be found on their personal websites and a repository found here. The objectives for the Data Scientist's Toolbox were to set expectations for the Coursera Specialization and to gain familiarity with GitHub and Rmarkdown.
The course in R Programming involved familiarizing you with R and RStudio. This course involved two minor programming assignments to demonstrate R programming. My primary files for the course can be round here. The second programming assignment is part of a fork and can be found here.
The course in Getting and Cleaning Data involves learning methods to generate "tidy" datasets. The exercises involving this course can be found here.
The course in Exploratory Data Analysis gives an overview of the basic R plot function, the lattice package and ggplot.
The course in Reproducible Research and introduces the use of Rpubs. My Rpubs report for this class can be found here.
The course in Statistical Inference gave a brief overview of statisitics's and their corresponding assumptions. This course was graded using quizzes.
The course in Regression Models gave an overview of possible statistical models for various analysis. My course project can be found here
The course in Practical Machine Learning provided an overview in how to design machine learning experiments and demonstrated the capabilities of the caret package in R. My associated project for this course can be found here.
The course in Developing Data Products introduces Rpresenter, shiny, and slidify. My final project for this course involved generating a shinyApp with an accompanying presentation. The code for this project can be found here.
The specialization was finished with the Data Science Capstone. This capstone project involved handling a large data set of mined text (~1.5GB) and generating a word prediction application. Along the development of the program was a Milestone Report. The capstone was completed with a final presentation and working application. The corresponding code can be found here.