Giter VIP home page Giter VIP logo

datasciencectacontent's Introduction

Data Science Specialization Community Mentor Content Repository

Author: Len Greski

This repository contains content developed during my time as either a student or Community Mentor in the Data Science Specialization from Johns Hopkins University that is offered over Coursera. A number of people have developed content to help students work through the nine courses in the specialization. The main index for this content is datasciencespecialization.github.io.

Repository Contents

As a participant and Community Mentor in courses in the curriculum, there are patterns of similar issues experienced by students. Migrating the content to github will facilitate reposting it to new runs of courses within the curriculum. This will make it easier for students to have access to the experiences from prior students without me having to cut and paste the content into Discussion Forums, which are the primary mechanism for communication between students and with TAs.

FileDescription
/markdownDirectory containing markdown files, the primary form of documentation for the content in the repository.
/markdown/imagesDirectory containing portable network graphics files, which are used to illustrate the narrative content in other documentation.
README.mdFile explaining the purpose and contents of the repository, listing of links to specific content by course.

The remainder of this document serves as a directory of the content, aligning individual documents with the course(s) for which the content is relevant.

Course 1: Data Scientist's Toolbox

  1. Configuring RStudio to work with git / github - Mac OSX
  2. Configuring RStudio to work with git / github - Windows 7, 8, and 10
  3. Using Editor Modes in Discussion Forum Posts
  4. Buying a Computer for Data Science

Course 2: R Programming

General commentary about the course, R programming in general, and R in relationship to other statistics packages.

  1. Commercial Statistics Packages: An Historical Perspective
  2. Configuring RStudio to work with git / github - Mac OSX
  3. A Data Frame is Also a List
  4. Forms of the Assignment Operator
  5. Forms of the Extract Operator
  6. S Objects, R Objects, and Lexical Scoping
  7. Thinking in R versus Thinking in SAS
  8. Strategy for the Programming Assignments
  9. Why is R More Difficult than SAS?
  10. R Onboarding for SAS Users
  11. References for R Programming Provides a list of references for R programming, ranging from beginning to advanced topics.

Posts regarding specifics of programming assignments

  1. Assignment 1: Breaking Down Pollutantmean
  2. Assignment 1: A SAS Version of Pollutantmean
  3. Assignment 2: Demystifying makeVector
  4. Assignment 2: makeCacheMatrix as an Object
  5. Assignment 2: Grading the SHA-1 Hash Code
  6. Assignment 3: Functions to Sort Data Frames

Miscellaneous Code Examples and Instructions

  1. Common R Mistakes: Overwriting R Functions with Output Variables
  2. Permanently Setting R Working Directory Link to R-bloggers.com article that explains how to set your working directory permanently in R (instead of RStudio)
  3. Creative Use of R: Downloading Course Lectures Article illustrating how to use R to automate the download of lectures from Data Science Specialization courses, such as R Programming. Techniques used in this article are helpful to make research reproducible, as required for courses like Getting and Cleaning Data and Reproducible Research.

Interesting R News and Blog Articles

  1. R vs. Python: 2016 Survey of Software used for Data Science Overview of results from a 2016 KDNuggets Software Poll, written by Gregory Piatetsky. The follow up article with expanded analysis is What Big Data, Data Science, Deep Learning software goes together, also on kdnuggets.com.
  2. Scaling R for Data Science August 2016 article by Federico Castanedo explaining three ways to scale R.
  3. Lexical Scoping and Statistical Computing Article by Robert Gentleman and Ross Ihaka at the University of Auckland describing how lexical scoping works, and why it is valuable in statistical computing.

Course 3: Getting and Cleaning Data

  1. Real World Example: Reading American Community Survey data
  2. Strategy for Reading Files & APIs / Quiz 2

Course 5: Reproducible Research

  1. Assignment 2 Checklist

Course 6: Statistical Inference

  1. Reference Materials for Statistical Inference Start here if you're looking for help on the statistical techniques taught in this course.
  2. Using MathJax with Discussion Forums, R Markdown, and Github Pages
  3. Power Calculations: Optimal Sample size
  4. Permutation Tests Explained

Articles Related to the Course Project

  1. Exponential Distribution / Central Limit Theorem - Assignment Checklist
  2. ToothGrowth Analysis - Assignment Checklist
  3. Exploratory Data Analysis in ToothGrowth Assignment, explaining the exploratory data analysis requirement for students who have not taken the Exploratory Data Analysis course prior to taking Statistical Inference.
  4. Accessing R Code from an Appendix in Knitr
  5. Theoretical Variance of Sampling Distribution of the Mean
  6. Kable Tables with Data Frames illustrates how to display a custom table in a knitr() document by creating a data frame to contain the information to be rendered with kable().
  7. Installing MiKTeX on Windows 10 / Generating a PDF with knitr

Course 7: Regression Models

  1. Why does sum of errors * X equal 0?
  2. Using MathJax with Discussion Forums, R Markdown, and Github Pages

Course 8: Practical Machine Learning

  1. Week 4: Combining Predictors Math Explained
  2. Course Project - gh-pages Setup with RStudio
  3. Course Project - Improving Runtime Performance of Random Forest Models with caret::train()
  4. Course Project - Predicting Test Scores based on Training Model Accuracy

Course 9: Developing Data Products

  1. Configuring shinyapps.io Application Timeout

Content for Community Mentors

  1. Tips for New Community Mentors A list of tips for new mentors supporting the Data Science Specialization, ranging from when to direct students to paid / professional resources such as the Coursera Learner Help Center, to how to optimize the value of content that is posted by mentors.

datasciencectacontent's People

Contributors

lgreski avatar dmi3kno avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.