Giter VIP home page Giter VIP logo

module-3-project-houston-ds-042219's Introduction

Module 3 - Final Project Specifications

Introduction

In this lesson, we'll review all the guidelines and specifications for the final project for Module 3.

Houston DS 042219 Pairings

  • Nick & Max
  • Lindsey & Francois
  • Harrison & Joe

Objectives

  • Understand all required aspects of the Final Project for Module 3
  • Understand all required deliverables
  • Understand what constitutes a successful project

Final Project Summary

Another module down--you're half way there!

For the culmination of Module 3, you just need to complete the final project!

The Project

For this project, you'll be working with the Northwind database--a free, open-source dataset created by Microsoft containing data from a fictional company. You probably remember the Northwind database from our section on Advanced SQL. Here's the schema for the Northwind database:

The goal of this project is to test your ability to gather information from a real-world database and use your knowledge of statistical analysis and hypothesis testing to generate analytical insights that can be of value to the company.

The Deliverables

The goal of your project is to query the database to get the data needed to perform a statistical analysis. In this statistical analysis, you'll need to perform a hypothesis test (or perhaps several) to answer the following question:

Does discount amount have a statistically significant effect on the quantity of a product in an order? If so, at what level(s) of discount?

In addition to answering this question with a hypothesis test, you will also need to come up with at least 3 other hypotheses to test on your own. These can by anything that you think could be imporant information for the company.

For this hypothesis, be sure to specify both the null hypothesis and the alternative hypothesis for your question. You should also specify if this is one-tail or a two-tail test.

To complete this project, you will need to turn in the following 3 deliverables:

  1. A Jupyter Notebook containing any code you've written for this project.
  2. A Blog Post explaining your process, methodology, and findings.
  3. An "Executive Summary" PowerPoint Presentation that explains the hypothesis tests you ran, your findings, and their relevance to company stakeholders.

Jupyter Notebook Must-Haves

For this project, your jupyter notebook should meet the following specifications:

Organization/Code Cleanliness

  • The notebook should be well organized, easy to follow, and code is commented where appropriate.

* Level Up: The notebook contains well-formatted, professional looking markdown cells explaining any substantial code. All functions have docstrings that act as professional-quality documentation.
* The notebook is written to technical audiences with a way to both understand your approach and reproduce your results. The target audience for this deliverable is other data scientists looking to validate your findings.
* Any SQL code written to source data should also be included.

Findings

  • Your notebook should clearly show how you arrived at your results for each hypothesis test, including how you calculated your p-values.

* You should also include any other statistics that you find relevant to your analysis, such as effect size.

Blog Post Must-Haves

Your blog post should include everything from how you identified what tables contained the information you need, to how you retrieved it using SQL (and any challenges you ran into while doing so), as well as your methodology and results for your hypothesis tests.

NOTE: This blog post is your way of showcasing the work you've done on this project--chances are it will soon be read by a recruiter or hiring manager! Take the time to make sure that you craft your story well, and clearly explain your process and findings in a way that clearly shows both your technical expertise and your ability to communicate your results!

Executive Summary Must-Haves

Your presentation should:

  • Contain between 5-10 professional quality slides detailing:

* A high-level overview of your methodology
* The results of your hypothesis tests
* Any real-world recommendations you would like to make based on your findings (ask yourself--why should the executive team care about what you found? How can your findings help the company?)
* Take no more than 5 minutes to present
* Avoid technical jargon and explain results in a clear, actionable way for non-technical audiences.

module-3-project-houston-ds-042219's People

Contributors

xiupan avatar

Watchers

James Cloos avatar Lindsey avatar

Forkers

fleralle

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.