Giter VIP home page Giter VIP logo

dsc-3-32-12-section-recap's Introduction

Section Recap

Introduction

This short lesson summarizes key takeaways from section 32

Objectives

You will be able to:

  • Understand and explain what was covered in this section
  • Understand and explain why this section will help you become a data scientist

Key Takeaways

The key takeaways from this section include:

  • The delphi technique suggests that multiple independent estimates will be more consistently accurate than any single estimate
  • Because of this, ensemble techniques are a powerful way for improving the quality of your models
  • Sometimes you'll use model stacking or meta-ensembles where you use of combination of different types of model for your ensemble
  • It's also common to have multiple similar models in an ensemble - e.g. a bunch of decision trees
  • Bagging (Bootstrap AGGregation) is a technique that leverages Bootstrap Resampling and Aggregation
  • Bootstrap resampling uses multiple smaller samples from the test data set to create independent estimates, and aggregation is the combining of those estimates to make predictions
  • A random forest is an ensemble method for decision trees using Bagging and the Subspace Sampling Method to create variance among the trees
  • With a random forest, for each tree, we sample using 2/3 of the training data and the remaining third is used to calculate the Out-of-Bag Error
  • In addition, the Subspace Sampling Method is used to further increase variability by randomly selecting the subset of features to use as predictors for training any given tree
  • GridsearchCV is an exhaustive search technique for finding optimal combinations of hyper parameters by calculating for every combination of parameter values you put into the search
  • Gradient boosting leverages an ensemble of weak learners (weak models) to create a strong combined model
  • Boosting (when compared to random forests) is an interative rather than independent process, using each model to strengthen the weaknesses of the previous ones
  • Two of the most common algorithms for Boosting are Adaboost (Adaptive Boosting) and Gradient Boosted Trees
  • Adaboost creates new classifiers by continually influencing the distribution of the data sampled to train each successive tree
  • Gradient Boosted Trees are a more advanced boosting algorithm that makes use of Gradient Descent
  • XGBoost (eXtreme Gradient Boosting) is one of the top gradient boosting algorithms currently in use
  • XGBoost is a stand-alone library that implements popular gradient boosting algorithms in the fastest, most performant way possible

dsc-3-32-12-section-recap's People

Contributors

loredirick avatar peterbell avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.