Giter VIP home page Giter VIP logo

nbaplayerclustering's Introduction

nbaplayerclustering

Clustering NBA Players Based on Advanced Stats from Basketball-Reference.com and FiveThirtyEight's DRAYMOND data.

The goal here is to group players effectively so that conclusions made about one player/style can be applied to a group of other similar ones. The aim is to cluster players according their skillesets, play style, and success on the court. In other words, the goal is to identify players who play similarly to other players.

The player clustering algorithm generates an arbitrary number of groupings of similar NBA players for a given target based on a 13-dimensional advanced-stat dataset pulled from Basketball-Ref.com and FiveThirtyEight's DRAYMOND data. The dataset was scaled, reduced a first time with PCA (Principal Component Analysis), scaled again, reduced a second time with LDA (Linear Discriminant Analysis) which preserved the clusters from the first clustering, and finally that result was scaled to produce the 2d output. There is a graph of the results at the end of section 5.

Take for example a player like LeBron James. Which NBAer approaches the game most similarly to him? The short answer is not many, because he's Lebron. Nevertheless there are players who are skilled scorers, passers, and defenders, and they will compare best to LeBron. Note: this isn't simply about comparing similar play styles, although it is about that too.

Consider player like Andre Iguodala, who approaches the game of basketball almost identically to LeBron: he's a servicable scorer, decent passer, decent rebounder, and a great defender. He's also similar in body size and plays the same position. In terms of how well-rounded Iguodala's game is, he's a dead ringer for LeBron. The problem is that he's only relatively decent at all the things LeBron transcends at. So although the two would compare favorably, Iguodala still might not be the best comp for LeBron.

On the other hand, someone like Kevin Durant, whose game has aspects that are a bit different than LeBron's (Durant's a better shooter, for one), could be a better comp. Durant adds similarly high value as an all-around player, has a high usage rate, is a good defender, and can hold his own passing the ball. In other words, he could be a better comp for LeBron.

That's the problem this algorithm is trying to solve: who is most similar to whom?

Advanced metrics considered in this algorithm: true shooting %, three point attempt rate, free throw rate, o-rebound %, assist %, turnover %, usage %, steal %, block %, defensive box plus/minus, and DRAYMOND, which is FiveThirtyEight's proprietary metric for measuring the ability to disrupt opponents' shots without blocking them, a crucial skill that's absent from standard box score info and even many advanced stats. Position and age are also considered, primarily to give the algorithm a base idea of the physical build and energy level of players.

The algorithm scrapes its data from basketball-reference.com and also directly imports its DRAYMOND data from FiveThirtyEight.

nbaplayerclustering's People

Contributors

seanmcalevey avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.