Giter VIP home page Giter VIP logo

soccer_xg's Introduction

Soccer xG

A Python package for training and analyzing expected goals (xG) models in soccer.




About

This repository contains the code and models for our series on the analysis of xG models:

In particular, it contains code for experimenting with an exhaustive set of features and machine learning pipelines for predicting xG values from soccer event stream data. Since we rely on the SPADL language as input format, soccer_xg currently supports event streams provided by Opta, Wyscout, and StatsBomb.

Getting started

The recommended way to install soccer_xg is to simply use pip:

$ pip install soccer-xg

Subsequently, a basic xG model can be trained and applied with the code below:

from itertools import product
from soccer_xg import XGModel, DataApi

# load the data
provider = 'wyscout_opensource'
leagues = ['ENG', 'ESP', 'ITA', 'GER', 'FRA']
seasons = ['1718']
api = DataApi([f"data/{provider}/spadl-{provider}-{l}-{s}.h5" 
        for (l,s) in product(leagues, seasons)])
# load the default pipeline
model = XGModel()
# train the model
model.train(api, training_seasons=[('ESP', '1718'), ('ITA', '1718'), ('GER', '1718')])
# validate the model
model.validate(api, validation_seasons=[('ENG', '1718')])
# predict xG values
model.estimate(api, game_ids=[2500098])

Although this default pipeline is suitable for computing xG, it is by no means the best possible model. The notebook 4-creating-custom-xg-pipelines illustrates how you can train your own xG models or you can use one of the four pipelines used in our blogpost series. These can be loaded with:

XGModel.load_model('openplay_logreg_basic')
XGModel.load_model('openplay_xgboost_basic')
XGModel.load_model('openplay_logreg_advanced')
XGModel.load_model('openplay_xgboost_advanced')

Note that these models are meant to predict shots from open play. To be able to compute xG values from all shot types, you will have to combine them with a pipeline for penalties and free kicks.

from soccer_xg import xg

openplay_model = xg.XGModel.load_model(f'openplay_xgboost_advanced') # custom pipeline for open play shots
penalty_model = xg.PenaltyXGModel() # default pipeline for penalties
freekick_model = xg.FreekickXGModel() # default pipeline for free kicks

model = xg.XGModel()
model.model = [openplay_model, penalty_model, freekick_model]
model.train(api, training_seasons=...)

For developers

Create venv and install deps

make init

Install git precommit hook

make precommit_install

Run linters, autoformat, tests etc.

make pretty lint test

Bump new version

make bump_major
make bump_minor
make bump_patch

Research

If you make use of this package in your research, please use the following citation:

@inproceedings{robberechts2020data,
  title={How data availability affects the ability to learn good xG models},
  author={Robberechts, Pieter and Davis, Jesse},
  booktitle={International Workshop on Machine Learning and Data Mining for Sports Analytics},
  pages={17--27},
  year={2020},
  organization={Springer}
}

License

Copyright (c) DTAI - KU Leuven โ€“ All rights reserved.
Licensed under the Apache License, Version 2.0
Written by Pieter Robberechts, 2020

soccer_xg's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.