Giter VIP home page Giter VIP logo

fifa-world-cup-prediction's Introduction

Project Description

Objective:

  • Prediction of the winner of an international matches Prediction results are "Win / Lose / Draw" or "goal difference"
  • Apply the model to predict the result of FIFA world cup 2018.

Data: Data are assembled from multiple sources, most of them are from Kaggle, others come from FIFA website / EA games.

Feature Engineering: To determine who will more likely to win a match, based on my knowledge, I come up with 4 main groups of features as follows:

  1. head-to-head match history between 2 teams
  2. recent performance of each team (10 recent matches), aka "form"
  3. bet-ratio before matches
  4. squad strength (from FIFA video game)

Feature list reflects those factors.

Lifecycle

Report

Check the Full Report to gain more insight about this Project. The report contains:

  • Exploratory Data Analysis: Investigate correlations, importance of features to results, hypothesis interesting
  • Methodology: How I carried out this project, which experiments I did.
  • Models: baseline model, logistic regression, random forest, gradient boosting tree, ADA boost tree, Neural Network.
  • Evaluation Criteria: F1, 10-fold cross validation accuracy
  • Results and Conclusion

Project Structure

  1. EDA: Data Exploratory Analysis
  2. LE: saved model for Label Encoder
  3. data: completed dataset
  4. save_model: saved Machine Learning model after training

Data

Data Source

The dataset are from all international matches from 2000 - 2018, results, bet odds, ranking, squad strengths

  1. FIFA World Cup 2018
  2. International match 1872 - 2018
  3. FIFA Ranking through Time
  4. Bet Odd
  5. Bet Odd 2
  6. Squad Strength - Sofia
  7. Squad Strength - FIFA index

Feature List

  • *difference: team1 - team2
  • *form: performance in 10 recent matches
Feature Name Description Source
team_1 Nation Code (e.g US, NZ) 1 & 2
team_2 Nation Code (e.g US, NZ) 1 & 2
date Date of match yyyy - mm - dd 1 & 2
tournament Friendly,EURO, AFC, FIFA WC 1 & 2
h_win_diff Head2Head: win difference 2
h_draw Head2Head: number of draw 2
form_diff_goalF Form: difference in "Goal For" 2
form_diff_goalA Form: difference in "Goal Against" 2
form_diff_win Form: difference in number of win 2
form_diff_draw Form: difference in number of draw 2
odd_diff_win Betting Odd: difference bet rate for win 4 & 5
odd_draw Betting Odd: bet rate for draw 4 & 5
game_diff_rank Squad Strength: difference in FIFA Rank 3
game_diff_ovr Squad Strength: difference in Overall Strength 6
game_diff_attk Squad Strength: difference in Attack Strength 6
game_diff_mid Squad Strength: difference in Midfield Strength 6
game_diff_def Squad Strength: difference in Defense Strength 6
game_diff_prestige Squad Strength: difference in prestige 6
game_diff_age11 Squad Strength: difference in age of 11 starting players 6
game_diff_ageAll Squad Strength: difference in age of all players 6
game_diff_bup_speed Squad Strength: difference in Build Up Play Speed 6
game_diff_bup_pass Squad Strength: difference in Build Up Play Passing 6
game_diff_cc_pass Squad Strength: difference in Chance Creation Passing 6
game_diff_cc_cross Squad Strength: difference in Chance Creation Crossing 6
game_diff_cc_shoot Squad Strength: difference in Chance Creation Shooting 6
game_diff_def_press Squad Strength: difference in Defense Pressure 6
game_diff_def_aggr Squad Strength: difference in Defense Aggression 6
game_diff_def_teamwidth Squad Strength: difference in Defense Team Width 6

How to Run:

python experiment1-W-D-L.py
python experiment2-GoalDiff.py
python experiment3-WorldCup.py

Reference

  1. A machine learning framework for sport result prediction
  2. t-test definition
  3. Confusion Matrix Multi-Label example
  4. Precision-Recall Multi-Label example
  5. ROC curve example
  6. Model evaluation
  7. Tuning the hyper-parameters of an estimator
  8. Validation curves
  9. Understand Bet odd format
  10. EURO 2016 bet odd

Task List

Complete

  • Add prediction for Matchday 2
  • Add feature Importance
  • Add feature of squad and player info
  • Build a web crawler for Squad each team
  • Build a web crawler for FIFA game player
  • Add a simple classification based on "bet odd".
  • Add feature group 1
    • Add h_win_diff, h_draw
    • Add rank_diff, title_diff
  • Add features group 2
  • Add features group 3
  • Simple EDA and a small story
  • Add features group 4
  • Prepare framework for running classifiers
  • Add evaluation metrics and plot
    • Add accuracy, precision, recall, F1
    • Add ROC curves
  • Build a data without player rating and squad value
  • Generate data and preform prediction for EURO 2016, ok now my story is more interesting
  • Create more data, "teamA vs teamB -> win" is equivalent to "teamB vs teamA -> lose"

fifa-world-cup-prediction's People

Contributors

mrthlinh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fifa-world-cup-prediction's Issues

Wrong home_team information?

Hi,

I noticed that the home _team column in files data_ex1 to 3 has incorrect information. Your features list however doesn't contain that column. Do you have any correct home_team data as that would help increase accuracy?

Also, why the data in the data_ex1 file stopped in 2015 instead of 2018?

And finally, do you have any updates on this project?

Thank you very much.

goal diff of win and lose

i notice that there are more than three classes in your label class, like win1,win2.etc. However ,i didn't find the source code to create that. Can you tell me which folder the code lie in? or could you share code of this part on github?

Source Code Structure?

Hi,

Would you please write a quick info about the code structure if you have time? Like what is each file used for? The crawler is for crawling info from FIFA video game, the xxx.py is used for xxx....... Something like this? A brief introduction about the code structure.

Thank you.

Crawler dose not work

Hi,
I want to know if when I use the crawler.py, it gave me the error:

Traceback (most recent call last): File "crawler.py", line 98, in <module> version_id = int(sys.argv[1]) - 5 IndexError: list index out of range

How can I fix this??
Thank you.

Issues about squad_crawler.py

When I ran python3 squad_crawler.py, I got this error:
France - Didier Deschamps Traceback (most recent call last): File "squad_crawler.py", line 64, in <module> player_name = row.select("th > a")[0].get_text() IndexError: list index out of range

Would you please tell me how to fix this??

Thx.

Question about squad strength

Hi,
May I have a question about squad strength? Since the members of a team constantly change, how can we calculate the squad strength of each team at every match?
Thank you.

ps: I found it

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.