Giter VIP home page Giter VIP logo

qatar_2022_prediction's Introduction

16366925243845 (1) Screen Shot 2022-09-30 at 3 15 39 PM

World Cup QATAR 2022 Prediction: Project Overview

This project aims to predict the results of the QATAR 2022 World Cup from the international matches held since the 90s, the qualifications of the teams in their last matches, and the potential of each team.

Resources Used

  • Python Version: 3.7
  • Packages: Pandas, NumPy, Sklearn, Tensorflow, and Seaborn.
  • Data:
    • international_matches.csv - This dataset provides a complete overview of all international soccer matches played since the 90s. On top of that, the strength of each team is provided by incorporating actual FIFA rankings as well as player strengths based on the EA Sport FIFA video game.
    • players_22.csv - The datasets provided include the player data for FIFA 22 Career Mode.

Data preparation and dataset creation

  • Both datasets international_matches.csv and players_22.csv were prepared for analysis and the creation of the training dataset of the Machine learning model. The preparation consists of fixing the na's values and removing the information of the teams that will not participate in the cup.

  • From dataset international_matches.csv, create training dataset training.csv and inference dataset last_team_scores.csv. training.csv contains the names of the teams facing each other, the FIFA ranking of each team, and the rating of both teams' defense, midfield, and offense. On the other hand, the inference dataset contains the qualification of each team on its last FIFA date.

EDA

From datasets international_matches.csv and players_22.csv, the notebooks QATAR22_EDA+Data_Preparation.ipynb and Getting_Squads_Stats.ipynb answer the questions listed below. These questions allow us to get an idea of the favorites to win the cup according to statistics.

  • Which National soccer teams have the best offence?

download-12

  • Which National soccer teams have the best defence?

download-8

  • Which National soccer team have the best midfield?

download-11

  • Which teams have the highest winning percentage?

download-6

  • Who are the best players in Qatar 2022?

download-2

  • What are the most promising teams?

download-5

  • Does it have any advantage to be the local team?

This question is fundamental. The graph below shows that the home teams win more than 50% of the home games. This is due to different reasons, e.g., the familiarity with the field of play, the movement that the visiting teams must make, the feeling of territoriality, the support of the public, and innumerable factors. When the Colombian National Team visits the Maracana stadium to play against Brazil, they tend to lose the match or draw. However, they tend to tie or win when Colombia is local in the Metropolitano stadium. For this reason, to predict the result of the matches from a Machine Learning model, I must define the home team and the visiting team.

download-7

Modeling and Tuning

The Modeling+Tuning.ipynb notebook aims to train the Machine learning model that will predict the outcome of the World Cup matches. This notebook chooses one ML model to predict the group stage matches and another for the knockout stage. the difference is that the result of the group stage matches can be a loss, a draw, or a win. On the other hand, in the direct elimination stage, there is only defeat or victory. The best model for each stage is chosen among the algorithms:

  • Random Forest
  • Ada Boost Classifier
  • XGB Boost
  • Neural Networks

The XGB Boost model presents the best performance in both stages. Therefore it is tuned, validated, and exported as a pipeline to perform easy inferences.

  • Confusion matrix of the group stage model tuned and validated

download-10

  • Confusion matrix of the knockout stage model tuned and validated

download-9

Predictions

Finally, notebook Predictions.ipynb uses the inference datasets and the trained models to predict the World Cup matches and thus find the winner of the World Cup. It is essential to mention that to choose who is the home team in each World Cup match, use dataset squad_stats.csv, which provides the potential of each team; therefore, the team with more significant potential will be the home team.

qatar_2022_prediction's People

Contributors

davidcamilo0710 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.