Giter VIP home page Giter VIP logo

nfl_database's Introduction

NFL_database

Installing the application

1.) Clone this repository by running the following command,

git clone https://github.com/JamesChapmanNV/NFL_database

2.) Navigate to /src/python/ExampleConfig.ini and modify according to user's postgresql server credentials. Rename the file as 'Config.ini'

3.) Navigate to /src/sql/ and connect to the same postgresql server and database as mentioned in 'Config.ini'

4.) Run the following command to build the tables and function.

\i build.sql

Running the application

1.) Ensure that the python modules mentioned in /src/python/requirements.txt are installed

2.) Navigate to /src/ and execute the following command to display the help and possible options in using the application.

python3 python -h

Work in progress …

Current tables are as follows:

Teams

column data_type constraints
team_name VARCHAR(45) PRIMARY KEY
abbreviation VARCHAR(3) UNIQUE
location VARCHAR(20)
venue_name VARCHAR(55) FOREGIN KEY(Venues.name)
primary_color CHAR(6)
secondary_color CHAR(6)

Positions

column data_type constraints
position_name VARCHAR(20) PRIMARY KEY
abbreviation VARCHAR(2) UNIQUE
platoon VARCHAR(13)

Athletes

column data_type constraints
athlete_id BIGINT PRIMARY KEY
first_name VARCHAR(45)
last_name VARCHAR(45)
dob DATE
height NUMERIC
weight NUMERIC
birth_city VARCHAR(20)
birth_state VARCHAR(20)

Rosters

column data_type constraints
game_id BIGINT FOREIGN KEY(Games.game_id)
team_name VARCHAR(45) FOREIGN KEY(Teams.team_name)
athlete_id BIGINT FOREIGN KEY(Athletes.athlete_id)
position_name VARCHAR(20) FOREIGN KEY(Positions.position_name)
played BOOLEAN
game_id, team_name, athlete_id PRIMARY KEY

Venues

column data_type constraints
venue_name VARCHAR(45) PRIMARY KEY
capacity INT
city VARCHAR(20)
state CHAR(2)
grass BOOLEAN
indoor BOOLEAN

Games

column data_type constraints
game_id BIGINT PRIMARY KEY
date DATE NOT NULL FOREIGN KEY(Season_Dates.date)
attendance INT
home_team VARCHAR(45) FOREIGN KEY(Teams.team_name)
away_team VARCHAR(45) FOREIGN KEY(Teams.team_name)
venue_name VARCHAR(45) FOREIGN KEY(Venues.venue_name)
utc_time TIME

Season_Dates

column data_type constraints
date DATE PRIMARY KEY
season_year NOT NULL NOT NULL
season_type VARCHAR(20) NOT NULL
week INT NOT NULL

Plays

column data_type constraints
play_id BIGINT PRIMARY KEY
quarter INT NOT NULL
yards INT NOT NULL
score_value INT NOT NULL
play_type VARCHAR(45) NOT NULL
text TEXT
seconds_remaining INT
start_down INT NOT NULL
end_down INT NOT NULL

Player_Plays

column data_type constraints
play_id BIGINT FOREIGN KEY(Plays.play_id)
player_id BIGINT FOREIGN KEY(Athletes.athlete_id)
game_id BIGINT NOT NULL
type VARCHAR(20) NOT NULL
play_id, player_id, type PRIMARY KEY

Linescores

column data_type constraints
team_name VARCHAR(45) FOREIGN KEY(Teams.team_name)
game_id BIGINT FOREIGN KEY(Games.game_id)
quarter INT FOREIGN KEY(Games.quarter)
score INT NOT NULL
team_name, game_id, quarter PRIMARY KEY
Open In Colab

nfl_database's People

Contributors

cazumbaugh3 avatar jameschapmannv avatar vishnuteja97 avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Forkers

cazumbaugh3

nfl_database's Issues

player_id

at some point it might be nice to replace player_plays.player_id -> athlete_id everywhere that it shows up

Missing games in player_plays data

There are just over 300 games in the player plays table that do not have an associated game in the games table. For now, these were just removed and the game_ids are saved in the missing_games.csv file.

Play ID in player_plays is missing from plays

There is at least one record in player_plays that does not have a matching play_id in the plays table. The play_id is 40055421436, and this is causing the import to fail due to a foreign key violation.

Implement favoriting

Users can favorite teams and athletes. This needs to be implemented, along with the ability to change their favorite.

Issues/To-Dos

3/21/2024

1.) Venues table - Drop zipcode, combine city and state into a single field
2.) Rosters table - Remove active. change didnotplay -> played and flip the values
3.) Teams table - Remove teamid. Use teamname as primary key (simplifying assumption that team name is unique identifier)
4.) Games table - Remove name and shortname.
5.) Athletes table - Remove drafted_bool
6.) Positions table, Venue table - Remove numeric id with names as keys
7.) Teams table - Remove AFC, NFC (Irvin, Rice maybe as well)
8.) Plays table - Add start_down, end_down

Some athletes have multiple jersey numbers

Some of the athletes have multiple jersey numbers, presumably from playing on multiple teams (see below). Should we move the jersey number to the rosters table?

 firstname | lastname  |   birth_place   |   birth_place   | jersey | jersey 
-----------+-----------+-----------------+-----------------+--------+--------
 De'Angelo | Henderson | Summerville, SC | Summerville, SC |     32 |     33
 De'Angelo | Henderson | Summerville, SC | Summerville, SC |     33 |     32
 Tom       | Johnson   | Moss Point, MS  | Moss Point, MS  |     92 |     96
 Tom       | Johnson   | Moss Point, MS  | Moss Point, MS  |     96 |     92
 Brandon   | Scherff   | Denison, IA     | Denison, IA     |     68 |     75
 Brandon   | Scherff   | Denison, IA     | Denison, IA     |     75 |     68
 Willie    | Smith     | Smithfield, NC  | Smithfield, NC  |     71 |     69
 Willie    | Smith     | Smithfield, NC  | Smithfield, NC  |     69 |     71

Athletes that are in player_plays are missing from the athletes table

We have ~9,000 athletes that appear in the player plays data but are not present in the athletes data. The file 'data/missing_athletes.csv' contains a list of their IDs that can be used to pull info on them. For now, these players were excluded from the player_plays data. Once we get their data we can add it back.

Venues need capacity

Need to add capacity for each venue in the venues table. Currently listed as 0 for all.

Duplicate data in athletes table

The athletes table contains duplicate entries for several athletes. Needs fix in the following tables:

  • Athletes
  • Rosters
  • Player_Plays

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.