A case study done on the 1000 movies list
For this project I have used dataset from kaggle 'IMDB-Movie-Data.csv'. It contains features like Title, Genre, Description, Director, Rating etc.
Firstly, I have loaded all the necessary libraries. Then loaded the data using panda. In this project I will visualize different insights like classify movies based on Ratings[Excellent,Good and Average], find average rating of movies year-wise, display top 10 highest revenue movie titles etc.
Steps performed in this project:
-
Display Top 10 Rows of The Dataset
-
Check Last 10 Rows of The Dataset
-
Find Shape of Our Dataset (Number of Rows And Number of Columns)
-
Getting Information About Our Dataset Like Total Number Rows, Total Number of Columns, Datatypes of Each Column And Memory Requirement
-
Check Missing Values In The Dataset
-
Drop All The Missing Values
-
Check For Duplicate Data
-
Get Overall Statistics About The DataFrame
-
Display Title of The Movie Having Runtime Greater Than or equal to 180 Minutes
-
In Which Year There Was The Highest Average Voting?
-
In Which Year There Was The Highest Average Revenue?
-
Find The Average Rating For Each Director
-
Display Top 10 Lengthy Movies Title and Runtime
-
Display Number of Movies Per Year
-
Find Most Popular Movie Title (Highest Revenue)
-
Display Top 10 Highest Rated Movie Titles And its Directors
-
Display Top 10 Highest Revenue Movie Titles
-
Find Average Rating of Movies Year Wise
-
Does Rating Affect The Revenue?
-
Classify Movies Based on Ratings [Excellent, Good, and Average]
-
Count Number of Action Movies