Giter VIP home page Giter VIP logo

investigate_tmdb_movies's Introduction

Investigate_TMDb_Movies


Project introduction

In this project, you will analyze a dataset and then communicate your findings about it. You will use the Python libraries NumPy, pandas, and Matplotlib to make your analysis easier.



What do I need to install?

pandas
NumPy
Matplotlib
csv

   pip install (name libraries)
   

Table of Contents



Overview ๐Ÿ‘ˆ

In this project, you'll go through the data analysis process and see how everything fits together. Later Nanodegree projects will focus on individual pieces of the data analysis process.

You'll use the Python libraries NumPy, pandas, and Matplotlib, which make writing data analysis code in Python a lot easier! Not only that, these are sought-after skills by employers!


What will I learn?

After completing the project, you will:
  • Know all the steps involved in a typical data analysis process

  • Be comfortable posing questions that can be answered with a given dataset and then answering those questions

  • Know how to investigate problems in a dataset and wrangle the data into a format you can use

  • Have experience communicating the results of your analysis

  • Be able to use vectorized operations in NumPy and pandas to speed up your data analysis code

  • Be familiar with pandas' Series and DataFrame objects, which let you access your data more conveniently

  • Know how to use Matplotlib to produce plots showing your findings


steps of project


dataset

Click this link dataset to open a document with links and information about data sets that you can investigate for this project. You must choose one of these datasets to complete the project.

link TMDb_Movies dataset click

data cleaning

  • drop duplicated
   df.drop_duplicates(inplace= True)
  • fill non value with mean
   df.dropna(inplace=True)

  • fix data format

Analyze Your Data

Brainstorm some questions you could answer using the data set you chose, then start answering those questions. You can find some questions in the data set options to help you get started.

Try and suggest questions that promote looking at relationships between multiple variables. You should aim to analyze at least one dependent variable and three independent variables in your investigation. Make sure you use NumPy and pandas where they are appropriate!


end ๐Ÿ™‹

investigate_tmdb_movies's People

Contributors

ahmed-hassan97 avatar

Stargazers

 avatar

Watchers

 avatar

investigate_tmdb_movies's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.