Giter VIP home page Giter VIP logo

scraping-netflix's Introduction

Scraping Netflix - The Unofficial Way!

A Python class that scrapes information about Netflix movies that are available for streaming. Used by [www.tomatoflix.com][1]

Warning: Netflix's APIs have changed and this package has not yet been updated to follow those changes.

Why?

This project started because I wanted to create [Tomatoflix][1], an interactive website that helps lazy people like me find random Netflix movies to watch. I was surprised to find out that Netflix privatized their API. I took matters into my own hands and decided to forge a Netflix API of my own.

Requirements

  • Python 3
  • Modules:
    • BeautifulSoup
    • Requests
    • Fuzzywuzzy (Not very impressed with this one... Open to fuzzy matching alternatives. But it'll do for now.)

Installation

Git clone this to your local computer and it should be good to go. Currently working on making this installable via Pip.

Instructions

from netflix import *

# Insert netflix ID as a raw string
# To find Netflix ID:
# Sign into Netflix > Chrome Developer Tools > Resources > Cookies > www.netflix.com > NetflixId
netflix_id = r'INSERT NETFLIX ID HERE'

movie = Netflix(netflix_id)

# Initialization only has to be done once.
# This method creates jsons for all of the major genres that will be used to pull data from
movie.initialize()

"""
>Genres were successfully downloaded as JSON files
"""

# search() looks to see if the movie is available on Netflix streaming.
# other methods are chained to search() and returns specific information about the movie.
movie.search('Jerry Maguire').duration()
movie.search('Jerry Maguire').netflix_rating()

"""
>Movie was found
>2hr 18m
>3.6 stars
"""

Check out the example.py file. E-mail any specific questions to [email protected]

All Available Functions

[1]: http://www.tomatoflix.com
__Functions__ __Return Data Type__ __Description__
initialize(_netflix\_id\_as\_string_) None Creates a JSON file for each movie, organized by genre. This method has to be run __only once__ and __should not be run after the JSON files have been pulled successfully__. This will minimize your chances of getting "caught" by Netflix (as if they don't know what we're up to...).
all\_titles() List Returns a list of every title that's available for streaming on Netflix. Loop through this list to get information about every movie.
search(_movie\_string_) None Checks if the string is a movie that is currently available on Netflix. Will return one of the following messages: ```Movie was found``` or ```Movie could not be found. Did you mean any of the following movies?```. If movie is not found, a list of movies that paritially matched the search string will be printed to the console. __In order to find specific information about a movie, the algorithm must find a movie match.__
movie\_number() Int Returns the Netflix movie ID number
genres() List Returns a list of genres the movie belongs to on Netflix
title() String Returns the title of the movie
tv\_show() String Returns a _"Y"_ if the movie is considered a TV show. Returns a _"N"_ if it is only a movie.
synopsis() String Returns the synopsis for the movie.
year() Int Returns the year the movie was made. NOTE: This year does not always match the year listed on other movie websites like Rotten Tomatoes.
netflix\_rating() String Returns the average Netflix rating for the movie.
cert\_rating() String Returns the maturity rating for the movie.
actors\_list() List Returns a list of the prominent actors in the movie.
actors\_string() String Returns a string of the prominent actors in the movie.
url() String Returns the non Netflix member friendly URL for the movie.
duration() String Returns the duration (hours and minutes or number of seasons) of the movie or TV show.
box\_art() String Returns the URL for the small box art of the movie.
large\_box\_art() String Returns the URL for the large box art of the movie. NOTE: Because of the different layout of Netflix movie pages, this method does not always work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.