Giter VIP home page Giter VIP logo

imdbphp's Introduction

imdbphp

PHP library for retrieving film and TV information from IMDb. Retrieve most of the information you can see on IMDb including films, TV series, TV episodes, people. Search for titles on IMDb, including filtering by type (film, tv series, etc). Download film posters and actor images.

Quick Start

$title = new \Imdb\Title(335266);
$rating = $title->rating();
$plotOutline = $title->plotoutline();

# Find out about the director
$person = new \Imdb\Person($title->director()[0]['imdb']);
$name = $person->name();
$photo = $person->photo();

Installation

This library scrapes imdb.com so changes their site can cause parts of this library to fail. You will probably need to update a few times a year. Keep this in mind when choosing how to install/configure.

Get the files with one of:

Requirements

  • PHP >= 5.6
  • PHP cURL extension

Configuration

imdbphp needs no configuration by default but can cache imdb lookups, store images and change languages if configured.

Configuration is done by the \Imdb\Config class in src/Imdb/Config.php which has detailed explanations of all the config options available. You can alter the config by creating the object, modifying its properties then passing it to the constructor for imdb.

$config = new \Imdb\Config();
$config->language = 'de-DE,de,en';
$imdb = new \Imdb\Title(335266, $config);
$imdb->title(); // Lost in Translation - Zwischen den Welten
$imdb->orig_title(); // Lost in Translation

If you're using a git clone you might prefer to configure IMDbPHP by putting an ini file in the conf folder. 900_localconf.sample has some sample settings.

The cache folder is ./cache by default. Requests from imdb will be cached there for a week (by default) to speed up future requests.

Advanced Configuration

Replacing the default cache (disk cache)

You can replace the caching mechanism that ImdbPHP uses to any PSR-16 (simple cache) cache by passing one into the constructor of any ImdbPHP class.

The only piece of imdbphp config that will be used with your cache is the TTL which is set by \Imdb\Config::$cache_expire and defaults to 1 week.

$cache = new \Cache\Adapter\PHPArray\ArrayCachePool();
// Search results will be cached
$search = new \Imdb\TitleSearch(null /* config */, null /* logger */, $cache);
$firstResultTitle = $search->search('The Matrix')[0];
// $firstResultTitle, an \Imdb\Title will also be using $cache for caching any page requests it does
$cache = new \Cache\Adapter\PHPArray\ArrayCachePool();
$title = new \Imdb\Title(335266, null /* config */, null /* logger */, $cache);

Replacing the default logger (which echos coloured html, and is disabled by default)

The logger will mostly tell you about http requests that failed at error level, each http request at info and some stuff like cache hits at debug.

$logger = new \Monolog\Logger('name');
$title = new \Imdb\Title(335266, null /* config */, $logger);

Searching for a film

// include "bootstrap.php"; // Load the class in if you're not using an autoloader
$search = new \Imdb\TitleSearch(); // Optional $config parameter
$results = $search->search('The Matrix', array(\Imdb\TitleSearch::MOVIE)); // Optional second parameter restricts types returned

// $results is an array of Title objects
// The objects will have title, year and movietype available
// immediately, but any other data will have to be fetched from IMDb
foreach ($results as $result) { /* @var $result \Imdb\Title */
    echo $result->title() . ' ( ' . $result->year() . ')';
}

Searching for a person

// include "bootstrap.php"; // Load the class in if you're not using an autoloader
$search = new \Imdb\PersonSearch(); // Optional $config parameter
$results = $search->search('Forest Whitaker');

// $results is an array of Person objects
// The objects will have name and imdbid available, everything else must be fetched from IMDb
foreach ($results as $result) { /* @var $result \Imdb\Person */
    echo $result->name();
}

Demo site

The demo site gives you a quick way to make sure everything's working, some sample code and lets you easily see some of the available data.

From the demo folder in the root of this repository start up php's inbuilt webserver and browse to http://localhost:8000

php -S localhost:8000

Gotchas / Help

SSL certificate problem: unable to get local issuer certificate

Windows

The cURL library either hasn't come bundled with the root SSL certificates or they're out of date. You'll need to set them up:

  1. Download cacert.pem
  2. Store it somewhere in your computer.
    C:\php\extras\ssl\cacert.pem
  3. Open your php.ini and add the following under [curl]
    curl.cainfo = "C:\php\extras\ssl\cacert.pem"
  4. Restart your webserver.

Linux

cURL uses the certificate authority file that's part of linux by default, which must be out of date. Look for instructions for your OS to update the CA file or update your distro.

Configure languages

Sometimes IMDb gets unsure that the specified language are correct, if you only specify your unique language and territory code (de-DE). In the example below, you can find that we have chosen to include de-DE (German, Germany), de (German) and en (English). If IMDb can’t find anything matching German, Germany, you will get German results instead or English if there are no German translation.

$config = new \Imdb\Config();
$config->language = 'de-DE,de,en';
$imdb = new \Imdb\Title(335266, $config);
$imdb->title(); // Lost in Translation - Zwischen den Welten
$imdb->orig_title(); // Lost in Translation

Please use The Unicode Consortium Langugage-Territory Information database for finding your unique language and territory code.

Langauge Code Territory Code
German de Germany {O} DE

After you have found your unique language and territory code you will need to combine them. Start with language code (de), add a separator (-) and at last your territory code (DE); de-DE. Now include your language code (de); de-DE,de. And the last step add English (en); de-DE,de,en.

imdbphp's People

Contributors

amirsasani avatar athornstrom avatar buborh avatar duck7000 avatar eugenedan avatar fossil01 avatar izzysoft avatar jetrosuni avatar jreklund avatar mam4dali avatar miaadp avatar nhellfire avatar paxter avatar polakosz avatar ppardalj avatar romansixty avatar sebastian-king avatar sebastienaubry avatar sruell avatar tboothman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

imdbphp's Issues

Error for wrong iMDB id

I have several mistyped iMDB ids in my DB. When these ids are used with imdbPHP I get the an error saying
Imdb\Exception\Http: Uncaught exception 'Imdb\Exception\Http' with message 'Failed to retrieve url [http://akas.imdb.com/title/tt1223602/]. Status code [404]' in /'path_to_imdbPHP'/src/Imdb/Pages.php

This error breaks my script. It would be very useful if there was a small check function in imdbPHP which could be used to verify that the iMDB id we are using corresponds to an valid page. Is this something you would consider incorporating?

New iMDB layout is back!!!

The new iMDB layout is back and this time it seems to have replaced the old one..!!
Some info such as the poster url and the number of voters cannot be parsed correctly.

Series Writing Credits

Since a few weeks ago, the writing() method does not work anymore with TV series and returns an empty string: for those, IMDb updated the string to parse from "Writing Credits" to "Series Writing Credits".

Thanks!

Request: method Alternate Versions

My apologies if this is not the right way to ask

Is it possible to get a method for Alternate Versions?
It seems to be not in the class afaik

Thanks in advance
Ed

Recommendations year last entry not found

If i want the recommendations of the movie 1408 http://www.imdb.com/title/tt0450385/
Then the last entry (the movie Mirrors) has no year because of a small change in that titel.

All titles are like this: Name (year) but the last one looks like this: Mirrors I (2008)

Source code looks like this:

<div class="rec_details">
         <div class="rec-info">

           <div class="rec-jaw-upper">  

     <div class="rec-title">
       <a href="/title/tt0790686/?ref_=tt_rec_tt"><b>Mirrors</b></a>
        <span>I</span>
            <span class="nobr">(2008)</span>
   </div> 

Thanks
Ed

Photo regex requires increased pcre.backtrack_limit

The photo() 's weren't working on my system. When I increased:

ini_set('pcre.backtrack_limit',1000000);

They would work. The default is 100000 and the imdb page must be longer than that.

The ".*" in the thumbphoto regex was the cause.

Changing to ".*?" (non greedy) worked.

imdb.class.php:

protected function thumbphoto() {
    $this->getPage("Title");
    # preg_match('!id="img_primary">.*?<img [^>]+src="(.+?)".*<td id="overview-top"!ims',$this->page["Title"],$match);
    preg_match('!id="img_primary">.*?<img [^>]+src="(.+?)".*?<td id="overview-top"!ims',$this->page["Title"],$match);
    if (empty($match[1])) return FALSE;
    $this->main_thumb = $match[1];
    if ( preg_match('|(.*\._V1).*|iUs',$match[1],$mo) ) {
      $this->main_photo = $mo[1];
      return true;
    }
    else return FALSE;
  }

Error when following 301 redirects

The function call in getWebPage() in mdb_base.class.php#130 is using $target retrieved from the HTTP header field location. But the location does not contain the full URL, only "/title/tt#######/". Below is the header for http://www.imdb.com/title/tt2768262/

301 Moved Permanently
content-length:0
vary:User-Agent
server:Server
location:/title/tt2386868/
date:Thu, 09 Oct 2014 20:06:09 GMT
x-frame-options:SAMEORIGIN
p3p:policyref="http://i.imdb.com/images/p3p.xml",CP="CAO DSP LAW CUR ADM IVAo IVDo CONo OTPo OUR DELi PUBi OTRi BUS PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA HEA PRE LOC GOV OTC "

YUM install imdbphp without Apache

Hi.
My server runs on Centos 7 with NGiNX & PHP-FPM
I tried installing imdbphp2 via the IzzySoft repo but I noticed that I was also getting httpd & httpd-tools installed.

Is there a way to use the repository to install and keep updated imdbphp without having to install httpd & httpd-tools?

Thanks!

Remove html entities from results

While most people will use this data and dump it into a web document ... the library should be agnostic to that. It should turn any html entities into their utf8 equivalent.
Pretty sure that IMDb is entirely utf8 now... need to make sure though.

Limit array

Hello how can I limit in a search for the movie or series
the number of directors and actors
, $ Imdb-> director (), $ imdb-> cast ()
pq returns a very large list of names

Add ability to replace cache

  1. I have checked out briefly PSR-6 cache interfaces and they seem over-complicated. Can we have our simple interface instead? Something like that:
interface CacheInterface
{
    public function get($key);

    public function set($key, $value);
}
  1. Can we use dependency injection container both for cache and logger?
    Right now we have method setLogger, maybe we can have similar method setCache as well?
    The only drawback in this case is that we have to code the following steps: a) set log for MdbBase (i.e. Title), b) set log for cache c) set cache for Title.
    When using dependency injection container we have to set dependency only once, and we can have, for example, 2 additional config variables: cachetype and logtype which are class names for cache and logger

I can work on that points and make a pull request soon

Additional info in roles

I noticed this in the changelog:

Fix cast() parsing of role. The role field no longer contains anything other than the name of the role played

So, am I reading it right in that this is intentional, or is this an unfixed bug? I really think that this is pretty important information...

The poster and images variables are empty

The poster and images variables are empty.
I think it is irrelevant with the cache although i have cache dir correctly as said in documentation

print_r ($movie);
echo $movie->photo_localurl().$movie->title().$movie->photo().$movie->thumbphoto();exit;

the following variables returned empty
movie->photo_localurl().$movie->title().$movie->photo().$movie->thumbphoto();

Please advise

movies_actor() returns all types of titles (not just movies)

It would be awesome if we could have movies_actor() just for actual movies, and tvshows_actor() for instance for tvshows. Alternatively, a title_actor() method that would include a field indicating the type of the title (Movie, TvShow or TvEpisode).

I believe this is feasible, since by just looking at a person page (e.g. http://www.imdb.com/name/nm0933988/) we can deduce the type of each title that person had a role in. My point is that is does not seem to require further page request to each title entry in the Person filmography..

Could it possible for you to consider implementing this?

Default config throws exception when used

Default config should work so it's very easy to get started.

Disable caching by default? (will people use caching then? will people notice their caching which maybe used to work now doesn't?)
Add a cache folder to the repo? (will probably cause a notice when using composer install/update as the folder has changed ... but they should change their config anyway)

PHP error if no cast

Minor issue, but if a movie has no cast available, the script throws up an error.

Warning: Invalid argument supplied for foreach() in imdb.class.php on line 1185

Undefined constant errors

When setting a custom configuration undefined constant errors are thrown.

require './vendor/autoload.php';
$config = new \mdb_config();

Gives Notice: Use of undefined constant NO_ACCESS - assumed 'NO_ACCESS'

Top250 not working

Hi there

I'm trying to get the top250 but i think imdb changed it's site again because it's not working.

On their site it now list as Top Rated Movies #number if the movie is listed on the top250 list
I'm not very good at php preg_match but i think i'm in the right direction with this?

@preg_match('!<a href="[^"]*/chart/top\?(.*?)><strong> Top Rated Movies #(\d+)\s*</strong></a>!si'

But it doesn't work at all

I used this movie as example: http://www.imdb.com/title/tt0066921/
The source code of the imdb title page (This is the full <div> where the info is listed) looks like this:

<div class="article highlighted" id="titleAwardsRanks">
          <strong>
<a href="/chart/top?ref_=tt_awd"
> Top Rated Movies #80
</a>          </strong>
          |

    <span itemprop="awards">
      <b>
        Nominated for
        4
        Oscars.
      </b>
    </span>
    <span itemprop="awards">
        Another
      8 wins &amp; 17 nominations.
    </span>
    <span class="see-more inline">
<a href="/title/tt0066921/awards?ref_=tt_awd"
class="btn-full" >See more awards</a>&nbsp;&raquo;    </span>
      </div>

I need some help to get this fixed
Thanks
Ed

ps I comment this in a earlier issue but later realized that it was closed..

Error if cache folder is not created

If the cache folder is not in the directory, the application crashed.
therefore, i've added this little piece of code to the project where i'm using your great work.

/*
 * Check if the cache folder exists in the imdbphp api.
 * if it doesn't - then create it!
 */
if (!file_exists('../includes/api/imdbphp/cache'))
{
    mkdir('../includes/api/imdbphp/cache', 0777, true);
}

Anyway, Thanks for sharing this!

v3.0.0 inconsistencies

The Makefile hasn't been updated to reflect the huge structural changes implemented with v3.0.0, hence:

  • .deb/.rpm builds are impossible for the release tag, hence
  • the Prod demo (reflecting the latest release) will not be up-to-date

Additional issues due to the restructuring:

  • the Dev demo (reflecting the current master branch) will be is broken
  • applications using IMDBPHP won't be able to easily use the new version (no compatibility layer; for examples how those could look like, see e.g. MarkdownExtra: we could have corresponding files in the "old locations" taking care for that, such as imdb.class.php having a pseudo imdb class extending the real one), which especially affects those currently unmaintained (as e.g. phpVideoPro), where such a "compatibility layer" had easily fixed it.

Sorry, @tboothman – but that's why I said we should communicate when you're done with restructuring, but before you merge that into master 😷

Slashes in role

Good job fixing #12 - much appreciated.

However, this does introduce one new bug (or it was a recent bug that I missed). Some roles have slashes in them to separate multiple characters that an actor played in that movie, but any role that has a slash in it gets truncated.

Example, see: http://www.imdb.com/title/tt0462538/fullcredits

role = Marge Simpson / Selma / Patty
gets truncated to
role = Marge Simpson /

no photo available!

version 3.2.0 gets no image/photo out of imdb (eg. /demo/movie.php?mid=0212985). is this a problem of configuration? or a bug?

Error on searching for a person: redirecting to inexistent file imdb.php?mid=0361748

How to recreate the bug:

  1. main interface, search name: Brad pitt, type: name;
  2. press search button;
  3. you get a page like this one:
    http://www.luigiusai.it/moviedatabase/search.php?name=Brad+pitt&searchtype=nm&engine=imdb&mid=
  4. on the "Person Details", press the "Bastardi senza gloria" link, which is similar to this one:
    http://www.luigiusai.it/moviedatabase/imdb.php?mid=0361748

This is the error: the file imdb.php does not exist.

Best regards to all
Luigi, Verona

Title movie in english (2) re-open

re-open this topic
something does not work in the "person" script
(I set the configuration as in the screenshot)
When download the filmography film does not all titles are in English
some films are in Russian language
I wish them all with the original title in English
precise that I live in russia

ss screen capture 069
ss screen capture 070
ss screen capture 071
ss screen capture 072
ss screen capture 073

People - death locations

I couldn't find out an exact reason why, but occasionally, death location would not be returned properly. For an example, see:

http://akas.imdb.com/name/nm0662730/bio

His death location shows up as expected on the page, but the script will not return it.

Fix... imdb_person.class.php, line 417-418, replace with:

        if (!preg_match('|/search/name\?death_place=.*?"\s*>(.*?)<|ims',$match[1],$dloc))
          preg_match('|/search/name\?death_place=.*?"\s*>(.*?)<|ims',$match[1],$dloc);

line 420, replace:

"place"=>@trim(strip_tags($dloc[3]))

with:

"place"=>@trim(strip_tags($dloc[1]))

Method quotes returns nothing

This method is broken due to site changes i guess

page source looks like this:

<div id="qt0396883" class="quote soda sodavote odd" ><div class="sodatext">
<p>
<a href="/name/nm0000131/?ref_=tt_trv_qu"
><span class="character">Mike Enslin</span></a>:
[<span class="fine">Olin gives Enslin the room key</span>]
Most hotels have switched to magnetics. An actual key. That's a nice touch, it's antiquey. </p>

the preg_match like this

preg_match_all('!class="quote soda (odd|even)"\s*><p>\s*(.*?)\s*</p>\s*<div class=!ims',str_replace("\n"," ",$this->page["Quotes"]),$matches)

I tried to fix it myself but my skills with preg_match are slim.
obviously the class name is wrong but that is as far as i get

Thanks
Ed

imdb_trailers is broken

Throws all sorts of errors .. doesn't produce a single useable URL

PHP Notice:  Undefined index: fmt_url_map in C:\Users\Tom\code\imdbphp\imdb_trailers.class.php on li
ne 149

Notice: Undefined index: fmt_url_map in C:\Users\Tom\code\imdbphp\imdb_trailers.class.php on line 14
9
PHP Notice:  Undefined variable: fmt in C:\Users\Tom\code\imdbphp\imdb_trailers.class.php on line 15
1

Notice: Undefined variable: fmt in C:\Users\Tom\code\imdbphp\imdb_trailers.class.php on line 151
PHP Notice:  Undefined index: Location in C:\Users\Tom\code\imdbphp\imdb_trailers.class.php on line
153

Notice: Undefined index: Location in C:\Users\Tom\code\imdbphp\imdb_trailers.class.php on line 153
PHP Notice:  Undefined offset: 0 in C:\Users\Tom\code\imdbphp\imdb_trailers.class.php on line 156

TV Series cast() not correctly parsed

The html character

&nbsp;

is added to the 'role' index.

example from 'Homeland':

0 => 
    array (size=5)
      'imdb' => string '0000132' (length=7)
      'name' => string 'Claire Danes' (length=12)
      'role' => string '&nbsp;Carrie Mathison
                  (48 episodes, 2011-2014)' (length=64)
      'thumb' => string 'http://ia.media-imdb.com/images/M/MV5BMTMyMzQ1Mjk3M15BMl5BanBnXkFtZTcwNzk3ODMxNw@@._V1_SY44_CR1,0,32,44_AL_.jpg' (length=111)
      'photo' => string 'http://ia.media-imdb.com/images/M/MV5BMTMyMzQ1Mjk3M15BMl5BanBnXkFtZTcwNzk3ODMxNw@@._V1_SY44_CR1,0,32,44_AL_.jpg' (length=111)
  1 => 
    array (size=5)
      'imdb' => string '0001597' (length=7)
      'name' => string 'Mandy Patinkin' (length=14)
      'role' => string '&nbsp;Saul Berenson
                  (48 episodes, 2011-2014)' (length=62)
      'thumb' => string 'http://ia.media-imdb.com/images/M/MV5BMjAzNjU1NTE3NF5BMl5BanBnXkFtZTcwNjIyMzcyNw@@._V1_SY44_CR0,0,32,44_AL_.jpg' (length=111)
      'photo' => string 'http://ia.media-imdb.com/images/M/MV5BMjAzNjU1NTE3NF5BMl5BanBnXkFtZTcwNjIyMzcyNw@@._V1_SY44_CR0,0,32,44_AL_.jpg' (length=111)

Title movie english

how to configure this code to save the titles in English
I live in Russia, and these tiles are saved in "Russian language"
thank you

` /**

  • IMDB server to use.

  • choices are www.imdb.&lt;lang> with <lang> being one of

  • de|es|fr|it|pt, uk.imdb.com, and akas.imdb.com - the localized ones are

  • only qualified to find the movies IMDB ID (with the imdbsearch class;

  • akas.imdb.com will be the best place to search as it has all AKAs) -- but

  • parsing (with the imdb class) for most of the details will fail for

  • most of the details.

  • @var string imdbsite
    */
    public $imdbsite = "akas.imdb.com";

    /**

  • Tell IMDB which is the preferred language.

  • Any valid language code can be used here (e.g. en-US, de, pt-BR).

  • If this option is specified, the Accept-Language header with this value

  • will be included in the requests.

  • @var string
    */
    public $language = "en-US";`

Rewrite using xpath

Any thoughts on updating the script to use xpath which is a much more robust way of parsing HTML. I have some of the functionality already written before discovering this library using xpath and it is much easier to maintain. Any interest in this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.