Giter VIP home page Giter VIP logo

phpfiledownload's Introduction

phpFileDownload - file_get_content replacement for spiders

file_get_contents works well for static page but when working with spider scripts that gather data from real websites, it doesn't meet all the requirements.

phpFileDownload has been written to make those tasks convenient:

  • Automatic Cookies Management
  • Easy to do POST Data
  • Handling Location Redirect

Prototype

The class is extremely simple. There is a get() function that works like file_get_contents with a second parameter to tell the POST content. The cookies are stored in a simple associative array. class FileDownload() { public function get($url, $post_content = NULL); public $cookies; }

Example

A common task your spider want to execute is to login on a forum and access private pages there. Doing this is achieved with only two functions call.

$fd = new FileDownload();

// Login Phase: 
// It logs in by following the many redirects 
// and by settings all the required cookies

$fd->get('http://some-forum.com/login.php?do=login', // URL
		 'vb_login_username=vjeux&vb_login_password=!@#$%^&*'); // POST Data

// Optional:
// You can view / edit the cookies using the variable $fd->cookies

print_r($fd->cookies);

// Spider Phase:
// You are now logged in, get any page you need

$page = $fd->get('http://some-forum.com/search.php?do=finduser&userid=12345');

phpfiledownload's People

Contributors

vjeux avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.