Giter VIP home page Giter VIP logo

spynner's Introduction

= Introduction =

Spynner is a stateful programmatic web browser module for Python. It is based upon [http://www.qtsoftware.com/ Qt] and [http://webkit.org/ WebKit], so it supports Javascript, AJAX, and every other technology that !WebKit is able to handle (Flash, SVG, ...). Spynner takes advantage of [http://jquery.com jQuery], a powerful Javascript library that makes the interaction with pages and event simulation really easy.  

Using Spynner you would able to simulate a web browser with no GUI (though a browsing window can be opened for debugging purposes), so it may be used to implement crawlers or acceptance testing tools.

= Dependencies =

  * [http://www.python.org Python] (>=2.5)
  * [http://www.riverbankcomputing.co.uk/software/pyqt/download PyQt] (>=4.4.3): Python wrappers for the [http://www.qtsoftware.com/ Qt] framework.

= Install =

  * A [http://code.google.com/p/spynner/downloads/list release] version (recommended):

{{{
$ wget http://spynner.googlecode.com/files/spynner-VERSION.tgz
$ tar xvzf spynner-VERSION.tgz
$ cd spynner-VERSION
$ sudo python setup.py install
}}}

or 

{{{
$ sudo easy_install spynner
}}}

  * The bleeding edge version:

{{{
$ svn checkout http://spynner.googlecode.com/svn/trunk/ spynner-trunk
$ cd spynner-trunk
$ sudo python setup.py install
}}}

= API =

http://tokland.freehostia.com/googlecode/spynner/api/

You can generate the API locally (will create docs/api directory):

$ python setup.py gen_doc

= Usage =

A basic example:

{{{
import spynner

browser = spynner.Browser()
browser.load("http://www.wordreference.com")
browser.runjs("console.log('I can run Javascript')")
browser.runjs("console.log('I can run jQuery: ' + jQuery('a:first').attr('href'))")
browser.select("#esen")
browser.fill("input[name=enit]", "hola")
browser.click("input[name=b]")
browser.wait_page_load()
print browser.url, browser.html
browser.close()
}}}

Sometimes you'll want to see what is going on:

{{{
browser = spynner.Browser()
browser.debug_level = spynner.DEBUG
browser.create_webview()
browser.show()
...
}}}

See more examples in the repository: 

http://code.google.com/p/spynner/source/browse/#svn/trunk/examples

= Running Javascript = 

Spynner uses jQuery to make Javascript interface easier. By default, two modules are injected to every loaded page:

  * [http://docs.jquery.com/Downloading_jQuery jQuery core]: Amongst other things, it adds the powerful [http://docs.jquery.com/Selectors jQuery selectors], which are used internally by some Spynner methods (_fill_, _select_, _click_, _check_, ...). Of course you can also use jQuery when you inject your own code into a page.

  * [http://code.google.com/p/jqueryjs/source/browse/trunk/plugins/simulate jQuery _simulate_ plugin]: Makes it possible to simulate mouse and keyboard events (for now spynner uses it only in the _click_ action). Look up the library code to see which kind of events you can fire.

Note that you must use __jQuery(...)_ instead of _jQuery(...)_  or the common shortcut _$(...)_. That prevents name clashing with the jQuery library used by the page.

= Cook your soup: parsing the HTML = 

You can parse the HTML of a webpage with your favorite parsing library ([http://www.crummy.com/software/BeautifulSoup BeautifulSoup], [http://codespeak.net/lxml/ lxml], ...). Since we are already using Jquery for Javascript, it feels just natural to work with [http://pypi.python.org/pypi/pyquery pyquery], its Python counterpart:

{{{
import spynner
import pyquery

browser = spynner.Browser()
...
d = pyquery.Pyquery(browser.html)
d.make_links_absolute(browser.get_url())
href = d("#somelink").attr("href")
browser.download(href, open("/path/outputfile", "w"))
}}}

= Running Spynner without X11 ==

Spynner needs a X11 server to run. If you are running it in a server without X11 you must install the virtual [http://en.wikipedia.org/wiki/Xvfb Xvfb server]. Debian users can use the small wrapper (xvfb-run). If you are not using Debian, you can download it here:

http://www.mail-archive.com/[email protected]/msg69632/x-run

{{{
$ xvfb-run python myscript_using_spynner.py
}}}

= Feedback =

Open an [http://code.google.com/p/spynner/issues/list issue] to report a bug or request a new feature. Other comments and suggestions can be directly emailed to me: [mailto://[email protected] [email protected]].

spynner's People

Contributors

kiorky avatar

Stargazers

 avatar

Watchers

 avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.