d1on / spynner Goto Github PK
View Code? Open in Web Editor NEWThis project forked from picklingjar/spynner
Automated web browsing with javascript / cookies support.
License: GNU General Public License v3.0
This project forked from picklingjar/spynner
Automated web browsing with javascript / cookies support.
License: GNU General Public License v3.0
= Introduction = Spynner is a stateful programmatic web browser module for Python. It is based upon [http://www.qtsoftware.com/ Qt] and [http://webkit.org/ WebKit], so it supports Javascript, AJAX, and every other technology that !WebKit is able to handle (Flash, SVG, ...). Spynner takes advantage of [http://jquery.com jQuery], a powerful Javascript library that makes the interaction with pages and event simulation really easy. Using Spynner you would able to simulate a web browser with no GUI (though a browsing window can be opened for debugging purposes), so it may be used to implement crawlers or acceptance testing tools. = Dependencies = * [http://www.python.org Python] (>=2.5) * [http://www.riverbankcomputing.co.uk/software/pyqt/download PyQt] (>=4.4.3): Python wrappers for the [http://www.qtsoftware.com/ Qt] framework. = Install = * A [http://code.google.com/p/spynner/downloads/list release] version (recommended): {{{ $ wget http://spynner.googlecode.com/files/spynner-VERSION.tgz $ tar xvzf spynner-VERSION.tgz $ cd spynner-VERSION $ sudo python setup.py install }}} or {{{ $ sudo easy_install spynner }}} * The bleeding edge version: {{{ $ svn checkout http://spynner.googlecode.com/svn/trunk/ spynner-trunk $ cd spynner-trunk $ sudo python setup.py install }}} = API = http://tokland.freehostia.com/googlecode/spynner/api/ You can generate the API locally (will create docs/api directory): $ python setup.py gen_doc = Usage = A basic example: {{{ import spynner browser = spynner.Browser() browser.load("http://www.wordreference.com") browser.runjs("console.log('I can run Javascript')") browser.runjs("console.log('I can run jQuery: ' + jQuery('a:first').attr('href'))") browser.select("#esen") browser.fill("input[name=enit]", "hola") browser.click("input[name=b]") browser.wait_page_load() print browser.url, browser.html browser.close() }}} Sometimes you'll want to see what is going on: {{{ browser = spynner.Browser() browser.debug_level = spynner.DEBUG browser.create_webview() browser.show() ... }}} See more examples in the repository: http://code.google.com/p/spynner/source/browse/#svn/trunk/examples = Running Javascript = Spynner uses jQuery to make Javascript interface easier. By default, two modules are injected to every loaded page: * [http://docs.jquery.com/Downloading_jQuery jQuery core]: Amongst other things, it adds the powerful [http://docs.jquery.com/Selectors jQuery selectors], which are used internally by some Spynner methods (_fill_, _select_, _click_, _check_, ...). Of course you can also use jQuery when you inject your own code into a page. * [http://code.google.com/p/jqueryjs/source/browse/trunk/plugins/simulate jQuery _simulate_ plugin]: Makes it possible to simulate mouse and keyboard events (for now spynner uses it only in the _click_ action). Look up the library code to see which kind of events you can fire. Note that you must use __jQuery(...)_ instead of _jQuery(...)_ or the common shortcut _$(...)_. That prevents name clashing with the jQuery library used by the page. = Cook your soup: parsing the HTML = You can parse the HTML of a webpage with your favorite parsing library ([http://www.crummy.com/software/BeautifulSoup BeautifulSoup], [http://codespeak.net/lxml/ lxml], ...). Since we are already using Jquery for Javascript, it feels just natural to work with [http://pypi.python.org/pypi/pyquery pyquery], its Python counterpart: {{{ import spynner import pyquery browser = spynner.Browser() ... d = pyquery.Pyquery(browser.html) d.make_links_absolute(browser.get_url()) href = d("#somelink").attr("href") browser.download(href, open("/path/outputfile", "w")) }}} = Running Spynner without X11 == Spynner needs a X11 server to run. If you are running it in a server without X11 you must install the virtual [http://en.wikipedia.org/wiki/Xvfb Xvfb server]. Debian users can use the small wrapper (xvfb-run). If you are not using Debian, you can download it here: http://www.mail-archive.com/[email protected]/msg69632/x-run {{{ $ xvfb-run python myscript_using_spynner.py }}} = Feedback = Open an [http://code.google.com/p/spynner/issues/list issue] to report a bug or request a new feature. Other comments and suggestions can be directly emailed to me: [mailto://[email protected] [email protected]].
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.