Giter VIP home page Giter VIP logo

blinkr's Introduction

Blinkr

A broken page and link checker for websites. Optionally uses phantomjs to render pages to check resource loading, links created by JS, and report any JS page load errors.

typhoeus, which can execute up to 200 parallel requests, and cache the results is used to check links.

Installation

Add this line to your application's Gemfile:

gem 'blinkr'

And then execute:

$ bundle

Or install it yourself as:

$ gem install blinkr

If you wish to use phantomjs, install phantomjs for your platform http://phantomjs.org/download.html

Usage

Blinkr determines which pages to load from your sitemap.xml. To run blinkr against your site checking every a[href] link on all your pages:

blinkr -u http://www.jboss.org

If you want to customize blinkr, create a config file blinkr.yaml. For example:

# Links and pages not to check (may be a regexp or a string)
skips:
  - !ruby/regexp /^\/video\/((?!91710755).)*\/$/
  - !ruby/regexp /^\/quickstarts\/((?!eap\/kitchensink).)*\/*$/
  - !ruby/regexp /^\/boms\/((?!eap\/jboss-javaee-6_0).)*\/*$/
  - !ruby/regexp /^\/archetypes\/((?!eap\/jboss-javaee6-webapp-archetype).)*\/*$/

# Errors to ignore when generating the output. Each ignore should be a hash
# containing a url (may be regexp or a string), an error code (integer) and a
# error message (may be a regexp or a string)
ignores:
  - url: http://www.acme.com/foo
    message: Not Found
  - url: !ruby/regexp /^https?:\/\/(www\.)?acme\.com\/bar\/
    code: 500

# The output file to write the report to
report: _tmp/blinkr.html

# The URL to check (often specificed on the command line)
base_url: http://www.jboss.org

# Specify the URL to the sitemap to use, rather than the default <base_url>/sitemap.xml
sitemap: http://www.jboss.org/my_sitemap.xml

# Specify the 'browser' used to load each page from the sitemap. By default 
# typhoeus is used, which will fetch the sources of each page in parallel 
# (fast). 
# Alternatively, you can use phantomjs, which will process the javascript and
# CSS. This allows any links generated by javascript as well as any resources
# loaded by the page/javascript to be checked. Additionally, any JS errors are
# reported. To use phantomjs, you must make sure the native binary is available
# on your path.
browser:phantomjs

# The number of times to try reloading a link, if the server doesn't respond or
# refuses the connection. If the retry limit is exceeded, it will be reported as
# 'Server timed out' in the report. By default 3.
max_retrys: 3

# The number times to try reloading a page. You may want to increase this if you
# find errors in the console that a page cannot be loaded
max_page_retrys: 3

# Allows blinkr to ignore fragments (#foo) which can reduce the number of URLs
# to check. By default false.
ignore_fragments: true

# Control the number of threads used to run phantomjs. By default 8.
phantomjs_threads: 8

# Export the report to phantomjs

You can specify a custom config file on the command link:

blinkr -c my_blinkr.yaml

If you want to see more details about the URLs blinkr is checking, you can use the -v option:

blinkr -u http://www.jboss.org -v

If you need to debug why a particular URL is being reported as bad using blinkr, but works in your web browser, you can load a single URL using typhoeus:

blinkr -c my_blinkr.yaml -s http://www.acme.com/corp

Additionally, you can specify the -w option to tell libcurl to run in verbose mode (this is very verbose, so normally used with -s):

blinkr -c my_blinkr.yaml -s http://www.acme.com/corp -v

Extending Blinkr

Blinkr is based around a pipeline. Issues with the pages are collected, analysed, and then passed to the report for transformation and rendering. Additional sections may appended to the report.

To add extensions to blinkr, you need to define a custom pipeline. The pipeline is defined in a ruby file (e.g. blinkr.rb)

require 'acme/spellcheck'

Blinkr::Extensions::Pipeline.new do |config|
  # define the default extensions
  extension Blinkr::Extensions::Links.new config
  extension Blinkr::Extensions::JavaScript.new config
  extension Blinkr::Extensions::Resources.new config

  # define custom extensions
  extension ACME::Extensions::SpellCheck.new config
end

NOTE: You must add the default extensions to a custom pipeline, for them to be executed.

The pipeline is defined in blinkr.yaml:

# Use a custom pipeline
pipeline: blinkr.rb

An extension is just a standard Ruby class. It should declare an initialize(config) method, and may declare one or more of:

  • collect(page)
  • analyze(context, typhoeus)
  • transform(page, error, default_html)
  • append(context)

Each method is called as the pipeline progresses. Arguments passed are:

  • page - a object containing the tyhpoeus response, the page body (as a Nokogiri HTML document), an array of errors for the page, any resource_errors which ocurred when the page was loaded, and any javascript_errors which ocurred when the page was loaded
  • context - a map of url => pages which are being analysed. After the analyze phase, and before the transform phase, any pages with no errors are removed from the context
  • typhoeus - a wrapper around typhoeus, defining a process method and a process_all method, both of which take a url and a retry limit, and accept a block to execute when a response is returned.
  • error - an individual error, consisting of a type, a url, a title, a code, a message, a detail, a snippet and an fontawesome icon class
  • default_html - the default HTML used to display the error

transform should return the HTML used to display the error. append should return any HTML to be appended to the report. A templating language, such as slim or haml may be used to generate the HTML.

The build extensions, in lib/blinkr/extensions are good examples of how extensions can perform broken link analysis, or collect and format resource loading and javascript execution errors.

Contributing

  1. Fork it ( http://github.com/pmuir/blinkr/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

blinkr's People

Contributors

dantheman720 avatar lightguard avatar pmuir avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.