Giter VIP home page Giter VIP logo

imager-search's Introduction

NAME
    Imager::Search - Find images within other images

SYNOPSIS
      use Imager::Search ();
      
  # Load the pattern to search for
      my $pattern = Imager::Search::Pattern->new(
          driver => 'Imager::Search::Driver::HTML24',
          file   => 'pattern.bmp',
      );
      
  # Load the image to search in
      my $image = Imager::Search::Image->new(
          driver => 'Imager::Search::Driver::HTML24',
          file   => 'target.bmp',
      );
      
  # Execute the search
      my @matches = $image->find( $pattern );
      print "Found " . scalar(@matches) . " matches\n";

DESCRIPTION
    The regular expression engine provided with Perl has demonstrated itself
    to be both fully featured and extremely fast for tasks involving
    searching for patterns within a string.

    The CPAN module Imager has demonstrated robust functionality and
    excellent speed across all common operating system platforms for tasks
    involving working with images.

    The goal of Imager::Search takes the best features from Imager and the
    regular expression engine and combines them to produce a simple pure
    perl image recognition engine for systems in which the images are pixel
    perfect.

    And equally importantly, Imager::Search does it very very fast.

    Benchmarking a simple program that continuously monitors a 1024x768
    display for a single target image on a cheap 1.5Ghtz Windows machine
    demonstrated a monitoring rate of 5 frames per second using the default
    BMP24 driver.

    That is, 0.2 seconds to capture the screenshot, convert it into a
    searchable string, generate a search regexp, execute the regexp and then
    convert the results into match objects.

    Finally, Imager::Search itself is pure Perl, and should work quite
    simply on any platform that the Imager module supports, which at time of
    writing includes Windows, Mac OS X and most other forms of Unix.

  Use Cases
    Imager::Search is intended to be useful for a range of tasks involving
    images from computing systems and the digital world in general.

    The range of potential applications include monitoring screenshots from
    kiosk and advertising systems for evidence of crashes or embarrasing
    popup messages, automating interactions with graphics-intense desktop or
    website applications that would be otherwise intractable for traditional
    automation methods, and simple text recognition in systems with fonts
    that register to fixed pixel patterns.

    For example, by storing captured image fragments of a sample set of
    playing cards, a program might conceptually be able to look at a
    solitaire-type game and establish the position and identity of all the
    cards on the screen, populating a model of the current game state and
    then allowing the automation of the playing of the game.

    Imager::Search is NOT intended to be useful for tasks such as facial
    recognition or any other tasks involving real world images.

  Methodology
    Regular expressions are domain-specific Non-Finite Automata (NFA)
    programs designed to detect patterns within strings.

    Given the problem of locating a smaller "search image" one or more times
    inside a larger "target image", we compile the target image into a
    suitable string and compile the search image into a suitable regular
    expression.

    By executing the search regular expression on the target string, and
    translating the results of the run back into image terms, we can
    determine the specific location of all instances of the search image
    inside the target image with relative ease.

    By decomposing the image recognition task into a regular expression
    task, the problem then becomes how to define a series of transforms that
    can generate a suitable search expression, generate a suitable target
    string, and derive the match locations in pixel terms from match
    locations in character/byte terms.

  The Driver API
    While it is fairly easy to conceive of what a potential solution might
    look like, implementing any solution is complicated by the need for all
    the code surrounding the regular expression execution to be fast as
    well.

    For example, a 0.01 second regular expression search time is of no value
    if compiling the search and target images takes several seconds.

    It may also be viable to achieve a shorter total processing time by
    storing the target image in a format which is inherently searchable
    (such as Windows BMP) and using slower and more complex search
    expression.

    Different implementations may be superior in cases where compiled search
    expressions are cached and applied to many target images, versus cases
    where compiled target images are cached and search over by many search
    expressions.

    Imager::Search responds to this ambiguity by not imposing a single
    solution, but instead defining a driver API for the transforms, so that
    a number of different implementations can be used with the same API in
    various situations.

  The HTML24 Driver
    A default "HTML24" implementation is provided with the module. This is a
    reference driver that encodes each pixel as a 24-bit HTML "#RRGGBB"
    colour code.

    This driver demonstrates fast search times and a simple match
    resolution, but has an extremely slow method for generating the target
    images (as slow as 10 gigacyles for a typical 1024x768 pixel
    screenshot).

    Faster drivers are currently being pursued.

USAGE
    This new second-generation incarnation of Imager::Search is still in
    flux, so while the API for the individual classes are relatively stable,
    there is not yet a top level convenience API in the Imager::Search
    namespace itself, and the driver API is still being substantially
    changed in response to the differing needs of different styles of
    driver.

    However a typical (if verbose) usage can be demonstrated, that should
    continue to work for a while...

  1. Load the Search Image
      # An image loaded from a file
      use Imager::Search::Image ();
      my $image = Imager::Search::Image->new(
          driver => 'Imager::Search::Driver::HTML24',
          file   => 'target.bmp',
      );
      
  # An image captured from a screenshot
      use Imager::Search::Screenshot ();
      my $screen = Imager::Search::Screenshot->new(
          driver => 'Imager::Search::Driver::HTML24',
      );

  2. Load the Search Pattern
      # A pattern loaded from a file
      use Imager::Search::Pattern ();
      my $pattern = Imager::Search::Pattern->new(
          driver => 'Imager::Search::Driver::HTML24',
          file   => 'pattern.bmp',
      );

  3. Execute the Search
      # Find the first match
      my $first = $image->find_first( $pattern );
      
  # Find all matches
      my @matches = $image->find( $pattern );

CLASSES
    The following is the complete list of classes provided by the main
    Imager-Search distribution.

  Imager::Search::Image
    Imager::Search::Image implements the an image that will be searched
    within.

  Imager::Search::Screenshot
    Imager::Search::Screenshot is a Imager::Search::Image subclass that
    captures an image from the currently active window.

  Imager::Search::Pattern
    Imager::Search::Pattern provides compiled search pattern objects

  Imager::Search::Match
    Imager::Search::Match provides objects that represent locations in
    images where a pattern was found.

  Imager::Search::Driver
    Imager::Search::Driver is the abstract driver interface. It cannot be
    instantiated directly, but it describes (in both code and documentation)
    what any driver needs to implement.

  Imager::Search::Driver::HTML24
    Imager::Search::Driver::HTML24 is an 24-bit reference driver that uses
    HTML colour codes (#RRGGBB) to represent each pixel.

  Imager::Search::Driver::BMP24
    Imager::Search::Driver::BMP24 is a high performance 24-bit driver that
    uses the Windows BMP file format natively for the image string format.

SUPPORT
    No support is available for this module.

    However, bug reports may be filed at the following URI.

    <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Imager-Search>

AUTHOR
    Adam Kennedy <[email protected]>

COPYRIGHT
    Copyright 2007 - 2011 Adam Kennedy.

    This program is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.

    The full text of the license can be found in the LICENSE file included
    with this module.

imager-search's People

Contributors

adamkennedy avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.