Giter VIP home page Giter VIP logo

abot's People

Contributors

sjdirect avatar

abot's Issues

Create Logo

Contact graphic designer to have logo created


Original issue reported on code.google.com by [email protected] on 28 Oct 2012 at 9:09

Add crawl recovery

Add crawl recovery that reloads pages that were crawled, pages to crawl and 
other context. This allows the crawl to pick up where it left off. May also 
need to add a stop for this work properly

Original issue reported on code.google.com by [email protected] on 16 Nov 2012 at 5:15

Consider accepting lambda expressions for crawl decisions

Consider accepting lambda expressions for crawl decisions.

Pros: 
-Allows users to determine crawl behavior on the fly
-No classes or interfaces to implement and plugin

Cons:
-Not easy to test compound crawl behaviors
-Users must set these values on every instance (lots of copy and paste)
-Hard to group related or behaviors that are grouped together

Original issue reported on code.google.com by [email protected] on 29 Oct 2012 at 4:57

  • Merged into: #26

Create a PoliteWebCrawler

Create a PoliteWebCrawler.

-Add throttling
-Add manual crawl delay
-Add respect robots crawl delay
-Add respect robots disallow directive
-Add respect meta robots no index no follow

Original issue reported on code.google.com by [email protected] on 27 Sep 2012 at 11:47

  • Blocking: #74

Add crawltimeout

Add crawltimeout where crawl ends if the timeout time has elapsed.

Original issue reported on code.google.com by [email protected] on 15 Nov 2012 at 3:21

Update documentation/Downloads

-Link to the latest stable instead of making them go to the downloads tab
-Add fiddler .saz file to replay
-Add Abot vs Arachnode vs NCrawler section
-Add faqs page
-Split up quickstart onto its own page
-Add more detail to running the tests w/fiddler etc..

Original issue reported on code.google.com by [email protected] on 19 Nov 2012 at 3:30

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.