crawljax / crawljax Goto Github PK

View Code? Open in Web Editor NEW

506.0 57.0 217.0 74.04 MB

Crawljax

License: Apache License 2.0

Java 56.19% HTML 18.54% JavaScript 24.75% CSS 0.51% Shell 0.01%

crawling crawler dom dynamic test-generation web-analysis web-testing event-driven-crawling javascript

crawljax's Issues

Exceptions after crawling session

Original author: [email protected] (September 10, 2010 12:16:25)

At the end of a crawling session, long exception messages are logged for example about how it was not possible to close some browser(s) properly.

This usually happens when a max timeout or max number of states is defined in the crawl specification.

I think it is best to catch such exceptions and log more user-friendly messages.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=33

Extra state in IE when having anchor tag without "href" attribute

Original author: [email protected] (January 06, 2011 11:09:45)

What steps will reproduce the problem?

Deployed the attached web application in your web server
Run Crawljax in IE with sfg-exporter plugin on this web application
Save the state graph produced by the sfg-exporter
There is a HellowWorld.html file in the web application, remove the following line from there
Remove the line "<a> Test</a>" in the
Run again Crawljax in IE with sfg-exporter plugin on this web application

What is the expected output? What do you see instead?
It should give max 2 states, however it is giving me 3 states when I am having the following line
"<a> Test</a>"
Manually browsing the application is having only 2 states as the anchor tag is not having the "href" attribute

What version of the product are you using? On what operating system?
Crawljax 1.9
Windows XP
IE 7

Please provide any additional information below.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=40

bug on running crawljax 1.9 on google.it

Original author: [email protected] (May 29, 2010 18:55:10)

Trying to run crawljax 1.9 with a class written in Java on the web page
"www.google.it", setting 'clickDefaultElements', I have got this error:

Unable to locate element: {"method":"id","selector":"_target"}
System info: os.name: 'Windows XP', os.arch: 'x86', os.version: '5.1',
java.version: '1.6.0_18'
Driver info: driver.version: firefox
org.openqa.selenium.NoSuchElementException: Unable to locate element:
{"method":"id","selector":"_target"}
System info: os.name: 'Windows XP', os.arch: 'x86', os.version: '5.1',
java.version: '1.6.0_18'
Driver info: driver.version: firefox
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown
Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance
(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at org.openqa.selenium.firefox.Response.ifNecessaryThrow
(Response.java:105)
at org.openqa.selenium.firefox.FirefoxDriver.executeCommand
(FirefoxDriver.java:331)
at org.openqa.selenium.firefox.FirefoxDriver.sendMessage
(FirefoxDriver.java:312)
at org.openqa.selenium.firefox.FirefoxDriver.sendMessage
(FirefoxDriver.java:308)
at org.openqa.selenium.firefox.FirefoxDriver.findElement
(FirefoxDriver.java:265)
at org.openqa.selenium.firefox.FirefoxDriver.findElementById
(FirefoxDriver.java:179)
at org.openqa.selenium.By$1.findElement(By.java:66)
at org.openqa.selenium.firefox.FirefoxDriver.findElement
(FirefoxDriver.java:175)
at com.crawljax.browser.AbstractWebDriver.getInputWithRandomValue
(AbstractWebDriver.java:440)
at com.crawljax.forms.FormInputValueHelper.getFormInput
(FormInputValueHelper.java:316)
at
com.crawljax.forms.FormInputValueHelper.getFormInputWithDefaultValue
(FormInputValueHelper.java:252)
at com.crawljax.forms.FormHandler.getFormInputs
(FormHandler.java:200)
at com.crawljax.core.Crawler.handleInputElements(Crawler.java:264)
at com.crawljax.core.Crawler.clickTag(Crawler.java:340)
at com.crawljax.core.Crawler.crawl(Crawler.java:446)
at com.crawljax.core.Crawler.run(Crawler.java:610)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.lang.Thread.run(Unknown Source)

The error is returned after it fires on the element 'iGoogle'.

My main class:

package Test;
import com.crawljax.browser.EmbeddedBrowser.BrowserType;
import com.crawljax.core.CrawljaxController;
import com.crawljax.core.CrawljaxException;
import com.crawljax.core.configuration.CrawlSpecification;
import com.crawljax.core.configuration.CrawljaxConfiguration;

public final class Test1_9 {

    public static void main(String[] args) { 


            CrawlSpecification crawler = new CrawlSpecification

("http://www.google.it");
crawler.clickDefaultElements();
crawler.setRandomInputInForms(true);
crawler.setMaximumStates(40);
crawler.setDepth(1);
CrawljaxConfiguration config = new CrawljaxConfiguration
();
config.setCrawlSpecification(crawler);
config.setBrowser(BrowserType.firefox);

            try { 
                    CrawljaxController crawljax =  new

CrawljaxController(config);
crawljax.run();
} catch
(org.apache.commons.configuration.ConfigurationException e) {
e.printStackTrace();
System.exit(1);
} catch (CrawljaxException e) {
e.printStackTrace();
System.exit(1);
}

}

Best regards,

Barbara Farina

Original issue: http://code.google.com/p/crawljax/issues/detail?id=25

Can't find an element?

Original author: frankgroeneveld (February 03, 2010 13:49:50)

I keep having problems like this:

There is a page with a link <a href="#">something</a>

So, I do this in my configuration:
crawler.lookFor("a").withAttribute("href", "#");

The log looks like this:

Start crawling with 1 tags
Starting new Crawler: Thread 1 Crawler 1
Looking in state: index for candidate elements with
TAG:
Found 0 new candidate elements to analyze!

Why is there nothing printed behind "TAG"? If I change my code to:
crawler.lookFor("a").withAttribute("href", "%");
The links are found just fine and this is the log:

Start crawling with 1 tags
Starting new Crawler: Thread 1 Crawler 1
Looking in state: index for candidate elements with
TAG: A: href="%"
Found new candidate element: A: href=/ id=logo xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/A[1]
Found new candidate element: A: href=# onclick=run(); return false; xpath
/HTML[1]/BODY[1]/DIV[2]/DIV[1]/P[1]/A[1]
Found new candidate element: A: href=# onclick=stop(); return false;
xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/P[1]/A[2]
Found new candidate element: A: href=# onclick=runCycle(); return false;
xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/P[1]/A[3]
Found new candidate element: A: href=# onclick=clearScreen(); return
false; xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/P[1]/A[4]
Found new candidate element: A: href=# onclick=exportGrid(); return
false; xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/P[1]/A[5]
Found new candidate element: A: href=# onclick=importGrid(); return
false; xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/P[1]/A[6]
Found new candidate element: A: href=#
onclick=importScreen('76,122,124,160,161,168,169,182,183,207,211,216,217,230,231,244,245,254,260,264,265,292,293,302,306,308,309,314,316,350,356,364,399,403,448,449');
return false; xpath /HTML[1]/BODY[1]/DIV[2]/DIV[2]/UL[1]/LI[1]/A[1]
Found new candidate element: A: href=#
onclick=importScreen('65,66,67,71,72,73,159,164,166,171,207,212,214,219,255,260,262,267,305,306,307,311,312,313,401,402,403,407,408,409,447,452,454,459,495,500,502,507,543,548,550,555,641,642,643,647,648,649');
return false; xpath /HTML[1]/BODY[1]/DIV[2]/DIV[2]/UL[1]/LI[2]/A[1]
Found new candidate element: A: href=#
onclick=importScreen('785,835,880,881,884,885,886'); return false; xpath
/HTML[1]/BODY[1]/DIV[2]/DIV[2]/UL[1]/LI[3]/A[1]
Found new candidate element: A: href=#
onclick=importScreen('387,388,389,390,434,438,486,530,533'); return false;
xpath /HTML[1]/BODY[1]/DIV[2]/DIV[2]/UL[1]/LI[4]/A[1]
Found new candidate element: A: href=#
onclick=importScreen('159,160,167,206,209,212,213,215,253,254,257,260,262,263,302,305,307,351,354');
return false; xpath /HTML[1]/BODY[1]/DIV[2]/DIV[2]/UL[1]/LI[5]/A[1]
Found 12 new candidate elements to analyze!

I've been trying to debug this, but I can't find the problem.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=7

crawljax.com is not getting crawled in IE on Crawljax 1.9

Original author: [email protected] (December 22, 2010 09:20:36)

What steps will reproduce the problem?

Use the following program and run it

import com.crawljax.browser.EmbeddedBrowser.BrowserType;
import com.crawljax.core.CrawljaxController;
import com.crawljax.core.configuration.CrawlSpecification;
import com.crawljax.core.configuration.CrawljaxConfiguration;

public class TestCrawljax {

public static void main(String[] args) {
    CrawlSpecification crawler = new CrawlSpecification(&quot;http://crawljax.com&quot;);
    crawler.clickDefaultElements();
    crawler.setRandomInputInForms(true);
    crawler.setWaitTimeAfterEvent(5000);
    crawler.setWaitTimeAfterReloadUrl(5000);        

    CrawljaxConfiguration config = new CrawljaxConfiguration();
    config.setCrawlSpecification(crawler);
    config.setBrowser(BrowserType.ie);
    try {
        CrawljaxController crawljax = new CrawljaxController(config);
        crawljax.run();
    } catch (Exception e) {
        e.printStackTrace();            
    }
}

}

What is the expected output? What do you see instead?
It should crawl on http://crawljax.com properly in IE however in IE it is not crawling whereas it is working perfectly in Firefox.

What version of the product are you using? On what operating system?
Crawljax 1.0
Windows XP
IE 6.0

Please provide any additional information below.
Here is the Crawljax output ..

Starting Crawljax...
Used plugins:
No plugins loaded because CrawljaxConfiguration is empty
Embedded browser implementation: ie
Crawl depth: 0
Crawljax initialized!
Running PreCrawlingPlugins...
Loading Page http://crawljax.com
Running OnUrlLoadPlugins...
Running OnNewStatePlugins...
Start crawling with 4 crawl elements
Looking in state: index for candidate elements with
TAG: A
Found new candidate element: A: href=http://crawljax.com/download/ xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/DIV[2]/P[1]/A[1]
Found new candidate element: A: href=http://crawljax.com/documentation/changes/ xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/DIV[2]/P[1]/A[2]
Found new candidate element: A: href=documentation/writing-plugins/ xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/DIV[2]/UL[1]/LI[5]/STRONG[1]/A[1]
Found new candidate element: A: href=/download/ valign=middle xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/P[1]/A[1]
Found new candidate element: A: href=/download/ xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/P[1]/A[2]
Found new candidate element: A: href=/documentation/ valign=middle xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/P[2]/A[1]
Found new candidate element: A: href=/documentation/ xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/P[2]/A[2]
Found new candidate element: A: href=http://crawljax.com/wp-login.php rel=nofollow xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[2]/DIV[2]/DIV[1]/UL[1]/LI[1]/A[1]
Found new candidate element: A: href=# id=gotop onclick=MGJS.goTop();return false; xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[4]/A[1]
TAG: BUTTON
TAG: INPUT: type="submit"
TAG: INPUT: type="button"
Found 9 new candidate elements to analyze!
Running PreStateCrawlingPlugins...
Executing click on element: "crawljax-2.0 release" A: href="http://crawljax.com/download/" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/DIV[2]/P[1]/A[1]; State: index
Could not fire eventable: "crawljax-2.0 release" A: href="http://crawljax.com/download/" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/DIV[2]/P[1]/A[1]
Running OnFireEventFailedPlugins...
Executing click on element: "improvements" A: href="http://crawljax.com/documentation/changes/" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/DIV[2]/P[1]/A[2]; State: index
Could not fire eventable: "improvements" A: href="http://crawljax.com/documentation/changes/" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/DIV[2]/P[1]/A[2]
Running OnFireEventFailedPlugins...
Executing click on element: "plugin" A: href="documentation/writing-plugins/" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/DIV[2]/UL[1]/LI[5]/STRONG[1]/A[1]; State: index
Could not fire eventable: "plugin" A: href="documentation/writing-plugins/" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/DIV[2]/UL[1]/LI[5]/STRONG[1]/A[1]
Running OnFireEventFailedPlugins...
Executing click on element: A: href="/download/" valign="middle" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/P[1]/A[1]; State: index
Could not fire eventable: A: href="/download/" valign="middle" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/P[1]/A[1]
Running OnFireEventFailedPlugins...
Executing click on element: "Download now!" A: href="/download/" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/P[1]/A[2]; State: index
Could not fire eventable: "Download now!" A: href="/download/" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/P[1]/A[2]
Running OnFireEventFailedPlugins...
Executing click on element: A: href="/documentation/" valign="middle" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/P[2]/A[1]; State: index
Could not fire eventable: A: href="/documentation/" valign="middle" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/P[2]/A[1]
Running OnFireEventFailedPlugins...
Executing click on element: "Documentation" A: href="/documentation/" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/P[2]/A[2]; State: index
Could not fire eventable: "Documentation" A: href="/documentation/" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/DIV[1]/P[2]/A[2]
Running OnFireEventFailedPlugins...
Executing click on element: "Log in" A: href="http://crawljax.com/wp-login.php" rel="nofollow" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[2]/DIV[2]/DIV[1]/UL[1]/LI[1]/A[1]; State: index
Could not fire eventable: "Log in" A: href="http://crawljax.com/wp-login.php" rel="nofollow" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[2]/DIV[2]/DIV[1]/UL[1]/LI[1]/A[1]
Running OnFireEventFailedPlugins...
Executing click on element: "Top" A: href="#" id="gotop" onclick="MGJS.goTop();return false;" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[4]/A[1]; State: index
Could not fire eventable: "Top" A: href="#" id="gotop" onclick="MGJS.goTop();return false;" click xpath /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/DIV[4]/A[1]
Running OnFireEventFailedPlugins...
Finished executing
All Crawlers finished executing, now shutting down
CrawlerExecutor terminated
Closing the browser...
Total Crawling time(11985ms) ~= 0 min, 16 sec
EXAMINED ELEMENTS: 9
CLICKABLES: 0
STATES: 1
Dom average size (byte): 13848
Running PostCrawlingPlugins...
DONE!!!

Original issue: http://code.google.com/p/crawljax/issues/detail?id=37

Event types and candidate elements

Original author: [email protected] (February 01, 2010 19:13:19)

Crawljax should not be restricted only to the "click" event type.
WebDriver has also a "hover" functionality, which we should support.

The way to proceed would be to link each candidate element to its
EventType. This requires, however, a radical refactoring in the way
candidate elements are created and examined.

So eventually we will have:
crawler.click("a");
crawler.dontClick("a").withText("logout");

crawler.hover("div");
crawler.dontHover("a").withText("I have a mouseoverEvent");

Original issue: http://code.google.com/p/crawljax/issues/detail?id=6

Find out if Helper.getXpathExpression can be replaced with XPathHelper.getSpecificXpathExpression

Original author: frankgroeneveld (March 17, 2010 13:48:01)

Find out if Helper.getXpathExpression can be replaced with
XPathHelper.getSpecificXpathExpression without breaking any core
functionality. I believe some test will need to be modified for this to happen.

One of the cons for this change is the fact that this method is a little
bit slower.

Please comment on this whether or not we should do this.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=13

Need to handle faulty href value in Helper.java:77

Original author: [email protected] (August 18, 2010 07:37:48)

What steps will reproduce the problem?

if the href value is something like "http", "file" will throw error because these strings are starting with "http" or "file" and shorter than START_LENGTH which is 7. --- Helper.java:77

Checking for the "://" instead of checking prefix of the link would be more fault tolerant.

Thank you.
This is super awesome project.
I love this too much, already.
Keep up the miraculous work!

Original issue: http://code.google.com/p/crawljax/issues/detail?id=31

Comparators

Original author: [email protected] (April 20, 2010 22:10:07)

What steps will reproduce the problem?
1.Every isEquivalent() fonctions of the comparators
2.The compare(String originalDom, String newDom, EmbeddedBrowser browser)
function is always called with the originalDom string empty.

Please provide any additional information below.

I found this issue while I'm using Editdistance. Everytime EditDistance is
called, the first string is always empty.
I've used also other comparators, and the result is same.
I think that the problem is the compare() function of StateComparator class.

Can you tell me if that is right?
If not, can you tell me why?
Thank you

Original issue: http://code.google.com/p/crawljax/issues/detail?id=19

testCorrectNamesMultiThread fails

Original author: frankgroeneveld (February 23, 2010 10:34:28)

When running mvn test, testCorrectNamesMultiThread fails with the following
error:

Test set: com.crawljax.core.CrawlerExecutorTest

Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.352 sec
<<< FAILURE!
testCorrectNamesMultiThread(com.crawljax.core.CrawlerExecutorTest) Time
elapsed: 0.141 sec <<< FAILURE!
junit.framework.AssertionFailedError: Thread 1 ok
at junit.framework.Assert.fail(Assert.java:47)
at junit.framework.Assert.assertTrue(Assert.java:20)
at
com.crawljax.core.CrawlerExecutorTest.testCorrectNamesMultiThread(CrawlerExecutorTest.java:66)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at
org.junit.internal.runners.TestMethodRunner.executeMethodBody(TestMethodRunner.java:99)
at
org.junit.internal.runners.TestMethodRunner.runUnprotected(TestMethodRunner.java:81)
at
org.junit.internal.runners.BeforeAndAfterRunner.runProtected(BeforeAndAfterRunner.java:34)
at
org.junit.internal.runners.TestMethodRunner.runMethod(TestMethodRunner.java:75)
at
org.junit.internal.runners.TestMethodRunner.run(TestMethodRunner.java:45)
at
org.junit.internal.runners.TestClassMethodsRunner.invokeTestMethod(TestClassMethodsRunner.java:71)
at
org.junit.internal.runners.TestClassMethodsRunner.run(TestClassMethodsRunner.java:35)
at
org.junit.internal.runners.TestClassRunner$1.runUnprotected(TestClassRunner.java:42)
at
org.junit.internal.runners.BeforeAndAfterRunner.runProtected(BeforeAndAfterRunner.java:34)
at org.junit.internal.runners.TestClassRunner.run(TestClassRunner.java:52)
at
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62)
at
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140)
at
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:127)
at org.apache.maven.surefire.Surefire.run(Surefire.java:177)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at
org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:345)
at
org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1009)

Original issue: http://code.google.com/p/crawljax/issues/detail?id=10

Link containing IP address and machine name are not getting clicked

Original author: [email protected] (January 06, 2011 17:11:30)

What steps will reproduce the problem?

Deploy the attached web application in your web server
Modify %CATALINA_HOME%/webapps/sampleApp/index.html, instead of 10.34.6.181 mention your IP address or machine name
Run Crawljax on this web application

What is the expected output? What do you see instead?
Let’s say if we are having following anchor tag (fcn14736 is my machine name and 10.34.6.181 is my IP address) in our HTML page

<a href="http://fcn14736:8080/sampleApp/HelloWorld.html">HelloWorld</a>

<a href="http://10.34.6.181:8080/sampleApp/HelloWorld.html">HelloWorld</a>

Then crawljax is not crawling that link both in IE & Firefox however if I am having the following link then it is crawling it properly.

<a href="http://localhost:8080/sampleApp/HelloWorld.html">HelloWorld</a>

What version of the product are you using? On what operating system?
Crawljax 1.9
Windows XP
IE 7
Firefox 3.0

Please provide any additional information below.
I have also tried it directly with WebDriver, there it is working fine.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=41

IE issue in Crawljax 2.0

Original author: [email protected] (December 21, 2010 06:08:20)

What steps will reproduce the problem?

Open an IE window, keep it opened.
Run Crawljax 2.0 in IE on any website.

What is the expected output? What do you see instead?
It should open an IE window and crawl the website however it didn't open the IE window till the time any other IE window is opened. If the previously opened IE window is closed then it start working perfectly. This does not happen in case of Firefox & Chrome.

What version of the product are you using? On what operating system?
Crawljax 2.0
Windows XP
IE 6.0
Firefox 3.0
Chrome 8.0

Please provide any additional information below.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=35

Number of Clickables are 0 even number of EXAMINED elements are existing.

Original author: [email protected] (February 28, 2011 11:58:20)

What steps will reproduce the problem?
1.Crawled using URL - http://my.fp055.qa.ebay.com/ws/eBayISAPI.dll?MyEbayBeta&MyeBay=&guest=1
2. User Login Credentials given are - username- us_seller_bsc and password as password
3 . It crawls for few pages

What is the expected output? What do you see instead?
Expected is to click on found events , but even after finding examined elements as some 6 it gives CLICKABLE=0

What version of the product are you using? On what operating system?
2.0

Please provide any additional information below.

[main]Starting Crawljax...
[main]Used plugins:
[main]com.crawljax.plugins.crawloverview.CrawlOverview
[main]com.ebay.tools.crawler.plugins.precrawl.SignInPlugin
[main]com.ebay.tools.crawler.plugins.precrawlstate.StateSiginInPlugin
[main]com.ebay.tools.crawler.plugins.onnewstate.DOMDownloader
[main]Embedded browser implementation: ie
[main]Number of threads: 1
[main]Crawl depth: 6
[main]Crawljax initialized!
[main]Start crawling with 4 crawl elements
[Thread 1 Crawler 1 (initial)]Starting new Crawler: Thread 1 Crawler 1 (initial)
[Thread-3]Running PreCrawlingPlugins...
[Thread-3]Running OnBrowserCreatedPlugins...
[Thread 1 Crawler 1 (initial)]Loading Page http://my.fp055.qa.ebay.com/ws/eBayISAPI.dll?MyEbayBeta&MyeBay=&guest=1
[Thread 1 Crawler 1 (initial)]Running OnUrlLoadPlugins...
[Thread 1 Crawler 1 (initial)]Running OnNewStatePlugins...
I17:19:31:014 .\ScreenshotCapture.cpp(150) Size of stream: 155798
Dom File is - C:\Documents and Settings\mrathore\workspace_helios\TestCrawlJax.\domContent\index_1298893739294.txt
[Thread 1 Crawler 1 (initial)]Looking in state: index for candidate elements with
[Thread 1 Crawler 1 (initial)]TAG: A
[Thread 1 Crawler 1 (initial)]Found new candidate element: A: href=#mainContent rel=nofollow style=DISPLAY: block; LEFT: -9999px; POSITION: absolute xpath /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/A[1]
[Thread 1 Crawler 1 (initial)]Found new candidate element: A: xpath /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[5]/DIV[1]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/DIV[2]/DIV[1]/DIV[2]/DIV[4]/A[1]
[Thread 1 Crawler 1 (initial)]Found new candidate element: A: href=javascript:; id=verisign xpath /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[5]/DIV[2]/TABLE[1]/TBODY[1]/TR[5]/TD[2]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/A[1]
[Thread 1 Crawler 1 (initial)]Found new candidate element: A: class=gh-cbtn href=javascript:; id=closeId xpath /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[5]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/A[1]
[Thread 1 Crawler 1 (initial)]TAG: BUTTON
[Thread 1 Crawler 1 (initial)]TAG: INPUT: type="submit"
[Thread 1 Crawler 1 (initial)]Found new candidate element: INPUT: id=but_sgnBt name=sgnBt tabindex=5 title= type=submit value=Sign in xpath /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[5]/DIV[1]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/FORM[1]/DIV[3]/DIV[1]/B[1]/SPAN[1]/INPUT[1]
[Thread 1 Crawler 1 (initial)]Found new candidate element: INPUT: id=but_regSub name=regSub tabindex=7 title= type=submit value=Register xpath /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[5]/DIV[1]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/FORM[2]/B[1]/SPAN[1]/INPUT[1]
[Thread 1 Crawler 1 (initial)]TAG: INPUT: type="button"
[Thread 1 Crawler 1 (initial)]Found 6 new candidate elements to analyze!
[Thread 1 Crawler 1 (initial)]Starting preStateCrawlingPlugins...
[Thread 1 Crawler 1 (initial)]Running PreStateCrawlingPlugins...
[Thread 1 Crawler 1 (initial)]Executing click on element: "Skip to main content" A: href="#mainContent" rel="nofollow" style="DISPLAY: block; LEFT: -9999px; POSITION: absolute" click xpath /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/A[1]; State: index
[Thread 1 Crawler 1 (initial)]XPath changed from /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/A[1] to /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[1]/A[1] relatedFrame:
[Thread 1 Crawler 1 (initial)]Element not visible, so cannot be clicked: A Skip to main content
[Thread 1 Crawler 1 (initial)]Running OnFireEventFailedPlugins...
[Thread 1 Crawler 1 (initial)]Executing click on element: A: click xpath /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[5]/DIV[1]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/DIV[2]/DIV[1]/DIV[2]/DIV[4]/A[1]; State: index
[Thread 1 Crawler 1 (initial)]XPath changed from /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[5]/DIV[1]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/DIV[2]/DIV[1]/DIV[2]/DIV[4]/A[1] to /HTML[1]/BODY[1]/DIV[2]/DIV[1]/DIV[2]/DIV[4]/DIV[1]/DIV[1]/UL[2]/LI[7]/A[1] relatedFrame:
[Thread 1 Crawler 1 (initial)]Element not visible, so cannot be clicked: A
[Thread 1 Crawler 1 (initial)]Running OnFireEventFailedPlugins...
[Thread 1 Crawler 1 (initial)]Executing click on element: A: href="javascript:;" id="verisign" click xpath /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[5]/DIV[2]/TABLE[1]/TBODY[1]/TR[5]/TD[2]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/A[1]; State: index
[Thread 1 Crawler 1 (initial)]Could not fire eventable: A: href="javascript:;" id="verisign" click xpath /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[5]/DIV[2]/TABLE[1]/TBODY[1]/TR[5]/TD[2]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/A[1]
[Thread 1 Crawler 1 (initial)]Running OnFireEventFailedPlugins...
[Thread 1 Crawler 1 (initial)]Executing click on element: "Click to close" A: class="gh-cbtn" href="javascript:;" id="closeId" click xpath /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[5]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/A[1]; State: index
[Thread 1 Crawler 1 (initial)]XPath changed from /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[5]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/A[1] to /HTML[1]/BODY[1]/DIV[12]/TABLE[1]/TBODY[1]/TR[2]/TD[1]/DIV[1]/A[1] relatedFrame:
[Thread 1 Crawler 1 (initial)]Element not visible, so cannot be clicked: A
[Thread 1 Crawler 1 (initial)]Running OnFireEventFailedPlugins...
[Thread 1 Crawler 1 (initial)]Executing click on element: INPUT: id="but_sgnBt" name="sgnBt" tabindex="5" title="" type="submit" value="Sign in" click xpath /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[5]/DIV[1]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/FORM[1]/DIV[3]/DIV[1]/B[1]/SPAN[1]/INPUT[1]; State: index
[Thread 1 Crawler 1 (initial)]Could not fire eventable: INPUT: id="but_sgnBt" name="sgnBt" tabindex="5" title="" type="submit" value="Sign in" click xpath /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[5]/DIV[1]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/FORM[1]/DIV[3]/DIV[1]/B[1]/SPAN[1]/INPUT[1]
[Thread 1 Crawler 1 (initial)]Running OnFireEventFailedPlugins...
[Thread 1 Crawler 1 (initial)]Executing click on element: INPUT: id="but_regSub" name="regSub" tabindex="7" title="" type="submit" value="Register" click xpath /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[5]/DIV[1]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/FORM[2]/B[1]/SPAN[1]/INPUT[1]; State: index
[Thread 1 Crawler 1 (initial)]Could not fire eventable: INPUT: id="but_regSub" name="regSub" tabindex="7" title="" type="submit" value="Register" click xpath /HTML[1]/BODY[1]/DIV[3]/DIV[1]/DIV[1]/DIV[1]/DIV[5]/DIV[1]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/DIV[1]/FORM[2]/B[1]/SPAN[1]/INPUT[1]
[Thread 1 Crawler 1 (initial)]Running OnFireEventFailedPlugins...
[Thread 1 Crawler 1 (initial)]Finished executing
[Thread 1 Crawler 1 (initial)]All Crawlers finished executing, now shutting down
[main]CrawlerExecutor terminated
[main]Running PostCrawlingPlugins...
[Browser closing Thread]Closing the browser...
WARNING: Could not restore pre-existing cookies, using either theWinXp directory structure or the Vista directory structure
[main]Overview report generated: C:\Documents and Settings\mrathore\workspace_helios\TestCrawlJax\crawloverview\index.html
[main]Total Crawling time(54829ms) ~= 0 min, 54 sec
[main]EXAMINED ELEMENTS: 6
[main]CLICKABLES: 0
[main]STATES: 1
[main]Dom average size (byte): 21662
[main]DONE!!!
WE ARE IN VALIDATION PART NOW
Inside ValidateHtmlSource
Index99Last Index192
index99and last index192
href is<IMG height="1" src="https://extended-validation-ssl.verisign.com/dot_clear.gif" width="1">
href is<IMG height="1" src="https://extended-validation-ssl.verisign.com/dot_clear.gif" width="1">
return value from if href containstrue
Index213Last Index330
index213and last index330
href is<A href="http://pages.fp055.qa.ebay.com/help/policies/user-agreement.html?rt=nc" target="_blank">User Agreement</A>
href is<A href="http://pages.fp055.qa.ebay.com/help/policies/user-agreement.html?rt=nc" target="_blank">User Agreement</A>
return value from if href containstrue
index567and last index684
href is<A href="http://pages.fp055.qa.ebay.com/help/policies/privacy-policy.html?rt=nc" target="_blank">Privacy Policy</A>
href is<A href="http://pages.fp055.qa.ebay.com/help/policies/privacy-policy.html?rt=nc" target="_blank">Privacy Policy</A>
return value from if href containstrue
CASE 4 CASE 4 CASE 4 CASE 4 CASE 4
C:\Documents and Settings\mrathore\workspace_helios\TestCrawlJax.\reports\index_1298893739294.txt.html
Report Generated At : C:\Documents and Settings\mrathore\workspace_helios\TestCrawlJax\FinalReport.html

Original issue: http://code.google.com/p/crawljax/issues/detail?id=46

Previosuly opened IE window getting closed in Crawljax 1.9

Original author: [email protected] (December 21, 2010 06:19:12)

What steps will reproduce the problem?

Open an IE window, keep it opened.
Run Crawljax 1.9 in IE on any website.

What is the expected output? What do you see instead?
It should open an IE window and crawl the website without closing the previously opened IE Window however while crawling Crawljax used to close the previously opened IE window. Due to that we are not able to run the multiple session of Crawljax on the same machine. This does not happen in case of Firefox.

What version of the product are you using? On what operating system?
Crawljax 1.9
Windows XP
IE 6.0
Firefox 3.0

Please provide any additional information below.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=36

Extend CrawljaxException with more info

Original author: [email protected] (February 05, 2010 19:21:12)

I think it is a good idea to include more information in CrawljaxException.
Things like, OS, browser, version, number of threads, can be very useful for
debugging and stuff

Original issue: http://code.google.com/p/crawljax/issues/detail?id=9

EmbeddedBrowser hierarchy

Original author: [email protected] (January 28, 2010 08:13:13)

The AbstractBrowser should nowhere be dependent on any of its
subclasses. This is, in fact, the general Liskov Substitution Principle
(LSP) Thus, code like "if (browser instanceof FirefoxDriver)" is a distinct
smell that the abstractbrowser is, in fact, not at all abstract.
All firefox-specific code should be moved to the WebDriverFirefox class
Method cancellation is a smell Interface methods should be supported by
all subclasses. The current behavior of "saveScreenshot" is to issue a
warning and do nothing if the subclass isn't firefox. It may be tempting to
just move the firefox code to the firefox class, and leave the warning in
the abstract class. This boils down to method cancellation and doesn't
conform to LSP either. The correct way is to move the "saveScreenShot"
method out of the EmbeddedBrowser interface, since it can't be implemented
by all subclasses. Then there is no need for a warning either.
getBrowser() in firefox should return a firefoxdriver. With java it is
possible to override getBrowser() in WebDriverFirefox so that it not just
returns a FirefoxDriver rather than a general WebDriver. That's what we
want. This avoids the need to do typecasts (Typecasts are a smell too!) And
the same holds for the setters the signature in Firefox should be void
setBrowser(FirefoxDriver driver) { ... }

Original issue: http://code.google.com/p/crawljax/issues/detail?id=4

getBrowser on a session returns null

Original author: frankgroeneveld (June 15, 2010 12:16:24)

getBrowser on a session returns null in an OnNewStatePlugin always returns null (or at least when going from initial state to the first state).

Original issue: http://code.google.com/p/crawljax/issues/detail?id=26

Levenshtein Distance

Original author: [email protected] (March 28, 2010 22:17:56)

I have a simple question for you...
I don't understand how (and what) two Doms are compared during the crawling...
Can you help me?
Thank you

Original issue: http://code.google.com/p/crawljax/issues/detail?id=15

Element uppercase in XPath should not be required

Original author: frankgroeneveld (May 19, 2010 13:31:55)

Element names in XPath expresssions shouldn't need to be in uppercase.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=21

Crawler is not clicking all the links in the webpage.

Original author: [email protected] (January 24, 2011 15:57:44)

What steps will reproduce the problem?

Tried crawling http://pages.ebay.com/sitemap.html
With Crawl Depth Set to 1
And specifying the CrawlSpecification to clickDefaultElements()

What is the expected output? What do you see instead?

Expected to Click all the Anchor tags in the page. While only a specific set of Links are only clicked.

What version of the product are you using? On what operating system?
I am Currently using Crawljax 2.0. on Windows XP, and running the crawl on firefox browser. (Later observed the same result in ie also).

Please provide any additional information below.

Attached the output of the States Captured Using CrawlOverview plugin.

Do let me know in case any more information is required in debugging/reproducing the issue.

Thanks
Sharath.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=43

Check whether dropdown boxes are set by text or value

Original author: frankgroeneveld (May 19, 2010 15:12:41)

Check whether dropdown boxes are set by text or value

Original issue: http://code.google.com/p/crawljax/issues/detail?id=23

crawler.dontClick("input") is not working in Crawljax 1.9

Original author: [email protected] (December 27, 2010 06:42:32)

What steps will reproduce the problem?

Deploy the attached war file in Tomcat
Run the below mentioned program on that web application

public class TestCrawljax {

public static void main(String[] args) {
    CrawlSpecification crawler = new CrawlSpecification(
            &quot;http://localhost:8080/iframe/index.html&quot;);

    crawler.clickDefaultElements();
    crawler.setRandomInputInForms(true);
    crawler.setWaitTimeAfterEvent(1000);
    crawler.setWaitTimeAfterReloadUrl(1000);
    crawler.dontClick(&quot;input&quot;);

    CrawljaxConfiguration config = new CrawljaxConfiguration();
    config.setCrawlSpecification(crawler);
    //config.setBrowser(BrowserType.ie);
    try {
        CrawljaxController crawljax = new CrawljaxController(config);
        crawljax.run();
    } catch (Exception e) {
        e.printStackTrace();
    }

}

}

What is the expected output? What do you see instead?
It should not click on input button however it is clicking on input buttons.

What version of the product are you using? On what operating system?
Crawljax 1.9
Windows XP

Please provide any additional information below.
Am I missing anything here?

Original issue: http://code.google.com/p/crawljax/issues/detail?id=39

Update website with information about Invariants and Oracle comparators

Original author: [email protected] (April 21, 2010 08:48:32)

The website should be updated so that it is clear for external users how
they can use the API to add and use Invariants and Comparators, with some
examples for each.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=20

Proxy Settings are missing in the crawling window of Firefox 3.0

Original author: [email protected] (December 24, 2010 04:57:38)

What steps will reproduce the problem?

Run Crawljax 1.9 on Windows 7 and Firefox 3.0 (which has proxy settings) on any external website.

What is the expected output? What do you see instead?
Crawljax should successfully crawl that external site and connects to the the internet using the Firefox Proxy settings.
However it used to open the Firefox Window and displays "Page not found" error, when I have stopped the crawljax in between and checks the Proxy settings, I found that it does not have the Proxy settings which are present in my Firefox Configuration. If I am opening the Firefox Window separately then it is having the Proxy settings.

What version of the product are you using? On what operating system?
Windows 7 (64 bit)
Crawljax 1.9
Firefox 3.0

Please provide any additional information below.
Do I need to do anything else also so that Crawljax will use the Proxy Settings in Firefox? I have not faced this issue in IE 8.0 over there.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=38

Spawned Threads not getting killed when exceptions occur.

Original author: [email protected] (January 31, 2011 08:52:23)

What steps will reproduce the problem?

While crawling an ill formatted html, a PatternMatching exception occurs.

What is the expected output? What do you see instead?

The Crawling Threads needs to be killed and the control needs to return back to the calling program. The thread where the exception raised will be killed and the rest of the spawned threads will be in running state forever. So, the control never comes back to the callee.

What version of the product are you using? On what operating system?
Currently using Crawljax 2.o. on Windows XP.

Please provide any additional information below.

The logfiles attached.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=44

Link not getting clicked in Firefox in Crawljax 2.0

Original author: [email protected] (January 10, 2011 15:19:26)

What steps will reproduce the problem?

Deployed the attached web application in your web server
Run the below mentioned program in Firefox

public class TestCrawljax {

public static void main(String[] args) {
    CrawlSpecification crawler = new CrawlSpecification(
            &quot;http://localhost:8080/sampleApp&quot;);

    crawler.setClickOnce(false);
    crawler.setRandomInputInForms(false);
    crawler.clickDefaultElements();

    crawler.setWaitTimeAfterEvent(1000);
    crawler.setWaitTimeAfterReloadUrl(1000);        

    CrawljaxConfiguration config = new CrawljaxConfiguration();
    config.setCrawlSpecification(crawler);
    //config.setBrowser(BrowserType.ie);
    try {
        CrawljaxController crawljax = new CrawljaxController(config);
        crawljax.run();
    } catch (Exception e) {
        e.printStackTrace();
    }       
}

}

What is the expected output? What do you see instead?
It should click on the single link mentioned in the index page in Firefox. However in Firefox it is not clicking on that link however in case of IE it is clicking on the link.

What version of the product are you using? On what operating system?
Crawljax 2.0
Windows XP
Firefox 3.0
IE 7

Please provide any additional information below.
I have also tried the same thing with WebDriver using the following program, there it has worked fine.

public class TestWebDriver {

public static void main(String[] args) throws Exception {

    WebDriver driver = new FirefoxDriver();

    driver.get(&quot;http://localhost:8080/sampleApp&quot;);        
    driver.findElement(By.linkText(&quot;HelloWorld&quot;)).click();
    driver.close();
}

}

Here is the crawljax.com output for both Firefox & IE

Firefox output:
Starting Crawljax...
Used plugins:
No plugins loaded because CrawljaxConfiguration is empty
Embedded browser implementation: firefox
Number of threads: 1
Crawl depth: 0
Crawljax initialized!
Start crawling with 4 crawl elements
Starting new Crawler: Thread 1 Crawler 1 (initial)
Running PreCrawlingPlugins...
Running OnBrowserCreatedPlugins...
Loading Page http://localhost:8080/sampleApp
Running OnUrlLoadPlugins...
Running OnNewStatePlugins...
Looking in state: index for candidate elements with
TAG: A
Found new candidate element: A: href=HelloWorld.html xpath /HTML[1]/BODY[1]/TABLE[1]/FORM[1]/TBODY[1]/TR[1]/TD[1]/A[1]
TAG: BUTTON
TAG: INPUT: type="submit"
TAG: INPUT: type="button"
Found 1 new candidate elements to analyze!
Starting preStateCrawlingPlugins...
Running PreStateCrawlingPlugins...
Executing click on element: "HelloWorld" A: href="HelloWorld.html" click xpath /HTML[1]/BODY[1]/TABLE[1]/FORM[1]/TBODY[1]/TR[1]/TD[1]/A[1]; State: index
Could not fire eventable: "HelloWorld" A: href="HelloWorld.html" click xpath /HTML[1]/BODY[1]/TABLE[1]/FORM[1]/TBODY[1]/TR[1]/TD[1]/A[1]
Running OnFireEventFailedPlugins...
Finished executing
All Crawlers finished executing, now shutting down
CrawlerExecutor terminated
Running PostCrawlingPlugins...
Total Crawling time(14390ms) ~= 0 min, 14 sec
EXAMINED ELEMENTS: 1
CLICKABLES: 0
STATES: 1
Dom average size (byte): 346
DONE!!!

IE output:
Starting Crawljax...
Used plugins:
No plugins loaded because CrawljaxConfiguration is empty
Embedded browser implementation: ie
Number of threads: 1
Crawl depth: 0
Crawljax initialized!
Start crawling with 4 crawl elements
Starting new Crawler: Thread 1 Crawler 1 (initial)
Running PreCrawlingPlugins...
Running OnBrowserCreatedPlugins...
Loading Page http://localhost:8080/sampleApp
Running OnUrlLoadPlugins...
Running OnNewStatePlugins...
Looking in state: index for candidate elements with
TAG: A
Found new candidate element: A: href=HelloWorld.html xpath /HTML[1]/BODY[1]/TABLE[1]/FORM[1]/TBODY[1]/TR[1]/TD[1]/A[1]
TAG: BUTTON
TAG: INPUT: type="submit"
TAG: INPUT: type="button"
Found 1 new candidate elements to analyze!
Starting preStateCrawlingPlugins...
Running PreStateCrawlingPlugins...
Executing click on element: "HelloWorld" A: href="HelloWorld.html" click xpath /HTML[1]/BODY[1]/TABLE[1]/FORM[1]/TBODY[1]/TR[1]/TD[1]/A[1]; State: index
Dom is Changed!
Correcting state name from state2 to state1
State state1 added to the StateMachine.
StateMachine's Pointer changed to: state1
StateMachine's Pointer changed to: state1 FROM index
Running OnNewStatePlugins...
Running GuidedCrawlingPlugins...
RECURSIVE Call crawl; Current DEPTH= 1
Looking in state: state1 for candidate elements with
TAG: A
TAG: BUTTON
TAG: INPUT: type="submit"
TAG: INPUT: type="button"
Found 0 new candidate elements to analyze!
StateMachine's Pointer changed to: index
Finished executing
All Crawlers finished executing, now shutting down
CrawlerExecutor terminated
Running PostCrawlingPlugins...
Interaction Element= "HelloWorld" A: href="HelloWorld.html" click xpath /HTML[1]/BODY[1]/TABLE[1]/FORM[1]/TBODY[1]/TR[1]/TD[1]/A[1]
Total Crawling time(12375ms) ~= 0 min, 12 sec
EXAMINED ELEMENTS: 1
CLICKABLES: 1
STATES: 2
Dom average size (byte): 289
DONE!!!

Original issue: http://code.google.com/p/crawljax/issues/detail?id=42

java.lang.NoSuchMethodError: org.apache.commons.lang.builder.HashCodeBuilder.reflectionHashCode

Original author: [email protected] (February 22, 2011 09:17:15)

When i give URL as "http://my.qa.ebay.com/ws/eBayISAPI.dll?MyEbay&gbh=1" with specific login input for username and passowrd it throws error given below after crawling 2 to 3 pages.

Exception in thread "Thread 1 Crawler 1 (initial)" java.lang.NoSuchMethodError: org.apache.commons.lang.builder.HashCodeBuilder.reflectionHashCode(Ljava/lang/Object;[Ljava/lang/String;)I
at com.crawljax.core.state.Eventable.hashCode(Eventable.java:136)
at java.util.HashMap.getEntry(Unknown Source)
at java.util.HashMap.containsKey(Unknown Source)
at org.jgrapht.graph.AbstractBaseGraph.containsEdge(Unknown Source)
at org.jgrapht.graph.AbstractBaseGraph.addEdge(Unknown Source)
at com.crawljax.core.state.StateFlowGraph.addEdge(StateFlowGraph.java:147)
at com.crawljax.core.state.StateMachine.addStateToCurrentState(StateMachine.java:136)
at com.crawljax.core.state.StateMachine.update(StateMachine.java:171)
at com.crawljax.core.Crawler.clickTag(Crawler.java:313)
at com.crawljax.core.Crawler.crawlAction(Crawler.java:384)
at com.crawljax.core.Crawler.crawl(Crawler.java:450)
at com.crawljax.core.Crawler.run(Crawler.java:578)
at com.crawljax.core.InitialCrawler.run(InitialCrawler.java:98)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

Original issue: http://code.google.com/p/crawljax/issues/detail?id=45

Termination not correct when MaxRuntime passed

Original author: [email protected] (July 06, 2010 12:52:22)

When running a (large) CrawlSpec and put a MaxRuntime constraint on it when the MaxRuntime is reached the Crawler is not terminated directly.

Basically what happens is, the current Crawler is terminated. Afterwards all waiting Crawlers gets executed and start back-tracking and when in the previous state the crawl run terminates the Crawler.

This should be changed to; when the MaxRuntime is reached, therminate the current Crawler and make a call the all other running Crawlers and shutdown the queue and empty it.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=27

Support CrawlSpecification.click(somexpath)

Original author: frankgroeneveld (May 19, 2010 15:44:27)

Support CrawlSpecification.click(somexpath)

Original issue: http://code.google.com/p/crawljax/issues/detail?id=24

Correctly handle crashed browsers

Original author: [email protected] (September 15, 2010 10:21:30)

When a browser crashes for whatever reason the crawling continues as the exception is catched and suppressed. As the browser is currently 'dead' and not reachable the crawling can not continue with the current browser.

What happens now when running with multiple browsers is the crashed browser-object stays within crawljax and the calls to it continue to be made which results in I/O-Exceptions etc. slowly dying and killing all the crawlers which are in the queue trying to open the index->exception Crawler dies->next starts. etc.

The correct behavior should be:

Catch the first crashing exception
restart the browser or remove it from the pool
Terminate the Crawler where the exception occurred or restart the Crawler.

there are two points for discussion; should we restart the browser, I think that is safe to do while restarting the Crawler could cause a infinite loop in which the browser crashes over and over again.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=34

Exception in thread "main" org.openqa.selenium.WebDriverException: Internal error

Original author: [email protected] (March 20, 2010 11:54:16)

What steps will reproduce the problem?

When added "config.addPlugin(new CrawlOverview());"
Ran on "http://tokods.com" will assert and fail.
Ran on "http://google.com" will complete test and generate the output.

What is the expected output?
Complete the test.
Normal runs through completion like http://google.com

What do you see instead?

Starting Crawljax...
Loading properties...
Used plugins:
com.crawljax.plugins.crawloverview.CrawlOverview
Embedded browser implementation: com.crawljax.browser.WebDriverFirefox
Crawljax initialized!
Loading Page http://tokods.com
Start crawling with 2 tags and threshold-coefficient 0.0
Looking in state: index for candidate elements with
TAG: A
Found new candidate: A: href="index.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/A[1]
Found new candidate: "Home" A: alt="ホームへ" href="index.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/A[1]
Found new candidate: "お問い合わせ/Contact Us　" A: href="contact.html"
xpath /HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/DIV[1]/A[1]
Found new candidate: A: href="dc/index.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/A[1]
Found new candidate: A: href="service1.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[1]/DIV[2]/DIV[2]/A[1]
Found new candidate: A: href="service4.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[1]/DIV[2]/DIV[3]/A[1]
Found new candidate: "㈱東興電機製作所" A: href="tds/index.htm"
target="blank" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[1]/DIV[3]/TABLE[1]/TBODY[1]/TR[1]/TD[1]/
A[1]
Found new candidate: A: href="tds/index.htm" target="blank" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[1]/DIV[4]/A[1]
Found new candidate: A: href="dc/index.html" target="blank" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[1]/DIV[4]/A[2]
Found new candidate: A: href="eco/index.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[1]/DIV[4]/A[3]
Found new candidate: "ホームHome" A: href="index.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[1]/A[1]
Found new candidate: "業務「見える化」ツールDocument Compass" A:
href="dc/index.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[2]/A[1]
Found new candidate: "検証・評価サービスTesting & Testing Solutions" A:
class="ddt_title" href="javascript:ddt('ddt0');" id="ddt0" style="color:
rgb(255, 255, 255);" title="submenu OPEN" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[2]/DIV[2]/A[1]
Found new candidate: "検証・評価サービス概要Testing & Testing Solutions" A:
href="service1.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[2]/DIV[2]/DIV[1]/A[1]
Found new candidate: "IT機器製品検証・評価IT Device Testing" A:
href="service1_1.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[2]/DIV[2]/DIV[1]/A[2]
Found new candidate: "WEBサイト & WEBアプリケーション検証Testing for WEB site
& WEB Application" A: href="service1_2.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[2]/DIV[2]/DIV[1]/A[3]
Found new candidate: "ユーザビリティ & アクセシビリティ評価Usability &
Accessibility Test" A: href="service1_3.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[2]/DIV[2]/DIV[1]/A[4]
Found new candidate: "英語での検証評価レポート作成・提出Provide Test Reports in
English/Japanese" A: href="service1_4.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[2]/DIV[2]/DIV[1]/A[5]
Found new candidate: "技術翻訳Technical Translation" A:
href="service2.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[2]/A[2]
Found new candidate: "マニュアル作成Manual Creation" A:
href="service3.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[2]/A[3]
Found new candidate: "テストプロセス改善ツールTool for Test Process
Improvement" A: href="service4.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[2]/A[4]
Found new candidate: "会社概要About Us" A: href="#" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[3]/SPAN[1]/A[1]
Found new candidate: "会社概要About Us" A: href="profile1.html#S1" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[3]/DIV[1]/A[1]
Found new candidate: "沿革Corporate History" A: href="profile1.html#S2"
xpath /HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[3]/DIV[1]/A[2]
Found new candidate: "所在地Our Location" A: href="profile2.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[3]/DIV[1]/A[3]
Found new candidate: "シンガポール R&D ラボSingapore R&D Lab" A:
href="profile3.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[3]/DIV[1]/A[4]
Found new candidate: "使命・信条Mission Statement" A: href="mission.html"
xpath /HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[4]/A[1]
Found new candidate: "企業行動指針Business Conduct" A:
href="businessconduct.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[5]/A[1]
Found new candidate: "環境への取り組みEco Activities" A:
href="eco/index.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[6]/A[1]
Found new candidate: "お問い合わせContact Us" A: href="contact.html#S1"
xpath /HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[7]/A[1]
Found new candidate: "採用情報Employment Opportunities" A:
href="contact.html#S2" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[2]/DIV[8]/A[1]
Found new candidate: "個人情報保護方針について" A: href="privacy.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[3]/A[1]
Found new candidate: "サイトマップ" A: href="sitemap.html" xpath
/HTML[1]/BODY[1]/DIV[1]/DIV[1]/DIV[3]/A[2]
TAG: INPUT: type="submit"
Found 33 new candidate elements to analyze!
Starting preStateCrawlingPlugins...
Exception in thread "main" org.openqa.selenium.WebDriverException: Internal
error:
{"fileName":"file:///tmp/webdriver6796576226994598725profilecopy/extensions
/[email protected]/components/driver-component.js ->
file:///tmp/webdriver6796576226994598725profilecopy/extensions/fxdriver@goo
glecode.com/components/utils.js","lineNumber":1034,"message":"theDoc.getBox
ObjectFor is not a function","name":"TypeError"}
System info: os.name: 'Linux', os.arch: 'i386', os.version: '2.6.31.6-
rt19.2811209', java.version: '1.6.0_18'
Driver info: driver.version: firefox
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown
Source)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at
org.openqa.selenium.firefox.Response.ifNecessaryThrow(Response.java:105)
at
org.openqa.selenium.firefox.FirefoxDriver.sendMessage(FirefoxDriver.java:29
6)
at
org.openqa.selenium.firefox.FirefoxWebElement.sendMessage(FirefoxWebElement
.java:267)
at
org.openqa.selenium.firefox.FirefoxWebElement.getSize(FirefoxWebElement.jav
a:148)
at
com.crawljax.plugins.crawloverview.CrawlOverview.findElementAndAddToMap(Cra
wlOverview.java:301)
at
com.crawljax.plugins.crawloverview.CrawlOverview.preStateCrawling(CrawlOver
view.java:80)
at
com.crawljax.core.plugin.CrawljaxPluginsUtil.runPreStateCrawlingPlugins(Cra
wljaxPluginsUtil.java:198)
at
com.crawljax.core.CrawljaxController.clickTags(CrawljaxController.java:291)
at
com.crawljax.core.CrawljaxController.extractCandidates(CrawljaxController.j
ava:249)
at
com.crawljax.core.CrawljaxController.crawl(CrawljaxController.java:235)
at
com.crawljax.core.CrawljaxController.run(CrawljaxController.java:182)
at testSite.main(testSite.java:67)

What version of the product are you using? On what operating system?
crawljax-1.8 System info: os.name: 'Linux', os.arch: 'i386', os.version:
'2.6.31.6-rt19.2811209', java.version: '1.6.0_18'
Driver info: driver.version: firefox

Please provide any additional information below.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=14

Replacing PropertyHelper

Original author: frankgroeneveld (February 01, 2010 13:56:46)

Today, I finished the PropertiesFile class. This class allows you to read
in a very simple property file (no support for database) and sets all the
settings on a CrawljaxConfiguration object. This means that we can remove
the PropertyHelper and use CrawljaxConfiguration objects everywhere.
However, we need to decide how we are going to access the
CrawlaxConfiguration object.

I propose this:
add a static final CrawljaxConfiguration config to CrawljaxController which
is set in the CrawljaxController constructor.
Add a static getConfig method to CrawljaxConfiguration to get the instance.

This means that the configuration object is always consistent, because you
can't start the controller without a config. (So you can't use an old
config like with the PropertyHelper because you forgot to call
PropertyHelper.init()).

Original issue: http://code.google.com/p/crawljax/issues/detail?id=5

runOnFireEventFailedPlugins creates stack traces

Original author: frankgroeneveld (February 05, 2010 08:01:57)

fromIndex(0) > toIndex(-1)
java.lang.IllegalArgumentException: fromIndex(0) > toIndex(-1)
at java.util.ArrayList.subListRangeCheck(ArrayList.java:887)
at java.util.ArrayList.subList(ArrayList.java:877)
at com.crawljax.core.Crawler.fireEvent(Crawler.java:238)
at com.crawljax.core.Crawler.clickTag(Crawler.java:341)
at com.crawljax.core.Crawler.crawl(Crawler.java:434)
at com.crawljax.core.Crawler.run(Crawler.java:598)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)

This is due to:
/**

Execute the OnFireEventFailedPlugins with the current crawlPath with the
crawlPath removed 1 state to represent the path TO here.
*/ CrawljaxPluginsUtil.runOnFireEventFailedPlugins(eventable,
crawlPath.subList(0, crawlPath.size() - 1));

Which obviously gives problems if crawlPath.size() == 0

Original issue: http://code.google.com/p/crawljax/issues/detail?id=8

alert, confirm, and prompt boxes

Original author: [email protected] (January 18, 2010 19:24:23)

While crawling, it should be possible to automatically click OK on the following pop-ups: alert, confirm, and prompt

In some web applications, the crawler just keeps waiting for the OK (or
cancel) button to be clicked.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=2

Configure number of times elements are clicked.

Original author: [email protected] (January 18, 2010 18:52:35)

It would be useful to specify the number of times certain elements have to
be examined to see if they cause a state change.

Currently each candidate clickable is only clicked once.

A possible solutions would look like:

crawler.click("a").withText("next").nrOfTimes(5);
crawler.click("a").withText("add").randomNrOfTimes();
crawler.click("a").withText("more").asLongAsExists();

Original issue: http://code.google.com/p/crawljax/issues/detail?id=1

Update crawljax to use Selenium 2.1.0

Original author: [email protected] (July 23, 2011 02:15:49)

What steps will reproduce the problem?

crawljax tests fail for firefox 5.0

What is the expected output? What do you see instead?
test failure

What version of the product are you using? On what operating system?
firefox 5

Please provide any additional information below.
Updating the POM file to replace all selenium dependencies with the following makes the tests pass.

&lt;dependency&gt;
    &lt;groupId&gt;org.seleniumhq.selenium&lt;/groupId&gt;
    &lt;artifactId&gt;selenium-java&lt;/artifactId&gt;
    &lt;version&gt;2.1.0&lt;/version&gt;
&lt;/dependency&gt;

Original issue: http://code.google.com/p/crawljax/issues/detail?id=48

Frames and Iframes

Original author: [email protected] (January 18, 2010 20:38:40)

Frames are currently ignored by Crawljax.

There should be an algorithm that switched into and crawls each frame
automatically. Each frame could be seen either as a new state or part of the
initial DOM state that included the frame.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=3

Chrome on Ubuntu doesn't work.

Original author: frankgroeneveld (March 05, 2010 14:07:20)

Apparently the "handlePopups" function is invalid for chrome. The error
console gives me this:
"Uncaught SyntaxError: Unexpected token ILLEGAL".

Furthermore, when running the largecrawltest for Chrome (with handlepopups
disabled), it will result in a "hang" or broken connection between Crawljax
and Chrome when executing the precrawling plugins.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=11

Backtracking improvement suggestions

Original author: [email protected] (April 16, 2010 15:36:13)

I have two ideas for speeding up the backtracking

go to the latest visited url in stead of going back to the inital url.

example: Crawljax opens test.com, clicks about-> test.com/about.php ->
clicks some element which does some Ajax stuff..... When backtracking go to
test.com/about.php in stead of test.com.

When backtracking Crawljax knows which link it want to click, so immediately after Crawljax finds the element it should click on it in stead
of waiting a finite amount of type

Original issue: http://code.google.com/p/crawljax/issues/detail?id=18

database handler

Original author: [email protected] (August 16, 2011 08:12:30)

What steps will reproduce the problem?
1.saving the states into database or flat files.
2.rerun the session from the last state.
3.if any crash happened it will save the state automatically.

What is the expected output? What do you see instead?
database handler to handle the states to support reading the states & resuming the crawler

What version of the product are you using? On what operating system?
2.0

Please provide any additional information below.
I'm developing a GUI For crawler Configuration , when I finish it I can give the source to you to if u need it.

Good luck

Original issue: http://code.google.com/p/crawljax/issues/detail?id=49

Improve tests and coverage

Original author: frankgroeneveld (April 16, 2010 09:22:09)

For the next release, we need more coverage and better tests.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=17

IndexOutOfBounce exception while running WaitCondition

Original author: [email protected] (August 04, 2010 15:55:09)

When using a/one wait condition during crawling and a / the first waitcondition takes a long time (> timeout) but is successful a IndexOutOfBounceException is thrown. Due to a bug.

In WaitCondition:98 the index is increased and later the log event uses the increased index number to retrieve the WaitCondition (line 110).

Original issue: http://code.google.com/p/crawljax/issues/detail?id=30

Resulting state-flow graph is missing transitions

Original author: [email protected] (October 27, 2011 15:14:22)

What steps will reproduce the problem?

Crawl a site where two identical links in states A and B lead to state C
See for example the test site I attached.

What is the expected output?
Get a state-flow graph where state A and B can reach C

What do you see instead?
A state-flow graph where only state A or B has an outgoing transition to C, depending on which state was crawled first.

What version of the product are you using? On what operating system?
2.1-SNAPSHOT

Please provide any additional information below.
The reason why the behavior is different, is because Crawljax checks if potential candidate elements have been checked before, in order to prevent duplicate work (see com.crawljax.core.CandidateElementExtractor, lines 282 and 169). While it is indeed true that these transitions should not be /followed/ (since it is already known where they lead to), they should still be /added/ to the sfg.

Workaround:
Removing the optimization by ignoring the checkedElements manager yields good results, albeit at the cost of some extra work.
In com.crawljax.core.CandidateElementExtractor, change lines 282 to 284 to
if (matchesXpath && isElementVisible(dom, element)
&& !filterElement(attributes, element)) {
And in the same file, the if condition on line 169 becomes obsolete, i.e., change it to
if (!clickOnce || true) {

Original issue: http://code.google.com/p/crawljax/issues/detail?id=50

Find out if Helper.getContents can be replaced with Helper.getContentsWithLineEndings

Original author: frankgroeneveld (March 17, 2010 13:45:32)

Find out if Helper.getContents can be replaced with
Helper.getContentsWithLineEndings without breaking any core functionality.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=12

Crawljax fails on google.com with Google Chrome 4.0 and 4.1

Original author: [email protected] (July 08, 2010 00:23:49)

What steps will reproduce the problem?

Run JarRunner on http://www.google.com with the chrome 4.0 or 4.1 browser

What is the expected output? What do you see instead?
The test exists immediately saying that zero elements were examined (" EXAMINED ELEMENTS: 0"), which is obviously incorrect.

What version of the product are you using? On what operating system?
1.91 from sources on Windows 7

Please provide any additional information below.
I'm running on older versions of chrome that I compiled from source to avoid running version 5.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=28

CrawlElement.underXPath only uses the first result

Original author: [email protected] (April 05, 2010 18:00:50)

What steps will reproduce the problem?

Make at least two elements that match the same xpath
I use the xpath //LI[@Class='directory expanded']/../LI[contains(@Class,
'file')] for exclusion
include the xpath in a crawlelement.click/dontclick

What is the expected output? What do you see instead?
I use this with dontClick, and crawljax would ignore the first result while
continue crawling the rest of the results even though it shouldn't.

What version of the product are you using? On what operating system?
Latest version of crawljax on Windows 7

Please provide any additional information below.
I believe this problem is because
EventableConditionChecker.checkXpathStartsWithXpathEventableCondition uses
XPathHelper.getXpathForXPathExpression which only returns the first node,
whereas if all nodes that match are returned and checked then it would work
properly.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=16

getExactEventPath not returning the correct clickable list

Original author: [email protected] (August 30, 2010 11:35:23)

The session.getExactEventPath() method returns an incorrect list of clickables.

This list should always contain the exact clickable elements from the index state to the current state.

getExactEventPath().get(getExactEventPath().size() - 1) should return the last clickable item. The best way to check that this functionality is broken is to add an OnNewState plugin:

{{{
crawljaxConfiguration.addPlugin(new OnNewStatePlugin() {
public void onNewState(CrawlSession session) {
List<Eventable> events = session.getExactEventPath();
if (events.size() > 0) {
Eventable lastEvent = events.get(events.size() - 1);
LOGGER.info(lastEvent);
}
}
});
}}}

Original issue: http://code.google.com/p/crawljax/issues/detail?id=32

One of the constructor for WebDriverBackedEmbeddedBrowser swaps crawlWaitEvent and crawlWaitReload arguments

Original author: [email protected] (May 18, 2011 04:15:12)

What steps will reproduce the problem?

open WebDriverBackedEmbeddedBrowser.java
check constructor
3.

What is the expected output? What do you see instead?
Even after you set crawlWaitEvent, it doesn't get set.
Because of this order, the vars get each other's value as theirs.

What version of the product are you using? On what operating system?
2.1-SNAPSHOT

Please provide any additional information below.
Just make crawlWaitEvent comes first than crawlWaitReload in all the constructors.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=47

Invariant checks should be executed during back-tracking as well

Original author: frankgroeneveld (May 19, 2010 14:42:27)

Invariant checks should be executed during back-tracking as well, because
you might find failures then as well.

Original issue: http://code.google.com/p/crawljax/issues/detail?id=22

Exclude iframes from crawling

Original author: [email protected] (July 20, 2010 14:22:44)

Enable Crawljax to exclude certain parts of iframes.

This should be specified in the following way(s):

The normal xPath way: spec.dontClick("a").underXPath("//*[iframe]")
A specialized dontCrawl specification in crawlSpecification. So for example: spec.dontCrawl("top.left"); to specify which frames should not be crawled at all.

The problem with 1 is that all source must be loaded to know which part of the iframe must included. While with 2 the decision to decent into a iframe can be taken before descending into a iframe

Original issue: http://code.google.com/p/crawljax/issues/detail?id=29

crawljax / crawljax Goto Github PK

crawljax's Issues

Test set: com.crawljax.core.CrawlerExecutorTest

Recommend Projects

Recommend Topics

Recommend Org