Comments (18)
and in Source.java srcData.get(Name) seems to be a Long, so
protected int getSrcDataInt(String name) {
if (srcData.containsKey(name)) return ((Integer)srcData.get(name)).intValue();
String value = getSrcDataString(name); //.replace(".0", "");
return Integer.parseInt(value);
}
must be
protected int getSrcDataInt(String name) {
if (srcData.containsKey(name)) return ((Long)srcData.get(name)).intValue();
String value = getSrcDataString(name); //.replace(".0", "");
return Integer.parseInt(value);
}
from crawl-anywhere.
What is the issue with the Long to Integer cast ?
Is there an issue with one of the source setting parameter ? I don't think so !
from crawl-anywhere.
at least one source parameter is a Long, so cast to Integer fails, maybe only in our environment
from crawl-anywhere.
Can you provide the xml export of your source setting (export function) ?
from crawl-anywhere.
this issue got the label "bug", do you still need xml export of the source to investigate Long/Inter cast of Source.java?
from crawl-anywhere.
Yes please.
from crawl-anywhere.
(See attached file: 525288c536c04.xml)
from crawl-anywhere.
Please send the file to [email protected]
from crawl-anywhere.
Hi,
I don't reproduce these issues even with your export. Can you provide the exact scenario in order to each of these 2 issue ? Which source parameter is a long ?
Which version of mongodb are you using ? Is it a 64 bits version ?
Regards.
from crawl-anywhere.
$ mongod --version
db version v2.2.3, pdfile version 4.5
Fri Oct 18 11:38:20 git version: nogitversion
64 bit
from crawl-anywhere.
Fri Oct 18 13:29:11 CEST 2013 - =================================
Fri Oct 18 13:29:11 CEST 2013 - Crawler starting (version: 4.0.0)
Fri Oct 18 13:29:11 CEST 2013 - Simultaneous sources crawled : 3
Fri Oct 18 13:29:11 CEST 2013 - account : 1
Fri Oct 18 13:29:11 CEST 2013 -
Fri Oct 18 13:29:11 CEST 2013 - =================================
Fri Oct 18 13:29:11 CEST 2013 -
Fri Oct 18 13:29:11 CEST 2013 - Sources to be crawled : 1
Fri Oct 18 13:29:11 CEST 2013 - Pushing source : 4
Fri Oct 18 13:29:11 CEST 2013 - Source data key-name: id_target
Fri Oct 18 13:29:11 CEST 2013 - Source data key-class: class java.lang.Long
Fri Oct 18 13:29:11 CEST 2013 - java.lang.Long cannot be cast to java.lang.Integer
Fri Oct 18 13:29:11 CEST 2013 - >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Fri Oct 18 13:29:11 CEST 2013 - >>>> Error = java.lang.Long cannot be cast to java.lang.String
Fri Oct 18 13:29:11 CEST 2013 - = java.lang.Thread.run(Thread.java:662)
Fri Oct 18 13:29:11 CEST 2013 - fr.eolya.crawler.connectors.Source.getSrcDataString(Source.java:142)
Fri Oct 18 13:29:11 CEST 2013 - fr.eolya.crawler.connectors.Source.getSrcDataInt(Source.java:124)
Fri Oct 18 13:29:11 CEST 2013 - fr.eolya.crawler.connectors.Source.getTargetId(Source.java:205)
Fri Oct 18 13:29:11 CEST 2013 - fr.eolya.crawler.connectors.Connector.initializeInternal(Connector.java:50)
Fri Oct 18 13:29:11 CEST 2013 - fr.eolya.crawler.connectors.web.WebConnector.initialize(WebConnector.java:79)
Fri Oct 18 13:29:11 CEST 2013 - fr.eolya.crawler.ProcessorSource.call(ProcessorSource.java:55)
Fri Oct 18 13:29:11 CEST 2013 - fr.eolya.crawler.ProcessorSource.call(ProcessorSource.java:20)
Fri Oct 18 13:29:11 CEST 2013 - java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
Fri Oct 18 13:29:11 CEST 2013 - java.util.concurrent.FutureTask.run(FutureTask.java:138)
Fri Oct 18 13:29:11 CEST 2013 - java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
Fri Oct 18 13:29:11 CEST 2013 - java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
Fri Oct 18 13:29:11 CEST 2013 - java.lang.Thread.run(Thread.java:662)
log produces with following code-snippet:
try {
if (srcData.containsKey(name)) return ((Integer)srcData.get(name)).intValue();
}
catch(Exception e) {
logger.log("Source data key-name: " +name);
logger.log("Source data key-class: " +srcData.get(name).getClass());
logger.log(e.getMessage());
}
from crawl-anywhere.
Hi,
Thank you for this trace.
Did you setup something specific about target ? Did you created a target ? Did you change the target for your source ?
Dominique
from crawl-anywhere.
Please send me by email your file source.java.
from crawl-anywhere.
I tried various things, but it is still impossible to reproduce. Can you provide an export of your mongodb database (without the pages* collections) ?
from crawl-anywhere.
Hello,
We have the same problem. Installation seems to be fine, and we entered our sources (~140), but crawling never starts, with the cast Exception mentionned in this issue.
Did you find any solution/workaround ?
Thanks
from crawl-anywhere.
Can you provide me a mongodb export ?
from crawl-anywhere.
Thanks for your quick answer. I just sent the export to conact at crawl-anywhere.com.
from crawl-anywhere.
Fixed
from crawl-anywhere.
Related Issues (20)
- Missing dependency HOT 1
- If several accounts exist, the default one is ignored
- facet.mode_union parameter in search interface is ignored
- Search by tag or collection in search interface doesn't work
- Proxy address exclusion list HOT 1
- item_contentsize for PDF HOT 2
- Review IP geolocalisation HOT 1
- tools_test_scripts.sh never get to see any output of found links HOT 6
- Unable to add source HOT 4
- Crawl-anywhere on mac HOT 1
- Source Export / Import
- Title not parsed correctly for some international sites.
- Unable to add Source HOT 1
- Solr is not updated via indexer HOT 8
- HttpLoader does not fully support cookies HOT 1
- issue with require_once_all HOT 1
- Add ability to bypass robots.txt on a per-host basis
- Parse not correct for French and Chinese.
- Access forbidden, required password
- Files not found HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from crawl-anywhere.