Comments (6)
Hi John,
boilerpipe comes with a patched version of NekoHTML. So unless you are using a
different SAX parser or you are having an unpatched NekoHTML in your classpath,
you should not see this error at all.
Could you please give me an URL that I can check against my local installation
of boilerpipe?
Best,
Christian
Original comment by ckkohl79
on 6 Feb 2012 at 5:33
from boilerpipe.
I'm running into this issue with
http://heraldnews.suntimes.com/news/10325442-418/voting-map-redrawn-by-those-in-
power-with-hope-of-keeping-it.html
Original comment by [email protected]
on 29 Feb 2012 at 8:09
from boilerpipe.
Hi carylee,
thanks for this feedback. This page works fine here with the latest version
from trunk as well as with the previous version on
http://boilerpipe-web.appspot.com/
Could you please checkout that version from SVN and try again?
I am pretty sure that this is a classpath issue. Please ensure that you really
have the patched versions of NekoHTML's HTMLElements and HTMLTagBalancer (which
come with boilerpipe-core) included in your classpath *before* the original
nekohtml-1.9.13.jar.
Original comment by ckkohl79
on 21 Mar 2012 at 9:18
from boilerpipe.
[deleted comment]
from boilerpipe.
[deleted comment]
from boilerpipe.
Original comment by ckkohl79
on 27 Jun 2012 at 4:19
- Changed state: Fixed
from boilerpipe.
Related Issues (20)
- BoilerplateBlockFilter ignores labelToKeep
- [deleted issue]
- Program does not terminate for badly formatted/syntactically incorrect HTML input
- How to use boilerpipe to get some text with a hyperlink from the web page? HOT 1
- Incomplete extraction of text with special characters
- Server returned HTTP response code: 403 for URL (SOLVED) please use this codeline. HOT 2
- Limit the parsing depth of the html parsing to avoid out of memory situations HOT 1
- Extract article from non-english text HOT 1
- Missing Maven 1.2.0
- Xerces for andorid jar file needed HOT 2
- its not working for a news site HOT 1
- Incomplete extraction of article
- Fail to extract main content on some page, get footnote instead
- IllegalArgumentException for many web pages
- Missing ImageExtractor in downloabale 1.2 jar file
- Performance issues with UnicodeTokenizer
- Boilerpipe is conflicting with CyberNeko library HOT 1
- Unsupported content type: null HOT 1
- Different result when using Web Api and the source api?
- How to debug the result?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from boilerpipe.