Comments (6)
It appears that the broken XML prevented the script from finding links. So I see found links now, but never any exceptions. Even when I break the code.
from crawl-anywhere.
I would consider not getting any exceptions a problem. How else can you debug your script?
from crawl-anywhere.
Can you provide me an url to a "broken" page ?
from crawl-anywhere.
The page is fixed and in production environment. But for testing you can take any webpage and break the script somewhere. It should give exception then, but I'm not getting it.
from crawl-anywhere.
"break the script somewhere" ? Which script ?
from crawl-anywhere.
The script that parses the URL.
from crawl-anywhere.
Related Issues (20)
- Missing dependency HOT 1
- If several accounts exist, the default one is ignored
- facet.mode_union parameter in search interface is ignored
- Search by tag or collection in search interface doesn't work
- Proxy address exclusion list HOT 1
- item_contentsize for PDF HOT 2
- Review IP geolocalisation HOT 1
- Unable to add source HOT 4
- Crawl-anywhere on mac HOT 1
- Source Export / Import
- Title not parsed correctly for some international sites.
- Unable to add Source HOT 1
- Solr is not updated via indexer HOT 8
- HttpLoader does not fully support cookies HOT 1
- issue with require_once_all HOT 1
- Add ability to bypass robots.txt on a per-host basis
- Parse not correct for French and Chinese.
- Access forbidden, required password
- Files not found HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from crawl-anywhere.