Comments (3)
选择规模较小的网站可正常爬取,大规模网站则出现上述报错
from crawlergo.
或许,你只需要设置超时参数,并使用一个网络状况良好的代理即可。
from crawlergo.
I set up a server on AWS with 8 CPUs and 32 GB of RAM to retest.
The internet download speed is 500 mbit. I made the query as follows.
./crawlergo -c /usr/bin/chromium-browser -t 2 --tab-run-timeout=60s --wait-dom-content-loaded-timeout=60s https://www.amazon.com/
Result: navigate timeout.
It first crawling, but when the timeout period is up it gives a "navigate timeout" error.
The timeout is also written in the picture you shared. If there were more to the picture, I'm sure we would see more timeout errors.
Your suggested solution didn't solve the problem, but thank you for your interest.
from crawlergo.
Related Issues (20)
- Any suggestions on authenticated crawling? HOT 1
- 请删除
- Form auto-fill not working HOT 1
- crawlergo not working HOT 1
- 表格自动填写
- 导航超时 navigate timeout HOT 2
- Navigate timeout HOT 1
- OpenAPI Support
- may i know about this?
- may i know what tool is this?
- 功能建议
- 只有get请求
- 增加sitemap.xml(网站地图)解析功能,其读取主要和robots.txt差不多 HOT 1
- Duplicate Requests HOT 1
- --robots-path 参数问题 HOT 3
- 代理存在账号密码验证的时候会请求超时 HOT 1
- crawlergo陷入假死状态发现大量Chromium僵尸进程 HOT 3
- --ignore-url-keywords参数不管用和需求fuzz payload HOT 1
- 是否可以添加参数类似`--url-keywords`,只爬取包含特定关键字的url HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from crawlergo.