Comments (5)
然后大概抓了一百多个IP后自己中断了
haipproxy_1 | 2018/03/10 09:55:07| Closing HTTP port [::]:3128
haipproxy_1 | 2018/03/10 09:55:07| storeDirWriteCleanLogs: Starting...
haipproxy_1 | 2018/03/10 09:55:07| Finished. Wrote 0 entries.
haipproxy_1 | 2018/03/10 09:55:07| Took 0.00 seconds ( 0.00 entries/sec).
haipproxy_1 | Aborted (core dumped)
haipproxymaster_haipproxy_1 exited with code 134
过会再请求IP就变回0了
from haipproxy.
这几个地方你确认一下
(1)你看看redis中的ip的情况,推荐用redisdesktopmanager
(2)你的squid是否做了权限处理,如果你的squid暴露在了公网且没给它设置访问控制权限,那么恭喜你,你的服务器肯定被端口扫描器扫描了,也就是充当肉鸡了。这种情况可以查看为squid设置访问权限
(3)确实可能出现ip空缺的情况,但是这个情况是极少的,这种情况下py_cli
会降低筛选IP的要求,但是貌似在squid
那段代码中忘了做标准降低处理了。所以导致某些时候取出来的代理为0
from haipproxy.
第一点我等下看看
关于第二点,我之前就注释掉了Dockerfile里的
RUN apt install squid -yq
RUN sed -i 's/http_access deny all/http_access deny all/g' /etc/squid/squid.conf
RUN cp /etc/squid/squid.conf /etc/squid/squid.conf.backup
这三行squid相关的,不过Run.sh里的squid-update.py没有注释
因为我用不到squid,也不太符合我的需求,所以想请教一个官方的关闭或者不安装DOCKER里的squid的方法,谢谢
from haipproxy.
你给的日志不就是squid
的日志吗
haipproxy_1 | 2018/03/10 09:55:07| Closing HTTP port [::]:3128
明显3128这个端口就是squid的端口。
按理说你注释掉了squid的安装命令,怎么都不会启动squid了。
如果你不想安装squid
的话,除了把Dockerfile的相关内容删了,也把run.sh
中的内容改为:
#!/bin/bash
nohup python crawler_booter.py --usage crawler common > crawler.log 2>&1 &
nohup python scheduler_booter.py --usage crawler common > crawler_scheduler.log 2>&1 &
nohup python crawler_booter.py --usage validator init > init_validator.log 2>&1 &
nohup python crawler_booter.py --usage validator https > https_validator.log 2>&1&
python scheduler_booter.py --usage validator https
这种形式。
然后你用telnet
或者docker exec
看看吧,这样应该squid
是没有的。
然后你看看会不会出现你这个问题。
from haipproxy.
经测试,就算去掉run.sh里的以及dockerfile里的,squid的服务还是会被启动,还是会变成代理,很怪。
所以还是希望大佬增加选项,让用户可选安装squid。不然如果没改端口,估计很多布这个IP池的都被用作他用了。
另外,提醒其他用户splash的host端口也要记得改。
辛苦大佬了
from haipproxy.
Related Issues (20)
- 原理是什么
- docker-compose 部署怎么修改redis配置
- python crawler_booter.py --usage crawler 报错 HOT 1
- 树莓派docker-compose运行报错standard_init_linux.go:211: exec user process caused "exec format error" HOT 2
- 请注意这个项目不要被搞黑产、诈骗的人利用
- 客户端是java语言怎么用 HOT 1
- 登录时需要短信验证怎么办?
- docker运行时报出RSA_get0_crt_params: symbol not found错误
- Unhandled error in Deferred when using py_cli HOT 1
- 项目不可用?
- 不能启动,偶尔不停止但输入命令后没反应
- [笔误]rule.py
- docker 执行 docker-compose up 报错 HOT 1
- 建议
- python客户端调用示例 报错! HOT 1
- docker 执行 docker-compose up 报错 RUN apk upgrade --no-cache HOT 1
- crawler.log AttributeError: module 'OpenSSL.SSL' has no attribute 'SSLv3_METHOD'
- docker 执行 docker-compose报错
- 部署完成后3128端口提示错误
- docker compose up, pip install -r requirements failed, 可能是由scrapy依赖的cryptography版本问题导致
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from haipproxy.