Giter VIP home page Giter VIP logo

wechat_article's Introduction

WeChat_Article

爬取微信公众号文章

Bilibili视频演示:https://www.bilibili.com/video/BV1vN411D7Y3/

注意,除非你要断点续传,否则删除目录下conf.ini和url.json再启动!!!!

image

使用方法:

1、下载并解压Chrome.rar
2、运行main.exe
3、填入信息,点击“启动”即可。
4、如果想修改UI,可以安装这个:Qt Designer


背景知识:

使用公众号写文章时支持搜索其他公众号的文章的方式,来实现爬取指定公众号所有文章的目的。


程序原理:

通过selenium登录获取token和cookie,再自动爬取和下载


更新记录:

  1. 下载文章文字内容到txt
  2. 下载文章图片
  3. 保存HTML文件,并将图片链接指向本地
  4. 添加按时间范围下载
  5. 添加cookie登陆,不成功才selenium浏览器登陆
  6. 增加记住密码功能
  7. 修复一些问题,如requests卡死
  8. 添加按关键词下载
  9. 多线程优化下载速度
  10. 增加断点续传功能(可能存在bug,推荐不要用)
  11. 拟增加备用公众号功能(暂未完成)
  12. 下载PDF格式
  13. 不需要再手动下载Chrome,启动时会自动下载

使用说明:

创建虚拟环境

conda create -n wechat python=3.9 -y

进入虚拟环境

conda activate wechat

安装三方库

pip install -r requirements.txt

对于mac用户,安装pyqt5可能会报错,可以尝试:

brew install pyqt@5
cp -r   /opt/homebrew/Cellar/pyqt@5/5.15.7_2/lib/python3.9/site-packages/*   /Users/songxf/miniconda3/envs/wechat/lib/python3.9/site-packages/   

然后就可以导入了:

import PyQt5

运行脚本

python main.py

打包exe(生成在dist下)

pyinstaller -F -w -i icon.ico main.py

其他说明:

  • 爬取间隔太快,容易遇到“访问频繁”或“freq_control”,这时候可以删除cookie.json,再重新运行软件,换个号继续运行;
  • Qt打包完实在是太大了,有大佬会转成Tkinter吗?

欢迎关注微信公众号:xfxuezhang


打赏

如果这个项目帮助到了你,欢迎请我喝杯阔落👏🏻
yf

wechat_article's People

Contributors

1061700625 avatar songxf1024 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wechat_article's Issues

在Chromium完成登录后 点击了确认 但没反应?

Win11下,双击main.exe,填写了目标公众号id,点启动:

  • 有控制台黑框(无输出内容)
  • Chromium有启动,连接到公众平台,提出对话框“完成登录后点击确认”,扫码登录之后照做,但无进一步反应。
  • 主界面也无新信息输出

遇到的问题

启动程序后出现selenium.common.exceptions.SessionNotCreatedException: Message: Unable to find a matching set of capabilities的错误,可以教教我吗

公众号搜公众号

公众号搜公众号,抓多了会被屏蔽,说:“操作太频繁,稍后再试。” 最开始一两小时就恢复了,现在一天了还没恢复。

main.py 98行问题

range(20)有的时候不止20吧,用range(len(list_xxx))是不是更好

关于打包的问题

你好,我用的是chrome,我试了下,替换成google,直接打开程序文件没问题
但是想请教下,怎么打包呢? 我用你的bat打包失败。求指教。。。谢谢

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.