Giter VIP home page Giter VIP logo

Comments (5)

yutiansut avatar yutiansut commented on July 4, 2024 1

哈哈 没有scrapy那么麻烦辣~~

其实很多人的需求很简单 要么是有反爬的要么是没有反爬的,我准备标准化三套代码 一套是直接爬的那种 不用登录什么限制都没有的 一套是加入模拟登录,加入随机UserAgent , 第三套是加入更换sessionid,cookie的那种

然后标准化输出,主要是 数据和舆情信息,类似你们的Mod的概念,我觉得ricequant的mod想法很赞.
这样大家就可以根据自己要爬的网站的需求,以及目标网站的反爬level进行自主选择框架,以标准化的形式存进数据库可以很容易集成到原来的框架里面来.

类似脚手架的感觉,这样比较方便公共代码复用

from quantaxis.

yutiansut avatar yutiansut commented on July 4, 2024

准备把爬虫框架组合,标准化一下,搞成爬虫脚手架

from quantaxis.

wh1100717 avatar wh1100717 commented on July 4, 2024

@yutiansut 哈哈 赞赞哒 你这是打算做一个scrapy啊

from quantaxis.

wh1100717 avatar wh1100717 commented on July 4, 2024

@yutiansut 恩恩 不过scrapy的 pipeline机制还是可以借鉴的 每个人的需求可能都比较简单,但每个人的业务场景未必是一样的,所以如果做通用性的scaffold 还是比较难的。

from quantaxis.

better319 avatar better319 commented on July 4, 2024

那个mod的模式确实挺好的

from quantaxis.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.