Giter VIP home page Giter VIP logo

m3u8_spider_platform's Introduction

m3u8 spider platform

简介

m3u8 系列网站爬虫,几乎可以通杀所有的 m3u8 的中小网站。

技术栈

为了通用性,选择了 selenium + mitmproxy

  • 隐蔽性强。
  • 只写核心代码。直接从 API 层面处理。省的分析乱七八糟的 HTML。
  • 解耦。selenium 负责爬取,mitmproxy 负责隐蔽和下载。
  • selenium 保证了通用。小网站、盗版网站几乎都是盗链。前端反扒能力很低,几乎没有 selenium 特征识别。使用 selenium 可以轻松绕过。
  • mitmproxy 保证了灵活。某些比较有心的网站还是会做 selenium 特征识别,此时只要使用 mitmproxy 拦截并废掉那些 JS 代码即可。

注意

存在一些特别小心(或者辣鸡后端)的网站,返回的 m3u8 媒体列表会有部分切片资源失效,需要更新 m3u8 文件后重复请求。这也是很多基于浏览器插件的嗅探软件(如 IDM)下载视频后文件内容缺失的原因。

对此,我加了循环校验功能,会循环直到所有切片都下载完成。

License

MIT

为了使用 MIT 协议,没有使用 ffmpeg。

m3u8_spider_platform's People

Contributors

obgnail avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.