Giter VIP home page Giter VIP logo

bilibilitool's Introduction

- Bilibili Tool -

                      //
          \\         //
           \\       //
    ##DDDDDDDDDDDDDDDDDDDDDD##
    ## DDDDDDDDDDDDDDDDDDDD ##
    ## hh                hh ##
    ## hh    //    \\    hh ##
    ## hh   //      \\   hh ##
    ## hh                hh ##
    ## hh      wwww      hh ##
    ## hh                hh ##
    ## MMMMMMMMMMMMMMMMMMMM ##
    ##MMMMMMMMMMMMMMMMMMMMMM##
         \/            \/
    ________   ___   ___        ___   ________   ___   ___        ___
   |\   __  \ |\  \ |\  \      |\  \ |\   __  \ |\  \ |\  \      |\  \
   \ \  \|\ /_\ \  \\ \  \     \ \  \\ \  \|\ /_\ \  \\ \  \     \ \  \
    \ \   __  \\ \  \\ \  \     \ \  \\ \   __  \\ \  \\ \  \     \ \  \
     \ \  \|\  \\ \  \\ \  \____ \ \  \\ \  \|\  \\ \  \\ \  \____ \ \  \
      \ \_______\\ \__\\ \_______\\ \__\\ \_______\\ \__\\ \_______\\ \__\
       \|_______| \|__| \|_______| \|__| \|_______| \|__| \|_______| \|__|

🛠️ 哔哩哔哩(B站)低(mei)级(yong)工具箱

updataState.py

用于获得B站上传状态,是拥挤还是爆满等

state.png

爬取时需要用户手动提供cookie字符串 且勿关闭该cookies所属session, 即别关那个获得cookie的网页

danMu.py

爬取单个视频的弹幕,给定参数后可以保存到给定目录,不需要用户cookie,只需要BV视频的url

Comment.py

爬取单个视频的评论,给定参数后可以保存到给定目录,不需要用户cookie,只需要BV视频的url

currentWatch.py

获取该视频当前的观看人数,不需要用户cookie,只需要BV视频的url

pageInfo.py

获取该视频每一P的信息(包括名字, 播放时长等信息),不需要用户cookie,只需要BV视频的url

videoExtraInfo.py

获取该视频的额外信息,比如aidbvidview(视频播放数)、danmaku(总弹幕数)、reply(评论数)、share(分享数)、like(点赞数)、favorite(收藏)、coin(投币数),,不需要用户cookie,只需要BV视频的url

一些心得

写爬取这些东西的代码是异常无聊的,有意思的也就是找到url,其他的无非就是导库,下载后读取,后处理保存

大部分网站都会将这些资源(如弹幕和评论等信息资源),整为xml格式或者json格式的url,写本仓库这样简单的爬虫,难点就是找url,其他的就看Python基础掌握的好不好了

希望你能自己找到几个不同的字符串,如BVidaidoid(应该是有两个)和cid,找到他们分别对应哪些资源

爬取弹幕那一块,我希望你能自己把那些属性到底指的是什么想明白,希望能自己根据弹幕的样式,去猜属性的含义(话说,一年前那会儿还没有高级弹幕,我也不知道那个代码能不能有效应对高级弹幕)

我希望你能感觉到我写的过于冗余,然后用类封装好。时隔一年,我再一次看我之前写的代码,还是一样的想法,代码冗余太多,写成类就好了,初始化的时候将BVidaidoid等东西读入,之后用哪个拿哪个就好了

我还只是个写爬虫的新手,也只会找找简单的url,再往深走走,就可能涉及到逆向的知识,我已经停滞不前两三年了吧,没有什么进步hhh

bilibilitool's People

Contributors

drryanhuang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.