Giter VIP home page Giter VIP logo

web-spider's Introduction


Logo

Weibo_Spider

^_^对微博话题和用户进行爬取^_^

About The Project

本项目利用python对微博话题和用户进行爬取,同时利用FastAPI进行API的搭建。

配置文件为config.json

  • headers 不需要更改
  • cookies 获取方式:进入微博官网(https://weibo.com/), 登录后,F12进入开发者模式,选取Network,Ctrl+R 重新加载,在Name中点击第一个weibo.com,就会有cookies
  • page (int 类型) 代表要爬取的页数
  • user_id (int类型):用户的Id
  • query :要查询的话题

Getting Started

Prerequisites

首先安装环境依赖库,对于FastAPI,推荐安装所有的可选依赖及对应功能

  • BeautifulSoup
    pip install beautifulsoup4
  • FastAPI
    pip install "fastapi[all]"

Installation

Clone the repo

git clone https://github.com/CUTEPKQ/Web-Spider.git

Usage

  1. config.json文件中

  2. 运行

    • 运行main.py文件,可以得到爬取的数据(评论内容、评论时间)
      • 爬取的话题信息为 [话题内容,时间,用户昵称,话题网址]
      • 爬取的用户信息为 [话题内容,时间]
  3. API服务(暂未更新user,仅支持话题)

    • 运行api.py文件,启用api服务(默认host为localhost,端口号为9394,使用前请确保该端口未被占用
    • 运行api_test.py文件,验证api

Acknowledgments

web-spider's People

Contributors

cutepkq avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.