Giter VIP home page Giter VIP logo

autohome_crawler's Introduction

###项目简介

本项目主要用于介绍使用 requests 和 BeautifulSoup 进行爬虫开发,最后采集到的条目格式如下:

{
    "外观颜色": "晨露白,布里奇沃特青铜,马达加斯加橙,鲜绿,塞勒涅青铜,深蓝色,栗子黑", 
    "name": "Vanquish", 
    "url": "http://car.autohome.com.cn/price/brand-35.html", 
    "brand": "阿斯顿·马丁", 
    "车身结构": "硬顶跑车", 
    "变速箱": "自动", 
    "发动机": "6.0L", 
    "级别": "跑车", 
    "price": "526.88-628.00万"
}

使用须知

  1. clone 本项目
# git clone https://github.com/William-Sang/autohome_crawler.git
  1. 配置依赖

    # cd autohome_crawler
    # pip install -r requirements
    
  2. 修改配置(如果有需要)

    # vim setting.py
    
  3. 执行爬取任务,默认结果会下载到 requests 目录下

    # python app.py
    

需要加强功能

  1. 下载重试功能 http://www.coglib.com/~icordasc/blog/2014/12/retries-in-requests.html

可能出现的问题

  1. 抓取具体车型信息的时候,会出现颜色无法抓取成功的情况。(有时)

autohome_crawler's People

Contributors

william-sang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.