Giter VIP home page Giter VIP logo

flipkart-web-scrapper's Introduction

Web Scraping Flipkart

N|Solid

Scrapy is a python library that help you to scrape website data at ease

Installation

Use the package manager pip to install foobar.

pip install scrapy

What I Did

  • With great support of a Webinar from Prateek Sir by CodingBlocks i was able to scrape data of a e-commerce website "Flipkart".
  • I took a webpage and extracted all product_price, product_name, product_rating upto 5 pages.
  • Scraped data is then store in .json, .csv or .xml file and can be used for any purpose.
  • Scrapy does all the hard work for us, while we just type few lines of code.
  • code consist of the website-url we want to fetch data and css-selectors which contains our data.
  • Then using Scrapy command line tools we can start our website scraping.
  • On more details on how Scrapy works can be found here.

Applications

  • Get your all classmates semester marks at one click.(if your university uploads marks at a websites protal)
  • Scraping Amazon for specific category of books.
  • This method is used in populor serch engines to get relavent pages to your search result

Thanks to-

flipkart-web-scrapper's People

Contributors

asishraju avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

flipkart-web-scrapper's Issues

Getting an error!

I am getting the following error:


ValueError Traceback (most recent call last)
in
1 # -- coding: utf-8 --
2 import scrapy
----> 3 from ..items import FlipkartItem
4
5 class flipkart_Spider(scrapy.Spider):

ValueError: attempted relative import beyond top-level package

Also, how do we get the description, price and the other attributes like processor, weights etc.?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.