tducret / amazon-scraper-python Goto Github PK
View Code? Open in Web Editor NEWNon-official client to get some info about products sold on Amazon
License: MIT License
Non-official client to get some info about products sold on Amazon
License: MIT License
Hi @tducret, it would be nice to have the price as well
Hi author/ tducret,
I download all your files and attempted to run python setup.py install. Next, i tried running amazon2csv.py followed your instruction steps found in https://github.com/tducret/amazon-scraper-python which says " You can also pass a search url (if you added complex filters for example), and save it to a file ".
However, on Windows command prompt windows session when i run this command " amazon2csv.py --url="https://www.amazon.com/s/ref=nb_sb_noss_2?url=search-alias%3Daps&field-keywords=python+scraping" > output.csv ", my output.csv file is generated however there's no data found inside the output.csv file. Any advise what steps have i missed out which resulted to the output.csv file being empty (ie: no data extracting from url = https://www.amazon.com/s/ref=nb_sb_noss_2?url=search-alias%3Daps&field-keywords=python+scraping" ) ?
See below screenshot as a proof / reference
Please help me getting customer reviews for Seba med body wash\ soap .
product_list_tag
here i am getting empty list can anyone pls help me in resolving
Would be awesome if we could get images of a product (link to an image) into the CSV file. If possible all of them (could be in the same field in CSV, just separated with a delimiter or something?), or at least the first (main) image link.
EDIT:
I took a look at it a bit more and you could get (all) the thumbnail photos from the side and just change the url afterwards to get the full size image. For example, this item (https://www.amazon.com/gp/product/B0040EGNIU?pf_rd_p=1581d9f4-062f-453c-b69e-0f3e00ba2652&pf_rd_r=SB6K7AFYBC9WW0PTX71F), first thumbnail image has a link (https://images-na.ssl-images-amazon.com/images/I/41%2B1lH%2BGP0L._SS40_.jpg), if we change the SS40 to SX522 we get a full sized image (https://images-na.ssl-images-amazon.com/images/I/41%2B1lH%2BGP0L._SX522_.jpg)
Thumbnail links can be found like this: $("div#altImages ul li.item.imageThumbnail img").
Hope I helped a bit. Thanks!
EDIT 2:
Just found out that we can change the link to SL1500, which gives us an even better picture.
Maybe even implement a way to choose which resolution we want, although we can do this by changing links ourselves later too.
Thanks again!
I'm running this code in my PyCharm
`import amazonscraper
results = amazonscraper.search("coffee", max_product_nb=2)
for result in results:
print("{}".format(result.title))
print(" - ASIN : {}".format(result.asin))
print(" - {} out of 5 stars, {} customer reviews".format(result.rating, result.review_nb))
print(" - {}".format(result.url))
print(" - Image : {}".format(result.img))
print()
print("Number of results : %d" % (len(results)))`
and the output comes like this
`C:\Users\bordi\PycharmProjects\amazon\venv\Scripts\python.exe C:/Users/bordi/PycharmProjects/amazon/Scrapper.py
Number of results : 0
Process finished with exit code 0`
Please help me with this problem
Does this work with product variant? plz show example if so
+
Can I use this script with Lambda
When I used latest version of pip to install, I encountered the following error. Pip changed its internal implementation recently, which is probably why this broke
pip3 --version
pip 21.2.4 from /.../lib/python3.9/site-packages/pip (python 3.9)
pip3 install -U amazonscraper
Collecting amazonscraper
Downloading amazonscraper-0.1.2.tar.gz (8.6 kB)
ERROR: Command errored out with exit status 1:
command: /Users/liuxiaolu/Work/amazon/analyzer/env/bin/python3.9 -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_f415d4529b4844dcb21f497a53af1347/setup.py'"'"'; __file__='"'"'/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_f415d4529b4844dcb21f497a53af1347/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-pip-egg-info-t8errgvi
cwd: /private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_f415d4529b4844dcb21f497a53af1347/
Complete output (7 lines):
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_f415d4529b4844dcb21f497a53af1347/setup.py", line 22, in <module>
requirements = [str(ir.req) for ir in install_reqs]
File "/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_f415d4529b4844dcb21f497a53af1347/setup.py", line 22, in <listcomp>
requirements = [str(ir.req) for ir in install_reqs]
AttributeError: 'ParsedRequirement' object has no attribute 'req'
----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/c0/15/bab4563fe795fadce45ac42eea0ed0988d7f478dc9c4ba8d845f0c2d2d4a/amazonscraper-0.1.2.tar.gz#sha256=b683d98fabe0f0548a28707bf399a1e32840fdd4f6117fba8152c6bbd4dc6bc5 (from https://pypi.org/simple/amazonscraper/) (requires-python:>=3). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
Downloading amazonscraper-0.1.1.tar.gz (8.3 kB)
ERROR: Command errored out with exit status 1:
command: /Users/liuxiaolu/Work/amazon/analyzer/env/bin/python3.9 -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_2dd4acea14d8428e8993c5ab3b4911e3/setup.py'"'"'; __file__='"'"'/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_2dd4acea14d8428e8993c5ab3b4911e3/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-pip-egg-info-kj603xnr
cwd: /private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_2dd4acea14d8428e8993c5ab3b4911e3/
Complete output (7 lines):
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_2dd4acea14d8428e8993c5ab3b4911e3/setup.py", line 22, in <module>
requirements = [str(ir.req) for ir in install_reqs]
File "/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_2dd4acea14d8428e8993c5ab3b4911e3/setup.py", line 22, in <listcomp>
requirements = [str(ir.req) for ir in install_reqs]
AttributeError: 'ParsedRequirement' object has no attribute 'req'
----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/71/5f/16139dbe286630c2aeda864c3974da01cf016731f50d3f4108dda5102831/amazonscraper-0.1.1.tar.gz#sha256=1254df358f7329d3d8e6c5d66a44b362d7cd3c699e1e5dd2848a76b4f05c4d39 (from https://pypi.org/simple/amazonscraper/) (requires-python:>=3). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
When doing crawling, you will encounter the need to select the area before you can display the content. Can you tell me how to deal with this?
Hi - very cool project.
I'm just throwing this out there - it would be very useful if the product's comments/reviews could be scraped. Then, from the CSV file, I could run some text analysis or a word cloud, to see what the customers are saying about the product.
I'm not a skilled enough programmer to attempt to add this myself but would be excited to see it added.
Thanks -Ian
I am a c# developer and I am looking for an option to use it. Last month I work on FFmpeg binary. call the exe with the parameter and it does the work and returns the result (reading console output).
I am looking for using project similar to like that. Calling the exe with parameter and get the output. @tducret please check if there is a better way to use it.
I have used python but don't like to translate the code to C#. Anything else to use it in c#.
Thanks
Sorry but i'm new in the python world.
I'm with Jupyter Notebook and i can't obtain the text review for one product, i don't even know if it posible with this package.
Can you help me please.
Thank you very much!
I pip installed 'click' and it is in my pip list version 7.1.2 but I keep getting the following error:
Traceback (most recent call last):
File "/VSCode/amazon2csv/amazon2csv.py", line 3, in
import click
ModuleNotFoundError: No module named 'click'
Hi there , I am using Crawlera proxy rotation and I need to edit the settings.py. However, I don't see any option in that scraper to plug in proxy credentials. Can you pls help me with this ?
Many Thanks
Hi,
Have been searching for Amazon scrappers and this project is the most effective and efficient, especially on the capability to work on "keyword search" - great work!
I had forked this and was trying to make some changes so that it works with Amazon in my region (Amazon.com/au & Amazon.co.jp). But unfortunately it turned out that I do not have the skill to do so.
Would be very good to see if there is a way to alter parameters like this.
I also shared the same thought in the other issue thread regarding "Price", and I was also thinking to get the data of "Seller" and "Stock level" as well for a thorough analysis.
Thanks so much again for having this fascinating project.
Kenneth
Can add custom information? such as whether it’s an Sponsored Ad
I am only able to get results of 15 products when I perform a search, even though I have set max_product_nb to a value greater than 15. Performing a manual search using the same keyword returns over 100 products on the amazon.com website. I'll appreciate any comments. Thank you.
Thank you very much, this app is very easy to use and powerful.
I have a small question, why do the German Amazon information I grabed have a lot of "?" garbled.
How can I solve this? @tducret
Don't have time to investigate at the moment, but when using a search_url like https://smile.amazon.com/s/field-keywords=,%207.5%20Fluid%20Ounces,%206%20Pack
it gives requests.exceptions.TooManyRedirects: Exceeded 30 redirects.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.