tducret / amazon-scraper-python Goto Github PK

View Code? Open in Web Editor NEW

872.0 872.0 159.0 196 KB

Non-official client to get some info about products sold on Amazon

License: MIT License

Python 94.09% Shell 2.82% Dockerfile 2.41% HTML 0.68%

amazon-scraper-python's People

Contributors

Stargazers

Watchers

Forkers

longjohncoder perpetual-hydrofoil dafoo chosen1 amdhacks jithinraj thaminder mhohenberg garfield414 nimitagr sagarghai tomzhang valrcs n89nanda uberman4740 tkanng nilportugues koshillary machinefriendly nikolayvoronchikhin siba-tripathy jilldwright56 mystique9 damonclifford a-seetharam rahiashutosh ravi13u ssekhar2017 yanghaha11514 1335224453 asarav82 dinneo gpfmah anshu686 sndpgandra brucethomaswayne gridl keyman9848 liangdabiao jasonpy99 coder-caicai zartenc z00dev rusoft20000 hxcurtain nevermoreluo hl9565 roolin89 ben2017a gzcassan w0rldart sansan917 yennanliu slapbassify leiqi sumitpai kmkamonseki stungkit nick-choudhary lamkenneth89 aksuited zhengger joseywales72 donglee79 life-turner yumi6666 humdan aydnmms zxc125643 sdolenc rosslee2008 ciaranwelsh carloswear leesire-python bitofbreeze yogeshtak codeduet woori yanyaneboy sumo99 monstaller jaizquierdogalan cjsgh901 oneboxok daleboy panenlei lejarx manoharsai techwrekfix amir22010 sahanduiuc oliviawanderlust gitscami kikohs dcubasoy emma-cin iotwlw b33fjerky wrightway12 etscript

amazon-scraper-python's Issues

Adding the price

Hi @tducret, it would be nice to have the price as well

amazon2csv.py unable to generate output.csv with data extracting from Amazon

Hi author/ tducret,

I download all your files and attempted to run python setup.py install. Next, i tried running amazon2csv.py followed your instruction steps found in https://github.com/tducret/amazon-scraper-python which says " You can also pass a search url (if you added complex filters for example), and save it to a file ".

However, on Windows command prompt windows session when i run this command " amazon2csv.py --url="https://www.amazon.com/s/ref=nb_sb_noss_2?url=search-alias%3Daps&field-keywords=python+scraping" > output.csv ", my output.csv file is generated however there's no data found inside the output.csv file. Any advise what steps have i missed out which resulted to the output.csv file being empty (ie: no data extracting from url = https://www.amazon.com/s/ref=nb_sb_noss_2?url=search-alias%3Daps&field-keywords=python+scraping" ) ?

See below screenshot as a proof / reference

Amazon scrapping code for Seba med soap

Please help me getting customer reviews for Seba med body wash\ soap .
product_list_tag
here i am getting empty list can anyone pls help me in resolving

"amazon2csv" command not accessible in Windows 10

Images

Would be awesome if we could get images of a product (link to an image) into the CSV file. If possible all of them (could be in the same field in CSV, just separated with a delimiter or something?), or at least the first (main) image link.

EDIT:
I took a look at it a bit more and you could get (all) the thumbnail photos from the side and just change the url afterwards to get the full size image. For example, this item (https://www.amazon.com/gp/product/B0040EGNIU?pf_rd_p=1581d9f4-062f-453c-b69e-0f3e00ba2652&pf_rd_r=SB6K7AFYBC9WW0PTX71F), first thumbnail image has a link (https://images-na.ssl-images-amazon.com/images/I/41%2B1lH%2BGP0L._SS40_.jpg), if we change the SS40 to SX522 we get a full sized image (https://images-na.ssl-images-amazon.com/images/I/41%2B1lH%2BGP0L._SX522_.jpg)

Thumbnail links can be found like this: $("div#altImages ul li.item.imageThumbnail img").
Hope I helped a bit. Thanks!

EDIT 2:
Just found out that we can change the link to SL1500, which gives us an even better picture.
Maybe even implement a way to choose which resolution we want, although we can do this by changing links ourselves later too.
Thanks again!

Getting no products when searched

I'm running this code in my PyCharm
`import amazonscraper

results = amazonscraper.search("coffee", max_product_nb=2)

for result in results:
print("{}".format(result.title))
print(" - ASIN : {}".format(result.asin))
print(" - {} out of 5 stars, {} customer reviews".format(result.rating, result.review_nb))
print(" - {}".format(result.url))
print(" - Image : {}".format(result.img))
print()

print("Number of results : %d" % (len(results)))`

and the output comes like this

`C:\Users\bordi\PycharmProjects\amazon\venv\Scripts\python.exe C:/Users/bordi/PycharmProjects/amazon/Scrapper.py
Number of results : 0

Process finished with exit code 0`

Please help me with this problem

variant, Lambda

Does this work with product variant? plz show example if so
+
Can I use this script with Lambda

Japanese Amazon Review Extraction

The review parser for Amazon JP is incorrectly parsing the rating values, as the number comes after the phrase of "5つ星のうち". Hence, all of the parsed values returns: "5つ星のうち".

pip3 install broken

When I used latest version of pip to install, I encountered the following error. Pip changed its internal implementation recently, which is probably why this broke

pip3 --version

pip 21.2.4 from /.../lib/python3.9/site-packages/pip (python 3.9)

pip3 install -U amazonscraper

Collecting amazonscraper
  Downloading amazonscraper-0.1.2.tar.gz (8.6 kB)
    ERROR: Command errored out with exit status 1:
     command: /Users/liuxiaolu/Work/amazon/analyzer/env/bin/python3.9 -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_f415d4529b4844dcb21f497a53af1347/setup.py'"'"'; __file__='"'"'/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_f415d4529b4844dcb21f497a53af1347/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-pip-egg-info-t8errgvi
         cwd: /private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_f415d4529b4844dcb21f497a53af1347/
    Complete output (7 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_f415d4529b4844dcb21f497a53af1347/setup.py", line 22, in <module>
        requirements = [str(ir.req) for ir in install_reqs]
      File "/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_f415d4529b4844dcb21f497a53af1347/setup.py", line 22, in <listcomp>
        requirements = [str(ir.req) for ir in install_reqs]
    AttributeError: 'ParsedRequirement' object has no attribute 'req'
    ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/c0/15/bab4563fe795fadce45ac42eea0ed0988d7f478dc9c4ba8d845f0c2d2d4a/amazonscraper-0.1.2.tar.gz#sha256=b683d98fabe0f0548a28707bf399a1e32840fdd4f6117fba8152c6bbd4dc6bc5 (from https://pypi.org/simple/amazonscraper/) (requires-python:>=3). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  Downloading amazonscraper-0.1.1.tar.gz (8.3 kB)
    ERROR: Command errored out with exit status 1:
     command: /Users/liuxiaolu/Work/amazon/analyzer/env/bin/python3.9 -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_2dd4acea14d8428e8993c5ab3b4911e3/setup.py'"'"'; __file__='"'"'/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_2dd4acea14d8428e8993c5ab3b4911e3/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-pip-egg-info-kj603xnr
         cwd: /private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_2dd4acea14d8428e8993c5ab3b4911e3/
    Complete output (7 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_2dd4acea14d8428e8993c5ab3b4911e3/setup.py", line 22, in <module>
        requirements = [str(ir.req) for ir in install_reqs]
      File "/private/var/folders/_1/9b6tfd017bx3878sxkcsyhk40000gn/T/pip-install-8a51jct1/amazonscraper_2dd4acea14d8428e8993c5ab3b4911e3/setup.py", line 22, in <listcomp>
        requirements = [str(ir.req) for ir in install_reqs]
    AttributeError: 'ParsedRequirement' object has no attribute 'req'
    ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/71/5f/16139dbe286630c2aeda864c3974da01cf016731f50d3f4108dda5102831/amazonscraper-0.1.1.tar.gz#sha256=1254df358f7329d3d8e6c5d66a44b362d7cd3c699e1e5dd2848a76b4f05c4d39 (from https://pypi.org/simple/amazonscraper/) (requires-python:>=3). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

Currently, there are no sellers that can deliver this item to your location.

When doing crawling, you will encounter the need to select the area before you can display the content. Can you tell me how to deal with this?

Is it possible to scrape the Product Comments?

Hi - very cool project.

I'm just throwing this out there - it would be very useful if the product's comments/reviews could be scraped. Then, from the CSV file, I could run some text analysis or a word cloud, to see what the customers are saying about the product.

I'm not a skilled enough programmer to attempt to add this myself but would be excited to see it added.

Thanks -Ian

Is there a way I can compile this project into a .exe or a .dll and call it from c# code.

I am a c# developer and I am looking for an option to use it. Last month I work on FFmpeg binary. call the exe with the parameter and it does the work and returns the result (reading console output).

I am looking for using project similar to like that. Calling the exe with parameter and get the output. @tducret please check if there is a better way to use it.

I have used python but don't like to translate the code to C#. Anything else to use it in c#.

Thanks

How to obtain all the text reviews of a product

Sorry but i'm new in the python world.

I'm with Jupyter Notebook and i can't obtain the text review for one product, i don't even know if it posible with this package.

Can you help me please.

Thank you very much!

No module named 'click'

I pip installed 'click' and it is in my pip list version 7.1.2 but I keep getting the following error:

Traceback (most recent call last):
File "/VSCode/amazon2csv/amazon2csv.py", line 3, in
import click
ModuleNotFoundError: No module named 'click'

Can i also get the price and descriptions?

Regarding the time out by Amazon

IP change

Hi there , I am using Crawlera proxy rotation and I need to edit the settings.py. However, I don't see any option in that scraper to plug in proxy credentials. Can you pls help me with this ?
Many Thanks

Is there a way to make capture these info?

Hi,

Have been searching for Amazon scrappers and this project is the most effective and efficient, especially on the capability to work on "keyword search" - great work!

I had forked this and was trying to make some changes so that it works with Amazon in my region (Amazon.com/au & Amazon.co.jp). But unfortunately it turned out that I do not have the skill to do so.
Would be very good to see if there is a way to alter parameters like this.

I also shared the same thought in the other issue thread regarding "Price", and I was also thinking to get the data of "Seller" and "Stock level" as well for a thorough analysis.

Thanks so much again for having this fascinating project.

Kenneth

Can add custom information？

Can add custom information？ such as whether it’s an Sponsored Ad

Is it possible to obtain results for more than 15 products?

I am only able to get results of 15 products when I perform a search, even though I have set max_product_nb to a value greater than 15. Performing a manual search using the same keyword returns over 100 products on the amazon.com website. I'll appreciate any comments. Thank you.

German Amazon character issue

Thank you very much, this app is very easy to use and powerful.

I have a small question, why do the German Amazon information I grabed have a lot of "?" garbled.

How can I solve this? @tducret

Carlos

Doesn't work for smile.amazon.com

Don't have time to investigate at the moment, but when using a search_url like https://smile.amazon.com/s/field-keywords=,%207.5%20Fluid%20Ounces,%206%20Pack it gives requests.exceptions.TooManyRedirects: Exceeded 30 redirects.