Giter VIP home page Giter VIP logo

simple_image_download's People

Contributors

koubae avatar riddlerq avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

simple_image_download's Issues

Progress Bar

I made a progress bar to show the progress of the keywords because I downloaded like a thousand pictures. I think this will be a great addition to your program.

problem with exception handling, keywords input and headers dict

I've been testing the library and there are 3 important problems.
1- different exceptions causes the program to stop that could be fixed just by handling different possible exceptions of requests and urllib libraries.
as a temporary solution inside simple_image_download.py file under def check_webpage(url): below
try: request = requests.get(url, allow_redirects=True, timeout=10) if 'html' not in str(request.content): checked_url = request
add these exceptions:
except requests.exceptions.RequestException as e: print("requests exception:", url) pass
this will fix the exceptions problems letting the code contiune it's execution even if some error happens fetching pages or downloading files.

2- giving multiple keywords to download function that includes spaces between words will cause the program to just get the first word before space as keyword and leave the rest of string.
as a temporary solution edit line keywords_to_search = [str(item).strip() for item in keywords.split(',')][0].split() under def generate_search_url(keywords): inside simple_image_download.py file to keywords_to_search = keywords.split(',')

3- google not sending respond to some requests. as i checked and perforemd some tests inside simple_image_download.py file under HEADERS = { change the content of dictionary to 'User-Agent': "Mozilla/5.0 (Windows; U; Windows NT 6.1; WOW64) AppleWebKit/602.42 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36", "Accept-Encoding": "*", "Connection": "keep-alive" this helps.

Goes into infinite loop if the google image search "did not match any image results"

On Line 37, if the search query does NOT return any results on google images, python goes into an infinite loop.

` 35 try:
36 new_line = raw_html.find('"https://', end_object + 1)
---> 37 end_object = raw_html.find('"', new_line + 1)
38
39 buffor = raw_html.find('\', new_line + 1, end_object)

KeyboardInterrupt: `

Maybe find the string "did not match any image results" in html file first and raise error?

cant download.

HTTPSConnectionPool(host='upload.wikimedia.org', port=443): Max retries exceeded with url: /wikipedia/commons/thumb/7/71/2010-kodiak-bear-1.jpg/1200px-2010-kodiak-bear-1.jpg (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x000001A335181C88>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'))

kindly help me out with this issue.
Thank you.

No support for gif

When I run this code, it only downloads files like this one:

cars_1

But sometimes when I download a lot of images without typing '.gif', some gifs are downloaded with no problem.

from simple_image_download import simple_image_download as simp
response = simp.simple_image_download
response().download('cars', 10, '.gif' )

Will repeatedly download the same images

This code seems to download the same n images over and over when I search for any term. That is, the x_{i} = x_{i+nk} where k is any natural number. For example, when I search up "eastern cottontail", I only get 84 unique images. This is a show stopper for me. Ideally, I would like the functionality to support duplicate image detection and ignore dupes.

AttributeError: 'simple_image_download' object has no attribute 'search_urls'

Hi, thanks for this wonderful library. I tried downloading the image and the URLs for those images but I got below mentioned error:
`---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
in
1 from simple_image_download import simple_image_download as simp
2 response = simp.simple_image_download()
----> 3 response.search_urls('Circular economy', limit=5)
4 for url in response.cache:
5 print(url)

AttributeError: 'simple_image_download' object has no attribute 'search_urls'`

I am trying on Google Colab and also tried on Windows but it is giving the same error. Can you please help me fix this? Thanks in advanced.

Extensions cannot accept more than one entry

There is a bug regarding the extensions parameter. I am not able to input more than one extension as in the code you are doing set([value]), thus if I pass an array of extensions as value, an error is thrown.

Only 33/100 queries returned

`from simple_image_download import simple_image_download as simp

response = simp.simple_image_download
queries = candidates['scientificName'].tolist()
for query in queries:
response().download(query, 1)
print(response().urls(query, 1))`

Does Google block an IP after so many calls? I have a for loop that attempts to get pictures from 100 queries, but there is only 33 returned.....Is there something I am missing?

Download stuck at 0%

The code worked fine under macos 10, Python 3.8 but under Windows 10 and the same Python version the download always gets stuck at 0%

File size

Could you please add an option to set the size of images to be downloaded?

Unable to use "extensions" argument on linux.

Getting error TypeError: download() got an unexpected keyword argument 'extensions' when trying to run script on Linux (RPi OS). Error does not occur when running same code in Windows 11.

cannot search for "green apples"

Code does a split on on strings and creates a separate URL and search for every word. Searching for "green apples" gives a folder with images of "green" and a separate folder of (red) "apples".

You would think it could be solved by quoting, "'green apples'", but that causes the package to create a url to search every character in that phrase -- ', g, r, e, ...

This needs to be fixed so the image search is anything that can be searched in images.google. E.g., " +'green apples' clipart ".

Downloads corrupted and sometimes wrong images

So, half of the images I try to download end up 'unreadable'. And a lot of the times, they are not even in the search. For example: I tried downloading on a day when Google had a special banner. It kept downloading the banner instead of the image. Any fix to this? Maybe updating the code?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.