gurugaurav / bing_image_downloader Goto Github PK

Python library to download bulk of images from Bing.com

Home Page: https://pypi.org/project/bing-image-downloader/

License: MIT License

Python 100.00%

bing-image-downloader image-downloader image-downloader-python image-scraper image-scrapping bing-image-scrapping python-image-webcrawler python-imagesearch python-image-download python-image-downloader

bing_image_downloader's Issues

Not working anymore

Hello,
I have been using the python package for months without problems, but for a few weeks the downloaded images have no relation to the search tag and the downloads are repeated cyclically.

I mean if i search for example "box" and try to download 50 images. There are many images that are not related to boxes, and every 8 images they repeat.

During handling of the above exception, another exception occurred:

Do you know how to fix this error?

I was hoping to download 50k images but only 7000 images got downloaded.

[!!]Indexing page: 56

Traceback (most recent call last):
  File "/home/mona/anaconda3/lib/python3.7/urllib/request.py", line 1319, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/home/mona/anaconda3/lib/python3.7/http/client.py", line 1252, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/home/mona/anaconda3/lib/python3.7/http/client.py", line 1298, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/home/mona/anaconda3/lib/python3.7/http/client.py", line 1247, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/home/mona/anaconda3/lib/python3.7/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/home/mona/anaconda3/lib/python3.7/http/client.py", line 966, in send
    self.connect()
  File "/home/mona/anaconda3/lib/python3.7/http/client.py", line 1414, in connect
    super().connect()
  File "/home/mona/anaconda3/lib/python3.7/http/client.py", line 938, in connect
    (self.host,self.port), self.timeout, self.source_address)
  File "/home/mona/anaconda3/lib/python3.7/socket.py", line 707, in create_connection
    for res in getaddrinfo(host, port, 0, SOCK_STREAM):
  File "/home/mona/anaconda3/lib/python3.7/socket.py", line 752, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "download_images_from_bing.py", line 2, in <module>
    downloader.download('gun', limit=50000, adult_filter_off=True, force_replace=False)
  File "/home/mona/anaconda3/lib/python3.7/site-packages/bing_image_downloader/downloader.py", line 34, in download
    Bing().bing(query, limit, adult)
  File "/home/mona/anaconda3/lib/python3.7/site-packages/bing_image_downloader/bing.py", line 63, in bing
    response = urllib.request.urlopen(request)
  File "/home/mona/anaconda3/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/home/mona/anaconda3/lib/python3.7/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/home/mona/anaconda3/lib/python3.7/urllib/request.py", line 543, in _open
    '_open', req)
  File "/home/mona/anaconda3/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/home/mona/anaconda3/lib/python3.7/urllib/request.py", line 1362, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/home/mona/anaconda3/lib/python3.7/urllib/request.py", line 1321, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno -3] Temporary failure in name resolution>

from bing_image_downloader import downloader
downloader.download('cat', limit=50000, adult_filter_off=True, force_replace=False)

Os is not used here

bing_image_downloader/bing_image_downloader/downloader.py

Line 1 in 62e39b2

import os, sys

It seems that os module is not used in the code. It would be better to remove it.

Search food, but download many menu images

I attempted to download pizza images and used codes:
downloader.download("pizza", limit=100, output_dir="photos", adult_filter_off=True, force_replace=False, timeout=5)

However, the downloader gave me many menu images from zmenu.com.

[`Formating`] : Format the code using Black

Hi There,

It seems that the code is not properly formatted using black.

Projects like django,flask are formatted using black.

It would be best if we follow a common coding pattern :)

Image options

Could you please add the option to choose image resolution and format?

Hi, if I use bing.py and downloader.py outside the package they work perfectly. But if I import as it is in the instructions it generates ImportError: cannot import name downloader from partially initialized module bing_image_downloader.

Error:: URL can't contain control characters. (found at least ' ')

When using the download() function, I came accross multiple control character errors. This is caused by a url containing a blank space.

I temporarily solved it by changing these two lines to replace " " with "%20" inside bing.py:
self.seen.add(link.replace(" ", "%20"))
self.download_image(link.replace(" ", "%20"))

Photo search with txt list

Congratulations on your script! it works very well!
For a project I would have to search 50 photos for a long list of plants and flowers, how can I search for photos using a txt list or similar?
Thanks and Merry Christmas!

Getting original URLs

Is it possible to save the urls along with the image?

Force Replace option not functioning due to typo.

On line 24 in downloader.py there is a missing '_'.

if Path.isdir(image_dir): should be if Path.is_dir(image_dir):

Error:: 'ascii' codec can't encode character '\xf1' in position 46: ordinal not in range(128)

've tried this yesterday with a couple of queries and it worked completely regulary. But today it is showing some error -

[%] Downloading Image #4 from http://www.zacatecasalminuto.com/wp-content/uploads/2020/05/cierra_mina_peñoles_morelos_zac.png
[!] Issue getting: http://www.zacatecasalminuto.com/wp-content/uploads/2020/05/cierra_mina_peñoles_morelos_zac.png
[!] Error:: 'ascii' codec can't encode character '\xf1' in position 46: ordinal not in range(128)

Path problems

Hello!, when I put a destination folder the script tries to generate a folder with the name of the path I indicated, for example if I put in path "C:" it doesn't locate the images in that path but tries to create a folder called C:\ which generates an error, is there a way to solve this?

Is there also a way to rename the downloaded file and not generate a folder with the name of each search but keep them all in the same folder?

Thank you very much!

Circular import

I can't run code because of AttributeError.

Duplicate Images

I am trying to create a food dataset. However, when I try to scrape from Bing using this library, I am getting a lot of duplicate images. Please assist.

Thank you

option to change image names to query_string?

like, dog_01.png not Image_01.png

There should be an option to use an absolute path as out_dir

It would be really useful if out_dir wasn't always appended to os.getcwd().
I think there should be a simple boolean absolute_path for people who want to use their own absolute path.

Broken, results do not resemble same query via web

E.g.: people wearing masks in public -> lots of Donald Trump images
Results as expected when used on the website image search.

Any query seems to result in repeating images too.
This lib is currently broken and not fit for purpose.

Allow to specify output folder for downloading images

Currently the script downloads the images to a dataset folder in the repo.
This is not always desirable.

Modify the signature of downloader.download to specify the output_dir for image downloads.

Why quotation marks do not work?

Hi
This is my code

from bing_image_downloader import downloader

query_string = 'langeek dictionary definition "Go"'
downloader.download(query_string, limit=3,  output_dir='dataset', adult_filter_off=True, 
                    force_replace=False, timeout=60, verbose=True)

But the quotation marks do not work and the results with Bing isn't same.

option to keep original image names

Is it possible to download and keep original image names, not index as 'index_1.jpg index_2.jpg ' etc ?

Will be stuck forever if unable to download an image

This program is great. However, if I am unable to access an image, I want to skip it and move on to the next one.
This program will try endlessly if it fails to download a file.

It simply hangs

Hello,

I just keep getting the messages:

[!!]Indexing page: 320

[%] Indexed 10 Images on Page 320.

Any solution?

why joining the search query in the download path, this giving me issues while trying to open the images

bing_image_downloader/bing_image_downloader/downloader.py

Line 21 in 62e39b2

image_dir = Path(output_dir).joinpath(query).absolute()

Can We pass image licence type filter as bing search give an option to do so ?

Typo error in the documentation page on pypi.org

There was a typo when you are writing the function of the adult_filter_off parameter on the pypi.org website, you wrote (Enable for disable adult filteration) instead of (Enable or disable adult filteration).

add filters support

Hi,
Thanks for the nice work.
I suggest to add filters as optional input in this line:

bing_image_downloader/build/lib/bing_image_downloader/downloader.py

Line 34 in 0cad2bd

bing = Bing(query, limit, output_dir, adult, timeout)

Worked fine yesterday, but showing error for me today.

I've tried this yesterday with a couple of queries and it worked completely fine. But today it is showing some error -

Traceback (most recent call last):
  File "E:\Pyhon\bing.py", line 1, in <module>
    from bing_image_downloader import downloader
  File "C:\Users\Admin\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\bing_image_downloader\downloader.py", line 6, in <module>
    from bing import Bing
  File "E:\Pyhon\bing.py", line 2, in <module>
    downloader.download('apricot', limit=100,  output_dir='dataset', adult_filter_off=True, force_replace=False, timeout=60, verbose=True)
AttributeError: partially initialized module 'bing_image_downloader.downloader' has no attribute 'download' (most likely due to a circular import)

Process finished with exit code 1

File names manipulation

Useful package but still imagine you use several such packages to download images of o cats from several sources like bing, google, and others. It would be nice to have option to prefix names like this:

downloader.download(
    label=search_str,
    output_dir=output_dir,
    photo="photo", #choose from [line, photo, clipart, gif, transparent]
    limit=5,
    name_prefix='bing',
    main_name='cat',
    name_sep='_'
)

so you can save all cats images in one folder like this
bing_cat_001.jpg
google_cat_001.jpg

gurugaurav / bing_image_downloader Goto Github PK

bing_image_downloader's Issues

Recommend Projects

Recommend Topics

Recommend Org