gurugaurav / bing_image_downloader Goto Github PK
View Code? Open in Web Editor NEWPython library to download bulk of images from Bing.com
Home Page: https://pypi.org/project/bing-image-downloader/
License: MIT License
Python library to download bulk of images from Bing.com
Home Page: https://pypi.org/project/bing-image-downloader/
License: MIT License
Hello,
I have been using the python package for months without problems, but for a few weeks the downloaded images have no relation to the search tag and the downloads are repeated cyclically.
I mean if i search for example "box" and try to download 50 images. There are many images that are not related to boxes, and every 8 images they repeat.
Do you know how to fix this error?
I was hoping to download 50k images but only 7000 images got downloaded.
[!!]Indexing page: 56
Traceback (most recent call last):
File "/home/mona/anaconda3/lib/python3.7/urllib/request.py", line 1319, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "/home/mona/anaconda3/lib/python3.7/http/client.py", line 1252, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/home/mona/anaconda3/lib/python3.7/http/client.py", line 1298, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/home/mona/anaconda3/lib/python3.7/http/client.py", line 1247, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/home/mona/anaconda3/lib/python3.7/http/client.py", line 1026, in _send_output
self.send(msg)
File "/home/mona/anaconda3/lib/python3.7/http/client.py", line 966, in send
self.connect()
File "/home/mona/anaconda3/lib/python3.7/http/client.py", line 1414, in connect
super().connect()
File "/home/mona/anaconda3/lib/python3.7/http/client.py", line 938, in connect
(self.host,self.port), self.timeout, self.source_address)
File "/home/mona/anaconda3/lib/python3.7/socket.py", line 707, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "/home/mona/anaconda3/lib/python3.7/socket.py", line 752, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "download_images_from_bing.py", line 2, in <module>
downloader.download('gun', limit=50000, adult_filter_off=True, force_replace=False)
File "/home/mona/anaconda3/lib/python3.7/site-packages/bing_image_downloader/downloader.py", line 34, in download
Bing().bing(query, limit, adult)
File "/home/mona/anaconda3/lib/python3.7/site-packages/bing_image_downloader/bing.py", line 63, in bing
response = urllib.request.urlopen(request)
File "/home/mona/anaconda3/lib/python3.7/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/home/mona/anaconda3/lib/python3.7/urllib/request.py", line 525, in open
response = self._open(req, data)
File "/home/mona/anaconda3/lib/python3.7/urllib/request.py", line 543, in _open
'_open', req)
File "/home/mona/anaconda3/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
File "/home/mona/anaconda3/lib/python3.7/urllib/request.py", line 1362, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/home/mona/anaconda3/lib/python3.7/urllib/request.py", line 1321, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno -3] Temporary failure in name resolution>
from bing_image_downloader import downloader
downloader.download('cat', limit=50000, adult_filter_off=True, force_replace=False)
It seems that os
module is not used in the code. It would be better to remove it.
I attempted to download pizza images and used codes:
downloader.download("pizza", limit=100, output_dir="photos", adult_filter_off=True, force_replace=False, timeout=5)
However, the downloader gave me many menu images from zmenu.com.
Could you please add the option to choose image resolution and format?
Hi, if I use bing.py and downloader.py outside the package they work perfectly. But if I import as it is in the instructions it generates ImportError: cannot import name downloader from partially initialized module bing_image_downloader
.
When using the download() function, I came accross multiple control character errors. This is caused by a url containing a blank space.
I temporarily solved it by changing these two lines to replace " " with "%20" inside bing.py:
self.seen.add(link.replace(" ", "%20"))
self.download_image(link.replace(" ", "%20"))
Congratulations on your script! it works very well!
For a project I would have to search 50 photos for a long list of plants and flowers, how can I search for photos using a txt list or similar?
Thanks and Merry Christmas!
Is it possible to save the urls along with the image?
On line 24 in downloader.py there is a missing '_'.
if Path.isdir(image_dir):
should be if Path.is_dir(image_dir):
've tried this yesterday with a couple of queries and it worked completely regulary. But today it is showing some error -
[%] Downloading Image #4 from http://www.zacatecasalminuto.com/wp-content/uploads/2020/05/cierra_mina_peñoles_morelos_zac.png
[!] Issue getting: http://www.zacatecasalminuto.com/wp-content/uploads/2020/05/cierra_mina_peñoles_morelos_zac.png
[!] Error:: 'ascii' codec can't encode character '\xf1' in position 46: ordinal not in range(128)
Hello!, when I put a destination folder the script tries to generate a folder with the name of the path I indicated, for example if I put in path "C:" it doesn't locate the images in that path but tries to create a folder called C:\ which generates an error, is there a way to solve this?
Is there also a way to rename the downloaded file and not generate a folder with the name of each search but keep them all in the same folder?
Thank you very much!
I am trying to create a food dataset. However, when I try to scrape from Bing using this library, I am getting a lot of duplicate images. Please assist.
Thank you
like, dog_01.png not Image_01.png
It would be really useful if out_dir wasn't always appended to os.getcwd().
I think there should be a simple boolean absolute_path for people who want to use their own absolute path.
E.g.: people wearing masks in public -> lots of Donald Trump images
Results as expected when used on the website image search.
Any query seems to result in repeating images too.
This lib is currently broken and not fit for purpose.
Currently the script downloads the images to a dataset
folder in the repo.
This is not always desirable.
Modify the signature of downloader.download
to specify the output_dir
for image downloads.
Hi
This is my code
from bing_image_downloader import downloader
query_string = 'langeek dictionary definition "Go"'
downloader.download(query_string, limit=3, output_dir='dataset', adult_filter_off=True,
force_replace=False, timeout=60, verbose=True)
But the quotation marks do not work and the results with Bing isn't same.
Is it possible to download and keep original image names, not index as 'index_1.jpg index_2.jpg ' etc ?
This program is great. However, if I am unable to access an image, I want to skip it and move on to the next one.
This program will try endlessly if it fails to download a file.
Hello,
I just keep getting the messages:
[!!]Indexing page: 320
[%] Indexed 10 Images on Page 320.
Any solution?
There was a typo when you are writing the function of the adult_filter_off parameter on the pypi.org website, you wrote (Enable for disable adult filteration) instead of (Enable or disable adult filteration).
Hi,
Thanks for the nice work.
I suggest to add filters as optional input in this line:
I've tried this yesterday with a couple of queries and it worked completely fine. But today it is showing some error -
Traceback (most recent call last):
File "E:\Pyhon\bing.py", line 1, in <module>
from bing_image_downloader import downloader
File "C:\Users\Admin\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\bing_image_downloader\downloader.py", line 6, in <module>
from bing import Bing
File "E:\Pyhon\bing.py", line 2, in <module>
downloader.download('apricot', limit=100, output_dir='dataset', adult_filter_off=True, force_replace=False, timeout=60, verbose=True)
AttributeError: partially initialized module 'bing_image_downloader.downloader' has no attribute 'download' (most likely due to a circular import)
Process finished with exit code 1
Useful package but still imagine you use several such packages to download images of o cats from several sources like bing, google, and others. It would be nice to have option to prefix names like this:
downloader.download(
label=search_str,
output_dir=output_dir,
photo="photo", #choose from [line, photo, clipart, gif, transparent]
limit=5,
name_prefix='bing',
main_name='cat',
name_sep='_'
)
so you can save all cats images in one folder like this
bing_cat_001.jpg
google_cat_001.jpg
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.