Comments (3)
A temporary poor man's solution could be to identify the offset i of the last successfully downloaded sample (in my case 196) and manually rerun the script starting at offset i+2 (in my case 198), thus skipping the problematic sample.
This would be fine. However this does not work because it seems that filenames are not exactly relative to the requested offset and number of samples (I'm not exactly sure what's going on), and some samples end up being overwritten because they are saved with the same filename as previously downloaded samples.
from isic-archive-downloader.
For what it's worth, I've ran the script under various locations with my VPN (i.e. different IP address) and even asked a friend to run it for me. I have also tried running with --p 1
thinking that maybe all those processes were spamming the API too much. But the results are always the same, it always hangs at #160 for a while and then #197 for much longer. So I don't think the API is rate limitting us or anything like that.
You should be able to reproduce it the same way if you run this right now
python download_archive.py --num-images 250 --filter malignant
Do you understand why this is happening?
from isic-archive-downloader.
Hey! Thank you for this great idea! :)
And regarding the issues that you mention - I think some images are not downloadable for some reason.
You can even try to download them using the link in your browser and it won't be able to.
So I guess image #197 is one of these images. On the other hand I can't think of a reason yet for the hanging on image #160.
I will try to reproduce it myself when i'll have the time soon.
Btw, in order to skip the problematic samples in a more elegant way we could use the max_tries parameter which is present in some of the download functions in "download_single_item.py
", and add that parameter to the functions that don't have it.
I have put that parameter in some of the download functions with a default value of infinite tries, but never really given the user a way to specify another value in cases of problematic images such as you described.
So I guess that when i'll have more time, that will be the way to implement this request
from isic-archive-downloader.
Related Issues (20)
- README.md change "num_image" to "num-images" HOT 1
- Filter not working!! HOT 3
- Some images are actually RGBA and not JPEG HOT 2
- Syntax error in script HOT 4
- .gitignore HOT 3
- Download k samples of each class HOT 3
- Downloading images by Lesion diagnosis HOT 3
- Format invalid data HOT 2
- Download Freeze ! HOT 5
- Choosing Datasets HOT 1
- โfilter benign flags says none after downloading description HOT 1
- ImportError: DLL load failed HOT 3
- Filter by diagnosis
- Syntax error HOT 6
- Syntax error HOT 3
- Wondering if it is possible to modify the code to download just 2019 segmentation images data?
- Segmentation is not downloaded if 'failed' = True
- json.decoder.JSONDecodeError: Expecting value: line 1 column 6067423 (char 6067422) HOT 4
- Please fix malignant downloading issue HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from isic-archive-downloader.