Giter VIP home page Giter VIP logo

imagenet-datasets-downloader's Introduction

ImageNet Downloader

This is ImageNet dataset downloader. You can create new datasets from subsets of ImageNet by specifying how many classes you need and how many images per class you need. This is achieved by using image urls provided by ImageNet API.

In this blog post I wrote in a bit more detail how and why I wrote the tool. Also, I did a little analysis of the current state of the ImageNet URLs in the post.

This software is written in Python 3

Usage

The following command will randomly select 100 of ImageNet classes with at least 200 images in them and start downloading:

python ./downloader.py \
    -data_root /data_root_folder/imagenet \
    -number_of_classes 100 \
    -images_per_class 200

The following command will download 500 images from each of selected class:

python ./downloader.py 
    -data_root /data_root_folder/imagenet \
    -use_class_list True \
    -class_list n09858165 n01539573 n03405111 \
    -images_per_class 500 

You can find class list in this csv where I list every class that appear in the ImageNet with number of total urls and total flickr urls it that class.

Multiprocessing workers

I've implementet parallel request processing and I've added multiprocessing_workers parameter which by default is 8. You can turn it higher, but I havent yet tested the limits of flickr allowed bandwith myself, so use it with care and you will have to find the limits yourself if you want to go for the maximum speed.

You can do something like this:

python ./downloader.py \
    -data_root /data_root_folder/imagenet \
    -number_of_classes 1000 \
    -images_per_class 500 \
    -multiprocessing_workers 24

imagenet-datasets-downloader's People

Contributors

jmaio avatar mf1024 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.