Giter VIP home page Giter VIP logo

oid-imageclassification's Introduction

OID-ImageClassification

A collection of scripts to download data, train and evaluate an image classifier on Open Images using TensorFlow

Features

  • Create a list of all classes by image count
  • Download images for custom lists of classes (using parallelization)
  • Delete corrupt images
  • Train a model of choice on the downloaded image dataset
  • Evaluate the performance of the model (includes per-class accuracies)

Dependencies

Python 3.6 or higher

Package Version
Pillow 7.0.0
numpy 1.18.5
requests 2.22.0
tensorflow 2.3.1
tensorflow-hub 0.9.0
sklearn 0.23.2

Other package versions may work too.
Can be installed from requirements.txt

Workflow

  1. Download the Image IDs, Image labels, Boxes and Class Names from https://storage.googleapis.com/openimages/web/download.html
    (Train, Validation and Test of "Subset with Image-Level Labels" and Bounding Boxes of "Subset with Bounding Boxes")

  2. Put them in a folder structure like this:
    inputFolder.png

  3. Create folders named out and processing

  4. Run the script 1_create_class_id_to_image_ids.py
    Output:
    script1.png

  5. Run the script 2_create_class_list_by_image_count.py
    Output:
    script2.png

  6. Choose class names to train your classifier on from out/class_list_by_image_count and put them into a .txt file inside in/class_lists
    Example:
    script1.png

  7. Adjust all options in config.py under # image download to your liking

  8. Run the script 3_download_images.py
    Example Output:
    script3.png

  9. Run the script 4_delete_corrupt_images.py

  10. Adjust all options in config.py under # model training to your liking

  11. Run the script 5_train_model.py
    Output:
    script5.png
    Now you have an Tensorflow Image classifier at out/saved_model

  12. If you killed the previous script because it took too long, run 6_extract_model_from_checkpoint.py

  13. Run the script 7_evaluate_model.py
    Output:
    script7.png

  14. DONE

Recommendations

  • The dataset is very noisy, you might have to manually delete images that do not fit the label
  • Make sure you have enabled GPU support https://www.tensorflow.org/install/gpu
  • Place your dataset on a SSD drive (500Mb/s should be enough) for faster training

oid-imageclassification's People

Contributors

lischilpp avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.