Giter VIP home page Giter VIP logo

clothing-recognition's Introduction

👗 Fashion Mnist License

Introduction

Goal

Classify images of clothing from dataset of Zalando's articles (source)

Dataset

Fashion-MNIST dataset of Zalando's article images - replacement for the original MNIST dataset for benchmarking machine learning algorithms.

  • training set (60,000 examples).
  • test set (10,000 examples).

Each example is a 28x28 grayscale image, associated with a label from 10 classes.

dataset_example (You can generate your plots by using function: app.utils.data_utils.plot_rand_images)

Content

Each image has 784 pixels in total (28 pixels in height * 28 pixels in width). Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel (higher numbers meaning darker). Pixel-value is an integer between 0 and 255. Each training and test example is assigned to one of the following labels:

Number Label
0 T-shirt/top
1 Trouser
2 Pullover
3 Dress
4 Coat
5 Sandal
6 Shirt
7 Sneaker
8 Bag
9 Ankle boot

Used methods

Repository contains two machine learning algorithms, used to build image classification tool:

KNN (K-Nearest Neighbor)

KNN is a non-parametric, classification algorithm. The basic idea behind KNN is simple. Given a (test) vector or image to classify label, find k vectors or images in training set that are “closest” to the (test) vector or image. With the k closest vectors or images, there are k labels. Assign the most frequent label of k labels to the (test) vector or image.

For KNN all algorithm methods, utilities for plotting images and manipulate of data, was written from scratch by myself (TensorFlow library is used here only to downloa training data)

Data preprocessing:

  • splitting training data to "training" and "validation" set
  • flatting training data from size (60000,(28,28)) to (60000,784)
  • data normalization (approximately normalizing data dimensions scale - dividing by 255)

Examples:

normalization_example (Normalized data: app.utils.data_utils.plot_rand_images)

Used metrics to measure “closeness”:

Attempt_No Algorithm
1 Manhattan Distance
2 Hamming Distance
3 Euclidean Distance (L2)

KNN is an exception to general workflow for building/testing supervised machine learning models. There is no model build and becouse of that we don't have a training and validating set. All what we can do is selecting best k parameter and "closeness" parameter. To find it we can split our data to sth like "training" and "validation" set, which will be used in distance calculating method.

  • Splitting proportion : 25%
  • "Train" images qty : 45000
  • "Validation" images qty : 15000

Example k search log:

k_search_log

(All logs you can find in: /app/knn/results/logs/)

Finally we found best parameter k=7

Both to test our algorithm on test data and searching best k value, first we need to split data to batches (in our case each of size=2000/2500 images). KNN is very space consuming and without splitting, we would have to need sth about 15-25GB of free RAM memory to evaluate matrices calculations.


CNN (Convolutional Neural Networks)

CNN image classifications takes an input image, process it and classify it under certain categories. Deep learning CNN models to train and test, each input image will pass it through a series of convolution layers with filters (kernals), pooling, fully connected layers (FC) and apply Softmax function to classify an object with probabilistic values between 0 and 1

Data preprocessing:

  • splitting training data to training and validation set
  • resizing training data to 3D: from size (60000,(28,28)) to (60000,(28,28,1))
  • data normalization (normalizing the data dimensions, by dividing by 255, so that they are of considered approximately the same scale)
  • data augmentation

Data augmentation:

Encompasses a wide range of techniques used to generate “new” training samples from the original ones by transforming data in various ways.

Used augmentations techniques:

Attempt_No Transformations
1 rotation_range=90, horizontal_flip, vertical_flip
2 rotation_range=5, horizontal_flip, vertical_flip, zoom_range=0.1

Example augmentated data:

augmentated_data (You can generate your augmentiated images plots by using function: app.utils.data_utils.plot_rand_images_from_gen)

Validation set to training set proportions: 1/4

Used models:

Model 1:

model_1

Layers Description
Conv2D 2D convolutional layer
MaxPooling2D Max pooling operation for spatial data
Flatten Flattens the input
Dense Regural, fully-connected NN layer

Model 2:

model_2

Layers Description
Conv2D 2D convolutional layer
MaxPooling2D Max pooling operation for spatial data
Flatten Flattens the input
Dropout Reducing overfitting
Dense Regural, fully-connected NN layer

Model 3:

model_3

Layers Description
Conv2D 2D convolutional layer
MaxPooling2D Max pooling operation for spatial data
Batch normalization Normalize the input layer by re-centering and re-scaling
Flatten Flattens the input
Dropout Reducing overfitting
Dense Regural, fully-connected NN layer

Training attempts:

Attempt_no Model_no Batch size Epochs Augmentation_no Time
1 1 64 150 1 3:28:06
2 2 64 120 2 2:25:54
3 2 2048 150 2 1:37:43
4 3 32 15 2 2:01:06

(All logs you can find in: /app/cnn/results/logs/)

Results

Example predictions:

example_pred_knn

(You can generate your predictions by using: app.utils.data_utils.plot_image_with_predict_bar)

KNN

Manhattan Distance:

  • Accuracy: 10.08%
  • k: 7
  • Distance calculation method: Manhattan Distance
  • Train images qty: 45000
  • Total calculation time= 0:05:13
  • Total k searching time= 0:12:21

Hamming Distance:

  • Accuracy: 36.77%
  • k: 7
  • Distance calculation method: Hamming Distance
  • Train images qty: 45000
  • Total calculation time= 0:14:31
  • Total k searching time= 0:14:31

Euclidean Distance - Best result:

  • Accuracy: 84.77%
  • k: 7
  • Distance calculation method: Euclidean distance (L2)
  • Train images qty: 45000
  • Total calculation time= 0:05:05
  • Total k searching time= 0:18:51

Benchmark


As we can see, compared to benchmark our result is quite good, wheras relatively short training time.

(bechmark_source)


CNN

First attempt:

Test name: 1_epoch150_batch64

  • Prediction accuracy: 84.61%
  • Model number: 1
  • Batch size: 64
  • Epochs: 150
  • Started data size qty: 45000
  • Prediction loss: 0.44
  • Total calculation time: 3:28:06

(All models and logs you can find in: /app/cnn/results/models/)

Accuracy Losses

As we can see, futher increasing epochs value doesn't have much sense because our accuracy and losses are becaming more and more flat. We change our model and (knowing that test images are positioned rather straight) we reduce rotation value to 5 and add negligible zoom to augmentation)


Second attempt:

Test name: 2_epoch120_batch64

  • Prediction accuracy: 88,60%
  • Model number: 2
  • Batch size: 64
  • Epochs: 120
  • Started data size qty: 45000
  • Prediction loss: 0.33
  • Total calculation time: 2:25:54
Accuracy Losses

We achieve better results. Let's see if we can gain even more from this model, by increasing batch size and slightly epochs too


Third attempt:

Test name: 2_epoch150_batch2048

  • Prediction accuracy: 85,22%
  • Model number: 2
  • Batch size: 2048
  • Epochs: 150
  • Started data size qty: 45000
  • Prediction loss: 0.42
  • Total calculation time: 1:37:43
Accuracy Losses

It's better than first attempt, but worse than last. Our augmentated data remains unchanged. We repleace our model to new and return to small batches. We als o change epochs size, because in new model, one iteration over epoch cost us much more time than in others (sth about 8 minutes per epoch)


Fourth attempt: - best result

Test name: 3_epoch15_batch32

  • Prediction accuracy: 90.49%
  • Model number: 3
  • Batch size: 32
  • Epochs: 15
  • Started data size qty: 45000
  • Prediction loss: 0.26
  • Total calculation time: 2:01:06
Accuracy Losses

Finally we achieve best result as 90,49%, which is relatively good result (best noticed ever accuracy for FashionMNIST was 96,7%) (bechmark_source). If we look at graphs we can realize that if we increase our epochs size even more, perhaps we could get maybe one 1% percent more. If you have a lot of time and you are curious about results you can check this by yourself 😉

Usage

Software requirements

python 3.+, tensorFlow, keras, numpy, matplotlib, IPython, skit-learn

How to use it?

You just need to simply run cnn_main.py or knn_main.py according to algorithm method which you want to test. You don't need to download Fashion_MNIST data - it is done automatically by TensorFlow library.

python3 cnn_main.py

or

python3 knn_main.py

Expected output steps for main files

CNN_main:

  • making, compiling, fitting and measuring accuracy for model with default parameters, model and data (you can change epochs, batch size or model as you want)
  • plotting graph with history of training accuracy and losses
  • plotting example training images
  • plotting example images with predictions

KNN_main:

  • searching best k value
  • making predictions for test data and calculating accuracy for k which had been found step before
  • plotting example training images
  • plotting example images with predictions

Both in cnn and knn folder:

  • All plots are saved to .png files inside results\models.
  • Logs are written to .txt files inside results\logs directory.

📫 Contact

Created by

[email protected] - feel free to contact me! ✊

clothing-recognition's People

Contributors

ukasz09 avatar

Stargazers

Raksan avatar Gerhard avatar  avatar  avatar Maciej Wasilewski avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.