Image classifier for detecting cats and dogs

Pipeline

Dataset

To pull and process the dataset run the following command

make data

Version control

The dataset is version controlled with dvc. There exists 3 tags for the datasets

all_data: raw and processed dataset of 279 cats and 278 dogs for training and 70 cats and 70 dogs for testing
raw_only: raw dataset of 279 cats and 278 dogs for training and 70 cats and 70 dogs for testing
expanded_dataset: expands the dataset with ~25k pictures og dogs and cats with a ~80/20 train/test split

To select a specific dataset run the following command before make data

git checkout tag data.dvc

Docker

The training can be containerized with docker. To build the docker images with the included docker file, from the root folder run

docker build -f trainer.dockerfile . -t trainer:latest

This docker image can be passed the following arguments

-lr: learning rate of the model. Default = 1e-4
-e: Number of epochs to train for. Default = 5
-bs: Batch size to use for the dataloader. Default = 16
-o: Optimizer to use in training. Default = Adam
-pt: Whether or not to use a pretrained ResNet50 CNN as the backbone. Default = True

The training script will log and report performance to wandb. Make sure you are logged into wandb by passing wandb login. Then when running the docker image, you must pass docker-run through wandb. Eg:

wandb docker-run --name experiment5 trainer:latest -lr 0.0001 -e 5 -bs 16 -o Adam -pt True

Project plan

The project is done by 5 members: Abdulrahman Ramadan, Cristina Ailoaei, Jakob Ryttergaard Poulsen, Roza Hasso, Teakosheen Joulak

Dataset: Cats and Dogs image classification: https://www.kaggle.com/datasets/samuelcortinhas/cats-and-dogs-image-classification which consists of 697 files/images of cats and dogs.
The project goal: The goal of the project is to classify a given image whether it includes a cat or a dog object, we want to create a structure repository to train a neural network model logging the results and the performance with reproducible experiments.
Framework: We will use Pytorch Image Models TIMM, because it includes the necessery classes and code for initializing the neural network model.
Deep Learning Model: We will use the Neural Network NN model to classify cats and dogs images

The tentative project plan is to use the following tools

Code structure and versioning

Cookiecutter for a structured repository template
Git for version control of code
DVC for version control of data

Reproducibility

Docker for system configuration
Conda for Python environment configuration

Experiment logging and monitoring

Hydra for hyperparameter specification
Wandb for experiment logging and model performance

Code performance and structure

Snakeviz for inspecting code performance
Using flake8 testing to check for Pep8 compliance in our code
Using isort for import structure

rozahasso / machine-learning-operations Goto Github PK

machine-learning-operations's Introduction

Image classifier for detecting cats and dogs

Pipeline

Dataset

Version control

Docker

Project plan

Code structure and versioning

Reproducibility

Experiment logging and monitoring

Code performance and structure

machine-learning-operations's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent