🍚 RiceGen

🌾 Diffusion Model applied to rice grains generation. 🌾

This is a simple academical project that aim to generate rice grains images using an unconditional diffusion model from scratch technique. Although being unconditional, the model is fully trained with similiar rice images, so we can say that is conditioned to generate rice grains.

Warning

Most of the image labels are incorrected, a mistaken were made when generating them. The only image with right labels is BS-64_E-120_IMGSIZE-16x16. For the others, consider the first image label for all of them. It'll be fixed as soon as possible.

Dataset and pre-processing

The chosen database was Rice Image Dataset, which provided images of five rice varieties (Arborio, Basmati, Ipsala, Jasmine and Karacadag), 15k images for each. As this project was executed within a Google Colab Jupyter Notebook, the size was reduced for 1k images for each variety. Using the following command:

ls | head -n 1000 | xargs -I {} mv {} ../<variety>-1k/

Once the data was shrinked, it was possible and feasible to use and upload to Drive. Inside Colab, after extracted the zip file that contained all the 5k images, was necessary to process these PIL images labeled into a PyTorch Dataset structure, using the CustomDataset and shuffling it. 4k (80%) was designed to train the model and the 20% to validation.

The architecture

It's worth to remember that this project is highly inspired and based on Diffusion Models - Live Coding Tutorial and Diffusion Models | Paper Explanation | Math Explained using their code as foundation, as the scratch implementation requires a certain level of expertise at math, mainly statistic.

The main goal is applying noise to the images using Markov Chain, this is the forward process. Once the model is trained, it can do the reverse process, turning noise into a coherent image. The architecture uses the U-Net in the processes which is very similar to Variational Auto Encoders (VAE).

NOTE:: I will not cover deeply the math under the hood. Just applying the concepts studied. If you want go further, feel free to read and watch the references.

Parameters

There were several tests with differents parameters values (batch size, image size, learning rate and number of epochs). You can see the results inside the 5k-samples. Among theses options, the image with the following parameters: BS-64_E-120_IMGSIZE-16x16 was the most similar to a real grain. The BS-64_E-120_IMGSIZE-32x32_LR-00003 (with learning rate = 0.0003) presented a good trail as well.

Results

Altough being limited by the Colab runtime resources, great images were accomplished. The image below has the followings parameters: Batch size = 64, Epochs = 120, Resolution = 16x16 and Learning Rate = 0.0003.

When trying to increase the resolution, it's clearly seen that the model get instable and try to generate more grains, which sometimes leads to an undesired image.

Here's how the loss function behaved in the first image (16x16 resolution).

References

dtransposed. (2023, February 6). Introduction to Markdown [Video]. YouTube. Retrieved March 8, 2024, from https://www.youtube.com/watch?v=S_il77Ttrmg.
Outlier. (2022, June 6). Diffusion Models | Paper Explanation | Math Explained [Video]. YouTube. Retrieved March 8, 2024, from https://www.youtube.com/watch?v=HoKDTa5jHvg.
Jia-Bin Huang. (2024, January 8). Diffusion Models | Paper Explanation | Math Explained [Video]. YouTube. Retrieved March 8, 2024, from https://www.youtube.com/watch?v=i2qSxMVeVLI.
Luo, C. (2022). The Effectiveness of Markdown in Document Creation. Arxiv. https://arxiv.org/pdf/2208.11970.pdf
Ho, J., Jain, A., Abbeel, P. (2020). Denoising Diffusion Probabilistic Model. Arxiv. https://arxiv.org/pdf/2006.11239.pdf

edupras / ricegen Goto Github PK

ricegen's Introduction

🍚 RiceGen

Dataset and pre-processing

The architecture

Parameters

Results

References

ricegen's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent