Giter VIP home page Giter VIP logo

Comments (6)

GantMan avatar GantMan commented on May 16, 2024

Hi!

You are correct. Training the entire dataset would be most impressive as I currently have around 30,000+ images per class. Additionally, I've increased the batch size to 32, which means 16,000 images are pulled in each epoch. Since I'm batching and using Stochastic Gradient Descent, I've found this to be a powerful method for continuous refinement of the model without overfitting.

Additionally, I have perturbation on the images, so that noise, rotation, and cropping is added randomly. Making it mathematically infeasible that the same exact image would ever be used twice.

After some serious re-training/refining I'd love for you to re-test my latest model. I'm getting around 93% accuracy. This was trained longer on an even larger dataset.

Side note:

You say you're not familiar with Keras, if you use some other method, I'd love for you to contribute. I'm planning on writing a Tensorflow JS training version. It would be entertaining to see which ML framework performs best.

from nsfw_model.

misterDDF avatar misterDDF commented on May 16, 2024

Thanks for your reply.

Yes I'm trying to reimplement this model with Pytorch, but the model accuracy for now can only reach about 83%, thinks I should retrain it more seriously.

from nsfw_model.

GantMan avatar GantMan commented on May 16, 2024

Here's a blog post I'm working on for how I trained the model:
https://medium.com/@gantlaborde/howto-ai-nsfw-detection-229a9725829c

from nsfw_model.

devinhee avatar devinhee commented on May 16, 2024

Hi!
I retrained this model with keras, but the model accuracy for now can only reach 89%. I guess it might be something wrong with my dataset, I can not get enough data for sexy class and drawings class, where did you get data of these two class.

from nsfw_model.

GantMan avatar GantMan commented on May 16, 2024

@devinhee - what's your data categorization error rate at? If you did a basic pull off of reddit etc. You might have some significant misclassifications that are holding your model back.

from nsfw_model.

devinhee avatar devinhee commented on May 16, 2024

@devinhee - what's your data categorization error rate at? If you did a basic pull off of reddit etc. You might have some significant misclassifications that are holding your model back.

Categorization error rate is 20% ~ 25%. Actually, I did some basic data cleaning, deleted bad images, removed duplicate images. But I did not check every single image of each categorization.

from nsfw_model.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.