Hi, I'm deeply impressed by your project and want to devote your model to my PyTorch p

@ydk-tellurion you may can checkout <a href="https://arxiv.org/abs/2108.01819" rel="no

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Reproduce deepdanbooru_v3 in PyTorch about deepdanbooru HOT 8 CLOSED

kichangkim commented on August 27, 2024

Reproduce deepdanbooru_v3 in PyTorch

from deepdanbooru.

Comments (8)

KichangKim commented on August 27, 2024

Hi. I originally implemented DeepDanbooru by using Microsoft's CNTK. With TensorFlow, there are some different behaviours to CNTK.

So I testd some parameter changes for TensorFlow:

Increased learning rate: x 1000 ~ 5000
Changed learning algorithm : Adam -> SGD

I does not have experience for PyTorch, but I think you should check PyTorch's default behaviour of internal network layer like conv, pooling, initializer, loss and so on. They all have different default parameters depending on the library.

Also, I noticed that you said "one-hot labels", but DeepDanbooru needs multi-label inputs. Not multi-class. DeepDanbooru's input vector should have multiple one-value (if the image has multiple tags).

from deepdanbooru.

tellurion-kanata commented on August 27, 2024

Hi, thanks for your timely reply.
I think there is a little misunderstanding, sorry for my ambiguous description. I label each image with a paired one-hot vector with 7000 channels, setting 1 for existing tag and 0 for others. I think we are the same in this point according to your post in reddit two years ago.
ex:
1girl, red_hair, black_hair,...1000 other tags...,blah
1,1,0,...,0

So you didn't make specific improvements on the networks or training strategy? Like assigning larger weights for low-frequency tags, adopting loss functions which are able to restrict the negative samples like focal loss, or resample the dataset to make the distribution more balanced?

I will check the difference of default settings, thank you!

from deepdanbooru.

KichangKim commented on August 27, 2024

1girl, red_hair, black_hair,...1000 other tags...,blah 1,1,0,...,0

Oh, it looks fine.

Additionally, I filtered training dataset, using only images which has 20 or more general tags. Images which has < 20 general tags, simply ignored.

from deepdanbooru.

tellurion-kanata commented on August 27, 2024

Thanks you for your kind help!
I would do the same filtering for my dataset before my next try.

from deepdanbooru.

koke2c95 commented on August 27, 2024

@ydk-tellurion you may can checkout this

code&datatsets prepare release: ShuhongChen/bizarre-pose-estimator

resnet50 trained on this subset classification better to RF5's

proposes a cleaner danbooru multi-label task specific on anime character, processing guide for danbooru2019
uninformative target tags: clean up by (positive tags, under tagged ,non-contextual relevance etc.)
severe class imbalance: weighted trick, to reduced class imbalance/long-tailed issues
on training : data augmentation strategy avoid certain lossly
solve related image tasks using this backbone feature.

from deepdanbooru.

tellurion-kanata commented on August 27, 2024

@YHJ2c95 Hi, thanks for your information!
I would test it in the following days.

from deepdanbooru.

koke2c95 commented on August 27, 2024

@KichangKim @ydk-tellurion

after survey danbooru's tag
I think multi-label classification not a good

tag self with semantic, but is for human, as dataset is images bucket/collection

Concepts that one cannot describe / not presented , this serious effect, lead poorly trained models, few downstream task
Or even, nothing learned at all
perhaps add some pseudo label from unsupervised cluster could give huge improved

There are tags for non-original characters that are bias, i.e. character traits
There are very few valid tags, (so why limit dan to 2, there are many "images" and two tags are enough to search)
There is no information about the components of the tag, for example
The tag set is not strongly delineated and there is repetition of meaning
New tags are difficult to synchronise with earlier images

web image classification

I think the danbooru tagging task is a web image classification, which is very different from imagenet/CAPTCHA.
One is that imagenet/CAPTCHA is very close to objects, whereas web images are not like that
Secondly, if you remove the tagging literally, then the whole dataset should be just a bucket/collection, with sets of different themes of different intensities

Extend this further as a group, which can be subsetted and merged
Apply contrast learning again, instead of using single image view's contrast, turn it into a group-to-group contrast

The inspiration is ( give a bunch of images and then guess the tag game ) and paper

from deepdanbooru.

KichangKim commented on August 27, 2024

Yes, simple multi-label classification has many limitations for semantic recognition. That is why I removed "copyright tags" from training data.

from deepdanbooru.

Reproduce deepdanbooru_v3 in PyTorch about deepdanbooru HOT 8 CLOSED

Comments (8)

tag self with semantic, but is for human, as dataset is images bucket/collection

web image classification

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent