Giter VIP home page Giter VIP logo

mood's Introduction

MOOD

• 🤗 Model • 🐱 Code • 📃 MOODv1 • 📃 MOODv2

framework

MOODv1: Rethinking Out-of-Distribution Detection: Masked Image Modeling is All You Need (CVPR2023)

The core of out-of-distribution (OOD) detection is to learn the in-distribution (ID) representation, which is distinguishable from OOD samples. Previous work applied recognition-based methods to learn the ID features, which tend to learn shortcuts instead of comprehensive representations. In this work, we find surprisingly that simply using reconstruction-based methods could boost the performance of OOD detection significantly. We deeply explore the main contributors of OOD detection and find that reconstruction-based pretext tasks have the potential to provide a generally applicable and efficacious prior, which benefits the model in learning intrinsic data distributions of the ID dataset. Specifically, we take Masked Image Modeling as a pretext task for our OOD detection framework (MOOD). Without bells and whistles, MOOD outperforms previous SOTA of one-class OOD detection by 5.7%, multi-class OOD detection by 3.0%, and near-distribution OOD detection by 2.1%. It even defeats the 10-shot-per-class outlier exposure OOD detection, although we do not include any OOD samples for our detection.

moodv1

MOODv2: Masked Image Modeling for Out-of-Distribution Detection

The crux of effective out-of-distribution (OOD) detection lies in acquiring a robust in-distribution (ID) representation, distinct from OOD samples. While previous methods predominantly leaned on recognition-based techniques for this purpose, they often resulted in shortcut learning, lacking comprehensive representations. In our study, we conducted a comprehensive analysis, exploring distinct pretraining tasks and employing various OOD score functions. The results highlight that the feature representations pre-trained through reconstruction yield a notable enhancement and narrow the performance gap among various score functions. This suggests that even simple score functions can rival complex ones when leveraging reconstruction-based pretext tasks. Reconstruction-based pretext tasks adapt well to various score functions. As such, it holds promising potential for further expansion. Our OOD detection framework, MOODv2, employs the masked image modeling pretext task. Without bells and whistles, MOODv2 impressively enhances 14.30% AUROC to 95.68% on ImageNet and achieves 99.98% on CIFAR-10.

table

mood's People

Contributors

harshnandwana avatar julietlewis0110 avatar julietljy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mood's Issues

MoCov3 checkpoint

Thanks for your excellent work! I wonder if it is possible to release the ckpt of MoCov3 pre-trained on ImageNet-22k in Table-1? I saw that in the original MoCo paper, the checkpoints are obtained by pre-training on ImageNet-1k.

Problem with visualizer package

Hi, I try to run the model but getting an error related to visualizer package which is not inside requirement file. I install it using pip but getting this error:

ImportError: cannot import name 'get_local' from 'visualizer' (/opt/conda/lib/python3.8/site-packages/visualizer.py)

Is there any specific version of the visualizer need to be installed?

Thanks for you helps.

The results of ID CIFAR100-> OOD CIFAR10

After Fine-tuning on the CIFAR100 with the model that are self-supervised pretrained and then intermediate fine-tuned on ImageNet-22k, I got the 87.169 AUROC for ID CIFAR100-> OOD CIFAR10, this is a significant difference from the 98.3 AUROC reported in the paper, How can I get normal results?
The command line I ran and the results are shown below:

command line:

OMP_NUM_THREADS=1 python -m torch.distributed.launch --nproc_per_node=2 run_class_finetuning.py --model beit_base_patch16_224 --data_path /home/ubuntu/code/open-set/MOOD --data_set cifar100 --nb_classes 100 --disable_eval_during_finetuning --finetune /home/ubuntu/code/open-set/MOOD/beit_base_patch16_224_pt22k_ft22k.pth --output_dir logs_cifar100_test --batch_size 128 --lr 1.5e-3 --update_freq 1 --warmup_epochs 5 --epochs 90 --layer_decay 0.65 --drop_path 0.2 --weight_decay 0.05 --layer_scale_init_value 0.1 --clip_grad 3.0

results:

results_test

What is difference between "fine-tuning" and "intermediate fine-tuning" ?

Q1 :Are they right?

[1] intermediate fine-tuning is working on ImageNet-21K?

[2] fine-tuning is working on one-class ID-dataset?

Q2:where are they related code?

the paper say:

  1. Pre-train the Masked Image Modeling ViT on ImageNet-21k
  2. Apply intermediate fine-tuning ViT on ImageNet-21k.
  3. Apply fine-tuning of pre-trained ViT on the ID dataset.

the code contains:

a. Pretrained models
b. Fine-tuning on In-Distribution Dataset

In my view, 'a' is for '1' and 'b' is for '3', where is the code for '2' ?

cant find training script for MOODv2

Hey Team
i cant find the method to train or finetune the model also i wish to use this model as an api where if i pass a image the model can tell if it is an ood or not for that is there a script for prediction where i can get the knowledge of it is an outlier or not

Mahalanobis distance calculation matrix is not working and throwing these errors

python src/demo.py --img_path 1000_F_26800115_YlmErNLIVZeNZXPzUc3z4GAD1gGkVABu.jpg --cfg /app/mmpretrain/work_dirs/beitv2_beit-base-p16_8xb256-amp-coslr-300e_in1k/beitv2_beit-base-p16_8xb256-amp-coslr-300e_in1k.py --checkpoint pretrain/epoch_19.pth --fc outputs/fc.pkl --id_train_feature outputs/imagenet_train.pkl --id_val_feature outputs/imagenet_train.pkl --methods Mahalanobis
=> Loading model
/root/miniconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1639180588308/work/aten/src/ATen/native/TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Loads checkpoint by local backend from path: pretrain/epoch_19.pth
=> Loading image
=> Extracting feature
Extracted Feature: (1, 768)
w.shape=(8192, 768), b.shape=(8192,)
image path: 1000_F_26800115_YlmErNLIVZeNZXPzUc3z4GAD1gGkVABu.jpg
=> Loading features
feature_id_train.shape=(91710, 768), feature_id_val.shape=(91710, 768)
=> Computing logits...
=> Computing softmax...
Computing classwise mean feature: 0%| | 0/1000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "src/demo.py", line 305, in
main()
File "src/demo.py", line 263, in main
fs = feature_id_train[train_labels == i]
IndexError: boolean index did not match indexed array along dimension 0; dimension is 91710 but corresponding boolean dimension is 200000

problems while running src/extract_feature_vit.py

if im using any command

  1. python3 src/extract_feature_vit.py $IMAGENET_PATH --cfg configs/beit-base-p16_224px.py --checkpoint pretrain/epoch_19.pth --fc_save_path output/fc.pkl
  2. python src/extract_feature_vit.py $IMAGENET_PATH --out_file outputs/imagenet_train.pkl --cfg configs/beit-base-p16_224px.py --checkpoint pretrain/beitv2-base.pth --img_list datalists/imagenet2012_train_random_200k.txt

im getting below given error

Traceback (most recent call last):
File "src/extract_feature_vit.py", line 12, in
from .list_dataset import ImageFilelist
ImportError: attempted relative import with no known parent package

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.