Giter VIP home page Giter VIP logo

anomalygpt's Introduction

AnomalyGPT_logo

AnomalyGPT: Detecting Industrial Anomalies using Large Vision-Language Models

License

🌐 Project Page • 🤗 Online Demo • 📃 Paper • 🤖 Model • 📹 Video

Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Ming Tang, Jinqiao Wang


Catalogue:


1. Introduction: [Back to Top]

AnomalyGPT_logo

AnomalyGPT is the first Large Vision-Language Model (LVLM) based Industrial Anomaly Detection (IAD) method that can detect anomalies in industrial images without the need for manually specified thresholds. Existing IAD methods can only provide anomaly scores and need manually threshold setting, while existing LVLMs cannot detect anomalies in the image. AnomalyGPT can not only indicate the presence and location of anomaly but also provide information about the image.

AnomalyGPT

We leverage a pre-trained image encoder and a Large Language Model (LLM) to align IAD images and their corresponding textual descriptions via simulated anomaly data. We employ a lightweight, visual-textual feature-matching-based image decoder to obtain localization result, and design a prompt learner to provide fine-grained semantic to LLM and fine-tune the LVLM using prompt embeddings. Our method can also detect anomalies for previously unseen items with few normal sample provided.


2. Running AnomalyGPT Demo [Back to Top]

2.1 Environment Installation

build anaconda env

apt-get install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6 git-lfs
wget https://repo.anaconda.com/archive/Anaconda3-2024.02-1-Linux-x86_64.sh
bash Anaconda3-2024.02-1-Linux-x86_64.sh


image
You can undo this by running `conda init --reverse $SHELL`? [yes|no]
[no] >>> yes
no change     /work/cyh_anomaly/3/condabin/conda
no change     /work/cyh_anomaly/3/bin/conda
no change     /work/cyh_anomaly/3/bin/conda-env
no change     /work/cyh_anomaly/3/bin/activate
no change     /work/cyh_anomaly/3/bin/deactivate
no change     /work/cyh_anomaly/3/etc/profile.d/conda.sh
no change     /work/cyh_anomaly/3/etc/fish/conf.d/conda.fish
no change     /work/cyh_anomaly/3/shell/condabin/Conda.psm1
no change     /work/cyh_anomaly/3/shell/condabin/conda-hook.ps1
no change     /work/cyh_anomaly/3/lib/python3.11/site-packages/xontrib/conda.xsh
no change     /work/cyh_anomaly/3/etc/profile.d/conda.csh
modified      /home/peter/.bashrc

add env path

export PATH="/work/cyh_anomaly/3/bin:$PATH"
echo 'export PATH="/work/cyh_anomaly/3/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

若不小心把環境變數弄錯的話,可以這樣清掉

source /work/cyh_anomaly/3/envs/agpt/bin/activate
export PATH=/work/cyh_anomaly/3/envs/agpt/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
# 查看是否清乾淨
echo $PATH

永久更新PATH變量

nano ~/.bashrc

# add this path on bottom of file
# Set PATH for the agpt virtual environment
export PATH=/work/cyh_anomaly/3/envs/agpt/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
source ~/.bashrc

conda create -n agpt python==3.10
conda activate agpt

Clone the repository locally:

git clone https://github.com/CASIA-IVA-Lab/AnomalyGPT.git

Install the required packages:

pip install -r requirements.txt

2.2 Prepare ImageBind Checkpoint:

You can download the pre-trained ImageBind model using this link. After downloading, put the downloaded file (imagebind_huge.pth) in [./pretrained_ckpt/imagebind_ckpt/] directory.

2.3 Prepare Vicuna Checkpoint:

To prepare the pre-trained Vicuna model, please follow the instructions provided [here].

2.4 Prepare Delta Weights of AnomalyGPT:

We use the pre-trained parameters from PandaGPT to initialize our model. You can get the weights of PandaGPT trained with different strategies in the table below. In our experiments and online demo, we use the Vicuna-7B and openllmplayground/pandagpt_7b_max_len_1024 due to the limitation of computation resource. Better results are expected if switching to Vicuna-13B.

Base Language Model Maximum Sequence Length Huggingface Delta Weights Address
Vicuna-7B (version 0) 512 openllmplayground/pandagpt_7b_max_len_512
Vicuna-7B (version 0) 1024 openllmplayground/pandagpt_7b_max_len_1024
Vicuna-13B (version 0) 256 openllmplayground/pandagpt_13b_max_len_256
Vicuna-13B (version 0) 400 openllmplayground/pandagpt_13b_max_len_400

Please put the downloaded 7B/13B delta weights file (pytorch_model.pt) in the ./pretrained_ckpt/pandagpt_ckpt/7b/ or ./pretrained_ckpt/pandagpt_ckpt/13b/ directory.

After that, you can download AnomalyGPT weights from the table below.

Setup and Datasets Weights Address
Unsupervised on MVTec-AD AnomalyGPT/train_mvtec
Unsupervised on VisA AnomalyGPT/train_visa
Supervised on MVTec-AD, VisA, MVTec-LOCO-AD and CrackForest AnomalyGPT/train_supervised

After downloading, put the AnomalyGPT weights in the ./code/ckpt/ directory.

In our online demo, we use the supervised setting as our default model to attain an enhanced user experience. You can also try other weights locally.

2.5. Deploying Demo

Upon completion of previous steps, you can run the demo locally as

cd ./code/
python web_demo.py

3. Train Your Own AnomalyGPT [Back to Top]

Prerequisites: Before training the model, making sure the environment is properly installed and the checkpoints of ImageBind, Vicuna and PandaGPT are downloaded.

3.1 Data Preparation:

You can download MVTec-AD dataset from [this link] and VisA from [this link]. You can also download pre-training data of PandaGPT from [here]. After downloading, put the data in the [./data] directory.

The directory of [./data] should look like:

data
|---pandagpt4_visual_instruction_data.json
|---images
|-----|-- ...
|---mvtec_anomaly_detection
|-----|-- bottle
|-----|-----|----- ground_truth
|-----|-----|----- test
|-----|-----|----- train
|-----|-- capsule
|-----|-- ...
|----VisA
|-----|-- split_csv
|-----|-----|--- 1cls.csv
|-----|-----|--- ...
|-----|-- candle
|-----|-----|--- Data
|-----|-----|-----|----- Images
|-----|-----|-----|--------|------ Anomaly 
|-----|-----|-----|--------|------ Normal 
|-----|-----|-----|----- Masks
|-----|-----|-----|--------|------ Anomaly 
|-----|-----|--- image_anno.csv
|-----|-- capsules
|-----|-----|----- ...

3.2 Training Configurations

The table below show the training hyperparameters used in our experiments. The hyperparameters are selected based on the constrain of our computational resources, i.e. 2 x RTX3090 GPUs.

Base Language Model Epoch Number Batch Size Learning Rate Maximum Length
Vicuna-7B 50 16 1e-3 1024

3.3 Training AnomalyGPT

To train AnomalyGPT on MVTec-AD dataset, please run the following commands:

cd ./code
bash ./scripts/train_mvtec.sh

The key arguments of the training script are as follows:

  • --data_path: The data path for the json file pandagpt4_visual_instruction_data.json.
  • --image_root_path: The root path for training images of PandaGPT.
  • --imagebind_ckpt_path: The path of ImageBind checkpoint.
  • --vicuna_ckpt_path: The directory that saves the pre-trained Vicuna checkpoints.
  • --max_tgt_len: The maximum sequence length of training instances.
  • --save_path: The directory which saves the trained delta weights. This directory will be automatically created.
  • --log_path: The directory which saves the log. This directory will be automatically created.

Note that the epoch number can be set in the epochs argument at ./code/config/openllama_peft.yaml file and the learning rate can be set in ./code/dsconfig/openllama_peft_stage_1.json


4. Examples

An image of concrete with crack.


A crack capsule.


An image of a cut hazelnut.


A damaged bottle.


A photo of normal carpet.


A photo of a piece of wood with defect.


A piece of normal fabric.


License

AnomalyGPT is licensed under the CC BY-NC-SA 4.0 license.


Citation:

If you found AnomalyGPT useful in your research or applications, please kindly cite using the following BibTeX:

@article{gu2023anomalyagpt,
  title={AnomalyGPT: Detecting Industrial Anomalies using Large Vision-Language Models},
  author={Gu, Zhaopeng and Zhu, Bingke and Zhu, Guibo and Chen, Yingying and Tang, Ming and Wang, Jinqiao},
  journal={arXiv preprint arXiv:2308.15366},
  year={2023}
}

Acknowledgments:

We borrow some codes and the pre-trained weights from PandaGPT. Thanks for their wonderful work!

Star History Chart

anomalygpt's People

Contributors

fantasticgnu avatar cyh4157 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.