lipurple / grounded-diffusion Goto Github PK

View Code? Open in Web Editor NEW

157.0 15.0 8.0 167.75 MB

Open-vocabulary Object Segmentation with Diffusion Models

Home Page: https://lipurple.github.io/Grounded_Diffusion/

Python 8.82% Jupyter Notebook 91.10% Shell 0.07% Dockerfile 0.01% Makefile 0.01% CSS 0.01% Batchfile 0.01%

grounded-generation text-to-image-diffusioin-model

grounded-diffusion's Introduction

Open-vocabulary Object Segmentation with Diffusion Models

This repository contains the official PyTorch implementation of grounded diffusion: https://arxiv.org/abs/2301.05221.

Requirements

A suitable conda environment named grounded-diffusion can be created and activated with:

conda env create -f environment.yaml
conda activate grounded-diffusion

Model Zoo

https://drive.google.com/drive/folders/1HlagN6jVhmC_UbrOAy133LkN4Qgf2Scv?usp=sharing

Train

Before training, please download the checkpoint of the off-the-shelf detector into a folder called mmdetection/checkpoint/.

python train.py --class_split 1 --train_data random --save_name pascal_1_random

Inference

python test.py --sd_ckpt 'xxx/stable_diffusion.ckpt' \
--grounding_ckpt 'xxx/grounding_module.pth' \
--prompt "a photo of a lion on a mountain top at sunset" \
--category "lion"

Citation

If you use this code for your research or project, please cite:

@article{li2023grounded,
  title   = {Open-vocabulary Object Segmentation with Diffusion Models},
  author  = {Li, Ziyi and Zhou, Qinye and Zhang, Xiaoyun and Zhang, Ya and Wang, Yanfeng and Xie, Weidi},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year    = {2023}
}

Acknowledgements

Many thanks to the code bases from Stable Diffusion, CLIP, taming-transformers.

grounded-diffusion's People

Contributors

Stargazers

Watchers

Forkers

mornydew quang-ngh rogerqi cvjie natanloterio hotpepperlabs whuhxb eviliclufas

grounded-diffusion's Issues

RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

python test.py --sd_ckpt 'xxx/stable_diffusion.ckpt'
--grounding_ckpt 'xxx/grounding_module.pth'
--prompt "a photo of a lion on a mountain top at sunset"
--category "lion"

您好，执行下面这行代码出现上述错误，可能是stable_diffusion.ckpt该ckpt文件有问题。
能否帮忙确认一下，谢谢

teaser image

Hello, Thank you for your excellent work.I am very interested in the teaser input images, it is the following pictures,

Can you post them?

how to get models/ldm/stable-diffusion-v1/model.ckpt?

python train.py --class_split 1 --train_data random --save_name pascal_1_random,
and No such file or directory: 'models/ldm/stable-diffusion-v1/model.ckpt',

can you give me some advices?

how to evaluate the checkpoint after train?

I follow the readme use python train.py --class_split 1 --train_data random --save_name pascal_1_random ' to train the model and generate the checkpoints；now how to evaluate them? I dont find the evalution code in you project.

The confused definition of open-vocabulary segmentation

Thanks for your excellent work!

I am confused about the definition of open-vocabulary segmentation from two aspects:

I note that the segmentation model (i.e., maskformer in the paper) is trained on full categories of PASCAL VOC and COCO while the data are synthetic from the Stable Diffusion.
Can open-vocabulary segmentation protocol access the complete categories during training? In my opinion, the unseen(novel) class name should only be available at the test instead of training time. Otherwise, it is not really open-vocabulary.

Hope the authors could give me some help to make me better understand this paper!

Thanks!

torch version

which version of torch is it？

Inference Speed

Thanks a lot for your great work! May I know what is the inference speed for generating grounded images?

RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

when i run Inference,
python test.py --sd_ckpt 'lipurple/stable_diffusion.ckpt'
--grounding_ckpt 'lipurple/grounding_module.pth'
--prompt "a photo of a lion on a mountain top at sunset"
--category "lion"

and will occur ：RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory.

can you give me some advices?

thanks

How long does it take to retrain this model?

How long does it take to retrain this model?
是否有并行训练这个模型的方法？一张卡上训练的话好像需要很久很久；

model cannot be found in train.py

when I run the code

python train.py --class_split 1 --train_data random --save_name pascal_1_random

FileNotFoundError: mmdetection/checkpoint/mask_rcnn_swin-s-p4-w7_fpn_fp16_ms-crop-3x_coco_20210903_104808-b92c91f1.pth can not be found.

Release of COCO training script

Thanks for the great work!

At the moment, the provided train.py seems to be hardwired to train on the Pascal VOC dataset. Is there a plan to release the COCO training script that can be used to reproduce results in the paper?

lipurple / grounded-diffusion Goto Github PK

grounded-diffusion's Introduction

Open-vocabulary Object Segmentation with Diffusion Models

Requirements

Model Zoo

Train

Inference

Citation

Acknowledgements

grounded-diffusion's People

Contributors

Stargazers

Watchers

Forkers

grounded-diffusion's Issues

Recommend Projects

Recommend Topics

Recommend Org