showlab / visorgpt Goto Github PK

View Code? Open in Web Editor NEW

125.0 2.0 2.0 123.31 MB

[NeurIPS 2023] Customize spatial layouts for conditional image synthesis models, e.g., ControlNet, using GPT

License: MIT License

Python 100.00%

controlnet gpt image-generation diffusion-models

visorgpt's People

Contributors

Stargazers

Watchers

Forkers

mmarking haorand

visorgpt's Issues

Questions about Kullback-Leibler divergence calculation in Table 4.

Really appreicate your impressive work~

I wonder how the KL divergence in Table 4 is calculated ? Is it an average of KL in each category or calculated across all categories as a whole? For COCO, only 6400 generated samples are used, am I right ?

Thanks in advance for your kind response.

Does VISORGPT supports to generate multiple instances with different sizes in an image?

I want to know if I want to generate objects with different sizes, such as a large building and lots of small windows in an image, can VISORGPT do it?

What does the generated_sentence.txt generated after training represent?

Hi, I followed the steps you provided for 200,000 steps training. When I used the inference test results, the generated_sentence.txt I got was different from the Output sequence shown in the paper. When I write "box; multiple instances; medium; 4; 0; apple, apple, cake, knife;" in beginning.txt, I get "[CLS] box; multiple instances; medium; 4; 0; apple, apple, cake, knife; [ ] 176 ymin 188 xmax 236 ymax 426 ] [SEP] banana xmin 112 ymin 181 xmax 167 ymax 429 ] [SEP] ##r xmin 138 ymin 189 xmax 180 ymax 427 ] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] cell phone xmin 83 ymin 197 xmax 143 ymax 448 ] [SEP] [SEP] [SEP] 94 ymin 202 xmax 139 ymax 422 ] [SEP] [SEP] [SEP] [SEP] [SEP] [ SEP] xmin 144 ymin 182 xmax 230 ymax 420 ] [SEP] [SEP] [SEP] 185 ] [SEP] [SEP] [ xmin [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] xmin . ...", what does [SEP] here mean?

.cache/torch_extensions/py38_cu117/utils/utils.so

Thanks for sharing your excellent work!

When I run the training command, I met the following error.

Loading extension module utils...
Traceback (most recent call last):
File "pretrain.py", line 121, in
main()
File "pretrain.py", line 117, in main
trainer.train_and_validate(args)
File "/mnt/data-1/data/jiagang.zhu/VisorGPT/train/tencentpretrain/trainer.py", line 56, in train_and_validate
worker(args.local_rank, None, args, model_for_training, model_for_dataloader)
File "/mnt/data-1/data/jiagang.zhu/VisorGPT/train/tencentpretrain/trainer.py", line 593, in worker
model_for_training, optimizer, _, scheduler = deepspeed.initialize(
File "/mnt/data-1/data/jiagang.zhu/miniconda3/envs/visorgpt/lib/python3.8/site-packages/deepspeed/init.py", line 125, in initialize
engine = DeepSpeedEngine(args=args,
File "/mnt/data-1/data/jiagang.zhu/miniconda3/envs/visorgpt/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 336, in init
self._configure_optimizer(optimizer, model_parameters)
File "/mnt/data-1/data/jiagang.zhu/miniconda3/envs/visorgpt/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1284, in _configure_optimizer
self.optimizer = self._configure_zero_optimizer(basic_optimizer)
File "/mnt/data-1/data/jiagang.zhu/miniconda3/envs/visorgpt/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1533, in _configure_zero_optimizer
optimizer = DeepSpeedZeroOptimizer(
File "/mnt/data-1/data/jiagang.zhu/miniconda3/envs/visorgpt/lib/python3.8/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 165, in init
util_ops = UtilsBuilder().load()
File "/mnt/data-1/data/jiagang.zhu/miniconda3/envs/visorgpt/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 485, in load
return self.jit_load(verbose)
File "/mnt/data-1/data/jiagang.zhu/miniconda3/envs/visorgpt/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 520, in jit_load
op_module = load(
File "/mnt/data-1/data/jiagang.zhu/miniconda3/envs/visorgpt/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1284, in load
return _jit_compile(
File "/mnt/data-1/data/jiagang.zhu/miniconda3/envs/visorgpt/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1534, in _jit_compile
return _import_module_from_library(name, build_directory, is_python_module)
File "/mnt/data-1/data/jiagang.zhu/miniconda3/envs/visorgpt/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1936, in _import_module_from_library
module = importlib.util.module_from_spec(spec)
File "", line 556, in module_from_spec
File "", line 1166, in create_module
File "", line 219, in _call_with_frames_removed
ImportError: /home/user001/.cache/torch_extensions/py38_cu117/utils/utils.so: cannot open shared object file: No such file or directory

Have you met this problem before? Thank you.

Error while loading Gligen

Hello,

While loading GLigen, I have got the following errors. Are those the right weights please?

  File "gradio_demo.py", line 35, in <module>
    g_config, g_grounding_tokenizer_input = build_gligen_model(ckpt=gligen_model_path)
  File "/home/VisorGPT/demo/GLIGEN/gligen/gligen_inference_box.py", line 229, in build_gligen_model
    model, autoencoder, text_encoder, diffusion, config = load_ckpt(ckpt)
  File "/home/VisorGPT/demo/GLIGEN/gligen/gligen_inference_box.py", line 99, in load_ckpt
    text_encoder.load_state_dict( saved_ckpt["text_encoder"]  )
  File "/opt/conda/envs/visorgpt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 2153, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for FrozenCLIPEmbedder:
        Unexpected key(s) in state_dict: "transformer.text_model.embeddings.position_ids".

The original training data file

Hi Sierkinhane,
Very nice work. Can you provide the original training data file for us to understand how your data is organized? And how to process it as the visorgpt_dagger_train_seq.bin?

Thanks.

Some issues in training

Thank you greatly for your excellent work, as I try to reproduce the training process, I encountered the following problem and wondered if you have encountered it?

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)
File "/storage/zhaoliuqing/code/VisorGPT/train/tencentpretrain/embeddings/word_embedding.py", line 27, in forward
emb = self.embedding(src)
File "/storage/zhaoliuqing/code/VisorGPT/train/tencentpretrain/embeddings/embedding.py", line 27, in forward
emb = embedding(src, seg)
File "/storage/zhaoliuqing/code/VisorGPT/train/tencentpretrain/models/model.py", line 33, in forward
emb = self.embedding(src, seg)
File "/storage/zhaoliuqing/code/VisorGPT/train/tencentpretrain/trainer.py", line 160, in forward_propagation
loss_info = model(src, tgt, seg)
File "/storage/zhaoliuqing/code/VisorGPT/train/tencentpretrain/trainer.py", line 110, in train
loss = self.forward_propagation(batch, model)
File "/storage/zhaoliuqing/code/VisorGPT/train/tencentpretrain/trainer.py", line 638, in worker
trainer.train(args, gpu_id, rank, train_loader, model_for_training, optimizer, scheduler)
File "/storage/zhaoliuqing/code/VisorGPT/train/tencentpretrain/trainer.py", line 56, in train_and_validate
worker(args.local_rank, None, args, model_for_training, model_for_dataloader)
File "/storage/zhaoliuqing/code/VisorGPT/train/pretrain.py", line 117, in main
trainer.train_and_validate(args)
File "/storage/zhaoliuqing/code/VisorGPT/train/pretrain.py", line 121, in
main()
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)

Looking forward to your reply!

showlab / visorgpt Goto Github PK

visorgpt's People

Contributors

Stargazers

Watchers

Forkers

visorgpt's Issues

Questions about Kullback-Leibler divergence calculation in Table 4.

Does VISORGPT supports to generate multiple instances with different sizes in an image?

What does the generated_sentence.txt generated after training represent?

.cache/torch_extensions/py38_cu117/utils/utils.so

Error while loading Gligen

The original training data file

Some issues in training

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent