ha0tang / lggan Goto Github PK

[CVPR 2020] Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

License: Other

Shell 1.68% Python 95.53% MATLAB 2.79%

image-translation image-manipulation image-generation cross-view gan generative-adversarial-network generative-model local global pytorch

lggan's Introduction

Local and Global GAN
Cross-View Image Translation
Semantic Image Synthesis
Acknowledgments
Related Projects
Citation
Contributions
Collaborations

Local and Global GAN

Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation
Hao Tang, Dan Xu, Yan Yan, Philip H.S. Torr, Nicu Sebe.
In CVPR 2020.
The repository offers the official implementation of our paper in PyTorch.

In the meantime, check out our related ACM MM 2020 paper Dual Attention GANs for Semantic Image Synthesis, and TIP 2021 paper Layout-to-Image Translation with Double Pooling Generative Adversarial Networks.

Framework

Cross-View Image Translation Results on Dayton and CVUSA

Semantic Image Synthesis Results on Cityscapes and ADE20K

Generated Segmentation Maps on Cityscapes

Generated Segmentation Maps on ADE20K

Generated Feature Maps on Cityscapes

License

The code is released for academic research use only. For commercial use, please contact [email protected].

Cross-View Image Translation

Please refer to the cross_view_translation folder for more details.

Semantic Image Synthesis

Please refer to the semantic_image_synthesis folder for more details.

Acknowledgments

This source code of cross-view image translation is inspired by SelectionGAN, the source code of semantic image synthsis is inspired by GauGAN/SPADE.

Related Projects

Citation

If you use this code for your research, please cite our papers.

LGGAN

@article{tang2022local,
  title={Local and Global GANs with Semantic-Aware Upsampling for Image Generation},
  author={Tang, Hao and Shao, Ling and Torr, Philip HS and Sebe, Nicu},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
  year={2022}
}

@inproceedings{tang2019local,
  title={Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation},
  author={Tang, Hao and Xu, Dan and Yan, Yan and Torr, Philip HS and Sebe, Nicu},
  booktitle={CVPR},
  year={2020}
}

SelectionGAN

@article{tang2022multi,
  title={Multi-Channel Attention Selection GANs for Guided Image-to-Image Translation},
  author={Tang, Hao and Torr, Philip HS and Sebe, Nicu},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
  year={2022}
}

@inproceedings{tang2019multi,
  title={Multi-channel attention selection gan with cascaded semantic guidance for cross-view image translation},
  author={Tang, Hao and Xu, Dan and Sebe, Nicu and Wang, Yanzhi and Corso, Jason J and Yan, Yan},
  booktitle={CVPR},
  year={2019}
}

ECGAN

@article{tang2023edge,
  title={Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis},
  author={Tang, Hao and Qi, Xiaojuan and Sun, Guolei, and Xu, Dan and and Sebe, Nicu and Timofte, Radu and Van Gool, Luc},
  journal={ICLR},
  year={2023}
}

DPGAN

@article{tang2021layout,
  title={Layout-to-image translation with double pooling generative adversarial networks},
  author={Tang, Hao and Sebe, Nicu},
  journal={IEEE Transactions on Image Processing (TIP)},
  volume={30},
  pages={7903--7913},
  year={2021}
}

DAGAN

@inproceedings{tang2020dual,
  title={Dual Attention GANs for Semantic Image Synthesis},
  author={Tang, Hao and Bai, Song and Sebe, Nicu},
  booktitle ={ACM MM},
  year={2020}
}

PanoGAN

@article{wu2022cross,
  title={Cross-View Panorama Image Synthesis},
  author={Wu, Songsong and Tang, Hao and Jing, Xiao-Yuan and Zhao, Haifeng and Qian, Jianjun and Sebe, Nicu and Yan, Yan},
  journal={IEEE Transactions on Multimedia (TMM)},
  year={2022}
}

Contributions

If you have any questions/comments/bug reports, feel free to open a github issue or pull a request or e-mail to the author Hao Tang ([email protected]).

Collaborations

I'm always interested in meeting new people and hearing about potential collaborations. If you'd like to work together or get in contact with me, please email [email protected]. Some of our projects are listed here.

If you really want to do something, you'll find a way. If you don't, you'll find an excuse.

lggan's People

Contributors

Stargazers

Watchers

Forkers

c1a1o1 ml-lab 1912158597-george yqgans killsking zeta1999 rieryn zabir-nabil huichuanliu yz-sh maoshifu-yang 2100877953 hwpengtristin

lggan's Issues

question

sorry to interrupt you,sir.but i have a question which is that i can not get the access to you pretrain datasets and model.Can you share it,please?

Link to pretrained models for Semantic Image Synthesis are broken

Hey!

Thank you for make the source code available.
The links for the pretrained models are broken. Can you fix this?
Thanks!

disi.unitn.it/~hao.tang/uploads/models/LGGAN/cityscapes_pretrained.tar.gz
disi.unitn.it/~hao.tang/uploads/models/LGGAN/ade_pretrained.tar.gz

When can you make the code public?

I have read your article and I want to reproduce your work

About arxiv paper image

Hello. I saw your paper at arxiv.
https://arxiv.org/pdf/1912.12215.pdf

And I have a question.
In Fig. 1, Fig. 16, Fig. 17, Global & Global+Local image are very similar.

If Local affects the Global+Local, it is thought that Local pixel in the white area of Local weight should appear in the result, but there is no such tendency. Am I misunderstanding?

size mismatch for conv weight when test_ade.sh

Hi,
Thanks for sharing your work. Btw, when I tried to reproduce using the ADE20K pretrained checkpoint, I came across the following error. I hope you can take a look:

`
LGGAN/semantic_image_synthesis$ sh test_ade.sh
----------------- Options ---------------
aspect_ratio: 1.0
batchSize: 1 [default: 2]
cache_filelist_read: False
cache_filelist_write: False
checkpoints_dir: ./checkpoints
contain_dontcare_label: True
crop_size: 256
dataroot: ./datasets/ade20k [default: ./datasets/cityscapes/]
dataset_mode: ade20k [default: coco]
display_winsize: 256
gpu_ids: 0 [default: 0,1]
how_many: inf
init_type: xavier
init_variance: 0.02
isTrain: False [default: None]
label_nc: 150
load_from_opt_file: False
load_size: 256
max_dataset_size: 9223372036854775807
model: pix2pix
nThreads: 0
name: LGGAN_ade [default: label2coco]
nef: 16
netG: lggan
ngf: 64
no_flip: True
no_instance: True
no_pairing_check: False
norm_D: spectralinstance
norm_E: spectralinstance
norm_G: spectralspadesyncbatch3x3
num_upsampling_layers: normal
output_nc: 3
phase: test
preprocess_mode: resize_and_crop
results_dir: ./results [default: ./results/]
serial_batches: True
use_vae: False
which_epoch: 200 [default: latest]
z_dim: 256
----------------- End -------------------
dataset [ADE20KDataset] of size 2000 was created
Network [LGGANGenerator] was created. Total number of parameters: 114.6 million. To see the architecture, do print(network).
Traceback (most recent call last):
File "test_ade.py", line 20, in
model = Pix2PixModel(opt)
File "/home/you/Work/LGGAN/semantic_image_synthesis/models/pix2pix_model.py", line 25, in init
self.netG, self.netD, self.netE = self.initialize_networks(opt)
File "/home/you/Work/LGGAN/semantic_image_synthesis/models/pix2pix_model.py", line 121, in initialize_networks
netG = util.load_network(netG, 'G', opt.which_epoch, opt)
File "/home/you/Work/LGGAN/semantic_image_synthesis/util/util.py", line 208, in load_network
net.load_state_dict(weights)
File "/home/you/anaconda3/envs/torch1.4-py36-cuda10.1-tf1.14/lib/python3.6/site-packages/torch/nn/modules/module.py", line 830, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for LGGANGenerator:
Unexpected key(s) in state_dict: "deconv5_35.weight", "deconv5_35.bias", "deconv5_36.weight", "deconv5_36.bias", "deconv5_37.weight", "deconv5_37.bias", "deconv5_38.weight", "deconv5_38.bias", "deconv5_39.weight", "deconv5_39.bias", "deconv5_40.weight", "deconv5_40.bias", "deconv5_41.weight", "deconv5_41.bias", "deconv5_42.weight", "deconv5_42.bias", "deconv5_43.weight", "deconv5_43.bias", "deconv5_44.weight", "deconv5_44.bias", "deconv5_45.weight", "deconv5_45.bias", "deconv5_46.weight", "deconv5_46.bias", "deconv5_47.weight", "deconv5_47.bias", "deconv5_48.weight", "deconv5_48.bias", "deconv5_49.weight", "deconv5_49.bias", "deconv5_50.weight", "deconv5_50.bias", "deconv5_51.weight", "deconv5_51.bias".
size mismatch for conv1.weight: copying a param with shape torch.Size([64, 151, 7, 7]) from checkpoint, the shape in current model is torch.Size([64, 36, 7, 7]).
size mismatch for deconv9.weight: copying a param with shape torch.Size([3, 156, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 105, 3, 3]).
size mismatch for fc2.weight: copying a param with shape torch.Size([51, 64]) from checkpoint, the shape in current model is torch.Size([35, 64]).
size mismatch for fc2.bias: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([35]).

Failed to reproduce ADE 20k with your provided checkpoint.

Model for ADE can not load the pretrained checkpoint. Please check it. Thank you!

RuntimeError: Given groups=1, weight of size [64, 151, 7, 7], expected input[8, 13, 262, 262] to have 151 channels, but got 13 channels instead

I'm getting an error "RuntimeError: Given groups=1, weight of size [64, 151, 7, 7], expected input[8, 13, 262, 262] to have 151 channels, but got 13 channels instead" and I don't know why.

Any ideas?

Where is SAU

Could you please point to a part of code which corresponds to SAU module from p. 3.2. "Semantic-Aware Upsampling" ?