Giter VIP home page Giter VIP logo

vdvae's Introduction

Very Deep VAEs

Repository for the paper "Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images" (https://arxiv.org/abs/2011.10650)

Some model samples and a visualization of how it generates them: image

This repository is tested with PyTorch 1.6, CUDA 10.1, Numpy 1.16, Ubuntu 18.04, and V100 GPUs.

Setup

Several additional packages are required, including NVIDIA Apex:

pip install imageio
pip install mpi4py
pip install sklearn
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
cd ..

Also, you'll have to download the data, depending on which one you want to run:

./setup_cifar10.sh
./setup_imagenet.sh imagenet32
./setup_imagenet.sh imagenet64
./setup_ffhq256.sh
./setup_ffhq1024.sh  /path/to/images1024x1024  # this one depends on you first downloading the subfolder `images_1024x1024` from https://github.com/NVlabs/ffhq-dataset on your own

Training models

Hyperparameters all reside in hps.py. We use 2 gpus for our CIFAR-10 runs, and 32 for the rest of the models. (Using a lower batch size is also possible and results in slower learning, and may also require a lower learning rate).

The mpiexec arguments you use for runs with more than 1 node depend on the configuration of your system, so please adapt accordingly.

mpiexec -n 2 python train.py --hps cifar10
mpiexec -n 32 python train.py --hps imagenet32
mpiexec -n 32 python train.py --hps imagenet64
mpiexec -n 32 python train.py --hps ffhq256
mpiexec -n 32 python train.py --hps ffhq1024

Restoring saved models

For convenience, we have included training checkpoints which can be restored in order to confirm performance, continue training, or generate samples.

ImageNet 32

# 119M parameter model, trained for 1.7M iters (about 2.5 weeks on 32 V100)
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/imagenet32-iter-1700000-log.jsonl
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/imagenet32-iter-1700000-model.th
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/imagenet32-iter-1700000-model-ema.th
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/imagenet32-iter-1700000-opt.th
python train.py --hps imagenet32 --restore_path imagenet32-iter-1700000-model.th --restore_ema_path imagenet32-iter-1700000-model-ema.th --restore_log_path imagenet32-iter-1700000-log.jsonl --restore_optimizer_path imagenet32-iter-1700000-opt.th --test_eval
# should give 2.6364 nats per dim, which is 3.80 bpd

ImageNet 64

# 125M parameter model, trained for 1.6M iters (about 2.5 weeks on 32 V100)
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets-2/imagenet64-iter-1600000-log.jsonl
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets-2/imagenet64-iter-1600000-model.th
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets-2/imagenet64-iter-1600000-model-ema.th
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets-2/imagenet64-iter-1600000-opt.th
python train.py --hps imagenet64 --restore_path imagenet64-iter-1600000-model.th --restore_ema_path imagenet64-iter-1600000-model-ema.th --restore_log_path imagenet64-iter-1600000-log.jsonl --restore_optimizer_path imagenet64-iter-1600000-opt.th --test_eval
# should be 2.44 nats, or 3.52 bits per dim

FFHQ-256

# 115M parameters, trained for 1.7M iterations (or about 2.5 weeks) on 32 V100
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq256-iter-1700000-log.jsonl
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq256-iter-1700000-model.th
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq256-iter-1700000-model-ema.th
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq256-iter-1700000-opt.th
python train.py --hps ffhq256 --restore_path ffhq256-iter-1700000-model.th --restore_ema_path ffhq256-iter-1700000-model-ema.th --restore_log_path ffhq256-iter-1700000-log.jsonl --restore_optimizer_path ffhq256-iter-1700000-opt.th --test_eval
# should be 0.4232 nats, or 0.61 bits per dim

FFHQ-1024

# 115M parameters, trained for 1.7M iterations (or about 2.5 weeks) on 32 V100
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq1024-iter-1700000-log.jsonl
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq1024-iter-1700000-model.th
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq1024-iter-1700000-model-ema.th
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq1024-iter-1700000-opt.th
python train.py --hps ffhq1024 --restore_path ffhq1024-iter-1700000-model.th --restore_ema_path ffhq1024-iter-1700000-model-ema.th --restore_log_path ffhq1024-iter-1700000-log.jsonl --restore_optimizer_path ffhq1024-iter-1700000-opt.th --test_eval
# should be 1.678 nats, or 2.42 bits per dim

CIFAR-10

# 39M parameters, trained for ~1M iterations with early stopping (a little less than a week on 2 GPUs)
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets-2/cifar10-seed0-iter-900000-model-ema.th
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets-2/cifar10-seed1-iter-1050000-model-ema.th
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets-2/cifar10-seed2-iter-650000-model-ema.th
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets-2/cifar10-seed3-iter-1050000-model-ema.th
python train.py --hps cifar10 --restore_ema_path cifar10-seed0-iter-900000-model-ema.th --test_eval
python train.py --hps cifar10 --restore_ema_path cifar10-seed1-iter-1050000-model-ema.th --test_eval
python train.py --hps cifar10 --restore_ema_path cifar10-seed2-iter-650000-model-ema.th --test_eval
python train.py --hps cifar10 --restore_ema_path cifar10-seed3-iter-1050000-model-ema.th --test_eval
# seeds 0, 1, 2, 3 should give 2.879, 2.842, 2.898, 2.864 bits per dim, for an average of 2.87 bits per dim.

vdvae's People

Contributors

adityaramesh avatar rewonc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vdvae's Issues

CUDA error: invalid device ordinal

I am using slurm with two nodes, each having one GPU. However, the mpiexec command does not work and I get CUDA error: invalid device ordinal. I have tried srun command as well to make sure resources are allocated to each node correctly but I get the same error.

Traceback (most recent call last):
  File "train.py", line 136, in <module>
    main()
  File "train.py", line 126, in main
    H, logprint = set_up_hyperparams()
  File "/lustre06/project/6054857/mehranag/orig/train_helpers.py", line 115, in set_up_hyperparams
    setup_mpi(H)
  File "/lustre06/project/6054857/mehranag/orig/train_helpers.py", line 78, in setup_mpi
    torch.cuda.set_device(H.local_rank)
  File "/home/mehranag/stuff/venv/lib/python3.8/site-packages/torch/cuda/__init__.py", line 261, in set_device
    torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal

I am able to run on one node without MPI though.
I was wondering if anyone has any ideas of how to solve this?

plateau during training

Hi Rewon. Cool work!

I tested the model with your checkpoint and everything works perfectly.
Now I am training VDVAE on CIFAR10 from scratch using one GPU (reducing the batch size 32 -->16 and the lr 2e-4 --> 1e-4).
The model starts training without problems and then gets stuck in a plateau around ~4.7 nats/dim for a long time.
I found a similar plateau with other configurations (smaller lr, smaller model).

Did you experience this plateau during training?

Thanks!

How to check bits per dim from the logs

Hi authors!

Thank you for your sharing!

I am wondering how I can check the bits per dim from the logs, when quickly reviewing the code, distortion is that, but when I ran, distortion is too small as thinking it BPD (about 1).

Could I know that?

Bests, Jaesik.

FutureWarning: arrays to stack must be passed as a "sequence" type such as list or tuple.

When I implemented VDVAE, following warning message was shown.

data.py:151: FutureWarning: arrays to stack must be passed as a "sequence" type such as list or tuple. Support for non-sequence iterables such as generators is deprecated as of NumPy 1.16 and will raise an error in the future.
  trX = np.vstack(data['data'] for data in tr_data)

Therefore, I submitted a pull request(#18).

Which output in the code is NLL in the paper?

I wander which output in the code is NLL in the paper?
Is it the stats[distortion]?
However, I got stats[distortion]<1.5 on CIFAR10, which does not correspond to 2.87 in the paper.

Question about params in decoder

Hi, thanks for sharing your great works !!!! It's so amazing!
I have a question about the decoder, I wonder if you can help me out a bit.
Can you please explain what these three params are for? ( bias_xs, gain and bias )

vdvae/vae.py

Lines 186 to 190 in ea35b49

self.bias_xs = nn.ParameterList([nn.Parameter(torch.zeros(1, self.widths[res], res, res)) for res in self.resolutions if res <= H.no_bias_above])
self.out_net = DmolNet(H)
self.gain = nn.Parameter(torch.ones(1, H.width, 1, 1))
self.bias = nn.Parameter(torch.zeros(1, H.width, 1, 1))
self.final_fn = lambda x: x * self.gain + self.bias

gpu training

Hello dear authors,

I am wondering if I can modify the stucture to have less parameters and to be able to perform inference on my pc after training on the server.
thank u!

FusedAdam requires cuda extensions

I have built the apex module based on the procedure explained but when trying to train the model on cifar10, I get:

/lustre03/project/6054857/mehranag/vdvae/data.py:147: FutureWarning: arrays to stack must be passed as a "sequence" type such as list or tuple. Support for non-sequence iterables such as generators is deprecated as of NumPy 1.16 and will raise an error in the future.
  trX = np.vstack(data['data'] for data in tr_data)
Traceback (most recent call last):
  File "train.py", line 144, in <module>
    main()
  File "train.py", line 140, in main
    train_loop(H, data_train, data_valid_or_test, preprocess_fn, vae, ema_vae, logprint)
  File "train.py", line 59, in train_loop
    optimizer, scheduler, cur_eval_loss, iterate, starting_epoch = load_opt(H, vae, logprint)
  File "/lustre03/project/6054857/mehranag/vdvae/train_helpers.py", line 180, in load_opt
    optimizer = AdamW(vae.parameters(), weight_decay=H.wd, lr=H.lr, betas=(H.adam_beta1, H.adam_beta2))
  File "/home/mehranag/anaconda3/envs/env/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/optimizers/fused_adam.py", line 79, in __init__
    raise RuntimeError('apex.optimizers.FusedAdam requires cuda extensions')
RuntimeError: apex.optimizers.FusedAdam requires cuda extensions

I understand that this is an apex-related issue since I get the following error when trying to run examples/simple/distributed in the apex repo:

Warning:  multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback.  Original ImportError was: ImportError("/lib64/libm.so.6: version `GLIBC_2.29' not found (required by /home/mehranag/anaconda3/envs/env/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/amp_C.cpython-36m-x86_64-linux-gnu.so)",)
final loss =  tensor(0.5392, device='cuda:0', grad_fn=<MseLossBackward>)

I have tried many things to fix this issue but no luck. I have two questions:

  • Does anybody know why I get FusedAdam requires cuda extensions even though I build apex with --global-option="--cpp_ext" --global-option="--cuda_ext" options?
  • How can I avoid using apex? - I am only trying to test some stuff on cifar10 and don't need the distributed training feature considering that I'm getting some weird errors!

Training a smaller model

To reduce the model size for a single GPU, is there anything more to do after reducing the number of residual blocks in vae.py?

What is samples-*.png?

Could you explain what we're seeing in samples-*.png?

image

For example, this is a 256x320 png output when training using CIFAR10:

python train.py --hps cifar10

It's 8x10 but which are the originals? Why do I see 7 of the same image top to bottom and then 3 new ones?

Not all stochastic layers are Gaussians?

At the beginning of Section 4 "AN ARCHITECTURE FOR VERY DEEP VAES" it says:
"This VAE consists only of convolutions, nonlinearities, and Gaussian stochastic layers."

But the code uses also a Discretized Mixture of Logistics (DML) (see "DmolNet" class on vae_helpers.py).

Aside from quantitative model performance (e.g. FID scores), what is the difference between using only Gaussian stochastic layers instead of DML wrt reconstructions and sampling performance?

Almost perfect reconstructions but random samples

I tried to reproduce this paper using a custom dataset, but after just 1 epoch I already see almost perfect reconstructions while sampling from the normal distributions gives random noisy results. It would seem like either the model is ignoring all encodings except the ones at higher resolution, or the model fails to encode things into a normal distribution, which doesn't seem the case since the KL loss is low and decreasing...
image
Is this something to be expected? How would I go about improving these results?
Thanks!

Fig 4 cumulative percentage of latent variables at a given resolution

Dear Adityaramesh,

Thank you for sharing with us this wonderful paper and implementation.

Would you mind telling me how to generate Fig 4 in the paper (Cumulative percentage of latent variables at a given resolution) using this implementation? I am new to VAE, sorry to ask silly questions.

Thank you for your help.

Best Wishes,

Alex

input data shift and scale

Hi authors!

I have a question about input image shift and scale.

The shift and scale for the output data are reasonable [0,255] -> [-1,1], but you used various numbers for input shift and scale for the different datasets.

Could I know the reference of the values?

Or found through fine-tuning?

Bests, Jaesik.

Does your ResBlock lack of an activation function?

https://github.com/tensorflow/tpu/blob/d4daff70a8a17625cb43386b2a564cb0e0e0e130/models/official/resnet/resnet_model.py#L497
Compared with the above implementation, your code is the pre_activation version, the input to the ResBlock should be squashed by an activation function first. However in your code, the skip input is directly added to the branch output.

    def forward(self, x):
        xhat = self.c1(F.gelu(x))
        xhat = self.c2(F.gelu(xhat))
        xhat = self.c3(F.gelu(xhat))
        xhat = self.c4(F.gelu(xhat))
        out = x + xhat if self.residual else xhat
        if self.down_rate is not None:
            out = F.avg_pool2d(out, kernel_size=self.down_rate, stride=self.down_rate)
        return out

Should it be

    def forward(self, x):
        x = F.gelu(x)
        xhat = self.c1(x)
        xhat = self.c2(F.gelu(xhat))
        xhat = self.c3(F.gelu(xhat))
        xhat = self.c4(F.gelu(xhat))
        out = x + xhat if self.residual else xhat
        if self.down_rate is not None:
            out = F.avg_pool2d(out, kernel_size=self.down_rate, stride=self.down_rate)
        return out

error when load model

Dear OpenAI team,

Thank you for sharing with us this great implementation.

When I try to load FFHQ 1204 using command:

wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq1024-iter-1700000-log.jsonl
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq1024-iter-1700000-model.th
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq1024-iter-1700000-model-ema.th
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets/ffhq1024-iter-1700000-opt.th
python train.py --hps ffhq1024 --restore_path ffhq1024-iter-1700000-model.th --restore_ema_path ffhq1024-iter-1700000-model-ema.th --restore_log_path ffhq1024-iter-1700000-log.jsonl --restore_optimizer_path ffhq1024-iter-1700000-opt.th --test_eval

It output:

Missing key(s) in state_dict: "encoder.enc_blocks.68.c1.weight", "encoder.enc_blocks.68.c1.bias", "encoder.enc_blocks.68.c2.weight", "encoder.enc_blocks.68.c2.bias", "encoder.enc_blocks.68.c3.weight", "encoder.enc_blocks.68.c3.bias", "encoder.enc_blocks.68.c4.weight", "encoder.enc_blocks.68.c4.bias", "encoder.enc_blocks.69.c1.weight", "encoder.enc_blocks.69.c1.bias", "encoder.enc_blocks.69.c2.weight", "encoder.enc_blocks.69.c2.bias", "encoder.enc_blocks.69.c3.weight", "encoder.enc_blocks.69.c3.bias", "encoder.enc_blocks.69.c4.weight", "encoder.enc_blocks.69.c4.bias", "decoder.dec_blocks.66.enc.c1.weight", "decoder.dec_blocks.66.enc.c1.bias", "decoder.dec_blocks.66.enc.c2.weight", "decoder.dec_blocks.66.enc.c2.bias", "decoder.dec_blocks.66.enc.c3.weight", "decoder.dec_blocks.66.enc.c3.bias", "decoder.dec_blocks.66.enc.c4.weight", "decoder.dec_blocks.66.enc.c4.bias", "decoder.dec_blocks.66.prior.c1.weight", "decoder.dec_blocks.66.prior.c1.bias", "decoder.dec_blocks.66.prior.c2.weight", "decoder.dec_blocks.66.prior.c2.bias", "decoder.dec_blocks.66.prior.c3.weight", "decoder.dec_blocks.66.prior.c3.bias", "decoder.dec_blocks.66.prior.c4.weight", "decoder.dec_blocks.66.prior.c4.bias", "decoder.dec_blocks.66.z_proj.weight", "decoder.dec_blocks.66.z_proj.bias", "decoder.dec_blocks.66.resnet.c1.weight", "decoder.dec_blocks.66.resnet.c1.bias", "decoder.dec_blocks.66.resnet.c2.weight", "decoder.dec_blocks.66.resnet.c2.bias", "decoder.dec_blocks.66.resnet.c3.weight", "decoder.dec_blocks.66.resnet.c3.bias", "decoder.dec_blocks.66.resnet.c4.weight", "decoder.dec_blocks.66.resnet.c4.bias", "decoder.dec_blocks.67.enc.c1.weight", "decoder.dec_blocks.67.enc.c1.bias", "decoder.dec_blocks.67.enc.c2.weight", "decoder.dec_blocks.67.enc.c2.bias", "decoder.dec_blocks.67.enc.c3.weight", "decoder.dec_blocks.67.enc.c3.bias", "decoder.dec_blocks.67.enc.c4.weight", "decoder.dec_blocks.67.enc.c4.bias", "decoder.dec_blocks.67.prior.c1.weight", "decoder.dec_blocks.67.prior.c1.bias", "decoder.dec_blocks.67.prior.c2.weight", "decoder.dec_blocks.67.prior.c2.bias", "decoder.dec_blocks.67.prior.c3.weight", "decoder.dec_blocks.67.prior.c3.bias", "decoder.dec_blocks.67.prior.c4.weight", "decoder.dec_blocks.67.prior.c4.bias", "decoder.dec_blocks.67.z_proj.weight", "decoder.dec_blocks.67.z_proj.bias", "decoder.dec_blocks.67.resnet.c1.weight", "decoder.dec_blocks.67.resnet.c1.bias", "decoder.dec_blocks.67.resnet.c2.weight", "decoder.dec_blocks.67.resnet.c2.bias", "decoder.dec_blocks.67.resnet.c3.weight", "decoder.dec_blocks.67.resnet.c3.bias", "decoder.dec_blocks.67.resnet.c4.weight", "decoder.dec_blocks.67.resnet.c4.bias", "decoder.dec_blocks.68.enc.c1.weight", "decoder.dec_blocks.68.enc.c1.bias", "decoder.dec_blocks.68.enc.c2.weight", "decoder.dec_blocks.68.enc.c2.bias", "decoder.dec_blocks.68.enc.c3.weight", "decoder.dec_blocks.68.enc.c3.bias", "decoder.dec_blocks.68.enc.c4.weight", "decoder.dec_blocks.68.enc.c4.bias", "decoder.dec_blocks.68.prior.c1.weight", "decoder.dec_blocks.68.prior.c1.bias", "decoder.dec_blocks.68.prior.c2.weight", "decoder.dec_blocks.68.prior.c2.bias", "decoder.dec_blocks.68.prior.c3.weight", "decoder.dec_blocks.68.prior.c3.bias", "decoder.dec_blocks.68.prior.c4.weight", "decoder.dec_blocks.68.prior.c4.bias", "decoder.dec_blocks.68.z_proj.weight", "decoder.dec_blocks.68.z_proj.bias", "decoder.dec_blocks.68.resnet.c1.weight", "decoder.dec_blocks.68.resnet.c1.bias", "decoder.dec_blocks.68.resnet.c2.weight", "decoder.dec_blocks.68.resnet.c2.bias", "decoder.dec_blocks.68.resnet.c3.weight", "decoder.dec_blocks.68.resnet.c3.bias", "decoder.dec_blocks.68.resnet.c4.weight", "decoder.dec_blocks.68.resnet.c4.bias", "decoder.dec_blocks.69.enc.c1.weight", "decoder.dec_blocks.69.enc.c1.bias", "decoder.dec_blocks.69.enc.c2.weight", "decoder.dec_blocks.69.enc.c2.bias", "decoder.dec_blocks.69.enc.c3.weight", "decoder.dec_blocks.69.enc.c3.bias", "decoder.dec_blocks.69.enc.c4.weight", "decoder.dec_blocks.69.enc.c4.bias", "decoder.dec_blocks.69.prior.c1.weight", "decoder.dec_blocks.69.prior.c1.bias", "decoder.dec_blocks.69.prior.c2.weight", "decoder.dec_blocks.69.prior.c2.bias", "decoder.dec_blocks.69.prior.c3.weight", "decoder.dec_blocks.69.prior.c3.bias", "decoder.dec_blocks.69.prior.c4.weight", "decoder.dec_blocks.69.prior.c4.bias", "decoder.dec_blocks.69.z_proj.weight", "decoder.dec_blocks.69.z_proj.bias", "decoder.dec_blocks.69.resnet.c1.weight", "decoder.dec_blocks.69.resnet.c1.bias", "decoder.dec_blocks.69.resnet.c2.weight", "decoder.dec_blocks.69.resnet.c2.bias", "decoder.dec_blocks.69.resnet.c3.weight", "decoder.dec_blocks.69.resnet.c3.bias", "decoder.dec_blocks.69.resnet.c4.weight", "decoder.dec_blocks.69.resnet.c4.bias". 
	size mismatch for encoder.in_conv.weight: copying a param with shape torch.Size([512, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 3, 3, 3]).
	size mismatch for encoder.in_conv.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.0.c1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([4, 16, 1, 1]).
	size mismatch for encoder.enc_blocks.0.c1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([4]).
	size mismatch for encoder.enc_blocks.0.c2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([4, 4, 3, 3]).
	size mismatch for encoder.enc_blocks.0.c2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([4]).
	size mismatch for encoder.enc_blocks.0.c3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([4, 4, 3, 3]).
	size mismatch for encoder.enc_blocks.0.c3.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([4]).
	size mismatch for encoder.enc_blocks.0.c4.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 4, 1, 1]).
	size mismatch for encoder.enc_blocks.0.c4.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.1.c1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([4, 16, 1, 1]).
	size mismatch for encoder.enc_blocks.1.c1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([4]).
	size mismatch for encoder.enc_blocks.1.c2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([4, 4, 3, 3]).
	size mismatch for encoder.enc_blocks.1.c2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([4]).
	size mismatch for encoder.enc_blocks.1.c3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([4, 4, 3, 3]).
	size mismatch for encoder.enc_blocks.1.c3.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([4]).
	size mismatch for encoder.enc_blocks.1.c4.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 4, 1, 1]).
	size mismatch for encoder.enc_blocks.1.c4.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.2.c1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([8, 32, 1, 1]).
	size mismatch for encoder.enc_blocks.2.c1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([8]).
	size mismatch for encoder.enc_blocks.2.c2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).
	size mismatch for encoder.enc_blocks.2.c2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([8]).
	size mismatch for encoder.enc_blocks.2.c3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).
	size mismatch for encoder.enc_blocks.2.c3.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([8]).
	size mismatch for encoder.enc_blocks.2.c4.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 8, 1, 1]).
	size mismatch for encoder.enc_blocks.2.c4.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([32]).
	size mismatch for encoder.enc_blocks.3.c1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([8, 32, 1, 1]).
	size mismatch for encoder.enc_blocks.3.c1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([8]).
	size mismatch for encoder.enc_blocks.3.c2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).
	size mismatch for encoder.enc_blocks.3.c2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([8]).
	size mismatch for encoder.enc_blocks.3.c3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).
	size mismatch for encoder.enc_blocks.3.c3.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([8]).
	size mismatch for encoder.enc_blocks.3.c4.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 8, 1, 1]).
	size mismatch for encoder.enc_blocks.3.c4.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([32]).
	size mismatch for encoder.enc_blocks.4.c1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([8, 32, 1, 1]).
	size mismatch for encoder.enc_blocks.4.c1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([8]).
	size mismatch for encoder.enc_blocks.4.c2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).
	size mismatch for encoder.enc_blocks.4.c2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([8]).
	size mismatch for encoder.enc_blocks.4.c3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).
	size mismatch for encoder.enc_blocks.4.c3.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([8]).
	size mismatch for encoder.enc_blocks.4.c4.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 8, 1, 1]).
	size mismatch for encoder.enc_blocks.4.c4.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([32]).
	size mismatch for encoder.enc_blocks.5.c1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([8, 32, 1, 1]).
	size mismatch for encoder.enc_blocks.5.c1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([8]).
	size mismatch for encoder.enc_blocks.5.c2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).
	size mismatch for encoder.enc_blocks.5.c2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([8]).
	size mismatch for encoder.enc_blocks.5.c3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).
	size mismatch for encoder.enc_blocks.5.c3.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([8]).
	size mismatch for encoder.enc_blocks.5.c4.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 8, 1, 1]).
	size mismatch for encoder.enc_blocks.5.c4.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([32]).
	size mismatch for encoder.enc_blocks.6.c1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 64, 1, 1]).
	size mismatch for encoder.enc_blocks.6.c1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.6.c2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for encoder.enc_blocks.6.c2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.6.c3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for encoder.enc_blocks.6.c3.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.6.c4.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 16, 1, 1]).
	size mismatch for encoder.enc_blocks.6.c4.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([64]).
	size mismatch for encoder.enc_blocks.7.c1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 64, 1, 1]).
	size mismatch for encoder.enc_blocks.7.c1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.7.c2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for encoder.enc_blocks.7.c2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.7.c3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for encoder.enc_blocks.7.c3.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.7.c4.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 16, 1, 1]).
	size mismatch for encoder.enc_blocks.7.c4.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([64]).
	size mismatch for encoder.enc_blocks.8.c1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 64, 1, 1]).
	size mismatch for encoder.enc_blocks.8.c1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.8.c2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for encoder.enc_blocks.8.c2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.8.c3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for encoder.enc_blocks.8.c3.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.8.c4.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 16, 1, 1]).
	size mismatch for encoder.enc_blocks.8.c4.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([64]).
	size mismatch for encoder.enc_blocks.9.c1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 64, 1, 1]).
	size mismatch for encoder.enc_blocks.9.c1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.9.c2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for encoder.enc_blocks.9.c2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.9.c3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for encoder.enc_blocks.9.c3.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.9.c4.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 16, 1, 1]).
	size mismatch for encoder.enc_blocks.9.c4.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([64]).
	size mismatch for encoder.enc_blocks.10.c1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 64, 1, 1]).
	size mismatch for encoder.enc_blocks.10.c1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.10.c2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for encoder.enc_blocks.10.c2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.10.c3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for encoder.enc_blocks.10.c3.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.10.c4.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 16, 1, 1]).
	size mismatch for encoder.enc_blocks.10.c4.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([64]).
	size mismatch for encoder.enc_blocks.11.c1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 64, 1, 1]).
	size mismatch for encoder.enc_blocks.11.c1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.11.c2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for encoder.enc_blocks.11.c2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.11.c3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for encoder.enc_blocks.11.c3.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for encoder.enc_blocks.11.c4.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 16, 1, 1]).
	size mismatch for encoder.enc_blocks.11.c4.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([64]).
	size mismatch for encoder.enc_blocks.64.c2.weight: copying a param with shape torch.Size([128, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).
	size mismatch for encoder.enc_blocks.64.c3.weight: copying a param with shape torch.Size([128, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).
	size mismatch for encoder.enc_blocks.65.c2.weight: copying a param with shape torch.Size([128, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).
	size mismatch for encoder.enc_blocks.65.c3.weight: copying a param with shape torch.Size([128, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).
	size mismatch for decoder.gain: copying a param with shape torch.Size([1, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 16, 1, 1]).
	size mismatch for decoder.bias: copying a param with shape torch.Size([1, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 16, 1, 1]).
	size mismatch for decoder.dec_blocks.65.enc.c1.weight: copying a param with shape torch.Size([128, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 128, 1, 1]).
	size mismatch for decoder.dec_blocks.65.enc.c1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for decoder.dec_blocks.65.enc.c2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for decoder.dec_blocks.65.enc.c2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for decoder.dec_blocks.65.enc.c3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for decoder.dec_blocks.65.enc.c3.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for decoder.dec_blocks.65.enc.c4.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 1, 1]).
	size mismatch for decoder.dec_blocks.65.prior.c1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 64, 1, 1]).
	size mismatch for decoder.dec_blocks.65.prior.c1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for decoder.dec_blocks.65.prior.c2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for decoder.dec_blocks.65.prior.c2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for decoder.dec_blocks.65.prior.c3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for decoder.dec_blocks.65.prior.c3.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for decoder.dec_blocks.65.prior.c4.weight: copying a param with shape torch.Size([544, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([96, 16, 1, 1]).
	size mismatch for decoder.dec_blocks.65.prior.c4.bias: copying a param with shape torch.Size([544]) from checkpoint, the shape in current model is torch.Size([96]).
	size mismatch for decoder.dec_blocks.65.z_proj.weight: copying a param with shape torch.Size([512, 16, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 16, 1, 1]).
	size mismatch for decoder.dec_blocks.65.z_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([64]).
	size mismatch for decoder.dec_blocks.65.resnet.c1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 64, 1, 1]).
	size mismatch for decoder.dec_blocks.65.resnet.c1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for decoder.dec_blocks.65.resnet.c2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for decoder.dec_blocks.65.resnet.c2.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for decoder.dec_blocks.65.resnet.c3.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
	size mismatch for decoder.dec_blocks.65.resnet.c3.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([16]).
	size mismatch for decoder.dec_blocks.65.resnet.c4.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 16, 1, 1]).
	size mismatch for decoder.dec_blocks.65.resnet.c4.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([64]).
	size mismatch for decoder.out_net.out_conv.weight: copying a param with shape torch.Size([100, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([20, 16, 1, 1]).
	size mismatch for decoder.out_net.out_conv.bias: copying a param with shape torch.Size([100]) from checkpoint, the shape in current model is torch.Size([20]).

It seems like the model and weights are not compatible. Could you tell me how to solve this problem?

Thank you very much for your help.

Best Wishes,

Alex

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.