bcmi / dci-vton-virtual-try-on Goto Github PK

View Code? Open in Web Editor NEW

378.0 378.0 55.0 5.92 MB

[ACM Multimedia 2023] Taming the Power of Diffusion Models for High-Quality Virtual Try-On with Appearance Flow.

Home Page: https://arxiv.org/abs/2308.06101

License: MIT License

Python 99.80% Shell 0.20%

dci-vton-virtual-try-on's People

Contributors

Stargazers

Watchers

dci-vton-virtual-try-on's Issues

about mask boundary

I found that in the example diagram, in the mask boundary area, the generated Image has traces of fusion, but the GAN based method does not. Is this a problem with the fusion method or a flaw in diffusion

request for tutorials

hiii... can you provide tutorials on how to use this repo.... it will help a lot..

get cloth-warp image use my dataset

when I use the warp model provided by author to get cloth-warp image, I find it less effective. so I need to use my dataset to train a warp model, use train_VITON.sh, but how to get cloth-warp image for training, are cloth-warp gt image manually annotated？

Deepfashion dataset used in ablation study

Hello authors, Great work! I was wondering how did you find relevant data for this purpose from the deepfashion dataset as it cannot be directly used for virtual try on. Can you share the method that you used to shortlist the relevant samples from the datasets please?

Thanks.

Problem with setting up the person/clothes pairs.

Thank you so much for sharing this great work!

I now have a problem with setting up the person/clothes pairs.
No matter how I try to set up the test_pairs.txt and run the inference with unpaired mode (e.g. as follows), the try-on result is not what I expected ( the person is in another clothes).
00055_00.jpg 01967_00.jpg

target clothes

try-on result

I really appreciate if anyone could help or share some insights. Thank you!

how to limit the degree of warp

Hi, thanks for your great work.
I have a question on how to limit the degree of warp, as I have observed that the skirt is being warped and stretched to appear like a long dress?

No uniformity in any of the modules.

How difficult is it to provide a clear and straight forward user-centric inference pipeline?
Git repo owners should work on their inference pipeline.

Traning code release

Will the training code be released?

Regarding Texture

How can I improve texture quality for custom images. I did retrain them, but texture quality does not improve much.

Also, the sleeve size is not according to the cloth image, but the sleeve gets adjusted to user input image. Could you suggest how I can make results better?

poor evaluation metrics score

Hi,

Thanks for your work first!

I run inference by your checkpoint and evaluate all the metrics you mentioned in the paper but I got very poor results. Could you please provide the code for evaluating the performance?

Can use this model for full body, pants

Is the model you provide trained with DressCode?
Can I use the current model for photos with both shirt and pants, full body?
Thanks in advance

the result of self.sample_hijack

sample_hijack的输出结果z经过decoder是什么样的，和GT很像吗

How much gpu memory is required to train the diffusion model?

Hi thanks for your work.
I want to train a 1024 resolution diffusion model. Could you tell me how much GPU memory is needed to train this model?

Failed to load warp_viton.pth checkpoint

Hi,

Thanks for your work!

I am trying to test inference. The pre-trained model at https://drive.google.com/drive/folders/11BJo59iXVu2_NknKMbN0jKtFV06HTn5K fails to load since it's a zip archive with below error


Traceback (most recent call last):
  File "eval_PBAFN_viton.py", line 37, in <module> 
    load_checkpoint(warp_model, opt.warp_checkpoint)
  File "/home/paperspace/PF-AFN/PF-AFN_test/models/networks.py", line 178, in load_checkpoint
    checkpoint = torch.load(checkpoint_path)
  File "/home/paperspace/anaconda3/envs/tryon/lib/python3.6/site-packages/torch/serialization.py", line 387, in load
    return _load(f, map_location, pickle_module, **pickle_load_args)
  File "/home/paperspace/anaconda3/envs/tryon/lib/python3.6/site-packages/torch/serialization.py", line 560, in _load
    raise RuntimeError("{} is a zip archive (did you mean to use torch.jit.load()?)".format(f.name))

If I use the pre-trained checkpoint from https://drive.google.com/file/d/1_a0AiN8Y_d_9TNDhHIcRlERz3zptyYWV/view which is linked to from https://github.com/geyuying/PF-AFN then I get the following error:

Traceback (most recent call last):
  File "eval_PBAFN_viton.py", line 37, in <module> 
    load_checkpoint(warp_model, opt.warp_checkpoint)
  File "/home/paperspace/PF-AFN/PF-AFN_test/models/networks.py", line 183, in load_checkpoint
    model.load_state_dict(checkpoint_new)
  File "/home/paperspace/anaconda3/envs/tryon/lib/python3.6/site-packages/torch/nn/modules/module.py", line 777, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for AFWM:
        size mismatch for cond_features.encoders.0.0.block.0.weight: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([16]).
        size mismatch for cond_features.encoders.0.0.block.0.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([16]).
        size mismatch for cond_features.encoders.0.0.block.0.running_mean: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([16]).
        size mismatch for cond_features.encoders.0.0.block.0.running_var: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([16]).
        size mismatch for cond_features.encoders.0.0.block.2.weight: copying a param with shape torch.Size([64, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 16, 3, 3]).

How do I go about fixing this?

an problem about input

thanks for your great work！
when i was ran ‘sh test_VITON.sh’ ,
it displayed the error message ‘can not find the file path dataset/test_pairs.txt’.
After I putting the the test_pairs.txt and test directory from VITON-HD ,it displayed the error message ‘RuntimeError: running_mean should contain 16 elements not 23’ ! It seems that an usual problem about PyTorch input,however,I can not solve this problem.could you please give me some solutions or point out my other problems.Thanks very much.

the error message ‘RuntimeError: running_mean should contain 16 elements not 23’

HELP! EOFError: Ran out of input

I am sure that I have deployed it according to the author's requirements and put the viton512.ckpt file in the checkpoint folder. This file is not empty. Can u help me to solve this promblem?

Only get noise？

I completely followed your instructions，but there are only noises in visualization during training process. Is it usual？

Could you describe how to use cp_dataset_v2.py and what is the selection strategy update?

Hi authors,

Could you describe how to use cp_dataset_v2.py and what is the selection strategy update?

Looking forward to your reply

Inconsistency between Inference Results and Paper Examples and some failed cases in VITON-HD.

Thank you for your contribution once again. We have been working on a VTON (Virtual Try-On) project and have run your inference on VITON-HD unpaired test set. However, we have noticed some inconsistencies between the inference results and the examples shown in the paper.

In one of the examples, the generated neckline looks a little strange compared to the image appended in the paper, where it looks good. Below are the details of the example:

Image: 00654_00.jpg
Clothing item: 05838_00.jpg

Additionally, I followed the guidelines mentioned in the README.md file. The warped clothes were downloaded from the provided link:

DCI-VTON-Virtual-Try-On/README.md

Line 47 in 20f75c6

 2. Download pre-warped cloth image/mask from [Google Drive](https://drive.google.com/drive/folders/15cBiA0AoSCLSkg3ueNFWSw4IU3TdfXbO?usp=sharing) or [Baidu Cloud](https://pan.baidu.com/s/1ss8e_Fp3ZHd6Cn2JjIy-YQ?pwd=x2k9) and put it under your VITON-HD dataset 

The checkpoint used was from the following link:

DCI-VTON-Virtual-Try-On/README.md

Line 72 in 20f75c6

 Please download the pretrained model from [Google Drive](https://drive.google.com/drive/folders/11BJo59iXVu2_NknKMbN0jKtFV06HTn5K?usp=sharing) or [Baidu Cloud](https://pan.baidu.com/s/13Rp_-Fbp1NUN41q0U6S4gw?pwd=6bfg). 

The only changes I made were to the test.sh file:

I modified the checkpoint/data/output directories.
I decreased the n_samples from 8 to 1, as I only needed one inference due to time constraints. (It took us approximately 4 hours and 40 minutes to complete the inference on the unpaired test set using an A100 server.)

I also found some failed cases - When changing from short sleeves to long sleeves, the wrong shape is generated unexpectedly.

Image: 00071_00.jpg
Clothing item: 02151_00.jpg

I added some lines in test.py to save inputs of pipeline and it looks okay.

Could you please help me verify if the committed code is correct or if the uploaded checkpoints are correct? Any assistance would be greatly appreciated.

Any plans to release codes for dress code dataset?

Hi authos,

Do you have plan to release codes for dress code dataset?
If so, when will you release it?

Looking forward to your reply.
Best

Failure to load warp_viton.pth checkpoints

I have already installed the dependencies following the instructions in projects DCI-VTON-Virtual-Try-On and PF-AFN.
My system environment is wsl2 ubuntu 22.04.
But in Warping Module，when I run the command "test_VITON.sh"，some errors occurred.

------------ Options -------------
batchSize: 32
data_type: 32
dataroot: ./dataset/VITON-HD
display_winsize: 512
fineSize: 512
gen_checkpoint: checkpoints/PFAFN/gen_model_final.pth
gpu_ids: [0]
input_nc: 3
isTrain: False
label_nc: 13
loadSize: 512
max_dataset_size: inf
nThreads: 1
name: cloth-warp
no_flip: False
norm: instance
output_nc: 3
phase: test
resize_or_crop: none
serial_batches: False
tf_log: False
unpaired: True
use_dropout: False
verbose: False
warp_checkpoint: checkpoints/warp_viton.pth
-------------- End ----------------
#training images = 64
Traceback (most recent call last):
File "/home/wyfx/anaconda3/envs/tryon/lib/python3.6/tarfile.py", line 189, in nti
n = int(s.strip() or "0", 8)
ValueError: invalid literal for int() with base 8: 'torch._u'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/wyfx/anaconda3/envs/tryon/lib/python3.6/tarfile.py", line 2299, in next
tarinfo = self.tarinfo.fromtarfile(self)
File "/home/wyfx/anaconda3/envs/tryon/lib/python3.6/tarfile.py", line 1093, in fromtarfile
obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors)
File "/home/wyfx/anaconda3/envs/tryon/lib/python3.6/tarfile.py", line 1035, in frombuf
chksum = nti(buf[148:156])
File "/home/wyfx/anaconda3/envs/tryon/lib/python3.6/tarfile.py", line 191, in nti
raise InvalidHeaderError("invalid header")
tarfile.InvalidHeaderError: invalid header

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/wyfx/anaconda3/envs/tryon/lib/python3.6/site-packages/torch/serialization.py", line 556, in _load
return legacy_load(f)
File "/home/wyfx/anaconda3/envs/tryon/lib/python3.6/site-packages/torch/serialization.py", line 467, in legacy_load
with closing(tarfile.open(fileobj=f, mode='r:', format=tarfile.PAX_FORMAT)) as tar,
File "/home/wyfx/anaconda3/envs/tryon/lib/python3.6/tarfile.py", line 1591, in open
return func(name, filemode, fileobj, **kwargs)
File "/home/wyfx/anaconda3/envs/tryon/lib/python3.6/tarfile.py", line 1621, in taropen
return cls(name, mode, fileobj, **kwargs)
File "/home/wyfx/anaconda3/envs/tryon/lib/python3.6/tarfile.py", line 1484, in init
self.firstmember = self.next()
File "/home/wyfx/anaconda3/envs/tryon/lib/python3.6/tarfile.py", line 2311, in next
raise ReadError(str(e))
tarfile.ReadError: invalid header

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "eval_PBAFN_viton.py", line 37, in
load_checkpoint(warp_model, opt.warp_checkpoint)
File "/home/wyfx/projects/PF-AFN/PF-AFN_test/models/networks.py", line 178, in load_checkpoint
checkpoint = torch.load(checkpoint_path)
File "/home/wyfx/anaconda3/envs/tryon/lib/python3.6/site-packages/torch/serialization.py", line 387, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "/home/wyfx/anaconda3/envs/tryon/lib/python3.6/site-packages/torch/serialization.py", line 560, in _load
raise RuntimeError("{} is a zip archive (did you mean to use torch.jit.load()?)".format(f.name))
RuntimeError: checkpoints/warp_viton.pth is a zip archive (did you mean to use torch.jit.load()?)

The problem is similar to the following issue, #11
. I try "pip install cupy-cuda11x"，but can not install，the error is as follows

ERROR: Cannot unpack file /tmp/pip-unpack-suqln0bl/simple.htm (downloaded from /tmp/pip-req-build-_upoi04a, content-type: text/html); cannot detect archive format ERROR: Cannot determine archive format of /tmp/pip-req-build-_upoi04a

problem with the details around the neck/collar

Thank you for your work. I have run your code to produce some results. I have noticed that the collar/neck area is showing the skin instead of the clothes. Could you explain what happened and is there any way to resolve this?

Database

Hi, any opinions on using https://github.com/sergeywong/cp-vton instead of VITON-HD? Saw in a similar project that they chose cp instead of HD due to copywrite issues.

Traning code of diffusion model release

Hi~ Thanks for your good work! I am quite interesting in it!
I noticed that you have uploaded lots of codes except the training part of diffusion model.
Do you plan to release the training code of diffusion model? if yes, when? Thank you!

Where can I get VITON-HD 512?

Thank you for your great work.

I followed the tutorial and download the VITON-HD dataset from the link you provided (https://github.com/shadow2496/VITON-HD), however the dataset image size is 1024*768.
Your pretrained model is 512, could I know there to get the 512 dataset?

1024x768 training setting

Hello, thanks for your great contribution!
Can you let me know the training setting for 1024x768 resolution?
Did you use the same setting specified in viton512_v2.yaml?
Thank you

++)
I tried training with A100 40GB and it returned OOM..
Could you share your 1024 checkpoint if possible?

Did you use pre-trained model?

Hello, thanks for your great contribution!
I'm curious that if you trained your model from scratch or did you start from a pre-trained model?(stable diffusion inpainting model perhaps)

Queries Regarding Custom Dataset Testing in This Project

Hello, I've completed the preprocessing steps for my custom dataset by HR-VITON/issues/45 , but I've encountered the following challenges:

When I apply the warp module directly to my preprocessed custom dataset, I observe undesirable distortion results. Is it possible to achieve improved results by training the warp module specifically on my dataset?
I've stored the data produced by the warp module within my customized dataset and tested the diffusion module. Surprisingly, I noticed that areas other than the clothing exhibit unfavorable redrawn effects. This outcome seems inconsistent with your paper, which suggests that only the clothing region should undergo redrawn effects. Could you please clarify the reason for this discrepancy, and can I enhance the results by training the model on my data?
Lastly, the images generated by the diffusion module currently have a resolution of 512x384. I'm interested in generating larger images with a resolution of 1024x768. How can I achieve this?

Thank you for your assistance.

warp model

Will you release the warp model as well? Also does your experiment include results for 1024x768 resolution?

Release date for models and code

Hi,

Once again, great paper. When do you plan to release the other things such as the inference script, pretrained model, and training scripts.

Can't Run Warping Inference

> sh test_VITON.sh
------------ Options -------------
batchSize: 32
data_type: 32
dataroot: /home/ec2-user/repos/VITON-HD
display_winsize: 512
fineSize: 512
gen_checkpoint: checkpoints/PFAFN/gen_model_final.pth
gpu_ids: [0]
input_nc: 3
isTrain: False
label_nc: 13
loadSize: 512
max_dataset_size: inf
nThreads: 1
name: cloth-warp
no_flip: False
norm: instance
output_nc: 3
phase: test
resize_or_crop: none
serial_batches: False
tf_log: False
unpaired: True
use_dropout: False
verbose: False
warp_checkpoint: checkpoints/warp_viton.pth
-------------- End ----------------
#training images = 64
/home/ec2-user/.conda/envs/dci-vton/lib/python3.8/site-packages/torchvision/transforms/transforms.py:332: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
  warnings.warn(
/home/ec2-user/.conda/envs/dci-vton/lib/python3.8/site-packages/cupy/cuda/compiler.py:464: UserWarning: cupy.cuda.compile_with_cache has been deprecated in CuPy v10, and will be removed in the future. Use cupy.RawModule or cupy.RawKernel instead.
  warnings.warn(
/home/ec2-user/.conda/envs/dci-vton/lib/python3.8/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/TensorShape.cpp:2228.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
/home/ec2-user/.conda/envs/dci-vton/lib/python3.8/site-packages/torch/nn/functional.py:4193: UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.
  warnings.warn(
/home/ec2-user/repos/PF-AFN/PF-AFN_test/data/cp_dataset.py:67: RuntimeWarning: invalid value encountered in divide
  pose_data[9] = point + (pose_data[9] - point) / length_b * length_a
/home/ec2-user/repos/PF-AFN/PF-AFN_test/data/cp_dataset.py:68: RuntimeWarning: invalid value encountered in divide
  pose_data[12] = point + (pose_data[12] - point) / length_b * length_a

Script just breaks after writing all these warnings. I have no clue how the code should be updated to solve these warnings...

RuntimeError: The size of tensor a (6) must match the size of tensor b (8) at non-singleton dimension 0

Hi,

Thank you very much for the project. I like the project very much but I am getting an error while running it. Can you help me? I followed the steps in the readme file. But I am getting this error.

Run Code:

python test.py --plms --gpu_id 0 --ddim_steps 100 --outdir results/viton --config configs/viton512.yaml --dataroot VITON-HD --ckpt viton512.ckpt --n_samples 8 --seed 23 --scale 1 --H 512 --W 512 --unpaired

Error Message:

  File "...\DCI-VTON-Virtual-Try-On\ldm\modules\diffusionmodules\openaimodel.py", line 273, in _forward
    h = h + emb_out
RuntimeError: The size of tensor a (6) must match the size of tensor b (8) at non-singleton dimension 0

VGG and L1 loss.

Thanks for your great work in advance.

I have several questions about the training code and implementation of VGG loss and L1 loss.

From these lines of codes, it seems to use both L1 loss and VGG loss at the same time but I have not found implementations about L1 loss in the paper until now (26/09/2023). May I confirm if the L1 loss is an ad-hoc/experimental setting or it has been proven to induce some promising improvement.

DCI-VTON-Virtual-Try-On/ldm/models/diffusion/ddpm.py

Lines 1697 to 1706 in 107c2d3

 loss_l1_weight = 1e-1 

 loss_vgg_weight = 1e-3 

 x_samples = self.differentiable_decode_first_stage(x_pred) 

 loss_l1 = self.get_loss(x_samples, gt, mean=True) 

 loss += loss_l1 * loss_l1_weight 

 loss_dict.update({'train/loss_l1': loss_l1 * loss_l1_weight}) 

 loss_vgg = self.get_vgg_loss(x_samples, gt) 

 loss += loss_vgg * loss_vgg_weight 

 loss_dict.update({'train/loss_vgg': loss_vgg * loss_vgg_weight})

How did you decide the weights of each VGG layer.

DCI-VTON-Virtual-Try-On/ldm/models/diffusion/ddpm.py

Lines 1709 to 1716 in 107c2d3

 def get_vgg_loss(self, pred, gt): 

 pred_feat = self.vgg(pred, ["r12", "r22", "r32", "r42", "r52"], preprocess=True) 

 gt_feat = self.vgg(gt, ["r12", "r22", "r32", "r42", "r52"], preprocess=True) 

 loss_feat = 0 

 weights = [1.0 / 32, 1.0 / 16, 1.0 / 8, 1.0 / 4, 1.0] 

 for i in range(len(pred_feat)): 

 loss_feat += weights[i] * F.l1_loss(pred_feat[i], gt_feat[i].detach()) 

 return loss_feat

And also how did you balance the three loss items:

L2 of latent noise.
L1 of pixels.
VGG of pixels.

Will be appreciative if you are willing to share more insights here 🤗

Details of Diffusion Model Training based Paint By Example

Diffusion Model
1、尝试将warped-cloth与Image-agnostic制作成您文中的样子进行训练，对应于Reconstruction Branch，遇到了一些问题：
a. Paint-By-Example. Pretrained Model VAE编解码会导致脸部变形的问题
期待您的回复

How to train 1024 resolution diffusion model?

We utilize the pretrained Paint-by-Example as initialization, please download the pretrained models from Google

But Paint-by-Example does not have a 1024 resolution model.
which model I can utilize as initialization to train 1024 resolution diffusion model?
Thanks

关于diffusion模型对于衣服细节保护的疑问

感谢作者的优秀工作~我在VITON测试集上进行了模型的测试，具体流程是根据 HR-VITON-issues45 进行了前处理，并根据作者的指示进行warp和diffusion修复。下图1是我获取到的结果图，图2是文章中展示结果图：

上述图1从左往右依次是输入衣服图片、人物pose图片、衣服warp后图像、最终试穿图像。如图所示衣服的图案和文字有所形变，不知道是不是我的处理流程存在问题？

Why does updated warp code give cloth masks instead of body/person aligned warp masks?

I have had this issue with the wap dir here in this repo. I am attaching two images. The warp image i am getting vs what I should get. 00017_00 file is the one I am getting

Custom Image

Hi, thank you for the code. Could you let us know how to test for a custom image. What are the requirements and what changes we need to make in the code?

Questions about your work

Hi thanks for your contribution. I have some questions on the work as I'm confused by the training details in the paper.

For the L_simple this is a normal L2 reconstruction loss like in paint by example where you only modify the input layer unet to add more channels is this correct? Does this mean you need to train the unet input layer too? I wonder if you tried the palette method of denoising instead.
For the L_vgg for training the diffusion model, do you decode the latents to compute VGG loss? How many steps did you take to decode? I thought this will be very slow for each step.
For both training losses, they are trained together with no two-stage process? What lambda weights did you give to each training loss?
Is I_gt something that is already available, meaning to say you don't need any paired data to train the model? So the training is self-supervised?
Do you train the whole model or just some layers? How long does training take?

Warping model:

Do you use a pretrained warping model, or is this something you train from scratch? If train from scratch, how long did you take and how many data points?
What is the main difference between this warping model and the models from older papers?

Thanks for your work again

this error is coming in google colab:

/content/DCI-VTON-Virtual-Try-On
2023-12-05 14:49:45.898989: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-12-05 14:49:45.899048: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-12-05 14:49:45.899086: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-12-05 14:49:47.701705: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Global seed set to 23
Loading model from /content/DCI-VTON-Virtual-Try-On/viton512.ckpt
Global Step: 58240
LatentTryOnDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.54 M params.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 64, 64) = 16384 dimensions.
making attention of type 'vanilla' with 512 in_channels
Downloading: "https://github.com/DagnyT/hardnet/raw/master/pretrained/train_liberty_with_aug/checkpoint_liberty_with_aug.pth" to /root/.cache/torch/hub/checkpoints/checkpoint_liberty_with_aug.pth
100% 5.10M/5.10M [00:00<00:00, 64.6MB/s]
config.json: 100% 4.52k/4.52k [00:00<00:00, 18.9MB/s]
model.safetensors: 100% 1.71G/1.71G [00:24<00:00, 68.9MB/s]
^C

why is this error coming in google colab I'm using V100 gpu .

Unable to test the warping module

I am getting issue with cupy module and it's not part of the conda env yaml file

when running sh test_VITON.sh i am getting this

Also my gpu and cuda version is

Let me know how you got it working for youself . The readme seems to be missing this bit

Tackle the Occlusion problem

Many of the virtual try on models have the problem of occlusion effect, in such cases images are not generated properly within the occluded region by the hands or some challenging pose. Does this model take care about that thing (paper results shows no such problem), how it can be tackled?

Where is the base.py?

When I run train.sh or main.py to trian a diffusion model, it has an error:
File "main.py", line 21, in
from ldm.data.base import Txt2ImgIterableBaseDataset
ModuleNotFoundError: No module named 'ldm.data.base'

I found that there is not a base.py file in the path of ./ldm/data/, maybe you forget to upload this file? Can I use the files in the paint-by-examples project directly? Did you modify this file in the paint-by-example?

	loss_l1_weight = 1e-1
	loss_vgg_weight = 1e-3
	x_samples = self.differentiable_decode_first_stage(x_pred)
	loss_l1 = self.get_loss(x_samples, gt, mean=True)
	loss += loss_l1 * loss_l1_weight
	loss_dict.update({'train/loss_l1': loss_l1 * loss_l1_weight})

	loss_vgg = self.get_vgg_loss(x_samples, gt)
	loss += loss_vgg * loss_vgg_weight
	loss_dict.update({'train/loss_vgg': loss_vgg * loss_vgg_weight})

	def get_vgg_loss(self, pred, gt):
	pred_feat = self.vgg(pred, ["r12", "r22", "r32", "r42", "r52"], preprocess=True)
	gt_feat = self.vgg(gt, ["r12", "r22", "r32", "r42", "r52"], preprocess=True)
	loss_feat = 0
	weights = [1.0 / 32, 1.0 / 16, 1.0 / 8, 1.0 / 4, 1.0]
	for i in range(len(pred_feat)):
	loss_feat += weights[i] * F.l1_loss(pred_feat[i], gt_feat[i].detach())
	return loss_feat

bcmi / dci-vton-virtual-try-on Goto Github PK

dci-vton-virtual-try-on's People

Contributors

Stargazers

Watchers

Forkers

dci-vton-virtual-try-on's Issues

Recommend Projects

Recommend Topics

Recommend Org