victorca25 / trainner Goto Github PK
View Code? Open in Web Editor NEWtraiNNer: Deep learning framework for image and video super-resolution, restoration and image-to-image translation, for training and testing.
License: Apache License 2.0
traiNNer: Deep learning framework for image and video super-resolution, restoration and image-to-image translation, for training and testing.
License: Apache License 2.0
21-08-20 15:26:58.678 - INFO: Dataset [SingleDataset - seta] is created.
21-08-20 15:26:58.678 - INFO: Number of test_1 images in [seta]: 100
21-08-20 15:26:58.678 - INFO: Dataset [SingleDataset - setb] is created.
21-08-20 15:26:58.678 - INFO: Number of test_2 images in [setb]: 100
21-08-20 15:26:58.709 - INFO: AMP library available
21-08-20 15:27:03.014 - INFO: Loading pretrained model for G [C:\Users\User\Desktop\traiNNer-master\traiNNer-master\codes\experiments\pretrained_models\4x_RRDB_ESRGAN.pth]
21-08-20 15:27:03.400 - INFO: Network G structure: DataParallel - RRDBNet, with parameters: 16,697,987
21-08-20 15:27:03.400 - INFO: Model [SRModel] created.
21-08-20 15:27:03.400 - INFO:
Testing [seta]...
Traceback (most recent call last):
File "test.py", line 253, in
main()
File "test.py", line 249, in main
test_loop(model, opt, dataloaders, data_params)
File "test.py", line 120, in test_loop
for data in dataloader:
File "C:\Users\User\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\utils\data\dataloader.py", line 359, in iter
return self._get_iterator()
File "C:\Users\User\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\utils\data\dataloader.py", line 305, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "C:\Users\User\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\utils\data\dataloader.py", line 918, in init
w.start()
File "C:\Users\User\AppData\Local\Programs\Python\Python38\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\User\AppData\Local\Programs\Python\Python38\lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\User\AppData\Local\Programs\Python\Python38\lib\multiprocessing\context.py", line 327, in _Popen
return Popen(process_obj)
File "C:\Users\User\AppData\Local\Programs\Python\Python38\lib\multiprocessing\popen_spawn_win32.py", line 93, in init
reduction.dump(process_obj, to_child)
File "C:\Users\User\AppData\Local\Programs\Python\Python38\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'get_totensor..'
C:\Users\User\Desktop\traiNNer-master\traiNNer-master\codes>Traceback (most recent call last):
File "", line 1, in
File "C:\Users\User\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\User\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
The avobe error message come up when I run "python test.py -opt options/sr/test_sr.yml".
And I modified the yml to specify the image and model path.
But That error message appeared.
How can run test.py ? why this error message appear? I don't know how do I run traiNNer...
Traceback (most recent call last):
File "D:\PycharmProjects\traiNNer\codes\train.py", line 500, in
main()
File "D:\PycharmProjects\traiNNer\codes\train.py", line 496, in main
fit(model, opt, dataloaders, steps_states, data_params, loggers)
File "D:\PycharmProjects\traiNNer\codes\train.py", line 224, in fit
for n, train_data in enumerate(dataloaders['train'], start=1):
File "D:\venvs\AI\Lib\site-packages\torch\utils\data\dataloader.py", line 631, in next
data = self._next_data()
^^^^^^^^^^^^^^^^^
File "D:\venvs\AI\Lib\site-packages\torch\utils\data\dataloader.py", line 1346, in _next_data
return self._process_data(data)
^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\venvs\AI\Lib\site-packages\torch\utils\data\dataloader.py", line 1372, in _process_data
data.reraise()
File "D:\venvs\AI\Lib\site-packages\torch_utils.py", line 722, in reraise
raise exception
AttributeError: Caught AttributeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "D:\venvs\AI\Lib\site-packages\torch\utils\data_utils\worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
^^^^^^^^^^^^^^^^^^^^
File "D:\venvs\AI\Lib\site-packages\torch\utils\data_utils\fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\venvs\AI\Lib\site-packages\torch\utils\data_utils\fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]
~~~~~~~~~~~~^^^^^
File "D:\PycharmProjects\traiNNer\codes\data\aligned_dataset.py", line 126, in getitem
A_transform = get_transform(
^^^^^^^^^^^^^^
File "D:\PycharmProjects\traiNNer\codes\dataops\augmentations.py", line 573, in get_transform
transform_list.append(transforms.Resize(osize, method))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\PycharmProjects\traiNNer\codes\dataops\augmennt\augmennt\transforms.py", line 175, in init
elif isinstance(size, collections.Iterable) and len(size) == 2:
^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'collections' has no attribute 'Iterable'
There is a better way to define torch's requirement in requirements.txt. the current way installs CPU by default.
I know how to do this better. I will include this in a PR later.
By default the video learning rate is 0.001, which is far too high when replacing the SR component with RRDB (for sofvsr). In general this is still pretty high. I recommend setting the LR to 0.0001 and then increase the ofr weights instead.
I will make a PR for this.
On Colab restarting training again results in following error:
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in call_impl
result = self.forward(*input, **kwargs)
File "/content/BasicSR/codes/models/modules/architectures/block.py", line 428, in forward
sampled_noise = self.noise.repeat(*x.size()).normal() * scale
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.78 GiB total capacity; 14.32 GiB already allocated; 20.75 MiB free; 14.46 GiB reserved in total by PyTorch)
The only way to restart training was to reduce batch all the way from 64 to 3
I've tried running following commands to no avail:
gc.collect()
torch.cuda.empty_cache()
There seems to a resolution here:
https://discuss.pytorch.org/t/how-can-we-release-gpu-memory-cache/14530/27
I tried train_ppon.py and got an import error on
from models.modules.LPIPS import compute_dists as lpips
I tried commenting it out and the usage of lpips and there seemed to be other import errors.
I've not tried test_ppon.py, but maybe that has errors too.
It uses about 3GB when training and then about 5.2 when it starts the validation then it crashes.
My training data is 512x512 jpg files, one frame: https://imgur.com/a/MulGiTE
GPU: 2070 super
CPU: 5600x
cuda 11 installed
torch==1.9.1+cu111 torchvision==0.10.1+cu111
Complete powershell output:
export CUDA_VISIBLE_DEVICES=0
Path already exists. Rename it to [D:\Code\GitHub\BasicSR\experiments\debug_001_template_archived_210928-091440]
21-09-28 09:14:40.677 - INFO: name: debug_001_template
use_tb_logger: True
model: srragan
scale: 4
gpu_ids: [0]
use_amp: False
use_swa: False
datasets:[
train:[
name: DIV2K
mode: LRHRC
dataroot_HR: ..\..\train\hr
dataroot_LR: ..\..\train\lr
subset_file: None
use_shuffle: True
znorm: False
n_workers: 6
batch_size: 8
virtual_batch_size: 8
HR_size: 128
image_channels: 3
dataroot_kernels: ../training/kernels/results/
lr_downscale: True
lr_downscale_types: [1, 2, 777]
use_flip: True
use_rot: True
hr_rrot: False
lr_blur: False
lr_blur_types: ['gaussian', 'clean', 'clean', 'clean']
noise_data: ../noise_patches/normal/
lr_noise: False
lr_noise_types: ['gaussian', 'JPEG', 'clean', 'clean', 'clean', 'clean']
lr_noise2: False
lr_noise_types2: ['dither', 'dither', 'clean', 'clean']
hr_noise: False
hr_noise_types: ['gaussian', 'clean', 'clean', 'clean', 'clean']
phase: train
scale: 4
data_type: img
]
val:[
name: val_set14_part
mode: LRHROTF
dataroot_HR: ..\..\val\hr
dataroot_LR: ..\..\val\lr
znorm: False
lr_downscale: False
lr_downscale_types: [1, 2]
phase: val
scale: 4
data_type: img
]
]
path:[
strict: False
root: D:\Code\GitHub\BasicSR
pretrain_model_G: ..\experiments\pretrained_models\1xPSNR.pth
experiments_root: D:\Code\GitHub\BasicSR\experiments\debug_001_template
models: D:\Code\GitHub\BasicSR\experiments\debug_001_template\models
training_state: D:\Code\GitHub\BasicSR\experiments\debug_001_template\training_state
log: D:\Code\GitHub\BasicSR\experiments\debug_001_template
val_images: D:\Code\GitHub\BasicSR\experiments\debug_001_template\val_images
]
network_G:[
strict: False
which_model_G: RRDB_net
norm_type: None
mode: CNA
nf: 64
nb: 23
nr: 3
in_nc: 3
out_nc: 3
gc: 32
group: 1
convtype: Conv2D
net_act: leakyrelu
gaussian: True
plus: False
scale: 4
]
network_D:[
strict: True
which_model_D: discriminator_vgg
norm_type: batch
act_type: leakyrelu
mode: CNA
nf: 64
in_nc: 3
nlayer: 3
num_D: 3
]
train:[
lr_G: 0.0001
weight_decay_G: 0
beta1_G: 0.9
lr_D: 0.0001
weight_decay_D: 0
beta1_D: 0.9
lr_scheme: MultiStepLR
lr_gamma: 0.5
swa_start_iter: 375000
swa_lr: 0.0001
swa_anneal_epochs: 10
swa_anneal_strategy: cos
pixel_criterion: l1
pixel_weight: 0.01
feature_criterion: l1
feature_weight: 1
gan_type: vanilla
gan_weight: 0.005
manual_seed: 0
niter: 500000.0
val_freq: 8
metrics: psnr,ssim,lpips
overwrite_val_imgs: None
val_comparison: None
lr_decay_iter: 10
lr_steps: [50000, 100000, 200000, 300000]
]
logger:[
print_freq: 2
save_checkpoint_freq: 8
overwrite_chkp: False
]
is_train: True
21-09-28 09:14:40.678 - INFO: Random seed: 0
21-09-28 09:14:41.321 - INFO: Dataset [LRHRDataset - DIV2K] is created.
21-09-28 09:14:41.322 - INFO: Number of train images: 63,792, iters: 7,974
21-09-28 09:14:41.323 - INFO: Total epochs needed: 63 for iters 500,000
21-09-28 09:14:41.324 - INFO: Dataset [LRHRDataset - val_set14_part] is created.
21-09-28 09:14:41.324 - INFO: Number of val images in [val_set14_part]: 5
21-09-28 09:14:41.558 - INFO: AMP library available
21-09-28 09:14:42.583 - INFO: Initialization method [kaiming]
21-09-28 09:14:42.799 - INFO: Initialization method [kaiming]
21-09-28 09:14:42.891 - INFO: Loading pretrained model for G [..\experiments\pretrained_models\1xPSNR.pth] ...
21-09-28 09:14:43.753 - INFO: Network G structure: DataParallel - RRDBNet, with parameters: 16,697,987
21-09-28 09:14:43.754 - INFO: Network D structure: DataParallel - Discriminator_VGG, with parameters: 14,502,281
21-09-28 09:14:43.756 - INFO: Model [SRRaGANModel] is created.
21-09-28 09:14:43.757 - INFO: Start training from epoch: 0, iter: 0
21-09-28 09:14:52.560 - INFO: <epoch: 0, iter: 2, lr:1.000e-04, t:-1.0000s, td:3.0840s, eta:0.0000h> pix-l1: 1.6838e-03 fea-vgg19-l1: 1.5493e+00 l_g_gan: 6.9997e-03 l_d_real: 3.2938e-01 l_d_fake: 3.4658e-01 D_real: 5.9246e-01 D_fake: -4.6950e-01
21-09-28 09:14:53.462 - INFO: <epoch: 0, iter: 4, lr:1.000e-04, t:-1.0000s, td:0.0000s, eta:0.0000h> pix-l1: 2.4982e-03 fea-vgg19-l1: 1.7615e+00 l_g_gan: 1.9201e-02 l_d_real: 5.7227e-02 l_d_fake: 6.6596e-02 D_real: 1.0418e+00 D_fake: -2.7365e+00
21-09-28 09:14:54.274 - INFO: <epoch: 0, iter: 6, lr:1.000e-04, t:0.9020s, td:0.0000s, eta:125.2761h> pix-l1: 2.0472e-03 fea-vgg19-l1: 1.7822e+00 l_g_gan: 3.1084e-02 l_d_real: 6.8650e-03 l_d_fake: 3.2773e-03 D_real: 1.3466e+00 D_fake: -4.8651e+00
21-09-28 09:14:55.233 - INFO: <epoch: 0, iter: 8, lr:1.000e-04, t:0.8125s, td:0.0000s, eta:112.8456h> pix-l1: 2.5441e-03 fea-vgg19-l1: 1.4662e+00 l_g_gan: 2.8835e-02 l_d_real: 1.3615e-02 l_d_fake: 5.1962e-03 D_real: 1.5751e+00 D_fake: -4.1826e+00
21-09-28 09:14:55.669 - INFO: Models and training states saved.
Setting up Perceptual loss...
Loading model from: J:\Videos\ESRGAN\DATASET\traiNNer-2.0\codes\models\modules\LPIPS\lpips_weights\v0.1\squeeze.pth
...[net-lin [squeeze]] initialized
...Done
Traceback (most recent call last):
File "J:\Videos\ESRGAN\DATASET\traiNNer-2.0\codes\train.py", line 416, in <module>
main()
File "J:\Videos\ESRGAN\DATASET\traiNNer-2.0\codes\train.py", line 412, in main
fit(model, opt, dataloaders, steps_states, data_params, loggers)
File "J:\Videos\ESRGAN\DATASET\traiNNer-2.0\codes\train.py", line 289, in fit
model.test() # run inference
File "J:\Videos\ESRGAN\DATASET\traiNNer-2.0\codes\models\SRRaGAN_model.py", line 387, in test
self.forward(CEM_net=CEM_net)
File "J:\Videos\ESRGAN\DATASET\traiNNer-2.0\codes\models\SRRaGAN_model.py", line 254, in forward
self.fake_H = self.netG(self.var_L) # G(LR)
File "C:\Program Files\Python39\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Program Files\Python39\lib\site-packages\torch\nn\parallel\data_parallel.py", line 166, in forward
return self.module(*inputs[0], **kwargs[0])
File "C:\Program Files\Python39\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "J:\Videos\ESRGAN\DATASET\traiNNer-2.0\codes\models\modules\architectures\RRDBNet_arch.py", line 49, in forward
x = self.model(x)
File "C:\Program Files\Python39\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Program Files\Python39\lib\site-packages\torch\nn\modules\container.py", line 139, in forward
input = module(input)
File "C:\Program Files\Python39\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "J:\Videos\ESRGAN\DATASET\traiNNer-2.0\codes\models\modules\architectures\block.py", line 195, in forward
output = x + self.sub(x)
File "C:\Program Files\Python39\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Program Files\Python39\lib\site-packages\torch\nn\modules\container.py", line 139, in forward
input = module(input)
File "C:\Program Files\Python39\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "J:\Videos\ESRGAN\DATASET\traiNNer-2.0\codes\models\modules\architectures\RRDBNet_arch.py", line 93, in forward
out = self.RDB3(out)
File "C:\Program Files\Python39\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "J:\Videos\ESRGAN\DATASET\traiNNer-2.0\codes\models\modules\architectures\RRDBNet_arch.py", line 159, in forward
x5 = self.conv5(torch.cat((x, x1, x2, x3, x4), 1))
RuntimeError: CUDA out of memory. Tried to allocate 1.48 GiB (GPU 0; 8.00 GiB total capacity; 2.59 GiB already allocated; 332.74 MiB free; 5.41 GiB reserved in total by PyTorch)
I got error( 'lam < 0 or lam contains NaNs') on some images in
at https://github.com/victorca25/BasicSR/blob/14aced7d1049a283761c145f3cf300a94c6ac4b9/codes/dataops/augmentations.py#L786
I just modified to perform gaussian noise on the error images
try:
noise_img = np.random.poisson(img_LR * vals) / float(vals)
except ValueError:
print("Poissoning Err,Just Gaussing it")
compression = np.random.uniform(10, 50) #randomize quality between 10 and 50%
encode_param = [int(cv2.IMWRITE_JPEG_QUALITY), compression] #encoding parameters
# encode
is_success, encimg = cv2.imencode('.jpg', img_LR, encode_param)
# decode
noise_img = cv2.imdecode(encimg, 1)
noise_img = noise_img.astype(np.uint8)
I was training a model with PPON (192) + MultiScale + Diffaug, and I receive the following error when moving to Phase 2:
I have AMP disabled because my GPU doesn't support it.
error.log
21-01-27 11:26:52.449 - INFO: Random seed: 0
21-01-27 11:26:52.647 - INFO: Dataset [LRHRDataset - DIV2K] is created.
21-01-27 11:26:52.647 - INFO: Number of train images: 37,933, iters: 2,371
21-01-27 11:26:52.647 - INFO: Total epochs needed: 43 for iters 100,000
21-01-27 11:26:52.648 - INFO: Dataset [LRHRDataset - val_set14_part] is created.
21-01-27 11:26:52.648 - INFO: Number of val images in [val_set14_part]: 1
21-01-27 11:26:52.650 - INFO: AMP library available
21-01-27 11:26:52.827 - INFO: Initialization method [kaiming]
21-01-27 11:26:54.127 - INFO: Initialization method [kaiming]
21-01-27 11:26:54.185 - INFO: Loading pretrained model for G [../experiments/pretrained_models/PPON_G.pth] ...
21-01-27 11:26:55.276 - INFO: Network G structure: DataParallel - PPON, with parameters: 17,267,657
21-01-27 11:26:55.277 - INFO: Network D structure: DataParallel - MultiscaleDiscriminator, with parameters: 8,296,899
21-01-27 11:26:55.277 - INFO: Model [PPONModel] is created.
21-01-27 11:26:55.277 - INFO: Start training from epoch: 0, iter: 0
21-01-27 11:26:55.991 - INFO: Switching to phase: p2, step: 1
Traceback (most recent call last):
File "/mnt/ext4-storage/Training/BasicSR/codes/train.py", line 382, in
main()
File "/mnt/ext4-storage/Training/BasicSR/codes/train.py", line 378, in main
fit(model, opt, dataloaders, steps_states, data_params, loggers)
File "/mnt/ext4-storage/Training/BasicSR/codes/train.py", line 221, in fit
model.optimize_parameters(virtual_step) # calculate loss functions, get gradients, update network weights
File "/mnt/ext4-storage/Training/BasicSR/codes/models/ppon_model.py", line 199, in optimize_parameters
l_g_total.backward()
AttributeError: 'float' object has no attribute 'backward'
I was reading about the GAN types (Vanilla, LSGAN, and WGAN-GP) already included in BasicSR, and I found a new type that may bring a sizable performance increase to the discriminators used in upscaling methods like ESRGAN and PPON.
https://arxiv.org/abs/1807.00734
This paper outlines the idea behind a relativistic discriminator and showcases new variants of existing GANs that were created to use this approach.
There is also source code available:
https://www.github.com/AlexiaJM/RelativisticGAN
The one that stood out to me was RaLSGAN.
It performs better than the other variants in most tests involving generating images that are 128x128 or less. When it comes to SGAN (Standard GAN), it outperforms this variant by a large margin.
Interested to hear your thoughts on this,
N0man
Hey,
When trying to train, I got the following error. Does anyone know how to fix it?
Traceback (most recent call last):
File "train.py", line 65, in configure_loggers
tb_logger = SummaryWriter(log_dir=log_dir)
NameError: name 'log_dir' is not defined
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 379, in
main()
File "train.py", line 351, in main
loggers = configure_loggers(opt)
File "train.py", line 69, in configure_loggers
tb_logger = SummaryWriter(logdir=log_dir)
NameError: name 'log_dir' is not defined
E:\BasicSR-master\codes>
hi
how can I use your model for train super resolution whit ideas of real ESRGAN(2 stage degradation)??
I can't tell whether its possible to train something like EGVSR or BaicVSR. Is that an upcoming feature?
From what I can tell, when using multiple folders to specify training data, they must have the same prefix path. If that is not done, training gives a confusing "image too large" error.
If prefix requirement is intended, perhaps good to document in the example file and/or detect for a more specific error message?
D:\traiNNer\codes\models\base_model.py:921: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.
self.grad_clip(
C:\Python39\lib\site-packages\torch\optim\lr_scheduler.py:129: UserWarning: Detected call of lr_scheduler.step()
before optimizer.step()
. In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step()
before lr_scheduler.step()
. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
warnings.warn("Detected call of lr_scheduler.step()
before optimizer.step()
. "
How fix this?
I think your ETA calculation is off by a factor of 100...
Through both my own manual calculations and visually inspecting the ETA while training, it seems that the decimal separator is placed in the wrong spot.
Example:
23-07-26 02:16:57.959 - INFO: <epoch: 33, iter: 167,900, lr:6.250e-06, t:31.6040s, td:0.0001s, eta:281.8024h> pix-l1: 4.2061e-04 fea-vgg19-l1: 7.0643e-01 l_g_gan: 1.4697e-02 l_d_real: 5.6621e-02 l_d_fake: 5.6456e-02 D_real: 4.0469e+01 D_fake: 3.7594e+01
Here we see that the ETA is 281.8024h(rs), when in fact it is much closer to 2.81hrs. This is consistent with every training run I've done so far, no matter how big or small the ETA actually is.
Even though I've gotten used to it now, I thought I might raise an issue here and let you know ๐
I've noticed that if you update niter mid training, it displays the correct new 'lr steps', but it does not correct the current 'lr rate', in accordance with the new steps. Looks like, it just keep the old lr rate from 'latest.state' file.
Example:
When i change niter from 100,000 to 500,000, following were the logs:
Herer is the settings in config file:
lr_steps_rel: [0.1, 0.2, 0.4, 0.6]
lr_G: 0.0001
Caused by it assuming that since the LR and HR datasets are the same size it should generate LR on the fly.
This should be an easy fix, I'll make a PR for it later.
I used create_lmdb.py to create both my LR and HR datasets, and I was wondering how I should configure my options file.
Do the settings differ from using HR/LR image folders?
I think crop size is a bit misleading for people that are new to training. Many times I've had to explain how training a 1x model with a crop size of 128 is equivalent to training a 4x model at 512. It seems silly to set the crop size to 32 for a 1x model, but it might seem less silly if every model by default used the same LR crop size of 32 regardless of scale.
At the very least I think a comment or something explaining this concept would suffice. This might be a case of us just needing better documentation rather than adding extra hand holding.
Thoughts?
Hi!
Is it possible to initialize pix2pix for working with 3 channel input image and 1 channel output?
Trying with following options:
name: 001_pix2pix_test
use_tb_logger: true
model: pix2pix
scale: 1
gpu_ids: [0]
use_amp: true
use_swa: false
# Dataset options:
datasets:
train:
name: test
mode: aligned
outputs: AB
dataroot_B: '../datasets/test/B'
dataroot_A: '../datasets/test/A'
use_shuffle: true
n_workers: 8
batch_size: 2
virtual_batch_size: 2
preprocess: none
crop_size: 256
input_nc: 3
output_nc: 1
image_channels: 3
# Generator options:
network_G:
strict: false
which_model_G: unet_net
# Discriminator options:
network_D:
strict: true
which_model_D: patchgan
in_nc: 4
And got this in few secs after training start:
2024-01-03 11:07:26.963466: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
24-01-03 11:07:34.462 - WARNING: From c:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\src\losses.py:2976: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead.
24-01-03 11:07:35.310 - INFO: Random seed: 0
24-01-03 11:07:35.677 - INFO: Dataset [AlignedDataset - test] is created.
24-01-03 11:07:35.677 - INFO: Number of train images: 3,159, epoch iters: 1,579
24-01-03 11:07:35.678 - INFO: Total epochs needed: 32 for iters 50,000
24-01-03 11:07:35.910 - INFO: AMP library available
24-01-03 11:07:36.229 - INFO: Initialization method [kaiming]
24-01-03 11:07:36.900 - INFO: Initialization method [kaiming]
24-01-03 11:07:36.927 - INFO: GAN enabled
24-01-03 11:07:36.929 - INFO: AMP enabled
24-01-03 11:07:36.930 - INFO: Network G structure: DataParallel - UnetGenerator, with parameters: 54,413,955
24-01-03 11:07:36.930 - INFO: Network D structure: DataParallel - NLayerDiscriminator, with parameters: 2,766,657
24-01-03 11:07:36.931 - INFO: Model [Pix2PixModel] created.
24-01-03 11:07:36.930 - INFO: Network G structure: DataParallel - UnetGenerator, with parameters: 54,413,955
24-01-03 11:07:36.930 - INFO: Network D structure: DataParallel - NLayerDiscriminator, with parameters: 2,766,657
24-01-03 11:07:36.931 - INFO: Model [Pix2PixModel] created.
24-01-03 11:07:36.931 - INFO: Start training from epoch: 0, iter: 0
Traceback (most recent call last):
File "f:\GIT\traiNNer\codes\train.py", line 500, in <module>
main()
File "f:\GIT\traiNNer\codes\train.py", line 496, in main
fit(model, opt, dataloaders, steps_states, data_params, loggers)
File "f:\GIT\traiNNer\codes\train.py", line 238, in fit
model.optimize_parameters(virtual_step) # calculate loss functions, get gradients, update network weights
File "f:\GIT\traiNNer\codes\models\pix2pix_model.py", line 219, in optimize_parameters
self.backward_D()
File "f:\GIT\traiNNer\codes\models\pix2pix_model.py", line 146, in backward_D
self.log_dict = self.backward_D_Basic(
File "f:\GIT\traiNNer\codes\models\base_model.py", line 871, in backward_D_Basic
l_d_total, gan_logs = self.adversarial(
File "c:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "c:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "f:\GIT\traiNNer\codes\models\losses.py", line 595, in forward
return self.conditional_discriminator(
File "f:\GIT\traiNNer\codes\models\losses.py", line 530, in conditional_discriminator
return self.regular_discriminator(
File "f:\GIT\traiNNer\codes\models\losses.py", line 536, in regular_discriminator
pred_d_fake, pred_d_real = self.get_predictions_dis(
File "f:\GIT\traiNNer\codes\models\losses.py", line 475, in get_predictions_dis
pred_d_fake = netD(fake.detach())
File "c:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "c:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "c:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\parallel\data_parallel.py", line 183, in forward
return self.module(*inputs[0], **module_kwargs[0])
File "c:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "c:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "f:\GIT\traiNNer\codes\models\modules\architectures\discriminators.py", line 579, in forward
return self.model(x)
File "c:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "c:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "c:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\container.py", line 215, in forward
input = module(input)
File "c:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "c:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "c:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\conv.py", line 460, in forward
return self._conv_forward(input, self.weight, self.bias)
File "c:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\conv.py", line 456, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [64, 4, 4, 4], expected input[2, 6, 1024, 1024] to have 4 channels, but got 6 channels instead
I'm double checked: all A images got 3 channels and all B images got only 1 grayscale channel.
Whats going on and it's a possible at all?
Training a 1x model with Pixel Unshuffle (using the supplied pretrained model) yields this error:
[Python] RuntimeError: Given groups=1, weight of size [64, 48, 3, 3], expected input[1, 4, 297, 397] to have 48 channels, but got 4 channels instead [ESRGAN] Upscaling Error: Index was outside the bounds of the array. at Cupscale.PreviewMerger.Merge() at Cupscale.Main.Upscale.<Run>d__8.MoveNext()
The ability to change which augmentation preset is being used at different points in training would be great. For example, at 10k iterations, resrgan_blur could be used, but at 30k it's automatically switched to bsrgan_blur.
This was discussed in the #trainner channel on the GU Discord server
Edit: A possible expansion on this idea, augmentation preset strengths. I'm not sure how it'd function, but I figured I'd bring it up
python3 train.py -opt train_sr.yml Traceback (most recent call last): File "/home/nickdbts2022/Desktop/traiNNer/codes/train.py", line 500, in main() File "/home/nickdbts2022/Desktop/traiNNer/codes/train.py", line 466, in main opt = parse_options() File "/home/nickdbts2022/Desktop/traiNNer/codes/train.py", line 25, in parse_options opt = options.parse(args.opt, is_train=is_train) File "/home/nickdbts2022/Desktop/traiNNer/codes/options/options.py", line 552, in parse raise ValueError("Configuration file {} not found.".format(opt_path)) ValueError: Configuration file options/train/train_sr.yml not found.
This error shows up after starting train.py with the configuration that came with traiNNer. Fresh install with nothing modified (except the train_sr.yml"
23-05-31 19:09:48.945 - INFO: Random seed: 0
23-05-31 19:09:49.264 - INFO: Dataset [AlignedDataset - DIV2K] is created.
23-05-31 19:09:49.266 - INFO: Number of train images: 14,361, epoch iters: 1,795
23-05-31 19:09:49.266 - INFO: Total epochs needed: 279 for iters 500,000
23-05-31 19:09:49.266 - INFO: Dataset [AlignedDataset - val_set14_part] is created.
23-05-31 19:09:49.267 - INFO: Number of val images in [val_set14_part]: 1
23-05-31 19:09:49.624 - INFO: AMP library available
23-05-31 19:09:51.252 - INFO: Initialization method [kaiming]
23-05-31 19:09:51.547 - INFO: Initialization method [kaiming]
23-05-31 19:09:51.670 - INFO: Loading pretrained model for G [..\experiments\pretrained_models\RRDB_PSNR_x4.pth]
23-05-31 19:09:52.916 - INFO: GAN enabled
23-05-31 19:09:52.922 - INFO: AMP enabled
23-05-31 19:09:52.923 - INFO: norm gradient clip enabled. Clip value: 0.1.
23-05-31 19:09:52.935 - INFO: Network G structure: DataParallel - RRDBNet, with parameters: 16,697,987
23-05-31 19:09:52.936 - INFO: Network D structure: DataParallel - Discriminator_VGG, with parameters: 14,502,281
23-05-31 19:09:52.936 - INFO: Model [SRModel] created.
23-05-31 19:09:52.936 - INFO: Start training from epoch: 0, iter: 0
E:\nn\trainner\codes\models\base_model.py:921: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.
self.grad_clip(
C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\optim\lr_scheduler.py:129: UserWarning: Detected call of lr_scheduler.step()
before optimizer.step()
. In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step()
before lr_scheduler.step()
. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
warnings.warn("Detected call of lr_scheduler.step()
before optimizer.step()
. "
Traceback (most recent call last):
File "E:\nn\trainner\codes\train.py", line 500, in
main()
File "E:\nn\trainner\codes\train.py", line 496, in main
fit(model, opt, dataloaders, steps_states, data_params, loggers)
File "E:\nn\trainner\codes\train.py", line 224, in fit
for n, train_data in enumerate(dataloaders['train'], start=1):
File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\data\dataloader.py", line 521, in next
data = self._next_data()
File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\data\dataloader.py", line 1203, in _next_data
return self._process_data(data)
File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\data\dataloader.py", line 1229, in _process_data
data.reraise()
File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\torch_utils.py", line 425, in reraise
raise self.exc_type(msg)
cv2.error: Caught error in DataLoader worker process 0.
Original Traceback (most recent call last):
File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\data_utils\worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "E:\nn\trainner\codes\data\aligned_dataset.py", line 117, in getitem
img_A, img_B = paired_imgs_check(
File "E:\nn\trainner\codes\dataops\augmentations.py", line 1388, in paired_imgs_check
img_A, img_B = shape_change_fn(
File "E:\nn\trainner\codes\dataops\augmentations.py", line 1141, in shape_change_fn
img_A = transforms.Resize((int(h/scale), int(w/scale)),
File "E:\nn\trainner\codes\dataops\augmennt\augmennt\transforms.py", line 192, in call
return F.resize(img, self.size, self.interpolation)
File "E:\nn\trainner\codes\dataops\augmennt\augmennt\common.py", line 211, in wrapped_function
result = func(img, *args, **kwargs)
File "E:\nn\trainner\codes\dataops\augmennt\augmennt\functional.py", line 187, in resize
output = cv2.resize(img, dsize=(size[1], size[0]), interpolation=_cv2_str2interpolation[interpolation])
cv2.error: OpenCV(4.7.0) D:\a\opencv-python\opencv-python\opencv\modules\imgproc\src\resize.cpp:4065: error: (-215:Assertion failed) inv_scale_x > 0 in function 'cv::resize'
Very cool fork...
According to this video they achieved better (quality) performances of DAIN+EDVR, so it would be great to be implemented in standard-motion too:
https://github.com/Mukosame/Zooming-Slow-Mo-CVPR-2020
Hope that inspires !
Hello. I am training on Colab and i get following error.
export CUDA_VISIBLE_DEVICES=0
20-12-30 03:59:48.658 - INFO: name: ftrainer
use_tb_logger: False
model: srragan
scale: 8
batch_multiplier: 1
gpu_ids: [0]
datasets:[
train:[
name: Dataset
mode: LRHROTF
dataroot_HR: ['/content/datasets/set0/train/hr', '/content/datasets/set1/train/hr', '/content/datasets/set2/train/hr']
dataroot_LR: ['/content/datasets/set0/train/lr', '/content/datasets/set1/train/lr', '/content/datasets/set2/train/lr']
subset_file: None
use_shuffle: True
n_workers: 4
batch_size: 100
HR_size: 128
phase: train
scale: 8
data_type: img
virtual_batch_size: 100
]
val:[
name: Validation
mode: LRHROTF
dataroot_HR: ['/content/datasets/set0/val/hr', '/content/datasets/set1/val/hr', '/content/datasets/set2/val/hr']
dataroot_LR: ['/content/datasets/set0/val/lr', '/content/datasets/set1/val/lr', '/content/datasets/set2/val/lr']
phase: val
scale: 8
data_type: img
]
]
path:[
root: /content/BasicSR/
pretrain_model_G: ../experiments/pretrained_models/Restart.pth
experiments_root: /content/BasicSR/experiments/ftrainer
models: /content/BasicSR/experiments/ftrainer/models
training_state: /content/BasicSR/experiments/ftrainer/training_state
log: /content/BasicSR/experiments/ftrainer
val_images: /content/BasicSR/experiments/ftrainer/val_images
]
network_G:[
which_model_G: RRDB_net
norm_type: None
mode: CNA
nf: 64
nb: 23
in_nc: 3
out_nc: 3
gc: 32
group: 1
convtype: Conv2D
net_act: leakyrelu
scale: 8
]
network_D:[
which_model_D: discriminator_vgg
norm_type: batch
act_type: leakyrelu
mode: CNA
nf: 64
in_nc: 3
]
train:[
lr_G: 0.0001
lr_D: 0.0001
use_frequency_separation: False
lr_scheme: MultiStepLR
lr_steps: [50000, 100000, 200000, 300000]
lr_gamma: 0.5
pixel_criterion: l1
pixel_weight: 0.01
feature_criterion: l1
feature_weight: 1
gan_type: vanilla
gan_weight: 0.005
manual_seed: 0
niter: 500000.0
val_freq: 100
overwrite_val_imgs: None
val_comparison: None
]
logger:[
print_freq: 100
save_checkpoint_freq: 100.0
backup_freq: 100
overwrite_chkp: None
]
is_train: True
20-12-30 03:59:48.658 - INFO: Random seed: 0
20-12-30 03:59:48.716 - INFO: Dataset [LRHRDataset - Dataset] is created.
20-12-30 03:59:48.716 - INFO: Number of train images: 1,307, iters: 14
20-12-30 03:59:48.716 - INFO: Total epochs needed: 35715 for iters 500,000
20-12-30 03:59:48.719 - INFO: Dataset [LRHRDataset - Validation] is created.
20-12-30 03:59:48.719 - INFO: Number of val images in [Validation]: 358
20-12-30 03:59:48.752 - INFO: AMP library available
Traceback (most recent call last):
File "train.py", line 256, in
main()
File "train.py", line 98, in main
model = create_model(opt)
File "/content/BasicSR/codes/models/init.py", line 26, in create_model
m = M(opt)
File "/content/BasicSR/codes/models/SRRaGAN_model.py", line 51, in init
self.netG = networks.define_G(opt).to(self.device) # G
File "/content/BasicSR/codes/models/networks.py", line 160, in define_G
finalact=opt_net['finalact'], gaussian_noise=opt_net['gaussian'], plus=opt_net['plus'], nr=opt_net['nr'])
File "/content/BasicSR/codes/models/modules/architectures/RRDBNet_arch.py", line 26, in init
gaussian_noise=gaussian_noise, plus=plus) for _ in range(nb)]
File "/content/BasicSR/codes/models/modules/architectures/RRDBNet_arch.py", line 26, in
gaussian_noise=gaussian_noise, plus=plus) for _ in range(nb)]
File "/content/BasicSR/codes/models/modules/architectures/RRDBNet_arch.py", line 86, in init
gaussian_noise=gaussian_noise, plus=plus) for _ in range(nr)]
TypeError: 'NoneType' object cannot be interpreted as an integer
I'm using the test.py as per the instruction and my results are blue. How can I run upscaling to produce proper results?
Why are the installation instructions so complicated? I dont understand this shit.
I was looking to try this out to train an upscaling model but thought to try one of my test images first, and found that downscaling was being done in srgb gamma. Most images are encoded as srgb (~188 is half as bright as 255) but downscaling algorithms, where it's especially relevant, assume they're taking linear rgb as input (~127 is half as bright as 255).
I used this image as my input (this isn't a good image for training an upscaler but it does demonstrate the problem) and manually ran it through resize
from imresize.py
, the same way it is done in generate_mod_LR_bic.py
. It's best to open this image in a program that does not perform any scaling, since your browser might be doing some.
What I got out of it at 1/4 scale was a uniform grey square.
But I can fix this by converting it to and from linear RGB using the methods you already have in colors.py
(the functions are named incorrectly, rgb2srgb
should be srgb2rgb
and vice versa):
img = cv2.imread('gamma.jpg')
img = img * 1.0 / 255
img = torch.from_numpy(np.transpose(img[:, :, [2, 1, 0]], (2, 0, 1))).float()
img = rgb2srgb(img)
rlt = resize(img, 1/4)
rlt = srgb2rgb(rlt)
torchvision.utils.save_image(
(rlt * 255).round() / 255, 'rlt.png', nrow=1, padding=0, normalize=False)
This code snippet gives me the expected result:
While this is an artificial example that exaggerates the effect, the colour distortion is going to happen to a varying degree on any images that are transformed in non-linear gamma. I believe this is decreasing the accuracy of the trained models, since they'll be learning to attempt to reverse this colour distortion which can cause noticeable colour shift when upscaling images that were not produced from srgb downscaling.
i get this error
File "C:\ManduScale\Train\codes\train.py", line 500, in <module>
main()
File "C:\ManduScale\Train\codes\train.py", line 487, in main
dataloaders, data_params = get_dataloaders(opt)
File "C:\ManduScale\Train\codes\train.py", line 134, in get_dataloaders
dataset = create_dataset(dataset_opt)
File "C:\ManduScale\Train\codes\data\__init__.py", line 79, in create_dataset
dataset = D(dataset_opt)
File "C:\ManduScale\Train\codes\data\aligned_dataset.py", line 41, in __init__
self.A_paths, self.B_paths = get_dataroots_paths(self.opt, strict=False, keys_ds=self.keys_ds)
File "C:\ManduScale\Train\codes\data\base_dataset.py", line 235, in get_dataroots_paths
paths_A, paths_B = read_dataroots(opt, keys_ds=keys_ds)
File "C:\ManduScale\Train\codes\data\base_dataset.py", line 168, in read_dataroots
paths_A, paths_B = paired_dataset_validation(A_images_paths, B_images_paths,
File "C:\ManduScale\Train\codes\data\base_dataset.py", line 99, in paired_dataset_validation
A_paths = get_image_paths(data_type, paths[0], max_dataset_size) # get image paths
File "C:\ManduScale\Train\codes\dataops\common.py", line 82, in get_image_paths
paths = sorted(_get_paths_from_images(dataroot, max_dataset_size=max_dataset_size))
File "C:\ManduScale\Train\codes\dataops\common.py", line 43, in _get_paths_from_images
assert images, '{:s} has no valid image file'.format(path)
AssertionError: C:\ManduScale\OPScale\DataSet\FourthSet\LR1.lmdb has no valid image file
my config is
dataroot_HR: ['C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
'C:\ManduScale\OPScale\DataSet\FourthSet\HR',
]
dataroot_LR: ['C:\ManduScale\OPScale\DataSet\FourthSet\LR1.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR2.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR3.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR4.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR5.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR6.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR7.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR8.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR9.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR10.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR11.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR12.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR13.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR14.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR15.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR16.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR17.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR18.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR19.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR20.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR22.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR23.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR24.lmdb',
'C:\ManduScale\OPScale\DataSet\FourthSet\LR25.lmdb',
]
anyone can help me fix this issue?
The first error is that it is forcing the image size to mutiple of 4 when the scale is 1.
Secondly, even thought it has cropped/expanded the image it still gives this error as it does not scale the corresponding lr image
LOGS:
The image size needs to be a multiple of 4. The loaded image size was (817, 398), so it was adjusted to (816, 400). This adjustment will be done to all images whose sizes are not multiples of 4.
The image size needs to be a multiple of 4. The loaded image size was (476, 485), so it was adjusted to (476, 484). This adjustment will be done to all images whose sizes are not multiples of 4.
Traceback (most recent call last):
File "train.py", line 417, in
main()
File "train.py", line 413, in main
fit(model, opt, dataloaders, steps_states, data_params, loggers)
File "train.py", line 215, in fit
for n, train_data in enumerate(dataloaders['train'], start=1):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 517, in next
data = self._next_data()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1199, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1225, in _process_data
data.reraise()
File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 429, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 73, in default_collate
return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 73, in
return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: stack expects each tensor to be equal size, but got [3, 400, 816] at entry 0 and [3, 560, 464] at entry 1
When using nearest_aligned the output is noticeably shifted down and to the right. This affects my models severely and causes noticeable warping in their output.
I used this code in augmentations.py to produce the output images:
if __name__ == '__main__':
img = cv2.imread('test.png')
img_A, _ = Scale(img=img, scale=4, algo=997, ds_kernel=None, img_type='cv2')
cv2.imwrite('output.png', img_A)
Output from nearest_aligned
as per the above code:
Output from convert test.png -interpolate Average -filter point -resize 25% magick-nearest.png
:
Explicitly sampling the top left corner closely matches the offset from nearest_aligned
:
convert test.png -define sample:offset=0%x0% -sample 25% magick-sampled-top-left.png
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.