현재 판면된 바로는 SPADE + decoder 는 그나마 잘 작동을 하지만 atention 을 생성하는 non-local block 은 개똥망이다. 그리

<a href="https://pytorch.org/docs/stable/torchvision/transforms.html#torchvision.trans

2월 9일 미팅 이후 베이스 라인: Augmentation scaling w

correspondence subnet 을 독립적으로 트레이닝 about spade_colorization HOT 3 CLOSED

ThisIsIsaac commented on July 20, 2024

correspondence subnet 을 독립적으로 트레이닝

from spade_colorization.

Comments (3)

ThisIsIsaac commented on July 20, 2024

torch.transforms 에서 밑에 함수정도를 쓰면 될 것 같다:

RandonRotation
RandomAffine
RandomCrop
Resize
ColorJitter

모든 filter, filter, interpolation 옵션은 PIL.Image.NEAREST 로 줘야함. 가장 가까운 픽셀을 하나 골라오는건데 이렇게 해야지 mapping 이 가능해짐. source

여기에 작은 random gaussian noise 를 추가하면 끝!

Q. can both target and ref be augmented at the same time?

Yes. We first augment the reference. Then we create target by additional augmentation on the augmented ref.

현재는 64x64 에서 bilinear interpolation 을 하는데 이것도 문제 없나?

이럴경우 pixel wise 가 아니라 patch wise 로 해야함. 예를 들어 256x256 이미지가 인풋이고 64x64 warping marix 가 나온다면 256x256 이미지를 4x4 패치로 쪼개서 한 패치엔 같은 index value를 넣으면 됨. 이 index tensor 를 위와 같이 augmentation 해준 후에 한 패치를 하나의 value 로 바꿔주면 됨 (예: 4x4 patch 가 있을때 이 pathc 는 전부 같은 숫자일거다. 그럼 그걸 그냥 1개의 숫자로 replace 해주면 됨 )

—-

Shuffle the patches around within the same photo

from spade_colorization.

ThisIsIsaac commented on July 20, 2024

Correspondence matrix 부분 따로 트레이닝

corrsonedence subnet 을 아예 따로 트레이닝을 하면 좋을 것 같다. data augmentation 으로 충분히 generalize 만 할 수 있다면 어느정도 학습이 된 subnet 을 이후 메인 네트워크에 붇혀서 추가로 학습하면 우리가 원하는 방향으로 흘러갈 수 도 있음.

데이터 생성 방법

original image 를 augment 해서 ref 를 생성.
Index RGB 생성
ref 를 augment 해서 taret 생성. Index RGB 를 target 생성할때와 동일한 방법으로 augment 해서 GT Index RGB 생성.

augmentation

중요 포인트:

target 에 있는 모든 픽셀이 ref 에도 있어야함
interpolation 시 nearest 를 사용하지 않으면 픽셀 다위의 1대1 매핑이 불가능함. 하지만 nearest 를 사용하면 화질에 문제가 있을 수 있기 때문에 가능한 사용하지 않기로
ref 와 target pixel 간의 one-to-one (모든 target pixel 에 corresponding 하는 target pixel 하나씩 있을때), one-to-many (하나의 ref pixel 에 해당하는 target pixel 이 여러개 있을때), one-to-none (어떤 ref pixel 은 그 어떤 target pixel 에도 해당하지 않을때) 매핑을 모두 배워야함.

augmentation methods:

색 distortion
90 degrees rotation (90가 아니면 실제 이미지 밖으로 나갈 수 있기 때문에
vertical / horizontal flipping
random cropping
resize(256, 256): @DongHwanJang @kyumaze 이부분은 discusion 필요. subnet 을 따로 트레이닝 할때 과연 target 과 ref 가 같은 사이즈여야 할까? 아니라면 유의미한 augmentation 자유도가 늘어남
tiling: 위에 있는 augmentation 으로는 one-to-one 밖에 하지 못함. 이걸 극복하기 위해 tiling 이 꼭 필요할듯. tiling 이란 ref 이미지를 몇개의 타일로 나누고 그 타일들의 여러가지 조합으로 target 이미지를 만드는것. 예를 들어 256x256 을 4개의 64x64 이미지로 나누고 그중 하나를 선택해 4번 복붙해서 target 을 만든다거나 (one to many mapping & one to none) tile 4개를 셔플해서 target 을 만든다거나

losses

pixel-wise softmax
discriminator: target vs fake
VGG perceptual: target vs fake
L1/L2: target GT vs fake
smoothness loss

from spade_colorization.

ThisIsIsaac commented on July 20, 2024

2월 9일 미팅 이후 베이스 라인:

Augmentation

scaling w/ nearest
cropping ( since we can scale to same size )
horizontal flipping (Not vertical flipping because most photos have gravitational bias)

Tiling on hold because boarders will be too difficult

Todo

what is good scaling factor
Try tiling?

from spade_colorization.

correspondence subnet 을 독립적으로 트레이닝 about spade_colorization HOT 3 CLOSED

Comments (3)

Q. can both target and ref be augmented at the same time?

현재는 64x64 에서 bilinear interpolation 을 하는데 이것도 문제 없나?

Correspondence matrix 부분 따로 트레이닝

데이터 생성 방법

augmentation

losses

Augmentation

Todo

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent