Giter VIP home page Giter VIP logo

unsupdis-pytorch's Introduction

The official implementation is here with tensorflow 1.x. The stitching pipeline referred to UnsupDIS and the networks and code organizations utilized YOLOv5. Both of them are excellent works.

This repo allows you to finish the whole training process (including alignment and reconstruction) within 1 day. This repo makes it possible to be a real-time application during inference.

Results

image

Pretrained Checkpoints

Model COCO
PSNR
COCO
SSIM
COCO
RMSE
UDIS
PSNR
UDIS
SSIM
Params(M) GFLOPs
align-origin.tf - - 2.0239 23.80 0.7929 180.0 14.3
align-origin 33.95 0.9481 2.0695 26.34 0.8589 180.0 14.3
align-yolo 36.64 0.9657 1.7241 26.53 0.8641 15.0 14.5
align-variant 37.33 0.9704 1.7614 26.53 0.8622 9.7 12.3
fuse-origin - - - - - 8.0 605.3
fuse-yolo - - - - - 4.4 74.8

* The original model size exceeds github's release limitation (2GB). You are free to train a model with the provided commands.

Installation

Python>=3.6 is required with all requirements.txt installed including PyTorch>=1.7:

python3 -m pip install -r requirements.txt

Data Preparation

Download the UDIS-D and WarpedCOCO (code: 1234), and make soft-links to the data directories:

ln -sf /path/to/UDIS-D UDIS-D
ln -sf /path/to/WarpedCOCO WarpedCOCO

Make sure the images are organized as follows:

UDIS-D/train/input1/000001.jpg  UDIS-D/train/input2/000001.jpg  UDIS-D/test/input1/000001.jpg  UDIS-D/test/input2/000001.jpg
WarpedCOCO/training/input1/000001.jpg  WarpedCOCO/training/input2/000001.jpg  WarpedCOCO/testing/input1/000001.jpg  WarpedCOCO/testing/input2/000001.jpg

Training, Testing, and Inference

Run the commands below to go through the whole process of unsupervised deep image stitching. Some alternative commands are displayed in main.sh.

Download the pretrained backbones (YOLOv5s, YOLOv5m, YOLOv5l, YOLOv5x) and put them to the weights/ directory first. You can modify the depth_multiple and width_multiple in models/*.yaml to choose which backbone to use.

Step 1 (Alignment): Unsupervised pre-training on Stitched MS-COCO

python3 train.py --data data/warpedcoco.yaml --hyp data/hyp.align.scratch.yaml --cfg models/align_yolo.yaml --weights weights/yolov5x.pt --batch-size 16 --img-size 128 --epochs 150 --adam --device 0 --mode align
mv runs/train/exp weights/align/warpedcoco

Step 2 (Alignment): Unsupervised finetuning on UDIS-D

python3 train.py --data data/udis.yaml --hyp data/hyp.align.finetune.udis.yaml --cfg models/align_yolo.yaml --weights weights/align/warpedcoco/weights/best.pt --batch-size 16 --img-size 128 --epochs 50 --adam --device 0 --mode align
mv runs/train/exp weights/align/udis

Step 3 (Alignment): Evaluating and visualizing the alignment results

(RMSE) python3 inference_align.py --source data/warpedcoco.yaml --weights weights/align/warpedcoco/weights/best.pt --task val --rmse
(PSNR) python3 test.py --data data/warpedcoco.yaml --weights weights/align/warpedcoco/weights/best.pt --batch-size 64 --img-size 128 --task val --device 0 --mode align
(PSNR) python3 test.py --data data/udis.yaml --weights weights/align/udis/weights/best.pt --batch-size 64 --img-size 128 --task val --device 0 --mode align
(PLOT) python3 inference_align.py --source data/udis.yaml --weights weights/align/udis/weights/best.pt --task val --visualize
rm -r runs/infer/ runs/test/

Step 4 (Alignment): Generating the coarsely aligned image pairs

python3 inference_align.py --source data/udis.yaml --weights weights/align/udis/weights/best.pt --task train
python3 inference_align.py --source data/udis.yaml --weights weights/align/udis/weights/best.pt --task test
mkdir UDIS-D/warp
mv runs/infer/exp UDIS-D/warp/train
mv runs/infer/exp2 UDIS-D/warp/test

Step 5 (Reconstruction): Training the reconstrction model on UDIS-D

python3 train.py --data data/udis.yaml --hyp data/hyp.fuse.scratch.yaml --cfg models/fuse_yolo.yaml --weights weights/yolov5m.pt --batch-size 4 --img-size 640 --epochs 30 --adam --device 0 --mode fuse --reg-mode crop
mv runs/train/exp weights/fuse/udis

Step 6 (Reconstruction): Generating the finally stitched results

python3 inference_fuse.py --weights weights/fuse/udis/weights/best.pt --source data/udis.yaml --task test --half --img-size 640 --reg-mode crop

TODO

  • FP16 Compatibility: FP16 data-type may cause strange values in division operations.
  • Fuse Optimization: the hyp-params and loss functions are not perfect right now.

Comparison and Discussion

In order to improve the flexibility and speed up the training process, we made this reimplementation in pytorch. We also adjusted the networks, loss functions, data augmentation, and a considerable part of hyper-parameters. Taking the network as an example, in the alignment phase we replace the CostVolume module with a simple Concat module, of which the former is either time-consuming or memory-consuming, and even leads to divergence. And the alignment training process may be broken for some unknown reasons. This repository is far away from perfect, and I hope you can assist me to complete this project. Contact me if you have any problems or suggestions -- [email protected].

unsupdis-pytorch's People

Contributors

liudakai2 avatar tgjjj avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.