Giter VIP home page Giter VIP logo

dped's Introduction

DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks



The provided code implements the paper that presents an end-to-end deep learning approach for translating ordinary photos from smartphones into DSLR-quality images. The learned model can be applied to photos of arbitrary resolution, while the methodology itself is generalized to any type of digital camera. More visual results can be found here.

2. Prerequisites

3. First steps

  • Download the pre-trained VGG-19 model Mirror and put it into vgg_pretrained/ folder
  • Download DPED dataset (patches for CNN training) and extract it into dped/ folder.
    This folder should contain three subolders: sony/, iphone/ and blackberry/

4. Train the model

python train_model.py model=<model>

Obligatory parameters:

model: iphone, blackberry or sony

Optional parameters and their default values:

batch_size: 50   -   batch size [smaller values can lead to unstable training]
train_size: 30000   -   the number of training patches randomly loaded each eval_step iterations
eval_step: 1000   -   each eval_step iterations the model is saved and the training data is reloaded
num_train_iters: 20000   -   the number of training iterations
learning_rate: 5e-4   -   learning rate
w_content: 10   -   the weight of the content loss
w_color: 0.5   -   the weight of the color loss
w_texture: 1   -   the weight of the texture [adversarial] loss
w_tv: 2000   -   the weight of the total variation loss
dped_dir: dped/   -   path to the folder with DPED dataset
vgg_dir: vgg_pretrained/imagenet-vgg-verydeep-19.mat   -   path to the pre-trained VGG-19 network

Example:

python train_model.py model=iphone batch_size=50 dped_dir=dped/ w_color=0.7

5. Test the provided pre-trained models

python test_model.py model=<model>

Obligatory parameters:

model: iphone_orig, blackberry_orig or sony_orig

Optional parameters:

test_subset: full,small   -   all 29 or only 5 test images will be processed
resolution: orig,high,medium,small,tiny   -   the resolution of the test images [orig means original resolution]
use_gpu: true,false   -   run models on GPU or CPU
dped_dir: dped/   -   path to the folder with DPED dataset

Example:

python test_model.py model=iphone_orig test_subset=full resolution=orig use_gpu=true

6. Test the obtained models

python test_model.py model=<model>

Obligatory parameters:

model: iphone, blackberry or sony

Optional parameters:

test_subset: full,small   -   all 29 or only 5 test images will be processed
iteration: all or <number>   -   get visual results for all iterations or for the specific iteration,
               <number> must be a multiple of eval_step
resolution: orig,high,medium,small,tiny   -   the resolution of the test images [orig means original resolution]
use_gpu: true,false   -   run models on GPU or CPU
dped_dir: dped/   -   path to the folder with DPED dataset

Example:

python test_model.py model=iphone iteration=13000 test_subset=full resolution=orig use_gpu=true

7. Folder structure

dped/   -   the folder with the DPED dataset
models/   -   logs and models that are saved during the training process
models_orig/   -   the provided pre-trained models for iphone, sony and blackberry
results/   -   visual results for small image patches that are saved while training
vgg-pretrained/   -   the folder with the pre-trained VGG-19 network
visual_results/   -   processed [enhanced] test images

load_dataset.py   -   python script that loads training data
models.py   -   architecture of the image enhancement [resnet] and adversarial networks
ssim.py   -   implementation of the ssim score
train_model.py   -   implementation of the training procedure
test_model.py   -   applying the pre-trained models to test images
utils.py   -   auxiliary functions
vgg.py   -   loading the pre-trained vgg-19 network


8. Problems and errors

What if I get an error: "OOM when allocating tensor with shape [...]"?

   Your GPU does not have enough memory. If this happens during the training process:

  • Decrease the size of the training batch [batch_size]. Note however that smaller values can lead to unstable training.

   If this happens while testing the models:

  • Run the model on CPU (set the parameter use_gpu to false). Note that this can take up to 5 minutes per image.
  • Use cropped images, set the parameter resolution to:

high   -   center crop of size 1680x1260 pixels
medium   -   center crop of size 1366x1024 pixels
small   -   center crop of size 1024x768 pixels
tiny   -   center crop of size 800x600 pixels

   The less resolution is - the smaller part of the image will be processed


9. Citation

@inproceedings{ignatov2017dslr,
  title={DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks},
  author={Ignatov, Andrey and Kobyshev, Nikolay and Timofte, Radu and Vanhoey, Kenneth and Van Gool, Luc},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={3277--3285},
  year={2017}
}

10. Any further questions?

Please contact Andrey Ignatov ([email protected]) for more information

dped's People

Contributors

aiff22 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dped's Issues

Cudnn handle

When i run the test code with GPU,I got the note as below:

Testing original sony model, processing image 17.jpg
2019-07-11 01:37:25.413701: E tensorflow/stream_executor/cuda/cuda_dnn.cc:353] Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2019-07-11 01:37:25.413830: E tensorflow/stream_executor/cuda/cuda_dnn.cc:361] Possibly insufficient driver version: 384.183.0
Segmentation fault (core dumped)

I use tensorflow-gpu 1.11.0, cuda9.0
I had change my cudnn version from 7.4.2 to 7.0.5. Both of these two version didn't work.
I'll thank for any suggestion!

How to port resnet(image) onto android devices.

Please share some insight on how to port resnet(image) from model.py onto android devices.
I have trained and saved a model, but the model requires the output from resnet(image) as input. Not sure how to do that on android.

Thanks in advance.

A brilliant idea!

Super cool. Thank you for creating this, I am testing it now, will let you know if I have any questions 👍

Comparison with APE that is source-ignorant?

Thank you for sharing great work.
I'd like to ask a question about the comparison with APE for evaluation.

APE is a general purpose enhancement tool that does not know from what camera source the image was taken.
In my opinion, the better quality of DPED partly comes from the fact that it was trained only for specific camera sources.

Would you give me your thoughts on my opinion?
And would you share another evaluation result if you conducted training a network with all the camera sources together?

Do you have the file of '.meta'?

Hi @aiff22
Thank you for sharing your work.
I want to transfer your models, but the pre-training models lack of the '.meta' file, do you have the file of '.meta'?

bug in vgg.py ?

Hi:
I download the training data and vgg net, run the train_model.py using the given command.
But I encountered the following error:

kernels, bias = weights[i][0][0][0][0]
ValueError: too many values to unpack

So, am I wrong somewhere with the code ?

Thank you very much !

discrim_predicitions

Hello , i was wondering if u can elaborate about the discriminator predictions variable . it wasn't clear to me what the code line
discrim_predictions = models.adversarial(adversarial_image) returns . is it a vector with predictions over a batch of patches? or a single patch prediction?
thanks !

The order of leaky_relu layer and BN layer for D-net

Hi there,

I am looking at your code and trying to reproduce your WESPE paper. I found in the function _conv_layer() in models.py file, the BN layer is after leaky_relu layer, which is different from what other people usually do (BN first, then relu layer). Is there any special reason you do so?

BTW, will the code for WESPE paper will be released?

Thanks a lot.

Best,
Yvonne

consult for the dataset

hello, thank you for sharing your work. I downloaded the dataset from the link you gave, but found that the given dataset was not aligned. The test set full_size_test_images only has sony, iphone, blackberry, but no canon, can provide full_size aligned data sets for training and testing. If it is convenient, it can be sent to my mailbox [email protected]. Thank you.

Quite slow on my mac machine for given models.

I am using the command mentioned.
python test_model.py model=iphone_orig test_subset=full resolution=orig use_gpu=false

But for a single image, it is taking around ~24 seconds. Is there any way to improve the time?

test model, but error

I run the cmd, but error.
python test_model.py model=iphone_orig test_subset=full resolution=orig use_gpu=true
the error below, how should I do?
Testing original iphone model, processing image 1.jpg
Traceback (most recent call last):
File "test_model.py", line 50, in
image = np.float16(misc.imresize(misc.imread(test_dir + photo), res_sizes[phone])) / 255
AttributeError: 'module' object has no attribute 'imresize'

too slow when training (not using GPU)

When training the model, i.e, executing the command python train_model.py model=iphone, I find the training model seems to use the CPU instead of GPU, but my machine has a 2080Ti nvidia.
I have tried to use CUDA_VISIBLE_DECIVES=0 python train_model.py model=iphone. And I also have tried to add "os.environ['CUDA_VISIBLE_DEVICES']='0' in the code. Besides, I tried to add config=tf.compat.v1.ConfigProto(device_count={'GPU':0}) in the code. But none of them worked.
Could anyone help me, please?

Quality improvement query

Hi,

I ran the iphone,sony,blackberry pre-trained models on some sunset and visually attractive images.

The iphone model produces decent outputs but the other two remove the sharpness and almost the output is smooth

Can I know few hyper parameter settings I could try in training the model so that the output has good texture and the input image is not degraded in case of them images taken in decent light conditions ?

Hyper parameters of pre-trained model

Hi,

I wish to know the hyper-parameter settings of the pre-trained model provided.
I used the default parameters and trained the network, the output quality does not match with the output generated using pre-trained model.

I've tried the following coefficients of loss function and trained the network:
Content:10.0 , tv:2000 , texture: 1.0, color:0.5
as well as
content: 1.0, tv : 400 , texture: 0.4 , color: 0.1
However these settings produce output that is degraded compared to that of pre-trained model.

Moreover, the paper suggests to pre-train the discriminator. But, the code provided does not use pre-trained discriminator.

Please help with the above 2 queries.

Awaiting your help,

porting to mxnet problem

Hi @aiff22,

I am trying to port your code to MxNet. But there is an issue about loss function definition.
For example I can not find the corresponding ops like tf.image.rgb_to_grayscale in MxNet.

Maybe you can guide me. Thanks

Confusion about texture loss

the texture loss is defined as following:

loss_discrim = -tf.reduce_sum(discrim_target * tf.log(tf.clip_by_value(discrim_predictions, 1e-10, 1.0)))
loss_texture = -loss_discrim

in this case loss_texure is a negtive value, and I wonder if this is an bug. So I think the correct texture loss is :

loss_discrim = -tf.reduce_sum(discrim_target * tf.log(tf.clip_by_value(discrim_predictions, 1e-10, 1.0)))
loss_texture = loss_discrim

Not Good Result after training

I have trained the model as per your training code and dataset. I have selected 100 as batch size and 12000 number of iterations. But after training the model, the results are not good. The output image has glitch and there is no significant difference in the images (input and output). Kindly reply.
I have attached one output image for reference.

blackberry_5_iteration_12000_enhanced

HOW TO stitch enhanced patches to single image?

Please, tell me HOW TO stitch enhanced patches to single image? What do you use to stitch enhanced patches to single image? Do you meet some problem with combining or stitching? Or May be some artifacts?

Image size becomes larger

When I test images with resolution=orig,the size of enhanced images becomes larger than original images.
e.g. original size is 295KB while enhanced size is 1.1MB, why becoming so large?

Another question: The speed of inference is 60s per image on my macbookpro, so slow?

one small issue after run the code

Is the tensorboard file saved in an existing folder? I am studying tensorflow and graph and statistics in tensorboard really helps me a lot. thanks very much!

Getting NoneType in models.py

2020-03-13 00:28:09.773043: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-03-13 00:28:09.786026: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-03-13 00:28:09.794261: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]
WARNING:tensorflow:From C:\Python3\lib\site-packages\tensorflow_core\python\ops\resource_variable_ops.py:1635: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Traceback (most recent call last):
File "train_model.py", line 57, in
enhanced = models.resnet(phone_image)
File "C:\Users\rkarnat\Python\DPED-master\DPED-master\models.py", line 13, in resnet
c2 = tf.nn.relu(_instance_norm(conv2d(c1, W2) + b2))
File "C:\Users\rkarnat\Python\DPED-master\DPED-master\models.py", line 114, in _instance_norm
batch, rows, cols, channels = [i.value for i in net.get_shape()]
File "C:\Users\rkarnat\Python\DPED-master\DPED-master\models.py", line 114, in
batch, rows, cols, channels = [i.value for i in net.get_shape()]
AttributeError: 'NoneType' object has no attribute 'value'

collecting data

Hello, @aiff22 do you think it is necessary for different cameras to collect images at the same location when collecting data?

感觉不如opencv一行代码提升效果明显

cv2.normalize(img,dst=None,alpha=350,beta=10,norm_type=cv2.NORM_MINMAX)
各位可以测一下,我这边是测了发现正常环境opencv和DPED没差多少
但是在极限弱光环境下,opencv的对比度拉升要好于DPED
微信图片编辑_20210421172552
左边原图,中间opencv,右边DPED

It shows cuda_error_out_of_memory when I trained the network on nvidia1080ti.

totalMemory: 11.00GiB freeMemory: 9.09GiB
2018-06-26 11:36:14.357681: I C:\Users\User\Source\Repos\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1312] Adding visible gpu devices: 0
2018-06-26 11:36:15.811224: I C:\Users\User\Source\Repos\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8806 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Initializing variables
Training network
2018-06-26 11:36:33.432391: E C:\Users\User\Source\Repos\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-26 11:36:33.590026: E C:\Users\User\Source\Repos\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.60G (3865470464 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-26 11:36:33.747158: E C:\Users\User\Source\Repos\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.24G (3478923264 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-26 11:36:33.907955: E C:\Users\User\Source\Repos\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.92G (3131030784 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-26 11:36:34.064080: E C:\Users\User\Source\Repos\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.62G (2817927680 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-26 11:36:34.223302: E C:\Users\User\Source\Repos\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.36G (2536134912 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY

How can I solve this?thanks for a lot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.