Comments (6)
https://github.com/gordicaleksa/pytorch-nst-feedforward I've implemented the paper without any tanh activation and it works like a charm!
Could you maybe link this repo as a PyTorch implementation of Johnson's original paper? I've documented the differences that it has from the original paper in transformer_net.py file.
from fast-neural-style.
I would think it is connected to the practice back then of using 0..255 pixel range directly in models. Eg. the original neural-style used a VGG model like that.
This issue seems to support my guess:
#112
from fast-neural-style.
I'm aware that Caffe used BGR, 0..255 images mean ([123.675, 116.28, 103.53]) normalized using ImageNet's mean to train VGG models.
Whereas PyTorch which I'm using uses RGB, 0..1 range images also normalized using both std ([0.229, 0.224, 0.225]) and mean ([0.485, 0.456, 0.406]) to train VGG nets. Those are the same numbers just depending whether you use 0..1 or 0..255 range.
But that still doesn't explain the magic 150 number, 127.5 would make some sense then you could just shift [-127.5, 127.5] from the model output to [0, 255].
What I think happened here is that Johnson tried to emulate (using this 150*tanh output activation) what would happen when you used [0, 255] image and subtracted ImageNet's mean from it. That way you can directly feed the output of the transformer net into the perceptual net (VGG).
Namely, if you take 0..255 image and subtract [123.675, 116.28, 103.53] the biggest number will be 151.47 and the smallest will be -123.675 (assuming that we have 255 and 0s in the right channels). So that's kinda -150, 150...
from fast-neural-style.
That way you can directly feed the output of the transformer net into the perceptual net (VGG).
That's about what I was thinking too. But also that 150 may just have been a value that appeared to work. Note that it is in fact a command line option.
from fast-neural-style.
Do we have a way of pinging Johnson for this thing? It'd be nice to explain the meaning of that magic number and whether our hypothesis is true. @jcjohnson
from fast-neural-style.
Seems to me that he's not been active here for a long time.
The logical place to start is to consult the paper https://cs.stanford.edu/people/jcjohns/papers/eccv16/JohnsonECCV16.pdf .
From section 3.1
"All nonresidual convolutional layers are followed by batch normalization [50] and ReLU
nonlinearities with the exception of the output layer, which instead uses a scaled
tanh to ensure that the output has pixels in the range [0, 255]."
from fast-neural-style.
Related Issues (20)
- cuda runtime error when training
- invalid argument: /path/to/output/file.h5
- Training image sizes
- train requirements.txt error 'ascii' codec can't decode byte 0xe2 in position 1178
- How to understand the gradient backward propagation of perceptual loss? HOT 1
- did you try it with nvidia jetson tx2 / nvidia xavier hardware?
- cannot open <starry_night.t7>
- Cannot run fast_neural_style.lua script(libjpeg library problem) HOT 1
- Diffrent results on diffrent machines HOT 1
- C# Implementation
- predict result is different when running opencv and fast_neural_style.lua script HOT 1
- get bad results after training candy
- unable to set v4l2 format: Invalid argument
- File.lua: unknown object HOT 2
- What is motivation of using 9x9 conv at first and last layer?
- could we apply fast-neural-transfer to image deformation?
- fork failed: cannot allocate memory HOT 1
- Problems with lua HOT 1
- Does anyone know the original source of the mosaic image?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fast-neural-style.