Giter VIP home page Giter VIP logo

deoldify's Introduction

DeOldify

Quick Start: The easiest way to colorize images using open source DeOldify (for free!) is here: DeOldify Image Colorization on DeepAI

Desktop: Want to run open source DeOldify for photos and videos on the desktop?

The most advanced version of DeOldify image colorization is available here, exclusively. Try a few images for free! MyHeritage In Color

Huggingface Web Demo: Integrated to Huggingface Spaces with Gradio. See demo: Hugging Face Spaces

Replicate: Image: | Video:


Image (artistic) Colab for images | Video Colab for video

Having trouble with the default image colorizer, aka "artistic"? Try the "stable" one below. It generally won't produce colors that are as interesting as "artistic", but the glitches are noticeably reduced.

Image (stable) Colab for stable model

Instructions on how to use the Colabs above have been kindly provided in video tutorial form by Old Ireland in Colour's John Breslin. It's great! Click video image below to watch.

DeOldify Tutorial

Get more updates on Twitter Twitter logo.

Table of Contents

About DeOldify

Simply put, the mission of this project is to colorize and restore old images and film footage. We'll get into the details in a bit, but first let's see some pretty pictures and videos!

New and Exciting Stuff in DeOldify

  • Glitches and artifacts are almost entirely eliminated
  • Better skin (less zombies)
  • More highly detailed and photorealistic renders
  • Much less "blue bias"
  • Video - it actually looks good!
  • NoGAN - a new and weird but highly effective way to do GAN training for image to image.

Example Videos

Note: Click images to watch

Facebook F8 Demo

DeOldify Facebook F8 Movie Colorization Demo

Silent Movie Examples

DeOldify Silent Movie Examples

Example Images

"Migrant Mother" by Dorothea Lange (1936)

Migrant Mother

Woman relaxing in her livingroom in Sweden (1920)

Sweden Living Room

"Toffs and Toughs" by Jimmy Sime (1937)

Class Divide

Thanksgiving Maskers (1911)

Thanksgiving Maskers

Glen Echo Madame Careta Gypsy Camp in Maryland (1925)

Gypsy Camp

"Mr. and Mrs. Lemuel Smith and their younger children in their farm house, Carroll County, Georgia." (1941)

Georgia Farmhouse

"Building the Golden Gate Bridge" (est 1937)

Golden Gate Bridge

Note: What you might be wondering is while this render looks cool, are the colors accurate? The original photo certainly makes it look like the towers of the bridge could be white. We looked into this and it turns out the answer is no - the towers were already covered in red primer by this time. So that's something to keep in mind- historical accuracy remains a huge challenge!

"Terrasse de café, Paris" (1925)

Cafe Paris

Norwegian Bride (est late 1890s)

Norwegian Bride

Zitkála-Šá (Lakota: Red Bird), also known as Gertrude Simmons Bonnin (1898)

Native Woman

Chinese Opium Smokers (1880)

Opium Real

Stuff That Should Probably Be In A Paper

How to Achieve Stable Video

NoGAN training is crucial to getting the kind of stable and colorful images seen in this iteration of DeOldify. NoGAN training combines the benefits of GAN training (wonderful colorization) while eliminating the nasty side effects (like flickering objects in video). Believe it or not, video is rendered using isolated image generation without any sort of temporal modeling tacked on. The process performs 30-60 minutes of the GAN portion of "NoGAN" training, using 1% to 3% of imagenet data once. Then, as with still image colorization, we "DeOldify" individual frames before rebuilding the video.

In addition to improved video stability, there is an interesting thing going on here worth mentioning. It turns out the models I run, even different ones and with different training structures, keep arriving at more or less the same solution. That's even the case for the colorization of things you may think would be arbitrary and unknowable, like the color of clothing, cars, and even special effects (as seen in "Metropolis").

Metropolis Special FX

My best guess is that the models are learning some interesting rules about how to colorize based on subtle cues present in the black and white images that I certainly wouldn't expect to exist. This result leads to nicely deterministic and consistent results, and that means you don't have track model colorization decisions because they're not arbitrary. Additionally, they seem remarkably robust so that even in moving scenes the renders are very consistent.

Moving Scene Example

Other ways to stabilize video add up as well. First, generally speaking rendering at a higher resolution (higher render_factor) will increase stability of colorization decisions. This stands to reason because the model has higher fidelity image information to work with and will have a greater chance of making the "right" decision consistently. Closely related to this is the use of resnet101 instead of resnet34 as the backbone of the generator- objects are detected more consistently and correctly with this. This is especially important for getting good, consistent skin rendering. It can be particularly visually jarring if you wind up with "zombie hands", for example.

Zombie Hand Example

Additionally, gaussian noise augmentation during training appears to help but at this point the conclusions as to just how much are bit more tenuous (I just haven't formally measured this yet). This is loosely based on work done in style transfer video, described here: https://medium.com/element-ai-research-lab/stabilizing-neural-style-transfer-for-video-62675e203e42.

Special thanks go to Rani Horev for his contributions in implementing this noise augmentation.

What is NoGAN?

This is a new type of GAN training that I've developed to solve some key problems in the previous DeOldify model. It provides the benefits of GAN training while spending minimal time doing direct GAN training. Instead, most of the training time is spent pretraining the generator and critic separately with more straight-forward, fast and reliable conventional methods. A key insight here is that those more "conventional" methods generally get you most of the results you need, and that GANs can be used to close the gap on realism. During the very short amount of actual GAN training the generator not only gets the full realistic colorization capabilities that used to take days of progressively resized GAN training, but it also doesn't accrue nearly as much of the artifacts and other ugly baggage of GANs. In fact, you can pretty much eliminate glitches and artifacts almost entirely depending on your approach. As far as I know this is a new technique. And it's incredibly effective.

Original DeOldify Model

Before Flicker

NoGAN-Based DeOldify Model

After Flicker

The steps are as follows: First train the generator in a conventional way by itself with just the feature loss. Next, generate images from that, and train the critic on distinguishing between those outputs and real images as a basic binary classifier. Finally, train the generator and critic together in a GAN setting (starting right at the target size of 192px in this case). Now for the weird part: All the useful GAN training here only takes place within a very small window of time. There's an inflection point where it appears the critic has transferred everything it can that is useful to the generator. Past this point, image quality oscillates between the best that you can get at the inflection point, or bad in a predictable way (orangish skin, overly red lips, etc). There appears to be no productive training after the inflection point. And this point lies within training on just 1% to 3% of the Imagenet Data! That amounts to about 30-60 minutes of training at 192px.

The hard part is finding this inflection point. So far, I've accomplished this by making a whole bunch of model save checkpoints (every 0.1% of data iterated on) and then just looking for the point where images look great before they go totally bonkers with orange skin (always the first thing to go). Additionally, generator rendering starts immediately getting glitchy and inconsistent at this point, which is no good particularly for video. What I'd really like to figure out is what the tell-tale sign of the inflection point is that can be easily automated as an early stopping point. Unfortunately, nothing definitive is jumping out at me yet. For one, it's happening in the middle of training loss decreasing- not when it flattens out, which would seem more reasonable on the surface.

Another key thing about NoGAN training is you can repeat pretraining the critic on generated images after the initial GAN training, then repeat the GAN training itself in the same fashion. This is how I was able to get extra colorful results with the "artistic" model. But this does come at a cost currently- the output of the generator becomes increasingly inconsistent and you have to experiment with render resolution (render_factor) to get the best result. But the renders are still glitch free and way more consistent than I was ever able to achieve with the original DeOldify model. You can do about five of these repeat cycles, give or take, before you get diminishing returns, as far as I can tell.

Keep in mind- I haven't been entirely rigorous in figuring out what all is going on in NoGAN- I'll save that for a paper. That means there's a good chance I'm wrong about something. But I think it's definitely worth putting out there now because I'm finding it very useful- it's solving basically much of my remaining problems I had in DeOldify.

This builds upon a technique developed in collaboration with Jeremy Howard and Sylvain Gugger for Fast.AI's Lesson 7 in version 3 of Practical Deep Learning for Coders Part I. The particular lesson notebook can be found here: https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson7-superres-gan.ipynb

Why Three Models?

There are now three models to choose from in DeOldify. Each of these has key strengths and weaknesses, and so have different use cases. Video is for video of course. But stable and artistic are both for images, and sometimes one will do images better than the other.

More details:

  • Artistic - This model achieves the highest quality results in image coloration, in terms of interesting details and vibrance. The most notable drawback however is that it's a bit of a pain to fiddle around with to get the best results (you have to adjust the rendering resolution or render_factor to achieve this). Additionally, the model does not do as well as stable in a few key common scenarios- nature scenes and portraits. The model uses a resnet34 backbone on a UNet with an emphasis on depth of layers on the decoder side. This model was trained with 5 critic pretrain/GAN cycle repeats via NoGAN, in addition to the initial generator/critic pretrain/GAN NoGAN training, at 192px. This adds up to a total of 32% of Imagenet data trained once (12.5 hours of direct GAN training).

  • Stable - This model achieves the best results with landscapes and portraits. Notably, it produces less "zombies"- where faces or limbs stay gray rather than being colored in properly. It generally has less weird miscolorations than artistic, but it's also less colorful in general. This model uses a resnet101 backbone on a UNet with an emphasis on width of layers on the decoder side. This model was trained with 3 critic pretrain/GAN cycle repeats via NoGAN, in addition to the initial generator/critic pretrain/GAN NoGAN training, at 192px. This adds up to a total of 7% of Imagenet data trained once (3 hours of direct GAN training).

  • Video - This model is optimized for smooth, consistent and flicker-free video. This would definitely be the least colorful of the three models, but it's honestly not too far off from "stable". The model is the same as "stable" in terms of architecture, but differs in training. It's trained for a mere 2.2% of Imagenet data once at 192px, using only the initial generator/critic pretrain/GAN NoGAN training (1 hour of direct GAN training).

Because the training of the artistic and stable models was done before the "inflection point" of NoGAN training described in "What is NoGAN???" was discovered, I believe this amount of training on them can be knocked down considerably. As far as I can tell, the models were stopped at "good points" that were well beyond where productive training was taking place. I'll be looking into this in the future.

Ideally, eventually these three models will be consolidated into one that has all these good desirable unified. I think there's a path there, but it's going to require more work! So for now, the most practical solution appears to be to maintain multiple models.

The Technical Details

This is a deep learning based model. More specifically, what I've done is combined the following approaches:

Except the generator is a pretrained U-Net, and I've just modified it to have the spectral normalization and self-attention. It's a pretty straightforward translation.

This is also very straightforward – it's just one to one generator/critic iterations and higher critic learning rate. This is modified to incorporate a "threshold" critic loss that makes sure that the critic is "caught up" before moving on to generator training. This is particularly useful for the "NoGAN" method described below.

NoGAN

There's no paper here! This is a new type of GAN training that I've developed to solve some key problems in the previous DeOldify model. The gist is that you get the benefits of GAN training while spending minimal time doing direct GAN training. More details are in the What is NoGAN? section (it's a doozy).

Generator Loss

Loss during NoGAN learning is two parts: One is a basic Perceptual Loss (or Feature Loss) based on VGG16 – this just biases the generator model to replicate the input image. The second is the loss score from the critic. For the curious – Perceptual Loss isn't sufficient by itself to produce good results. It tends to just encourage a bunch of brown/green/blue – you know, cheating to the test, basically, which neural networks are really good at doing! Key thing to realize here is that GANs essentially are learning the loss function for you – which is really one big step closer to toward the ideal that we're shooting for in machine learning. And of course you generally get much better results when you get the machine to learn something you were previously hand coding. That's certainly the case here.

Of note: There's no longer any "Progressive Growing of GANs" type training going on here. It's just not needed in lieu of the superior results obtained by the "NoGAN" technique described above.

The beauty of this model is that it should be generally useful for all sorts of image modification, and it should do it quite well. What you're seeing above are the results of the colorization model, but that's just one component in a pipeline that I'm developing with the exact same approach.

This Project, Going Forward

So that's the gist of this project – I'm looking to make old photos and film look reeeeaaally good with GANs, and more importantly, make the project useful. In the meantime though this is going to be my baby and I'll be actively updating and improving the code over the foreseeable future. I'll try to make this as user-friendly as possible, but I'm sure there's going to be hiccups along the way.

Oh and I swear I'll document the code properly...eventually. Admittedly I'm one of those people who believes in "self documenting code" (LOL).

Getting Started Yourself

Easiest Approach

The easiest way to get started is to go straight to the Colab notebooks:

Image Colab for images | Video Colab for video

Special thanks to Matt Robinson and María Benavente for their image Colab notebook contributions, and Robert Bell for the video Colab notebook work!

Your Own Machine (not as easy)

Hardware and Operating System Requirements

  • (Training Only) BEEFY Graphics card. I'd really like to have more memory than the 11 GB in my GeForce 1080TI (11GB). You'll have a tough time with less. The Generators and Critic are ridiculously large.
  • (Colorization Alone) A decent graphics card. Approximately 4GB+ memory video cards should be sufficient.
  • Linux. I'm using Ubuntu 18.04, and I know 16.04 works fine too. Windows is not supported and any issues brought up related to this will not be investigated.

Easy Install

You should now be able to do a simple install with Anaconda. Here are the steps:

Open the command line and navigate to the root folder you wish to install. Then type the following commands

git clone https://github.com/jantic/DeOldify.git DeOldify
cd DeOldify
conda env create -f environment.yml

Then start running with these commands:

source activate deoldify
jupyter lab

From there you can start running the notebooks in Jupyter Lab, via the url they provide you in the console.

Note: You can also now do "conda activate deoldify" if you have the latest version of conda and in fact that's now recommended. But a lot of people don't have that yet so I'm not going to make it the default instruction here yet.

Alternative Install: User daddyparodz has kindly created an installer script for Ubuntu, and in particular Ubuntu on WSL, that may make things easier: https://github.com/daddyparodz/AutoDeOldifyLocal

Note on test_images Folder

The images in the test_images folder have been removed because they were using Git LFS and that costs a lot of money when GitHub actually charges for bandwidth on a popular open source project (they had a billing bug for while that was recently fixed). The notebooks that use them (the image test ones) still point to images in that directory that I (Jason) have personally and I'd like to keep it that way because, after all, I'm by far the primary and most active developer. But they won't work for you. Still, those notebooks are a convenient template for making your own tests if you're so inclined.

Typical training

The notebook ColorizeTrainingWandb has been created to log and monitor results through Weights & Biases. You can find a description of typical training by consulting W&B Report.

Pretrained Weights

To start right away on your own machine with your own images or videos without training the models yourself, you'll need to download the "Completed Generator Weights" listed below and drop them in the /models/ folder.

The colorization inference notebooks should be able to guide you from here. The notebooks to use are named ImageColorizerArtistic.ipynb, ImageColorizerStable.ipynb, and VideoColorizer.ipynb.

Completed Generator Weights

Completed Critic Weights

Pretrain Only Generator Weights

Pretrain Only Critic Weights

Want the Old DeOldify?

We suspect some of you are going to want access to the original DeOldify model for various reasons. We have that archived here: https://github.com/dana-kelley/DeOldify

Want More?

Follow #DeOldify on Twitter.

License

All code in this repository is under the MIT license as specified by the LICENSE file.

The model weights listed in this readme under the "Pretrained Weights" section are trained by ourselves and are released under the MIT license.

A Statement on Open Source Support

We believe that open source has done a lot of good for the world.  After all, DeOldify simply wouldn't exist without it. But we also believe that there needs to be boundaries on just how much is reasonable to be expected from an open source project maintained by just two developers.

Our stance is that we're providing the code and documentation on research that we believe is beneficial to the world.  What we have provided are novel takes on colorization, GANs, and video that are hopefully somewhat friendly for developers and researchers to learn from and adopt. This is the culmination of well over a year of continuous work, free for you. What wasn't free was shouldered by us, the developers.  We left our jobs, bought expensive GPUs, and had huge electric bills as a result of dedicating ourselves to this.

What we haven't provided here is a ready to use free "product" or "app", and we don't ever intend on providing that.  It's going to remain a Linux based project without Windows support, coded in Python, and requiring people to have some extra technical background to be comfortable using it.  Others have stepped in with their own apps made with DeOldify, some paid and some free, which is what we want! We're instead focusing on what we believe we can do best- making better commercial models that people will pay for. Does that mean you're not getting the very best for free?  Of course. We simply don't believe that we're obligated to provide that, nor is it feasible! We compete on research and sell that.  Not a GUI or web service that wraps said research- that part isn't something we're going to be great at anyways. We're not about to shoot ourselves in the foot by giving away our actual competitive advantage for free, quite frankly.

We're also not willing to go down the rabbit hole of providing endless, open ended and personalized support on this open source project.  Our position is this:  If you have the proper background and resources, the project provides more than enough to get you started. We know this because we've seen plenty of people using it and making money off of their own projects with it.

Thus, if you have an issue come up and it happens to be an actual bug that having it be fixed will benefit users generally, then great- that's something we'll be happy to look into.

In contrast, if you're asking about something that really amounts to asking for personalized and time consuming support that won't benefit anybody else, we're not going to help. It's simply not in our interest to do that. We have bills to pay, after all. And if you're asking for help on something that can already be derived from the documentation or code?  That's simply annoying, and we're not going to pretend to be ok with that.

deoldify's People

Contributors

0xflotus avatar 645775992 avatar aiemmu avatar ak391 avatar akashpalrecha avatar alexandrevicenzi avatar alexlloyd0 avatar benswinney avatar bhumijgupta avatar bitplane avatar borisdayma avatar dana-kelley avatar dependabot[bot] avatar erjanmx avatar iblech avatar imgbotapp avatar jahaynes avatar jantic avatar johnbreslin avatar jqueguiner avatar kirushyk avatar lavanyashukla avatar lnicola avatar macbre avatar mariabg avatar mc-robinson avatar mgrankin avatar naereen avatar schlunsen avatar strickvl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deoldify's Issues

RuntimeError: cuda runtime error (2) : out of memory at ..\aten\src\THC\THCGeneral.cpp:663

this is my config
Collecting environment information...
PyTorch version: 0.3.1.post2
Is debug build: No
CUDA used to build PyTorch: 9.0
OS: Microsoft Windows 10 Pro
GCC version: Could not collect
CMake version: Could not collect
Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 9.0.176
GPU models and configuration: GPU 0: GeForce GTX 1070
Nvidia driver version: 417.35
cuDNN version: Probably one of the following:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin\cudnn64_7.dll
Versions of relevant libraries:
[pip] Could not collect
[conda] blas 1.0 mkl
[conda] mkl 2019.1 144
[conda] mkl_fft 1.0.6 py36h6288b17_0
[conda] mkl_random 1.0.2 py36h343c172_0
[conda] pytorch 0.3.1 py36_cuda90_cudnn7he774522_2 [cuda90] peterjc123
[conda] torchtext 0.3.1
[conda] torchvision 0.2.1

this is the output of collect_env.py
this is the error i get if i try to use colorizeVisualization in jupyter notebook
RuntimeError: cuda runtime error (2) : out of memory at ..\aten\src\THC\THCGeneral.cpp:663

so does this mean my 1070 is running out of memory? any suggestions in this matter please?
what gpu did you people use to run this ?
thanks

Add colorization hinting

Would be useful to manually correct some of the colorization glitches, or make the output more historically accurate.

Ideally, a secondary image with user-defined colour hints could be provided to both the generator and critic. Generating training hints that look like user brush-strokes might be a bit difficult, although I'm not sure how much that would matter.

Another, easier option would be to specify only a palette:
http://openaccess.thecvf.com/content_cvpr_2017_workshops/w12/papers/Cho_PaletteNet_Image_Recolorization_CVPR_2017_paper.pdf

Awesome project by the way, can't wait to see how it develops!

Error transforming image - possibly because outside notebook

I'm on a machine without a GUI so I'd prefer to just run everything from the command line like a boss and I can do some general tinkering on a range of images and weights.

I've basically just copied the relevant bits from the notebooks

import multiprocessing
import os
from torch import autograd
from fastai.transforms import TfmType
from fasterai.transforms import *
from fastai.conv_learner import *
from fasterai.images import *
from fasterai.dataset import *
from fasterai.visualize import *
from fasterai.callbacks import *
from fasterai.loss import *
from fasterai.modules import *
from fasterai.training import *
from fasterai.generators import *
from fasterai.filters import *
from fastai.torch_imports import *
torch.backends.cudnn.benchmark=True
torch.cuda.set_device(0)

colorizer_path = 'colorize_gen_192.h5'
render_factor=42
weights_path = "/mnt/disks/400gb_ssd/DeOldify/colorize_gen_192.h5"
results_dir="/mnt/disks/400gb_ssd/DeOldify/result_images"
filters = [Colorizer34(gpu=0, weights_path=weights_path)]
vis = ModelImageVisualizer(filters, render_factor=render_factor, results_dir=results_dir)
vis.plot_transformed_image("test_images/1852GatekeepersWindsor.jpg")

It's unfortunately resulting in the below error. I don't doubt for a second the issue is with me

Traceback (most recent call last):
  File "/mnt/disks/400gb_ssd/DeOldify/fastai/dataset.py", line 234, in open_image
    im = cv2.imread(str(fn), flags).astype(np.float32)/255
AttributeError: 'NoneType' object has no attribute 'astype'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "test.py", line 28, in <module>
    vis.plot_transformed_image("test_images/1852GatekeepersWindsor.jpg")
  File "/mnt/disks/400gb_ssd/DeOldify/fasterai/visualize.py", line 31, in plot_transformed_image
    result = self._get_transformed_image_ndarray(path, render_factor)
  File "/mnt/disks/400gb_ssd/DeOldify/fasterai/visualize.py", line 49, in _get_transformed_image_ndarray
    orig_image = open_image(str(path))
  File "/mnt/disks/400gb_ssd/DeOldify/fastai/dataset.py", line 238, in open_image
    raise OSError('Error handling image at: {}'.format(fn)) from e
OSError: Error handling image at: test_images/1852GatekeepersWindsor.jpg

Which in itself is odd because you'd expect the filecheck at the top of open_image() to have dealt with that. I'm guessing this is not a DeOldify issue per se as it's all inside fastai but wonder if you've any thoughts.

Worked! Observations from a newbie. (Win 10 install)

Such an interesting, outstanding project. Kudos. Also appreciate the obvious care you and the community are putting into documentation of this, which is no easy task.

Two main comments:

First, for Windows users trying it out locally, I noticed an out-of-memory error on the "Color Visualization" notebook, where memory doesn't seem to be automatically released. I was able to resolve it with explicit memory cleanup before each visualization. Please see this thread for a workaround: #49

Second -- a couple general observations:

  1. This is amazing work. Really fun to see photos come to life.

  2. In my test trials, medium head shots (i.e., waist up) seem to do much better overall than, say, full shots set on a larger landscape. And my own anecdotal tests are right in line with your observation about a blue clothing bias -- it seems to want to bias toward blue for many articles of clothing.

This got me wondering: given the relative higher accuracy of medium head shots (if my anecdotal observation is actually really true) I was wondering if one optimization for the generator during training might be to "heavily weight flesh tone of a generated medium shot" -- i.e., try a face-detect first, get the largest face in the picture, try to build a "medium shot" of that by cropping then bias heavily toward those weights? I don't know at all if or how this would map to your existing code, just thought I'd throw it out there if it sparks any ideas.

Awesome! But I have some badcase.

This project is so awesome! I have tested several pictures, and the results are very good.
But There are some badcase.
First, as shown in the picture.Lei Feng -- a very famous figure in China,I use this picture to test your wonderful project, but the result turns out a little wrong. As i known ,the clothes Lei Feng dress should be green but in the result turns out blue.
3
Second, as shown in the below picture,red scarf is a symbol of Chinese youth, but the little girl wears a blue scarf. Another point is that their hands are a bit scary(Blue,Dark).😂
1
So are there any optimizition advices for these problems?
Looking forword to your reply~
Best wishes.
Thank you

About how to get the weight

Hello, I want to ask how to download colorize34_gen_192.h5 and bwdefade3_gen_160.h5? Where could I find the file"data/imagenet/ILSVRC/Data/CLS-LOC/train". Extremely grateful

Add Docker file

Can a Docker file and a Docker image with a pretrained model be provided for use with nvidia-docker? The idea is to make it possible to colorize easily.

Create install/guidance for cpu rendering

CPU rendering is desirable to get around the limitations of memory in GPUs when wanting to render at high quality. Additionally, of course- not everybody has a GPU with a decent amount of memory. This issue will track creating both an install and instructions/best practices for CPU rendering.

Usage howto

Can you please add a small howto in README.md on how to use DeOldify on own set of images? It's not that obvious :)

Optimize memory usage.

Memory usage is way too high- barely practical. I suspect simple model tweaks will make a big difference.

Narrow down an optimized training schedule

Hopefully knocking down the time to train and improving quality of results in the process. I'm pretty sure the current training regime in the notebooks is not ideal.

'async' is a reserved word in Python >= 3.7

flake8 testing of https://github.com/jantic/DeOldify on Python 3.7.1

$ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

./fastai/core.py:37:32: E999 SyntaxError: invalid syntax
    if cuda: a = to_gpu(a, async=True)
                               ^
./fastai/models/inceptionresnetv2.py:316:20: F821 undefined name 'pretrained_settings'
        settings = pretrained_settings['inceptionresnetv2'][pretrained]
                   ^
./fastai/models/inceptionresnetv2.py:321:17: F821 undefined name 'InceptionResNetV2'
        model = InceptionResNetV2(num_classes=1001)
                ^
./fastai/models/inceptionresnetv2.py:337:17: F821 undefined name 'InceptionResNetV2'
        model = InceptionResNetV2(num_classes=num_classes)
                ^
./fastai/models/cifar10/main_dxy.py:179:32: E999 SyntaxError: invalid syntax
      target = target.cuda(async=True)
                               ^
./fastai/models/cifar10/utils.py:114:32: F821 undefined name 'random'
  return string + '-{}'.format(random.randint(1, 10000))
                               ^
./fastai/models/cifar10/utils_kuangliu.py:17:18: F821 undefined name 'torch'
    dataloader = torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=True, num_workers=2)
                 ^
./fastai/models/cifar10/utils_kuangliu.py:18:12: F821 undefined name 'torch'
    mean = torch.zeros(3)
           ^
./fastai/models/cifar10/utils_kuangliu.py:19:11: F821 undefined name 'torch'
    std = torch.zeros(3)
          ^
2     E999 SyntaxError: invalid syntax
7     F821 undefined name 'random'
9

Problem when training: Somehow training_orig_images become colored

Firstly, thank you for your tremendous work on colorization. It gives me a lot of new ideas.

Here is my problem, I have used your training code for my own training process. I have noticed that it generate scaled images like 128*128 or 256*256 before training process and it takes a lot of time. But when I train on my own dataset, the BlackAndWhiteTransform() transformation is not working. The input image is become colorful. So the whole training is become meaningless.

And I did add BlackAndWhiteTransform() to all my progressive GANs.

x_tfms = [BlackAndWhiteTransform()]
scheds.extend(GANTrainSchedule.generate_schedules(szs=[64, 64],
                                                  bss=[64, 64],
                                                  path=IMAGENET,
                                                  x_tfms=x_tfms,
                                                  extra_aug_tfms=extra_aug_tfms,
                                                  keep_pcts=[1.0,1.0],
                                                  save_base_name=proj_id,
                                                  c_lrs=c_lrs,
                                                  g_lrs=g_lrs,
                                                  lrs_unfreeze_factor=lrs_unfreeze_factor,
                                                  gen_freeze_tos=gen_freeze_tos))

I don't know what the problem is, please help me. Thank you very much.

image

Thank you! It worked! (Where are you supposed to put the .h5?)

I was shocked when I was able to colorize my own photos today.

I followed the directions in the readme and I'll be damned, it WORKED.

It's my first time using a Jupyter Notebook so there was a lot of room for error.

I wasn't able to figure out where exactly to put the .h5 file however. I didn't see an empty set of folders to drop it into. Perhaps more instructions on how to create that folder structure would be good. I ended up manually creating the folders user c:\user\mike\data....

Thanks for creating and posting a working application. SO many times I've met with dead-ends in these machine learning projects (edges for cats comes to mind).

output path doesn't not cast as Path object correctly

Had another minor issue when running from command line

Traceback (most recent call last):
  File "test.py", line 28, in <module>
    vis.plot_transformed_image("/tmp/Alfred-James-Boyce.jpg")
  File "/mnt/disks/400gb_ssd/DeOldify/fasterai/visualize.py", line 37, in plot_transformed_image
    self._save_result_image(path, result)
  File "/mnt/disks/400gb_ssd/DeOldify/fasterai/visualize.py", line 46, in _save_result_image
    result_path = self.results_dir/source_path.name
AttributeError: 'str' object has no attribute 'name'

Looked like the output path wasn't being handled correctly?

In fasterai/visualize.py plot_transformed_image() accepts path as a string and converts it to a Path object but by the time we get to calling self._save_result_image() it's reverted back to a string. Maybe the casting on line 31 is by reference and not by value - I don't know enough about internals to know if that's the case.

Anyway, I just re-cast in the call to _save_result_image() and we're all hunky dory again
self._save_result_image(Path(path), result)

EDIT:
Ubuntu Xenial 16.,06
Python 3.6.6 :: Anaconda, Inc.

ResolvePackageNotFound cuda90

~/C/DeOldify ❯❯❯ conda env create -f environment.yml

Solving environment: failed

ResolvePackageNotFound:
  - cuda90

Out puts are not saved on drive folder

from the last commit
colored image files are not showing /saving in /content/drive/My Drive/deOldifyImages/results
vis.plot_transformed_image(img_path) works correctly

No module named 'fastai.transforms'

ModuleNotFoundError Traceback (most recent call last)
in
2 import os
3 from torch import autograd
----> 4 from fastai.transforms import TfmType
5 from fasterai.transforms import *
6 from fastai.conv_learner import *

ModuleNotFoundError: No module named 'fastai.transforms'
`

Memory Problem

I was just playing around with some of the example photos given but I noticed as I convert an image, the memory used by CUDA gets used and is never deallocated.

Basically if I convert a couple images my memory gets used up and I get the CUDA out of memory error.

I noticed if I restart the kernal or kill the python process I get the memory back and can continue trying to convert a different image.

Shouldn't the memory be deallocated after the image is converted and saved?

(I was running the project on windows 10 with an NVIDIA GTX970 graphics card. I used the weights linked to in your article:
https://blog.floydhub.com/colorizing-and-restoring-old-images-with-deep-learning/ )

Greatly increase max resolution output by taking advantage of this chrominance optimization

Source: MayeulC on HackerNews, thread:

https://news.ycombinator.com/item?id=18363870#18369410

"Now, there seems to be a distinct loss of details in the restored images. The network being resolution-limited, is the black-and-white image displayed at full resolution besides the restored one?

What I would like to see is the output of the network to be treated as chrominance only.

Take the YUV transform of both the input and output images, scale back the UV matrix of the restored one to match the input, and replace the original channels. I'd be really curious to look at the output (and would do it myself if I was not on a smartphone)!"

how to run the "ColorizeVisualization"

Hello, I am sorry to bother you again. I have tried your case several times, but it still has problems:
My computer environment is win10, cpu. I have seen the cpu implementation you mentioned in the issue. It mentions the modification of the command. But if I follow 'conda env create -f environment.yml' at the beginning, the following error will occur:

image

So I tried to download the cpu version of the module such as pytorch. Then tried to use the weight you have given me and run "ColorizeVisualization" directly. But when I load the module, the following problem occurs.

image

Then I found the file "\fastai\torch_imports.py" along the path and saw this syntax error.

image

I have problems loading the module. So python can't continue working down.
I don't understand the meaning of the two folders "fastai" and "fasterai". Is this your own module? Why not choose to use conda install/pip install to download it? Could you give me some guiding advice? Thank you very much!!

I try my best to reproduce your case. Because I am shocked by your case and want to promote it to more friends around me. But I have not been successful.
I sincerely hope to get your help.

Red hand in example

These colorizations look amazing! Not sure if you noticed, but the hand in this image on the main page stuck out like a … sore thumb?
image

Keep up the great work!

Segmentation fault, is it because of RAM shortage (4GB) ?

I try to run the code on linux machine with video - card.
So I started a file with only 4 lines of code

import os
import multiprocessing
from torch import autograd
from fastai.transforms import TfmType

And got an error Segmentation fault
And it happens when I try to import smth from fast.ai.

Does it mean that I have shortage of RAM (which is 4Gb at the moment ) ?
It's quite strange because I just imported the library and that's it

Upload discriminator weights, to allow fine-tune training

Thanks for making this! I had no problems getting it working on Windows 10, and /colou?ri[zs]ing/ some images on my 980Ti.

The image I want to use it for (two boys on a row-boat) doesn't come out great, so I'd love to be able to fine-tune on a bunch of similar images so I can get a good result. You've kindly uploaded the generator weights, but I'd like the discriminator weights too (DCCritic / GANTrainer.netD).

No way I can train this all the way myself on my 980Ti 😅 but maybe just a few more examples is possible...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.