Giter VIP home page Giter VIP logo

illustration2vec's Introduction

Illustration2Vec

illustration2vec (i2v) is a simple library for estimating a set of tags and extracting semantic feature vectors from given illustrations. For details, please see our main paper.

Requirements

  • Pre-trained models (i2v uses Convolutional Neural Networks. Please download several pre-trained models from here, or execute get_models.sh in this repository).
  • numpy and scipy
  • PIL (Python Imaging Library) or its alternatives (e.g., Pillow)
  • skimage (Image processing library for python)

In addition to the above libraries and the pre-trained models, i2v requires either caffe or chainer library. If you are not familiar with deep learning libraries, we recommend to use chainer that can be installed via pip command.

How to use

In this section, we show two simple examples -- tag prediction and the the feature vector extraction -- by using the following illustration [1].

slide

[1] Hatsune Miku (初音ミク), © Crypton Future Media, INC., http://piapro.net/en_for_creators.html. This image is licensed under the Creative Commons - Attribution-NonCommercial, 3.0 Unported (CC BY-NC).

Tag prediction

i2v estimates a number of semantic tags from given illustrations in the following manner.

import i2v
from PIL import Image

illust2vec = i2v.make_i2v_with_chainer(
    "illust2vec_tag_ver200.caffemodel", "tag_list.json")

# In the case of caffe, please use i2v.make_i2v_with_caffe instead:
# illust2vec = i2v.make_i2v_with_caffe(
#     "illust2vec_tag.prototxt", "illust2vec_tag_ver200.caffemodel",
#     "tag_list.json")

img = Image.open("images/miku.jpg")
illust2vec.estimate_plausible_tags([img], threshold=0.5)

estimate_plausible_tags() returns dictionaries that have a pair of tag and its confidence.

[{'character': [(u'hatsune miku', 0.9999994039535522)],
  'copyright': [(u'vocaloid', 0.9999998807907104)],
  'general': [(u'thighhighs', 0.9956372380256653),
   (u'1girl', 0.9873462319374084),
   (u'twintails', 0.9812833666801453),
   (u'solo', 0.9632901549339294),
   (u'aqua hair', 0.9167950749397278),
   (u'long hair', 0.8817108273506165),
   (u'very long hair', 0.8326570987701416),
   (u'detached sleeves', 0.7448858618736267),
   (u'skirt', 0.6780789494514465),
   (u'necktie', 0.5608364939689636),
   (u'aqua eyes', 0.5527772307395935)],
  'rating': [(u'safe', 0.9785731434822083),
   (u'questionable', 0.020535090938210487),
   (u'explicit', 0.0006299660308286548)]}]

These tags are classified into the following four categories: general tags representing general attributes included in an image, copyright tags representing the specific name of the copyright, character tags representing the specific name of the characters, and rating tags representing X ratings.

If you want to focus on several specific tags, use estimate_specific_tags() instead.

illust2vec.estimate_specific_tags([img], ["1girl", "blue eyes", "safe"])
# -> [{'1girl': 0.9873462319374084, 'blue eyes': 0.01301183458417654, 'safe': 0.9785731434822083}]

Feature vector extraction

i2v can extract a semantic feature vector from an illustration.

import i2v
from PIL import Image

# In the feature vector extraction, you do not need to specify the tag.
illust2vec = i2v.make_i2v_with_chainer("illust2vec_ver200.caffemodel")

# illust2vec = i2v.make_i2v_with_caffe(
#     "illust2vec.prototxt", "illust2vec_ver200.caffemodel")

img = Image.open("images/miku.jpg")

# extract a 4,096-dimensional feature vector
result_real = illust2vec.extract_feature([img])
print("shape: {}, dtype: {}".format(result_real.shape, result_real.dtype))
print(result_real)

# i2v also supports a 4,096-bit binary feature vector
result_binary = illust2vec.extract_binary_feature([img])
print("shape: {}, dtype: {}".format(result_binary.shape, result_binary.dtype))
print(result_binary)

The output is the following:

shape: (1, 4096), dtype: float32
[[ 7.47459459  3.68610668  0.5379501  ..., -0.14564702  2.71820974
   7.31408596]]
shape: (1, 512), dtype: uint8
[[246 215  87 107 249 190 101  32 187  18 124  90  57 233 245 243 245  54
  229  47 188 147 161 149 149 232  59 217 117 112 243  78  78  39  71  45
  235  53  49  77  49 211  93 136 235  22 150 195 131 172 141 253 220 104
  163 220 110  30  59 182 252 253  70 178 148 152 119 239 167 226 202  58
  179 198  67 117 226  13 204 246 215 163  45 150 158  21 244 214 245 251
  124 155  86 250 183  96 182  90 199  56  31 111 123 123 190  79 247  99
   89 233  61 105  58  13 215 159 198  92 121  39 170 223  79 245  83 143
  175 229 119 127 194 217 207 242  27 251 226  38 204 217 125 175 215 165
  251 197 234  94 221 188 147 247 143 247 124 230 239  34  47 195  36  39
  111 244  43 166 118  15  81 177   7  56 132  50 239 134  78 207 232 188
  194 122 169 215 124 152 187 150  14  45 245  27 198 120 146 108 120 250
  199 178  22  86 175 102   6 237 111 254 214 107 219  37 102 104 255 226
  206 172  75 109 239 189 211  48 105  62 199 238 211 254 255 228 178 189
  116  86 135 224   6 253  98  54 252 168  62  23 163 177 255  58  84 173
  156  84  95 205 140  33 176 150 210 231 221  32  43 201  73 126   4 127
  190 123 115 154 223  79 229 123 241 154  94 250   8 236  76 175 253 247
  240 191 120 174 116 229  37 117 222 214 232 175 255 176 154 207 135 183
  158 136 189  84 155  20  64  76 201  28 109  79 141 188  21 222  71 197
  228 155  94  47 137 250  91 195 201 235 249 255 176 245 112 228 207 229
  111 232 157   6 216 228  55 153 202 249 164  76  65 184 191 188 175  83
  231 174 158  45 128  61 246 191 210 189 120 110 198 126  98 227  94 127
  104 214  77 237  91 235 249  11 246 247  30 152  19 118 142 223   9 245
  196 249 255   0 113   2 115 149 196  59 157 117 252 190 120  93 213  77
  222 215  43 223 222 106 138 251  68 213 163  57  54 252 177 250 172  27
   92 115 104 231  54 240 231  74  60 247  23 242 238 176 136 188  23 165
  118  10 197 183  89 199 220  95 231  61 214  49  19  85  93  41 199  21
  254  28 205 181 118 153 170 155 187  60  90 148 189 218 187 172  95 182
  250 255 147 137 157 225 127 127  42  55 191 114  45 238 228 222  53  94
   42 181  38 254 177 232 150  99]]

License

The pre-trained models and the other files we have provided are licensed under the MIT License.

illustration2vec's People

Contributors

elarnon avatar hiroshiba avatar rezoo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

illustration2vec's Issues

Model design benchmarks

Is it possible to compare the VGG model in the paper with other Architectures like Inception, ResNet, NASNet or others? Photos for reference
cover
20180116203702076
1_7djtcev7vyaqyhbq8qazua

pypi

hi, i accidentally upload the repo to pypi (instead of test.pypi) https://pypi.org/project/illustration2vec/

i thought i will delete it, but after some thought i will keep it if anybody need it or remove it when rezoo don't want it to be uploaded.

the version that i uploaded include

  • #14 from @Katsuya-Ishiyama
  • a cli command, untested and not documented
  • required package using chainer instead caffe
  • project data that i take from repo and rezoo github page
  • e: and the wrong version. should be something like 2.0.1
  • e2: broken image link

Train network

A basic view of how the network was originally trained would be very helpful. I have a larger dataset than was originally used and I'm hoping to evaluate your network and compare the results to Google's Inception network.

Without knowing what kinds of preprocessing were used (this can be partially inferred from the code provided) and how exactly the network was trained, this is difficult though.

Adding the solver prototxt that was used to train may be enough.

KeyError: 'conv6_4'

Hi,

On Mac, I downloaded the models, but I can't load them. Seems like the loaded model does not have any layers, even if I use an old version of chainer (< 2.0).

Is it a Mac issue?

KeyError: 'encode1'

I can run the estimate_plausible_tags function just fine- the model seems to be working, but calling extract_feature or extract_binary_feature returns KeyError: 'encode1'

looking in to it a little, it seems the error was generated as the output of :
feature = self._extract(imgs, layername='encode1')
and in the illust2vec model, the layers are only iterations of convX_Y, reluX_Y, and poolX.
Passing actual layer names doesn't seem to get rid of the error or produce any results so I'm lost with this...
On windows, chainer is v5.2

Why does the probability of the label of the same picture change?

I found that every time I estimate the label probability of the same image, the result is different.

That is, for the same picture (for example, the picture you provided miku.jpg), I ran the program twice and the results were

image

image

Is this phenomenon normal? And why this phenomenon occurs? (maybe I was not careful enough and did not find the random function in the program)

Thanks for this project.

ImportError: i2v requires caffe or chainer package

I am trying to run sample code after downloading the models. I have installed all the dependencies, but i always get the error ImportError: i2v requires caffe or chainer package. It have tried the same on win 10, ubuntu 16.04 and docker.

Checksums to verify downloaded models

Hello. Is it possible to place checksums for models from illustration2vec.net somewhere on the site?
The models are up to ~900 Mb, and redownloading them for verification would be a waste of time and traffic.

MD5 or SHA-1, SHA-256 etc.
Thanks in advance.

Pre-processing images

Before using the Illustration2Vec model on a dataset of images, what sort of pre-processing should I do to match your methods? In the appendix, it seems you subtracted the mean of the images. Is that it? Did you scale the data to fit in [-1, 1], for example?

service Unavailable

I am trying to visit the demo website, but the site is inaccessible and returns 'The server is temporarily unable to service your request due to maintenance downtime or capacity problems'.
Is there anything wrong with the site?

Models missing

curl -vvvv http://illustration2vec.net/models/illust2vec_tag.prototxt
*   Trying 130.34.54.33...
* Connected to illustration2vec.net (130.34.54.33) port 80 (#0)
> GET /models/illust2vec_tag.prototxt HTTP/1.1
> Host: illustration2vec.net
> User-Agent: curl/7.48.0
> Accept: */*
>
< HTTP/1.1 502 Proxy Error
< Date: Sun, 01 May 2016 23:18:58 GMT
< Server: Apache/2.4.7 (Ubuntu)
< Content-Length: 524
< Content-Type: text/html; charset=iso-8859-1
<
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Proxy Error</title>
</head><body>
<h1>Proxy Error</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
The proxy server could not handle the request <em><a href="/models/illust2vec_tag.prototxt">GET&nbsp;/models/illust2vec_tag.prototxt</a></em>.<p>
Reason: <strong>Error reading from remote server</strong></p></p>
<hr>
<address>Apache/2.4.7 (Ubuntu) Server at illustration2vec.net Port 80</address>
</body></html>
* Connection #0 to host illustration2vec.net left intact

big CPU cost

when i import i2v, the CPU cost climbs up to 10G slowly and then drops to 2G.
Any ideas about what happens?
Can this be related to GPU? how to use GPU in i2v?

Some advice about license compliance

Hello, such a nice repository benefits me a lot and so kind of you to make it open source!

Question
There’s some possible legal issues on the license of your repository when you combine numerous third-party packages.
For instance, numpy and scipy you imported are licensed with BSD License and BSD License, respectively.
However, the MIT License of your repository are less strict than above package licenses, which has violated the whole license compatibility in your repository and may bring legal and financial risks.

Advice
You can select another proper license for your repository, or write a custom license with license exception if some license terms couldn’t be summed up consistently.

Best wishes!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.