Comments (24)
I also wanted to play around with the pre-trained weights of the holistic mode, so i downloaded 'vgg16_weights_th_dim_ordering_th_kernels_Holistic_91.11.h5'
I used Keras with Tensorflow and assumed that using the vgg16 from keras.applications should work
from keras import applications
vgg = applications.VGG16(include_top=True, weights='PATH_TO_WEIGHTSFILE', classes=16)
Turns out you don't even have to convert the weights from theano to tensorflow on your own, since keras does this internally in model.load_weights (which is called inside vgg16 if you provide a weightsfile).
Initialization of the model + loading the weights seem to work, i didn't get any errors.
I then used a few examples from rvl-cdip to test everything. Sadly, every image tested was classified as memo.
Beeing suspicious about the weight-conversion, i set up a new project and installed keras with theano. And again, loading the model with weights worked but all test-images were classified as memo.
In 'IV-B Preprocessing' of the paper it is said that:
Following the resizing, all datasets were standardized
Can someone clarify what "standardized" means? Mean Pixel Substraction? Rescaling?
I would appreciate if someone could confirm that the provided weights actually work.
from document-image-classification-tl-sg.
For anyone looking to run this with Tensorflow 2.0, the following will work.
Install dependencies:
pip install tensorflow
pip install keras
pip install pillow
(used for inference later)
Download a weights file, e.g. vgg16_weights_th_dim_ordering_th_kernels_Holistic_91.11.h5
from Google Drive
Download the convert script from this repo.
Open the convert script, and make the following changes:
- Set the
model_weights
array at the top to point to the weight file(s) you have downloaded - Replace
K.set_image_dim_ordering('th')
withK.common.set_image_dim_ordering('th')
.
Run python Weight_conversion_th_to_tf_Keras2.py
from terminal/command prompt.
A new folder is created (tf-kernels-channels-last-dim-ordering
) and contains the converted weights file.
Open the folder and create new file called test.py
with the following code:
from keras import applications
vgg = applications.VGG16(include_top=True, weights='./vgg16_weights_th_dim_ordering_th_kernels_Holistic_91.11.h5', classes=16)
import numpy as np
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
class_map = ['letter', 'form', 'email', 'handwritten', 'advertisement',
'scientific report', 'scientific publication', 'specification', 'file folder',
'news article', 'budget', 'invoice', 'presentation', 'questionnaire',
'resume', 'memo']
def test(path):
img = image.load_img(path, target_size=(224, 224))
img = image.img_to_array(img)
img = preprocess_input(img)
x = np.expand_dims(img, 0)
y = vgg.predict(x)
print(y)
idx = np.argmax(y)
print('predicted class: {}', class_map[idx])
test('../form.jpg')
Now run the code: python test.py
and it will print out the predicted class of the image.
from document-image-classification-tl-sg.
@martinnormark Hey thanks for the guide. I've added a link to this on the main Readme.
from document-image-classification-tl-sg.
Tks @lgaida, I successfully load the trained weights as you suggest. My input is preprocessed as:
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
img = image.load_img('my_image', target_size=(224, 224))
img = image.img_to_array(img)
img = preprocess_put(img)
x = np.expand_dims(img, 0)
But when I tried to predict with holistic
model I had the same problem with you:
y = vgg.predict(x)
np.argmax(y) # always end up at id 8 (which is file folder)
from document-image-classification-tl-sg.
@hiepph too bad 😢
from document-image-classification-tl-sg.
Can someone clarify what "standardized" means? Mean Pixel Substraction? Rescaling?
By "standardized", we mean subtract the mean and divide by the standard deviation.
Regarding the data loading issues, I can try to look into our old code and configurations and try to elaborate.
from document-image-classification-tl-sg.
I can confirm however that everything we did was using theano as the backend.
So the input dimensions as well as the weights are in theano ordering. If you are using tensorflow as the backend, then you have to either switch backends to theano or change weight orderings for everything to work I think.
Turns out you don't even have to convert the weights from theano to tensorflow on your own, since keras does this internally in model.load_weights (which is called inside vgg16 if you provide a weightsfile).
I cannot however neither confirm nor deny this since I have not worked with that functionality myself.
from document-image-classification-tl-sg.
Hi @saikat-roy thank you for replying 👍
It would be fantastic if you could peek at your code again, maybe providing some code snippets. Playing around with dim-ordering is fine, but guessing and assuming preprocessing is way harder.
from document-image-classification-tl-sg.
Hey @lgaida. We apologize for not replying sooner but the source code of the project was never really written for what you might call, public consumption (also known as, it's an absolute mess) so we are scrambling to dig it out of storage.
It would be fantastic if you could peek at your code again, maybe providing some code snippets. Playing around with dim-ordering is fine, but guessing and assuming preprocessing is way harder.
# X is the main data matrix organized as (samples,channel,height,width) formatting
# Initially X has been created with 3 channels to match original VGG16 input but
# since RVL-CDIP images are grayscale, we simply copy the 1st channel onto the
# 2nd and 3rd channel. but after standardization as you will see below.
_mean = X[:,0,:,:].mean(axis=0)
_std = X[:,0,:,:].std(axis=0)
_jmp = 1000 # We essentially do the standardization in mini-batches
# of size '_jmp' due to memory constraints
for i in range(0,X.shape[0],_jmp):
end = min(i+_jmp,X.shape[0])
X[i:end,0,:,:] = (X[i:end,0,:,:]-_mean)/_std # batch standardization
X[i:end,1,:,:] = X[i:end,0,:,:] # batch copying to channel 2
X[i:end,2,:,:] = X[i:end,0,:,:] # batch copying to channel 3
I am digging through our old files and this is the preprocessing snippet that I found we had used. I should however warn you that the _mean
and _std
calculation that we used are the naive versions and will consume a ridiculous amount of memory and if used without extremely large RAMs will probably lead to crashes. We used AWS EC2 instances (also we were being a bit lazy) so it wasn't a problem for us but I would recommend modifying it in some way (maybe doing it manually in mini-batches) to suit lower hardware configurations.
from document-image-classification-tl-sg.
Thanks for replying so quickly. I'm going to play around with your code snippet, and i'm currently implementing something very similar on my own.
To reduce even more assumptions:
X in your code snippet represents the train images of rvl-cdip, and you normalize the test samples with the mean & std of this X (=train samples), right?
Or is X the whole rvl-cdip including train, test, validation?
from document-image-classification-tl-sg.
X in your code snippet represents the train images of rvl-cdip, and you normalize the test samples with the mean & std of this X (=train samples), right?
Or is X the whole rvl-cdip including train, test, validation?
While the first case you suggested might be more experimentally sound, we actually ran this snippet separately for train, test and validation sets, standardizing each dataset with their own mean and standard deviation.
from document-image-classification-tl-sg.
Hello again,
I installed Theano and tested both my own and your implementation of the normalization. Still not able to make good predictions 👎
If you don't want to publish the code, any chances i might get it? I would try to come up with a publishable code snippet, providing a small example on how to use the weights for prediction.
from document-image-classification-tl-sg.
I installed Theano and tested both my own and your implementation of the normalization. Still not able to make good predictions -1
That's odd. I'm guessing you did the whole changes in the keras.json configuration file by setting the "backend" and "image_data_format" already. Strange that it wouldn't work.
If you don't want to publish the code, any chances i might get it? I would try to come up with a publishable code snippet, providing a small example on how to use the weights for prediction.
Sure. Give us a little time, like a day or so, and we'll give you the version of the code that we had used.
from document-image-classification-tl-sg.
I installed Theano and tested both my own and your implementation of the normalization. Still not able to make good predictions
Hey @lgaida. I was digging around our code and I saw something. I know the last version I gave you didn't have a NaN guard for the standardization. Did your version have one?
_jmp = 1000
eps = 0.0001
for i in range(0,X.shape[0],_jmp):
end = min(i+_jmp,X.shape[0])
X[i:end,0,:,:] = (X[i:end,0,:,:]-_mean)/(_std+eps) # batch standardization
X[i:end,1,:,:] = X[i:end,0,:,:] # batch copying to channel 2
X[i:end,2,:,:] = X[i:end,0,:,:] # batch copying to channel 3
from document-image-classification-tl-sg.
Hey @lgaida. I was digging around our code and I saw something. I know the last version I gave you didn't have a NaN guard for the standardization. Did your version have one?
Kind of, i initialized the array with zeroes.
Sure. Give us a little time, like a day or so, and we'll give you the version of the code that we had used.
Sounds great 👍 I'll be waiting until then :) Feel free to contact me via github or email (see github-profile)
from document-image-classification-tl-sg.
Kind of, i initialized the array with zeroes.
I mean to say that (as far as I remember) the std of X
in some places is 0
. So you would be getting NaNs in the standardized input in some places. Do we mean the same thing? It was an issue for us if I am still remembering correctly. Try adding a small value like 0.0001
or something to the _std
like above and try running the examples again if you haven't yet specifically guarded against this.
from document-image-classification-tl-sg.
Hello @lgaida , thanks for your interest in our work and reaching out to us.
Kind of, i initialized the array with zeroes.
I would repeat the same thing as mentioned by @saikat-roy that even though initialization was done with all zeros, unfortunately that doesn't guarantee that you won't get NaN. Please consider adding this safe guard in your code and let us know.
from document-image-classification-tl-sg.
Hello @lgaida , thanks for your interest in our work and reaching out to us.
Kind of, i initialized the array with zeroes.
I would repeat the same thing as mentioned by @saikat-roy that even though initialization was done with all zeros, unfortunately that doesn't guarantee that you won't get NaN. Please consider adding this safe guard in your code and let us know.
I just added the guard but still get Label 8 for every tested sample 😢
from document-image-classification-tl-sg.
Hi @saikat-roy, can you provide the mean
and std
values of your training set so I can standardize my inputs before forwarding through the trained model?
from document-image-classification-tl-sg.
Hey sorry for the late reply.
Hi @saikat-roy, can you provide the
mean
andstd
values of your training set so I can standardize my inputs before forwarding through the trained model?
I'm really sorry but we don't have the computational environment setup, that we had set up for processing the dataset, available currently.
I just added the guard but still get Label 8 for every tested sample
We will however, be looking into releasing more of our code and testing the model weights ourselves since it is disturbing to hear the model weights do not load as expected. While we cannot do it immediately, we do plan to try it in a week or two.
So I would request your patience for a while longer and hopefully we can get back to you with better news than "we don't know what's wrong, this shouldn't be happening".
from document-image-classification-tl-sg.
Just want to remind you that i could also take a look at the code 👋
from document-image-classification-tl-sg.
Hi Saikat, Arindam,
First of all thanks for writing this great article
I am also getting everything predicted as 8. Below is my code: Kindly assist and let us know how to get this resolved
from keras import applications
vgg = applications.VGG16(include_top=True, weights='F:/Doc_Image_Classification/vgg16_weights_th_dim_ordering_th_kernels_Holistic_91.11.h5', classes=16)
import numpy as np
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
img = image.load_img('F:/Doc_Image_Classification/images/pic1.png', target_size=(224, 224))
img = image.img_to_array(img)
img = preprocess_input(img)
x = np.expand_dims(img, 0)
y = vgg.predict(x)
np.argmax(y)
from document-image-classification-tl-sg.
Okay so first and foremost we are sincerely sorry about the ridiculously late updates to this issue. Unfortunately as we mentioned, we have since stopped working on this project and have literally no hardware or software setup available to test the models any more. I know its frustrating to have your queries not answered but we have gotten little to no time to really go through the code for this bug - we have thought a lot about it and simply put, it did NOT exist when we worked on it.
The reason I am writing this update is to mention that we recently went through multiple issues on the keras forums regarding issues with model.save
and model.load
in keras. From our end, the code should run fine if the data is simply standardized as I had mentioned earlier, which everyone seems to be doing as well - so if you are still using our code I gently urge you to look into whether the keras bugs for serialization are to blame here. We will go over it ourselves if we can but without a proper hardware setup, we sincerely can't promise anything in terms of time.
I thank you for being patient with us and again we sincerely apologize for not actively helping out with the issue. To anyone who needs our code, we will attempt to simply just release the .py files with some minor cleaning soon - since we can't help out actively this is the least we can do at this point.
from document-image-classification-tl-sg.
An attempt to solve the weight loading has been added to the readme. So we'll be closing this issue.
from document-image-classification-tl-sg.
Related Issues (6)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from document-image-classification-tl-sg.