Giter VIP home page Giter VIP logo

hiarindam / document-image-classification-tl-sg Goto Github PK

View Code? Open in Web Editor NEW
42.0 42.0 16.0 183 KB

Document Image Classification with Intra-Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks

Home Page: https://arxiv.org/abs/1801.09321

License: MIT License

Python 100.00%
deep-convolutional-neural-networks deep-learning document-classification document-image-classification image-classification structure-learning training-strategies transfer-learning

document-image-classification-tl-sg's Issues

train code

I'd like to use this algorithm to train some other ticket classification task. Could you offer the train code?

Source code

Other than pretrained weights, are you planning to also provide the reference source code (preprocessing, meta-classifier, etc...) used for the article?

Thanks

For evaluation, is test set normalized in the same manner as the train set?

I computed mean and std for rvlcdip to be 0.9919 and 0.1853 so these are the following transforms I'm using for the train set:
dataset = RvlCdipDataset('labels/train.txt', 'images/', transform=transforms.Compose([transforms.Resize((224,224)), transforms.Normalize(0.9199, 0.1853), transforms.Lambda(lambda x: x.repeat(3,1,1))]))

Do I use the same transforms for the test set?

Concatenation of the base class softmax predictions

Dear author,

Please allow me to clarify some details regarding the implementation. To run inference on an image using your proposed "MLNN based stacking of holistic & region-based models with inter and intra-domain weights transfer" method, I would first run each of the 5 base models (with your given weights) on the image and get 5 different "base class softmax predictions". I would then concatenate those 5 base class softmax predictions... but since each softmax prediction has 16 scores wouldn't their concatenation have a total of 5*16 scores. Afterwards, I am supposed to feed those scores to an MLNN to get the final prediction. Is that right? Do you also have the weights for the MLNN? Or do I have to train it on the validation set myself?

Please help me as I would like to be able to reproduce your results (92.21% accuracy) for my research project.

Yours sincerely,
Gordon

How to load pre-trained weights?

I see you post the link to pre-trained weights but no tutorial how to use them.

Tried with keras:

from keras.models import load_model
load_model('vgg16_weights_th_dim_ordering_th_kernels_Holistic_91.11.h5')

but failed ValueError: No model found in config file.

How do I load these h5 weights? Can you provide the model source so I can use model.load_weights(...)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.