Giter VIP home page Giter VIP logo

Comments (8)

pavitrakumar78 avatar pavitrakumar78 commented on May 31, 2024

Please check the link here:
https://github.com/pavitrakumar78/Street-View-House-Numbers-SVHN-Detection-and-Classification-using-CNN/tree/master/cnn_models

I have included all the files required to load the CNN. *.h5 files are the weights and the *.json files are the model architecture files.
If you do don't want to know about how the training happened, you can ignore the train_digit_classification.py and the train_digit_detection.py files. The construction of datasets used in training both CNN is done in the construct_datasets.py code.

The processing of inputting a single image and getting the bounding box is handled by the find_box_and_predict_digit() function in the combi_models.py file (after loading pre-trained CNN model). It follows exactly the detection pileine given in the readme.

Look through the code, let me know if you see anything that needs explanation.

from street-view-house-numbers-svhn-detection-and-classification-using-cnn.

TharakaMadhusanka avatar TharakaMadhusanka commented on May 31, 2024

Hi !

  1. How long does this take to process for one input, in finad_box_and_predict_digit() ?

I run the combi_models.py, by setting one input for find_box...(). Application is running no error and no out put and without exciting. no predictions or no error

  1. Can't we directly input an image and to find the number area and return what the number is ? What is the purpose of inputting .h5 for training image set ?

Thank You

from street-view-house-numbers-svhn-detection-and-classification-using-cnn.

pavitrakumar78 avatar pavitrakumar78 commented on May 31, 2024

How long does this take to process for one input, in finad_box_and_predict_digit() ?
It does it almost instantly since it's only prediction process

I would not recommend running the full combi_models.py as it performs tests which might take some time depending on your PC.

Can't we directly input an image and to find the number area and return what the number is ?

Yes, you can. Just call the function find_box_and_predict_digit() with a 3D numpy array of the image as input. If you need only this function, then you can remove lines 95 to 197 in combi_models.py. If you are going to input your own image, you need to add the code to do that yourself. At the moment, I only take images from the test_data dataframe which has processed images of all the test images of the dataset.

What is the purpose of inputting .h5 for training image set ?

.h5 is just a file format for storing data. It can be anything you want. Instead of reading the images one by one every single time, I read then once and then process them, put it into a dataframe (all the images) and dump that as a .h5 file. It makes it easier for me to move around data and for reproducing results.

If you run the combi_models.py with the data files in the proper path, this is the output you should get:

2018-06-23 18:07:32.073412: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:940] Found device 0 with properties:
name: GeForce GTX 660
major: 3 minor: 0 memoryClockRate (GHz) 1.0845
pciBusID 0000:01:00.0
Total memory: 2.00GiB
Free memory: 1.65GiB
2018-06-23 18:07:32.081087: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:961] DMA: 0
2018-06-23 18:07:32.089619: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:971] 0:   Y
2018-06-23 18:07:32.098400: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 660, pci bus id: 0000:01:00.0)
box prediction absolute difference 247418.05103462934
box prediction absolute difference (64x64) 247418.05103462934
box prediction absolute difference (orig size) 512364.54729372263
class prediction accuracy 0.9650237257002908
full digit prediction accuracy 0.510867901423542
individual digit accuracies: [0.70924537 0.65192102 0.84517067 0.9846931 ]
Predicted digit: 1522
Predicted digit: 135
Predicted digit: 861
Predicted digit: 348
Predicted digit: 114
Predicted digit: 23
Predicted digit: 863
Predicted digit: 6
Predicted digit: 8
Predicted digit: 1
Predicted digit: 1410
Predicted digit: 27
test ID 4416
Predicted digit: 16
test ID 11328
Predicted digit: 2
test ID 9766
Predicted digit: 251
test ID 5966
Predicted digit: 210
test ID 2699
Predicted digit: 3
test ID 5680
Predicted digit: 122
test ID 12293
Predicted digit: 1
test ID 10218
Predicted digit: 112
test ID 11298
Predicted digit: 2
test ID 7803
Predicted digit: 74

The first few lines are the library related info which are automatically printed by tensorflow - you can ignore them. Before the code prints the "Predicted Digit" line for each case, a matplotlib window will popup showing the pic and the bounding box.

from street-view-house-numbers-svhn-detection-and-classification-using-cnn.

TharakaMadhusanka avatar TharakaMadhusanka commented on May 31, 2024

Thanks for keep in touch ! So, what my only expectation is to input image, and find the number of it. So as you say, it's a waste running whole combi_models.py. so can you please, set me the scope for my requirement ? and after 20mins I got the first line of out put

from street-view-house-numbers-svhn-detection-and-classification-using-cnn.

pavitrakumar78 avatar pavitrakumar78 commented on May 31, 2024

You can remove lines 95 to 197.

from street-view-house-numbers-svhn-detection-and-classification-using-cnn.

TharakaMadhusanka avatar TharakaMadhusanka commented on May 31, 2024

Hi Bro ! Followed you, though it takes around 20 mints to return out put . I got number detected plot, but, consumes too much of time for the process. Can you bit explain, the code work ? :)

from street-view-house-numbers-svhn-detection-and-classification-using-cnn.

pavitrakumar78 avatar pavitrakumar78 commented on May 31, 2024

I can't do anything about the runtime. Feel free to make pull requests to optimize code.

As I mentioned before, the code follows the pipeline mentioned in the README here. It's very straightforward and I recommend you add print statement inbetween then if you are confused on what is happening, but please don't expect line by line explanation.

from street-view-house-numbers-svhn-detection-and-classification-using-cnn.

TharakaMadhusanka avatar TharakaMadhusanka commented on May 31, 2024

Thank You Bro ! KIT !

from street-view-house-numbers-svhn-detection-and-classification-using-cnn.

Related Issues (7)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.