Comments (8)
The situation with the watson logo being scored can be corrected in a custom classifier by adding the image to the negative class.
So, lets say I have 200 images of documents I want to classify for each document type, like drivers license, passport, and tax id card. In addition to these three classes, I include 100 images of logos and things that are not any of the three classes I want to find. When testing my classifier, if I find images that classify incorrectly, I can add them to the negative classifier to improve the results.
The reason the watson logo gets classified is that it must have some feature in it that can be found in the dalmatian class. If I had to guess, it's the dots in the image. I would also say that the 0.522 score is pretty low, and increasing the threshold will help weed out poorly classified images.
from visual-recognition-nodejs.
The watson image used was from https://i.ytimg.com/vi/8o44asJt8ZA/maxresdefault.jpg
from visual-recognition-nodejs.
Yes... Any image which is not a dog provides similar results.
Thanks
JF
from visual-recognition-nodejs.
Hi @jflevi, thanks for the feedback.
This is due to the relatively small training image set that the demo uses. In particular, the Non-Dogs are all 4-legged critters like cats and tigers and such, so when you include that in the training data, the service tries to match whatever random image you give it to either a particular dog breed or else cats and such.
If you wanted to recognize arbitrary images, you’re going to need either create a much larger training set (with the Non-Dogs part full of random things like company logos), or else you could just use the default
classifier that the "Try” page of the demo uses.
That said, we do appreciate the feedback and we’re continually working on improving the service, so we’ll take this into account for future updates.
from visual-recognition-nodejs.
Nathan,
Thanks for your response but I don't think adding more images will help in
my use-case which is the following (adapted to dogs).
I have a set of images with unknown content. I want to identify which one
are Dogs which are not Dogs.
The real use-case I'm trying to address with this service is: The Bank as
a set of customer documents (70 millions documents) stored and they want
to classify and identify ID documents (passport or IDCard) other document
don't need to be classified. The default classifier is not able to
recognize any ID cards.
Would you have any recommendation to address my requirements?
Thanks
Cordialement - Kind regards
Jean-Francois LEVI
Client Technical Advisor - Société Générale
Phone/Fax: +(33) 1 58 75 28 77
Mobile: +(33) 6 75 07 85 00
Email: [email protected]
From: Nathan Friedly [email protected]
To: watson-developer-cloud/visual-recognition-nodejs
[email protected]
Cc: Jean-Francois Levi/France/IBM@IBMFR, Mention
[email protected]
Date: 02/06/2016 22:05
Subject: Re: [watson-developer-cloud/visual-recognition-nodejs]
Strange behavior of Demo Classifier (#104)
Hi @jflevi, thanks for the feedback.
This is due to the relatively small training image set that the demo uses.
In particular, the Non-Dogs are all 4-legged critters like cats and tigers
and such, so when you include that in the training data, the service tries
to match whatever random image you give it to either a particular dog
breed or else cats and such.
If you wanted to recognize arbitrary images, you?re going to need either
create a much larger training set (with the Non-Dogs part full of random
things like company logos), or else you could just use the default
classifier that the "Try? page of the demo uses.
That said, we do appreciate the feedback and we?re continually working on
improving the service, so we?ll take this into account for future updates.
?
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
from visual-recognition-nodejs.
That actually sounds like a very fitting use case - choose a random selection of images out of the 7 million and split them into two groups: ID cards & other. Per the documentation, you'll want at least 150-200 images in each group, and should see some benefit all the way up to 5000 images total. (You'll likely have to shrink down the images to fit within the 100mb-per-zip limit, but 320px and larger is good.)
Beyond that, you can also require human verification for images with a score below a given threshold, say 0.75 - and then perhaps add those images to the training set so that it further improves over time. (note: each classifier instance is immutable, so you can't add new training images to an existing one... But you can just replace it with a new one.)
from visual-recognition-nodejs.
Nathan,
Thanks this is exactly what's I'm currently testing and it works
fine...except that images which are not IDcards (like logos) are sometime
being classified as IDcards. So I tested with the Dogs example and found
the same problem. For me there is a bug somewhere... try the following
with Dogs...
Select 3 dogs breeds only and try to classify the Watson logo or any other
image... The result is no match which is fine this is the expected result.
If you Select all the Watson logo is being wrongly classified.
That's my point and I don't understand why it behaves like this.
Thanks a lot for your feedback.
JF
Cordialement - Kind regards
Jean-Francois LEVI
Client Technical Advisor - Société Générale
Phone/Fax: +(33) 1 58 75 28 77
Mobile: +(33) 6 75 07 85 00
Email: [email protected]
From: Nathan Friedly [email protected]
To: watson-developer-cloud/visual-recognition-nodejs
[email protected]
Cc: Jean-Francois Levi/France/IBM@IBMFR, Mention
[email protected]
Date: 03/06/2016 14:44
Subject: Re: [watson-developer-cloud/visual-recognition-nodejs]
Strange behavior of Demo Classifier (#104)
That actually sounds like a very fitting use case - choose a random
selection of images out of the 7 million and split them into two groups:
ID cards & other. Per the documentation, you'll want at least 150-200
images in each group, and should see some benefit all the way up to 5000
images total. (You'll likely have to shrink down the images to fit within
the 100mb-per-zip limit, but 320px and larger is good.)
Beyond that, you can also require human verification for images with a
score below a given threshold, say 0.75 - and then perhaps add those
images to the training set and created a new classifier so that it further
improves over time. (Each custom classifier is immutable, so you can't add
new training images to an existing one... But you can just replace it with
a new one.)
?
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
from visual-recognition-nodejs.
Jean-Francois wrote:
For me there is a bug somewhere... try the following
with Dogs...
Select 3 dogs breeds only and try to classify the Watson logo or any other
image... The result is no match which is fine this is the expected result.
If you Select all the Watson logo is being wrongly classified.
That's my point and I don't understand why it behaves like this.
I understand that is counter-intuitive. In your example, adding more training data (all dogs instead of just 3 breeds) leads to a misclassification. One of the deep problems with some machine learning techniques (including the ones we use) is that we cannot explain "why" a mistake (or right answer) was given. You might find this paper interesting and surprising: https://arxiv.org/abs/1312.6199
We do know that counter-intuitive results like this are possible, especially when the test images (Watson logo) come from a different distribution than the training images (dogs). We are actively working on ways to identify whether a classifier is "appropriate" for a particular test set, given what it was trained on. Our dept has some initial results, but nothing deployed yet.
@kognate 's advice above, about using large training sets (hundreds or thousands of example per class) and augmenting your negative set with the logo images is the best practice we can recommend at this time.
Matt
Sr Software Engineer
IBM Research - Visual Recognition
from visual-recognition-nodejs.
Related Issues (20)
- ! "Visual Recognition is not available right now..." HOT 19
- Page keeps refreshing in Safari HOT 6
- Resize and rotation should be done using a library HOT 1
- exif orientation is not respected
- issues with npm start HOT 2
- zip files lead to 404 error HOT 6
- File upload seems broken HOT 6
- Upgrade Deployment Tracker client dependency
- security.js - for VR Nodejs Rate limiting to API is set too low at 10 should be higher. HOT 1
- Getting error when trying to access JSON result HOT 9
- Argument error HOT 2
- Deployment Tracker Service is discontinued HOT 1
- visualRecognition. detectFaces method doesn't recognise key locally HOT 3
- Add GDPR Language to Stand Alone Demo Pages HOT 1
- We are working to get Visual Recognition up and running shortly! HOT 1
- Cannot read property 'fields' of undefined HOT 2
- how to display only class and score in html? HOT 3
- Update the instructions to work with IAM
- Phone Camera HOT 6
- Following documented example results in an error HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from visual-recognition-nodejs.