Giter VIP home page Giter VIP logo

Comments (18)

shivaz560 avatar shivaz560 commented on June 10, 2024 1

Thank you very much ahmetozlu

from code.

shervinemami avatar shervinemami commented on June 10, 2024

Have you tried reading chapters 2 & 3 of this book?

from code.

flankechen avatar flankechen commented on June 10, 2024

@shervinemami
yeap, I read that and understand that image detect and tracking is damn hard thing. But the performance and experience is just bad.
I guess I should start with metaio SDK if I gonna build up something.

from code.

scotshinderman avatar scotshinderman commented on June 10, 2024

I don't have the code in front of me, but iirc there's some debugging flags
that you might want to try. When I used these, it would show how the
feature correspondences were behaving. Also the tunable parameters in
RANSAC and the size of the image might be something to try.
hth

On Thu, May 21, 2015 at 11:17 PM, flankechen [email protected]
wrote:

@shervinemami https://github.com/shervinemami
yeap, I read that and understand that image detect and tracking is damn
hard thing. But the performance and experience is just bad.
I guess I should start with metaio SDK if I gonna build up something.


Reply to this email directly or view it on GitHub
#52 (comment)
.

from code.

shervinemami avatar shervinemami commented on June 10, 2024

@flankechen just making sure, since some people find this code on the web and don't realize you are expected to read the book chapter before looking at the code.

I think you should be aware that Ch3 (and all chapters of this book) is intended to teach you the basics of markerless AR for educational purposes, so a beginner can learn how it works and put together a working demo themselves, and decide if they want to improve it using more complex methods later. Whereas commercial tools are intended to get the best possible results even if the code is extremely large & complex and took many man-years to develop and requires PhD's to understand the maths. So there is a fundamental difference between projects in this book and commercial projects.

If you want a black box that gives the best results then you should look for a commercial solution, or if you want to create your own AR system from scratch, then you can start from the code of this book and then read various conference papers such as from CVPR & ICCV at "www.cvpapers.com" and implement some more complex algorithms. A 3rd alternative is to try an open-source/cheaper library such as ARToolkit ("http://www.hitl.washington.edu/artoolkit/")

from code.

flankechen avatar flankechen commented on June 10, 2024

@shervinemami
Thanks for the advice. I did read ch2 and ch3, learning lots of how AR actually works. Though I am not a new bird of CV, this is really interesting experiences.
I am trying to build up something with commercial projects for demo propose, while keeping eyes open on open source codes and paper.
Thanks again.

from code.

BloodAxe avatar BloodAxe commented on June 10, 2024

Hello! Thanks for you feedback @flankechen. As an author of ch2 and ch2 I feel uneasy that you had problems running it. Due to the limitation of the book size I had to cut it to least viable prototype omitting many important things. Initial idea was to give readers a feeling of basics of Augmented Reality.

Here's how it supposed to look: https://www.youtube.com/watch?v=CvQLy3Z89ro

There are few possible places where it can breaks:

  1. Wrong camera intrinsic matrix. It's critical to have calibration matrix that fits with your camera. Otherwise pose estimation will result incorrect pose no matter how good tracking was. Please refer to the Camera calibration using OpenCV.
  2. Poor homography estimation due to small number of matches. This is likely to be your problem. Since OpenCV library changes it may be a chance that default ORB c'tor now changed and returns too few feature points. I suggest you to try with SURF or AKAZE features to check this suggestion. They are generally slower than ORB, but for testing purposes it would be enough.

Sorry for the late reply, I did not have a chance to reply earlier due to my vacation. Let's keep in touch now and I hope you will get nice tracking.

from code.

AugustoAngeletti avatar AugustoAngeletti commented on June 10, 2024

Hi BloodAxe, your video is impressive. Can you explain us how did you get it?
I suppose the basic tracking system is based on KLT right?

Thanks
Augusto

from code.

BloodAxe avatar BloodAxe commented on June 10, 2024

@AugustoAngeletti you're right. The inter-frame tracking was done using KLT algorithm to get points movement from previous to next frame. However using only KLT affects tracking precision and introduces drift of keypoints. So, after we track feature points using KLT algorithm our next step is to calculate homography Ha and warp current frame with this homography to obtain approximation of reference image (This is it how our system 'see' this target). As output we should get an image that looks almost identical to reference target.
Now we can estimate additional transformation Hr between reference image and warped current frame. This can be easily done with template matching. You next step should be computation of refined homography which is a combination of approximated homography Ha and Hr.

  1. Initialise tracking of reference image (R) and current homography H
    For each next frame (I):
  2. Perform KLT to get tracks of feature points between two frames.
  3. Compute inter-frame homography Hi using feature tracks.
  4. Update H = H * Hi
  5. Warp I using inverted H into R'
  6. Do template matching of keypoints from R onto R' to obtain tracks of feature points between R and R'
  7. Compute homography Hr using tracks from (5).
  8. Compute refined homography: H = H * Hr

from code.

AugustoAngeletti avatar AugustoAngeletti commented on June 10, 2024

@BloodAxe thanks you a lot for the reply. Now everything is more clear.
I have a couple of questions.
When you say template matching you are assuming using a matcher to get knn matches?
About local features detection and descriptor extraction: in my project (porting of markerless example to Unity) I'm using ORB, maybe it's better to use AKAZE?
I'm using OpenCV for Unity (it's a wrapper of OpenCV Java) but I have a problem in matrix multiplication. In OpenCV C++ you can compute the product as you wrote H = H * Hr, but in the Java version is not implemented. Some people said to use the function gemm() others H.mul(Hr). I tested both with very bad results. Have you some advice?

In my example the corners generated with Homography (after Homography Refinement) flickering a lot. I'm thinking to use Kalman Filter to correct the position. Do you think it is a good idea?

Thanks
Augusto

from code.

OriginLive avatar OriginLive commented on June 10, 2024

Hello, I'm trying to do exact same thing @AugustoAngeletti is, however i'm getting really poor results. Have you managed to get anything with decent speed to work with multiple images?

from code.

BloodAxe avatar BloodAxe commented on June 10, 2024

Yes, the second edition of this book will contain updated chapter for simultaneous multi-target tracking algorithm.

from code.

OriginLive avatar OriginLive commented on June 10, 2024

@BloodAxe Hello, in ch3, there's a comment
// Then we add vector of descriptors (each descriptors matrix describe one image).
// This allows us to perform search across multiple images:

I've added multiple descriptors, each from it's own image. How can i know, which image was found, based on it's descriptor? Basically i would like to know which image it detected.

Edit: Got it by detecting it on DMatch. However, it is super slow with a mere 100 images. Any ideas how to speed it up?

Another update: Managed to get it to work. For 200 pictures, it takes about 700ms, so you need to detect only a portion each frame. You put the different descriptors in the training, and you can save either the descriptor or the trained matcher so it doesn't take as much time. The best i could do was 8 pictures per frame, to get 30fps

from code.

shivaz560 avatar shivaz560 commented on June 10, 2024

@BloodAxe @shervinemami . I have just started to work with your program and I am trying to test some of the features. I would like to use a live feed from the camera and whenever i try running the demo+pattern.png from the command line , it always says cannot open video source device. I have exactly passed the arguements like how it is in the book. Could you tell me whether I have missed something

from code.

ahmetozlu avatar ahmetozlu commented on June 10, 2024

processVideo(patternImage, calibration, cv::VideoCapture()); // line 62

You should change this line, if you use web cam:
processVideo(patternImage, calibration, cv::VideoCapture(0));

If you use additional cam, you should change it to:
processVideo(patternImage, calibration, cv::VideoCapture(1));

That's all!

from code.

BloodAxe avatar BloodAxe commented on June 10, 2024

@OriginLive, answering your question regarding multi-image matching, cv::DMatch has imageIdx field, that corresponds to zero-based index of image at training stage.

As a rule of thumb, you should limit max number of keypoitns and size of input images to be similar to video size. If your video is 640x480 there is no sense in training matcher on 4K image targets.

For 200 images I'd go for LSH index tree (exists in OpenCV). If the number of images would be 10000+ i then suggest to do a Bag-Of-Words approach.

from code.

shivaz560 avatar shivaz560 commented on June 10, 2024

@BloodAxe @shervinemami I tried out the markerless AR project and it works pretty well. I know that your approach is based on mono camera. Have you extended your work to stereo vision as well as I think it might give a better result . Is it there an example code or tool box that is based on stereo vision and markerless AR.

from code.

BloodAxe avatar BloodAxe commented on June 10, 2024

@shivaz560 I don't see what benefits can stereo-vision can give in this case.
Answering your question: no, there is no code for stereo.

from code.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.