Giter VIP home page Giter VIP logo

Comments (9)

Parskatt avatar Parskatt commented on May 30, 2024 1

@cvbird

  1. In the current version (a new preprint will be released soon) we train either on only megadepth (like LoFTR), or on a combination of MegaDepth and a synthetic dataset (that we made ad-hoc). For megadepth the reference warp is constructed like in loftr (we use the warp_kpts function https://github.com/zju3dv/LoFTR/blob/2122156015b61fbb650e28b58a958e4d632b1058/src/loftr/utils/geometry.py#L5 with the input being created from a meshgrid to be dense). The depth is provided in megadepth, so no additional method is needed. Please refer to to the LoFTR (or perhaps even D2Net) for how to download and process megadepth. The synthetic dataset is very similar to the one presented in PDCNet or PDCNet+, although we have our own implementation. There there the flow is computed by a set of homographies and the confidence comes from covisibility (which is easy to compute given that we generate the warps).

  2. See above. Note that we found using the confidence loss exclusively from megadepth to yield better estimation results. I.e. we only use the confidence in the synthetic dataset to zero out the regression loss in certain regions.

  3. I don't remember L_swap, if you mean L_warp then it is simply done by computing warp_kpts with the grid as described above, and comparing it with the estimated dense warp. This gives an error for each pixel, we take the 2-norm in each pixel,multiply with the reference confidence(so as to remove the loss from non-matching pairs) and then take the mean. For the confidence loss we simply use binary cross entropy in each pixel.

from dkm.

Parskatt avatar Parskatt commented on May 30, 2024 1

@KakueiTanaka
I updated the codebase now, and there is some (hopefully working) training code.
Note that the code we provide here is adapted from our internal training framework that is quite messy, hence there might have been some translation errors. Let me know if there are some issues!

from dkm.

Parskatt avatar Parskatt commented on May 30, 2024

We do not plan to release the training code in the near future, however, if you have any questions regarding the training I'm happy to answer. Probably the training code release will be sometime after summer.

I'll leave this issue open in case others have the same question.

from dkm.

cvbird avatar cvbird commented on May 30, 2024

Hi, parskatt. Thank you for the excellent work. I'm trying to train it from scratch. However, I still can not understand the loss functions. As mentioned in Sec 3.4, "The reference warps can come from projected depths like in (Sarlin et al., 2020; Sun et al., 2021) or from synthetic homographies, and the reference confidence p indicates, e.g., covisibility or consistent depth.", so

  1. What is the choice in this paper? depth or synthetic homographies ? It seems for different dataset, the strategy is different. If depth policy used, which method is used to generate the depth maps?
  2. If the reference confidence p indicates covisibility, is is a scalar or a tensor with the same shape as referenced warps ? how to obtain p ?
  3. would you please help to give a detailed illustration about the two loss functions L_swap and L_conf ?

Thank you again.

from dkm.

cvbird avatar cvbird commented on May 30, 2024

Thank you for your patient illustration.
Looking forward to the new version. :)

from dkm.

KakueiTanaka avatar KakueiTanaka commented on May 30, 2024

Hello, Parskatt.
Could you tell me the size of the images used when training the model?

from dkm.

Parskatt avatar Parskatt commented on May 30, 2024

@KakueiTanaka Hi, we use height = 384 width=512 images, we will release preliminary training code, loss functions etc in a few days :)

from dkm.

KakueiTanaka avatar KakueiTanaka commented on May 30, 2024

Thanks a lot!

from dkm.

Parskatt avatar Parskatt commented on May 30, 2024

I realized that in our internal training we actually freeze the batchnorm statistics in the resnet backbone during training, which is not done here. This might cause some small discrepancies if trying to reproduce results. I'll make some updates to this code when I get back from vacation :)

from dkm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.