Giter VIP home page Giter VIP logo

Comments (3)

alexd314 avatar alexd314 commented on August 18, 2024

Hello,

I am afraid this will not be a very easy process, as currently we have not documented any of the details, and this would require efforts going through the source code in order to understand what is going on.

Here, I will only try to give you a rough outline on just one-two of the most important aspects. However, keep in mind that there will be certainly other parts that would require additional attention.

Just to explain regarding the background augmentation, in order to be able to use the model in real-world cases, during training time we synthetically "blend" the 3D replica of the structure with artificial backgrounds taken from public and inhouse RGB-D datasets (where we only use the depthmaps, while discarding color). In the main.py, which constitutes the training script, the command line arguments --vcl_path, --intnet_path and --corbs_path point to local paths of 3 of such datasets. We currently have not released the datasets we have used. However, in principle, any depth-dataset, either public, or captured in-house, would do. A folder containing depthmap files would suffice. Note that the datasets do not need to be annotated in any way. We just want real-world depth captures of real-world scenes. That's all. For reference implementation on how we load our background datasets check here. An important remark here is the scale parameter of the ImageBackgroundSamplerParams instance, which refers to applying additional scaling when loading the depth map. (i.e to account for when the 3D model of the structure is in meters but you depth maps are in millimeters)

Most importantly, apart from backgrounds needed for augmentation, in order to retrain the model, you need a 3D replica of the structure authored in a 3D authoring tool, like Blender, Maya, 3DSMax or any other equivalent. Actually how you build the 3D replica is not of a concern, as soon as it is exported in .obj format, potentially having annotations (obj "o" element) regarding the label of each box's side. If your structure is comprised of simple boxes like ours, the main thing to take care of is being precise with the length measurements of the boxes' sides and their relative placement. We believe this is easier accomplished in a 3D authoring tool as mentioned before, but If you don't want to use an authoring tool, and have the box's lengths, you could even try to script it with an .obj exporter in any programming language you like.

In theory, if the structure you use is comprised of the same number of boxes as ours (4 in this case), just replacing the .obj file in data/asymmetric_box.obj would probably do. If you are customizing beyond our "template" structure, you need to deal with stuff in src/io/box_model_loader.py and src/dataset/rendering/box_renderer.py to make sure everything is compatible with your model.

I know that this is only a fraction of the information that is required towards your goal, but a complete documentation is not available at the moment. Apart from the code itself, the best documentation we have regarding the methodology that we follow is our publication. While we cannot guarantee support under all circumstances, if you find yourself struggling with a specific issue, feel free to leave another comment or open another issue. We will try to assist in any case. Good luck !

from structurenet.

ababilinski avatar ababilinski commented on August 18, 2024

Hi @alexd314 , Thank you so much for that information 🙂 It definitely puts me on the right track.

Just so I better understand:

  • Does the data have to be arranged in any particular way or just a folder 1-100 frames of RGBD depth frames from different perspectives?
  • Does the data structure differ between --vcl_path, --intnet_path and --corbs_path or can they be any RGBD depth data as long as they are unique datasets?

Thank you for your time,

from structurenet.

alexd314 avatar alexd314 commented on August 18, 2024

Hi again,

(1) No structure is needed for the files. Just a folder under which all your *.png, *.pgm, (or other fileformat - see below) are located, is sufficient. Please use only depthmaps. No color files should exist inside the folder.
(2) The data-structure across datasets is exactly the same, as described in (1). The code treats datasets with "per-dataset" hyperparameters, giving slightly more importance to some datasets over the others. (For a detailed understanding look here, here, here, here and the context around those links). Simplifying things, in theory, you could even have a single folder with your depthmaps and have all 3 dataset paths point to the same folder. This would also work.

The code for loading the depth data is exactly here. In principle, any depth image format readable by opencv, with flag ANY_DEPTH is compatible (i.e. single channel png, pgm or other).

I hope this helps.

from structurenet.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.