Giter VIP home page Giter VIP logo

Comments (20)

seva100 avatar seva100 commented on June 8, 2024 2

@ttsesm, I've trained the network with --crop_size 256x256, and now the black regions are gone:
redwood_bedroom_npbg_crop_256_short1

I've trained for 39 epochs, and with such crop size more epochs might be needed to achieve better results. You can also try the 2nd way I suggested (512x512 crop size and less aggressive zoom augmentations). Though, the results already seem quite acceptable to me.

from npbg.

seva100 avatar seva100 commented on June 8, 2024 1

Ok, looks like the --init_view does not work as intended for now. Please try to use the following script: https://gist.github.com/seva100/4fe57ab17ebd943fa7614cb0d4d7f982
This script allows you to render either XYZ point coordinates visualized in RGB or network outputs. It should work out-of-the-box by executing:
python generate_dataset.py --config downloads/redwood_bedroom.yaml --inputs xyz_p1
and produce .png renderings of XYZ point coordinates in the folder rendered of the project root. In this mode, this script does not require any trained network, so no weights checkpoints need to be specified in the config.

For example, when executed with the downloads/livingroom.yaml config, the script produces the following image for the camera #1:

portfolio_view

and we can see that the points project to the same places as in the respective ground truth image. This should be a working debugging procedure for the view matrices.

from npbg.

ttsesm avatar ttsesm commented on June 8, 2024 1

@seva100 I can confirm that both approaches, i.e. --crop_size 256x256 and changing to less aggressive zoom augmentations solve the issue with the black spots.

Thanks for the help, hopefully these info will be helpful for others as well.

from npbg.

seva100 avatar seva100 commented on June 8, 2024 1

@ttsesm Great to hear!
By the way, if the format of your target image names is different from the default one, you can avoid changing this line

target_list = [os.path.join(config['target_path'], target_name_func(i)) for i in camera_labels]

and just change target_name_func in the paths config. For example, I used the following paths_example.yaml for the Redwood bedroom:

datasets:
    "redwood_bedroom":
        scene_path: data/redwood_bedroom.yaml
        target_path: data/scenes/redwood_bedroom/images/
        target_name_func: "lambda i: f'{int(i):06}.jpg'"

from npbg.

seva100 avatar seva100 commented on June 8, 2024

We have not tried datasets available at your link, but it seems like Redwood comes with everything you need: RGB images (you can ignore depth), .ply reconstructions, and camera poses. It seems that .log trajectory files are the view matrices already converted to the format required for Neural Point-Based Graphics (sometimes it's required to invert some axes or some matrices; in this case, you can try writing to the dataset authors about the exact format of their camera poses). Next, you'll need to make a paths file and a scene config -- there is a tutorial in the readme and some comments in this issue.

If you succeed in running this data, we would love to hear your feedback here - this would be really helpful for others!

from npbg.

ttsesm avatar ttsesm commented on June 8, 2024

@seva100 does the .ply file needs to have color information or not necessarily? Because the one provided from the redwood is colorless.

In principle if I need color I could recreate it with the color images and the agisoft/colmap software.

from npbg.

seva100 avatar seva100 commented on June 8, 2024

No, .ply file does not need to have color information. It only needs to contain XYZ coordinates of the points.

from npbg.

ttsesm avatar ttsesm commented on June 8, 2024

Hi @seva100, I've managed to train the network with the redwood dataset but the output seems not correct. I've tried two approaches since with the default parameters I was getting a CUDA out of memory RuntimeError. So what I've tried are the following training modes, in the first one I've reduced the image size and in the second the batch size:

python train.py --config configs/train_example.yaml --pipeline npbg.pipelines.ogl.TexturePipeline --dataset_names redwood_bedroom --crop_size 256x256

python train.py --config configs/train_example.yaml --pipeline npbg.pipelines.ogl.TexturePipeline --dataset_names redwood_bedroom --batch_size 4 --batch_size_val 4

for the first case I've got a VAL LOSS 926.9789898726192 and for the second 872.1955623209042 which from what I understand it is high, could you please confirm whether these values are considered bad or good.

While the viewer output of course does not seem that good as well:

image

image

While I should be getting something similar to that:

image

Any idea what I might have doing wrong.

from npbg.

seva100 avatar seva100 commented on June 8, 2024

Hi @ttsesm, I think most likely you'll need to play with the camera poses, as they are often provided in different ways. The format we expect is the same as the one given by Agisoft; you can find a good reference here (the only exception is that we invert the 2nd and 3rd columns of R matrix afterwards). In short, [R t | 0 1] should be a world2camera transformation matrix, so if for Redwood it is camera2world, you'll need to invert this matrix. Also, check that your intrinsic matrix is correct (it should be of a form [f 0 cx | 0 f cy | 0 0 1], where f is converted to pixels, and (cx, cy) should be close to an image center point).

About the high values of VGG loss: I think this depends on the dataset; we had numbers of 700-800 in case of a good convergence too. Also, I would try to decrease batch size to 1 for both training and validation (this should not significantly affect the quality) and set 512x512 crop size.

from npbg.

ttsesm avatar ttsesm commented on June 8, 2024

I see, yup it seems that you are right the redwood provides a camera2world transformation according to the trajectory description here. I will try to invert them and try again.

My cam intrinsic shouldn't be a problem since they are correctly defined as you can see below:

525.0000 0.000000 319.5000 0.000000
0.000000 525.0000 239.5000 0.000000
0.000000 0.000000 1.000000 0.000000
0.000000 0.000000 0.000000 1.000000

from npbg.

seva100 avatar seva100 commented on June 8, 2024

Yes, please also try to change the sign of the 2nd and 3rd column. In the end, the matrices should correspond to the OpenGL coordinate system which has +X axis headed right, -Y up, -Z forward.

Perhaps the following trick can help to validate that your matrices are correct. Once you make some transformation of the view matrices and train something (at least for 1 epoch), you can open the viewer.py and provide an index of any view as --init-view 1 as a command-line argument to the viewer (you can use any other view index instead of "1"). Then, when the window appears, press X. This will switch the viewer from showing the network output to displaying the XYZ coordinates of points (visualized as RGB, R=X, G=Y, B=Z). If your view matrices are correct, you will see the points where they should be when looking from the camera #1 (or a different camera if you selected another index in --init-view).

from npbg.

ttsesm avatar ttsesm commented on June 8, 2024

I've tried to use the --init-view 1 option as you suggested but I am getting the following error:

$ python viewer.py --config downloads/redwood_bedroom.yaml --checkpoint data/logs/08-04_14-35-04___batch_size^4__dataset_names^redwood_bedroom/checkpoints/PointTexture_stage_0_epoch_2_redwood_bedroom.pth --init-view 1

loading pointcloud...
=== 3D model ===
VERTICES:  4893993
EXTENT:  [-2.4639  -0.99316 -4.4403 ] [4.3096 5.4167 3.0381]
================
new viewport size  (640, 480)
Traceback (most recent call last):
  File "viewer.py", line 434, in <module>
    my_app = MyApp(args)
  File "viewer.py", line 157, in __init__
    self.trackball = Trackball(init_view, self.viewport_size, 1, rotation_mode=args.rmode)
UnboundLocalError: local variable 'init_view' referenced before assignment

from npbg.

ttsesm avatar ttsesm commented on June 8, 2024

Hhmm, inverting the pose matrices and then changing the sign of the 2nd and 3rd column did not work as well. I should be doing something wrong somewhere else or something.

Is there a way to apply some debugging on the fly or at least to check that what I am using for the the training is correct.

from npbg.

ttsesm avatar ttsesm commented on June 8, 2024

@seva100 with the generate_dataset.py script I was able to identify what was the issue and fix the poses (I just needed to change the sign in the 2nd and 3rd columns, there was no need for inverting the pose matrices). So I am also getting images similar to the one you posted as you can see below:

redwood_bedroom_000005

and my loss dropped as well to 491.04269130734633 which I guess is also a good sign.

I tried to visualize the output with the viewer:

python viewer.py --config downloads/redwood_bedroom.yaml --checkpoint data/logs/08-10_18-48-08___batch_size^4__dataset_names^redwood_bedroom/checkpoints/PointTexture_stage_0_epoch_20_redwood_bedroom.pth --viewport 2000,1328

and I am getting the following output:

npbg_output5

what I've noticed is that there are a lot of black areas in the view, is this fine considering that my rgb images are capturing quite well the whole scene?

from npbg.

seva100 avatar seva100 commented on June 8, 2024

@ttsesm great to hear that the problem with view matrices was resolved; there is always a hassle with the poses coming from various external sources.
Can you please show what your downloads/bedroom.yaml file looks like? I suppose that the path to net_ckpt was not provided in this file, but can't be sure.

from npbg.

ttsesm avatar ttsesm commented on June 8, 2024

Hi Artem, indeed I had the net_ckpt and texture_ckpt commented, since at that point I had copied from the livingroom.yaml and there were pointing to wrong paths. I've uncommented them specifying where to find the PointTexture_stage_0_epoch_39_redwood_bedroom.pth and UNet_stage_0_epoch_39_net.pth respectively but the output is more or less the same as above.

from npbg.

seva100 avatar seva100 commented on June 8, 2024

@ttsesm hard to say at this point what could be the reason. Can you please post here:

  • the train and viewer commands you used,
  • all the configs content (used for training and for the viewer),
  • link to the .ply file and the images you used, if there is one provided by the dataset authors?

I also have a hypothesis that some regions are missing in the .ply file you use. This might actually cause these black regions.

from npbg.

ttsesm avatar ttsesm commented on June 8, 2024

For training I used the following command:
python train.py --config configs/train_example.yaml --pipeline npbg.pipelines.ogl.TexturePipeline --dataset_names redwood_bedroom --crop_size 512x512 --batch_size 4 --batch_size_val 4

while for viewing:
python viewer.py --config downloads/redwood_bedroom.yaml --checkpoint data/logs/08-10_18-48-08___batch_size^4__dataset_names^redwood_bedroom/checkpoints/PointTexture_stage_0_epoch_39_redwood_bedroom.pth --viewport 2000,1328

If you want to train the scene by yourself you can download all the configs and data that I've used from the following link (I have set the folders to be in the same structure as you are using). You will need to comment though the if/elif statement in line

if 'ply_raw' in data.metadata:
and replace it with just model['normals'] = data.vertex_normals since apparently the provided .ply does not contain nx, ny and nz variables and apply a small change in line
target_list = [os.path.join(config['target_path'], target_name_func(i)) for i in camera_labels]
with target_list = [os.path.join(config['target_path'], target_name_func(str(i).zfill(6))) for i in camera_labels] since the image names are provided with prefixed zeros.

from npbg.

seva100 avatar seva100 commented on June 8, 2024

Thank you for providing the data, it seems that you do everything correctly, point cloud stored in the bedroom.ply looks quite OK, and images truly cover everything. So these black regions should not be present in the rendered output.

I tried training just you like did and can confirm the presence of the black regions, so what I noticed is that ground truth has some black parts -- see the following random screenshot from TensorBoard (top: rendered by NPBG; bottom: ground truth)
image

Most likely, this happens because the ground truth image is 640x480, crop size is 512x512, and we also use random zoom augmentations in a range of [0.5, 2.0] zoom-ins. So in an extreme case of 0.5 zoom-out, the ground truth image gets resized to 320x240, which is less than the crop size of 512x512, which results in this black padding. Since we were always training NPBG on larger images (1080p and larger), this explains why we have not encountered this before.

Can you please try to train it with --crop_size 256x256? In this case, altered ground truth image fits into the crop, and there should be no black padding (a little more epochs might be needed though).
Another way is to retain 512x512 crop size but change the line 24 random_zoom: [0.5, 2.0] to e.g. random_zoom: [1.0, 2.0] in the train_example.yaml config.
As soon as I have a chance, I will try to do the same and report the result.

from npbg.

ttsesm avatar ttsesm commented on June 8, 2024

Hi Artem, thank you a lot for the feedback. Indeed the --crop_size 256x256 seems to work nicely. I will also try your other suggestion and report back here.

from npbg.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.