Giter VIP home page Giter VIP logo

ldif's Issues

Visualization with qview: scaling issue

Hi, we're trying to visualize ellipsoids overlayed on a mesh (as in Fig 5 in the paper). But the scale of the mesh that we put into qview is not consistent with the scale of the ellipsoids. How can we scale the mesh so they're on the same scale?

For example, when visualizing ldif/ldif2mesh/test-ldif.txt with ldif/ldif2mesh/test-ldif-output.ply we notice their scales are inconsistent. In fact the ellipsoids are much smaller than the mesh (only appearing as a dot on the left):
image

When visualizing ldif/ldif2mesh/test-ldif.txt with ldif/ldif2mesh/test-ldif.ply however we have them correctly displayed on the same scale:
image

We've tried to normalize test-ldif-output.ply to a unit sphere but still the scales are inconsistent:
image

I would imagine ldif/ldif2mesh/test-ldif.ply was scaled/translated from ldif/ldif2mesh/test-ldif-output.ply. Where can I find the script to do this?

test or val issue

I 've already trained the model ,but when i want to test it,i failed a lot time

CalledProcessError Traceback (most recent call last)
~/ldif/ldif/inference/predict.py in _grid_eval_cuda(self, sif_vector, resolution, extent)
845 try:
--> 846 cmd_result = sp.check_output(cmd, shell=True)
847 log.info(cmd_result.decode('utf-8').replace('\n', ''))

~/anaconda3/lib/python3.7/subprocess.py in check_output(timeout, *popenargs, **kwargs)
410 return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
--> 411 **kwargs).stdout
412

~/anaconda3/lib/python3.7/subprocess.py in run(input, capture_output, timeout, check, *popenargs, **kwargs)
511 raise CalledProcessError(retcode, process.args,
--> 512 output=stdout, stderr=stderr)
513 return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command 'CUDA_VISIBLE_DEVICES=1 /home/gemengyuan/ldif/ldif/ldif2mesh/ldif2mesh /tmp/tmp0e9rowkf/ldif.txt /home/gemengyuan/ldif/ldif/ldif2mesh/extracted.occnet /tmp/tmp0e9rowkf/grid.grd -resolution 256' returned non-zero exit status 35.

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
in
1 embedding = encoder.run_example(e)
----> 2 mesh = decoder.extract_mesh(embedding, resolution=256)
3 gaps_util.mshview(mesh)

~/ldif/ldif/inference/predict.py in extract_mesh(self, sif_vectors, resolution, extent, return_success, world2local)
959 extent,
960 extract_parts=False,
--> 961 world2local=world2local)
962 grid_out_time = time.time()
963 log.verbose(f'Grid eval time: {grid_out_time - extract_start_time}')

~/ldif/ldif/inference/predict.py in _grid_eval(self, sif_vector, resolution, extent, extract_parts, world2local)
886 log.verbose('Evaluating SDF grid for mesh.')
887 if self.use_inference_kernel and not extract_parts:
--> 888 return self._grid_eval_cuda(sif_vector, resolution, extent)
889 if extract_parts or world2local:
890 log.warning('Part extraction and world2local are not supported with the'

~/ldif/ldif/inference/predict.py in _grid_eval_cuda(self, sif_vector, resolution, extent)
867 'possible.')
868 else:
--> 869 raise ValueError(f'Unrecognized error code {e.returncode} occurred'
870 f' during inference kernel evaluation: {e.output}')
871

ValueError: Unrecognized error code 35 occurred during inference kernel evaluation: b'GPUCheckOk Failure: CUDA driver version is insufficient for CUDA runtime version ldif2mesh.cu 985\n'


CalledProcessError Traceback (most recent call last)
~/ldif/ldif/inference/predict.py in _grid_eval_cuda(self, sif_vector, resolution, extent)
845 try:
--> 846 cmd_result = sp.check_output(cmd, shell=True)
847 log.info(cmd_result.decode('utf-8').replace('\n', ''))

~/anaconda3/lib/python3.7/subprocess.py in check_output(timeout, *popenargs, **kwargs)
410 return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
--> 411 **kwargs).stdout
412

~/anaconda3/lib/python3.7/subprocess.py in run(input, capture_output, timeout, check, *popenargs, **kwargs)
511 raise CalledProcessError(retcode, process.args,
--> 512 output=stdout, stderr=stderr)
513 return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command 'CUDA_VISIBLE_DEVICES=1 /home/gemengyuan/ldif/ldif/ldif2mesh/ldif2mesh /tmp/tmp0e9rowkf/ldif.txt /home/gemengyuan/ldif/ldif/ldif2mesh/extracted.occnet /tmp/tmp0e9rowkf/grid.grd -resolution 256' returned non-zero exit status 35.

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
in
1 embedding = encoder.run_example(e)
----> 2 mesh = decoder.extract_mesh(embedding, resolution=256)
3 gaps_util.mshview(mesh)

~/ldif/ldif/inference/predict.py in extract_mesh(self, sif_vectors, resolution, extent, return_success, world2local)
959 extent,
960 extract_parts=False,
--> 961 world2local=world2local)
962 grid_out_time = time.time()
963 log.verbose(f'Grid eval time: {grid_out_time - extract_start_time}')

~/ldif/ldif/inference/predict.py in _grid_eval(self, sif_vector, resolution, extent, extract_parts, world2local)
886 log.verbose('Evaluating SDF grid for mesh.')
887 if self.use_inference_kernel and not extract_parts:
--> 888 return self._grid_eval_cuda(sif_vector, resolution, extent)
889 if extract_parts or world2local:
890 log.warning('Part extraction and world2local are not supported with the'

~/ldif/ldif/inference/predict.py in _grid_eval_cuda(self, sif_vector, resolution, extent)
867 'possible.')
868 else:
--> 869 raise ValueError(f'Unrecognized error code {e.returncode} occurred'
870 f' during inference kernel evaluation: {e.output}')
871

ValueError: Unrecognized error code 35 occurred during inference kernel evaluation: b'GPUCheckOk Failure: CUDA driver version is insufficient for CUDA runtime version ldif2mesh.cu 985\n'

CalledProcessError Traceback (most recent call last)
~/ldif/ldif/inference/predict.py in _grid_eval_cuda(self, sif_vector, resolution, extent)
845 try:
--> 846 cmd_result = sp.check_output(cmd, shell=True)
847 log.info(cmd_result.decode('utf-8').replace('\n', ''))

~/anaconda3/lib/python3.7/subprocess.py in check_output(timeout, *popenargs, **kwargs)
410 return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
--> 411 **kwargs).stdout
412

~/anaconda3/lib/python3.7/subprocess.py in run(input, capture_output, timeout, check, *popenargs, **kwargs)
511 raise CalledProcessError(retcode, process.args,
--> 512 output=stdout, stderr=stderr)
513 return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command 'CUDA_VISIBLE_DEVICES=1 /home/gemengyuan/ldif/ldif/ldif2mesh/ldif2mesh /tmp/tmp0e9rowkf/ldif.txt /home/gemengyuan/ldif/ldif/ldif2mesh/extracted.occnet /tmp/tmp0e9rowkf/grid.grd -resolution 256' returned non-zero exit status 35.

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
in
1 embedding = encoder.run_example(e)
----> 2 mesh = decoder.extract_mesh(embedding, resolution=256)
3 gaps_util.mshview(mesh)

~/ldif/ldif/inference/predict.py in extract_mesh(self, sif_vectors, resolution, extent, return_success, world2local)
959 extent,
960 extract_parts=False,
--> 961 world2local=world2local)
962 grid_out_time = time.time()
963 log.verbose(f'Grid eval time: {grid_out_time - extract_start_time}')

~/ldif/ldif/inference/predict.py in _grid_eval(self, sif_vector, resolution, extent, extract_parts, world2local)
886 log.verbose('Evaluating SDF grid for mesh.')
887 if self.use_inference_kernel and not extract_parts:
--> 888 return self._grid_eval_cuda(sif_vector, resolution, extent)
889 if extract_parts or world2local:
890 log.warning('Part extraction and world2local are not supported with the'

~/ldif/ldif/inference/predict.py in _grid_eval_cuda(self, sif_vector, resolution, extent)
867 'possible.')
868 else:
--> 869 raise ValueError(f'Unrecognized error code {e.returncode} occurred'
870 f' during inference kernel evaluation: {e.output}')
871

ValueError: Unrecognized error code 35 occurred during inference kernel evaluation: b'GPUCheckOk Failure: CUDA driver version is insufficient for CUDA runtime version ldif2mesh.cu 985\n'
What might cause these errors SOS!!!

Segmentation fault when train the net

python train.py --batch_size 24 --experiment_name shapenet-ldif
--model_directory $models --model_type "ldif"
--dataset_directory $dataset
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
INFO: Making dataset...
INFO: Optimized dataset detected at ./shapenet/optimized
INFO: Mapping...
INFO: is_invalid vs lower_coords: [24, 32, 1] vs [24, 32, 3]
INFO: Post-where lower_coords: [24, 32, 3]
INFO: is_invalid vs sdf coords: [24, 32, 1] vs [24, 32, 1]
INFO: In-out image summaries have been removed.
INFO: The 0-th GPU has 22390 MB free.
INFO: TensorFlow can use up to 93.1397945511389% of the total GPU memory.
INFO: Initializing variables...
INFO: No previous checkpoint detected, training from scratch.
Fatal Python error: Segmentation fault

Thread 0x00007fd78cff9700 (most recent call first):
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/threading.py", line 302 in wait
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/queue.py", line 170 in get
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/tensorflow_core/python/summary/writer/event_file_writer.py", line 159 in run
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/threading.py", line 932 in _bootstrap_inner
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/threading.py", line 890 in _bootstrap

Thread 0x00007fd9b5258340 (most recent call first):
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1441 in _call_tf_sessionrun
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1349 in _run_fn
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1365 in _do_call
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1358 in _do_run
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1179 in _run
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 955 in run
File "train.py", line 263 in main
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/absl/app.py", line 258 in _run_main
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/absl/app.py", line 312 in run
File "train.py", line 283 in
./reproduce_shapenet_autoencoder.sh: line 50: 1295263 Segmentation fault (core dumped) python train.py --batch_size 24 --experiment_name shapenet-ldif --model_directory $models --model_type "ldif" --dataset_directory $dataset

I have generated the dataset from raw ShapnetCoreV1/03001627 models, by converting .obj file to .ply and then generating watertight .ply file using gaps tools. After that I used the command in the script named reproduce_shapenet_autoencoder.sh to make dataset, everything done successfully. But when I tried to train the net with the dataset, it failed and got the log showed above.

BTW, the enviroment with my computer: ubuntu20.04 with RTX3090, cuda version = 11.1, and I run the code on tensorflow-1.15.
Could you give me some advice for this issue?
Thank you!

/process_mesh_local.sh: line 47: 13856 Bus error (core dumped) ${gaps}/msh2msh $mesh_in $mesh -scale_by_pca -translate_by_centroid -scale 0\.25 -debug_matrix ${outdir}/orig_to_gaps.txt

I have built ./build_gaps.sh successfully.
And /gaps_is_installed.sh print out Ready to go!
But I still get this Bus error "/process_mesh_local.sh: line 47: 13856 Bus error (core dumped) ${gaps}/msh2msh $mesh_in $mesh -scale_by_pca -translate_by_centroid -scale 0.25 -debug_matrix ${outdir}/orig_to_gaps.txt"
when runing

python meshes2dataset.py --mesh_directory [path/to/dataset_root]
--dataset_directory [path/to/nonexistent_output_directory]

Anyone knows why?

The result of unit_test.sh

Hi,

What is the result of unit_test.sh supported to be? I only got a full black openGL window. Is it normal?

Fix issues while running ./build_gaps.sh

Platform: Ubuntu 18.04.3 LTS

./build_gaps.sh
fatal error: GL/osmesa.h: No such file or directory

Fix for this error

apt-get install libosmesa6-dev

Second error while running

./build_gaps.sh
In file included from scn2cam.cpp:13:0:
/usr/include/GL/osmesa.h:124:1: error: ‘GLAPI’ does not name a type; did you mean ‘GLEWAPI’?
 GLAPI OSMesaContext GLAPIENTRY

Fix for this error
Near the end of glew.h is the line:
#undef GLAPI
Delete it

Ready to go!

Error when unit testing

I tried to run unit_test.sh and I got the following output

INFO: Creating directories...
2it [00:00, 6631.31it/s]
INFO: Making dataset...
100%|█████████████████████████████████████████████| 2/2 [00:00<00:00, 56.32it/s]
Program was not compiled with mesa.  Recompile with make mesa.

/home/origin/codes/ldif-master/ldif/scripts/process_mesh_local.sh:行 59: 47657 已放弃               (核心已转储) ${gaps}/scn2img $mesh $dodeca_path $depth_dir -capture_depth_images $mesa -width 224 -height 224
Program was not compiled with mesa.  Recompile with make mesa.

/home/origin/codes/ldif-master/ldif/scripts/process_mesh_local.sh:行 59: 47659 已放弃               (核心已转储) ${gaps}/scn2img $mesh $dodeca_path $depth_dir -capture_depth_images $mesa -width 224 -height 224
joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/origin/anaconda3/envs/ldif/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 431, in _process_worker
    r = call_item()
  File "/home/origin/anaconda3/envs/ldif/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 285, in __call__
    return self.fn(*self.args, **self.kwargs)
  File "/home/origin/anaconda3/envs/ldif/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 595, in __call__
    return self.func(*args, **kwargs)
  File "/home/origin/anaconda3/envs/ldif/lib/python3.6/site-packages/joblib/parallel.py", line 253, in __call__
    for func, args, kwargs in self.items]
  File "/home/origin/anaconda3/envs/ldif/lib/python3.6/site-packages/joblib/parallel.py", line 253, in <listcomp>
    for func, args, kwargs in self.items]
  File "meshes2dataset.py", line 104, in process_one
    f'{dataset_directory}/{split}/{synset}/{name}/', skip_existing, log_level)
  File "/home/origin/codes/ldif-master/ldif/scripts/make_example.py", line 63, in mesh_to_example
    shell=True)
  File "/home/origin/anaconda3/envs/ldif/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/home/origin/anaconda3/envs/ldif/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '/home/origin/codes/ldif-master/ldif/scripts/process_mesh_local.sh /tmp/tmp.CfwvYtBPnT/input_meshes/train/animal/blub.ply /tmp/tmp.CfwvYtBPnT/output_dataset/train/animal/blub/ /home/origin/codes/ldif-master/ldif' returned non-zero exit status 134.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "meshes2dataset.py", line 203, in <module>
    app.run(main)
  File "/home/origin/anaconda3/envs/ldif/lib/python3.6/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/home/origin/anaconda3/envs/ldif/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "meshes2dataset.py", line 151, in main
    FLAGS.skip_existing, FLAGS.log_level) for f in tqdm.tqdm(files))
  File "/home/origin/anaconda3/envs/ldif/lib/python3.6/site-packages/joblib/parallel.py", line 1042, in __call__
    self.retrieve()
  File "/home/origin/anaconda3/envs/ldif/lib/python3.6/site-packages/joblib/parallel.py", line 921, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/home/origin/anaconda3/envs/ldif/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 542, in wrap_future_result
    return future.result(timeout=timeout)
  File "/home/origin/anaconda3/envs/ldif/lib/python3.6/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/home/origin/anaconda3/envs/ldif/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
subprocess.CalledProcessError: Command '/home/origin/codes/ldif-master/ldif/scripts/process_mesh_local.sh /tmp/tmp.CfwvYtBPnT/input_meshes/train/animal/blub.ply /tmp/tmp.CfwvYtBPnT/output_dataset/train/animal/blub/ /home/origin/codes/ldif-master/ldif' returned non-zero exit status 134.

Error in data preprocessing

Hi. Thanks for sharing the code of this nice work!

I try to go through the whole training process following the instruction. I already obtained the watertight meshes using TSDF fusion method provided by occnet, and save all meshes as .ply files.

However, when I run meshes2dataset.py to process those meshes, I get an error like this:

subprocess.CalledProcessError: Command '/home/disk/diske/ldif/ldif/scripts/process_mesh_local.sh dataset/train/02958343/1005ca47e516495512da0dbf3c68e847.ply processed_data/train/02958343/1005ca47e516495512da0dbf3c68e847/ /home/disk/diske/ldif/ldif' returned non-zero exit status 255.

It seems that something went wrong in the .sh file used to generate SDF values and depth images. I am quite confused because there is no other information tells me what causes this error. Hope you can help me.

Really appreciated .

Crash on train

I have setup all the dependencies and my dataset generation script ran without any errors. But, when I run train.py, I get a crash. Here's a complete log I am getting :
`python train.py --dataset_directory data/Dataset --experiment_name ldif_default --model_type ldif --model_directory ./trained_models

INFO: Making dataset...
INFO: Mapping...
INFO: In-out image summaries have been removed.
INFO: The 0-th GPU has 11158 MB free.
INFO: TensorFlow can use up to 86.23409213120631% of the total GPU memory.
INFO: Initializing variables...
INFO: No previous checkpoint detected, training from scratch.
ERROR:tensorflow:Session failed to close after 30 seconds. Continuing after this point may leave your program in an undefined state.
E0802 08:03:47.381119 139974498080576 session.py:1637] Session failed to close after 30 seconds. Continuing after this point may leave your program in an undefined state.
Traceback (most recent call last):
File "/home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found.
(0) Out of range: End of sequence
[[{{node IteratorGetNext}}]]
(1) Out of range: End of sequence
[[{{node IteratorGetNext}}]]
[[IteratorGetNext/_3]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train.py", line 275, in
app.run(main)
File "/home/tintash-rnd/.local/lib/python3.6/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/tintash-rnd/.local/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "train.py", line 256, in main
[model_config.train_op, summary_op, model_config.loss])
File "/home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "/home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
run_metadata)
File "/home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found.
(0) Out of range: End of sequence
[[node IteratorGetNext (defined at /home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
(1) Out of range: End of sequence
[[node IteratorGetNext (defined at /home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
[[IteratorGetNext/_3]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'IteratorGetNext':
File "train.py", line 275, in
app.run(main)
File "/home/tintash-rnd/.local/lib/python3.6/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/tintash-rnd/.local/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "train.py", line 187, in main
split=FLAGS.split)
File "/mnt/DeepLearningResearch/ldif/ldif/datasets/local_inputs.py", line 137, in make_dataset
dataset_items = tf.compat.v1.data.make_one_shot_iterator(dataset).get_next()
File "/home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/data/ops/iterator_ops.py", line 426, in get_next
name=name)
File "/home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/ops/gen_dataset_ops.py", line 2518, in iterator_get_next
output_shapes=output_shapes, name=name)
File "/home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "/home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "/home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "/home/tintash-rnd/miniconda3/envs/ldif/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in init
self._traceback = tf_stack.extract_stack()`

Is it possible to share your preprocessed data and pretrained model of LDIF?

Hi Kyle:

Thanks for your great work and code on Local Deep Implicit Functions for 3D Shape!

I am trying to reproducing your results. However, I noticed that pretrained model checkpoint is on your TODO list. And preprocessed data is not released too.

I'd like to ask if it is possible to share your preprocessed data? Also when is the time to release the pretrained model checkpoint?

Best wishes
Cheng

Abnormal chamfer metric reproduced on ShapeNet

Hi Kyle:

I'm trying to reproduce the results with original settings (model type: ldif, batch size: 24) on ShapeNet. Watertight mesh generation was done using code from OccNet before preprocessing with your code make_dataset.py. The data is randomly split into train, val, test set like in the paper. The sizes of the meshes are unchanged, which are whin a cube with side length 2.

However, after 200k iterations of training (which is less than 1M in the paper), I've got the following results on the val split:
mean: IoU=81.78, Chamfer=0.03, F-Score=90.25
Comparing to the results in your paper (mean: IoU=90.00, Chamfer=0.4, F-Score=92.20) and other works, the IoU and F-Score metrics seem Alright considering less training. However, Chamfer metric is abnormally low.

I can't find any mistake before data preprocessing. It would be kind if you can give some hints about the process. It would be better if you can publish pretrained model, watertight mesh generation code and preprocessed data.

Best wishes
Cheng

IO Performance Issues

Previously (See #3) a user reported poor IO performance that was bottlenecking training. In response to this, a recent commit 801f5b1 added new flags to meshes2dataset.py "--optimize" and "--optimize_only". These flags generate a sharded and compressed tfrecords dataset for reduced IO overhead. The files are written to a subdirectory inside the dataset_directory path. The train.py script looks for that directory, and if available uses it for training rather than the existing files (which remain because they are useful for interactive visualization and the evaluation scripts).

Commit f78dbc4 enables this behavior by default. If you are an existing user experiencing less than 100% GPU utilization, please ctrl+c training, git pull, rerun the meshes2dataset.py script with the flags --optimize and --optimize_only (the latter flag skips the first part of dataset creation, which was already run), and rerun the training command (no change to the training command is required; it will resume using the new tfrecords data). Unfortunately this new meshes2dataset.py step can take several hours on shapenet, and also consumes ~3mb extra disk space per dataset element (totaling 129GB extra on ShapeNet). However, in the tested cases it has resulted in 100% gpu utilization. With this change, I experience ~3.5 steps/sec with a batch size of 24 on a V100, and ~2 steps/sec with a batch size of 24 on a P100.

The size of the shards and their contents could be further optimized, but without a failing example I'm not sure what the optimal settings are. If you experience less than 100% gpu utilization after this change, please comment below and I will do my best to address your issue. Similarly, if you can confirm 100% utilization on a networked HDD, that would be highly appreciated, since I can't easily test on that setup.

One other minor note is that a byproduct of this change is that the 10K points per sample are no longer randomly chosen from 100K each time a mesh is seen; instead the same 10K points are used each time. Because those 10K points are never seen by the network directly, but rather used to generate local pointclouds based on the SIF elements, I anticipate no effect from this change, unless the dataset is extremely small. However I will verify this quantitatively before closing the issue.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.