Giter VIP home page Giter VIP logo

im-net-pytorch's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

im-net-pytorch's Issues

IM-GAN for pytorch version

Hello,I'm interested in your latent GAN, but it seems that I don't see your IM-GAN in the pytorch version.Is there a IM-GAN that develops the corresponding pytorch version?

How to get the ground truth models?

Hi Zhiqin,

Thanks for the great work.
Can you tell me how to get the ground trouth 256 * 256 * 256 voxel models and point cloud models?

Best regards,
czsc

training loss keeps increasing for SVR task

Dear authors,
Thanks for sharing the code of your wonderful work. I followed the instructions from README.txt to conduct the SVR experiments but found the training loss keeps increasing and then saturates at a large value. Please check the log below. Could you please help to figure out the problem? Thanks.


2021-07-22 09:53:21,611 - Model - INFO - PARAMETER ...
2021-07-22 09:53:21,611 - Model - INFO - Namespace(ae=False, beta1=0.5, checkpoint_dir='checkpoint_v0', data_dir='./data/all_vox256_img/', dataset='all_vox256_img', end=16, epoch=1000, getz=False, iteration=0, learning_rate=5e-05, sample_dir='samples/all_vox256_img1_v0', sample_vox_size=64, start=0, svr=True, train=True)
2021-07-22 09:53:21,611 - Model - INFO -

----------net summary----------
2021-07-22 09:53:21,611 - Model - INFO - training samples 35019
2021-07-22 09:53:21,611 - Model - INFO - -------------------------------

2021-07-22 09:54:16,795 - Model - INFO - Epoch: [ 0/1000] time: 55.0997, loss: 0.01976688
2021-07-22 09:55:14,028 - Model - INFO - Epoch: [ 1/1000] time: 112.3317, loss: 0.01464148
2021-07-22 09:56:11,519 - Model - INFO - Epoch: [ 2/1000] time: 169.8244, loss: 0.01332228
2021-07-22 09:57:08,826 - Model - INFO - Epoch: [ 3/1000] time: 227.1289, loss: 0.01255551
2021-07-22 09:58:06,521 - Model - INFO - Epoch: [ 4/1000] time: 284.8244, loss: 0.01195736
2021-07-22 09:59:03,822 - Model - INFO - Epoch: [ 5/1000] time: 342.1306, loss: 0.01152219
2021-07-22 10:00:01,491 - Model - INFO - Epoch: [ 6/1000] time: 399.7950, loss: 0.01113618
2021-07-22 10:00:58,809 - Model - INFO - Epoch: [ 7/1000] time: 457.1126, loss: 0.01082771
2021-07-22 10:01:56,299 - Model - INFO - Epoch: [ 8/1000] time: 514.6026, loss: 0.01057385
2021-07-22 10:02:53,894 - Model - INFO - Epoch: [ 9/1000] time: 572.1980, loss: 0.01034075
2021-07-22 10:03:50,362 - Model - INFO - Epoch: [10/1000] time: 628.6637, loss: 0.04968767
2021-07-22 10:04:51,521 - Model - INFO - Epoch: [11/1000] time: 689.7292, loss: 0.05001253
2021-07-22 10:06:41,706 - Model - INFO - Epoch: [12/1000] time: 799.8922, loss: 0.05001420
2021-07-22 10:08:32,360 - Model - INFO - Epoch: [13/1000] time: 910.5690, loss: 0.05001350
2021-07-22 10:10:23,517 - Model - INFO - Epoch: [14/1000] time: 1021.7215, loss: 0.05001152

Visualizing voxel grid

Hello @czq142857,
Thanks for sharing the code.
I would like to visualize the input and output voxel grids that are used to train IM-NET autoencoder. Preferably, renderings of all voxels grids. Could you please tell me what Python package you used to do this?

Thanks

About testing

Hi,
I have followed your advice and used images from the rendered views are from 3D-R2N2.
1212
I was wondering if you used ShapeNet rendered images http://cvgl.stanford.edu/data2/ShapeNetRendering.tgz directly ,
my results results are not so good
0
2222

I can't understand some codes

hi, zhiqin

I'm having some trouble reading your code. It bothered me for days.

  • I read it for a long time and still can't understand what this function is doing.

image

  • or Where should I look for information on this to understand the above code

I'm new to the field.
Sincerely hope to get your answer here, I will be grateful !

Once again, a heartfelt thank you

Orientation and reconstruction quality

Hi,
Thank you for the good work and providing the data and trained models. It helps a lot! tried to test the reconstruction on our dataset with the provided pre-trained model and we get poor results.
image
The left most part is our voxelization of the ground truth mesh (shown on the center). The right most part is the reconstruction.
We have not trained or fine tuned on our dataset. I was wondering what could the potential issue be. Could it be orientation or something else?

A concern about the loss function.

Thanks for the codes!

But the loss function used here is confusing to me.
I notice that a (leaky) clamp operation is applied to the output of generator.
l7 = torch.max(torch.min(l7, l7*0.01+0.99), l7*0.01)
Then, MSE Loss is used.

Here is the problem. Let's say a positive sample whose target is 1, then a prediction for it is in range of [0, 1), and the other one is negtative. Naturally, we expect that the loss of the former should be less than the latter. But via the loss function used here, that is not sure. There will be a local minimum in the negative area. Do you think this is gonna make it difficult to train?

By the way, is there any reason why you don't use BCE or NLL loss here? Or, any advise about the loss function?

Looking forward for your reply.

About the original voxel models from HSP

Hello Zhiqin,

Thanks a lot for sharing the code and your great work!

I‘m reading the 2_gather_256vox_16_32_64.py. If it is convenient, could you please tell me what's the meaning of the data stored in the mat file(HSP)? For example, I don't understand the meaning of the b or b1 vector.
Maybe with this, I could have a deeper understanding of how to get sampling points(point-value) in voxels.

Thanks,
nzdbgyx

Question on data preprocessing step

Hello @czq142857,

Sorry to bother you again. But I have few questions on pre-processing step.
I want to generate data required to run IM-NET (voxels and point-value pair) from object meshes from ShapeNetCore v1. I did following steps:

  • I used binvox command suggested in this comment to convert .obj to .binvox files.
  • Then I used sampling code given in here to create voxels, point-value pairs.
  • I used following code to visualize voxels and sampled points at each resolution:
def save_vox(vox, filename):
   from PIL import Image
   a = Image.fromarray(np.max(vox, axis=0)*255)
   b = Image.fromarray(np.max(vox, axis=1)*255)
   c = Image.fromarray(np.max(vox, axis=2)*255)
   img = Image.new('RGB', (a.width+b.width+c.width, a.height))
   img.paste(a, (0, 0))
   img.paste(b, (a.width, 0))
   img.paste(c, (a.width+b.width, 0))
   img.save(filename+".jpg")

Output:
image

I also visualized the voxels and point-value pairs given in the ready-to-use data that you have provided here. Used above code to visualize sampled points for the same 3 examples and the visualizations are different.
image

The main difference in case of HSP sampled data is that even if the points are sampled at 16,32,64 resolutions, the actual point coordinates are from 256^3 voxels. Whereas that is not the case with earlier sampling that uses code given in here

Could you please tell me why is there a difference between pre-processed data, even if I am using same dataset ShapeNetCore.v1?Which one is correct? And if I want to get the correct data required to run IM-NET from mesh files, how can I obtain it?

Thank you,
Supriya

Training with the ready-to-use data

Hey and thanks for sharing this repository.
I've been trying to train the AE with the ready-to-use data that you shared.
However, I'm not quite sure about the train-test split or how to acquire the all_vox256 data.
In the link you shared (https://drive.google.com/open?id=1ykE6MB2iW1Dk5t4wRx85MgpggeoyAqu3) I found several zip files, each related to a different type of object.
Inside the data folder there's a _vox.hdf5 file and a _vox_z.hdf5 file, but it is unclear what the model takes as an input. The txt files (table.txt for instance) are empty.
Could you elaborate on these matters?

Thanks a lot,
Tal

About test

hi,
I want to input only one image without his voxel to reconstruction, is that feasible?

Reproducing for Point Cloud to Mesh Reconstruction

Hi,

I am looking to convert a point cloud to mesh using IMNet.

I am able to reproduce the paper on shapenet dataset. However, when trying to train from scratch or applying transfer learning on my own dataset, I am unable to reproduce the good reconstruction result.

Steps taken so far:

  1. Convert Point Cloud to Voxel Grid
  2. Use the point sampling script to sample the point-value pairs and voxels
  3. Train the model

My hypothesis is that my point cloud to voxel conversion is not as good and smooth as the original voxels you have from shapenet. Would you happen to have any idea on how to properly voxelise and get the point-value sampling from a point cloud?

inconsistency with AE training config and data

IM-NET-pytorch/modelAE.py

Lines 152 to 163 in 31b9cb2

if self.sample_vox_size==16:
self.load_point_batch_size = 16*16*16
self.point_batch_size = 16*16*16
self.shape_batch_size = 32
elif self.sample_vox_size==32:
self.load_point_batch_size = 16*16*16
self.point_batch_size = 16*16*16
self.shape_batch_size = 32
elif self.sample_vox_size==64:
self.load_point_batch_size = 16*16*16*4
self.point_batch_size = 16*16*16
self.shape_batch_size = 32

The numbers here seems inconsistent with the data (sample 16 ** 3 * 2 points for 32^3 voxels, 32 ** 3 points for 64^3 voxels), which may lead to incomplete training.
point_batch_num = int(self.load_point_batch_size/self.point_batch_size)

Also, can I simply assign self.load_point_batch_size as data_points.shape[1]?

Pretrain model

Is there a pretrained model that is possible to evaluate directly? thank you so much for the neat work

Question on Inference

Hello @czq142857,
Thanks for sharing the code.

I want to test the model using an external image.
Please can you walk me through the procedure as I cannot see any way to pass an image to the model.

About AE results and traing loss

Hi, Zhiqin

I test AE model on 02828884_bench, and find that the output of the AE model may not be symmetrical. Besides, the thin structure such as bottom legs may disappear. Colud you give me an explain about this. In my opinion, the small sampled points ratio of thin structues make legs disappear. However, I could not find explination about the asymmetrical in the horizontal direction.

Here are some results.
https://jacksun.top/index.php/s/gkABL7Hq9XpKyRa
https://jacksun.top/index.php/s/dqiHdSdSnyoHtB5
https://jacksun.top/index.php/s/PRR4DHaESnDsNt2

Another quesion is that, the postive points (value=1) account for (5%~10%), so is Naive MSE loss a good choice for such optimzation?

Best,
sun

Code for latent GAN

Hi Zhiqin,
Thanks for the great work and code.

Sorry but I did not find the code for latent GAN in this repo, like what in the original repo and improved TF repo.

Thanks :)

The issue of infinite loop here

Dear ZhiqinChen:
Thank you for sharing your code and I'm sorry to bother you

I'm a student Master student in South Korea.
I have an issue testing the IM_AE
the error which is 'infinite loop here!' occurred in my cmd window.
I can not find the reason for this issue
Could you please interpret or clarify this issue to find the reason why?

Issues with reconstruction / Discrepancy in metric (CD)

Hello,

First of all, thank you for open-sourcing the code and the amazing work! I am attempting to decode the trained latent vectors (as well as generate latent vectors using pre-generated voxels) provided by you for reconstructing the shape (by passing through the decoder). In my experiments, I only consider the Chair class of ShapeNet. However, I observe some imperfect reconstructions similar to the one attached (The attached reconstruction is for model ce2ff5c3a103b2c17ad11050da24bb12 ). I also attempted to compute the Chamfer's Distance using the implementation here. I observe a score of 6.1 (multiplied by 10^3), sampling 10000 points on the mesh. Isn't this higher than the value reported in the paper? Is the evaluation done differently from that of DeepSDF's computation of Chamfer Distance? Furthermore, I also observe similar difference in the CD reported in Table 1 of this paper (https://arxiv.org/pdf/1911.10949.pdf). While I understand that in the PQ-Net paper, comparison is provided over PartNet dataset while its ShapeNet in IM-Net paper, isn't the difference still large?

image

I observe similar (imperfect) reconstruction for relatively complicated chairs such as the ones with wheel underneath. Is it due to shortage of such example in the training data?
Thank you in advance for your time and clarification.

Regards,
Ramana

Train imnet using other data

Dear ZhiqinChen:
Thank you for sharing your code and I'm sorry to bother you
And I'm sorry for my suck English, if you have difficult in reading my questions, I can email you in Chinese.
I met some problems when preparing training data.
I want to use IMNet-pytorch version to train my model, using dataset named PartNet V0(https://shapenet.org/download/parts), I met some problems when preprocessing models and pictures:
I have seen early questions about acquiring binvox file in your project, and use command:
"binvox -bb -0.5 -0.5 -0.5 0.5 0.5 0.5 -d 64 -e input.obj"
to get binvox file ,then I use vox2pointvaluepair_from_binvox to get h5 file of voxel, because I don't know how to get voxels with 256^3 resolution.(I have also seen HSP data, its 'b' voxels seem meaningless, so I have no idea about how to creat my own mat file.)
Then I use 3_gather_img to gather my rendering views of models (although every model has only one view and I hope it won't influent my training work.)
so here are my questions:
1.Can I just use 64^3 resolution binvox file to train pytorch version IMNet?
2.When I prepareing rendering views, I found my png files have no alphe channel, can I use these png files and how to rewrite the code? If not , how can I get the right png files for training work?
Thank you and I'm sorry about bothering you again.
Thanks
Zh T

Question of removing all skip connections

Hiiii Zhiqin,
Thanks for your great work and codes.

I found that you removed all skip connections of the decoder in IM-NET Improved PyTorch implementation, which is also mentioned in your paper section 3.2 to illustrate: "They can be removed when the feature vector is long, so as to prevent the model from becoming too large."
I wonder if it's necessary and how's the performance change since I consider taking the skip-connected structure is an important part of IM-NET.

Looking forward to your reply. Thanks in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.