Giter VIP home page Giter VIP logo

im-net-pytorch's Introduction

IM-NET-pytorch

PyTorch 1.2 implementation for paper "Learning Implicit Fields for Generative Shape Modeling", Zhiqin Chen, Hao (Richard) Zhang.

Improvements

In short, this repo is an implementation of IM-NET with the framework provided by BSP-NET-pytorch.

The improvements over the original implementation is the same as IM-NET (improved TensorFlow1 implementation):

Encoder:

  • In IM-AE (autoencoder), changed batch normalization to instance normalization.

Decoder (=generator):

  • Changed the first layer from 2048-1024 to 1024-1024-1024.
  • Changed latent code size from 128 to 256.
  • Removed all skip connections.
  • Changed the last activation function from sigmoid to clip ( max(min(h, 1), 0) ).

Training:

  • Trained one model on the 13 ShapeNet categories as most Single-View Reconstruction networks do.
  • For each category, sort the object names and use the first 80% as training set, the rest as testing set, same as AtlasNet.
  • Reduced the number of sampled points by half in the training set. Points were sampled on 2563 voxels.
  • Removed data augmentation (image crops), same as Occupancy Networks.
  • Added coarse-to-fine sampling for inference to speed up testing.
  • Added post-processing to make the output mesh smoother. To enable, find and uncomment all "self.optimize_mesh(vertices,model_z)".

Citation

If you find our work useful in your research, please consider citing:

@article{chen2018implicit_decoder,
  title={Learning Implicit Fields for Generative Shape Modeling},
  author={Chen, Zhiqin and Zhang, Hao},
  journal={Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2019}
}

Dependencies

Requirements:

Our code has been tested on Ubuntu 16.04 and Windows 10.

Datasets and pre-trained weights

The original voxel models are from HSP.

The rendered views are from 3D-R2N2.

Since our network takes point-value pairs, the voxel models require further sampling.

For data preparation, please see directory point_sampling.

We provide the ready-to-use datasets in hdf5 format, together with our pre-trained network weights.

Backup links:

Usage

Please use the provided scripts train_ae.sh, train_svr.sh, test_ae.sh, test_svr.sh to train the network on the training set and get output meshes for the testing set.

To train an autoencoder, use the following commands for progressive training.

python main.py --ae --train --epoch 200 --sample_dir samples/all_vox256_img0_16 --sample_vox_size 16
python main.py --ae --train --epoch 200 --sample_dir samples/all_vox256_img0_32 --sample_vox_size 32
python main.py --ae --train --epoch 200 --sample_dir samples/all_vox256_img0_64 --sample_vox_size 64

The above commands will train the AE model 200 epochs in 163 resolution, then 200 epochs in 323 resolution, and finally 200 epochs in 643 resolution. Training on the 13 ShapeNet categories takes about 3 days on one GeForce RTX 2080 Ti GPU.

After training, you may visualize some results from the testing set.

python main.py --ae --sample_dir samples/im_ae_out --start 0 --end 16

You can specify the start and end indices of the shapes by --start and --end.

To train the network for single-view reconstruction, after training the autoencoder, use the following command to extract the latent codes:

python main.py --ae --getz

Then use the following commands to train the SVR model:

python main.py --svr --train --epoch 1000 --sample_dir samples/all_vox256_img1

After training, you may visualize some results from the testing set.

python main.py --svr --sample_dir samples/im_svr_out --start 0 --end 16

License

This project is licensed under the terms of the MIT license (see LICENSE for details).

im-net-pytorch's People

Contributors

czq142857 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.