Giter VIP home page Giter VIP logo

semiparametric's Introduction

Semi-parametric Object Synthesis

Rotating cars

Abstract

We present a new semi-parametric approach to synthesize novel views of an object from a single monocular image. First, we exploit man-made object symmetry and piece-wise planarity to integrate rich a-priori visual information into the novel viewpoint synthesis process. An Image Completion Network (ICN) then leverages 2.5D sketches rendered from a 3D CAD as guidance to generate a realistic image. In contrast to concurrent works, we do not rely solely on synthetic data but leverage instead existing datasets for 3D object detection to operate in a real-world scenario. Differently from competitors, our semi-parametric framework allows the handling of a wide range of 3D transformations. Thorough experimental analysis against state-of-the-art baselines shows the efficacy of our method both from a quantitative and a perceptive point of view.

Paper

  
@article{palazzi2019semi,
  title={Semi-parametric Object Synthesis},
  author={Palazzi, Andrea and Bergamini, Luca and Calderara, Simone and Cucchiara, Rita},
  journal={arXiv preprint arXiv:1907.10634},
  year={2019}
}

Code

Install

Run the following in a fresh Python 3.6 environment to install all dependencies:

pip install -r requirements.txt

Code was tested on Ubuntu linux only (16.04, 17.04).

How to run

To run our demo code, you need to download the following:

  • Pascal3D+ vehicles dataset (.zip file here)
  • 3D CADs (.zip file here)
  • Pre-trained weights (.pth file here)

Extract both archives in a location of your choice <data_root>; move there the pre-trained weights file too.

The entry point is run_rotate.py. The script expects as mandatory arguments the car dataset, pre-trained weights and CAD dir.

Therefore, it can be run as follows:

python run_rotate.py <data_root>/pascal_car <data_root>/weights.pth <data_root>/cad --device cpu

Description and usage

If everything went well,, you should see a GUI like the following:

Viewport

The GUI is composed of two windows: the viewport and the output one.

While the focus is on the viewport, keyboard can be used to move around the object in spherical coordinates. Here the full list of commands is provided. While you move, the output shows both Image Completion Network (ICN) inputs (2.5D sketches, appearance prior) and network prediction. Please refer to Sec.3 of the paper for details.

Notice: it may happen that when starting the program, open3D does not render anything. This is an initialization issue. In case this happens, just focus on the viewport and press spacebar a couple of times until you see both windows rendered properly.

Supplementary Material

Other classes

Rotating cars

Extreme viewpoint transformations (see Sec. 4)

Due to its semi-parametric nature, our method is much more robust than competitors to extreme viewpoint changes.

Here they are some examples:

Zoom gif
Manipulation of radial distance.

Elevation gif
Manipulation of elevation.

Rototranslation gif
Arbitrary rototranslation.

Data augmentation (see Sec. 4.4)

Additional examples generated synthetically using our model are shown below.

Each row is generated as follows. Given an image from Pascal3D+, other examples in the same pose are randomly sampled from the dataset. Then, our method is used to transfer the appearance of the latter to the pose of the first. Eventually, generated vehicles are stiched upon the original image. For a seamless collaging, we perform a small Gaussian blur at the mask border.

Generated data

Percentage of Correct Keypoints (PCK) logged in TensorBoard during training (see Sec. 4.4)

PCK graph

semiparametric's People

Contributors

iccv19sub265 avatar lucabergamini avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.