Giter VIP home page Giter VIP logo

trending-in-3d-vision's Introduction

Trending in 3D Vision

I first got fascinated by the beauty of 3D vision since 2015. After that, so many new and wonderful ideas, works have been brought into this field, and it seems so hard to catch up with this fast-evolving area today. This leads to the major motivation behind this paper reading list: to get a sense of current SOTA methods, and an overview of the research trending in the field of 3D vision, mainly with deep learning.

From this list, you may say, various applications, multiple modalities of data, powerful neural backbones are the major working horses, or the boom of neural radiance field and differentiable rendering inspire a lot of new methods and tasks, or you want to point out that self-supervision, data-efficient learning are the critical keys. Different people may have different opinions, but this list is about existing possibilities in 3D vision, to which you may say 'wow, this is even possible', or 'aha, I never imagined such a method'.

Note that this repo started as a self-collected paper list based on my own appetite, which may reflect some bias. Some may not be precisely categorized, for which you can raise an issue, or send a pull request.

[Chen et al. (ARXIV '22)] Vision-based Large-scale 3D Semantic Mapping for Autonomous Driving Applications

[Avraham et al. (ARXIV '22)] Nerfels: Renderable Neural Codes for Improved Camera Pose Estimation

[Hughe et al. (ARXIV '22)] Hydra: A Real-time Spatial Perception Engine for 3D Scene Graph Construction and Optimization [Video]

[Zhu at al. (CVPR '22)] NICE-SLAM: Neural Implicit Scalable Encoding for SLAM [Project] [Code] [Video]

[Teed et al. (NeurIPS '21)] DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras [Code] [Video]

[Yang et al. (3DV '21)] TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo [Project] [Code] [Video]

[Lin et al. (ARXIV '21)] R3LIVE: A Robust, Real-time, RGB-colored, LiDAR-Inertial-Visual tightly-coupled state Estimation and mapping package [Code] [Video]

[Duzceker et al. (CVPR '21)] DeepVideoMVS: Multi-View Stereo on Video with Recurrent Spatio-Temporal Fusion [Code] [Video]

[Teed at al. (CVPR '21)] Tangent Space Backpropagation for 3D Transformation Groups [Code]

[Sun et al. (CVPR '21)] NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video [Project] [Code]

[Murthy J. et al. (ICRA '20)] ∇SLAM: Automagically differentiable SLAM [Project] [Code]

[Schops et al. (CVPR '19)] BAD SLAM: Bundle Adjusted Direct RGB-D SLAM [Project] [Code]

Human avatars

[Su et al. (ARXIV '22)] DANBO: Disentangled Articulated Neural Body Representations via Graph Neural Networks [Project]

[Jiang et al. (CVPR '22)] SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video [Project] [Code]

[Weng et al. (CVPR '22)] HumanNeRF: Free-viewpoint Rendering of Moving People from Monocular Video [Project] [Code] [Video]

[Yu et al. (CVPR '21)] Function4D: Real-time Human Volumetric Capture from Very Sparse Consumer RGBD Sensors [Project] [Data] [Video]

Animals capture

[Yang et al. (CVPR '22)] BANMo: Building Animatable 3D Neural Models from Many Casual Videos [Project] [Code] [Video]

[Wu et al. (NeurIPS '21)] DOVE: Learning Deformable 3D Objects by Watching Videos [Project] [Video]

Human-object interaction

[Jiang et al. (CVPR '22)] NeuralHOFusion: Neural Volumetric Rendering under Human-object Interactions [Project] [Video]

[Hasson et al. (CVPR '21)] Towards unconstrained joint hand-object reconstruction from RGB videos [Project] [Code]

Scene-level 3D dynamics

[Grauman et al. (CVPR '22)] Ego4D: Around the World in 3,000 Hours of Egocentric Video [Project] [Code]

[Li et al. (CVPR '22)] Neural 3D Video Synthesis From Multi-View Video [Project] [Video] [Data]

[Zhang et al. (SIGGRAPH '21)] Consistent Depth of Moving Objects in Video [Project] [Code] [Video]

[Zeed et al. (CVPR '21)] RAFT-3D: Scene Flow using Rigid-Motion Embeddings [Code]

[Lu et al. (CVPR '21)] Omnimatte: Associating Objects and Their Effects in Video [Project] [Code] [Video]

[Hasselgren et al. (ARXIV '22)] Shape, Light & Material Decomposition from Images using Monte Carlo Rendering and Denoising

[Gkioxari et al. (ARXIV '22)] Learning 3D Object Shape and Layout without 3D Supervision [Project] [Video]

[Boss et al. (ARXIV '22)] SAMURAI: Shape And Material from Unconstrained Real-world Arbitrary Image collections [Project] [Video]

[Wei et al. (SIGGRAPH '22)] Approximate Convex Decomposition for 3D Meshes with Collision-Aware Concavity and Tree Search [Project] [Code] [Video]

[Vicini et al. (SIGGRAPH '22)] Differentiable Signed Distance Function Rendering [Project] [Video]

[Or-El et al. (CVPR '22)] StyleSDF: High-Resolution 3D-Consistent Image and Geometry Generation [Project] [Code] [Demo]

[Girdhar et al. (CVPR '22)] Omnivore: A Single Model for Many Visual Modalities [Project] [Code]

[Noguchi et al. (CVPR '22)] Watch It Move: Unsupervised Discovery of 3D Joints for Re-Posing of Articulated Objects [Project] [Code] [Video]

[Gong et al. (CVPR '22)] PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision [Code]

[Wu et al. (CVPR '22)] Toward Practical Monocular Indoor Depth Estimation [Project] [Code] [Video] [Data]

[Wei et al. (CVPR '22)] Self-supervised Neural Articulated Shape and Appearance Models [Project] [Video]

[Chan et al. (CVPR '22)] EG3D: Efficient Geometry-aware 3D Generative Adversarial Networks [Project] [Code] [Video]

[Rombach et al. (ICCV '21)] Geometry-Free View Synthesis: Transformers and no 3D Priors Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations [Project] [Code] [Video]

[Harley et al. (CVPR '21)] Track, Check, Repeat: An EM Approach to Unsupervised Tracking [Project] [Code]

[Watson et al. (CVPR '21)] The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth [Code] [Video]

[Nicolet et al. (SIGGRAPH Asia '21)] Large Steps in Inverse Rendering of Geometry [Project] [Code] [Video]

[Wu et al. (CVPR '20)] Unsupervised Learning of Probably Symmetric Deformable 3D Objects From Images in the Wild [Project] [Code] [Video]

Topology-aware

[Palafox et al. (CVPR '22)] SPAMs: Structured Implicit Parametric Models [Project] [Video]

[Park et al. (SIGGRAPH Asia '21)] A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields [Project] [Code] [Video]

Additional priors

[Guo et al. (CVPR '22)] Neural 3D Scene Reconstruction with the Manhattan-world Assumption [Project] [Code] [Video]

Faster, memory-efficient

[Chen et al. (ARXIV '22)] TensoRF: Tensorial Radiance Fields [Project] [Code]

[Müller et al. (ARXIV '22)] Instant Neural Graphics Primitives with a Multiresolution Hash Encoding [Project] [Code] [Video]

[Schwarz et al. (ARXIV '22)] VoxGRAF: Fast 3D-Aware Image Synthesis with Sparse Voxel Grids

[Takikawa et al. (Siggraph '22)] Variable Bitrate Neural Fields [Project] [Video]

[Sun et al. (CVPR '22)] Direct Voxel Grid Optimization Super-fast Convergence for Radiance Fields Reconstruction [Project] [Code] [Video] [DVGOv2]

[Yu et al. (CVPR '22)] Plenoxels: Radiance Fields without Neural Networks [Project] [Code] [Video]

[Xu et al. (CVPR '22)] Point-NeRF: Point-based Neural Radiance Fields [Project] [Code]

[Deng et al. (CVPR '22)] Depth-Supervised NeRF: Fewer Views and Faster Training for Free [Project] [Code] [Video]

[Takikawa et al. (CVPR '21)] Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes [Project] [Code] [Video]

[Garbin et al. (CVPR '21)] FastNeRF: High-Fidelity Neural Rendering at 200FPS [Project] [Video]

Dynamic

[Fang et al. (ARXIV '22)] TiNeuVox: Fast Dynamic Radiance Fields with Time-Aware Neural Voxels [Project] [Code] [Video]

[Wang et al. (CVPR '22)] Fourier PlenOctrees for Dynamic Radiance Field Rendering in Real-time [Project] [Video]

[Gao et al. (ICCV '21)] Dynamic View Synthesis from Dynamic Monocular Video [Project] [Code] [Video]

[Li et al. (CVPR '21)] Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes [Project] [Code] [Video]

Editable

[Zhang et al. (ARXIV '22)] ARF: Artistic Radiance Fields [Project] [Code & Data]

[Kobayashi et al. (ARXIV '22)] Decomposing NeRF for Editing via Feature Field Distillation [Project]

[Benaim et al. (ARXIV '22)] Volumetric Disentanglement for 3D Scene Manipulation [Project]

[Lazova et al. (CVPR '22)] Control-NeRF: Editable Feature Volumes for Scene Rendering and Manipulation

[Yuan et al. (CVPR '22)] NeRF-Editing: Geometry Editing of Neural Radiance Fields

Generalizable

[Yu et al. (ARXIV '22)] MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction [Project]

[Rebain et al. (CVPR '22)] LOLNeRF: Learn from One Look [Project]

[Chen et al. (ICCV '21)] MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo [Project] [Code] [Video]

[Yu et al. (CVPR '21)] Neural Radiance Fields from One or Few Images [Project] [Code] [Video]

Large-scale

[Tancik et al. (ARXIV '22)] Block-NeRF: Scalable Large Scene Neural View Synthesis [Project] [Video]

[Zhang et al. (CVPR '22)] NeRFusion: Fusing Radiance Fields for Large-Scale Scene Reconstruction [Project]

Sparse input

[Long et al. (ARXIV '22)] SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse views [Project]

[Suhail et al. (CVPR '22)] Light Field Neural Rendering [Project] [Code]

[Niemeyer et al. (CVPR '22)] RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from Sparse Inputs [Project] [Code] [Video]

Datasets

[Downs et al. (ARXIV '22)] Google Scanned Objects: A High-Quality Dataset of 3D Scanned Household Item [Blog] [Data]

[Wiersma et al. (SIGGRAPH '22)] DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds [Project] [Code] [Supp.]

[Ran et al. (CVPR '22)] Surface Representation for Point Clouds [Code]

[Mittal et al. (CVPR '22)] AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation [Project] [Code]

[Chen et al. (CVPR '22)] The Devil is in the Pose: Ambiguity-free 3D Rotation-invariant Learning via Pose-aware Convolution

[Jakab et al. (CVPR '21)] KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control [Project] [Code] [Video]

[Yang et al. (CVPR '22)] ArtiBoost: Boosting Articulated 3D Hand-Object Pose Estimation via Online Exploration and Synthesis

[Yin et al. (CVPR '22)] FisherMatch: Semi-Supervised Rotation Regression via Entropy-based Filtering [Project] [Code]

[Sun et al. (CVPR '22)] OnePose: One-Shot Object Pose Estimation without CAD Models [Project] [CodeSoon] [Supp]

[Deng et al. (NeurIPS '21)] Revisiting 3D Object Detection From an Egocentric Perspective

[Li et al. (NeurIPS '21)] Leveraging SE(3) Equivariance for Self-Supervised Category-Level Object Pose Estimation [Project] [Code]

[Lu et al. (ICCV '21)] Geometry Uncertainty Projection Network for Monocular 3D Object Detection [Code]

[Ahmadyan et al. (CVPR '21)] Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild With Pose Annotations [Project] [Code]

[Murphy et al. (ICML '21)] Implicit-PDF: Non-Parametric Representation of Probability Distributions on the Rotation Manifold [Project] [Code] [Video] [Data]

[Kim et al. (ARXIV '22)] Conditional Motion In-betweening [Project]

[He et al. (ARXIV '22)] NeMF: Neural Motion Fields for Kinematic Animation

[Ianina et al. (CVPR '22)] BodyMap: Learning Full-Body Dense Correspondence Map [Project] [Supp]

[Muralikrishnan et al. (CVPR '22)] GLASS: Geometric Latent Augmentation for Shape Spaces [Project] [Code] [Video]

[Taheri et al. (CVPR '22)] GOAL: Generating 4D Whole-Body Motion for Hand-Object Grasping [Project] [Video] [Code]

[AIGERMAN et al. (SIGGRAPH '22)] Neural Jacobian Fields: Learning Intrinsic Mappings of Arbitrary Meshes

[Raab et al. (Siggraph '22)] MoDi: Unconditional Motion Synthesis from Diverse Data [Project(need fix)] [Video]

[Li et al. (SIGGRAPH '22)] GANimator: Neural Motion Synthesis from a Single Sequence [Project] [Code] [Video]

[Wang et al. (NeurIPS '21)] MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images [Project] [Code] [Video]

[Henter et al. (Siggraph Asia '20)] MoGlow: Probabilistic and controllable motion synthesis using normalising flows [Project] [Code] [Video]

[Driess et al. (ARXIV '22)] Reinforcement Learning with Neural Radiance Fields [Project] [Video]

[Gao et al. (CVPR '22)] ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer [Project] [Code] [Video]

[Ortiz et al. (RSS '22)] iSDF: Real-Time Neural Signed Distance Fields for Robot Perception [Project] [Code] [Video]

[Wi et al. (ICRA '22)] VIRDO: Visio-tactile Implicit Representations of Deformable Objects [Project] [Code]

[Adamkiewicz et al. (ICRA '22)] Vision-Only Robot Navigation in a Neural Radiance World [Project] [Code] [Video] [Data]

[Li et al. (CoRL '21)] 3D Neural Scene Representations for Visuomotor Control [Project] [Video]

[Ichnowski et al. (CoRL '21)] Dex-NeRF: Using a Neural Radiance Field to Grasp Transparent Objects [Project] [Dataset] [Video]

[Tevet et al. (ARXIV '22)] MotionCLIP: Exposing Human Motion Generation to CLIP Space [Project] [Code]

[Wang et al. (CVPR '22)] CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields [Project] [Code] [Video]

[Michel et al. (ARXIV '21)] Text2Mesh: Text-Driven Neural Stylization for Meshes [Project] [Code]

Volume Rendering

[Sawhey et al. (SIGGRAPH '22)] Grid-free Monte Carlo for PDEs with spatially varying coefficients [Project] [Code]

More resources

trending-in-3d-vision's People

Contributors

dragonlong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

trending-in-3d-vision's Issues

Recommended papers on 3D human reconstruction

Hi Xiaolong,

Thanks for making this great repo for 3D computer vision!

Just would like to mention three of our recent works:

  1. "3D human pose, shape and texture from low-resolution images and videos" (TPAMI 2021, repo: https://github.com/xuxy09/RSC-Net) for robust 3D human reconstruction through contrastive learning;
  2. "3D Human Texture Estimation from a Single Image with Transformers" (ICCV 2021 Oral, repo: https://github.com/xuxy09/Texformer) for self-supervised 3D human reconstruction;
  3. "Geometry-Guided Progressive NeRF for Generalizable and Efficient Neural Human Rendering" (ECCV 2022, repo: https://github.com/sail-sg/GP-Nerf) for efficient and accurate human rendering.

It would be great if these works can be added to your paper list and shared with a wider audience.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.