Giter VIP home page Giter VIP logo

odometry's Introduction

Direct Stereo Semi-Dense Visual Odometry and 3D Reconstruction

This was a course project from 3d scanning and motion capture at Technical University München. The project implemented a direct semi-dense image alignment for tracking a Stereo Camera (only works on Kitti Dataset). It is only an re-implementation and combination of existing algorithms. Some of the highlights:

  • Real-time: on a single Intel CPU core (>= 4th generation)
  • Customized Optimization: all optimization procedures are implemented from scratch, no ceres-solver or other optimization frameworks are used
  • Algorithm Dependencies: all important algorithms are implemented from scratch, such as disparity search, depth/geometry optimization, pose optimization, etc

The project is still under developing. The project is already closed since the lecture has ended. However, it will go on as a personal project (developing). The plan is to implement a RGB-D odometry using the existing code blocks. Stereo odometry requires non-trivial calibrations and accurate stereo matchings, both are complicated during implementation especially real-time performance on a single CPU is needed. The RGB-D odometry utilizes monocular RGB as well as Depth outputs from the sensor (TUM RGB-D dataset or Intel Realsense), outputs camera trajectories as well as reconstructed 3D geometry. The benefit of directly using the depth output from a sensor is that the geometry estimation is much simpler and easy to be implemented. Although we still need to do calibration between the depth and RGB sensors if a live RGB-D sensor is used, otherwise we'll just use the calibration provided from the dataset (TUM RGB-D dataset) for offline application.

Related/Referenced Papers

  • Robust Odometry Estimation for RGB-D Cameras, C. Kerl, J. Sturm, D. Cremers, In Proc. of the IEEE Int. Conf. on Robotics and Automation (ICRA), 2013.
  • Dense Visual SLAM for RGB-D Cameras, C. Kerl, J. Sturm, D. Cremers, In Proc. of the Int. Conf. on Intelligent Robot Systems (IROS), 2013.
  • LSD-SLAM: Large-Scale Direct Monocular SLAM, J. Engel, T. Schöps, D. Cremers, ECCV '14.
  • Semi-Dense Visual Odometry for a Monocular Camera, J. Engel, J. Sturm, D. Cremers, ICCV '13.
  • DSO: Direct Sparse Odometry, J. Engel, V. Koltun, D. Cremers, In IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2018.

Requirements and Dependencies

  • CMake >= 3.8
  • Clang >= 7.0 (or any other compilers like GCC or Intel, do not use Microsoft compilers)
  • C++ 14 standard (mainly for security reasons, especially for pointers)
  • C++ standard libraries, Eigen >= 3.3, OpenCV >= 3.4
  • Sophus camera motions as Lie Group and Lie Algebra, only needed for estimating camera pose and scene/map reconstruction)
  • Boost (only for multi-threading)
  • (Optional) Pangolin (only for visualization)
  • (Optional) Kalibr (only for camera calibration)

Build and Compile conventions

  • Always build every .cpp source file as static libraries to lib directory
  • Always separate .cpp and .h files into corresponding folders
  • Always put third_party libraries or codes into third_party folder if CMake could not find automatically
  • DO NOT git commit build (including executables) related files such as *.a, *.o, etc. Use .gitignore.
  • EXPLICITLY enable SIMD vectorization and CPU arch optimization when compiling

Code Style and Conventions

We follow Google C++ Style in general.

Datatype Rules/Notes

  • AVOID use raw data types, such as arrays, raw pointers. Instead use the following:
    • Eigen::Array for (large) multi-dimentional data types
    • std::vector for address contiguous (small-medium) sequence data types
    • use reference for passing/returning function arguments, always use smart pointers instead of raw pointers
    • smart pointers:
      • std::unique_ptr, std::make_unique
      • std::shared_ptr, std::make_shared
    • DO NOT use auto pointers to avoid confusions
    • use raw pointers VERY CAREFULLY if must
  • USE Eigen::Matrix only for linear algebra related storage and operations
  • USE OpenCV cv::Mat only for image/camera related data storage and operations
  • USE 32-bit float for all floating point data, use 64-bit double if must
  • USE 32-bit int for all (contiguous) integer data
  • DO NOT use unsigned_int. Use it if it is needed for interfacing with other libraries (unsigned_int has already been proved as a design flaw in C++ standard)
  • Aliasing in Eigen: be aware if you have the same Eigen object on both side of the expression.
  • Alignment in Eigen if you have fixed-size vectorizable object only.

Performance Concerns

  • Generally use 32-bit aligned contiguous memory layout for all image matrices and (medium-large) data arrays ( use 128-bit or 256-bit aligned memory if certain parallel operations can be performed with SSE/AVX registers)
  • USE fix-sized Eigen::Matrix or Eigen::Vector only for small matrices (total number of elements up to 16)
    • Optional make sure the number of elements is dividable by 4, if not padding with additional zeros
  • USE dynamic-sized Eigen::Matrix or Eigen::Vector for all medium-large matrices (Eigen do memory alignment automatically for large dynamic matrices)
  • OpenCV automatically allocate 32-bit aligned contiguous address for cv::Mat, but still check it to be sure

TODOs

  • Asynchronous Queue for camera/tracking interface
  • Camera Tracking speed up: 30ms -> 10ms

License

The source code is licensed under the GNU General Public License Version 3 (GPLv3), see http://www.gnu.org/licenses/gpl.html.

odometry's People

Contributors

wangyutum avatar

Watchers

James Cloos avatar 姬忠鹏 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.