Giter VIP home page Giter VIP logo

octattention's Introduction

OctAttention: Octree-Based Large-Scale Contexts Model for Point Cloud Compression. AAAI 2022 Paper.

Branches

There are two branches named obj and lidar that implement Object and LiDAR point cloud coding respectively. They share the same network. Note: the checkpoint file is saved in the corresponding branch separately. The model for LiDAR compression is here.

Requirements

  • python 3.7
  • PyTorch 1.9.0+cu102
  • file/environment.sh to help you build this environment

Download and Prepare Training and Testing Data

  • Download data

    For LiDAR compression

    SemanticKITTI (80G)
    23201/20351 frames in 00-10/11-21 folders for training/testing.

    For Object compression

    MPEG 8iVFBv2 (5.5GB)
    300/300 frames in soldier10 and longdress10 for training.
    300/300 frames in loot10 and redandblack10 for testing.

    MPEG 8iVSLF (100M)
    1/1/1/1 frame in Boxer9/10 and Thaidancer9/10 (quantized from 12bit data) for testing.
    please cite: Maja Krivokuća, Philip A. Chou, and Patrick Savill, “8i Voxelized Surface Light Field (8iVSLF) Dataset,” ISO/IEC JTC1/SC29 WG11 (MPEG) input document m42914, Ljubljana, July 2018.

    JPEG MVUB (8GB)
    318/216/207 frames in andrew10, david10 and sarah10 for training.
    245/245/216/216 frames in Phil9/10 and Ricardo9/10 for testing.
    (Note: We rotated the MVUB data to make it consistent with MPEG 8i. Please set rotation=True in the dataPrepare function when processing MVUB data in training and testing.)

  • Prepare data

Please set oriDir in dataPrepare.py before.

python dataPrepare.py

To prepare train and test data. It will generate *.mat data in the directory Data.

Train

python octAttention.py 

You should set the Network parameters expName,DataRootetc. in networkTool.py. This will output checkpoint in expName folder, e.g. Exp/Kitti. (Note: You should run DataFolder.calcdataLenPerFile() in dataset.py for a new dataset, and you can comment it after you get the parameter dataLenPerFile)

Encode and Decode

You may need to run the following command to provide pc_error and tmc13v14_r(release version) execute permission.

chmod +x file/pc_error file/tmc13v14_r 
  • Encode

python encoder.py  

This will output binary codes saved in .bin format in Exp(expName)/data, and will generate *.mat data in the directory Data/testPly.

  • Decode

python decoder.py 

This will load *.mat data for check and calculate PSNR by pc_error.

Test TMC

We provide the test code for TMC13 v14 (G-PCC) for Object and LiDAR point cloud compression.

python testTMC.py

Citation

If this work is useful for your research, please consider citing :

@article{OctAttention, 
title={OctAttention: Octree-Based Large-Scale Contexts Model for Point Cloud Compression}, volume={36}, 
url={https://ojs.aaai.org/index.php/AAAI/article/view/19942}, DOI={10.1609/aaai.v36i1.19942},
number={1}, journal={Proceedings of the AAAI Conference on Artificial Intelligence}, 
author={Fu, Chunyang and Li, Ge and Song, Rui and Gao, Wei and Liu, Shan}, year={2022}, month={Jun.}, pages={625-633} 
}

octattention's People

Contributors

zb12138 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

octattention's Issues

Question about decoding result.

Thanks so much for your sharing!
I have run the original encoder.py and decoder.py, using the checkpoint you provided or the checkpoint trained by myself, but get the results as below.

  1. Use infile1 (A) as reference, loop over A, use normals on B. (A->B).
    mse1 (p2point): 0
    mse1,PSNR (p2point): inf
  2. Use infile2 (B) as reference, loop over B, use normals on A. (B->A).
    mse2 (p2point): 0
    mse2,PSNR (p2point): inf
  3. Final (symmetric).
    mseF (p2point): 0
    mseF,PSNR (p2point): inf

I didn't find the problem after debugging. The visualizing decoded point cloud has correct contour, but no texture information at all. Have you ever encountered this situation?

Question about 'dataLenPerFile'

Thank you so much for sharing a great project.

I have converted 23201 *.bin scan files from SemanticKITTI to *.mat files using dataPrepare.py,
and calculated 'dataLenPerFile' using DataFolder.calcdataLenPerFile() method as you mentioned.

But from the code below your 'dataLenPerFile' seems to be 17456051, but in my case 121484 is obtained.
I wonder if this is a value that can vary depending on the environment.

train_set = dataset.DataFolder(root=trainDataRoot, TreePoint=TreePoint,transform=None,dataLenPerFile= 17456051) # you should run 'dataLenPerFile' in dataset.py to get this num (17456051.4)

Thank You.

Question about trained weight on SemanticKITTI

Thanks for sharing the great work!

I found that there is a trained weight located in the folder modelsave/obj, and it was trained on object point clouds.

Is it convenient for you if you upload the .pth files trained on SemanticKITTI? Thanks!

OSError: [WinError 193] %1 不是有效的 Win32 应用程序。

I am running dataPrepare.py
This code:lib = cdll.LoadLibrary(os.path.dirname(os.path.abspath(file))+'/Octree_python_lib.so') # class level loading lib
Display errors at runtime:OSError: [WinError 193] %1 不是有效的 Win32 应用程序。
So I want to ask what your operating system is...?

Some Questions about TreePoint

Hello, while reviewing and organizing your code, I noticed the variable "TreePoint" in the data preparation code for training. However, even after reading your comments, I still have some doubts about the meaning of this variable. Could you please explain it to me? Thank you very much!

question about tmc3_v14

Firstly thank you for your great work! I have a question aboud using tmc3.
'/bin/sh: 1: ./tmc3_v14: Permission denied' has shown while test the g-pcc. Is there any solutions?

Nvidia GeForce RTX 3090 - 1 Year Training

Hi,
you claim that it took you 2 days to train Kitti for 8 epochs on a Titan XP.
Then why it will take me 1 Year on a GeForce RTX 3090 while its utilization is almost 100%.
Is this the actual implementation run in your paper?

Question about TMC13

Hi,
I have another question about the file testTMC.py for lidar point clouds.
Before the compression, the coordinates of lidar point clouds are shifted by mean value and normalized by max value.

I am curios why the mean value and max value are not stored and counted as a part of bit-stream.

Question about figure 5

Hi, I'm suprised about your paper, and i want to try the visualization. could you please give me the code about the figure 5? thanks a lot.

Question about TMC13

First of all, thank you for sharing the project.Why is there such a significant difference in the time for encoding and decoding compared to the results in your paper?

~/code/OctAttention$ python testTMC.py
TMC V14
2023-11-07:15:47:02
dataset/MPEG 8iVFBv2/soldier/soldier/Ply/soldier_vox10_0536.ply
__________________________________________________ encode __________________________________________________
MPEG PCC tmc3 version unconfirmed-release-14.0

  • Effective configuration parameters
    mode : 0
    firstFrameNum : 0
    frameCount : 1
    reconstructedDataPath : ""
    uncompressedDataPath : "./temp/tmc/p.ply"
    compressedStreamPath : "./temp/tmc/tmc.bin"
    postRecolorPath : ""
    preInvScalePath : ""
    convertPlyColourspace : 1
    outputBinaryPly : 1
    outputUnitLength : 1
    outputScaling : 1 (External)
    outputPrecisionBits : -1
    srcUnitLength : 1
    srcUnit : 0 (Dimensionless)
    inputScale : 1
    codingScale : 1
    sequenceScale : 1
    positionQuantizationScale : 1
    externalScale : 1
    sortInputByAzimuth : 0
    geometry_axis_order : 1 (xyz)
    autoSeqBbox : 1
    seqOrigin : 0, 0, 0
    seqSizeWhd : 0, 0, 0
    mergeDuplicatedPoints : 1
    partitionMethod : 4 (UniformSquare)
    partitionOctreeDepth : 1
    sliceMaxPoints : 1100000
    sliceMinPoints : 550000
    tileSize : 0
    cabac_bypass_stream_enabled_flag : 0
    entropyContinuationEnabled : 0
    disableAttributeCoding : 1
    enforceLevelLimits : 1
    geomTreeType : 0
    qtbtEnabled : 1
    maxNumQtBtBeforeOt : 4
    minQtbtSizeLog2 : 0
    numOctreeEntropyStreams : 0
    bitwiseOccupancyCoding : 1
    neighbourAvailBoundaryLog2 : 7
    inferredDirectCodingMode : 0
    jointTwoPointIdcm : 1
    adjacentChildContextualization : 1
    intra_pred_max_node_size_log2 : 6
    planarEnabled : 1
    planarModeThreshold0 : 77
    planarModeThreshold1 : 99
    planarModeThreshold2 : 113
    planarModeIdcmUse : -1
    trisoupNodeSizeLog2 :
    trisoup_sampling_value : 0
    positionQuantisationEnabled : 0
    positionQuantisationMethod : 0 (Uniform)
    positionQpMultiplierLog2 : 0
    positionBaseQp : 4
    positionIdcmQp : 0
    positionSliceQpOffset : 0
    positionQuantisationOctreeSizeLog2 : -1
    positionQuantisationOctreeDepth : -1
    positionBaseQpFreqLog2 : 8
    positionSliceQpFreqLog2 : -8
    angularEnabled : 0
    lidarHeadPosition : 0, 0, 0
    numLasers : 0
    lasersTheta :
    lasersZ :
    lasersNumPhiPerTurn :
    planarBufferDisabled : 0
    predGeomAzimuthQuantization : 1
    positionAzimuthScaleLog2 : 5
    positionAzimuthSpeed : 362
    positionRadiusInvScaleLog2 : 0
    predGeomSort : 1 (Morton)
    predGeomAzimuthSortPrecision : 0
    predGeomTreePtsMax : 1100000
    pointCountMetadata : 0
    recolourSearchRange : 1
    recolourNumNeighboursFwd : 8
    recolourNumNeighboursBwd : 1
    recolourUseDistWeightedAvgFwd : 1
    recolourUseDistWeightedAvgBwd : 1
    recolourSkipAvgIfIdenticalSourcePointPresentFwd : 1
    recolourSkipAvgIfIdenticalSourcePointPresentBwd : 0
    recolourDistOffsetFwd : 4
    recolourDistOffsetBwd : 4
    recolourMaxGeometryDist2Fwd : 1000
    recolourMaxGeometryDist2Bwd : 1000
    recolourMaxAttributeDist2Fwd : 1000
    recolourMaxAttributeDist2Bwd : 1000

Slice number: 1
positions bitstream size 136014 B (1.0267 bpp)
positions processing time (user): 2.349 s
Total frame size 136049 B
Total bitstream size 136049 B
Processing time (wall): 4.252 s
Processing time (user): 3.595 s
ptNum: 1059810
binsize(b): 1088392
tmc bpip: 1.0269689850067465
__________________________________________________ decode __________________________________________________
MPEG PCC tmc3 version unconfirmed-release-14.0

  • Effective configuration parameters
    mode : 1
    firstFrameNum : 0
    frameCount : 1
    reconstructedDataPath : "./temp/tmc/recPt.ply"
    uncompressedDataPath : ""
    compressedStreamPath : "./temp/tmc/tmc.bin"
    postRecolorPath : ""
    preInvScalePath : ""
    convertPlyColourspace : 1
    outputBinaryPly : 1
    outputUnitLength : 1
    outputScaling : 1 (External)
    outputPrecisionBits : -1
    skipOctreeLayers : 0
    decodeMaxPoints : 0

positions bitstream size 136014 B
positions processing time (user): 0.889 s

Total bitstream size 136049 B
Processing time (wall): 1.051 s
Processing time (user): 0.956 s
__________________________________________________ pc_error __________________________________________________
PCC quality measurement software, version 0.13.5

infile1: temp/data/pt0pcRefer.ply
infile2: temp/data/pt0pc.ply
normal1:
singlePass: 0
hausdorff: 0
color: 0
lidar: 0
resolution: 1023
dropDuplicates: 2
neighborsProc: 1
averageNormals: 1
mseSpace: 1
nbThreads: 1

Verifying if the data is loaded correctly.. The last point is: 261 992 256
Reading file 1 done.
Verifying if the data is loaded correctly.. The last point is: 363 775 345
Reading file 2 done.
Imported intrinsic resoluiton: 1023
Peak distance for PSNR: 1023
Point cloud sizes for org version, dec version, and the scaling ratio: 1059810, 1059810, 1
Normals prepared.

  1. Use infile1 (A) as reference, loop over A, use normals on B. (A->B).
    mse1 (p2point): 0
    mse1,PSNR (p2point): inf
  2. Use infile2 (B) as reference, loop over B, use normals on A. (B->A).
    mse2 (p2point): 0
    mse2,PSNR (p2point): inf
  3. Final (symmetric).
    mseF (p2point): 0
    mseF,PSNR (p2point): inf
    Job done! 1.731 seconds elapsed (excluding the time to load the point clouds).

Source code of OctreeCpp

Thanks for opening of your work, we hope to develop new research on the work, could get the code of octreecpp?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.