uav4geo / openpointclass Goto Github PK

Fast and memory efficient semantic segmentation of 3D point clouds. Runs on Windows, Mac and Linux.

License: GNU Affero General Public License v3.0

CMake 4.31% C++ 94.35% Dockerfile 1.34%

photogrammetry point-cloud segmentation semantic ai point-cloud-segmentation semantic-segmentation

openpointclass's Introduction

OpenPointClass - Fast Semantic Segmentation of 3D Point Clouds

A fast, memory efficient free and open source point cloud classifier. It generates an AI model from a set of input point clouds that have been labeled and can subsequently use that model to classify new datasets.

On the default parameters it can classify 15 million points in less than 2 minutes on a 4-core Intel i5, which is faster than any other freely available software we've tried.

It generalizes well to point clouds of varying density and includes local smoothing regularization methods.

It supports all point cloud formats supported by PDAL. When built without PDAL, it supports a subset of the PLY format only, which is optimized for speed.

Install

If you're on Windows, you have two choices:

Purchase the program for Windows and support the project ❤️
Build the program from sources (see instructions below)

If you're on macOS/Linux you currently need to build from sources (see instructions below).

Build

Dependencies:

Intel TBB
Eigen
PDAL (optional for LAS/LAZ support)

Linux

mkdir build
cd build
cmake .. [-DWITH_GBT=ON]
make -j$(nproc)

Windows

You will need Visual Studio, CMake and VCPKG.

Install required packages with VCPKG:

vcpkg install eigen3 tbb pdal

Replace <VCPKG_PATH> with the path to your VCPKG installation in the following commands:

mkdir build
cd build
cmake [-DWITH_GBT=ON] -DCMAKE_TOOLCHAIN_FILE=<VCPKG_PATH>/scripts/buildsystems/vcpkg.cmake ..
cmake --build . --config Release --target ALL_BUILD -- /maxcpucount:14

Usage

./pctrain ./ground_truth.ply
./pcclassify ./dataset.ply ./classified.ply [model.bin]

We provide access to a pre-trained model if you don't have access to labeled data. Please note the model was generated using a limited number of samples and it might not work well with all datasets.

model.bin

Training classes are assumed to follow the ASPRS 1.4 Classification and to be stored in either a label, class or classification property.

You can re-map classification codes by creating a <FILE>.json in the same directory as <FILE>.ply:

{
    "source": "https://url-to-your-data",
    "classification": {
        "0": "ground",
        "1": "building",
        "2": "low_vegetation",
        "3": "medium_vegetation",
        "4": "high_vegetation",
        "17": "ground"
    }
}

You can also use the --classes flag to limit training to a subset of the ASPRS classes:

./pctrain ./ground_truth.laz --classes 2,5,6

Supported Classes

Class	Number
unclassified	1
ground	2
low_vegetation	3
medium_vegetation	4
high_vegetation	5
building	6
low_point	7
water	9
rail	10
road_surface	11
wire_guard	13
wire_conductor	14
transmission_tower	15
wire_structure_connector	16
bridge_deck	17
high_noise	18
overhead_structure	19
ignored_ground	20
snow	21
temporal_exclusion	22
human_made_object	64

Evaluation

You can check a model accuracy by using the --eval argument:

./pctrain ./ground_truth.ply --eval test.ply

You can use PDAL to conveniently split a dataset into two (one for training, one for evaluation):

pdal split [--capacity numpoints] input.ply input_split.ply

Color Output

You can output the results of classification as a colored point cloud by using the --color option:

./pcclassify ./dataset.ply ./classified.ply --color

Classifier Types

pctrain can generate AI models using either random forests (default) or gradient boosted trees:

./pctrain -c gbt [...]

Advanced Options

See ./pctrain --help.

Docker

You can build a Docker image with the following command:

docker build -t uav4geo/openpointclass:latest .

Run the image with the following command:

docker run -it --rm -v /dataset-path:/data uav4geo/openpointclass:latest bash

Where /dataset-path is the path to the directory containing the dataset files and the model.bin file.

You will be presented with a bash prompt inside the container. You can then run the pctrain and pcclassify as described above.

Otherwise, you can use the commands directly with the following syntax:

docker run -it --rm -v /dataset-path:/data uav4geo/openpointclass:latest pctrain /data/ground_truth.ply
docker run -it --rm -v /dataset-path:/data uav4geo/openpointclass:latest pcclassify /data/dataset.ply /data/classified.ply /data/model.bin

Known Issues

We only support a subset of the PLY format (for performance reasons) and certain less common variations of the format might give trouble. Most importantly, the X/Y/Z coordinates must be float values when using binary PLY, not double or float64. We recommend to use LAS/LAZ if higher precision coordinates are needed.

License

The software is released under the terms of the AGPLv3

Made with ❤️ by UAV4GEO

openpointclass's People

Contributors

Stargazers

Watchers

Forkers

hedo88th trand2k yuancaimaiyi zwl995 zfb132 u4gbot hobu cooph2022 pierotofy sltaeronautics sihonwei liuxinren456852 storijimmy magnuan letturl xtcl ezhangle whuhxb

openpointclass's Issues

Segmentation fault during creation of second pass

Hi,

I get a segfault for a .laz file that I am trying to classify using docker. It looks like it fails during the creation of the second pass.

./pcclassify /data/21A027_628_58_0000.laz ./classified.ply /data/model.bin

I am uncertain if it it is a bug, but rather an older LAS specification version or similar that is causing the issue. You have specified that version 1.4 is required for training - is it the same for pcclassify? If this is the case, do you have any recommendation for conversion. I did try to convert it without any success. The location of the laser data is in the southern part of Sweden.

Please let me know if I need to provide you with any additional information.

Output of lasinfo:
Heading '21A027_628_58_0000.laz' with 11317852 points
lasinfo (240220) report for '21A027_628_58_0000.laz'
reporting all LAS header entries:
file signature: 'LASF'
file source ID: 0
global_encoding: 1
project ID GUID data 1-4: 00000000-0000-0000-0000-000000000000
version major.minor: 1.2
system identifier: ''
generating software: 'LASzip DLL 2.4 r0 (150731)'
file creation day/year: 336/2021
header size: 227
offset to point data: 227
number var. length records: 0
point data format: 1
point data record length: 28
number of point records: 11317852
number of points by return: 7718162 1749178 983754 571157 295601
scale factor x y z: 0.01 0.01 0.01
offset x y z: 535000 6715000 0
min x y z: 580000.00 6280000.00 -390.66
max x y z: 582499.99 6282499.99 341.04
LASzip compression (version 2.4r0 c2 50000): POINT10 2 GPSTIME11 2
reporting minimum and maximum for all LAS point record entries ...
X 4500000 4749999
Y -43500000 -43250001
Z -39066 34104
intensity 382 65535
return_number 1 5
number_of_returns 1 5
edge_of_flight_line 0 0
scan_direction_flag 0 1
classification 1 18
scan_angle_rank -18 18
user_data 0 0
point_source_ID 13603 13605
gps_time 300525420.164837 300527088.064599
number of first returns: 7718162
number of intermediate returns: 1849949
number of last returns: 7719279
number of single returns: 5969538
overview over number of returns of given pulse: 5969538 1531136 1237623 1102336 1477219 0 0
histogram of classification of points:
4987784 unclassified (1)
6310052 ground (2)
8978 noise (7)
2528 water (9)
8420 bridge deck (17)
90 Reserved for ASPRS Definition (18)

Segmentation fault

Input files: https://hub.dronedb.app/r/hedo88/err-opc

Command: ./pcclassify ../point_cloud.ply out.ply ../model.bin

Loading ../model.bin
Reading 1060056 points
Starting resolution: 0.2
Init scale 0 at 0.2 ...
Init scale 1 at 0.2 ...
Init scale 3 at 0.8 ...
Init scale 5 at 3.2 ...
Init scale 4 at 1.6 ...
Init scale 2 at 0.4 ...
Building scale 1 (160105 points) ...
Building scale 2 (37145 points) ...
Building scale 3 (10084 points) ...
Building scale 4 (2694 points) ...
Building scale 5 (731 points) ...
Features: 105
Classifying...
Local smoothing...
Segmentation fault

with valgrind:

Classifying...
Local smoothing...
==656== Thread 3:
==656== Invalid write of size 1
==656==    at 0x17F432: rf::classify(PointSet&, liblearning::RandomForest::RandomForest<liblearning::RandomForest::NodeGini<liblearning::RandomForest::AxisAlignedSplitter> >*, std::vector<Feature*, std::allocator<Feature*> > const&, std::vector<Label, std::allocator<Label> > const&, rf::Regularization, bool, bool) [clone ._omp_fn.3] (in /build/pcclassify)
==656==    by 0x487CB9D: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==656==    by 0x5396B42: start_thread (pthread_create.c:442)
==656==    by 0x5427BB3: clone (clone.S:100)
==656==  Address 0x15912 is not stack'd, malloc'd or (recently) free'd
==656== 
==656== 
==656== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==656==  Access not within mapped region at address 0x15912
==656==    at 0x17F432: rf::classify(PointSet&, liblearning::RandomForest::RandomForest<liblearning::RandomForest::NodeGini<liblearning::RandomForest::AxisAlignedSplitter> >*, std::vector<Feature*, std::allocator<Feature*> > const&, std::vector<Label, std::allocator<Label> > const&, rf::Regularization, bool, bool) [clone ._omp_fn.3] (in /build/pcclassify)
==656==    by 0x487CB9D: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==656==    by 0x5396B42: start_thread (pthread_create.c:442)
==656==    by 0x5427BB3: clone (clone.S:100)
==656==  If you believe this happened as a result of a stack
==656==  overflow in your program's main thread (unlikely but
==656==  possible), you can try to increase the size of the
==656==  main thread stack using the --main-stacksize= flag.
==656==  The main thread stack size used in this run was 8388608.
==656== 
==656== HEAP SUMMARY:
==656==     in use at exit: 205,653,345 bytes in 1,223,477 blocks
==656==   total heap usage: 16,633,178 allocs, 15,409,701 frees, 1,342,197,731 bytes allocated
==656== 
==656== Searching for pointers to 1,223,477 not-freed blocks
==656== Checked 399,635,328 bytes
==656== 
==656== LEAK SUMMARY:
==656==    definitely lost: 415 bytes in 3 blocks
==656==    indirectly lost: 0 bytes in 0 blocks
==656==      possibly lost: 11,408 bytes in 23 blocks
==656==    still reachable: 205,641,522 bytes in 1,223,451 blocks
==656==         suppressed: 0 bytes in 0 blocks
==656== Rerun with --leak-check=full to see details of leaked memory
==656== 
==656== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==656== 
==656== 1 errors in context 1 of 1:
==656== Invalid write of size 1
==656==    at 0x17F432: rf::classify(PointSet&, liblearning::RandomForest::RandomForest<liblearning::RandomForest::NodeGini<liblearning::RandomForest::AxisAlignedSplitter> >*, std::vector<Feature*, std::allocator<Feature*> > const&, std::vector<Label, std::allocator<Label> > const&, rf::Regularization, bool, bool) [clone ._omp_fn.3] (in /build/pcclassify)
==656==    by 0x487CB9D: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==656==    by 0x5396B42: start_thread (pthread_create.c:442)
==656==    by 0x5427BB3: clone (clone.S:100)
==656==  Address 0x15912 is not stack'd, malloc'd or (recently) free'd
==656== 
==656== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault

It crashes both on my linux system (kde neon 5.27) and on linux docker image (ubuntu 22.04)

-DWITH_GBT=ON fails

#9 0.554 [  5%] Creating directories for 'lightgbm'
#9 0.619 [ 10%] Performing download step (git clone) for 'lightgbm'
#9 0.637 Cloning into 'lightgbm'...
#9 3.799 HEAD is now at e4231205 [python-package] use keyword arguments in predict() calls (#5755)
#9 3.824 Submodule 'include/boost/compute' (https://github.com/boostorg/compute) registered for path 'external_libs/compute'
#9 3.825 Submodule 'eigen' (https://gitlab.com/libeigen/eigen.git) registered for path 'external_libs/eigen'
#9 3.826 Submodule 'external_libs/fast_double_parser' (https://github.com/lemire/fast_double_parser.git) registered for path 'external_libs/fast_double_parser'
#9 3.826 Submodule 'external_libs/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'external_libs/fmt'
#9 3.831 Cloning into '/opc/build/lightgbm-prefix/src/lightgbm/external_libs/compute'...
#9 5.099 Cloning into '/opc/build/lightgbm-prefix/src/lightgbm/external_libs/eigen'...
#9 5.522 error: RPC failed; HTTP 403 curl 22 The requested URL returned error: 403
#9 5.522 fatal: error reading section header 'shallow-info'
#9 5.525 fatal: clone of 'https://gitlab.com/libeigen/eigen.git' into submodule path '/opc/build/lightgbm-prefix/src/lightgbm/external_libs/eigen' failed
#9 5.530 Failed to clone 'external_libs/eigen'. Retry scheduled
#9 5.534 Cloning into '/opc/build/lightgbm-prefix/src/lightgbm/external_libs/fast_double_parser'...
#9 6.091 Cloning into '/opc/build/lightgbm-prefix/src/lightgbm/external_libs/fmt'...
#9 8.341 Cloning into '/opc/build/lightgbm-prefix/src/lightgbm/external_libs/eigen'...
#9 8.840 error: RPC failed; HTTP 403 curl 22 The requested URL re

Using the Dockerfile, the sha pointed at is no good anymore.

Segment fault

Support for distributed feature extraction / training

Modify pctrain by adding a --extract-features <path>.opcfeat.bin parameter. When set, execution should stop at https://github.com/uav4geo/OpenPointClass/blob/main/randomforest.cpp#L30 and https://github.com/uav4geo/OpenPointClass/blob/main/gbm.cpp#L45
Serialize the required vectors (for RF that's gt, ft, GBT populates the structures similarly although not identically). It might also be possible to serialize in a single format regardless of RF or GBT if one creates a new function that simply does the serialization (like train, but stops after creating the features). One might want to encode the various scale, radius, treeDepth, etc. parameters into the serialized output to avoid repeating them and validating other serialized outputs. All serialized output's parameters from different processes need to match.
Modify pctrain by checking for .opcfeat.bin file input extensions; if all files passed as input are .opcfeat.bin, then read features directly instead of computing them by adapting the rf::train and gbt::train functions. If you have serialized the scale, radius, etc. parameters one can read them from the serialized files instead of passing them manually.

Failed to setup (Access is denied. )

i am following the instructions to install. when i use the command

"cmake [-DWITH_GBT=ON] -DCMAKE_TOOLCHAIN_FILE=<C:\Researchteam\vcpkg\vcpkg>/scripts/buildsystems/vcpkg.cmake .."

it show "Access is denied." I already using administrator to open anaconda prompt, please someone help thank you so much.

Point Cloud Train with non fully ASPRS 1.4 Classification

We are working to train a lot of point cloud previosly well classified by Leica Cyclone 3DR. They have a different classificiation such curbs (74), sidewalk, (73), walls (100) amongs other. The request is... if the code can train more thant the ASPRS 1.4 23 classifications. We are happy to use the starndar ones and train my self classifications. The speed of the code its amazing. The other comments its that Potree dont support more classifications too. :(( but one problem at time.

Here some very well trained set. We are workig to share more. My idea its create different scenarios such, city, road, farm, town among other that may be can ne integrated in WebODM Platform to choose one of them when process the photogrametric data.

passenger_terminal_05_r0_10_s20.zip
marina_05_r2.zip
espigon_05_r2.zip

Why result is so strange ?

Output is:

map.ply ./map.ply ./model.bin 
Model: Random Forest
Loading ./model.bin
Reading 1496024 points
Starting resolution: 0.01
Init scale 0 at 0.01 ...
Init scale 1 at 0.01 ...
Init scale 2 at 0.02 ...
Init scale 3 at 0.04 ...
Init scale 4 at 0.08 ...
Init scale 5 at 0.16 ...
Building scale 1 (3 points) ...
Building scale 2 (3 points) ...
Building scale 3 (3 points) ...
Building scale 4 (3 points) ...
Building scale 5 (3 points) ...
Features: 105
Classifying...
Local smoothing...
Wrote ./map.ply