gwihlidal / meshopt-rs Goto Github PK

Rust ffi and idiomatic wrapper for zeux/meshoptimizer, a mesh optimization library that makes indexed meshes more GPU-friendly.

License: Apache License 2.0

Rust 99.62% C 0.38%

meshopt-rs's People

Contributors

Stargazers

Watchers

meshopt-rs's Issues

Add wrappers for new methods

Add wrappers for the following new bindings:

    pub fn meshopt_simplifyPoints(
        destination: *mut ::std::os::raw::c_uint,
        vertex_positions: *const f32,
        vertex_count: usize,
        vertex_positions_stride: usize,
        target_vertex_count: usize,
    ) -> usize;

    pub fn meshopt_spatialSortRemap(
        destination: *mut ::std::os::raw::c_uint,
        vertex_positions: *const f32,
        vertex_count: usize,
        vertex_positions_stride: usize,
    );

    pub fn meshopt_spatialSortTriangles(
        destination: *mut ::std::os::raw::c_uint,
        indices: *const ::std::os::raw::c_uint,
        index_count: usize,
        vertex_positions: *const f32,
        vertex_count: usize,
        vertex_positions_stride: usize,
    );

Simplifier should read the entire vertex, not just positions

Simplifier actually reads the entire vertex, not just position data - it has preferential treatment for position data in the vertex, but it uses other bytes to figure out where UV collapses happen.

It's not a major issue currently, but this should be addressed later (possibly with some interface changes on the C side to make things nicer in this area).

Create a minimalistic 100% optimal tool example

Currently, the only example is demo, which is a monolithic feature matrix of the entire API. It would be very helpful to create a minimalistic tool example that shows only shows the necessary calls to optimize a mesh for 100% GPU (and optionally, CPU packing/encoding improvements).

Likely this would be a simple docopt\clap command line tool that takes an obj/gltf2, optimizes it, and saves out the optimized version.

Bonus points would be to preserve information from the original mesh (morph targets, materials, etc.) so the output from the tool could be usable in actual game or demo pipelines, instead of only writing out vertices and indices for the result.

Incorrect buffer size

This code is incorrectly giving back a slice to invalid memory. It allocates 12 u8s (12 bytes) and creates a slice from it of 12 f32s (48 bytes).

meshopt-rs/src/utilities.rs

Lines 181 to 186 in 16a3046

 let mut scratch = [0u8; 12]; 

 self.reader.read_exact(&mut scratch)?; 

 let position = 

 unsafe { std::slice::from_raw_parts(scratch.as_ptr().cast::<f32>(), 12) }; 

 self.reader.set_position(reader_pos); 

 Ok(position)

https://doc.rust-lang.org/std/slice/fn.from_raw_parts.html

The len argument is the number of elements, not the number of bytes.

Also if I'm not mistaken, it's returning a slice to the buffer that is on the stack.

optimize_vertex_cache_in_place takes immutable indices

optimize_vertex_cache_in_place takes indices as a &[u32], not a &mut [u32].

meshopt-rs/src/optimize.rs

Line 26 in 37dc884

pub fn optimize_vertex_cache_in_place(indices: &[u32], vertex_count: usize) {

Bevy mesh support

Hey! Nice library! Hope to utilize it to its potential!
Is there a possibility to support Bevy mesh formats?

In the following format for example:

    // ( inside a struct )
    pub(crate) positions: Vec<[f32; 3]>,
    pub(crate) normals: Vec<[f32; 3]>,
    pub(crate) uvs: Vec<[f32; 2]>,

       // inside the struct implementation
        let mut mesh = Mesh::new(PrimitiveTopology::TriangleList);
        mesh.insert_attribute(Mesh::ATTRIBUTE_UV_0, std::mem::take(&mut self.uvs));
        mesh.insert_attribute(
            Mesh::ATTRIBUTE_POSITION,
            std::mem::take(&mut self.positions),
        );
        mesh.insert_attribute(Mesh::ATTRIBUTE_NORMAL, std::mem::take(&mut self.normals));
        mesh.set_indices(Some(mesh::Indices::U32(std::mem::take(
            &mut self.triangle_indices,
        ))));

The difference between this, and Vec<Vertex> is that the data is stored in standalone arrays, not interleaved as in the Vertex Vec.

Replace deprecated `failure` crate

failure crate for error handling is deprecated & unmaintained through RUSTSEC advisories so would be good to update this crate to replace the error handling, for example using thiserror and its enum-based derive errors which is nice.

warning[A003]: failure is officially deprecated/unmaintained
    ┌─ /workspaces/ark/Cargo.lock:212:1
    │
212 │ failure 0.1.8 registry+https://github.com/rust-lang/crates.io-index
    │ ------------------------------------------------------------------- unmaintained advisory detected
    │
    = ID: RUSTSEC-2020-0036
    = Advisory: https://rustsec.org/advisories/RUSTSEC-2020-0036
    = The `failure` crate is officially end-of-life: it has been marked as deprecated
      by the former maintainer, who has announced that there will be no updates or
      maintenance work on it going forward.
      
      The following are some suggested actively developed alternatives to switch to:
      
      - [`anyhow`](https://crates.io/crates/anyhow)
      - [`eyre`](https://crates.io/crates/eyre)
      - [`fehler`](https://crates.io/crates/fehler)
      - [`snafu`](https://crates.io/crates/snafu)
      - [`thiserror`](https://crates.io/crates/thiserror)
    = Announcement: https://github.com/rust-lang-nursery/failure/pull/347
    = Solution: No safe upgrade is available!
    = failure v0.1.8
      └── meshopt v0.1.9

we could do this work and PR it?

More exhaustive validation in some sanity check modes

Some functions pass both an array of indices and vertex count, and there’s an implicit assumption that all indices are in range. In debug this would be “validated” with assertions, but this should be validated when sanity checks are enabled even in release, and safely recover (informing the caller).

We want to protect against all memory safety issues, and throwing a Result error on invalid index data. In addition, encoding.rs has a few assert! macros that should return a Result error instead.

derive(Debug) on generated structs

Would you mind adding a derive_debug() to the bindgen chain so it's easier to println!() some of the structs?

Evaluate pure rust compression crates

An evaluation should be done for using pure rust compression crates, such as miniz_oxide.

The current implementation uses miniz_oxide_c_api, which aside from being an unsafe cc crate, it has some very weird nuances like the compression bounds having different sizes between Windows and Linux/OSX.

Current workaround hack in demo.rs:

// GW-TODO: Wow, on Windows the bound type is u32, and on OSX the bound type is u64 (fix me)
#[cfg(windows)]
type BoundsType = u32;

#[cfg(not(windows))]
type BoundsType = u64;

let compress_bound = miniz_oxide_c_api::mz_compressBound(input_size as BoundsType);

Demo build failure

Trying to run the example with this command:

cargo run --release --example demo

Here is the end of the error

cargo:warning=c++: error: vendor/src/allocator.cpp: No such file or directory
cargo:warning=c++: fatal error: no input files
cargo:warning=compilation terminated.
exit code: 1

--- stderr

error occurred: Command "c++" "-O3" "-ffunction-sections" "-fdata-sections" "-fPIC" "-m64" "-I" "src" "-Wall" "-Wextra" "-std=c++11" "-o" "/home/jlb6740/workloads/meshopt-rs/target/release/build/meshopt-1cbb12cbae4a23de/out/vendor/src/allocator.o" "-c" "vendor/src/allocator.cpp" with args "c++" did not execute successfully (status code exit code: 1).

New release on crates.io

Hey, would be really neat to have a new crates.io release at some time :)

More idiomatic rust wrappers, and optimize some poorly performing translations

The current API has many nice aspects over the native FFI, but there is still a large room for improvement.

Improvements to:

Support adapters to allow for 32 and 16 bit indices
Easier to plug in custom vertex packing formats
In place mutation of slices needs to return the final element count, and the Vec needs to resized at the call site. This is easy to mess up and leave garbage values at the end of the array.
Remove redundant copies and resizes with zero-fill (slow!)

Current performance comparison on a 1950X threadripper w/ 64GB DDR4 (3ghz) - most metrics are quite similar between native and rust, but the outliers are likely due to the DecodePosition trait and other redundant work:

Rust:

=== tessellated plane: 40401 vertices, 80000 triangles
Original : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 0.00 msec
Random   : ACMR 2.999275 ATVR 5.939012 (NV 5.938665 AMD 5.939407 Intel 5.925942) Overfetch 8.998045 Overdraw 1.000000 in 2.63 msec
Cache    : ACMR 0.633500 ATVR 1.254424 (NV 1.620282 AMD 1.375436 Intel 1.249796) Overfetch 1.146308 Overdraw 1.000000 in 25.75 msec
CacheFifo: ACMR 0.606637 ATVR 1.201233 (NV 1.625851 AMD 1.357565 Intel 1.191357) Overfetch 1.107151 Overdraw 1.000000 in 4.30 msec
Overdraw : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 4.27 msec
Fetch    : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 1.18 msec
FetchMap : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 1.77 msec
Complete : ACMR 0.633500 ATVR 1.254424 (NV 1.620282 AMD 1.375436 Intel 1.249796) Overfetch 1.000025 Overdraw 1.000000 in 30.07 msec
Stripify : ACMR 0.650700 ATVR 1.288483 (NV 1.653598 AMD 1.565159 Intel 1.249771); 154826 strip indices (64.5%) in 5.68 msec
IdxCodec : 9.1 bits/triangle (post-deflate 0.2 bits/triangle); encode 2.44 msec, decode 0.58 msec (1.55 GB/s)
VtxPack  : 128.0 bits/vertex (post-deflate 33.3 bits/vertices)
VtxCodec : 34.9 bits/vertex (post-deflate 2.9 bits/vertex); encode 2.07 msec, decode 0.47 msec (0.32 GB/s)
VtxCodec0: 34.4 bits/vertex (post-deflate 2.9 bits/vertex); encode 1.53 msec, decode 0.35 msec (0.43 GB/s)
Simplify : 80000 triangles => 5 LOD levels down to 19207 triangles in 34.20 msec, optimized in 87.34 msec
           ACMR 0.633500...0.656584 Overfetch 1.347244..1.054782 Codec VB 34.6 bits/vertex IB 15.3 bits/triangle

=== "examples/pirate.obj": 3152 vertices, 5010 triangles
Original : ACMR 1.152695 ATVR 1.832170 (NV 1.875635 AMD 1.948287 Intel 1.279822) Overfetch 1.000000 Overdraw 1.409778 in 0.00 msec
Random   : ACMR 2.984631 ATVR 4.743972 (NV 4.740165 AMD 4.749048 Intel 4.590102) Overfetch 1.000000 Overdraw 1.292420 in 0.19 msec
Cache    : ACMR 0.749301 ATVR 1.190990 (NV 1.387373 AMD 1.283947 Intel 1.039023) Overfetch 1.000000 Overdraw 1.430061 in 1.87 msec
CacheFifo: ACMR 0.756088 ATVR 1.201777 (NV 1.445114 AMD 1.308058 Intel 1.071383) Overfetch 1.000000 Overdraw 1.435101 in 0.44 msec
Overdraw : ACMR 2.579641 ATVR 4.100254 (NV 4.099302 AMD 4.149429 Intel 3.115799) Overfetch 1.000000 Overdraw 1.210810 in 0.40 msec
Fetch    : ACMR 1.152695 ATVR 1.832170 (NV 1.875635 AMD 1.948287 Intel 1.279822) Overfetch 1.000000 Overdraw 1.409778 in 0.19 msec
FetchMap : ACMR 1.152695 ATVR 1.832170 (NV 1.875635 AMD 1.948287 Intel 1.279822) Overfetch 1.000000 Overdraw 1.409778 in 0.24 msec
Complete : ACMR 0.773254 ATVR 1.229061 (NV 1.429886 AMD 1.318528 Intel 1.134518) Overfetch 1.000000 Overdraw 1.338693 in 2.14 msec
Stripify : ACMR 0.777246 ATVR 1.235406 (NV 1.471764 AMD 1.394353 Intel 1.038706); 8551 strip indices (56.9%) in 0.44 msec
IdxCodec : 9.3 bits/triangle (post-deflate 5.9 bits/triangle); encode 0.20 msec, decode 0.04 msec (1.26 GB/s)
VtxPack  : 128.0 bits/vertex (post-deflate 90.1 bits/vertices)
VtxCodec : 78.1 bits/vertex (post-deflate 71.3 bits/vertex); encode 0.19 msec, decode 0.04 msec (0.26 GB/s)
VtxCodec0: 70.0 bits/vertex (post-deflate 64.2 bits/vertex); encode 0.15 msec, decode 0.03 msec (0.36 GB/s)
Simplify : 5010 triangles => 5 LOD levels down to 1202 triangles in 3.65 msec, optimized in 5.73 msec
           ACMR 0.752096...0.887687 Overfetch 1.000000..1.000987 Codec VB 73.1 bits/vertex IB 14.9 bits/triangle

C++

=== tessellated plane: 40401 vertices, 80000 triangles
Original : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 0.00 msec
Random   : ACMR 2.999312 ATVR 5.939085 (NV 5.939061 AMD 5.939407 Intel 5.924383) Overfetch 9.001263 Overdraw 1.000000 in 3.22 msec
Cache    : ACMR 0.633500 ATVR 1.254424 (NV 1.620282 AMD 1.375436 Intel 1.249796) Overfetch 1.146308 Overdraw 1.000000 in 11.41 msec
CacheFifo: ACMR 0.606637 ATVR 1.201233 (NV 1.625851 AMD 1.357565 Intel 1.191357) Overfetch 1.107151 Overdraw 1.000000 in 3.36 msec
Overdraw : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 4.07 msec
Fetch    : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 0.84 msec
FetchMap : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 1.03 msec
Complete : ACMR 0.633500 ATVR 1.254424 (NV 1.620282 AMD 1.375436 Intel 1.249796) Overfetch 1.000025 Overdraw 1.000000 in 18.59 msec
Stripify : ACMR 0.650700 ATVR 1.288483 (NV 1.653598 AMD 1.565159 Intel 1.249771); 154826 strip indices (64.5%) in 5.44 msec
IdxCodec : 9.1 bits/triangle (post-deflate 0.2 bits/triangle); encode 2.35 msec, decode 0.52 msec (1.73 GB/s)
VtxPack  : 128.0 bits/vertex (post-deflate 33.3 bits/vertex)
VtxCodec : 34.9 bits/vertex (post-deflate 2.9 bits/vertex); encode 1.94 msec, decode 0.45 msec (1.34 GB/s)
VtxCodecO: 34.4 bits/vertex (post-deflate 2.9 bits/vertex); encode 1.47 msec, decode 0.36 msec (1.27 GB/s)
Simplify : 80000 triangles => 5 LOD levels down to 19207 triangles in 30.13 msec, optimized in 48.58 msec
           ACMR 0.633500...0.656584 Overfetch 1.347244..1.054782 Codec VB 34.6 bits/vertex IB 15.3 bits/triangle

=== "examples/pirate.obj": 3152 vertices, 5010 triangles
Original : ACMR 1.152695 ATVR 1.832170 (NV 1.875635 AMD 1.948287 Intel 1.279822) Overfetch 1.000000 Overdraw 1.409778 in 0.00 msec
Random   : ACMR 2.983034 ATVR 4.741434 (NV 4.738896 AMD 4.745241 Intel 4.585342) Overfetch 1.000000 Overdraw 1.322785 in 0.13 msec
Cache    : ACMR 0.749301 ATVR 1.190990 (NV 1.387373 AMD 1.283947 Intel 1.039023) Overfetch 1.000000 Overdraw 1.430061 in 1.33 msec
CacheFifo: ACMR 0.756088 ATVR 1.201777 (NV 1.445114 AMD 1.308058 Intel 1.071383) Overfetch 1.000000 Overdraw 1.435101 in 0.31 msec
Overdraw : ACMR 2.579641 ATVR 4.100254 (NV 4.099302 AMD 4.149429 Intel 3.115799) Overfetch 1.000000 Overdraw 1.210810 in 0.29 msec
Fetch    : ACMR 1.152695 ATVR 1.832170 (NV 1.875635 AMD 1.948287 Intel 1.279822) Overfetch 1.000000 Overdraw 1.409778 in 0.09 msec
FetchMap : ACMR 1.152695 ATVR 1.832170 (NV 1.875635 AMD 1.948287 Intel 1.279822) Overfetch 1.000000 Overdraw 1.409778 in 0.08 msec
Complete : ACMR 0.773254 ATVR 1.229061 (NV 1.429886 AMD 1.318528 Intel 1.134518) Overfetch 1.000000 Overdraw 1.338693 in 1.64 msec
Stripify : ACMR 0.777246 ATVR 1.235406 (NV 1.471764 AMD 1.394353 Intel 1.038706); 8551 strip indices (56.9%) in 0.44 msec
IdxCodec : 9.3 bits/triangle (post-deflate 5.9 bits/triangle); encode 0.22 msec, decode 0.04 msec (1.28 GB/s)
VtxPack  : 128.0 bits/vertex (post-deflate 90.1 bits/vertex)
VtxCodec : 78.1 bits/vertex (post-deflate 71.3 bits/vertex); encode 0.20 msec, decode 0.04 msec (1.13 GB/s)
VtxCodecO: 70.0 bits/vertex (post-deflate 64.2 bits/vertex); encode 0.14 msec, decode 0.03 msec (1.09 GB/s)
Simplify : 5010 triangles => 5 LOD levels down to 1202 triangles in 3.35 msec, optimized in 4.31 msec
           ACMR 0.752096...0.887687 Overfetch 1.000000..1.000987 Codec VB 73.1 bits/vertex IB 14.9 bits/triangle

Any plans for updating to newer versions of meshoptimizer library ?

I need newer API around meshlets from meshoptimizer library. Any plans to update the library or if not, would you accept a pull request for an update ?

Upgrade to latest mesh optimizer

Now that mesh optimizer has added a meshlet clusterizer, it would be a good time to upgrade to the latest version and provide bindings for all the new stuff.

Add high-level bindings for meshopt_simplifyScale()

Evaluate using half-rs for half quantization routines

In utilities.rs is a quantize_half routine that could likely be replaced by a well tested crate like half-rs.

Supporting additional vertex attributes

It appears that the Rust wrapper is limited in the vertex attributes it supports: position, normal, and a single texture coordinate set:

meshopt-rs/src/packing.rs

Lines 81 to 87 in 37dc884

 #[derive(Default, Debug, Copy, Clone, PartialOrd)] 

 #[repr(C)] 

 pub struct Vertex { 

 pub p: [f32; 3], 

 pub n: [f32; 3], 

 pub t: [f32; 2], 

 }

I'm interested in using the library for processing glTF files, which can have considerably more vertex attributes, and even application-specific custom attributes (prefixed with _, like _GAME_DATA). See:

https://github.com/KhronosGroup/glTF/blob/master/specification/2.0/README.md#meshes

I assume that the native meshoptimizer library must support these cases, because the gltfpack tool it provides supports them. Would it be possible to allow more attributes in the meshopt-rs library, or are there other obstacles here? Thanks!

Properly publish v0.2.1 (and v0.2.0?) on github

I just published 0.2.1 on crates.io. It includes the simplify_scale by @BeastLe9enD and simplify_with_locks API functions

I think we should do a proper release on github, but I noticed that we haven't done that for 0.2.0 either. What do you think, should we just ignore 0.2.0 or still do that for history's sake? :)

Either way, we need a good summary of all the changes. Is the changelog complete?

Write unit tests

Self-explanatory ;)

There should be tests for all components of the library

analyze
clusterize
encoding
optimize
packing
remap
shadow
simplify
stripify
utilities

Documentation coverage of the API

Many routines and types are missing code comments; this should be addressed.

optimize_vertex_cache_in_place is unsound (&[u32] instead of &mut [u32])

optimize_vertex_cache_in_place currently takes a &[u32] which is then passed as a mut pointer to the ffi function, which mutates it. This is obviously unsound

Simplify does not affect the mesh (And simplify_sloppy do)

Hi, I'm encountering some bug, here is some information :

Working code :

let mut lod: Vec<u32>;
    {
        let src = &mesh.indices;
        lod = meshopt::simplify_sloppy(
            src,
            vertex_adapter,
            ::std::cmp::min(src.len(), target_index_count),
            target_error,
            None,
        );
    }

    println!("finalizing lod {}...", display_id);
    meshopt::optimize_vertex_cache_in_place(&mut lod, vertex_adapter.vertex_count);
    meshopt::optimize_overdraw_in_place(&mut lod, &vertex_adapter, 1f32);

This snippet uses simplify_sloppy and works well. But for my project, I want to use the simplify function to keep mesh borders and texture, so here is the new code :

    let mut lod: Vec<u32>;
    {
        let src = &mesh.indices;
        lod = meshopt::simplify(
            src,
            vertex_adapter,
            ::std::cmp::min(src.len(), target_index_count),
            target_error,
            SimplifyOptions::LockBorder,
            None,
        );
    }

    println!("finalizing lod {}...", display_id);
    meshopt::optimize_vertex_cache_in_place(&mut lod, vertex_adapter.vertex_count);
    meshopt::optimize_overdraw_in_place(&mut lod, &vertex_adapter, 1f32);

Basically the same thing except I'm changing the function to simplify I'm adding a SimplifyOptions::LockBorder parameter

But with the same input settings, simplify_sloppy will create a nice lod, while simplify will almost not change the mesh. See :

simplify_sloppy :

simplify :

As you can see, simplify is barely affecting the mesh. I don't know if it's a bug or if I'm doing something wrong. I tried with a ton of parameters and can't make it work.
Thank you for any help !

	let mut scratch = [0u8; 12];
	self.reader.read_exact(&mut scratch)?;
	let position =
	unsafe { std::slice::from_raw_parts(scratch.as_ptr().cast::<f32>(), 12) };
	self.reader.set_position(reader_pos);
	Ok(position)

	#[derive(Default, Debug, Copy, Clone, PartialOrd)]
	#[repr(C)]
	pub struct Vertex {
	pub p: [f32; 3],
	pub n: [f32; 3],
	pub t: [f32; 2],
	}

gwihlidal / meshopt-rs Goto Github PK

meshopt-rs's People

Contributors

Stargazers

Watchers

Forkers

meshopt-rs's Issues

Recommend Projects

Recommend Topics

Recommend Org