gwihlidal / meshopt-rs Goto Github PK
View Code? Open in Web Editor NEWRust ffi and idiomatic wrapper for zeux/meshoptimizer, a mesh optimization library that makes indexed meshes more GPU-friendly.
License: Apache License 2.0
Rust ffi and idiomatic wrapper for zeux/meshoptimizer, a mesh optimization library that makes indexed meshes more GPU-friendly.
License: Apache License 2.0
Add wrappers for the following new bindings:
pub fn meshopt_simplifyPoints(
destination: *mut ::std::os::raw::c_uint,
vertex_positions: *const f32,
vertex_count: usize,
vertex_positions_stride: usize,
target_vertex_count: usize,
) -> usize;
pub fn meshopt_spatialSortRemap(
destination: *mut ::std::os::raw::c_uint,
vertex_positions: *const f32,
vertex_count: usize,
vertex_positions_stride: usize,
);
pub fn meshopt_spatialSortTriangles(
destination: *mut ::std::os::raw::c_uint,
indices: *const ::std::os::raw::c_uint,
index_count: usize,
vertex_positions: *const f32,
vertex_count: usize,
vertex_positions_stride: usize,
);
Simplifier actually reads the entire vertex, not just position data - it has preferential treatment for position data in the vertex, but it uses other bytes to figure out where UV collapses happen.
It's not a major issue currently, but this should be addressed later (possibly with some interface changes on the C side to make things nicer in this area).
Currently, the only example is demo
, which is a monolithic feature matrix of the entire API. It would be very helpful to create a minimalistic tool
example that shows only shows the necessary calls to optimize a mesh for 100% GPU (and optionally, CPU packing/encoding improvements).
Likely this would be a simple docopt\clap command line tool that takes an obj/gltf2, optimizes it, and saves out the optimized version.
Bonus points would be to preserve information from the original mesh (morph targets, materials, etc.) so the output from the tool could be usable in actual game or demo pipelines, instead of only writing out vertices and indices for the result.
This code is incorrectly giving back a slice to invalid memory. It allocates 12 u8s (12 bytes) and creates a slice from it of 12 f32s (48 bytes).
Lines 181 to 186 in 16a3046
https://doc.rust-lang.org/std/slice/fn.from_raw_parts.html
The len argument is the number of elements, not the number of bytes.
Also if I'm not mistaken, it's returning a slice to the buffer that is on the stack.
optimize_vertex_cache_in_place takes indices as a &[u32], not a &mut [u32].
Line 26 in 37dc884
Hey! Nice library! Hope to utilize it to its potential!
Is there a possibility to support Bevy mesh formats?
In the following format for example:
// ( inside a struct )
pub(crate) positions: Vec<[f32; 3]>,
pub(crate) normals: Vec<[f32; 3]>,
pub(crate) uvs: Vec<[f32; 2]>,
// inside the struct implementation
let mut mesh = Mesh::new(PrimitiveTopology::TriangleList);
mesh.insert_attribute(Mesh::ATTRIBUTE_UV_0, std::mem::take(&mut self.uvs));
mesh.insert_attribute(
Mesh::ATTRIBUTE_POSITION,
std::mem::take(&mut self.positions),
);
mesh.insert_attribute(Mesh::ATTRIBUTE_NORMAL, std::mem::take(&mut self.normals));
mesh.set_indices(Some(mesh::Indices::U32(std::mem::take(
&mut self.triangle_indices,
))));
The difference between this, and Vec<Vertex>
is that the data is stored in standalone arrays, not interleaved as in the Vertex Vec.
failure
crate for error handling is deprecated & unmaintained through RUSTSEC advisories so would be good to update this crate to replace the error handling, for example using thiserror
and its enum-based derive errors which is nice.
warning[A003]: failure is officially deprecated/unmaintained
┌─ /workspaces/ark/Cargo.lock:212:1
│
212 │ failure 0.1.8 registry+https://github.com/rust-lang/crates.io-index
│ ------------------------------------------------------------------- unmaintained advisory detected
│
= ID: RUSTSEC-2020-0036
= Advisory: https://rustsec.org/advisories/RUSTSEC-2020-0036
= The `failure` crate is officially end-of-life: it has been marked as deprecated
by the former maintainer, who has announced that there will be no updates or
maintenance work on it going forward.
The following are some suggested actively developed alternatives to switch to:
- [`anyhow`](https://crates.io/crates/anyhow)
- [`eyre`](https://crates.io/crates/eyre)
- [`fehler`](https://crates.io/crates/fehler)
- [`snafu`](https://crates.io/crates/snafu)
- [`thiserror`](https://crates.io/crates/thiserror)
= Announcement: https://github.com/rust-lang-nursery/failure/pull/347
= Solution: No safe upgrade is available!
= failure v0.1.8
└── meshopt v0.1.9
we could do this work and PR it?
Some functions pass both an array of indices and vertex count, and there’s an implicit assumption that all indices are in range. In debug this would be “validated” with assertions, but this should be validated when sanity checks are enabled even in release, and safely recover (informing the caller).
We want to protect against all memory safety issues, and throwing a Result error on invalid index data. In addition, encoding.rs has a few assert! macros that should return a Result error instead.
Would you mind adding a derive_debug()
to the bindgen chain so it's easier to println!()
some of the structs?
An evaluation should be done for using pure rust compression crates, such as miniz_oxide.
The current implementation uses miniz_oxide_c_api, which aside from being an unsafe cc crate, it has some very weird nuances like the compression bounds having different sizes between Windows and Linux/OSX.
Current workaround hack in demo.rs:
// GW-TODO: Wow, on Windows the bound type is u32, and on OSX the bound type is u64 (fix me)
#[cfg(windows)]
type BoundsType = u32;
#[cfg(not(windows))]
type BoundsType = u64;
let compress_bound = miniz_oxide_c_api::mz_compressBound(input_size as BoundsType);
Trying to run the example with this command:
cargo run --release --example demo
Here is the end of the error
cargo:warning=c++: error: vendor/src/allocator.cpp: No such file or directory
cargo:warning=c++: fatal error: no input files
cargo:warning=compilation terminated.
exit code: 1--- stderr
error occurred: Command "c++" "-O3" "-ffunction-sections" "-fdata-sections" "-fPIC" "-m64" "-I" "src" "-Wall" "-Wextra" "-std=c++11" "-o" "/home/jlb6740/workloads/meshopt-rs/target/release/build/meshopt-1cbb12cbae4a23de/out/vendor/src/allocator.o" "-c" "vendor/src/allocator.cpp" with args "c++" did not execute successfully (status code exit code: 1).
Hey, would be really neat to have a new crates.io release at some time :)
The current API has many nice aspects over the native FFI, but there is still a large room for improvement.
Improvements to:
Current performance comparison on a 1950X threadripper w/ 64GB DDR4 (3ghz) - most metrics are quite similar between native and rust, but the outliers are likely due to the DecodePosition
trait and other redundant work:
Rust:
=== tessellated plane: 40401 vertices, 80000 triangles
Original : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 0.00 msec
Random : ACMR 2.999275 ATVR 5.939012 (NV 5.938665 AMD 5.939407 Intel 5.925942) Overfetch 8.998045 Overdraw 1.000000 in 2.63 msec
Cache : ACMR 0.633500 ATVR 1.254424 (NV 1.620282 AMD 1.375436 Intel 1.249796) Overfetch 1.146308 Overdraw 1.000000 in 25.75 msec
CacheFifo: ACMR 0.606637 ATVR 1.201233 (NV 1.625851 AMD 1.357565 Intel 1.191357) Overfetch 1.107151 Overdraw 1.000000 in 4.30 msec
Overdraw : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 4.27 msec
Fetch : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 1.18 msec
FetchMap : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 1.77 msec
Complete : ACMR 0.633500 ATVR 1.254424 (NV 1.620282 AMD 1.375436 Intel 1.249796) Overfetch 1.000025 Overdraw 1.000000 in 30.07 msec
Stripify : ACMR 0.650700 ATVR 1.288483 (NV 1.653598 AMD 1.565159 Intel 1.249771); 154826 strip indices (64.5%) in 5.68 msec
IdxCodec : 9.1 bits/triangle (post-deflate 0.2 bits/triangle); encode 2.44 msec, decode 0.58 msec (1.55 GB/s)
VtxPack : 128.0 bits/vertex (post-deflate 33.3 bits/vertices)
VtxCodec : 34.9 bits/vertex (post-deflate 2.9 bits/vertex); encode 2.07 msec, decode 0.47 msec (0.32 GB/s)
VtxCodec0: 34.4 bits/vertex (post-deflate 2.9 bits/vertex); encode 1.53 msec, decode 0.35 msec (0.43 GB/s)
Simplify : 80000 triangles => 5 LOD levels down to 19207 triangles in 34.20 msec, optimized in 87.34 msec
ACMR 0.633500...0.656584 Overfetch 1.347244..1.054782 Codec VB 34.6 bits/vertex IB 15.3 bits/triangle
=== "examples/pirate.obj": 3152 vertices, 5010 triangles
Original : ACMR 1.152695 ATVR 1.832170 (NV 1.875635 AMD 1.948287 Intel 1.279822) Overfetch 1.000000 Overdraw 1.409778 in 0.00 msec
Random : ACMR 2.984631 ATVR 4.743972 (NV 4.740165 AMD 4.749048 Intel 4.590102) Overfetch 1.000000 Overdraw 1.292420 in 0.19 msec
Cache : ACMR 0.749301 ATVR 1.190990 (NV 1.387373 AMD 1.283947 Intel 1.039023) Overfetch 1.000000 Overdraw 1.430061 in 1.87 msec
CacheFifo: ACMR 0.756088 ATVR 1.201777 (NV 1.445114 AMD 1.308058 Intel 1.071383) Overfetch 1.000000 Overdraw 1.435101 in 0.44 msec
Overdraw : ACMR 2.579641 ATVR 4.100254 (NV 4.099302 AMD 4.149429 Intel 3.115799) Overfetch 1.000000 Overdraw 1.210810 in 0.40 msec
Fetch : ACMR 1.152695 ATVR 1.832170 (NV 1.875635 AMD 1.948287 Intel 1.279822) Overfetch 1.000000 Overdraw 1.409778 in 0.19 msec
FetchMap : ACMR 1.152695 ATVR 1.832170 (NV 1.875635 AMD 1.948287 Intel 1.279822) Overfetch 1.000000 Overdraw 1.409778 in 0.24 msec
Complete : ACMR 0.773254 ATVR 1.229061 (NV 1.429886 AMD 1.318528 Intel 1.134518) Overfetch 1.000000 Overdraw 1.338693 in 2.14 msec
Stripify : ACMR 0.777246 ATVR 1.235406 (NV 1.471764 AMD 1.394353 Intel 1.038706); 8551 strip indices (56.9%) in 0.44 msec
IdxCodec : 9.3 bits/triangle (post-deflate 5.9 bits/triangle); encode 0.20 msec, decode 0.04 msec (1.26 GB/s)
VtxPack : 128.0 bits/vertex (post-deflate 90.1 bits/vertices)
VtxCodec : 78.1 bits/vertex (post-deflate 71.3 bits/vertex); encode 0.19 msec, decode 0.04 msec (0.26 GB/s)
VtxCodec0: 70.0 bits/vertex (post-deflate 64.2 bits/vertex); encode 0.15 msec, decode 0.03 msec (0.36 GB/s)
Simplify : 5010 triangles => 5 LOD levels down to 1202 triangles in 3.65 msec, optimized in 5.73 msec
ACMR 0.752096...0.887687 Overfetch 1.000000..1.000987 Codec VB 73.1 bits/vertex IB 14.9 bits/triangle
C++
=== tessellated plane: 40401 vertices, 80000 triangles
Original : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 0.00 msec
Random : ACMR 2.999312 ATVR 5.939085 (NV 5.939061 AMD 5.939407 Intel 5.924383) Overfetch 9.001263 Overdraw 1.000000 in 3.22 msec
Cache : ACMR 0.633500 ATVR 1.254424 (NV 1.620282 AMD 1.375436 Intel 1.249796) Overfetch 1.146308 Overdraw 1.000000 in 11.41 msec
CacheFifo: ACMR 0.606637 ATVR 1.201233 (NV 1.625851 AMD 1.357565 Intel 1.191357) Overfetch 1.107151 Overdraw 1.000000 in 3.36 msec
Overdraw : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 4.07 msec
Fetch : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 0.84 msec
FetchMap : ACMR 1.005000 ATVR 1.990050 (NV 2.120740 AMD 2.053910 Intel 1.990050) Overfetch 1.000025 Overdraw 1.000000 in 1.03 msec
Complete : ACMR 0.633500 ATVR 1.254424 (NV 1.620282 AMD 1.375436 Intel 1.249796) Overfetch 1.000025 Overdraw 1.000000 in 18.59 msec
Stripify : ACMR 0.650700 ATVR 1.288483 (NV 1.653598 AMD 1.565159 Intel 1.249771); 154826 strip indices (64.5%) in 5.44 msec
IdxCodec : 9.1 bits/triangle (post-deflate 0.2 bits/triangle); encode 2.35 msec, decode 0.52 msec (1.73 GB/s)
VtxPack : 128.0 bits/vertex (post-deflate 33.3 bits/vertex)
VtxCodec : 34.9 bits/vertex (post-deflate 2.9 bits/vertex); encode 1.94 msec, decode 0.45 msec (1.34 GB/s)
VtxCodecO: 34.4 bits/vertex (post-deflate 2.9 bits/vertex); encode 1.47 msec, decode 0.36 msec (1.27 GB/s)
Simplify : 80000 triangles => 5 LOD levels down to 19207 triangles in 30.13 msec, optimized in 48.58 msec
ACMR 0.633500...0.656584 Overfetch 1.347244..1.054782 Codec VB 34.6 bits/vertex IB 15.3 bits/triangle
=== "examples/pirate.obj": 3152 vertices, 5010 triangles
Original : ACMR 1.152695 ATVR 1.832170 (NV 1.875635 AMD 1.948287 Intel 1.279822) Overfetch 1.000000 Overdraw 1.409778 in 0.00 msec
Random : ACMR 2.983034 ATVR 4.741434 (NV 4.738896 AMD 4.745241 Intel 4.585342) Overfetch 1.000000 Overdraw 1.322785 in 0.13 msec
Cache : ACMR 0.749301 ATVR 1.190990 (NV 1.387373 AMD 1.283947 Intel 1.039023) Overfetch 1.000000 Overdraw 1.430061 in 1.33 msec
CacheFifo: ACMR 0.756088 ATVR 1.201777 (NV 1.445114 AMD 1.308058 Intel 1.071383) Overfetch 1.000000 Overdraw 1.435101 in 0.31 msec
Overdraw : ACMR 2.579641 ATVR 4.100254 (NV 4.099302 AMD 4.149429 Intel 3.115799) Overfetch 1.000000 Overdraw 1.210810 in 0.29 msec
Fetch : ACMR 1.152695 ATVR 1.832170 (NV 1.875635 AMD 1.948287 Intel 1.279822) Overfetch 1.000000 Overdraw 1.409778 in 0.09 msec
FetchMap : ACMR 1.152695 ATVR 1.832170 (NV 1.875635 AMD 1.948287 Intel 1.279822) Overfetch 1.000000 Overdraw 1.409778 in 0.08 msec
Complete : ACMR 0.773254 ATVR 1.229061 (NV 1.429886 AMD 1.318528 Intel 1.134518) Overfetch 1.000000 Overdraw 1.338693 in 1.64 msec
Stripify : ACMR 0.777246 ATVR 1.235406 (NV 1.471764 AMD 1.394353 Intel 1.038706); 8551 strip indices (56.9%) in 0.44 msec
IdxCodec : 9.3 bits/triangle (post-deflate 5.9 bits/triangle); encode 0.22 msec, decode 0.04 msec (1.28 GB/s)
VtxPack : 128.0 bits/vertex (post-deflate 90.1 bits/vertex)
VtxCodec : 78.1 bits/vertex (post-deflate 71.3 bits/vertex); encode 0.20 msec, decode 0.04 msec (1.13 GB/s)
VtxCodecO: 70.0 bits/vertex (post-deflate 64.2 bits/vertex); encode 0.14 msec, decode 0.03 msec (1.09 GB/s)
Simplify : 5010 triangles => 5 LOD levels down to 1202 triangles in 3.35 msec, optimized in 4.31 msec
ACMR 0.752096...0.887687 Overfetch 1.000000..1.000987 Codec VB 73.1 bits/vertex IB 14.9 bits/triangle
I need newer API around meshlets from meshoptimizer library. Any plans to update the library or if not, would you accept a pull request for an update ?
Now that mesh optimizer has added a meshlet clusterizer, it would be a good time to upgrade to the latest version and provide bindings for all the new stuff.
In utilities.rs
is a quantize_half
routine that could likely be replaced by a well tested crate like half-rs.
It appears that the Rust wrapper is limited in the vertex attributes it supports: position, normal, and a single texture coordinate set:
Lines 81 to 87 in 37dc884
I'm interested in using the library for processing glTF files, which can have considerably more vertex attributes, and even application-specific custom attributes (prefixed with _
, like _GAME_DATA
). See:
https://github.com/KhronosGroup/glTF/blob/master/specification/2.0/README.md#meshes
I assume that the native meshoptimizer library must support these cases, because the gltfpack tool it provides supports them. Would it be possible to allow more attributes in the meshopt-rs
library, or are there other obstacles here? Thanks!
I just published 0.2.1 on crates.io. It includes the simplify_scale
by @BeastLe9enD and simplify_with_locks
API functions
I think we should do a proper release on github, but I noticed that we haven't done that for 0.2.0 either. What do you think, should we just ignore 0.2.0 or still do that for history's sake? :)
Either way, we need a good summary of all the changes. Is the changelog complete?
Self-explanatory ;)
There should be tests for all components of the library
Many routines and types are missing code comments; this should be addressed.
optimize_vertex_cache_in_place
currently takes a &[u32] which is then passed as a mut pointer to the ffi function, which mutates it. This is obviously unsound
Hi, I'm encountering some bug, here is some information :
Working code :
let mut lod: Vec<u32>;
{
let src = &mesh.indices;
lod = meshopt::simplify_sloppy(
src,
vertex_adapter,
::std::cmp::min(src.len(), target_index_count),
target_error,
None,
);
}
println!("finalizing lod {}...", display_id);
meshopt::optimize_vertex_cache_in_place(&mut lod, vertex_adapter.vertex_count);
meshopt::optimize_overdraw_in_place(&mut lod, &vertex_adapter, 1f32);
This snippet uses simplify_sloppy and works well. But for my project, I want to use the simplify
function to keep mesh borders and texture, so here is the new code :
let mut lod: Vec<u32>;
{
let src = &mesh.indices;
lod = meshopt::simplify(
src,
vertex_adapter,
::std::cmp::min(src.len(), target_index_count),
target_error,
SimplifyOptions::LockBorder,
None,
);
}
println!("finalizing lod {}...", display_id);
meshopt::optimize_vertex_cache_in_place(&mut lod, vertex_adapter.vertex_count);
meshopt::optimize_overdraw_in_place(&mut lod, &vertex_adapter, 1f32);
Basically the same thing except I'm changing the function to simplify I'm adding a SimplifyOptions::LockBorder parameter
But with the same input settings, simplify_sloppy will create a nice lod, while simplify will almost not change the mesh. See :
As you can see, simplify is barely affecting the mesh. I don't know if it's a bug or if I'm doing something wrong. I tried with a ton of parameters and can't make it work.
Thank you for any help !
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.