I have a morton code library in rust that just uses BMI2 and wanted to use magicbits when BMI2 was not available. For testing, I used an invariant of the form:
fn invariant<T: Int>(v: T) {
{ // 2D
let (x, y) = morton_decode_2d(v);
let vs = morton_encode_2d(x, y);
assert_eq!(v, vs);
}
{ // 3D
let (x, y, z) = morton_decode_3d(v);
let vs = morton_encode_3d(x, y, z);
assert_eq!(v, vs);
}
}
for many integer ranges. I am testing for all u8
, u16
, [u16::max as u32, u16::max as u32 - 1000000]
, [u32::max - 1000000, u32::max]
, [u32::max as u64, u32::max as u64 - 1000000]
, [u64::max - 1000000, u64::max]
.
The BMI2 version always passes all tests correctly so it must be at least be a bug with my implementation.
I needed to change your magicbits implementation for the 2D case to:
// 2D decode 32-bit:
x = x & MORTON_MASK2D_U32[5];
x = (x ^ (x >> 1)) & MORTON_MASK2D_U32[4];
x = (x ^ (x >> 2)) & MORTON_MASK2D_U32[3];
x = (x ^ (x >> 4)) & MORTON_MASK2D_U32[2];
x = (x ^ (x >> 8)) & MORTON_MASK2D_U32[1];
x = (x ^ (x >> 16)) & MORTON_MASK2D_U32[0];
// 2D encode 32-bit:
x = (x | x << 16) & MORTON_MASK2D_U32[1];
x = (x | x << 8) & MORTON_MASK2D_U32[2];
x = (x | x << 4) & MORTON_MASK2D_U32[3];
x = (x | x << 2) & MORTON_MASK2D_U32[4];
x = (x | x << 1) & MORTON_MASK2D_U32[5];
// 2D decode 64-bit:
x = x & MORTON_MASK2D_U64[5];
x = (x ^ (x >> 1)) & MORTON_MASK2D_U64[4];
x = (x ^ (x >> 2)) & MORTON_MASK2D_U64[3];
x = (x ^ (x >> 4)) & MORTON_MASK2D_U64[2];
x = (x ^ (x >> 8)) & MORTON_MASK2D_U64[1];
x = (x ^ (x >> 16)) & MORTON_MASK2D_U64[0];
// 2D encode 64-bit:
x = (x | x << 32) & MORTON_MASK2D_U64[0];
x = (x | x << 16) & MORTON_MASK2D_U64[1];
x = (x | x << 8) & MORTON_MASK2D_U64[2];
x = (x | x << 4) & MORTON_MASK2D_U64[3];
x = (x | x << 2) & MORTON_MASK2D_U64[4];
x = (x | x << 1) & MORTON_MASK2D_U64[5];
to make it pass this test. I am still having trouble with the 3D version though...
Just wanted to ping you on the issue that the magic bits implementation might not be 100% correct. It seems to work for small enough morton codes, but if you only need those you should probably be using u8
or u16
specific implementations.
A good test is just to test it against the BMI2 version for the full u8, u16, and u32 integer ranges, and for some subranges of the u64 integer range (testing the whole u64 range takes N * 100 years on a 3Ghz CPU).
If it turns out that this is not a bug in my code, but actually a problem with libmorton, you might want to repeat the benchmarks since if more operations are needed for correctness maybe BMI2 will compare better.