-
Notifications
You must be signed in to change notification settings - Fork 67
New morton class with arithmetic and comparison operators #860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… on HLSL side by specializing , a bunch of morton operators
…or both cpp and hlsl
include/nbl/builtin/hlsl/morton.hlsl
Outdated
| NBL_CONSTEXPR_STATIC uint16_t Stages = mpl::log2_ceil_v<Bits>; | ||
| [[unroll]] | ||
| for (uint16_t i = Stages; i > 0; i--) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this loop will never unroll, static const will never be constexpr as a plain variable in a function, @keptsecret got screwed over by this in the HLSL Path Tracer!
Better unroll by hand, or use mpl::log2_ceil_v<Bits> directly as an initializer to uint16_t i=0
| struct MortonEncoder | ||
| { | ||
| template<typename decode_t = conditional_t<(Bits > 16), vector<uint32_t, Dim>, vector<uint16_t, Dim> > | ||
| NBL_FUNC_REQUIRES(concepts::IntVector<decode_t> && 8 * sizeof(typename vector_traits<decode_t>::scalar_type) >= Bits) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its actually >Bits+Dim not >=Bits because you will be left shifting the components, and last will have its MSB at Bits+Dim-1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this is for the decode_t, which immediately gets transformed to a vector of encode_t which does have enough Bits to hold the interleaved and shifted coordinates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idk what I was thinking when I wrote this tbh, maybe the check should be the other way around?
8 * sizeof(typename vector_traits<decode_t>::scalar_type) <= max(Bits, 16)to ensure you don't get an implicit truncation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or just drop that altogether idk
include/nbl/builtin/hlsl/morton.hlsl
Outdated
| encode_t encoded = _static_cast<encode_t>(uint64_t(0)); | ||
| array_get<portable_vector_t<encode_t, Dim>, encode_t> getter; | ||
| [[unroll]] | ||
| for (uint16_t i = 0; i < Dim; i++) | ||
| encoded = encoded | getter(interleaveShifted, i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't count on compiler noticing that |0 is identity and can be optimzed out for emulated_uint64_t, so do
encode_t ecnoded = getter(interleaveShifted,0);
[[unroll]]
for (uint32_t i=1; i<Dim; i++)
encoded = encoded | getter(interleaveShifted,i);
include/nbl/builtin/hlsl/morton.hlsl
Outdated
| // ----------------------------------------------------------------- MORTON ENCODER --------------------------------------------------- | ||
|
|
||
| template<uint16_t Dim, uint16_t Bits, typename encode_t NBL_PRIMARY_REQUIRES(Dimension<Dim> && Dim * Bits <= 64 && 8 * sizeof(encode_t) == mpl::round_up_to_pot_v<Dim * Bits>) | ||
| struct MortonEncoder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
morton::impl::Morton,, too many mortons
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
include/nbl/builtin/hlsl/morton.hlsl
Outdated
| }; | ||
|
|
||
| // ----------------------------------------------------------------- MORTON DECODER --------------------------------------------------- | ||
|
|
||
| template<uint16_t Dim, uint16_t Bits, typename encode_t NBL_PRIMARY_REQUIRES(Dimension<Dim> && Dim * Bits <= 64 && 8 * sizeof(encode_t) == mpl::round_up_to_pot_v<Dim * Bits>) | ||
| struct MortonDecoder | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not merge Decoder and Encoder into a single Transcoder ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| struct MortonDecoder | ||
| { | ||
| template<typename decode_t = conditional_t<(Bits > 16), vector<uint32_t, Dim>, vector<uint16_t, Dim> > | ||
| NBL_FUNC_REQUIRES(concepts::IntVector<decode_t> && 8 * sizeof(typename vector_traits<decode_t>::scalar_type) >= Bits) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing with >=Bits needing to be > Bits+Dim
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually same comments as for the interleaveShift function
include/nbl/builtin/hlsl/morton.hlsl
Outdated
| setter(decoded, i, encodedValue); | ||
| decoded = rightShift(decoded, _static_cast<vector<uint32_t, Dim> >(vector<uint32_t, 4>(0, 1, 2, 3))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could just write setter(decoded, i, encodedValue>>i);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
include/nbl/builtin/hlsl/morton.hlsl
Outdated
| NBL_HLSL_MORTON_SPECIALIZE_FIRST_CODING_MASK(2, 0x5555555555555555) // Groups bits by 1 on, 1 off | ||
| NBL_HLSL_MORTON_SPECIALIZE_CODING_MASK(2, 1, uint64_t(0x3333333333333333)) // Groups bits by 2 on, 2 off | ||
| NBL_HLSL_MORTON_SPECIALIZE_CODING_MASK(2, 2, uint64_t(0x0F0F0F0F0F0F0F0F)) // Groups bits by 4 on, 4 off | ||
| NBL_HLSL_MORTON_SPECIALIZE_CODING_MASK(2, 3, uint64_t(0x00FF00FF00FF00FF)) // Groups bits by 8 on, 8 off | ||
| NBL_HLSL_MORTON_SPECIALIZE_CODING_MASK(2, 4, uint64_t(0x0000FFFF0000FFFF)) // Groups bits by 16 on, 16 off | ||
|
|
||
| NBL_HLSL_MORTON_SPECIALIZE_FIRST_CODING_MASK(3, 0x9249249249249249) // Groups bits by 1 on, 2 off | ||
| NBL_HLSL_MORTON_SPECIALIZE_CODING_MASK(3, 1, uint64_t(0x30C30C30C30C30C3)) // Groups bits by 2 on, 4 off | ||
| NBL_HLSL_MORTON_SPECIALIZE_CODING_MASK(3, 2, uint64_t(0xF00F00F00F00F00F)) // Groups bits by 4 on, 8 off | ||
| NBL_HLSL_MORTON_SPECIALIZE_CODING_MASK(3, 3, uint64_t(0x00FF0000FF0000FF)) // Groups bits by 8 on, 16 off | ||
| NBL_HLSL_MORTON_SPECIALIZE_CODING_MASK(3, 4, uint64_t(0xFFFF00000000FFFF)) // Groups bits by 16 on, 32 off | ||
|
|
||
| NBL_HLSL_MORTON_SPECIALIZE_FIRST_CODING_MASK(4, 0x1111111111111111) // Groups bits by 1 on, 3 off | ||
| NBL_HLSL_MORTON_SPECIALIZE_CODING_MASK(4, 1, uint64_t(0x0303030303030303)) // Groups bits by 2 on, 6 off | ||
| NBL_HLSL_MORTON_SPECIALIZE_CODING_MASK(4, 2, uint64_t(0x000F000F000F000F)) // Groups bits by 4 on, 12 off | ||
| NBL_HLSL_MORTON_SPECIALIZE_CODING_MASK(4, 3, uint64_t(0x000000FF000000FF)) // Groups bits by 8 on, 24 off | ||
| NBL_HLSL_MORTON_SPECIALIZE_CODING_MASK(4, 4, uint64_t(0x000000000000FFFF)) // Groups bits by 16 on, 48 off (unused but here for completion + likely keeps compiler from complaining) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ull sufficies on the mask literals please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| // If `Bits` is greater than half the bitwidth of the decode type, then we can avoid `&`ing against the last mask since duplicated MSB get truncated | ||
| NBL_IF_CONSTEXPR(Bits > 4 * sizeof(typename vector_traits<decode_t>::scalar_type)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that > should be a >= because if you have 16 bit morton (e.g. dim=2 stored in a uint32_t) getting decoded into a vector of uint16_t you'll have a shift by 8 in the final coding round
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the comparison is against half the bitwidth. For example if decoding to a vector of uint16_t this decision is made based on whether we have more than 8 bits.
For example if you have exactly 8 bits the last shift is by 4. Ignore hex, let's just say the encoded number is ABCDEFGH where each letter is just representing a binary value. Then in the last round you'll have decoded = 0000ABCD0000EFGH, decoded >> 4 = 00000000ABCD0000 (need 16 bits to hold two 8-bit Mortons) and the | between these looks like 0000ABCDABCDEFGH. Here to get the correct value I do need to mask off the highest 8 bits.
Now say you have more than half the bitwidth of the decode type. For example a 9bit Morton ABCDEFGHI being decoded to a uint16_t. Here since we have more than 8 bits, the last round is a shift by 8 so the spacing between bits is also 8, so decoded will look like decoded = 000000000000000A00000000BCDEFGHI (now need 32 bits to hold two 9-bit mortons) and decoded >> 8 = 00000000000000000000000A00000000 so the | between them returns 000000000000000A0000000ABCDEFGHI. Now there's no need to mask, since taking only the lowest 16bits correctly yields 0000000ABCDEFGHI (same holds for any value from 10 to 16bits)
include/nbl/builtin/hlsl/morton.hlsl
Outdated
| template<typename I NBL_FUNC_REQUIRES(Comparable<Signed, Bits, storage_t, true, I>) | ||
| NBL_CONSTEXPR_STATIC_INLINE_FUNC vector<bool, D> __call(NBL_CONST_REF_ARG(storage_t) value, NBL_CONST_REF_ARG(portable_vector_t<I, D>) rhs) | ||
| { | ||
| NBL_CONSTEXPR portable_vector_t<storage_t, D> zeros = _static_cast<portable_vector_t<storage_t, D> >(_static_cast<vector<uint64_t, D> >(vector<uint64_t, 4>(0,0,0,0))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, this will create a hidden variable with an initializer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you literally have to use a temporary to compare against
or declare the variable as a plain const, not a static const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait what does static const do vs using just const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read the discord thread
include/nbl/builtin/hlsl/morton.hlsl
Outdated
| NBL_CONSTEXPR_STATIC portable_vector_t<storage_t, D> InterleaveMasks = _static_cast<portable_vector_t<storage_t, D> >(_static_cast<vector<uint64_t, D> >(vector<uint64_t, 4>(coding_mask_v<D, Bits, 0>, coding_mask_v<D, Bits, 0> << 1, coding_mask_v<D, Bits, 0> << 2, coding_mask_v<D, Bits, 0> << 3))); | ||
| NBL_CONSTEXPR_STATIC portable_vector_t<storage_t, D> SignMasks = _static_cast<portable_vector_t<storage_t, D> >(_static_cast<vector<uint64_t, D> >(vector<uint64_t, 4>(SignMask<Bits, D>, SignMask<Bits, D> << 1, SignMask<Bits, D> << 2, SignMask<Bits, D> << 3))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again a plain const is okay, static const is not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also is there a pretier way to write this ? or at least format (ever component new line?)
include/nbl/builtin/hlsl/morton.hlsl
Outdated
| // Obtain a vector of deinterleaved coordinates and flip their sign bits | ||
| const portable_vector_t<storage_t, D> thisCoord = (InterleaveMasks & value) ^ SignMasks; | ||
| // rhs already deinterleaved, just have to cast type and flip sign | ||
| const portable_vector_t<storage_t, D> rhsCoord = _static_cast<portable_vector_t<storage_t, D> >(rhs) ^ SignMasks; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you always flipping signs, regardless of Signed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be done?
include/nbl/builtin/hlsl/morton.hlsl
Outdated
| NBL_CONSTEXPR_INLINE_FUNC vector<bool, D> equals(NBL_CONST_REF_ARG(vector<I, D>) rhs) NBL_CONST_MEMBER_FUNC | ||
| { | ||
| return impl::Equals<Signed, Bits, D, storage_t, BitsAlreadySpread>::__call(value, rhs); | ||
| } | ||
|
|
||
| NBL_CONSTEXPR_INLINE_FUNC bool operator!=(NBL_CONST_REF_ARG(this_t) rhs) NBL_CONST_MEMBER_FUNC | ||
| { | ||
| return value != rhs.value; | ||
| } | ||
|
|
||
| template<bool BitsAlreadySpread, typename I | ||
| NBL_FUNC_REQUIRES(impl::Comparable<Signed, Bits, storage_t, BitsAlreadySpread, I>) | ||
| NBL_CONSTEXPR_INLINE_FUNC vector<bool, D> notEquals(NBL_CONST_REF_ARG(vector<I, D>) rhs) NBL_CONST_MEMBER_FUNC | ||
| { | ||
| return !equals<BitsAlreadySpread, I>(rhs); | ||
| } | ||
|
|
||
| template<bool BitsAlreadySpread, typename I | ||
| NBL_FUNC_REQUIRES(impl::Comparable<Signed, Bits, storage_t, BitsAlreadySpread, I>) | ||
| NBL_CONSTEXPR_INLINE_FUNC vector<bool, D> less(NBL_CONST_REF_ARG(vector<I, D>) rhs) NBL_CONST_MEMBER_FUNC | ||
| { | ||
| return impl::LessThan<Signed, Bits, D, storage_t, BitsAlreadySpread>::__call(value, rhs); | ||
| } | ||
|
|
||
| template<bool BitsAlreadySpread, typename I | ||
| NBL_FUNC_REQUIRES(impl::Comparable<Signed, Bits, storage_t, BitsAlreadySpread, I>) | ||
| NBL_CONSTEXPR_INLINE_FUNC vector<bool, D> lessEquals(NBL_CONST_REF_ARG(vector<I, D>) rhs) NBL_CONST_MEMBER_FUNC | ||
| { | ||
| return impl::LessEquals<Signed, Bits, D, storage_t, BitsAlreadySpread>::__call(value, rhs); | ||
| } | ||
|
|
||
| template<bool BitsAlreadySpread, typename I | ||
| NBL_FUNC_REQUIRES(impl::Comparable<Signed, Bits, storage_t, BitsAlreadySpread, I>) | ||
| NBL_CONSTEXPR_INLINE_FUNC vector<bool, D> greater(NBL_CONST_REF_ARG(vector<I, D>) rhs) NBL_CONST_MEMBER_FUNC | ||
| { | ||
| return impl::GreaterThan<Signed, Bits, D, storage_t, BitsAlreadySpread>::__call(value, rhs); | ||
| } | ||
|
|
||
| template<bool BitsAlreadySpread, typename I | ||
| NBL_FUNC_REQUIRES(impl::Comparable<Signed, Bits, storage_t, BitsAlreadySpread, I>) | ||
| NBL_CONSTEXPR_INLINE_FUNC vector<bool, D> greaterEquals(NBL_CONST_REF_ARG(vector<I, D>) rhs) NBL_CONST_MEMBER_FUNC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spelling nitpick, those functions are usually called equal without an s at the end
https://registry.khronos.org/OpenGL-Refpages/gl4/html/equal.xhtml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
`NBL_CONSTEXPR_FUNC` Adds `OpUndef` to spirv `intrinsics.hlsl` and `cpp_compat.hlsl` Adds an explicit `truncate` function for vectors and emulated vectors Adds a bunch of specializations for vectorial types in `functional.hlsl` Bugfixes and changes to Morton codes, very close to them working properly with emulated ints
| struct Promote | ||
| { | ||
| T operator()(U v) | ||
| NBL_CONSTEXPR_FUNC T operator()(NBL_CONST_REF_ARG(U) v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vectors should not be taken by NBL_CONST_REF_ARG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
After #947 gets merged, make sure to |
…not a single token
Description
Adds a new class for 2,3 and 4-dimensional morton codes, with arithmetic and comparison operators
Testing
TODO
TODO list:
Need to make sure all operators work properly before merging