-
Notifications
You must be signed in to change notification settings - Fork 67
HLSL path tracer example (new pr) #959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
|
||
| bool operator()(floating_point_type leftProb, NBL_REF_ARG(floating_point_type) xi, NBL_REF_ARG(floating_point_type) rcpChoiceProb) | ||
| { | ||
| const floating_point_type NEXT_ULP_AFTER_UNITY = bit_cast<floating_point_type>(bit_cast<uint_type>(floating_point_type(1.0)) + uint_type(1u)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick, renane to NextULPAfterUnity upper snake case we reserve for macros
| struct PartitionRandVariable | ||
| { | ||
| using floating_point_type = T; | ||
| using uint_type = typename unsigned_integer_of_size<sizeof(floating_point_type)>::type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick, don't we have a _t suffixed alias
| using floating_point_type = T; | ||
| using uint_type = typename unsigned_integer_of_size<sizeof(floating_point_type)>::type; | ||
|
|
||
| bool operator()(floating_point_type leftProb, NBL_REF_ARG(floating_point_type) xi, NBL_REF_ARG(floating_point_type) rcpChoiceProb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftProb should be a member, not a argument, why? I can draw multiple times with multiple xi but same leftProb
| using scalar_type = T; | ||
| using vector2_type = vector<T, 2>; | ||
|
|
||
| static Linear<T> create(NBL_CONST_REF_ARG(vector2_type) linearCoeffs) // start and end importance values (start, end) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const is enough for vector types
| retval.linearCoeffStart = linearCoeffs[0]; | ||
| retval.rcpDiff = 1.0 / (linearCoeffs[0] - linearCoeffs[1]); | ||
| vector2_type squaredCoeffs = linearCoeffs * linearCoeffs; | ||
| retval.squaredCoeffStart = squaredCoeffs[0]; | ||
| retval.squaredCoeffDiff = squaredCoeffs[1] - squaredCoeffs[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you actually want to compute rcpDiff first
Also leave a TODO that we should probably pre-multiply linearCoeffs by rcpDiff so that we have less registers/live-variables to use
and linearCoeffStart, squaredCoeffStart and quaredCoeffDiff would already have rcpDiff inside them.
Trading 1 less register for 1 more MUL
P.S. then the mix condition would then be abs(linearCoeffStartOverDiff) < max
| retval.vertex1 = nbl::hlsl::normalize(vertex1 - origin); | ||
| retval.vertex2 = nbl::hlsl::normalize(vertex2 - origin); | ||
| retval.cos_sides = vector3_type(hlsl::dot(retval.vertex1, retval.vertex2), hlsl::dot(retval.vertex2, retval.vertex0), hlsl::dot(retval.vertex0, retval.vertex1)); | ||
| const vector3_type csc_sides2 = hlsl::promote<vector3_type>(1.0) - retval.cos_sides * retval.cos_sides; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpict, this is sin2_sides it becomes a cosecant when you do the rsqrt
| bool pyramidAngles() | ||
| { | ||
| return hlsl::any<vector<bool, 3> >(csc_sides >= (vector3_type)(numeric_limits<scalar_type>::max)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment, what does pyramicAngles() mean ?
| return hlsl::any<vector<bool, 3> >(csc_sides >= (vector3_type)(numeric_limits<scalar_type>::max)); | ||
| } | ||
|
|
||
| scalar_type solidAngleOfTriangle(NBL_REF_ARG(vector3_type) cos_vertices, NBL_REF_ARG(vector3_type) sin_vertices, NBL_REF_ARG(scalar_type) cos_a, NBL_REF_ARG(scalar_type) cos_c, NBL_REF_ARG(scalar_type) csc_b, NBL_REF_ARG(scalar_type) csc_c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't need cos_a, cos_c, csc_b, csc_c as output arguments, I can simply steam them from the members if I wish
| vector3_type awayFromEdgePlane0 = hlsl::cross<vector3_type>(vertex1, vertex2) * csc_sides[0]; | ||
| vector3_type awayFromEdgePlane1 = hlsl::cross<vector3_type>(vertex2, vertex0) * csc_sides[1]; | ||
| vector3_type awayFromEdgePlane2 = hlsl::cross<vector3_type>(vertex0, vertex1) * csc_sides[2]; | ||
|
|
||
| // useless here but could be useful somewhere else | ||
| cos_vertices[0] = hlsl::dot<vector3_type>(awayFromEdgePlane1, awayFromEdgePlane2); | ||
| cos_vertices[1] = hlsl::dot<vector3_type>(awayFromEdgePlane2, awayFromEdgePlane0); | ||
| cos_vertices[2] = hlsl::dot<vector3_type>(awayFromEdgePlane0, awayFromEdgePlane1); | ||
| // TODO: above dot products are in the wrong order, either work out which is which, or try all 6 permutations till it works | ||
| cos_vertices = hlsl::clamp<vector3_type>((cos_sides - cos_sides.yzx * cos_sides.zxy) * csc_sides.yzx * csc_sides.zxy, hlsl::promote<vector3_type>(-1.0), hlsl::promote<vector3_type>(1.0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're overwriting them anyway, whats the point of computing awayfromEdgePlane and cos_vertices above?
| namespace util | ||
| { | ||
| // Use this convetion e_i = v_{i+2}-v_{i+1}. vertex index is modulo by 3. | ||
| template <typename float_t> | ||
| vector<float_t, 3> compInternalAngle(NBL_CONST_REF_ARG(vector<float_t, 3>) e0, NBL_CONST_REF_ARG(vector<float_t, 3>) e1, NBL_CONST_REF_ARG(vector<float_t, 3>) e2) | ||
| { | ||
| // Calculate this triangle's weight for each of its three m_vertices | ||
| // start by calculating the lengths of its sides | ||
| const float_t a = hlsl::dot(e0, e0); | ||
| const float_t asqrt = hlsl::sqrt(a); | ||
| const float_t b = hlsl::dot(e1, e1); | ||
| const float_t bsqrt = hlsl::sqrt(b); | ||
| const float_t c = hlsl::dot(e2, e2); | ||
| const float_t csqrt = hlsl::sqrt(c); | ||
|
|
||
| const float_t angle0 = hlsl::acos((b + c - a) / (2.f * bsqrt * csqrt)); | ||
| const float_t angle1 = hlsl::acos((-b + c + a) / (2.f * asqrt * csqrt)); | ||
| const float_t angle2 = hlsl::numbers::pi<float_t> - (angle0 + angle1); | ||
| // use them to find the angle at each vertex | ||
| return vector<float_t, 3>(angle0, angle1, angle2); | ||
| } | ||
| } | ||
|
|
||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this used for? and why is it in the spherical_triangle.hlsl ?
its clearly for a cartesian/euclidean triangle thats flat!
| namespace util | ||
| { | ||
| // Use this convetion e_i = v_{i+2}-v_{i+1}. vertex index is modulo by 3. | ||
| template <typename float_t> | ||
| vector<float_t, 3> compInternalAngle(NBL_CONST_REF_ARG(vector<float_t, 3>) e0, NBL_CONST_REF_ARG(vector<float_t, 3>) e1, NBL_CONST_REF_ARG(vector<float_t, 3>) e2) | ||
| { | ||
| // Calculate this triangle's weight for each of its three m_vertices | ||
| // start by calculating the lengths of its sides | ||
| const float_t a = hlsl::dot(e0, e0); | ||
| const float_t asqrt = hlsl::sqrt(a); | ||
| const float_t b = hlsl::dot(e1, e1); | ||
| const float_t bsqrt = hlsl::sqrt(b); | ||
| const float_t c = hlsl::dot(e2, e2); | ||
| const float_t csqrt = hlsl::sqrt(c); | ||
|
|
||
| const float_t angle0 = hlsl::acos((b + c - a) / (2.f * bsqrt * csqrt)); | ||
| const float_t angle1 = hlsl::acos((-b + c + a) / (2.f * asqrt * csqrt)); | ||
| const float_t angle2 = hlsl::numbers::pi<float_t> - (angle0 + angle1); | ||
| // use them to find the angle at each vertex | ||
| return vector<float_t, 3>(angle0, angle1, angle2); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should have stayed here, also awful name InternalAngle but of what ? a Pentagon!?
please change the name to anglesFromTriangleEdge and make sure the smooth normal compute still works
| return angle_adder.getSumofArccos() - numbers::pi<scalar_type>; | ||
| } | ||
|
|
||
| scalar_type solidAngleOfTriangle() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also OfTriangle in the name of the method is a tautology
| cascade_layer_scalar_type getLuma(NBL_CONST_REF_ARG(CascadeLayerType) col) | ||
| { | ||
| return hlsl::dot<CascadeLayerType>(hlsl::transpose(colorspace::scRGBtoXYZ)[1], col); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be a method of the SplattingParameters
| const cascade_layer_scalar_type log2Base = unpackedParams[1]; | ||
| const cascade_layer_scalar_type luma = getLuma(_sample); | ||
| const cascade_layer_scalar_type log2Luma = log2<cascade_layer_scalar_type>(luma); | ||
| const cascade_layer_scalar_type cascade = log2Luma * 1.f / log2Base - log2Start / log2Base; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its okay to write this as (log2Luma-log2Start)/log2Base
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually your splatting parameter should store rcpLog2Base and baseRootOfStart
| const float32_t2 unpackedParams = hlsl::unpackHalf2x16(splattingParameters.packedLog2); | ||
| const cascade_layer_scalar_type log2Start = unpackedParams[0]; | ||
| const cascade_layer_scalar_type log2Base = unpackedParams[1]; | ||
| const cascade_layer_scalar_type luma = getLuma(_sample); | ||
| const cascade_layer_scalar_type log2Luma = log2<cascade_layer_scalar_type>(luma); | ||
| const cascade_layer_scalar_type cascade = log2Luma * 1.f / log2Base - log2Start / log2Base; | ||
| const cascade_layer_scalar_type clampedCascade = clamp(cascade, 0, CascadeCount - 1); | ||
| // c<=0 -> 0, c>=Count-1 -> Count-1 | ||
| uint32_t lowerCascadeIndex = floor<cascade_layer_scalar_type>(cascade); | ||
| // 0 whenever clamped or `cascade` is integer (when `clampedCascade` is integer) | ||
| cascade_layer_scalar_type higherCascadeWeight = clampedCascade - floor<cascade_layer_scalar_type>(clampedCascade); | ||
| // never 0 thanks to magic of `1-fract(x)` | ||
| cascade_layer_scalar_type lowerCascadeWeight = cascade_layer_scalar_type(1) - higherCascadeWeight; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all this computation should be done in SplattingParameters::scalar_t
Also make the lowerCascadeIndex a uint16_t
| if (cascade > CascadeCount - 1) | ||
| lowerCascadeWeight = exp2(log2Start + log2Base * (CascadeCount - 1) - log2Luma); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrap the CascadeCount-1 in a _static_cast<SplattingParameters::scalar_t> so we get no conversion/comparison warnings.
Actually declare a const SplattingParameters::scalar_t LastCascade = CascadeCount-1 to use throughout
| using output_storage_type = CascadeEntry; | ||
| using initialization_data = SplattingParameters; | ||
| output_storage_type accumulation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get the need for extra typedefs here, I get that an alias of CascadeLayerType is needed, but not the rest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although after you do https://github.com/Devsh-Graphics-Programming/Nabla/pull/959/files#r2589131299 then they may be needed (except for initialization_data alias)
| uint32_t cascadeSampleCounter[CascadeCount]; | ||
| CascadeLayerType data[CascadeCount]; | ||
|
|
||
| void addSampleIntoCascadeEntry(CascadeLayerType _sample, uint32_t lowerCascadeIndex, float lowerCascadeLevelWeight, float higherCascadeLevelWeight, uint32_t sampleCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you want the weights to be same type as SplattingParameters::scalar_t
| struct CascadeEntry | ||
| { | ||
| uint32_t cascadeSampleCounter[CascadeCount]; | ||
| CascadeLayerType data[CascadeCount]; | ||
|
|
||
| void addSampleIntoCascadeEntry(CascadeLayerType _sample, uint32_t lowerCascadeIndex, float lowerCascadeLevelWeight, float higherCascadeLevelWeight, uint32_t sampleCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can pull the CascadeEntry (can rename to DefaultCascades) outside of CascadeAccumulator just to allow it to be templated on both CascadeLayerType and SampleCountType (should default to uint16_t instead of uint32_t)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then the CascadeAccumulator would be templated only on the CascadesType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know I told przemek to put it inside before, but now I realize we can mess around with the types of more things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the rationale for this : #936 (comment)
| retval.accumulation.data[i] = promote<CascadeLayerType, float32_t>(0.0f); | ||
| retval.accumulation.cascadeSampleCounter[i] = 0u; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to require the accumulation ( CascadesType::clear(cascadeIx)) has a clear method than poke members directly
| // declare concept | ||
| #define NBL_CONCEPT_NAME ResolveAccessorBase | ||
| #define NBL_CONCEPT_TPLT_PRM_KINDS (typename)(typename)(int32_t) | ||
| #define NBL_CONCEPT_TPLT_PRM_NAMES (T)(VectorScalarType)(Dims) | ||
| // not the greatest syntax but works | ||
| #define NBL_CONCEPT_PARAM_0 (a,T) | ||
| #define NBL_CONCEPT_PARAM_1 (scalar,VectorScalarType) | ||
| // start concept | ||
| NBL_CONCEPT_BEGIN(2) | ||
| // need to be defined AFTER the concept begins | ||
| #define a NBL_CONCEPT_PARAM_T NBL_CONCEPT_PARAM_0 | ||
| #define scalar NBL_CONCEPT_PARAM_T NBL_CONCEPT_PARAM_1 | ||
| NBL_CONCEPT_END( | ||
| ((NBL_CONCEPT_REQ_EXPR)((a.calcLuma(vector<VectorScalarType, 3>(scalar, scalar, scalar))))) | ||
| ); | ||
| #undef a | ||
| #undef scalar | ||
| #include <nbl/builtin/hlsl/concepts/__end.hlsl> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- this should be a more generic concept because this is what I want SplattingParameters to be checked for when its used with `CascadeAccumulator
- it should check that
T::scalar_t(which is a FloatingPointScalar) and a templated (on the sample type)calcLumawhich does not necessarily enforce that the sample type is avector<T,3> - don't take a
Dimstemplate parameter, you should only have two
| using output_type = vector<OutputScalar, 4>; | ||
| NBL_CONSTEXPR int32_t image_dimension = 2; | ||
|
|
||
| RWTexture2DArray<float32_t4> cascade; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can't store a texture as a local variable in DXC, it really doesn't like it, they need to be declared by the user on the fly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this unfortunately always need to be user-space and the texture declaration needs to be a global
| float32_t calcLuma(NBL_REF_ARG(float32_t3) col) | ||
| { | ||
| return hlsl::dot<float32_t3>(colorspace::scRGB::ToXYZ()[1], col); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok SplattingParameters should inherit from a BaseParameters that defines scalar_t, and calcLuma this stuff obviously need to be shared
| using output_scalar_type = OutputScalar; | ||
| using output_type = vector<OutputScalar, 4>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd advise for the output type to already have the correct number of components
| template<typename T, typename VectorScalarType, int32_t Dims> | ||
| NBL_BOOL_CONCEPT ResolveAccessor = ResolveAccessorBase<T, VectorScalarType, Dims> && concepts::accessors::LoadableImage<T, VectorScalarType, Dims>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok we definitely need to extend the LoadableImage concept with a final template argument which is the number of components (so they don't always load vector<T,4>)
| template<typename CascadeAccessor, typename OutputColorTypeVec NBL_PRIMARY_REQUIRES(concepts::Vector<OutputColorTypeVec> && ResolveAccessor<CascadeAccessor, typename CascadeAccessor::output_scalar_type, CascadeAccessor::image_dimension>) | ||
| struct Resolver | ||
| { | ||
| using output_type = OutputColorTypeVec; | ||
| using scalar_t = typename vector_traits<output_type>::scalar_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
template only on the CascadeAccessor, require the CascadeAccessor to have an output_t alias
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also template on the cascade count
| ResolveParameters computeResolveParameters(float base, uint32_t sampleCount, float minReliableLuma, float kappa, uint32_t cascadeSize) | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not cascadeSize, but cascadeCount , also it should really just be a template parameter on the Resolver, the loops really need to be unrolled and its something you can set up at compile time no problem
Finally this should have been static create factory inside the struct.
replaces #947