Skip to content

Conversation

@keptsecret
Copy link
Contributor

@keptsecret keptsecret commented Dec 2, 2025

replaces #947


bool operator()(floating_point_type leftProb, NBL_REF_ARG(floating_point_type) xi, NBL_REF_ARG(floating_point_type) rcpChoiceProb)
{
const floating_point_type NEXT_ULP_AFTER_UNITY = bit_cast<floating_point_type>(bit_cast<uint_type>(floating_point_type(1.0)) + uint_type(1u));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, renane to NextULPAfterUnity upper snake case we reserve for macros

struct PartitionRandVariable
{
using floating_point_type = T;
using uint_type = typename unsigned_integer_of_size<sizeof(floating_point_type)>::type;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, don't we have a _t suffixed alias

using floating_point_type = T;
using uint_type = typename unsigned_integer_of_size<sizeof(floating_point_type)>::type;

bool operator()(floating_point_type leftProb, NBL_REF_ARG(floating_point_type) xi, NBL_REF_ARG(floating_point_type) rcpChoiceProb)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftProb should be a member, not a argument, why? I can draw multiple times with multiple xi but same leftProb

using scalar_type = T;
using vector2_type = vector<T, 2>;

static Linear<T> create(NBL_CONST_REF_ARG(vector2_type) linearCoeffs) // start and end importance values (start, end)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const is enough for vector types

Comment on lines +27 to +31
retval.linearCoeffStart = linearCoeffs[0];
retval.rcpDiff = 1.0 / (linearCoeffs[0] - linearCoeffs[1]);
vector2_type squaredCoeffs = linearCoeffs * linearCoeffs;
retval.squaredCoeffStart = squaredCoeffs[0];
retval.squaredCoeffDiff = squaredCoeffs[1] - squaredCoeffs[0];
Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you actually want to compute rcpDiff first

Also leave a TODO that we should probably pre-multiply linearCoeffs by rcpDiff so that we have less registers/live-variables to use

and linearCoeffStart, squaredCoeffStart and quaredCoeffDiff would already have rcpDiff inside them.

Trading 1 less register for 1 more MUL

P.S. then the mix condition would then be abs(linearCoeffStartOverDiff) < max

retval.vertex1 = nbl::hlsl::normalize(vertex1 - origin);
retval.vertex2 = nbl::hlsl::normalize(vertex2 - origin);
retval.cos_sides = vector3_type(hlsl::dot(retval.vertex1, retval.vertex2), hlsl::dot(retval.vertex2, retval.vertex0), hlsl::dot(retval.vertex0, retval.vertex1));
const vector3_type csc_sides2 = hlsl::promote<vector3_type>(1.0) - retval.cos_sides * retval.cos_sides;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpict, this is sin2_sides it becomes a cosecant when you do the rsqrt

Comment on lines +42 to +45
bool pyramidAngles()
{
return hlsl::any<vector<bool, 3> >(csc_sides >= (vector3_type)(numeric_limits<scalar_type>::max));
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment, what does pyramicAngles() mean ?

return hlsl::any<vector<bool, 3> >(csc_sides >= (vector3_type)(numeric_limits<scalar_type>::max));
}

scalar_type solidAngleOfTriangle(NBL_REF_ARG(vector3_type) cos_vertices, NBL_REF_ARG(vector3_type) sin_vertices, NBL_REF_ARG(scalar_type) cos_a, NBL_REF_ARG(scalar_type) cos_c, NBL_REF_ARG(scalar_type) csc_b, NBL_REF_ARG(scalar_type) csc_c)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't need cos_a, cos_c, csc_b, csc_c as output arguments, I can simply steam them from the members if I wish

Comment on lines +80 to +89
vector3_type awayFromEdgePlane0 = hlsl::cross<vector3_type>(vertex1, vertex2) * csc_sides[0];
vector3_type awayFromEdgePlane1 = hlsl::cross<vector3_type>(vertex2, vertex0) * csc_sides[1];
vector3_type awayFromEdgePlane2 = hlsl::cross<vector3_type>(vertex0, vertex1) * csc_sides[2];

// useless here but could be useful somewhere else
cos_vertices[0] = hlsl::dot<vector3_type>(awayFromEdgePlane1, awayFromEdgePlane2);
cos_vertices[1] = hlsl::dot<vector3_type>(awayFromEdgePlane2, awayFromEdgePlane0);
cos_vertices[2] = hlsl::dot<vector3_type>(awayFromEdgePlane0, awayFromEdgePlane1);
// TODO: above dot products are in the wrong order, either work out which is which, or try all 6 permutations till it works
cos_vertices = hlsl::clamp<vector3_type>((cos_sides - cos_sides.yzx * cos_sides.zxy) * csc_sides.yzx * csc_sides.zxy, hlsl::promote<vector3_type>(-1.0), hlsl::promote<vector3_type>(1.0));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're overwriting them anyway, whats the point of computing awayfromEdgePlane and cos_vertices above?

Comment on lines +105 to +128
namespace util
{
// Use this convetion e_i = v_{i+2}-v_{i+1}. vertex index is modulo by 3.
template <typename float_t>
vector<float_t, 3> compInternalAngle(NBL_CONST_REF_ARG(vector<float_t, 3>) e0, NBL_CONST_REF_ARG(vector<float_t, 3>) e1, NBL_CONST_REF_ARG(vector<float_t, 3>) e2)
{
// Calculate this triangle's weight for each of its three m_vertices
// start by calculating the lengths of its sides
const float_t a = hlsl::dot(e0, e0);
const float_t asqrt = hlsl::sqrt(a);
const float_t b = hlsl::dot(e1, e1);
const float_t bsqrt = hlsl::sqrt(b);
const float_t c = hlsl::dot(e2, e2);
const float_t csqrt = hlsl::sqrt(c);

const float_t angle0 = hlsl::acos((b + c - a) / (2.f * bsqrt * csqrt));
const float_t angle1 = hlsl::acos((-b + c + a) / (2.f * asqrt * csqrt));
const float_t angle2 = hlsl::numbers::pi<float_t> - (angle0 + angle1);
// use them to find the angle at each vertex
return vector<float_t, 3>(angle0, angle1, angle2);
}
}

}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this used for? and why is it in the spherical_triangle.hlsl ?

its clearly for a cartesian/euclidean triangle thats flat!

Comment on lines -19 to -40
namespace util
{
// Use this convetion e_i = v_{i+2}-v_{i+1}. vertex index is modulo by 3.
template <typename float_t>
vector<float_t, 3> compInternalAngle(NBL_CONST_REF_ARG(vector<float_t, 3>) e0, NBL_CONST_REF_ARG(vector<float_t, 3>) e1, NBL_CONST_REF_ARG(vector<float_t, 3>) e2)
{
// Calculate this triangle's weight for each of its three m_vertices
// start by calculating the lengths of its sides
const float_t a = hlsl::dot(e0, e0);
const float_t asqrt = hlsl::sqrt(a);
const float_t b = hlsl::dot(e1, e1);
const float_t bsqrt = hlsl::sqrt(b);
const float_t c = hlsl::dot(e2, e2);
const float_t csqrt = hlsl::sqrt(c);

const float_t angle0 = hlsl::acos((b + c - a) / (2.f * bsqrt * csqrt));
const float_t angle1 = hlsl::acos((-b + c + a) / (2.f * asqrt * csqrt));
const float_t angle2 = hlsl::numbers::pi<float_t> - (angle0 + angle1);
// use them to find the angle at each vertex
return vector<float_t, 3>(angle0, angle1, angle2);
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should have stayed here, also awful name InternalAngle but of what ? a Pentagon!?

please change the name to anglesFromTriangleEdge and make sure the smooth normal compute still works

return angle_adder.getSumofArccos() - numbers::pi<scalar_type>;
}

scalar_type solidAngleOfTriangle()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also OfTriangle in the name of the method is a tautology

Comment on lines +64 to +67
cascade_layer_scalar_type getLuma(NBL_CONST_REF_ARG(CascadeLayerType) col)
{
return hlsl::dot<CascadeLayerType>(hlsl::transpose(colorspace::scRGBtoXYZ)[1], col);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be a method of the SplattingParameters

const cascade_layer_scalar_type log2Base = unpackedParams[1];
const cascade_layer_scalar_type luma = getLuma(_sample);
const cascade_layer_scalar_type log2Luma = log2<cascade_layer_scalar_type>(luma);
const cascade_layer_scalar_type cascade = log2Luma * 1.f / log2Base - log2Start / log2Base;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its okay to write this as (log2Luma-log2Start)/log2Base

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually your splatting parameter should store rcpLog2Base and baseRootOfStart

Comment on lines +72 to +84
const float32_t2 unpackedParams = hlsl::unpackHalf2x16(splattingParameters.packedLog2);
const cascade_layer_scalar_type log2Start = unpackedParams[0];
const cascade_layer_scalar_type log2Base = unpackedParams[1];
const cascade_layer_scalar_type luma = getLuma(_sample);
const cascade_layer_scalar_type log2Luma = log2<cascade_layer_scalar_type>(luma);
const cascade_layer_scalar_type cascade = log2Luma * 1.f / log2Base - log2Start / log2Base;
const cascade_layer_scalar_type clampedCascade = clamp(cascade, 0, CascadeCount - 1);
// c<=0 -> 0, c>=Count-1 -> Count-1
uint32_t lowerCascadeIndex = floor<cascade_layer_scalar_type>(cascade);
// 0 whenever clamped or `cascade` is integer (when `clampedCascade` is integer)
cascade_layer_scalar_type higherCascadeWeight = clampedCascade - floor<cascade_layer_scalar_type>(clampedCascade);
// never 0 thanks to magic of `1-fract(x)`
cascade_layer_scalar_type lowerCascadeWeight = cascade_layer_scalar_type(1) - higherCascadeWeight;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all this computation should be done in SplattingParameters::scalar_t

Also make the lowerCascadeIndex a uint16_t

Comment on lines +87 to +88
if (cascade > CascadeCount - 1)
lowerCascadeWeight = exp2(log2Start + log2Base * (CascadeCount - 1) - log2Luma);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrap the CascadeCount-1 in a _static_cast<SplattingParameters::scalar_t> so we get no conversion/comparison warnings.

Actually declare a const SplattingParameters::scalar_t LastCascade = CascadeCount-1 to use throughout

Comment on lines +45 to +47
using output_storage_type = CascadeEntry;
using initialization_data = SplattingParameters;
output_storage_type accumulation;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get the need for extra typedefs here, I get that an alias of CascadeLayerType is needed, but not the rest

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although after you do https://github.com/Devsh-Graphics-Programming/Nabla/pull/959/files#r2589131299 then they may be needed (except for initialization_data alias)

uint32_t cascadeSampleCounter[CascadeCount];
CascadeLayerType data[CascadeCount];

void addSampleIntoCascadeEntry(CascadeLayerType _sample, uint32_t lowerCascadeIndex, float lowerCascadeLevelWeight, float higherCascadeLevelWeight, uint32_t sampleCount)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you want the weights to be same type as SplattingParameters::scalar_t

Comment on lines +19 to +24
struct CascadeEntry
{
uint32_t cascadeSampleCounter[CascadeCount];
CascadeLayerType data[CascadeCount];

void addSampleIntoCascadeEntry(CascadeLayerType _sample, uint32_t lowerCascadeIndex, float lowerCascadeLevelWeight, float higherCascadeLevelWeight, uint32_t sampleCount)
Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can pull the CascadeEntry (can rename to DefaultCascades) outside of CascadeAccumulator just to allow it to be templated on both CascadeLayerType and SampleCountType (should default to uint16_t instead of uint32_t)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then the CascadeAccumulator would be templated only on the CascadesType

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I told przemek to put it inside before, but now I realize we can mess around with the types of more things

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the rationale for this : #936 (comment)

Comment on lines +56 to +57
retval.accumulation.data[i] = promote<CascadeLayerType, float32_t>(0.0f);
retval.accumulation.cascadeSampleCounter[i] = 0u;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to require the accumulation ( CascadesType::clear(cascadeIx)) has a clear method than poke members directly

Comment on lines +17 to +34
// declare concept
#define NBL_CONCEPT_NAME ResolveAccessorBase
#define NBL_CONCEPT_TPLT_PRM_KINDS (typename)(typename)(int32_t)
#define NBL_CONCEPT_TPLT_PRM_NAMES (T)(VectorScalarType)(Dims)
// not the greatest syntax but works
#define NBL_CONCEPT_PARAM_0 (a,T)
#define NBL_CONCEPT_PARAM_1 (scalar,VectorScalarType)
// start concept
NBL_CONCEPT_BEGIN(2)
// need to be defined AFTER the concept begins
#define a NBL_CONCEPT_PARAM_T NBL_CONCEPT_PARAM_0
#define scalar NBL_CONCEPT_PARAM_T NBL_CONCEPT_PARAM_1
NBL_CONCEPT_END(
((NBL_CONCEPT_REQ_EXPR)((a.calcLuma(vector<VectorScalarType, 3>(scalar, scalar, scalar)))))
);
#undef a
#undef scalar
#include <nbl/builtin/hlsl/concepts/__end.hlsl>
Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. this should be a more generic concept because this is what I want SplattingParameters to be checked for when its used with `CascadeAccumulator
  2. it should check that T::scalar_t (which is a FloatingPointScalar) and a templated (on the sample type) calcLuma which does not necessarily enforce that the sample type is a vector<T,3>
  3. don't take a Dims template parameter, you should only have two

using output_type = vector<OutputScalar, 4>;
NBL_CONSTEXPR int32_t image_dimension = 2;

RWTexture2DArray<float32_t4> cascade;
Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can't store a texture as a local variable in DXC, it really doesn't like it, they need to be declared by the user on the fly

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this unfortunately always need to be user-space and the texture declaration needs to be a global

Comment on lines +53 to +56
float32_t calcLuma(NBL_REF_ARG(float32_t3) col)
{
return hlsl::dot<float32_t3>(colorspace::scRGB::ToXYZ()[1], col);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok SplattingParameters should inherit from a BaseParameters that defines scalar_t, and calcLuma this stuff obviously need to be shared

Comment on lines +47 to +48
using output_scalar_type = OutputScalar;
using output_type = vector<OutputScalar, 4>;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd advise for the output type to already have the correct number of components

Comment on lines +41 to +42
template<typename T, typename VectorScalarType, int32_t Dims>
NBL_BOOL_CONCEPT ResolveAccessor = ResolveAccessorBase<T, VectorScalarType, Dims> && concepts::accessors::LoadableImage<T, VectorScalarType, Dims>;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok we definitely need to extend the LoadableImage concept with a final template argument which is the number of components (so they don't always load vector<T,4>)

Comment on lines +72 to +76
template<typename CascadeAccessor, typename OutputColorTypeVec NBL_PRIMARY_REQUIRES(concepts::Vector<OutputColorTypeVec> && ResolveAccessor<CascadeAccessor, typename CascadeAccessor::output_scalar_type, CascadeAccessor::image_dimension>)
struct Resolver
{
using output_type = OutputColorTypeVec;
using scalar_t = typename vector_traits<output_type>::scalar_type;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

template only on the CascadeAccessor, require the CascadeAccessor to have an output_t alias

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also template on the cascade count

Comment on lines +24 to +25
ResolveParameters computeResolveParameters(float base, uint32_t sampleCount, float minReliableLuma, float kappa, uint32_t cascadeSize)
{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not cascadeSize, but cascadeCount , also it should really just be a template parameter on the Resolver, the loops really need to be unrolled and its something you can set up at compile time no problem

Finally this should have been static create factory inside the struct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants