Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU Path Tracing: Mobile device triangle intersection precision issue #350

Open
gkjohnson opened this issue Oct 22, 2021 · 7 comments
Open
Milestone

Comments

@gkjohnson
Copy link
Owner

The gpu path tracing demo exhibits a lot of artifacts on Pixel 3. Increasing the ray origin offset by the geometry normal improves the rendering but it's still a problem.

Next step is to see if the reported intersection point is actually sitting on the surface of the triangle and whether thats causing subsequent path tracing issues.

@gkjohnson gkjohnson added this to the v0.5.1 milestone Oct 22, 2021
@gkjohnson
Copy link
Owner Author

gkjohnson commented Oct 22, 2021

By computing the distance between the retrieved point and the triangle surface using the following snippet we can see that the point is relatively far off the surface in either direction (red is on the positive side of the plane, green is negative). The pattern of red on the inner curve of the torus is interesting:

vec3 geomNorm = hit.face.normal;
vec3 a = texelFetch1D( bvh.position, hit.face.a ).rgb;
vec3 b = texelFetch1D( bvh.position, hit.face.b ).rgb;

vec3 apVec = hit.point - a;
float dist = dot( apVec, geomNorm );

dist *= 1000.0;
gl_FragColor = vec4( dist, - dist, 0.0, 1.0 );

Distances Visualized

These images are black on a desktop machine.

Adjusting Point To Surface

Adjusting the point to the surface and remeasuring distances yields the following. It doesn't look a whole lot better. Some sides flip but overall it's looks very similar.

hit.point -= geomNorm * dist;

Larger Triangles

Adjusting the geometry to use larger triangles doesn't change the coloring pattern or fix the issue. The red or positive distances seem to be more prominent on triangles that are facing the origin.

This can be shown by creating a sphere and shifting it on the X axis which causes the portion of the normals pointing toward the origin to turn red (it's green when fully centered):

TODO

  • Since error can't be completely fixed shift the point onto the positive side of the geometry, instead
  • Just adjusting the negative side by the negative distance (or even a multiple of 2, 3, or 4) doesn't always yield a new point on the other side of the triangle due to the floating point error. This is particularly noticeable with normals that are almost facing exactly way from the origin (see sphere). Its possible these adjustments are scaling some dimensions of the normal to such a small amount they're trivially small relative to the computation meaning they're effectively a noop. We should enforce a minimum value for each dimension of the normal vector when adjusting. Or perhaps a max? What if a dimension is 0?
  • We can't know the bit resolution for a float on a particular device but perhaps we can tune a mobile epsilon for this worst case?
  • In testing the point was adjusted along the normal to sit on the surface but it may be best if it's adjust along the ray to be on the correct side.
  • Swap edges used to derive normal to see if surface marks go away on dragon model
  • Lambert blemishes are actually caused by incorrect normal attribute sampling? Even when sampling a coord directly via vertex attr it has issues. Maybe try a power of 2 texture? Or a non floating point texture? Is the index incorrect? how?

@gkjohnson
Copy link
Owner Author

Adding the following line for mobile seems to ensure the point is on the right side of the surface. It's possible the error could be improved further but this is better than before:

@gkjohnson
Copy link
Owner Author

A separate issue is that smooth normals seem to not be sampled correctly:

Here's an image of the normals:

geometry normals smooth normals

For some reason when sampling the smooth normals texture some components seem to be zeroed out or negated for some reason:

TODO

  • Sample the normal texture directly to see if values are corrupted compared to desktop -- perhaps it only happening with texel fetch?
  • Try texture sampling vs texelFetch

@gkjohnson
Copy link
Owner Author

Looking at the texture displayed raw using texelFetch and texture it doesn't look like any values are zeroed out (displayed with absolute value):

TODO

  • Next assumption is that the modulus and / or int / uint conversions for sampling are causing issues. Try with constant / uniform texture resolution vectors, with int width and modulus instead of uint, or ultra wide textures.

@gkjohnson
Copy link
Owner Author

gkjohnson commented Oct 24, 2021

And a comparison between desktop and mobile images (done using screenshots the mobile one has compression applied):

Deskop Mobile
Normal TexelFetch image image
Index image image

NOTE: When displaying the triangle indices that are reported from the hit they are different. Try int textures? Compare final textures?

Desktop Mobile
"A" tri index image image

NOTES / TODO

  • Displaying the triangle indices between mobile and desktop directly resulted in different color (some mobile triangles rendered as black for some reason)
  • Switching "index" to use an int rather than a uint buffer resulted in some triangle indices just seeming to be 0 or negative?
  • Add utility for saving out texture as image to compare (save to device, upload)
  • Ensure the bounding boxes and therefore triangle order are identical on desktop and mobile
  • Add a utility for computing machin epsilon in a shader:
eps = 1
while ( 1 + ( eps / 2 ) > 1) eps = eps / 2;

@gkjohnson
Copy link
Owner Author

I turns out that by using structs in a fragment shader the precision of values are truncated resulting in incorrect indices and values during the sample phase. Repro shader:

#define STRUCT_TEST

struct TestStruct {
	int val;
};

void main() {
	#ifdef STRUCT_TEST

		// displays black
		TestStruct str;
		str.val = 1 << 20;
		gl_FragColor = vec4( str.val, str.val, str.val, 1.0 );

	#else

		// displays white
		int val = 1 << 20;
		gl_FragColor = vec4( val, val, val, 1.0 );

	#endif
}

Who knows how many devices are affected by this. According to this page the numbers are otherwise high precision values when not using structs. I think the only solution would be to use out variables for all hit results which is not at all ergonomic.

@gkjohnson
Copy link
Owner Author

gkjohnson commented Oct 28, 2021

On a number of Adreno chips the struct precision seems to be lower only when using structs:

image

https://gkjohnson.github.io/webgl-precision/

Related to KhronosGroup/WebGL#3351

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant