Flicking problem #334

caxieyou · 2023-06-28T01:38:23Z

I have test a svg file, which is not that big, not as big as the CIA map case.

When I loaded the file, and zoom in, I find that the screen is flicking, some small parts not rendering correctly.

Is this because of float precision problem?

Quit sure there is no clipping in this file

llsansun · 2023-07-19T05:37:37Z

It is found that the config.n_drawobj in coarse.wgsl has a certain relationship. When it exceeds 65535, will there be conflicts between the data before 65535 and the data after 65535 due to the synchronization of the working group, resulting in problems with the graphics display. Is there a better solution to the 65535 drawing limit?hope it can be resolved.

raphlinus · 2023-07-19T05:39:14Z

Yes, a limit of 64k draw objects is a known problem, and has a straightforward solution. This issue can serve as the tracking bug for that. Thanks for the analysis!

caxieyou · 2023-07-19T05:47:22Z

can you tell which bug or issue link is there so we can know when it's fixed or has any progress?
thanks a lot

caxieyou · 2023-09-15T10:07:17Z

is this bug been fixed?

raphlinus · 2023-09-15T15:22:59Z

Not yet. The stroke rework is taking a lot longer than expected, though there is progress. This will be a high priority after that, and is also one of the items tracked in #302.

DorianRudolph · 2023-12-23T16:23:44Z

Is this the same issue?

warning: flashing images

Screen.Recording.2023-12-23.at.17.22.19.mov

raphlinus · 2023-12-23T16:45:39Z

No, that issue is caused by overflow of internal buffers (related to #366), which is in turn provoked by not culling lines and tiles that land outside the viewport. We do plan to work on all that.

raphlinus · 2024-03-18T22:19:45Z

I plan on addressing the 64k draw object problem shortly. There are three approaches that can be taken.

One is to conditionally apply a 3-level dispatch when the (workgroup size)^2 limit is crossed. This is what's done with pathtags, and I find it ugly. Among other things, it requires more permutations of shaders to be compiled, and there's also some complex conditional logic for which shaders to dispatch. I do have a local patch which is almost done, so it is perhaps the path of least resistance.

The second approach is inspired by a technique I saw in FidelityFX sort, and is implemented in my recent sorting exploration. In that approach, each workgroup iterates over num_blocks_per_wg blocks, where each block is the amount of data currently handled by a single workgroup (256 draw objects). In that way, the size of the sequence is not inherently bounded by workgroup sizes.

A drawback to the latter approach is that it may limit the amount of addressable parallelism. Doing a quick calculation, for very large inputs it will dispatch 64k threads, regardless of the size of the input. That is more threads than directly supported by any existing hardware (RTX 4090 has 16k), though it may limit opportunities for latency hiding.

An advantage to the latter approach is that it's two fewer dispatches.

As a future potential optimization, we may want to have more permutations (specialization by pipeline override) to (a) allow larger workgroups when the hardware supports it (the WebGPU spec only requires 256, which informs the choices we've made), and (b) support iteration over multiple elements per thread. The former is probably the best way to improve opportunities to exploit parallelism on powerful GPUs (1M threads should be plenty for at least a while) and has no real downside other than wiring up the plumbing. The latter is more of a tradeoff, as it improves bandwidth for large problems but limits parallelism for small ones. To switch between the two adaptively requires potentially compiling both variants (affecting cold-start time including shader compilation) and of course the complexity of the logic.

The third approach is to go back to single pass scan techniques, as was done in piet-gpu. We now know how to do this in WebGPU (see Zulip thread) but the performance implications are mixed; in particular it would be a performance regression on Apple Silicon.

I'm most inclined to go with the second approach, as I think it's the best set of tradeoffs and admits additional optimization that would address the biggest shortcoming. I'll start on a PR, and if that goes well, probably apply the same technique to path tags.

Previously there was a limit of workgroup size squared for the number of draw objects, which is 64k in practice. This PR makes each workgroup iterate multiple blocks if that limit is exceeded, borrowing a technique from FidelityFX sort. WIP, this causes hangs on mac. Uploading to test on other hardware. Also contains some changes for testing that may not want to be committed as is. Fixes #334

* Allow large numbers of draw objects Previously there was a limit of workgroup size squared for the number of draw objects, which is 64k in practice. This PR makes each workgroup iterate multiple blocks if that limit is exceeded, borrowing a technique from FidelityFX sort. WIP, this causes hangs on mac. Uploading to test on other hardware. Also contains some changes for testing that may not want to be committed as is. Fixes #334 * Add missing barrier Add barrier for write-after-read hazard in coarse. The loop in question processes 64k draw objects at a time, so the barrier only gets invoked when that limit is exceeded. Also move new test scene so it isn't the first. * Address review comments Set resolution in params for test scene. Add comments explaining division of work.

NyxAlexandra · 2024-04-03T11:50:24Z

should the readme be changed after this was closed?

DJMcNab · 2024-04-03T12:02:09Z

Thanks for the reminder!

We intend to go through the list of issues in the README before publishing version 0.2.0, but a PR to remove the outdated items now would be welcome

NyxAlexandra · 2024-04-03T14:11:40Z

See #543

raphlinus mentioned this issue Mar 19, 2024

Allow large numbers of draw objects #526

Merged

raphlinus closed this as completed in #526 Mar 20, 2024

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flicking problem #334

Flicking problem #334

caxieyou commented Jun 28, 2023 •

edited

Loading

llsansun commented Jul 19, 2023

raphlinus commented Jul 19, 2023

caxieyou commented Jul 19, 2023

caxieyou commented Sep 15, 2023

raphlinus commented Sep 15, 2023

DorianRudolph commented Dec 23, 2023 •

edited

Loading

raphlinus commented Dec 23, 2023

raphlinus commented Mar 18, 2024

NyxAlexandra commented Apr 3, 2024

DJMcNab commented Apr 3, 2024

NyxAlexandra commented Apr 3, 2024

Flicking problem #334

Flicking problem #334

Comments

caxieyou commented Jun 28, 2023 • edited Loading

llsansun commented Jul 19, 2023

raphlinus commented Jul 19, 2023

caxieyou commented Jul 19, 2023

caxieyou commented Sep 15, 2023

raphlinus commented Sep 15, 2023

DorianRudolph commented Dec 23, 2023 • edited Loading

raphlinus commented Dec 23, 2023

raphlinus commented Mar 18, 2024

NyxAlexandra commented Apr 3, 2024

DJMcNab commented Apr 3, 2024

NyxAlexandra commented Apr 3, 2024

caxieyou commented Jun 28, 2023 •

edited

Loading

DorianRudolph commented Dec 23, 2023 •

edited

Loading