Further sectioned results performance improvements #6646

tgoyne · 2023-05-17T23:09:36Z

(But actually just performance improvements to the underlying things since that's where all the time is spent).

This eliminates TableRef checks in a pile of places where it very definitely will always be valid, generally because we get to the place via a call on a Table. TableRef checking isn't very expensive, but it adds up to a few percent of the runtime.

Checking if a TableView is up to date previously involved memory allocations, and it's an operation we do a lot. It now reuses a buffer between calls. Frozen table views now skip the check entirely as they're always up to date if they've ever been evaluated.

Sorting a TableView handled detached ObjKeys which would result from "snapshot" tableviews that we didn't re-evaluate. We never actually sort those after the fact, so we can just skip that handling and save some time.

Looking up an ObjKey in a Table is actually a kinda slow operation and was a meaningful portion of the runtime of sorting TableViews. The newly introduced ClusterTree::LeafCache optimizes lookups where subsequent lookups are on the same leaf, which will be fairly often in cache_first_column() due to it reading matching rows in table order. It is performance-neutral for sparse queries which never have multiple rows on the same leaf.

This dramatically speeds up sorting TableViews with fairly dense sets of rows, and is performance-neutral for sparse sets with at most one row per leaf used.

Passing the Results to the SDK rather than Mixed allows it to use `.get<Obj>()` where applicable, which eliminates a redundant lookup by obj key when sectioning a table.

This was never actually used (Results().snapshot().sort() produces a live Results) and had a significant performance impact.

tgoyne self-assigned this May 17, 2023

cla-bot bot added the cla: yes label May 17, 2023

tgoyne added 6 commits May 24, 2023 11:17

Assume the table ref is valid when constructing an Obj

7a2c820

Improve performance of checking if a TableView is in sync

78ce9ca

Cache the leaf lookups when reading the first column for sorting

4c87c88

This dramatically speeds up sorting TableViews with fairly dense sets of rows, and is performance-neutral for sparse sets with at most one row per leaf used.

Adjust the sectioning API to enable more efficient SDK implementations

415ed10

Passing the Results to the SDK rather than Mixed allows it to use `.get<Obj>()` where applicable, which eliminates a redundant lookup by obj key when sectioning a table.

Remove redundant const/non-const versions of Table find functions

6b4f275

Remove support for sorting TableViews which aren't up to date

3467913

This was never actually used (Results().snapshot().sort() produces a live Results) and had a significant performance impact.

tgoyne force-pushed the tg/sectioned-perf-2 branch from fa26a82 to 3467913 Compare May 24, 2023 18:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Further sectioned results performance improvements #6646

Further sectioned results performance improvements #6646

tgoyne commented May 17, 2023

Further sectioned results performance improvements #6646

Are you sure you want to change the base?

Further sectioned results performance improvements #6646

Conversation

tgoyne commented May 17, 2023