Skip to content

Replace dict with new hashtable: sorted set datatype#1427

Merged
zuiderkwast merged 6 commits intovalkey-io:unstablefrom
rainsupreme:zset-datatype
Jan 8, 2025
Merged

Replace dict with new hashtable: sorted set datatype#1427
zuiderkwast merged 6 commits intovalkey-io:unstablefrom
rainsupreme:zset-datatype

Conversation

@rainsupreme
Copy link
Copy Markdown
Contributor

@rainsupreme rainsupreme commented Dec 11, 2024

This PR replaces dict with hashtable in the ZSET datatype. Instead of mapping key to score as dict did, the hashtable maps key to a node in the skiplist, which contains the score. This takes advantage of hashtable performance improvements and saves 15 bytes per set item - 24 bytes overhead before, 9 bytes after.

Closes #1096

@rainsupreme rainsupreme marked this pull request as draft December 11, 2024 17:24
@rainsupreme rainsupreme force-pushed the zset-datatype branch 2 times, most recently from d2d854d to 6a8ee44 Compare December 17, 2024 22:49
@zuiderkwast
Copy link
Copy Markdown
Contributor

This PR is ready for review! Needed changes have been merged and I've rebased

Great! You could mark it as not draft then. :) We use the top comment and PR title for the final commit message when it gets merged, so you could update those to concisely describe the change (i.e. like a commit message).

It's late in my time zone so I'll look tomorrow.

@rainsupreme rainsupreme marked this pull request as ready for review December 18, 2024 00:36
@rainsupreme rainsupreme changed the title [draft] replace dict with hashtable: ZSET datatype replace dict with hashtable: ZSET datatype Dec 18, 2024
@enjoy-binbin enjoy-binbin added release-notes This issue should get a line item in the release notes run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) labels Dec 18, 2024
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 18, 2024

Codecov Report

Attention: Patch coverage is 92.55663% with 23 lines in your changes missing coverage. Please review.

Project coverage is 70.75%. Comparing base (b3b4bdc) to head (8316c90).
Report is 11 commits behind head on unstable.

Files with missing lines Patch % Lines
src/module.c 0.00% 10 Missing ⚠️
src/defrag.c 84.37% 5 Missing ⚠️
src/object.c 37.50% 5 Missing ⚠️
src/db.c 89.47% 2 Missing ⚠️
src/debug.c 92.30% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1427      +/-   ##
============================================
- Coverage     70.83%   70.75%   -0.09%     
============================================
  Files           120      120              
  Lines         64911    64959      +48     
============================================
- Hits          45982    45962      -20     
- Misses        18929    18997      +68     
Files with missing lines Coverage Δ
src/aof.c 80.23% <100.00%> (+0.11%) ⬆️
src/evict.c 98.47% <100.00%> (-0.38%) ⬇️
src/geo.c 93.58% <100.00%> (+0.02%) ⬆️
src/rdb.c 76.75% <100.00%> (+0.67%) ⬆️
src/server.c 87.61% <100.00%> (+0.14%) ⬆️
src/server.h 100.00% <ø> (ø)
src/sort.c 94.82% <100.00%> (-0.34%) ⬇️
src/t_zset.c 96.80% <100.00%> (+1.13%) ⬆️
src/debug.c 52.12% <92.30%> (+0.13%) ⬆️
src/db.c 89.48% <89.47%> (-0.07%) ⬇️
... and 3 more

... and 37 files with indirect coverage changes

Copy link
Copy Markdown
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome. Just a few nits.

Comment thread src/db.c Outdated
Comment thread src/t_zset.c Outdated
Comment thread src/t_zset.c Outdated
Comment thread src/t_zset.c Outdated
Comment thread src/defrag.c
Comment thread src/server.c Outdated
Comment thread src/t_zset.c Outdated
Copy link
Copy Markdown
Member

@ranshid ranshid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work @SoftlyRaining !

Some minor comments and I feel somewhat uncomfortable about the way we implemented the union.

Comment thread src/db.c Outdated
Comment thread src/defrag.c Outdated
Comment thread src/t_zset.c Outdated
@zuiderkwast zuiderkwast changed the title replace dict with hashtable: ZSET datatype Replace dict with new hashtable: sorted set datatype Dec 21, 2024
Comment thread src/t_zset.c Outdated
Comment thread src/t_zset.c Outdated
Copy link
Copy Markdown
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just fix the naming converion (compareNodeScoreEle -> zslCompareNodeScoreEle, etc.). Then I think it's good to merge.

I don't really get why you want to delete zslGetRank. If it's an optimization beyond the replacement of dict, then it can as well be a follow up, right? As you want.

Comment thread src/defrag.c Outdated
@rainsupreme
Copy link
Copy Markdown
Contributor Author

I poked around and found that all uses of zslGetRank could more efficiently use zslGetRankByNode, and it became dead code. I already made the revision at any rate but you're right, it might've been a separate small cleanup PR 😅

Signed-off-by: Rain Valentine <rsg000@gmail.com>
Signed-off-by: Rain Valentine <rsg000@gmail.com>
Signed-off-by: Rain Valentine <rsg000@gmail.com>
…etRank

Signed-off-by: Rain Valentine <rsg000@gmail.com>
@ranshid
Copy link
Copy Markdown
Member

ranshid commented Jan 5, 2025

I poked around and found that all uses of zslGetRank could more efficiently use zslGetRankByNode, and it became dead code. I already made the revision at any rate but you're right, it might've been a separate small cleanup PR 😅

@SoftlyRaining My only concern is a potential small degradation in performance of zcount and zlexcount. can we just make sure to verify we have no impact on these operations?

@rainsupreme
Copy link
Copy Markdown
Contributor Author

I poked around and found that all uses of zslGetRank could more efficiently use zslGetRankByNode, and it became dead code. I already made the revision at any rate but you're right, it might've been a separate small cleanup PR 😅

@SoftlyRaining My only concern is a potential small degradation in performance of zcount and zlexcount. can we just make sure to verify we have no impact on these operations?

I will remove that aspect of the change so it can be more thoroughly investigated and benchmarked as a separate PR. I'd prefer to avoid delaying the core hashtable work. :)

Signed-off-by: Rain Valentine <rsg000@gmail.com>
Copy link
Copy Markdown
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ranshid The zslGetRank changes have been removed, so I guess it's safe to merge. WDYT?

Copy link
Copy Markdown
Member

@ranshid ranshid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comment which I think we could skip handling for now and maybe only extend the comment.
I did not rescan the entire change so LGTM

Comment thread src/t_zset.c
Comment on lines +306 to +310
if ((node->backward == NULL || node->backward->score < newscore) &&
(node->level[0].forward == NULL || node->level[0].forward->score > newscore)) {
node->score = newscore;
return NULL;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small mark: it is still possible that a node position change will NOT take place after this check. For example in case we update the score to something that exactly matches the score of the prev or next node.
The check can be extended to check also equality of the score (but will also need to compare the key order).
I guess this is fine for now, but maybe extend the comment above to explain that?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the edge cases. Now, we remove and re-insert the node at the same position in this case, which is OK. It's not introduced in this PR anyway. We can convert the comment to a follow-up issue.

We could use zslCompareNodes here, but then we'd need to set the new score before we compare and then revert it if the check fails. Or use a stack-allocated temporary node for comparing, just to be able to use zslCompareNodes.

Suggested change
if ((node->backward == NULL || node->backward->score < newscore) &&
(node->level[0].forward == NULL || node->level[0].forward->score > newscore)) {
node->score = newscore;
return NULL;
}
double oldscore = node->score;
node->score = newscore;
if ((node->backward == NULL || zslCompareNodes(node->backward, node) <= 0) &&
zslCompareNodes(node->level[0].forward, node) >= 0) {
return NULL;
} else {
/* Restore score to restore skiplist order. */
node->score = oldscore;
}

Comment thread src/t_zset.c Outdated
about edge cases: allowing score to be equal to pref or next node and also compare ele in these cases.

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
@zuiderkwast zuiderkwast merged commit ab627d6 into valkey-io:unstable Jan 8, 2025
proost pushed a commit to proost/valkey that referenced this pull request Jan 17, 2025
This PR replaces dict with hashtable in the ZSET datatype. Instead of
mapping key to score as dict did, the hashtable maps key to a node in
the skiplist, which contains the score. This takes advantage of
hashtable performance improvements and saves 15 bytes per set item - 24
bytes overhead before, 9 bytes after.

Closes valkey-io#1096

---------

Signed-off-by: Rain Valentine <rsg000@gmail.com>
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Signed-off-by: proost <jwalag87@gmail.com>
kronwerk pushed a commit to kronwerk/valkey that referenced this pull request Jan 27, 2025
This PR replaces dict with hashtable in the ZSET datatype. Instead of
mapping key to score as dict did, the hashtable maps key to a node in
the skiplist, which contains the score. This takes advantage of
hashtable performance improvements and saves 15 bytes per set item - 24
bytes overhead before, 9 bytes after.

Closes valkey-io#1096

---------

Signed-off-by: Rain Valentine <rsg000@gmail.com>
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
@rainsupreme rainsupreme deleted the zset-datatype branch January 31, 2025 23:33
moticless added a commit to redis/redis that referenced this pull request Jan 18, 2026
* Embed sds element inside skiplist nodes: Changed zset dict to store
zskiplistNode* as keys (with no_value=1) instead of storing sds keys and
double* values, eliminating redundant sds storage and enabling
single-allocation nodes
* Single allocation for skiplist nodes: Each node now contains: fixed
fields + level[] array + embedded sds, reducing memory fragmentation and
allocation overhead. This optimization is based on valkey-io/valkey#1427
* Optimize lookups with dictFindLink: Use dictFindLink in zsetAdd to
avoid double hash table lookup when inserting new elements (find + add
becomes single operation)
* Simplify score updates
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes This issue should get a line item in the release notes run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Use new hashtable for sorted sets

4 participants