[kv] Support index lookup for primary key table #222

swuferhong · 2024-12-18T13:52:50Z

Purpose

Linked issue: #65

Index lookup is a feature that exposes lookup capabilities built on top of secondary indexes. By using secondary indexes, the required data can be located quickly, which can be utilized in conjunction with Flink to implement delta joins.
The purpose of this PR is to provide index lookup for kv tables. The implementation approach is to define the primary key of the kv storage as "secondary keys + primary key", and set the bucket key to the secondary keys. This way, when looking up data through the secondary keys, the corresponding bucket and server can be quickly identified, providing efficient point query capabilities.

Tests

API and Format

Documentation

wuchong

I think our current index is not a general index, it is just a prefix of primary key index. So, actually, it is just a prefix scan/lookup for the prefix of primary key (the prefix should include bucket key). I don't want to call this indexLookup because it occupies the API for future possible index (index on arbitrary columns).

How about changing the API into prefixLookup? The parameter key should be the prefix of primary key and must include bucket key. For DDL, we don't need to introduce new options table.index.keys, we can just continue to use bucket.key.

As we don't have force checks for bucket key is a prefix of primary key. We have to add some best practices for Delta Join cases in the future documentation. For tables used for DeltaJoin queries, the best practice is putting columns of bucket key before other columns in the definition of primary key. Otherwise, the prefixLookup doesn't work when the parameter key only contains bucket join. For example, given a primary key table orders with schema user_id, item_id, order_id, col1, col2, col3 (order_id can be used as primary key as it is unique). If the join key is (user_id, item_id), the primary key of the table must be set to user_id, item_id, order_id and bucket key to user_id, item_id. The prefixLookup will not work if the primary key is set to order_id, user_id, item_id, because the join key is not a prefix of primary key.

website/docs/maintenance/monitor-metrics.md

fluss-client/src/main/java/com/alibaba/fluss/client/lookup/AbstractLookup.java

fluss-client/src/main/java/com/alibaba/fluss/client/lookup/AbstractLookupBatch.java

fluss-client/src/main/java/com/alibaba/fluss/client/lookup/AbstractLookup.java

fluss-server/src/main/java/com/alibaba/fluss/server/replica/ReplicaManager.java

fluss-server/src/test/java/com/alibaba/fluss/server/replica/ReplicaManagerTest.java

fluss-server/src/test/java/com/alibaba/fluss/server/testutils/KvTestUtils.java

fluss-server/src/test/java/com/alibaba/fluss/server/tablet/TabletServiceITCase.java

swuferhong · 2024-12-26T08:18:36Z

@wuchong comments addressed. PR ready

swuferhong requested review from wuchong and luoyuxia December 18, 2024 13:52

wuchong linked an issue Dec 18, 2024 that may be closed by this pull request

[Feature] Fluss support index lookup for primary key table #65

Open

2 tasks

swuferhong force-pushed the index-lookup-1216 branch 3 times, most recently from b95540f to 90f6295 Compare December 20, 2024 09:55

wuchong requested changes Dec 21, 2024

View reviewed changes

swuferhong force-pushed the index-lookup-1216 branch from 90f6295 to eeff7c0 Compare December 26, 2024 07:50

swuferhong added 2 commits December 26, 2024 15:58

[kv] Support index lookup for primary key table

db788e5

address jark's comments

7554840

swuferhong force-pushed the index-lookup-1216 branch from eeff7c0 to 23bc3fd Compare December 26, 2024 08:17

swuferhong force-pushed the index-lookup-1216 branch from 23bc3fd to c26f475 Compare December 26, 2024 08:25

rebase main and resolve conflicts

593cb02

swuferhong force-pushed the index-lookup-1216 branch from c26f475 to 593cb02 Compare December 26, 2024 10:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[kv] Support index lookup for primary key table #222

[kv] Support index lookup for primary key table #222

swuferhong commented Dec 18, 2024

wuchong left a comment

swuferhong commented Dec 26, 2024

[kv] Support index lookup for primary key table #222

Are you sure you want to change the base?

[kv] Support index lookup for primary key table #222

Conversation

swuferhong commented Dec 18, 2024

Purpose

Tests

API and Format

Documentation

wuchong left a comment

Choose a reason for hiding this comment

swuferhong commented Dec 26, 2024