-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add API V2 RFC #69
Add API V2 RFC #69
Conversation
Signed-off-by: Andy Lok <[email protected]>
Signed-off-by: Andy Lok <[email protected]>
text/0069-api-v2.md
Outdated
Once the `API V2` is enabled, the key will be either start with either: | ||
|
||
1. `m` or `t`: TxnKV key. Used by TiDB. | ||
2. `x\0`: TxnKV key. The `\0` is a placeholder for the potential [Keyspace](https://github.com/tikv/rfcs/pull/39). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the compatibility with other systems like titan or juicefs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They should choose API V1
in client mode. If they want to support mixed TxnKV and RawKV, they should upgrade to API V2
client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid they can't do that easily as data need to be reloaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, they can't, but also i think that they don't really need api v2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless we always maintain two versions, which doesn't seem reasonable to me, otherwise they have to use V2 someday.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we drop the support for v1 api? If not, I think the maintainace effort is not the avoidable.
text/0069-api-v2.md
Outdated
|
||
- Reject all requests with `enable_api_v2` in the context. | ||
|
||
### TiKV Clients |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TiKV client needs to consider if it needs to decode keys when it loads a region from PD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know there is a bug that the client may try to decode the start key and end key of a region, which will panic when the boundary is not decodable. But this is not related to the api v2 encode, so I didn't list it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not just a bug. It is a quite more complicated problem. Currently, based on whether TiKV is in txnkv or rawkv mode, the region split keys can be encoded or not encoded. When TiDB loads a region from PD, it should determine if it needs to decode region key based on it uses txnkv or rawkv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the fix should change the region cache to not assume the split key is decodable.
text/0069-api-v2.md
Outdated
If the key is started with `r`, then the value can be either: | ||
|
||
1. `{0x0}{data}` | ||
2. `{0x1}{TTL expire timestamp}{data}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why TTL is not appended after data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no difference. So I'd like to put metadata together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, this encoding will make keys of continuous TTLs more likely to be stored in one region, instead of keys of the same user-key. Will this hurt scan performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refering #69 (comment):
If I understand correctly, this encoding will make keys of continuous TTLs more likely to be stored in one region, instead of keys of the same user-key. Will this hurt scan performance?
that is value encoding, so the later put with ttl will overwrite the former.
text/0069-api-v2.md
Outdated
|
||
### TiKV | ||
|
||
Add a new configuatrion `storage.enable_api_v2`. When enabled, `storage.enable_ttl` must also be enabled. TiKV will carry it's `storage.enable_api_v2` in `PutStore` request. PD will reject the `PutStore` if the switch mismatched. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just remove storage.enable_ttl
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's not appropriate to remove it but we can make it enabled by default.
Signed-off-by: Andy Lok <[email protected]>
Signed-off-by: andylokandy <[email protected]>
Signed-off-by: andylokandy <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, do we need to consider a new garbage collection logic apart from TxnKV for this design? e.g. how to handle TTL, how to remove old CAS result
text/0069-api-v2.md
Outdated
If the key is started with `r`, then the value can be either: | ||
|
||
1. `{0x0}{data}` | ||
2. `{0x1}{TTL expire timestamp}{data}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, this encoding will make keys of continuous TTLs more likely to be stored in one region, instead of keys of the same user-key. Will this hurt scan performance?
that is value encoding, so the later put with ttl will overwrite the former.
The outdated rawkv will be deleted by a compaction filter specific for rawkv. The CAS works in memory layer and no mvcc is used. |
Signed-off-by: andylokandy <[email protected]>
Signed-off-by: andylokandy <[email protected]>
text/0069-api-v2.md
Outdated
4. RawKV compare_and_swap is easy to make mistake. | ||
5. It could be nice if we can deploy multiple applications on one TiKV cluster. | ||
|
||
## Document-level design |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does document-level
mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Referring https://github.com/rust-lang/rfcs/blob/master/0000-template.md#guide-level-explanation:
Guide-level explanation
Explain the proposal as if it was already included in the language and you were teaching it to another Rust programmer. That generally means:
- Introducing new named concepts.
- Explaining the feature largely in terms of examples.
- Explaining how Rust programmers should think about the feature, and how it should impact the way they use Rust. It should explain the impact as concretely as possible.
- If applicable, provide sample error messages, deprecation warnings, or migration guidance.
- If applicable, describe the differences between teaching this to existing Rust programmers and new Rust programmers.
For implementation-oriented RFCs (e.g. for compiler internals), this section should focus on how compiler contributors should think about the change, and give examples of its concrete impact. For policy RFCs, this section should provide an example-driven introduction to the policy, and explain its impact in concrete terms.
Reference-level explanation
This is the technical portion of the RFC. Explain the design in sufficient detail that:
- Its interaction with other features is clear.
- It is reasonably clear how the feature would be implemented.
- Corner cases are dissected by example.
The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work.
text/0069-api-v2.md
Outdated
If the key is started with `r`, then the value can be either: | ||
|
||
1. `{0x0}{data}` | ||
2. `{0x1}{TTL expire timestamp}{data}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Signed-off-by: andylokandy <[email protected]>
Signed-off-by: andylokandy <[email protected]>
Signed-off-by: andylokandy <[email protected]>
text/0069-api-v2.md
Outdated
|
||
### Data migration | ||
|
||
It's reasonable to provide a way to import and export non-TiDB data in TiKV during the upgrade or downgrade. On TiKV before 4.0, the only way to do that is `scan` and `batch_put` on the client. After 4.0, TiKV start to support importing SST file into TxnKV, and after 5.1, importing on RawKV is also supported. You can find more information in [`RFC: Online Bulk Load for RawKV`](https://github.com/tikv/rfcs/pull/72). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to distinguish and scan all the non-tidb data in TiKV? If there is non-tidb data inside a tikv cluster 4.0, what's the correct step to update the cluster to the latest version without data losing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to distinguish and scan all the non-tidb data in TiKV? If there is non-tidb data inside a tikv cluster 4.0
The TxnKV data start with m
or t
are considered as tidb data, otherwise are non-tidb data.
If there is non-tidb data inside a tikv cluster 4.0, what's the correct step to update the cluster to the latest version without data losing?
The user should migrate the non-tidb to new version tikv either by the new SST import and export
feature or get()
and put()
through the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if users want to upgrade a tikv cluster to a newer version. should they first scan all the existing non-tidb data from tikv, store them to external storage (maybe a file), then upgrade the cluster, and finally put these non-tidb data from the external storage to the upgraded tikv cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the approches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can put other reasonable approaches to this section as well.
text/0069-api-v2.md
Outdated
|
||
### Client Version | ||
|
||
TiKV clients will have a new configuration choosing `API V1` or `API V2` mode on connecting to TiKV. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the config defaults to v1 for an old tikv client which hasn't set the value of this config, am I correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user is required to explicitly specify the version to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative is to fetch the API version from PD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's better to let old clients be able to connect the new version tikv cluster. consider that during the upgrade, an old tidb-server may need to connect to the new tikv cluster to provide services. setting the config to v1
may be a better choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During the upgrade, both server and client should be using API V1. The API version switch should take place after the upgrade has finished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The server (set by the administrator) could specify whether the version must be set by the client or whether the server will fill in a default value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During the upgrade, both server and client should be using API V1. The API version switch should take place after the upgrade has finished.
@andylokandy I thought this is done during upgrading the cluster. Could you write this information explicitly in the "Upgrade" section?
The server (set by the administrator) could specify whether the version must be set by the client or whether the server will fill in a default value.
agree with @gregwebs.
3. `r{keyspace prefix id}`: RawKV key. | ||
4. `a{keyspace prefix id}`: RawKV atomic key, on which you can use `CompareAndSwap`. | ||
|
||
The `{keyspace prefix id}` is the [keyspace](https://github.com/tikv/rfcs/pull/39) prefix for seperating keys of different keyspace. It should be an vary-length integer whose highest bit of every byte denotes whether the next byte is still part of the integer. The client will fetch the prefix from PD by the keyspace name specified by the user when initializing the client, so it means that the keyspace prefix is valid during the session, in other words, the change on keyspace on PD will not take effect on running seesions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how to utilize key space in the tidb side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TiDB doesn't have a corresponding keyspace because it uses the API V1 reserving for TiDB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Older TiDB can default to using m,t but newer TiDB can use a keyspace with API V2
text/0069-api-v2.md
Outdated
|
||
### PD | ||
|
||
Add a new http interface for adding, renaming, deleteing and querying the mapping from keyspace name to prefix id: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The keyspace name comes from user application, which makes it too sensitive to be leaked. We need to control the access privilege of this information. Is it better to use gRPC or other methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no difference between gRPC and HTTP on the aspect of security. The thing you might want is RBAC Support for PD.
|
||
### TiKV | ||
|
||
In TiKV config file, add a new configuration `storage.api_version`. When enabled, `storage.enable_ttl` must also be enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what would happen if we enable storage.api_version
and leave storage.enable_ttl
disabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tikv will panic on start.
Signed-off-by: Andy Lok <[email protected]>
Yeah, I agree. |
Signed-off-by: Andy Lok <[email protected]>
Signed-off-by: Andy Lok <[email protected]>
text/0069-api-v2.md
Outdated
[ | ||
{ | ||
name: "default", | ||
id: "0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can consider it later to satisfy specific user requirements, for now, let's focused on the minimum viable product for the keyspace.
|
||
#### PD | ||
|
||
Add the new APIs described [above](#Keyspace-Management). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens when we execute the command in pd-ctl to enable API v2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then PD will enable its keyspace APIs. but nothing will happen on TiKV unless we manually edit the tikv config and restart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am really looking forward to this!
Signed-off-by: Andy Lok <[email protected]>
Signed-off-by: Andy Lok <[email protected]>
Signed-off-by: Andy Lok <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
+1 |
1. Adding keyspace: newly added keyspaces can only be viable to new clients. | ||
2. Deleting keyspace: | ||
2.1. Deleting keyspace only marks the metadata to inviable in PD, the data in the keyspace and metadata in PD will not be deleted automatically. Garbage collecting the data in deleted keyspaces may be introduced in the future since it increases the complexity of this RFC. To ensure no data is left, the user should clean up the keyspace before deleting the keyspace at present. | ||
2.2. Every client syncs the keyspace information with the PD leader every 5 minutes. When a keyspace is deleted, the client should be aware of the deletion in 5 minutes. The keyspace can not be accessed by any living client after 5 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about make it configurable but deny extreme settings, 0 minute for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big topic. The main problem is that it's hard to synchronize this setting across all components including the clients the servers and pd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can hardcode a setting at present. optimize the mechanism to improve user experience in the future.
Signed-off-by: andylokandy <[email protected]>
text/0069-api-v2.md
Outdated
|
||
#### RawKV Value Encode | ||
|
||
If the key has RawKV prefix, which are `k{keyspace id prefix}r` or `k{keyspace id prefix}a`, then the value can be either: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @Connor1996
text/0069-api-v2.md
Outdated
@@ -64,7 +64,18 @@ GET /keyspaces | |||
[ | |||
{ | |||
name: "default", | |||
id: "0" | |||
id: "0", | |||
metadata: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to rename it to "property" rather than "metadata". BTW, is there any guideline about what fields should be contained in this JSON object, what fields should be put outside this JSON object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please also update the metadata design in PD after the guideline is settled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to rename it to "property" rather than "metadata".
Fair
BTW, is there any guideline about what fields should be contained in this JSON object, what fields should be put outside this JSON object?
No. Every component can define a new config item. The component may print a warning if it meets an unrecognized property under its section (eg. TxnClient can warn the property default-config.txn-client.abcd
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to put "created_at" to this field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're mentioning the created_at
in top-level metadata, no, because it's only writable by PD. Otherwise, if you ask whether the user can set a custom created_at
property, it's yes, the user is allowed to set whatever he wants.
Signed-off-by: andylokandy <[email protected]>
Signed-off-by: Andy Lok <[email protected]>
Signed-off-by: Andy Lok <[email protected]>
@Connor1996 @BusyJay I've removed the raw atomic prefix, PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still missing parts in this design, for example, like key codec in the TiKV read/write, the impact on performance, etc.
|
||
### CDC / BR | ||
|
||
Since all access to TiDB is unchanged during the upgrade, CDC and BR should work the same after upgrade/downgrade. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @overvenus PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by the key codec in the TiKV read/write
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, parsing keys or constructing keys in coprocessors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coprocessor only read and write tidb data, and since tidb data will not change, coprocessor will not change either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So tidb will not use API V2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tidb uses api v1
Signed-off-by: Andy Lok [email protected]
Motivation
API V2
is a set of breaking changes that aim to solve serval issues with current RawKV (hereafter referred to asAPI V1
):API V1
is not safe to use RawKV along with TxnKV.API V1
Raw TTL is controlled by Store configuration. Switching the configuration will cause data corruption in silence.API V1
Raw TTL is encoded into the value by appending 8-bytes UNIX timestamp to the end of the value, therefore it's hard to introduce other encode afterward.By solving issue 1, TiDB will be able to support RawKV as the table's storage engine, which will enrich TiDB's use cases.
Rendered