Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API V2 RFC #69

Merged
merged 25 commits into from
Sep 28, 2021
Merged

Add API V2 RFC #69

merged 25 commits into from
Sep 28, 2021

Conversation

andylokandy
Copy link
Contributor

@andylokandy andylokandy commented Jul 27, 2021

Signed-off-by: Andy Lok [email protected]

Motivation

API V2 is a set of breaking changes that aim to solve serval issues with current RawKV (hereafter referred to as API V1):

  1. API V1 is not safe to use RawKV along with TxnKV.
  2. API V1 Raw TTL is controlled by Store configuration. Switching the configuration will cause data corruption in silence.
  3. API V1 Raw TTL is encoded into the value by appending 8-bytes UNIX timestamp to the end of the value, therefore it's hard to introduce other encode afterward.

By solving issue 1, TiDB will be able to support RawKV as the table's storage engine, which will enrich TiDB's use cases.

Rendered

Signed-off-by: Andy Lok <[email protected]>
Signed-off-by: Andy Lok <[email protected]>
Once the `API V2` is enabled, the key will be either start with either:

1. `m` or `t`: TxnKV key. Used by TiDB.
2. `x\0`: TxnKV key. The `\0` is a placeholder for the potential [Keyspace](https://github.com/tikv/rfcs/pull/39).
Copy link
Member

@BusyJay BusyJay Jul 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the compatibility with other systems like titan or juicefs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should choose API V1 in client mode. If they want to support mixed TxnKV and RawKV, they should upgrade to API V2 client.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid they can't do that easily as data need to be reloaded.

Copy link
Contributor Author

@andylokandy andylokandy Jul 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they can't, but also i think that they don't really need api v2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless we always maintain two versions, which doesn't seem reasonable to me, otherwise they have to use V2 someday.

Copy link
Contributor Author

@andylokandy andylokandy Jul 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we drop the support for v1 api? If not, I think the maintainace effort is not the avoidable.

text/0069-api-v2.md Outdated Show resolved Hide resolved

- Reject all requests with `enable_api_v2` in the context.

### TiKV Clients
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TiKV client needs to consider if it needs to decode keys when it loads a region from PD.

Copy link
Contributor Author

@andylokandy andylokandy Jul 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know there is a bug that the client may try to decode the start key and end key of a region, which will panic when the boundary is not decodable. But this is not related to the api v2 encode, so I didn't list it here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not just a bug. It is a quite more complicated problem. Currently, based on whether TiKV is in txnkv or rawkv mode, the region split keys can be encoded or not encoded. When TiDB loads a region from PD, it should determine if it needs to decode region key based on it uses txnkv or rawkv.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the fix should change the region cache to not assume the split key is decodable.

text/0069-api-v2.md Outdated Show resolved Hide resolved
If the key is started with `r`, then the value can be either:

1. `{0x0}{data}`
2. `{0x1}{TTL expire timestamp}{data}`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why TTL is not appended after data

Copy link
Contributor Author

@andylokandy andylokandy Jul 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no difference. So I'd like to put metadata together.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this encoding will make keys of continuous TTLs more likely to be stored in one region, instead of keys of the same user-key. Will this hurt scan performance?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refering #69 (comment):

If I understand correctly, this encoding will make keys of continuous TTLs more likely to be stored in one region, instead of keys of the same user-key. Will this hurt scan performance?

that is value encoding, so the later put with ttl will overwrite the former.


### TiKV

Add a new configuatrion `storage.enable_api_v2`. When enabled, `storage.enable_ttl` must also be enabled. TiKV will carry it's `storage.enable_api_v2` in `PutStore` request. PD will reject the `PutStore` if the switch mismatched.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just remove storage.enable_ttl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@andylokandy andylokandy Jul 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not appropriate to remove it but we can make it enabled by default.

Signed-off-by: Andy Lok <[email protected]>
Signed-off-by: andylokandy <[email protected]>
Signed-off-by: andylokandy <[email protected]>
Copy link
Member

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, do we need to consider a new garbage collection logic apart from TxnKV for this design? e.g. how to handle TTL, how to remove old CAS result

If the key is started with `r`, then the value can be either:

1. `{0x0}{data}`
2. `{0x1}{TTL expire timestamp}{data}`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this encoding will make keys of continuous TTLs more likely to be stored in one region, instead of keys of the same user-key. Will this hurt scan performance?

@andylokandy
Copy link
Contributor Author

andylokandy commented Aug 3, 2021

@skyzh

If I understand correctly, this encoding will make keys of continuous TTLs more likely to be stored in one region, instead of keys of the same user-key. Will this hurt scan performance?

that is value encoding, so the later put with ttl will overwrite the former.

By the way, do we need to consider a new garbage collection logic apart from TxnKV for this design? e.g. how to handle TTL, how to remove old CAS result

The outdated rawkv will be deleted by a compaction filter specific for rawkv. The CAS works in memory layer and no mvcc is used.

text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
Signed-off-by: andylokandy <[email protected]>
Signed-off-by: andylokandy <[email protected]>
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
4. RawKV compare_and_swap is easy to make mistake.
5. It could be nice if we can deploy multiple applications on one TiKV cluster.

## Document-level design
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does document-level mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Referring https://github.com/rust-lang/rfcs/blob/master/0000-template.md#guide-level-explanation:

Guide-level explanation

Explain the proposal as if it was already included in the language and you were teaching it to another Rust programmer. That generally means:

  • Introducing new named concepts.
  • Explaining the feature largely in terms of examples.
  • Explaining how Rust programmers should think about the feature, and how it should impact the way they use Rust. It should explain the impact as concretely as possible.
  • If applicable, provide sample error messages, deprecation warnings, or migration guidance.
  • If applicable, describe the differences between teaching this to existing Rust programmers and new Rust programmers.

For implementation-oriented RFCs (e.g. for compiler internals), this section should focus on how compiler contributors should think about the change, and give examples of its concrete impact. For policy RFCs, this section should provide an example-driven introduction to the policy, and explain its impact in concrete terms.

Reference-level explanation

This is the technical portion of the RFC. Explain the design in sufficient detail that:

  • Its interaction with other features is clear.
  • It is reasonably clear how the feature would be implemented.
  • Corner cases are dissected by example.

The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work.

text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
If the key is started with `r`, then the value can be either:

1. `{0x0}{data}`
2. `{0x1}{TTL expire timestamp}{data}`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

text/0069-api-v2.md Show resolved Hide resolved
Signed-off-by: andylokandy <[email protected]>
Signed-off-by: andylokandy <[email protected]>
Signed-off-by: andylokandy <[email protected]>
text/0069-api-v2.md Outdated Show resolved Hide resolved

### Data migration

It's reasonable to provide a way to import and export non-TiDB data in TiKV during the upgrade or downgrade. On TiKV before 4.0, the only way to do that is `scan` and `batch_put` on the client. After 4.0, TiKV start to support importing SST file into TxnKV, and after 5.1, importing on RawKV is also supported. You can find more information in [`RFC: Online Bulk Load for RawKV`](https://github.com/tikv/rfcs/pull/72).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to distinguish and scan all the non-tidb data in TiKV? If there is non-tidb data inside a tikv cluster 4.0, what's the correct step to update the cluster to the latest version without data losing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to distinguish and scan all the non-tidb data in TiKV? If there is non-tidb data inside a tikv cluster 4.0

The TxnKV data start with m or t are considered as tidb data, otherwise are non-tidb data.

If there is non-tidb data inside a tikv cluster 4.0, what's the correct step to update the cluster to the latest version without data losing?

The user should migrate the non-tidb to new version tikv either by the new SST import and export feature or get() and put() through the client.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if users want to upgrade a tikv cluster to a newer version. should they first scan all the existing non-tidb data from tikv, store them to external storage (maybe a file), then upgrade the cluster, and finally put these non-tidb data from the external storage to the upgraded tikv cluster?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the approches.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can put other reasonable approaches to this section as well.


### Client Version

TiKV clients will have a new configuration choosing `API V1` or `API V2` mode on connecting to TiKV.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the config defaults to v1 for an old tikv client which hasn't set the value of this config, am I correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user is required to explicitly specify the version to use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative is to fetch the API version from PD.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's better to let old clients be able to connect the new version tikv cluster. consider that during the upgrade, an old tidb-server may need to connect to the new tikv cluster to provide services. setting the config to v1 may be a better choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During the upgrade, both server and client should be using API V1. The API version switch should take place after the upgrade has finished.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The server (set by the administrator) could specify whether the version must be set by the client or whether the server will fill in a default value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During the upgrade, both server and client should be using API V1. The API version switch should take place after the upgrade has finished.

@andylokandy I thought this is done during upgrading the cluster. Could you write this information explicitly in the "Upgrade" section?

The server (set by the administrator) could specify whether the version must be set by the client or whether the server will fill in a default value.

agree with @gregwebs.

3. `r{keyspace prefix id}`: RawKV key.
4. `a{keyspace prefix id}`: RawKV atomic key, on which you can use `CompareAndSwap`.

The `{keyspace prefix id}` is the [keyspace](https://github.com/tikv/rfcs/pull/39) prefix for seperating keys of different keyspace. It should be an vary-length integer whose highest bit of every byte denotes whether the next byte is still part of the integer. The client will fetch the prefix from PD by the keyspace name specified by the user when initializing the client, so it means that the keyspace prefix is valid during the session, in other words, the change on keyspace on PD will not take effect on running seesions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how to utilize key space in the tidb side?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TiDB doesn't have a corresponding keyspace because it uses the API V1 reserving for TiDB.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Older TiDB can default to using m,t but newer TiDB can use a keyspace with API V2


### PD

Add a new http interface for adding, renaming, deleteing and querying the mapping from keyspace name to prefix id:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The keyspace name comes from user application, which makes it too sensitive to be leaked. We need to control the access privilege of this information. Is it better to use gRPC or other methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no difference between gRPC and HTTP on the aspect of security. The thing you might want is RBAC Support for PD.

text/0069-api-v2.md Outdated Show resolved Hide resolved

### TiKV

In TiKV config file, add a new configuration `storage.api_version`. When enabled, `storage.enable_ttl` must also be enabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would happen if we enable storage.api_version and leave storage.enable_ttl disabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tikv will panic on start.

text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
Signed-off-by: Andy Lok <[email protected]>
@andylokandy
Copy link
Contributor Author

@zz-jason

we can leave this as an open question to solve in the future. at the initial version of keyspace, I prefer a strict privilege control that does not allow users to add keyspace whose id conflicts with an existing one. We can relax the control in the future if a specific requirement appears. What do you think?

Yeah, I agree.

Signed-off-by: Andy Lok <[email protected]>
Signed-off-by: Andy Lok <[email protected]>
[
{
name: "default",
id: "0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can consider it later to satisfy specific user requirements, for now, let's focused on the minimum viable product for the keyspace.

text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved

#### PD

Add the new APIs described [above](#Keyspace-Management).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens when we execute the command in pd-ctl to enable API v2?

Copy link
Contributor Author

@andylokandy andylokandy Sep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then PD will enable its keyspace APIs. but nothing will happen on TiKV unless we manually edit the tikv config and restart.

Copy link
Contributor

@gregwebs gregwebs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am really looking forward to this!

Signed-off-by: Andy Lok <[email protected]>
Signed-off-by: Andy Lok <[email protected]>
Signed-off-by: Andy Lok <[email protected]>
Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sunxiaoguang
Copy link
Member

I am really looking forward to this!

+1

1. Adding keyspace: newly added keyspaces can only be viable to new clients.
2. Deleting keyspace:
2.1. Deleting keyspace only marks the metadata to inviable in PD, the data in the keyspace and metadata in PD will not be deleted automatically. Garbage collecting the data in deleted keyspaces may be introduced in the future since it increases the complexity of this RFC. To ensure no data is left, the user should clean up the keyspace before deleting the keyspace at present.
2.2. Every client syncs the keyspace information with the PD leader every 5 minutes. When a keyspace is deleted, the client should be aware of the deletion in 5 minutes. The keyspace can not be accessed by any living client after 5 minutes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about make it configurable but deny extreme settings, 0 minute for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big topic. The main problem is that it's hard to synchronize this setting across all components including the clients the servers and pd.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can hardcode a setting at present. optimize the mechanism to improve user experience in the future.

Signed-off-by: andylokandy <[email protected]>
text/0069-api-v2.md Outdated Show resolved Hide resolved

#### RawKV Value Encode

If the key has RawKV prefix, which are `k{keyspace id prefix}r` or `k{keyspace id prefix}a`, then the value can be either:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -64,7 +64,18 @@ GET /keyspaces
[
{
name: "default",
id: "0"
id: "0",
metadata: {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to rename it to "property" rather than "metadata". BTW, is there any guideline about what fields should be contained in this JSON object, what fields should be put outside this JSON object?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also update the metadata design in PD after the guideline is settled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to rename it to "property" rather than "metadata".

Fair

BTW, is there any guideline about what fields should be contained in this JSON object, what fields should be put outside this JSON object?

No. Every component can define a new config item. The component may print a warning if it meets an unrecognized property under its section (eg. TxnClient can warn the property default-config.txn-client.abcd).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to put "created_at" to this field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're mentioning the created_at in top-level metadata, no, because it's only writable by PD. Otherwise, if you ask whether the user can set a custom created_at property, it's yes, the user is allowed to set whatever he wants.

Signed-off-by: andylokandy <[email protected]>
This was referenced Sep 13, 2021
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
text/0069-api-v2.md Outdated Show resolved Hide resolved
Signed-off-by: Andy Lok <[email protected]>
Signed-off-by: Andy Lok <[email protected]>
@andylokandy
Copy link
Contributor Author

@Connor1996 @BusyJay I've removed the raw atomic prefix, PTAL.

Copy link
Member

@Connor1996 Connor1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@andylokandy andylokandy requested a review from zhongzc September 28, 2021 09:54
@andylokandy andylokandy merged commit 4c8fee7 into tikv:master Sep 28, 2021
Copy link
Member

@BusyJay BusyJay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still missing parts in this design, for example, like key codec in the TiKV read/write, the impact on performance, etc.


### CDC / BR

Since all access to TiDB is unchanged during the upgrade, CDC and BR should work the same after upgrade/downgrade.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @overvenus PTAL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by the key codec in the TiKV read/write?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, parsing keys or constructing keys in coprocessors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coprocessor only read and write tidb data, and since tidb data will not change, coprocessor will not change either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So tidb will not use API V2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tidb uses api v1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.