tikv · ystaticy · Mar 22, 2022 · Mar 22, 2022 · Mar 29, 2022 · Mar 29, 2022
diff --git a/media/tikv-rawkv-gc-compactionfilter.png b/media/tikv-rawkv-gc-compactionfilter.png
diff --git a/text/0090-tikv-gc.md b/text/0090-tikv-gc.md
@@ -0,0 +1,88 @@
+# RFC: TiKV RawKV MVCC GC
+
+
+## Summary
+Move TiKV MVCC GC worker from TiDB into a group of independent GC worker node role and implement a new GC process in TiKV for RawKV.
+
+## Motivation
+1.GC worker is an important component for TiKV that deletes outdated MVCC data so as to not explode the storage. But currently, the GC worker is implemented in TiDB, which makes TiKV not usable without TiDB.And current GC process is just for transaction of TiDB,it's not usable for RawKV.  
-1.GC worker is an important component for TiKV that deletes outdated MVCC data so as to not explode the storage. But currently, the GC worker is implemented in TiDB, which makes TiKV not usable without TiDB.And current GC process is just for transaction of TiDB,it's not usable for RawKV.  
+1. GC worker is an important component for TiKV that deletes outdated MVCC data so as to not explode the storage. But currently, the GC worker is implemented in TiDB, which makes TiKV not usable without TiDB. And current GC process is just for transaction of TiDB, it's not usable for RawKV.  
-1.GC worker is an important component for TiKV that deletes outdated MVCC data so as to not explode the storage. But currently, the GC worker is implemented in TiDB, which makes TiKV not usable without TiDB.And current GC process is just for transaction of TiDB,it's not usable for RawKV.  
+1. GC worker is an important component for TiKV that deletes outdated MVCC data so as to not explode the storage. But currently, the GC worker is implemented in TiDB, which makes TiKV not usable without TiDB. And current GC process is just for transaction of TiDB, it's not usable for RawKV.  
+2.Standardize the API used to set and obtain GC status in PD to improve the developer's experience.
+
+## Background
+According to the documentation for the current GC worker in a TiDB cluster, the GC process is as follows:
+
+In TiDB GC worker leader:
+1. Regularly calculates a new timestamp called "GC safe point"(The default interval is 10min), and push the safe point to PD.
+2. Get the minimal Service safe point  among all services from the response of step 2, which is GC safe point .
-2. Get the minimal Service safe point  among all services from the response of step 2, which is GC safe point .
+2. Get the minimal Service safe point among all services from the response of step 1, which is GC safe point .
-2. Get the minimal Service safe point  among all services from the response of step 2, which is GC safe point .
+2. Get the minimal Service safe point among all services from the response of step 1, which is GC safe point .
+3. Txn GC process: resolve locks and record delete ranges information.
+
+In PD leader:
+1. Receive update safe point requests from TiDB or other tools (e.g. CDC, BR).
+2. Calculate the minimal timestamp = min(all service safe point, now - gc_life_time).
+
+the GC safe point data in etcd of PD as follows:  
+- safe point generated by TiDB:
+     ```shell
+  /gc/safe_point/service  
+    ```
+- service safe point generated by CDC,BR or Lighting:  
+     ```shell
+  /gc/safe_point/service/$serviceID
+    ```
+
+In every TiKV nodes：
+1. Get GC safe point from PD regularly.
+2. Deletion will be triggered in CompactionFilter and GcTask thread;
+
+## New GC worker architecture
+In a TiKV cluster without TiDB nodes , there are a few different points as follows:
-In a TiKV cluster without TiDB nodes , there are a few different points as follows:
+In a TiKV cluster without TiDB nodes, there are a few different points as follows:
-In a TiKV cluster without TiDB nodes , there are a few different points as follows:
+In a TiKV cluster without TiDB nodes, there are a few different points as follows:
+1. We need to move GC worker into another node role.
+2. For [API V2](https://github.com/tikv/rfcs/blob/master/text/0069-api-v2.md) .It need gc the earlier version in default cf. But Txn GC worker process will be triggered by WriteCompactionFilter of write cf.
-2. For [API V2](https://github.com/tikv/rfcs/blob/master/text/0069-api-v2.md) .It need gc the earlier version in default cf. But Txn GC worker process will be triggered by WriteCompactionFilter of write cf.
+2. For [API V2](https://github.com/tikv/rfcs/blob/master/text/0069-api-v2.md), it need gc the earlier version in default cf. But Txn GC worker process will be triggered by WriteCompactionFilter of write cf.
-2. For [API V2](https://github.com/tikv/rfcs/blob/master/text/0069-api-v2.md) .It need gc the earlier version in default cf. But Txn GC worker process will be triggered by WriteCompactionFilter of write cf.
+2. For [API V2](https://github.com/tikv/rfcs/blob/master/text/0069-api-v2.md), it need gc the earlier version in default cf. But Txn GC worker process will be triggered by WriteCompactionFilter of write cf.
+3. RawKV encoded code is different with Txn data in TiDB.
+
+So we designed a new GC architecture and process for TiKV cluster.It will be extended on the original interface to support the RawKV MVCC GC. For the original TiDB scenario, the old GC implementation can be used first.
+
+## Detailed design
+For support RawKV GC in TiKV cluster deploy without TiDB nodes.
+1. Add a new node role instead of GC worker in TiDB nodes.
+- Why we choose to create a new node role:
+  - IF we add GC Worker in PD: It will cause the problem of client-go circular dependency.
+  - IF we add GC Worker in TiKV: Because the logic required by GC worker is well implemented in client-go, but it is missing in client-rust, adding the implementation of GC worker in TiKV will increase more development work.
+
+   So after discussion, we decided to add a new role for GC Worker.
+  - It's mainly to regularly calculates a new timestamp called "GC safe point", and push the safe point to PD.
+  - It is implemented in golang, which is convenient to call the interface of client-go.
+
+  - The code of new GC worker, will be added into [tikv/migration](https://github.com/tikv/migration)
+
+2. Changes on PD:
+- A new concept is 'service group':  
+  - Due to TiDB, TxnKV and RawKV are allowed to coexist. Because the data of the three scenarios are independent, Because the data of the three scenarios are independent, separate safepoints are used in the GC, which helps to reduce the interference between businesses and speed up the GC.
+  - If multi tenancy is supported in the future, 'service group' can also support it.
+  - Need to design new interfaces for update service safepoint with 'service group'.
+  - Add UpdateServiceGCSafepointByServiceGroup and getGCSafepointByServiceGroup to standardize the API.
+    - the safepoint data path in etcd of PD,will be changed. The new safe point path in etcd as follows:
+    - gc_worker safe point
+    ```shell
+    /gc_servicegroup/$service_group_id/service
+    ```
+    - CDC,BR service safepoint
+    ```shell
+    /gc_servicegroup/$service_group_id/service/$serviceId
+    ```
+  - the GC Worker configuration in config/gc_worker_conf.toml file.
+  - The default interval for generating GC safepoint is still '10m0s'.
+  - And TiKV will get the GC safe point from PD. GC safe point = min(all service safe point, gc worker safe point).
+
+
+3.Changes on TiKV：
+- Get GC safe point from PD by getGCSafepointByServiceGroup interface.
+- For API V2, we need add new CompactionFilter which is named RawGCcompactionFilter, and add a new GCTask type implementation. 
+- GC conditions in RawGCcompactionFilter is:  (ts < GCSafePoint) && ( ttl-expired || deleted-mark || not the newest version ).  
+   - If the newest version is earlier than GC safe point and it's delete marked or expired ttl,those keys and earlier versions of the same userkey will be sent to a gc scheduler thread to gc asynchronous.
+      ![raw gc copaction filter ](../media/tikv-rawkv-gc-compactionfilter.png)
+
+
+## Reference
+https://docs.google.com/document/d/1jA3lK9QbYlwsvn67wGsSuusD1Dzx7ANq_vya384RBIg/edit#heading=h.rr3hcmc7ejb8  
+https://docs.pingcap.com/tidb/stable/garbage-collection-overview