pingcap · sre-bot · Mar 16, 2020 · Mar 13, 2020 · Mar 16, 2020
diff --git a/TOC.md b/TOC.md
@@ -307,6 +307,7 @@
     - [Upgrade](/reference/tidb-binlog/upgrade.md)
     - [Reparo](/reference/tidb-binlog/reparo.md)
     - [Binlog Slave Client](/reference/tidb-binlog/binlog-slave-client.md)
+    - [TiDB Binlog Relay Log](/reference/tidb-binlog/relay-log.md)
     - [Glossary](/reference/tidb-binlog/glossary.md)
     + Troubleshoot
       - [Troubleshooting](/reference/tidb-binlog/troubleshoot/binlog.md)

diff --git a/reference/tidb-binlog/deploy.md b/reference/tidb-binlog/deploy.md
@@ -553,6 +553,13 @@ The following part shows how to use Pump and Drainer based on the nodes above.
 
         # replicate-do-db = ["~^b.*","s1"]
 
+        # [syncer.relay]
+        # It saves the directory of the relay log. The relay log is not enabled if the value is empty.
+        # The configuration only comes to effect if the downstream is TiDB or MySQL.
+        # log-dir = ""
+        # the maximum size of each file
+        # max-file-size = 10485760
+
         # [[syncer.replicate-do-table]]
         # db-name ="test"
         # tbl-name = "log"

diff --git a/reference/tidb-binlog/overview.md b/reference/tidb-binlog/overview.md
@@ -46,6 +46,7 @@ The TiDB Binlog cluster is composed of Pump and Drainer.
 * TiDB uses the built-in Pump Client to send the binlog to each Pump
 * Pump stores binlogs and sends the binlogs to Drainer in order
 * Drainer reads binlogs of each Pump, merges and sorts the binlogs, and sends the binlogs downstream
+* Drainer supports [relay log](/reference/tidb-binlog/relay-log.md). By the relay log, Drainer ensures that the downstream clusters are in a consistent state.
 
 ## Notes
 

diff --git a/reference/tidb-binlog/relay-log.md b/reference/tidb-binlog/relay-log.md
@@ -0,0 +1,66 @@
+---
+title: TiDB Binlog Relay Log
+summary: Learn how to use relay log to maintain data consistency in extreme cases.
+category: reference
+---
+
+# TiDB Binlog Relay Log
+
+When replicating binlogs, Drainer splits transactions from the upstream and replicates the split transactions concurrently to the downstream.
+
+In extreme cases where the upstream clusters are not available and Drainer exits abnormally, the downstream clusters (MySQL or TiDB) might be in the intermediate states with inconsistent data. In such cases, Drainer can use the relay log to ensure that the downstream clusters are in a consistent state.
+
+## Consistent state during Drainer replication
+
+The downstream clusters reaching a consistent state means the data of the downstream clusters are the same as the snapshot of the upstream which sets `tidb_snapshot = ts`.
+
+The checkpoint consistency means Drainer checkpoint saves the consistent state of replication in `consistent`. When Drainer runs, `consistent` is `false`. After Drainer exits normally, `consistent` is set to `true`.
+
+You can query the downstream checkpoint table as follows:
+
+{{< copyable "sql" >}}
+
+```sql
+select * from tidb_binlog.checkpoint;
+```
+
+```
++---------------------+----------------------------------------------------------------+
+| clusterID           | checkPoint                                                     |
++---------------------+----------------------------------------------------------------+
+| 6791641053252586769 | {"consistent":false,"commitTS":414529105591271429,"ts-map":{}} |
++---------------------+----------------------------------------------------------------+
+```
+
+## Implementation principles
+
+After Drainer enables the relay log, it first writes the binlog events to the disks and then replicates the events to the downstream clusters.
+
+If the upstream clusters are not available, Drainer can restore the downstream clusters to a consistent state by reading the relay log.
+
+> **Note:**
+>
+> If the relay log data is lost at the same time, this method does not work, but its incidence is very low. In addition, you can use the Network File System to ensure data safety of the relay log.
+
+### Trigger scenarios where Drainer consumes binlogs from the relay log
+
+When Drainer is started, if it fails to connect to the Placement Driver (PD) of the upstream clusters, and it detects that `consistent = false` in the checkpoint, Drainer will try to read the relay log, and restore the downstream clusters to a consistent state. After that, the Drainer process sets the checkpoint `consistent` to `true` and then exits.
+
+### GC mechanism of relay log
+
+While Drainer is running, if it confirms that the whole data of a relay log file has been successfully replicated to the downstream, the file is deleted immediately. Therefore, the relay log does not occupy too much space.
+
+If the size of a relay log file reaches 10MB (by default), the file is split, and data is written into a new relay log file.
+
+## Configuration
+
+To enable the relay log, add the following configuration in Drainer:
+
+{{< copyable "" >}}
+
+```
+[syncer.relay]
+# It saves the directory of the relay log. The relay log is not enabled if the value is empty.
+# The configuration only comes to effect if the downstream is TiDB or MySQL.
+log-dir = "/dir/to/save/log"
+```