Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ticdc: add documentation for preventing TiCDC from using the same TiB cluster as both upstream and downstream #20341

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions ticdc/ticdc-manage-changefeed.md
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,39 @@
- `resolved-ts`: The largest TSO among the sorted data in the current processor.
- `checkpoint-ts`: The largest TSO that has been successfully written to the downstream in the current processor.

## Security mechanism

Starting from v9.0.0, TiCDC introduces a security mechanism to prevent users from accidentally configuring the same TiDB cluster as both the upstream and downstream for data replication, which could lead to circular replication and data anomalies.

When creating, updating, or resuming a replication task, TiCDC automatically checks whether the upstream and downstream TiDB clusters have the same `cluster_id`. If TiCDC detects the same `cluster_id` for both the upstream and downstream, it will reject the operation. The `cluster_id` (introduced in v9.0.0) is a unique identifier for TiDB clusters. You can query it using the following SQL statement:

```sql
SELECT VARIABLE_VALUE FROM mysql.tidb WHERE VARIABLE_NAME = 'cluster_id';
```

### Compatibility

- For non-TiDB downstream systems (such as MySQL, Kafka, etc.), TiCDC skips this check to ensure compatibility.

Check failure on line 270 in ticdc/ticdc-manage-changefeed.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [PingCAP.Latin] Use 'such as' instead of 'etc.'. Raw Output: {"message": "[PingCAP.Latin] Use 'such as' instead of 'etc.'.", "location": {"path": "ticdc/ticdc-manage-changefeed.md", "range": {"start": {"line": 270, "column": 58}}}, "severity": "ERROR"}
- For TiDB versions earlier than v9.0.0, the system cannot retrieve the `cluster_id`, so TiCDC still allows you to create replication tasks to ensure no impact on existing functionality. In these cases, because of the lack of `cluster_id`, you need to manually check the configurations to avoid potential issues.

### Example error messages

When creating, updating, or resuming a replication task, TiCDC will report an error if it detects the same `cluster_id` for both the upstream and downstream TiDB clusters. The following is a typical example:

When using the CLI command to create a replication task with the same cluster as both the upstream and downstream:

```
cdc cli changefeed create --server=http://127.0.0.1:8300 --sink-uri="mysql://root:@127.0.0.1:8300/" --changefeed-id="create-cmd"
```

You will receive the following error message:

```
Error: [CDC:ErrSameUpstreamDownstream]TiCDC does not support creating a changefeed with the same TiDB cluster as both the source and the target for the changefeed.
```

This error message includes the error code `CDC:ErrSameUpstreamDownstream`, indicating that TiCDC detected the same cluster being used as both the upstream and downstream. If you encounter this error, check whether the `sink-uri` parameter for your replication task is configured correctly.

## Replicate tables with the new framework for collations enabled

Starting from v4.0.15, v5.0.4, v5.1.1 and v5.2.0, TiCDC supports tables that have enabled [new framework for collations](/character-set-and-collation.md#new-framework-for-collations).
Expand Down