Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin' into max-user-connections
Browse files Browse the repository at this point in the history
  • Loading branch information
joccau committed Mar 7, 2025
2 parents f41775d + 6458e86 commit 3fb48fb
Show file tree
Hide file tree
Showing 36 changed files with 533 additions and 129 deletions.
1 change: 1 addition & 0 deletions TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -1126,6 +1126,7 @@
- v6.6
- [6.6.0-DMR](/releases/release-6.6.0.md)
- v6.5
- [6.5.12](/releases/release-6.5.12.md)
- [6.5.11](/releases/release-6.5.11.md)
- [6.5.10](/releases/release-6.5.10.md)
- [6.5.9](/releases/release-6.5.9.md)
Expand Down
5 changes: 4 additions & 1 deletion auto-increment.md
Original file line number Diff line number Diff line change
Expand Up @@ -457,11 +457,14 @@ INSERT INTO t VALUES (); -- Returns ID 30001
While IDs are always increasing and without significant gaps like those seen with `AUTO_ID_CACHE 0`, small gaps in the sequence might still occur in the following scenarios. These gaps are necessary to maintain both uniqueness and the strictly increasing property of the IDs.
- During failover when the primary instance exits or crashes
After you enable the MySQL compatibility mode, the allocated IDs are **unique** and **monotonically increasing**, and the behavior is almost the same as MySQL. Even when accessing across multiple TiDB instances, ID monotonicity is maintained. However, if the primary instance of the centralized service crashes, a few IDs might become non-continuous. This occurs because the secondary instance discards some IDs allocated by the primary instance during failover to ensure ID uniqueness.
- During rolling upgrades of TiDB nodes
- During normal concurrent transactions (similar to MySQL)
> **Note:**
>
>
> The behavior and performance of `AUTO_ID_CACHE 1` has evolved across TiDB versions:
>
> - Before v6.4.0, each ID allocation requires a TiKV transaction, which affects performance.
Expand Down
58 changes: 32 additions & 26 deletions br/br-log-architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,18 +108,24 @@ Log backup generates the following types of files:
.
├── v1
│   ├── backupmeta
│   │   ├── {min_restored_ts}-{uuid}.meta
│   │   ── {checkpoint}-{uuid}.meta
│   │   ├── ...
│   │   ── {resolved_ts}-{uuid}.meta
│   ├── global_checkpoint
│   │   ── {store_id}.ts
│   ── {date}
│      ── {hour}
│         ── {store_id}
│            ├── {min_ts}-{uuid}.log
│            ── {min_ts}-{uuid}.log
── v1_stream_truncate_safepoint.txt
│   │   ── {store_id}.ts
│   ── {date}
│      ── {hour}
│         ── {store_id}
│            ├── ...
│            ── {min_ts}-{uuid}.log
── v1_stream_truncate_safepoint.txt
```

Explanation of the backup file directory structure:

- `backupmeta`: stores backup metadata. The `resolved_ts` in the filename indicates the backup progress, meaning that data before this TSO has been fully backed up. However, note that this TSO only reflects the progress of certain shards.
- `global_checkpoint`: represents the global backup progress. It records the latest point in time to which data can be restored using `br restore point`.
- `{date}/{hour}`: stores backup data for the corresponding date and hour. When cleaning up storage, always use `br log truncate` instead of manually deleting data. This is because the metadata references the data in this directory, and manual deletion might lead to restore failures or data inconsistencies after restore.

The following is an example:

```
Expand All @@ -129,24 +135,24 @@ The following is an example:
│   │   ├── ...
│   │   ├── 435213818858112001-e2569bda-a75a-4411-88de-f469b49d6256.meta
│   │   ├── 435214043785779202-1780f291-3b8a-455e-a31d-8a1302c43ead.meta
│   │   ── 435214443785779202-224f1408-fff5-445f-8e41-ca4fcfbd2a67.meta
│   │   ── 435214443785779202-224f1408-fff5-445f-8e41-ca4fcfbd2a67.meta
│   ├── global_checkpoint
│   │   ├── 1.ts
│   │   ├── 2.ts
│   │   ── 3.ts
│   ── 20220811
│      ── 03
│         ├── 1
│         │   ├── ...
│         │   ├── 435213866703257604-60fcbdb6-8f55-4098-b3e7-2ce604dafe54.log
│         │   ── 435214023989657606-72ce65ff-1fa8-4705-9fd9-cb4a1e803a56.log
│         ├── 2
│         │   ├── ...
│         │   ├── 435214102632857605-11deba64-beff-4414-bc9c-7a161b6fb22c.log
│         │   ── 435214417205657604-e6980303-cbaa-4629-a863-1e745d7b8aed.log
│         ── 3
│            ├── ...
│            ├── 435214495848857605-7bf65e92-8c43-427e-b81e-f0050bd40be0.log
│            ── 435214574492057604-80d3b15e-3d9f-4b0c-b133-87ed3f6b2697.log
── v1_stream_truncate_safepoint.txt
│   │   ── 3.ts
│   ── 20220811
│      ── 03
│         ├── 1
│         │   ├── ...
│         │   ├── 435213866703257604-60fcbdb6-8f55-4098-b3e7-2ce604dafe54.log
│         │   ── 435214023989657606-72ce65ff-1fa8-4705-9fd9-cb4a1e803a56.log
│         ├── 2
│         │   ├── ...
│         │   ├── 435214102632857605-11deba64-beff-4414-bc9c-7a161b6fb22c.log
│         │   ── 435214417205657604-e6980303-cbaa-4629-a863-1e745d7b8aed.log
│         ── 3
│            ├── ...
│            ├── 435214495848857605-7bf65e92-8c43-427e-b81e-f0050bd40be0.log
│            ── 435214574492057604-80d3b15e-3d9f-4b0c-b133-87ed3f6b2697.log
── v1_stream_truncate_safepoint.txt
```
73 changes: 57 additions & 16 deletions dm/dm-master-configuration-file.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,19 +45,60 @@ This section introduces the configuration parameters of DM-master.

### Global configuration

| Parameter | Description |
| :------------ | :--------------------------------------- |
| `name` | The name of the DM-master. |
| `log-level` | Specifies a log level from `debug`, `info`, `warn`, `error`, and `fatal`. The default log level is `info`. |
| `log-file` | Specifies the log file directory. If the parameter is not specified, the logs are printed onto the standard output. |
| `master-addr` | Specifies the address of DM-master which provides services. You can omit the IP address and specify the port number only, such as ":8261". |
| `advertise-addr` | Specifies the address that DM-master advertises to the outside world. |
| `peer-urls` | Specifies the peer URL of the DM-master node. |
| `advertise-peer-urls` | Specifies the peer URL that DM-master advertises to the outside world. The value of `advertise-peer-urls` is by default the same as that of `peer-urls`. |
| `initial-cluster` | The value of `initial-cluster` is the combination of the `advertise-peer-urls` value of all DM-master nodes in the initial cluster. |
| `join` | The value of `join` is the combination of the `advertise-peer-urls` value of the existed DM-master nodes in the cluster. If the DM-master node is newly added, replace `initial-cluster` with `join`. |
| `ssl-ca` | The path of the file that contains list of trusted SSL CAs for DM-master to connect with other components. |
| `ssl-cert` | The path of the file that contains X509 certificate in PEM format for DM-master to connect with other components. |
| `ssl-key` | The path of the file that contains X509 key in PEM format for DM-master to connect with other components. |
| `cert-allowed-cn` | Common Name list. |
| `secret-key-path` | The file path of the secret key, which is used to encrypt and decrypt upstream and downstream passwords. The file must contain a 64-character hexadecimal AES-256 secret key. One way to generate this key is by calculating SHA256 checksum of random data, such as <code>head -n 256 /dev/urandom \| sha256sum</code>. For more information, see [Customize a secret key for DM encryption and decryption](/dm/dm-customized-secret-key.md). |
#### `name`

- The name of the DM-master.

#### `log-level`

- Specifies a log level.
- Default value: `info`
- Value options: `debug`, `info`, `warn`, `error`, `fatal`

#### `log-file`

- Specifies the log file directory. If the parameter is not specified, the logs are printed onto the standard output.

#### `master-addr`

- Specifies the address of DM-master which provides services. You can omit the IP address and specify the port number only, such as `":8261"`.

#### `advertise-addr`

- Specifies the address that DM-master advertises to the outside world.

#### `peer-urls`

- Specifies the peer URL of the DM-master node.

#### `advertise-peer-urls`

- Specifies the peer URL that DM-master advertises to the outside world. The value of `advertise-peer-urls` is by default the same as that of [`peer-urls`](#peer-urls).

#### `initial-cluster`

- The value of `initial-cluster` is the combination of the [`advertise-peer-urls`](#advertise-peer-urls) value of all DM-master nodes in the initial cluster.

#### `join`

- The value of `join` is the combination of the [`advertise-peer-urls`](#advertise-peer-urls) value of the existing DM-master nodes in the cluster. If the DM-master node is newly added, replace `initial-cluster` with `join`.

#### `ssl-ca`

- The path of the file that contains list of trusted SSL CAs for DM-master to connect with other components.

#### `ssl-cert`

- The path of the file that contains X509 certificate in PEM format for DM-master to connect with other components.

#### `ssl-key`

- The path of the file that contains X509 key in PEM format for DM-master to connect with other components.

#### `cert-allowed-cn`

- Common Name list.

#### `secret-key-path`

- The file path of the secret key, which is used to encrypt and decrypt upstream and downstream passwords. The file must contain a 64-character hexadecimal AES-256 secret key. One way to generate this key is by calculating SHA256 checksum of random data, such as `head -n 256 /dev/urandom | sha256sum`. For more information, see [Customize a secret key for DM encryption and decryption](/dm/dm-customized-secret-key.md).
115 changes: 87 additions & 28 deletions dm/dm-source-configuration-file.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,49 +66,108 @@ This section describes each configuration parameter in the configuration file.

### Global configuration

| Parameter | Description |
| :------------ | :--------------------------------------- |
| `source-id` | Represents a MySQL instance ID. |
| `enable-gtid` | Determines whether to pull binlog from the upstream using GTID. The default value is `false`. In general, you do not need to configure `enable-gtid` manually. However, if GTID is enabled in the upstream database, and the primary/secondary switch is required, you need to set `enable-gtid` to `true`. |
| `enable-relay` | Determines whether to enable the relay log feature. The default value is `false`. This parameter takes effect from v5.4. Additionally, you can [enable relay log dynamically](/dm/relay-log.md#enable-and-disable-relay-log) using the `start-relay` command. |
| `relay-binlog-name` | Specifies the file name from which DM-worker starts to pull the binlog. For example, `"mysql-bin.000002"`. It only works when `enable_gtid` is `false`. If this parameter is not specified, DM-worker will start pulling from the earliest binlog file being replicated. Manual configuration is generally not required. |
| `relay-binlog-gtid` | Specifies the GTID from which DM-worker starts to pull the binlog. For example, `"e9a1fc22-ec08-11e9-b2ac-0242ac110003:1-7849"`. It only works when `enable_gtid` is `true`. If this parameter is not specified, DM-worker will start pulling from the latest GTID being replicated. Manual configuration is generally not required. |
| `relay-dir` | Specifies the relay log directory. |
| `host` | Specifies the host of the upstream database. |
| `port` | Specifies the port of the upstream database. |
| `user` | Specifies the username of the upstream database. |
| `password` | Specifies the user password of the upstream database. It is recommended to use the password encrypted with dmctl. |
| `security` | Specifies the TLS config of the upstream database. The configured file paths of the certificates must be accessible to all nodes. If the configured file paths are local paths, then all the nodes in the cluster need to store a copy of the certificates in the same path of each host.|
#### `source-id`

- Represents a MySQL instance ID.

#### `enable-gtid`

- Determines whether to pull binlog from the upstream using GTID.
- In general, you do not need to configure `enable-gtid` manually. However, if GTID is enabled in the upstream database, and the primary/secondary switch is required, you need to set `enable-gtid` to `true`.
- Default value: `false`

#### `enable-relay`

- Determines whether to enable the relay log feature. This parameter takes effect from v5.4. Additionally, you can [enable relay log dynamically](/dm/relay-log.md#enable-and-disable-relay-log) using the `start-relay` command.
- Default value: `false`

#### `relay-binlog-name`

- Specifies the file name from which DM-worker starts to pull the binlog. For example, `"mysql-bin.000002"`.
- It only works when [`enable-gtid`](#enable-gtid) is `false`. If this parameter is not specified, DM-worker will start pulling from the earliest binlog file being replicated. Manual configuration is generally not required.

#### `relay-binlog-gtid`

- Specifies the GTID from which DM-worker starts to pull the binlog. For example, `"e9a1fc22-ec08-11e9-b2ac-0242ac110003:1-7849"`.
- It only works when [`enable-gtid`](#enable-gtid) is `true`. If this parameter is not specified, DM-worker will start pulling from the latest GTID being replicated. Manual configuration is generally not required.

#### `relay-dir`

- Specifies the relay log directory.
- Default value: `"./relay_log"`

#### `host`

- Specifies the host of the upstream database.

#### `port`

- Specifies the port of the upstream database.

#### `user`

- Specifies the username of the upstream database.

#### `password`

- Specifies the user password of the upstream database. It is recommended to use the password encrypted with dmctl.

#### `security`

- Specifies the TLS config of the upstream database. The configured file paths of the certificates must be accessible to all nodes. If the configured file paths are local paths, then all the nodes in the cluster need to store a copy of the certificates in the same path of each host.

### Relay log cleanup strategy configuration (`purge`)

Generally, there is no need to manually configure these parameters unless there is a large amount of relay logs and disk capacity is insufficient.

| Parameter | Description | Default value |
| :------------ | :--------------------------------------- | :-------------|
| `interval` | Sets the time interval at which relay logs are regularly checked for expiration, in seconds. | `3600` |
| `expires` | Sets the expiration time for relay logs, in hours. The relay log that is not written by the relay processing unit, or does not need to be read by the existing data migration task will be deleted by DM if it exceeds the expiration time. If this parameter is not specified, the automatic purge is not performed. | `0` |
| `remain-space` | Sets the minimum amount of free disk space, in gigabytes. When the available disk space is smaller than this value, DM-worker tries to delete relay logs. | `15` |
#### `interval`

- Specifies the time interval at which relay logs are regularly checked for expiration, in seconds.
- Default value: `3600`
- Unit: seconds

#### `expires`

- Specifies the expiration time for relay logs.
- The relay log that is not written by the relay processing unit, or does not need to be read by the existing data migration task will be deleted by DM if it exceeds the expiration time. If this parameter is not specified, the automatic purge is not performed.
- Default value: `0`
- Unit: hours

#### `remain-space`

- Specifies the minimum amount of free disk space, in gigabytes. When the available disk space is smaller than this value, DM-worker tries to delete relay logs.
- Default value: `15`
- Unit: GiB

> **Note:**
>
> The automatic data purge strategy only takes effect when `interval` is not 0 and at least one of the two configuration items `expires` and `remain-space` is not 0.
> The automatic data purge strategy only takes effect when [`interval`](#interval) is not `0` and at least one of the two configuration items [`expires`](#expires) and [`remain-space`](#remain-space) is not `0`.
### Task status checker configuration (`checker`)

DM periodically checks the current task status and error message to determine if resuming the task will eliminate the error. If needed, DM automatically retries to resume the task. DM adjusts the checking interval using the exponential backoff strategy. Its behaviors can be adjusted by the following configuration.

| Parameter | Description |
| :------------ | :--------------------------------------- |
| `check-enable` | Whether to enable this feature. |
| `backoff-rollback` | If the current checking interval of backoff strategy is larger than this value and the task status is normal, DM will try to decrease the interval. |
| `backoff-max` | The maximum value of checking interval of backoff strategy, must be larger than 1 second. |
#### `check-enable`

- Whether to enable this feature.

#### `backoff-rollback`

- If the current checking interval of backoff strategy is larger than this value and the task status is normal, DM will try to decrease the interval.

#### `backoff-max`

- The maximum value of checking interval of backoff strategy, must be larger than 1 second.

### Binlog event filter

Starting from DM v2.0.2, you can configure binlog event filters in the source configuration file.

| Parameter | Description |
| :------------ | :--------------------------------------- |
| `case-sensitive` | Determines whether the filtering rules are case-sensitive. The default value is `false`. |
| `filters` | Sets binlog event filtering rules. For details, see [Binlog event filter parameter explanation](/dm/dm-binlog-event-filter.md#parameter-descriptions). |
#### `case-sensitive`

- Determines whether the filtering rules are case-sensitive.
- Default value: `false`

#### `filters`

- Specifies binlog event filtering rules. For details, see [Binlog event filter parameter explanation](/dm/dm-binlog-event-filter.md#parameter-descriptions).
Loading

0 comments on commit 3fb48fb

Please sign in to comment.