Skip to content

Commit 1b93da6

Browse files
authored
Update pd-recover docs (pingcap#3281)
Signed-off-by: Ryan Leung <[email protected]>
1 parent cf02cc3 commit 1b93da6

File tree

1 file changed

+59
-82
lines changed

1 file changed

+59
-82
lines changed

Diff for: pd-recover.md

+59-82
Original file line numberDiff line numberDiff line change
@@ -6,145 +6,122 @@ aliases: ['/docs/dev/pd-recover/','/docs/dev/reference/tools/pd-recover/']
66

77
# PD Recover User Guide
88

9-
PD Recover is a disaster recovery tool of PD, used to recover the PD cluster which cannot start or provide services normally. PD Recover is downloaded with TiDB Ansible in the `resource/bin/pd-recover` path.
9+
PD Recover is a disaster recovery tool of PD, used to recover the PD cluster which cannot start or provide services normally.
10+
11+
## Compile from source code
12+
13+
+ [Go](https://golang.org/) Version 1.13 or later is required because the Go modules are used.
14+
+ In the root directory of the [PD project](https://github.com/pingcap/pd), use the `make pd-recover` command to compile and generate `bin/pd-recover`.
15+
16+
> **Note:**
17+
>
18+
> Generally, you do not need to compile source code because the PD Control tool already exists in the released binary or Docker. However, developer users can refer to the instructions above for compiling source code.
19+
20+
## Download TiDB installation package
21+
22+
To download the latest version of PD Recover, directly download the TiDB package, because PD Recover is included in the TiDB package.
23+
24+
| Package name | OS | Architecture | SHA256 checksum |
25+
|:---|:---|:---|:---|
26+
| `https://download.pingcap.org/tidb-{version}-linux-amd64.tar.gz` (pd-recover) | Linux | amd64 | `https://download.pingcap.org/tidb-{version}-linux-amd64.sha256` |
27+
28+
> **Note:**
29+
>
30+
> `{version}` indicates the version number of TiDB. For example, if `{version}` is `v4.0.0`, the package download link is `https://download.pingcap.org/tidb-v4.0.0-linux-amd64.tar.gz`. You can also download the latest unpublished version by replacing `{version}` with `latest`.
1031
1132
## Quick Start
1233

1334
This section describes how to use PD Recover to recover a PD cluster.
1435

1536
### Get cluster ID
1637

17-
The cluster ID can be obtained from the log of PD, TiKV or TiDB. To get the cluster ID, you can either use the `ansible ad-hoc` command in the Control Machine, or view the log directly on the server.
38+
The cluster ID can be obtained from the log of PD, TiKV or TiDB. To get the cluster ID, you can view the log directly on the server.
1839

19-
#### Get `[info] cluster ID` from PD log (recommended)
40+
#### Get cluster ID from PD log (recommended)
2041

21-
To get the `[info] cluster id` from the PD log, run the following command:
42+
To get the cluster ID from the PD log, run the following command:
2243

2344
{{< copyable "shell-regular" >}}
2445

25-
```
26-
ansible -i inventory.ini pd_servers -m shell -a 'cat {{deploy_dir}}/log/pd.log | grep "init cluster id" | head -10'
46+
```bash
47+
cat {{/path/to}}/pd.log | grep "init cluster id"
2748
```
2849

29-
```
30-
10.0.1.13 | CHANGED | rc=0 >>
50+
```bash
3151
[2019/10/14 10:35:38.880 +00:00] [INFO] [server.go:212] ["init cluster id"] [cluster-id=6747551640615446306]
32-
……
52+
...
3353
```
3454

35-
#### Get `[info] cluster ID` from TiDB log
55+
#### Get cluster ID from TiDB log
3656

37-
To get the `[info] cluster ID` from the TiDB log, run the following command:
57+
To get the cluster ID from the TiDB log, run the following command:
3858

3959
{{< copyable "shell-regular" >}}
4060

41-
```
42-
ansible -i inventory.ini tidb_servers -m shell -a 'cat {{deploy_dir}}/log/tidb*.log | grep "init cluster id" | head -10'
61+
```bash
62+
cat {{/path/to}}/tidb.log | grep "init cluster id"
4363
```
4464

45-
```
46-
10.0.1.15 | CHANGED | rc=0 >>
65+
```bash
4766
2019/10/14 19:23:04.688 client.go:161: [info] [pd] init cluster id 6747551640615446306
48-
……
67+
...
4968
```
5069

51-
#### Get `[info] PD cluster` from TiKV log
70+
#### Get cluster ID from TiKV log
5271

53-
To get the `[info] PD cluster` from the TiKV log, run the following command:
72+
To get the cluster ID from the TiKV log, run the following command:
5473

5574
{{< copyable "shell-regular" >}}
5675

57-
```
58-
ansible -i inventory.ini tikv_servers -m shell -a 'cat {{deploy_dir}}/log/tikv* | grep "PD cluster" | head -10'
76+
```bash
77+
cat {{/path/to}}/tikv.log | grep "connect to PD cluster"
5978
```
6079

61-
```
62-
10.0.1.15 | CHANGED | rc=0 >>
80+
```bash
6381
[2019/10/14 07:06:35.278 +00:00] [INFO] [tikv-server.rs:464] ["connect to PD cluster 6747551640615446306"]
64-
……
82+
...
6583
```
6684

67-
### Get `Alloc ID` (TiKV StoreID)
68-
69-
The `alloc-id` value you specify must be larger than the currently largest `Alloc ID` value. To get `Alloc ID`, you can either use the `ansible ad-hoc` command in the Control Machine, or view the log directly on the server.
70-
71-
#### Get `[info] allocates id` from PD log
72-
73-
To get the `[info] allocates id` from the PD log, run the following command:
85+
### Get allocated ID
7486

75-
{{< copyable "shell-regular" >}}
87+
The allocated ID value you specify must be larger than the currently largest allocated ID value. To get allocated ID, you can either get it from the monitor, or view the log directly on the server.
7688

77-
```
78-
ansible -i inventory.ini pd_servers -m shell -a 'cat {{deploy_dir}}/log/pd* | grep "allocates" | head -10'
79-
```
80-
81-
```
82-
10.0.1.13 | CHANGED | rc=0 >>
83-
[2019/10/15 03:15:05.824 +00:00] [INFO] [id.go:91] ["idAllocator allocates a new id"] [alloc-id=3000]
84-
[2019/10/15 08:55:01.275 +00:00] [INFO] [id.go:91] ["idAllocator allocates a new id"] [alloc-id=4000]
85-
……
86-
```
89+
#### Get allocated ID from the monitor (recommended)
8790

88-
#### Get `[info] alloc store id` from TiKV log
91+
To get allocated ID from the monitor, you need to make sure that the metrics you are viewing are the metrics of **the last PD leader**, and you can get the largest allocated ID from the **Current ID allocation** panel in PD dashboard.
92+
93+
#### Get allocated ID from PD log
8994

90-
To get the `[info] alloc store id` from the TiKV log, run the following command:
95+
To get the allocated ID from the PD log, you need to make sure that the log you are viewing is the log of **the last PD leader**, and you can get the maximum allocated ID by running the following command:
9196

9297
{{< copyable "shell-regular" >}}
9398

94-
```
95-
ansible -i inventory.ini tikv_servers -m shell -a 'cat {{deploy_dir}}/log/tikv* | grep "alloc store" | head -10'
99+
```bash
100+
cat {{/path/to}}/pd*.log | grep "idAllocator allocates a new id" | awk -F'=' '{print $2}' | awk -F']' '{print $1}' | sort -r | head -n 1
96101
```
97102

98-
```
99-
10.0.1.13 | CHANGED | rc=0 >>
100-
[2019/10/14 07:06:35.516 +00:00] [INFO] [node.rs:229] ["alloc store id 4 "]
101-
10.0.1.14 | CHANGED | rc=0 >>
102-
[2019/10/14 07:06:35.734 +00:00] [INFO] [node.rs:229] ["alloc store id 5 "]
103-
10.0.1.15 | CHANGED | rc=0 >>
104-
[2019/10/14 07:06:35.418 +00:00] [INFO] [node.rs:229] ["alloc store id 1 "]
105-
10.0.1.21 | CHANGED | rc=0 >>
106-
[2019/10/15 03:15:05.826 +00:00] [INFO] [node.rs:229] ["alloc store id 2001 "]
107-
10.0.1.20 | CHANGED | rc=0 >>
108-
[2019/10/15 03:15:05.987 +00:00] [INFO] [node.rs:229] ["alloc store id 2002 "]
103+
```bash
104+
4000
105+
...
109106
```
110107

111-
### Deploy a new PD cluster
112-
113-
To deploy a new PD cluster, run the following command:
108+
Or you can simply run the above command in all PD servers to find the largest one.
114109

115-
{{< copyable "shell-regular" >}}
116-
117-
```
118-
ansible-playbook bootsrap.yml --tags=pd &&
119-
ansible-playbook deploy.yml --tags=pd &&
120-
ansible-playbook start.yml --tags=pd
121-
```
110+
### Deploy a new PD cluster
122111

123-
To delete the old cluster, delete the `data.pd` directory and restart the PD service.
112+
Before deploying a new PD cluster, you need to stop the the existing PD cluster and then delete the previous data directory which is specified by `--data-dir`.
124113

125114
### Use pd-recover
126115

127116
{{< copyable "shell-regular" >}}
128117

129-
```
118+
```bash
130119
./pd-recover -endpoints http://10.0.1.13:2379 -cluster-id 6747551640615446306 -alloc-id 10000
131120
```
132121

133-
### Restart PD cluster
122+
### Restart the whole cluster
134123

135-
{{< copyable "shell-regular" >}}
136-
137-
```
138-
ansible-playbook rolling_update.yml --tags=pd
139-
```
140-
141-
### Restart TiDB or TiKV
142-
143-
{{< copyable "shell-regular" >}}
144-
145-
```
146-
ansible-playbook rolling_update.yml --tags=tidb,tikv
147-
```
124+
When you see the prompted information that the recovery is successful, restart the whole cluster.
148125

149126
## FAQ
150127

0 commit comments

Comments
 (0)