Skip to content

Commit 95785f0

Browse files
docs(etcdctl): document diagnosis subcommand and example report
Adds README section for `diagnose` command, flags, and output. Fixes README link to the example. Signed-off-by: Yerasala Venkata, Seshachalam <[email protected]>
1 parent 11dfb31 commit 95785f0

File tree

2 files changed

+218
-0
lines changed

2 files changed

+218
-0
lines changed

etcdctl/README.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1129,6 +1129,43 @@ DOWNGRADE CANCEL cancels the ongoing downgrade action to cluster.
11291129
./etcdctl downgrade cancel
11301130
Downgrade cancel success, cluster version 3.5
11311131
```
1132+
### DIAGNOSIS
1133+
1134+
`etcdctl diagnosis [flags]` - Collects and analyzes troubleshooting data from a running etcd cluster.
1135+
1136+
The `diagnosis` command gathers a concise set of diagnostic details from each cluster member by performing several checks, including:
1137+
1138+
* **Membership checks**: Verifies the cluster membership information.
1139+
* **Endpoint status**: Retrieves the status of each endpoint.
1140+
* **Serializable and linearizable reads**: Performs read operations to validate data consistency.
1141+
* **Metrics snapshot**: Collects a small snapshot of key metrics.
1142+
1143+
#### Flags
1144+
1145+
- `--cluster`: use all endpoints discovered from the cluster member list.
1146+
- `--etcd-storage-quota-bytes`: expected etcd storage quota in bytes (value passed to etcd with `--quota-backend-bytes`).
1147+
- `-o, --output`: optional file path to write the JSON report; by default the report is written to stdout. Logs are written to stderr.
1148+
1149+
Global flags (like `--endpoints`, TLS, auth, and timeouts) are shared with other `etcdctl` commands. See `etcdctl options` for the full list.
1150+
1151+
#### Examples
1152+
1153+
To perform analysis of a running etcd cluster, you can use the following command. This will collect and analyze data from all specified endpoints.
1154+
1155+
```bash
1156+
etcdctl diagnosis --endpoints=https://10.0.1.10:2379,https://10.0.1.11:2379,https://10.0.1.12:2379 \
1157+
--cacert ./ca.crt --key ./etcd-diagnosis.key --cert ./etcd-diagnosis.crt
1158+
1159+
# Use cluster-discovered endpoints
1160+
etcdctl diagnosis --cluster
1161+
1162+
# Write report to a file (logs still go to stderr)
1163+
etcdctl diagnosis -o report.json
1164+
```
1165+
1166+
1167+
Example output: see [ctlv3/command/diagnosis/examples/etcd_diagnosis_report.json](ctlv3/command/diagnosis/examples/etcd_diagnosis_report.json)
1168+
11321169

11331170
## Concurrency commands
11341171

Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
{
2+
"input": {
3+
"endpoints": [
4+
"http://127.0.0.1:2379"
5+
],
6+
"useClusterEndpoints": true,
7+
"dial-timeout": 2000000000,
8+
"command-timeout": 5000000000,
9+
"keep-alive-time": 2000000000,
10+
"keep-alive-timeout": 5000000000,
11+
"insecure": true,
12+
"insecure-discovery": true,
13+
"db-quota-bytes": 2147483648
14+
},
15+
"results": [
16+
{
17+
"name": "membershipChecker",
18+
"memberList": {
19+
"header": {
20+
"cluster_id": 17237436991929493444,
21+
"member_id": 9372538179322589801,
22+
"raft_term": 2
23+
},
24+
"members": [
25+
{
26+
"ID": 9372538179322589801,
27+
"name": "infra1",
28+
"peerURLs": [
29+
"http://127.0.0.1:12380"
30+
],
31+
"clientURLs": [
32+
"http://127.0.0.1:2379"
33+
]
34+
},
35+
{
36+
"ID": 10501334649042878790,
37+
"name": "infra2",
38+
"peerURLs": [
39+
"http://127.0.0.1:22380"
40+
],
41+
"clientURLs": [
42+
"http://127.0.0.1:22379"
43+
]
44+
},
45+
{
46+
"ID": 18249187646912138824,
47+
"name": "infra3",
48+
"peerURLs": [
49+
"http://127.0.0.1:32380"
50+
],
51+
"clientURLs": [
52+
"http://127.0.0.1:32379"
53+
]
54+
}
55+
]
56+
}
57+
},
58+
{
59+
"name": "epStatusChecker",
60+
"summary": [
61+
"Successful"
62+
],
63+
"epStatusList": [
64+
{
65+
"endpoint": "http://127.0.0.1:2379",
66+
"epStatus": {
67+
"header": {
68+
"cluster_id": 17237436991929493444,
69+
"member_id": 9372538179322589801,
70+
"revision": 1,
71+
"raft_term": 2
72+
},
73+
"version": "3.5.9",
74+
"dbSize": 98304,
75+
"leader": 18249187646912138824,
76+
"raftIndex": 8,
77+
"raftTerm": 2,
78+
"raftAppliedIndex": 8,
79+
"dbSizeInUse": 98304
80+
}
81+
},
82+
{
83+
"endpoint": "http://127.0.0.1:22379",
84+
"epStatus": {
85+
"header": {
86+
"cluster_id": 17237436991929493444,
87+
"member_id": 10501334649042878790,
88+
"revision": 1,
89+
"raft_term": 2
90+
},
91+
"version": "3.5.9",
92+
"dbSize": 98304,
93+
"leader": 18249187646912138824,
94+
"raftIndex": 8,
95+
"raftTerm": 2,
96+
"raftAppliedIndex": 8,
97+
"dbSizeInUse": 98304
98+
}
99+
},
100+
{
101+
"endpoint": "http://127.0.0.1:32379",
102+
"epStatus": {
103+
"header": {
104+
"cluster_id": 17237436991929493444,
105+
"member_id": 18249187646912138824,
106+
"revision": 1,
107+
"raft_term": 2
108+
},
109+
"version": "3.5.9",
110+
"dbSize": 98304,
111+
"leader": 18249187646912138824,
112+
"raftIndex": 8,
113+
"raftTerm": 2,
114+
"raftAppliedIndex": 8,
115+
"dbSizeInUse": 98304
116+
}
117+
}
118+
]
119+
},
120+
{
121+
"name": "serializableReadChecker",
122+
"summary": "Successful",
123+
"readResponses": [
124+
{
125+
"endpoint": "http://127.0.0.1:2379",
126+
"took": "686.5µs"
127+
},
128+
{
129+
"endpoint": "http://127.0.0.1:22379",
130+
"took": "1.129291ms"
131+
},
132+
{
133+
"endpoint": "http://127.0.0.1:32379",
134+
"took": "1.034625ms"
135+
}
136+
]
137+
},
138+
{
139+
"name": "linearizableReadChecker",
140+
"summary": "Successful",
141+
"readResponses": [
142+
{
143+
"endpoint": "http://127.0.0.1:2379",
144+
"took": "1.286333ms"
145+
},
146+
{
147+
"endpoint": "http://127.0.0.1:22379",
148+
"took": "890.417µs"
149+
},
150+
{
151+
"endpoint": "http://127.0.0.1:32379",
152+
"took": "1.257791ms"
153+
}
154+
]
155+
},
156+
{
157+
"name": "metricsChecker",
158+
"summary": [
159+
"Successful"
160+
],
161+
"epMetricsList": [
162+
{
163+
"endpoint": "http://127.0.0.1:2379",
164+
"took": "3.752625ms",
165+
"epMetrics": {
166+
"etcd_disk_backend_commit_duration_seconds_bucket": [
167+
"etcd_disk_backend_commit_duration_seconds_bucket{le=\"0.001\"} 0"
168+
],
169+
"etcd_disk_wal_fsync_duration_seconds_bucket": [
170+
"etcd_disk_wal_fsync_duration_seconds_bucket{le=\"0.001\"} 0"
171+
],
172+
"etcd_network_peer_round_trip_time_seconds_bucket": [
173+
"etcd_network_peer_round_trip_time_seconds_bucket{To=\"91bc3c398fb3c146\",le=\"0.0001\"} 2"
174+
],
175+
"process_resident_memory_bytes": null
176+
}
177+
}
178+
]
179+
}
180+
]
181+
}

0 commit comments

Comments
 (0)