Skip to content

Commit 7f61997

Browse files
committed
Added diagrams to overview
1 parent bcb0094 commit 7f61997

13 files changed

+291
-171
lines changed
62.9 KB
Loading
38.2 KB
Loading
Loading
Loading

docs/solutions/ha-architecture.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
1-
# Architecture layout
1+
# Architecture
22

33
As we discussed in the [overview of high availability](high-availability.md), the minimalist approach to a highly-available deployment is to have a three-node PostgreSQL cluster with the cluster management and failover mechanisms, load balancer and a backup / restore solution.
44

5-
The following diagram shows this architecture.
5+
The following diagram shows this architecture with the tools we recommend to use.
66

77
![Architecture of the three-node, single primary PostgreSQL cluster](../_images/diagrams/ha-architecture-patroni.png)
88

docs/solutions/ha-etcd.md

+103-115
Original file line numberDiff line numberDiff line change
@@ -1,142 +1,130 @@
11
# Configure etcd distributed store
22

3-
The distributed configuration store provides a reliable way to store data that needs to be accessed by large scale distributed systems. The most popular implementation of the distributed configuration store is etcd. etcd is deployed as a cluster for fault-tolerance and requires an odd number of members (n/2+1) to agree on updates to the cluster state. An etcd cluster helps establish a consensus among nodes during a failover and manages the configuration for the three PostgreSQL instances.
4-
5-
This document provides configuration for etcd version 3.5.x. For how to configure etcd cluster with earlier versions of etcd, read the blog post by _Fernando Laudares Camargos_ and _Jobin Augustine_ [PostgreSQL HA with Patroni: Your Turn to Test Failure Scenarios](https://www.percona.com/blog/postgresql-ha-with-patroni-your-turn-to-test-failure-scenarios/)
6-
7-
If you [installed the software from tarballs](../tarball.md), check how you [enable etcd](../enable-extensions.md#etcd).
8-
9-
The `etcd` cluster is first started in one node and then the subsequent nodes are added to the first node using the `add `command.
3+
In our implementation we use etcd distributed configuration store. [Refresh your knowledge about etcd](high-availability.md#etcd).
104

115
!!! note
6+
7+
If you [installed the software from tarballs](../tarball.md), you must first [enable etcd](../enable-extensions.md#etcd) before configuring it.
128

13-
Users with deeper understanding of how etcd works can configure and start all etcd nodes at a time and bootstrap the cluster using one of the following methods:
14-
15-
* Static in the case when the IP addresses of the cluster nodes are known
16-
* Discovery service - for cases when the IP addresses of the cluster are not known ahead of time.
17-
18-
See the [How to configure etcd nodes simultaneously](../how-to.md#how-to-configure-etcd-nodes-simultaneously) section for details.
19-
20-
### Configure `node1`
21-
22-
1. Create the configuration file. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node name and IP address with the actual name and IP address of your node.
23-
24-
```yaml title="/etc/etcd/etcd.conf.yaml"
25-
name: 'node1'
26-
initial-cluster-token: PostgreSQL_HA_Cluster_1
27-
initial-cluster-state: new
28-
initial-cluster: node1=http://10.104.0.1:2380
29-
data-dir: /var/lib/etcd
30-
initial-advertise-peer-urls: http://10.104.0.1:2380
31-
listen-peer-urls: http://10.104.0.1:2380
32-
advertise-client-urls: http://10.104.0.1:2379
33-
listen-client-urls: http://10.104.0.1:2379
34-
```
9+
To get started with `etcd` cluster, you need to bootstrap it. This means setting up the initial configuration and starting the etcd nodes so they can form a cluster. There are the following bootstrapping mechanisms:
3510

36-
2. Start the `etcd` service to apply the changes on `node1`.
11+
* Static in the case when the IP addresses of the cluster nodes are known
12+
* Discovery service - for cases when the IP addresses of the cluster are not known ahead of time.
13+
14+
Since we know the IP addresses of the nodes, we will use the static method. For using the discovery service, please refer to the [etcd documentation :octicons-external-link-16:](https://etcd.io/docs/v3.5/op-guide/clustering/#etcd-discovery){:target="_blank"}.
15+
16+
We will configure and start all etcd nodes in parallel. This can be done either by modifying each node's configuration or using the command line options. Use the method that you prefer more.
17+
18+
### Method 1. Modify the configuration file
19+
20+
1. Create the etcd configuration file on every node. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node names and IP addresses with the actual names and IP addresses of your nodes.
21+
22+
=== "node1"
23+
24+
```yaml title="/etc/etcd/etcd.conf.yaml"
25+
name: 'node1'
26+
initial-cluster-token: PostgreSQL_HA_Cluster_1
27+
initial-cluster-state: new
28+
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,node3=http://10.104.0.3:2380
29+
data-dir: /var/lib/etcd
30+
initial-advertise-peer-urls: http://10.104.0.1:2380
31+
listen-peer-urls: http://10.104.0.1:2380
32+
advertise-client-urls: http://10.104.0.1:2379
33+
listen-client-urls: http://10.104.0.1:2379
34+
```
35+
36+
=== "node2"
37+
38+
```yaml title="/etc/etcd/etcd.conf.yaml"
39+
name: 'node2'
40+
initial-cluster-token: PostgreSQL_HA_Cluster_1
41+
initial-cluster-state: new
42+
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380, node3=http://10.104.0.3:2380
43+
data-dir: /var/lib/etcd
44+
initial-advertise-peer-urls: http://10.104.0.2:2380
45+
listen-peer-urls: http://10.104.0.2:2380
46+
advertise-client-urls: http://10.104.0.2:2379
47+
listen-client-urls: http://10.104.0.2:2379
48+
```
49+
50+
=== "node3"
51+
52+
```yaml title="/etc/etcd/etcd.conf.yaml"
53+
name: 'node3'
54+
initial-cluster-token: PostgreSQL_HA_Cluster_1
55+
initial-cluster-state: new
56+
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380, node3=http://10.104.0.3:2380
57+
data-dir: /var/lib/etcd
58+
initial-advertise-peer-urls: http://10.104.0.3:2380
59+
listen-peer-urls: http://10.104.0.3:2380
60+
advertise-client-urls: http://10.104.0.3:2379
61+
listen-client-urls: http://10.104.0.3:2379
62+
```
63+
64+
2. Enable and start the `etcd` service on all nodes:
3765

3866
```{.bash data-prompt="$"}
3967
$ sudo systemctl enable --now etcd
4068
$ sudo systemctl start etcd
4169
$ sudo systemctl status etcd
4270
```
4371

44-
3. Check the etcd cluster members on `node1`:
72+
During the node start, etcd searches for other cluster nodes defined in the configuration. If the other nodes are not yet running, the start may fail by a quorum timeout. This is expected behavior. Try starting all nodes again at the same time for the etcd cluster to be created.
4573

46-
```{.bash data-prompt="$"}
47-
$ sudo etcdctl member list --write-out=table --endpoints=http://10.104.0.1:2379
48-
```
49-
50-
Sample output:
51-
52-
```{.text .no-copy}
53-
+------------------+---------+-------+----------------------------+----------------------------+------------+
54-
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
55-
+------------------+---------+-------+----------------------------+----------------------------+------------+
56-
| 9d2e318af9306c67 | started | node1 | http://10.104.0.1:2380 | http://10.104.0.1:2379 | false |
57-
+------------------+---------+-------+----------------------------+----------------------------+------------+
58-
```
59-
60-
4. Add the `node2` to the cluster. Run the following command on `node1`:
61-
62-
```{.bash data-prompt="$"}
63-
$ sudo etcdctl member add node2 --peer-ulrs=http://10.104.0.2:2380
64-
```
65-
66-
??? example "Sample output"
67-
68-
```{.text .no-copy}
69-
Added member named node2 with ID 10042578c504d052 to cluster
70-
71-
etcd_NAME="node2"
72-
etcd_INITIAL_CLUSTER="node2=http://10.104.0.2:2380,node1=http://10.104.0.1:2380"
73-
etcd_INITIAL_CLUSTER_STATE="existing"
74-
```
74+
--8<-- "check-etcd.md"
7575

76-
### Configure `node2`
76+
### Method 2. Start etcd nodes with command line options
7777

78-
1. Create the configuration file. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node names and IP addresses with the actual names and IP addresses of your nodes.
78+
1. On each etcd node, set the environment variables for the cluster members, the cluster token and state:
7979

80-
```yaml title="/etc/etcd/etcd.conf.yaml"
81-
name: 'node2'
82-
initial-cluster-token: PostgreSQL_HA_Cluster_1
83-
initial-cluster-state: existing
84-
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380
85-
data-dir: /var/lib/etcd
86-
initial-advertise-peer-urls: http://10.104.0.2:2380
87-
listen-peer-urls: http://10.104.0.2:2380
88-
advertise-client-urls: http://10.104.0.2:2379
89-
listen-client-urls: http://10.104.0.2:2379
9080
```
91-
92-
3. Start the `etcd` service to apply the changes on `node2`:
93-
94-
```{.bash data-prompt="$"}
95-
$ sudo systemctl enable --now etcd
96-
$ sudo systemctl start etcd
97-
$ sudo systemctl status etcd
98-
```
99-
100-
### Configure `node3`
101-
102-
1. Add `node3` to the cluster. **Run the following command on `node1`**
103-
104-
```{.bash data-prompt="$"}
105-
$ sudo etcdctl member add node3 http://10.104.0.3:2380
81+
TOKEN=PostgreSQL_HA_Cluster_1
82+
CLUSTER_STATE=new
83+
NAME_1=node1
84+
NAME_2=node2
85+
NAME_3=node3
86+
HOST_1=10.104.0.1
87+
HOST_2=10.104.0.2
88+
HOST_3=10.104.0.3
89+
CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_3}=http://${HOST_3}:2380
10690
```
10791

108-
2. On `node3`, create the configuration file. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node names and IP addresses with the actual names and IP addresses of your nodes.
109-
110-
```yaml title="/etc/etcd/etcd.conf.yaml"
111-
name: 'node3'
112-
initial-cluster-token: PostgreSQL_HA_Cluster_1
113-
initial-cluster-state: existing
114-
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,node3=http://10.104.0.3:2380
115-
data-dir: /var/lib/etcd
116-
initial-advertise-peer-urls: http://10.104.0.3:2380
117-
listen-peer-urls: http://10.104.0.3:2380
118-
advertise-client-urls: http://10.104.0.3:2379
119-
listen-client-urls: http://10.104.0.3:2379
120-
```
92+
2. Start each etcd node in parallel using the following command:
12193

122-
3. Start the `etcd` service to apply the changes.
94+
=== "node1"
12395

124-
```{.bash data-prompt="$"}
125-
$ sudo systemctl enable --now etcd
126-
$ sudo systemctl start etcd
127-
$ sudo systemctl status etcd
128-
```
96+
```{.bash data-prompt="$"}
97+
THIS_NAME=${NAME_1}
98+
THIS_IP=${HOST_1}
99+
etcd --data-dir=data.etcd --name ${THIS_NAME} \
100+
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
101+
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
102+
--initial-cluster ${CLUSTER} \
103+
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
104+
```
129105

130-
4. Check the etcd cluster members.
106+
=== "node2"
131107

132-
```{.bash data-prompt="$"}
133-
$ sudo etcdctl member list
134-
```
108+
```{.bash data-prompt="$"}
109+
THIS_NAME=${NAME_2}
110+
THIS_IP=${HOST_2}
111+
etcd --data-dir=data.etcd --name ${THIS_NAME} \
112+
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
113+
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
114+
--initial-cluster ${CLUSTER} \
115+
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
116+
```
135117

136-
??? example "Sample output"
118+
=== "node3"
137119

120+
```{.bash data-prompt="$"}
121+
THIS_NAME=${NAME_3}
122+
THIS_IP=${HOST_3}
123+
etcd --data-dir=data.etcd --name ${THIS_NAME} \
124+
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
125+
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
126+
--initial-cluster ${CLUSTER} \
127+
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
138128
```
139-
2d346bd3ae7f07c4: name=node2 peerURLs=http://10.104.0.2:2380 clientURLs=http://10.104.0.2:2379 isLeader=false
140-
8bacb519ebdee8db: name=node3 peerURLs=http://10.104.0.3:2380 clientURLs=http://10.104.0.3:2379 isLeader=false
141-
c5f52ea2ade25e1b: name=node1 peerURLs=http://10.104.0.1:2380 clientURLs=http://10.104.0.1:2379 isLeader=true
142-
```
129+
130+
--8<-- "check-etcd.md"

docs/solutions/ha-haproxy.md

+67
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Configure HAProxy
2+
3+
HAproxy is the load balancer and the single point of entry to your PostgreSQL cluster for client applications. A client application accesses the HAPpoxy URL and sends its read/write requests there. Behind-the-scene, HAProxy routes write requests to the primary node and read requests - to the secondaries in a round-robin fashion so that no secondary instance is unnecessarily loaded. To make this happen, provide different ports in the HAProxy configuration file. In this deployment, writes are routed to port 5000 and reads - to port 5001
4+
5+
This way, a client application doesn’t know what node in the underlying cluster is the current primary. HAProxy sends connections to a healthy node (as long as there is at least one healthy node available) and ensures that client application requests are never rejected.
6+
7+
1. Install HAProxy on the `HAProxy-demo` node:
8+
9+
```{.bash data-prompt="$"}
10+
$ sudo apt install percona-haproxy
11+
```
12+
13+
2. The HAProxy configuration file path is: `/etc/haproxy/haproxy.cfg`. Specify the following configuration in this file.
14+
15+
```
16+
global
17+
maxconn 100
18+
19+
defaults
20+
log global
21+
mode tcp
22+
retries 2
23+
timeout client 30m
24+
timeout connect 4s
25+
timeout server 30m
26+
timeout check 5s
27+
28+
listen stats
29+
mode http
30+
bind *:7000
31+
stats enable
32+
stats uri /
33+
34+
listen primary
35+
bind *:5000
36+
option httpchk /primary
37+
http-check expect status 200
38+
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
39+
server node1 node1:5432 maxconn 100 check port 8008
40+
server node2 node2:5432 maxconn 100 check port 8008
41+
server node3 node3:5432 maxconn 100 check port 8008
42+
43+
listen standbys
44+
balance roundrobin
45+
bind *:5001
46+
option httpchk /replica
47+
http-check expect status 200
48+
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
49+
server node1 node1:5432 maxconn 100 check port 8008
50+
server node2 node2:5432 maxconn 100 check port 8008
51+
server node3 node3:5432 maxconn 100 check port 8008
52+
```
53+
54+
55+
HAProxy will use the REST APIs hosted by Patroni to check the health status of each PostgreSQL node and route the requests appropriately.
56+
57+
3. Restart HAProxy:
58+
59+
```{.bash data-prompt="$"}
60+
$ sudo systemctl restart haproxy
61+
```
62+
63+
4. Check the HAProxy logs to see if there are any errors:
64+
65+
```{.bash data-prompt="$"}
66+
$ sudo journalctl -u haproxy.service -n 100 -f
67+
```

docs/solutions/ha-install-software.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -70,8 +70,8 @@ Run the following commands as root or with `sudo` privileges.
7070
3. Stop and disable all installed services:
7171

7272
```{.bash data-prompt="$"}
73-
$ sudo systemctl stop {etcd,patroni,postgresql}
74-
$ sudo systemctl disable {etcd,patroni,postgresql}
73+
$ sudo systemctl stop {etcd,patroni,postgresql-{{pgversion}}}
74+
$ sudo systemctl disable {etcd,patroni,postgresql-{{pgversion}}}
7575
```
7676

7777
4. Even though Patroni can use an existing Postgres installation, remove the data directory to force it to initialize a new Postgres cluster instance.

docs/solutions/ha-measure.md

+6-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
# Measuring high availability
22

3-
The need for high availability is determined by the business requirements, potential risks, and operational limitations. The level of high availability depends on how much downtime you can bear without negatively impacting your users and how much data loss you can tolerate during the system outage.
3+
The need for high availability is determined by the business requirements, potential risks, and operational limitations (e.g. the more components you add to your infrastructure, the more complex and time-consuming it is to maintain).
4+
5+
The level of high availability depends on the following:
6+
7+
* how much downtime you can bear without negatively impacting your users and
8+
* how much data loss you can tolerate during the system outage.
49

510
The measurement of availability is done by establishing a measurement time frame and dividing it by the time that it was available. This ratio will rarely be one, which is equal to 100% availability. At Percona, we don’t consider a solution to be highly available if it is not at least 99% or two nines available.
611

0 commit comments

Comments
 (0)