PostgreSQL 13 master with write-ahead log shipping replica
The following are notes from some projecty thing that summarize configuration of PostgreSQL 13 HA with HAproxy eventually bolted on top. There are multiple replication solutions that can be used. My scenario called for write-ahead log shipping. This was done on CentOS 7 machine with PostgreSQL 13 rpm packages.
The setup assumes there are no existing databases on the master or the replica - clean start. On high level, what happens is when pg_basebackup command below is run, database instances from the master are copied to the replica along with configuration files needed for HA.
To start, initialize PostgreSQL default instance on the master. At this point, $PGDATA directory on the replica will be empty. It is also assumed that both, the master and the replica have the same values for $PGDATA. Otherwise, things get funny…
[root@masterdb somedude]# /usr/pgsql-13/bin/postgresql-13-setup initdb
systemctl start postgresql-13
Next, verify the following setting is present and enabled in postgresql.conf on the master:
listen_addresses = '*'
On the master server, create a user - replicator to be used for PostgreSQL replication:
[root@masterdb somedude]# su - postgres
-bash-4.2$ psql
psql (13.3)
Type "help" for help.
postgres=# create user replicator with replication login encrypted password 'replicator';
CREATE ROLE
postgres=# quit
-bash-4.2$
The replicator user must be able to login to the master server. Remember, pg_basebackup copies everything to the replica. So, in reality, replicator will be logging in to the master from the replica, when invoked by pg_basebackup command below.
Add IPs of both servers, master and replica to pg_hba.conf. This way, when switchover happens, i.e. replica is promoted to the master, there will be no need to fiddle with pg_hba.conf.
Note, that 10.10.10.10 is the masterdb and 10.10.10.20 is the replicadb.
host replication replicator 10.10.10.10/32 scram-sha-256
host replication replicator 10.10.10.20/32 scram-sha-256
Make sure following are set in postgresql.conf. Naturally, these should be set according to your needs…
max_wal_senders = 10
max_replication_slots = 16
Now, ensure there are no databases on the replica server, i.e. $PGDATA directory is empty.
Assuming both master and replica are running - on the replica, perform backup and restore of the master server. The replica server will connect to the master server and grab master’s databases. This command needs to be run as psql user on the replica:
[root@replicadb somedude]# su - postgres
-bash-4.2$ pg_basebackup -h masterdb.unixpowered.com -U replicator -D $PGDATA -Fp -Xs -P -R -C --slot=rslot_replica
Password:
24990/24990 kB (100%), 1/1 tablespace
Start PostgreSQL on the replica. If everything goes fine you should be able to check replication status on the master:
[root@masterdb somedude]# su - postgres
-bash-4.2$ psql
psql (13.3)
Type "help" for help.
postgres=# \x auto
Expanded display is used automatically.
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid | 23561
usesysid | 16385
usename | replicator
application_name | walreceiver
client_addr | 10.10.10.20
client_hostname |
client_port | 34072
backend_start | 2021-08-12 12:44:49.796086+00
backend_xmin |
state | streaming
sent_lsn | 0/2E000110
write_lsn | 0/2E000110
flush_lsn | 0/2E000110
replay_lsn | 0/2E000110
write_lag | 00:00:00.016497
flush_lag | 00:00:00.017004
replay_lag | 00:00:00.017043
sync_priority | 0
sync_state | async
reply_time | 2021-08-12 12:49:20.442419+00
postgres=# select usename, application_name, client_addr, state, sync_priority, sync_state from pg_stat_replication;
usename | application_name | client_addr | state | sync_priority | sync_state
------------+------------------+----------------+-----------+---------------+------------
replicator | walreceiver | 10.10.10.20 | streaming | 0 | async
(1 row)
Let’s assume something happens to the master server and the replica has to become the new master. In this scenario, the master will have to be stopped and the replica will be promoted to the master. Afterwards, new replica must be built. Assuming the master had been stopped or disappeared, never to appear:
[root@replicadb somedude]# su - postgres
-bash-4.2$ pg_ctl promote
At this point, replicadb is the new master. The old masterdb must be rebuild using pg_basebackup as shown above.
Naturally, this is not ideal. There are tools to deal with PostgreSQL HA, etc. but those did not fit my projecty thing. Hence, HAproxy was bolted on and will be described next time. In this setup, DB traffic will be directed at HAproxy, which will determine which server is acting as the replica and which server is acting as the master and route traffic appropriately.