to sync 2 or more postgresql database that exist on seprate server (with specific IP) we have some way.
First we should tell about replication. replication uses when we want to make an backup from current exist postgresql DB. not a fix backup, an streaming replication that replicate each changes in primary DB on others backup DB.
So we can use this to make a mirror server, but what happen if we use write query (e.g. insert, update) on the secondery db (backup db)? ok we should not allow to write on db because every change on backup db cause to dismiss data because new data has been written on that when update from primary db. So what should we do? we should forward write query to primary db and after primary DB changes the replication send new data to backup DB so the user think write on the DB really worked while the write query forwarded to primary DB.
we start with Replication config.
1. Replication
to make an replication server we only need to use this command:
pg_basebackup -D /path/to/new/data/dir -R --slot=some_name -C -h host -p port -U dbuser
it's realy a simple way to make an database for replication. note that this command should be run on the backup server, so `host` should the IP address of primary server.
maybe you can not connect to primary srever because postgresql denied access to db from other remote server (only localhost access is allowed). so to change this you should change below file.
/etc/postgresql/14/main/pg_hba.conf
so you can below line in the `IPv4 local connections` section:
host all all 10.10.20.31/32 scram-sha-256
you should replace 10.10.20.31 with your DB backup server IP address.
NOTE: if you use `trust` instead of scram-sha-256 you does not need any password to connect to primary DB in backup server.
after that you can check access by run this command on backup server.
psql -U postgres -p 5432 -h 10.10.20.30
we assume that 10.10.20.30 is IP of primary server.
because in the backup DB every user has been transfered to backup server so not need any extra authentication or not need any password. after pg_basebackup the primary DB and backup DB can be run and reflect any change in priamry on backup.
2. pgpool
after init replication we should initialize a way to forward write command from backup DB to primary DB. we use pgpool for this. the pgpool often use for load balanceing.
to start work with pgpool I recommended you to use version 4.3.0 because older version has bugs and you can't use follow this article with older version.
to install pgpool v4.3.0 you should compile it from source code. (use --with-openssl with configure before use make to compile)
after install pgpool v4.3.0 you should config it. the config file located at /usr/local/etc/pgpool.conf you should introduce your priamry and backup db server IPs.
and also enable health check and sc_username to automaticaly check connection between DBs.
NOTE: the pool_passwd in /usr/local/etc/ should contain username and password. for exmaple
postgres:TEXTpostgresql
this means user postgres password postgresql.
NOTE: in the primary DB server and backup DB server pg_hba.conf should be similar because pgpool use same username password for all DB.
you can make an replication user to increase security and use repuser instead of postgres user.
to run pgpool you can use this command:
pgpool -n -d -D
-n cause pgpool run in current terminal not in the background.
-d shows debug message
-D cause delete status file and retry to get new status (is help you some times)
after pgpool run you can connect to it with below command:
psql -U postgres -p 9999 -h localhost
NOTE: in version 4.3.0 the roles for the servers show as primary and standby but in older version it shows at master and slave. don't use older version at all.
to show status of servers you can use this command after open psql:
show pool_nodes;
it get you servers status.
NOTE: I test replicatoin, when you add a column or add/delete an user replication send this to backup db so create table queries and some alter query also supported in replication.
NOTE: to run an instance of postgresql with specific data directory you can use this command:
/usr/lib/postgresql/14/bin/pg_ctl -D /path/to/data_dir start
Question1 what is diff between status and pg_status
if an node detached with pcp_detach_node the status goes to down. but pg_status show status of backend so it remain UP. you can attach node again with pcp_attach_node command.
but issue occured when primary backend has been failed. when backend server connection lose, the pgpool change status to DOWN if pgpool can connect to server again the status remain DOWN this is low of pgpool but pg_status set to UP again.
when primary backend goto down the pgpool should be restarted to UP again. so pcp_attach_node not work for this situation.
to slove this you can use health_check_max_retries config in pgpool.conf according to pgpool document: This setting can be useful in spotty networks, when it is expected that health checks will fail occasionally even when the primary node is fine.
and according to that docuemnt: It is advised that failover_on_backend_error must be disabled, if you want to enable health_check_max_retries.
so you can increase health_check_max_retries and disable failover_on_backend_error config.
or also to solve this problem you can disable failover_on_backend_error and disable health_check also this cause every request check server status. and status not to DOWN when occured any error.
refrences:
https://www.youtube.com/watch?v=qpxKlH7DBjU
https://www.pgpool.net/docs/42/en/html/auth-methods.html
https://www.pgpool.net/docs/latest/en/html/runtime-config-failover.html#GUC-FAILOVER-ON-BACKEND-ERROR
https://www.pgpool.net/docs/latest/en/html/runtime-config-health-check.html