repmgr 2.0 released
What is repmgr?
repmgr is a set of open source tools that helps DBAs and System administrators manage a cluster of PostgreSQL databases.
By taking advantage of the Hot Standby capability introduced in PostgreSQL 9, repmgr greatly simplifies the process of setting up and managing database with high availability and scalability requirements.
repmgr simplifies administration and daily management, enhances productivity and reduces the overall costs of a PostgreSQL cluster by:
- monitoring the replication process;
- allowing DBAs to issue high availability operations such as switch-overs and fail-overs
repmgr is production quality software and used widely across the world with PostgreSQL. Many users rely on repmgr to maintain their replication setups and as such we take new releases seriously and mark the level of maturity as a guide for users. As repmgr 2.0 moves towards production we continue to regard the autofailover feature as currently in beta; we expect that to fully mature in the 2.1 release.
2ndQuadrant provides contract support for PostgreSQL that includes both repmgr and the autofailover feature.
Features
- Improved Documentation
- General refactoring, code quality improvements and stabilization work
- Support for daemonizing (
-d
/--daemonize
) - PID file handling (
-p
/--pid-file
) - New config option:
monitor_interval_secs
- New config option:
retry_promote_interval
- New config option:
logfile
- New config option:
pg_bindir
- New config option:
pgctl_options
- Add timestamps to log line in
stderr
- Add a
ssh_options
parameter - Make
CLONE
command try to make an exact copy including$PGDATA
location - Add detection of master failure
- Add the notion of a witness server
- Add experimental autofailover capabilities
- Add a configuration parameter to indicate the script to execute on failover or follow
- Make the monitoring optional and turned off by default, it can be turned on with
--monitoring-history
switch - Add tunables to specify number of retries to reconnect to master and the time between them
Bugfixes
-
Fixed
PQexec()
calls: fixed several calls where we did not check the result status but only the return value ofPQexec()
; the query may have failed nonetheless -
Flush
stderr
after a log message appears: We had the problem that the log file appeared empty for a long time due to file buffers. Thus we callfflush()
after every log message so the log file gets written out to disk quickly -
Fixed repmgr repl_status columns: repmgr
repl_status
view had the columntime_lag
which was documented to be the time a standby is behind master. In fact it only works like this when viewed on the standby and not on the master: there it only was the time of the last status update. We dropped that column and replaced it by a new column „communication_time_lag
“ which is the content of the repl_status column on the master. On the standby we contain the time of the last update in shared mem though refer always to the correct time nonetheless where repl_status is queried. We also added a new column, „replication_time_lag
“, which refers to the apply delay. - Set connections to
NULL
when callingPQfinish()
on them. -
Performance improvements: the old implementation took round about 8
seconds per monitoring interval because it got caught in a sleep
call and had to wait for timeouts. MUCH too long, especially when
you look at the default
monitor_interval
value of 2 seconds – we could never hold that. The new implementation usesPQgetResult()
andselect()
to avoid the sleep and thus the monitoring routine now only uses a fraction of the time before (<1s). -
Leak and memory fixes: Fixed some leaks and an overlapping
strcpy()
call. -
Overhauled
CloseConnections()
:CloseConnections()
did not have aNULL
check forPQisBusy()
call and was a macro. It also didn't set the connections toNULL
. Now it is a function and sets the connections toNULL
and checks forNULL
before calling functions on connection variables. - Ignore
pg_log
when cloning - Correctly check
wal_keep_segments
- General code refactoring
- Log format fixes
- handle
stdin
/stdout
/stderr
for repmgrd - Added format checking for
printf()
like functions - Added forgotten priority value when creating a witness
pg_config
is now setable from outsite of the makefile- Split install targets to
install_prog
andinstall_ext
with doing both as the default - Flush output before calling
system()
- Initialize variables as
sscanf()
leaves them untouched upon error - No longer exit when standby connection drops
- Several typos have been corrected
- Fixed string comparison when reloading config files
- Do not create data directory before sanity checks succeeded when creating a witness
- Also check if query was successful when registering a new standby
- Remove master node earlier so that master register
--force
succeeds when it is already registered - Do not exit with database in backup mode (
pg_start_backup()
) - Debian control file now accepts PostgreSQL 9.0, 9.1, 9.2 and 9.3
- Now compiles with 9.3
Upcoming
Features we are working on in the near future:
- timeline increase when a standby gets promoted
- A better check which standby did receive most of the data
- Respect the fact that a standby can be delayed on purpose a factor in the voting algorithm
- include support for delayed standbys
Community and development
repmgr is free and open source software and is licensed under the GPLv3.
Contributions to repmgr are welcome. See the README.rst file for information about how to contribute.