repmgr 2.0 released
What is repmgr?
repmgr is a set of open source tools that helps DBAs and System administrators manage a cluster of PostgreSQL databases.
By taking advantage of the Hot Standby capability introduced in PostgreSQL 9, repmgr greatly simplifies the process of setting up and managing database with high availability and scalability requirements.
repmgr simplifies administration and daily management, enhances productivity and reduces the overall costs of a PostgreSQL cluster by:
- monitoring the replication process;
- allowing DBAs to issue high availability operations such as switch-overs and fail-overs
repmgr is production quality software and used widely across the world with PostgreSQL. Many users rely on repmgr to maintain their replication setups and as such we take new releases seriously and mark the level of maturity as a guide for users. As repmgr 2.0 moves towards production we continue to regard the autofailover feature as currently in beta; we expect that to fully mature in the 2.1 release.
2ndQuadrant provides contract support for PostgreSQL that includes both repmgr and the autofailover feature.
Features
- Improved Documentation
- General refactoring, code quality improvements and stabilization work
- Support for daemonizing (
-d/--daemonize) - PID file handling (
-p/--pid-file) - New config option:
monitor_interval_secs - New config option:
retry_promote_interval - New config option:
logfile - New config option:
pg_bindir - New config option:
pgctl_options - Add timestamps to log line in
stderr - Add a
ssh_optionsparameter - Make
CLONEcommand try to make an exact copy including$PGDATAlocation - Add detection of master failure
- Add the notion of a witness server
- Add experimental autofailover capabilities
- Add a configuration parameter to indicate the script to execute on failover or follow
- Make the monitoring optional and turned off by default, it can be turned on with
--monitoring-historyswitch - Add tunables to specify number of retries to reconnect to master and the time between them
Bugfixes
-
Fixed
PQexec()calls: fixed several calls where we did not check the result status but only the return value ofPQexec(); the query may have failed nonetheless -
Flush
stderrafter a log message appears: We had the problem that the log file appeared empty for a long time due to file buffers. Thus we callfflush()after every log message so the log file gets written out to disk quickly -
Fixed repmgr repl_status columns: repmgr
repl_statusview had the columntime_lagwhich was documented to be the time a standby is behind master. In fact it only works like this when viewed on the standby and not on the master: there it only was the time of the last status update. We dropped that column and replaced it by a new column „communication_time_lag“ which is the content of the repl_status column on the master. On the standby we contain the time of the last update in shared mem though refer always to the correct time nonetheless where repl_status is queried. We also added a new column, „replication_time_lag“, which refers to the apply delay. - Set connections to
NULLwhen callingPQfinish()on them. -
Performance improvements: the old implementation took round about 8
seconds per monitoring interval because it got caught in a sleep
call and had to wait for timeouts. MUCH too long, especially when
you look at the default
monitor_intervalvalue of 2 seconds – we could never hold that. The new implementation usesPQgetResult()andselect()to avoid the sleep and thus the monitoring routine now only uses a fraction of the time before (<1s). -
Leak and memory fixes: Fixed some leaks and an overlapping
strcpy()call. -
Overhauled
CloseConnections():CloseConnections()did not have aNULLcheck forPQisBusy()call and was a macro. It also didn't set the connections toNULL. Now it is a function and sets the connections toNULLand checks forNULLbefore calling functions on connection variables. - Ignore
pg_logwhen cloning - Correctly check
wal_keep_segments - General code refactoring
- Log format fixes
- handle
stdin/stdout/stderrfor repmgrd - Added format checking for
printf()like functions - Added forgotten priority value when creating a witness
pg_configis now setable from outsite of the makefile- Split install targets to
install_progandinstall_extwith doing both as the default - Flush output before calling
system() - Initialize variables as
sscanf()leaves them untouched upon error - No longer exit when standby connection drops
- Several typos have been corrected
- Fixed string comparison when reloading config files
- Do not create data directory before sanity checks succeeded when creating a witness
- Also check if query was successful when registering a new standby
- Remove master node earlier so that master register
--forcesucceeds when it is already registered - Do not exit with database in backup mode (
pg_start_backup()) - Debian control file now accepts PostgreSQL 9.0, 9.1, 9.2 and 9.3
- Now compiles with 9.3
Upcoming
Features we are working on in the near future:
- timeline increase when a standby gets promoted
- A better check which standby did receive most of the data
- Respect the fact that a standby can be delayed on purpose a factor in the voting algorithm
- include support for delayed standbys
Community and development
repmgr is free and open source software and is licensed under the GPLv3.
Contributions to repmgr are welcome. See the README.rst file for information about how to contribute.
