repmgr node check — performs some health checks on a node from a replication perspective
Performs some health checks on a node from a replication perspective. This command must be run on the local node.
Currently repmgr performs health checks on physical replication slots only, with the aim of warning about streaming replication standbys which have become detached and the associated risk of uncontrolled WAL file growth.
$ repmgr -f /etc/repmgr.conf node check Node "node1": Server role: OK (node is primary) Replication lag: OK (N/A - node is primary) WAL archiving: OK (0 pending files) Downstream servers: OK (2 of 2 downstream nodes attached) Replication slots: OK (node has no physical replication slots) Missing replication slots: OK (node has no missing physical replication slots)
Each check can be performed individually by supplying an additional command line parameter, e.g.:
$ repmgr node check --role OK (node is primary)
Parameters for individual checks are as follows:
--role
: checks if the node has the expected role
--replication-lag
: checks if the node is lagging by more than
replication_lag_warning
or replication_lag_critical
--archive-ready
: checks for WAL files which have not yet been archived,
and returns WARNING
or CRITICAL
if the number
exceeds archive_ready_warning
or archive_ready_critical
respectively.
--downstream
: checks that the expected downstream nodes are attached
--slots
: checks there are no inactive physical replication slots
--missing-slots
: checks there are no missing physical replication slots
--data-directory-config
: checks the data directory configured in
repmgr.conf
matches the actual data directory.
This check is not directly related to replication, but is useful to verify repmgr
is correctly configured.
--csv
: generate output in CSV format (not available
for individual checks)
--nagios
: generate output in a Nagios-compatible format
(for individual checks only)
When executing repmgr node check
with one of the individual
checks listed above, repmgr will emit one of the following Nagios-style exit codes
(even if --nagios
is not supplied):
0
: OK
1
: WARNING
2
: ERROR
3
: UNKNOWN
One of the following exit codes will be emitted by repmgr status check
if no individual check was specified.
SUCCESS (0)
No issues were detected.
ERR_NODE_STATUS (25)
One or more issues were detected.