repmgr standby switchover

repmgr standby switchover — promote a standby to primary and demote the existing primary to a standby

Description

Promotes a standby to primary and demotes the existing primary to a standby. This command must be run on the standby to be promoted, and requires a passwordless SSH connection to the current primary.

If other nodes are connected to the demotion candidate, repmgr can instruct these to follow the new primary if the option --siblings-follow is specified. This requires a passwordless SSH connection between the promotion candidate (new primary) and the nodes attached to the demotion candidate (existing primary). Note that a witness server, if in use, is also counted as a "sibling node" as it needs to be instructed to synchronise its metadata with the new primary.

Note

Performing a switchover is a non-trivial operation. In particular it relies on the current primary being able to shut down cleanly and quickly. repmgr will attempt to check for potential issues but cannot guarantee a successful switchover.

repmgr will refuse to perform the switchover if an exclusive backup is running on the current primary, or if WAL replay is paused on the standby.

For more details on performing a switchover, including preparation and configuration, see section Performing a switchover with repmgr.

Note

From repmgr 4.2, repmgr will instruct any running repmgrd instances to pause operations while the switchover is being carried out, to prevent repmgrd from unintentionally promoting a node. For more details, see pausing the repmgrd service.

Users of repmgr versions prior to 4.2 should ensure that repmgrd is not running on any nodes while a switchover is being executed.

User permission requirements

data_directory

repmgr needs to be able to determine the location of the data directory on the demotion candidate. If the repmgr is not a superuser or member of the pg_read_all_settings predefined roles, the name of a superuser should be provided with the -S/--superuser option.

CHECKPOINT

repmgr executes CHECKPOINT on the demotion candidate as part of the shutdown process to ensure it shuts down as smoothly as possible.

Note that CHECKPOINT requires database superuser permissions to execute. If the repmgr user is not a superuser, the name of a superuser should be provided with the -S/--superuser option.

If repmgr is unable to execute the CHECKPOINT command, the switchover can still be carried out, albeit at a greater risk that the demotion candidate may not be able to shut down as smoothly as might otherwise have been the case.

pg_promote() (PostgreSQL 12 and later)

From PostgreSQL 12, repmgr defaults to using the built-in pg_promote() function to promote a standby to primary.

Note that execution of pg_promote() is restricted to superusers or to any user who has been granted execution permission for this function. If the repmgr user is not permitted to execute pg_promote(), repmgr will fall back to using "pg_ctl promote". For more details see repmgr standby promote.

Options

--always-promote

Promote standby to primary, even if it is behind or has diverged from the original primary. The original primary will be shut down in any case, and will need to be manually reintegrated into the replication cluster.

--dry-run

Check prerequisites but don't actually execute a switchover.

Important

Success of --dry-run does not imply the switchover will complete successfully, only that the prerequisites for performing the operation are met.

-F
--force

Ignore warnings and continue anyway.

Specifically, if a problem is encountered when shutting down the current primary, using -F/--force will cause repmgr to continue by promoting the standby to be the new primary, and if --siblings-follow is specified, attach any other standbys to the new primary.

--force-rewind[=/path/to/pg_rewind]

Use pg_rewind to reintegrate the old primary if necessary (and the prerequisites for using pg_rewind are met).

If using PostgreSQL 9.4, and the pg_rewind binary is not installed in the PostgreSQL bin directory, provide its full path. For more details see also Switchover and pg_rewind and Using pg_rewind.

-R
--remote-user

System username for remote SSH operations (defaults to local system user).

--repmgrd-no-pause

Don't pause repmgrd while executing a switchover.

This option should not be used unless you take steps by other means to ensure repmgrd is paused or not running on all nodes.

This option cannot be used together with --repmgrd-force-unpause.

--repmgrd-force-unpause

Always unpause all repmgrd instances after executing a switchover. This will ensure that any repmgrd instances which were paused before the switchover will be unpaused.

This option cannot be used together with --repmgrd-no-pause.

--siblings-follow

Have nodes attached to the old primary follow the new primary.

This will also ensure that a witness node, if in use, is updated with the new primary's data.

Note

In a future repmgr release, --siblings-follow will be applied by default.

-S/--superuser

Use the named superuser instead of the normal repmgr user to perform actions requiring superuser permissions.

Configuration file settings

The following parameters in repmgr.conf are relevant to the switchover operation:

replication_lag_critical

If replication lag (in seconds) on the standby exceeds this value, the switchover will be aborted (unless the -F/--force option is provided)

shutdown_check_timeout

The maximum number of seconds to wait for the demotion candidate (current primary) to shut down, before aborting the switchover.

Note that this parameter is set on the node where repmgr standby switchover is executed (promotion candidate); setting it on the demotion candidate (former primary) will have no effect.

Note

In versions prior to repmgr 4.2, repmgr standby switchover would use the values defined in reconnect_attempts and reconnect_interval to determine the timeout for demotion candidate shutdown.

wal_receive_check_timeout

After the primary has shut down, the maximum number of seconds to wait for the walreceiver on the standby to flush WAL to disk before comparing WAL receive location with the primary's shut down location.

standby_reconnect_timeout

The maximum number of seconds to attempt to wait for the demotion candidate (former primary) to reconnect to the promoted primary (default: 60 seconds)

Note that this parameter is set on the node where repmgr standby switchover is executed (promotion candidate); setting it on the demotion candidate (former primary) will have no effect.

node_rejoin_timeout

maximum number of seconds to attempt to wait for the demotion candidate (former primary) to reconnect to the promoted primary (default: 60 seconds)

Note that this parameter is set on the the demotion candidate (former primary); setting it on the node where repmgr standby switchover is executed will have no effect.

However, this value must be less than standby_reconnect_timeout on the promotion candidate (the node where repmgr standby switchover is executed).

Execution

Execute with the --dry-run option to test the switchover as far as possible without actually changing the status of either node.

External database connections, e.g. from an application, should not be permitted while the switchover is taking place. In particular, active transactions on the primary can potentially disrupt the shutdown process.

Event notifications

standby_switchover and standby_promote event notifications will be generated for the new primary, and a node_rejoin event notification for the former primary (new standby).

If using an event notification script, standby_switchover will populate the placeholder parameter %p with the node ID of the former primary.

Exit codes

One of the following exit codes will be emitted by repmgr standby switchover:

SUCCESS (0)

The switchover completed successfully; or if --dry-run was provided, no issues were detected which would prevent the switchover operation.

ERR_SWITCHOVER_FAIL (18)

The switchover could not be executed.

ERR_SWITCHOVER_INCOMPLETE (22)

The switchover was executed but a problem was encountered. Typically this means the former primary could not be reattached as a standby. Check preceding log messages for more information.

See also

repmgr standby follow, repmgr node rejoin

For more details on performing a switchover operation, see the section Performing a switchover with repmgr.