[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5.2.6 Handling a failed node - the fail command

The set, swap and disable commands all rely on being able to connect to ssh and MySQL on both nodes in order to carry out operations. This is to ensure that the masterpair is always in a consistent, operable state.

If a node fails due to hardware or network failure, and there is no prospect of recovery of that node within an acceptable timeframe, then it is possible to use the fail command.

Under normal circumstances, the fail command acts exactly the same as the disable command - it will remove IPs from the given node, and bring them up on the other node, whilst ensuring the masterpair is in a consistent, operable state.

The fail command differs in that it will attempt to contact the failed node, and if it can't, it will carry on regardless. As such, the ‘--yes’ option must be specified on the command line to confirm that the user wishes to carry out this operation.

If a node has suffered a problem with MySQL, but can still be contacted via ssh, then the fail command will handle moving IPs away from the failed node correctly, but will not attempt to synchronise replication.

If a node cannot be contacted via ssh, then the fail command won't be able to take down the IPs from the failed node. There are dangers associated with this, as it could potentially leave a masterpair in an inconsistent state - for example, if used to fail a node which is only temporarily inaccessible, when that node comes back, one or more IPs will be up on both nodes.