ctdb-daemon: Don't explicitly disable monitoring around recovery
authorMartin Schwenke <martin@meltin.net>
Fri, 1 Sep 2017 02:12:45 +0000 (12:12 +1000)
committerAmitay Isaacs <amitay@samba.org>
Thu, 14 Sep 2017 12:49:15 +0000 (14:49 +0200)
Monitoring can fail during recovery due to databases (e.g. registry)
being unavailable.  This has been avoided by explicitly disabling
monitoring around recovery via the START_RECOVERY and END_RECOVERY
controls.  With this approach only there is still a window between
enabling recovery mode and START_RECOVERY when monitoring could be
attempted.  However, explicitly disabling monitoring is unnecessary
because monitoring is not done when a node is in recovery.

So remove the explicit disable/enable of monitoring and rely on
monitoring being skipped when recovery mode is active.

The only possible change of behaviour with this change is that there
is now a window between setting recovery mode to normal and the
END_RECOVERY control where monitoring is enabled.  However, at this
point databases would be available and the "recovered" event will
cancel any in-progress monitoring.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
ctdb/server/ctdb_recover.c

index 683211a26c0317e50e47e9c9aa8baefc0d571a05..f4cd5f64eee5821e1884a1c6743b93f3c2868a8f 100644 (file)
@@ -1044,7 +1044,6 @@ static void ctdb_end_recovery_callback(struct ctdb_context *ctdb, int status, vo
 {
        struct recovery_callback_state *state = talloc_get_type(p, struct recovery_callback_state);
 
-       ctdb_enable_monitoring(ctdb);
        CTDB_INCREMENT_STAT(ctdb, num_recoveries);
 
        if (status != 0) {
@@ -1083,16 +1082,12 @@ int32_t ctdb_control_end_recovery(struct ctdb_context *ctdb,
 
        state->c    = c;
 
-       ctdb_disable_monitoring(ctdb);
-
        ret = ctdb_event_script_callback(ctdb, state,
                                         ctdb_end_recovery_callback, 
                                         state, 
                                         CTDB_EVENT_RECOVERED, "%s", "");
 
        if (ret != 0) {
-               ctdb_enable_monitoring(ctdb);
-
                DEBUG(DEBUG_ERR,(__location__ " Failed to end recovery\n"));
                talloc_free(state);
                return -1;
@@ -1124,8 +1119,6 @@ static void run_start_recovery_event(struct ctdb_context *ctdb,
 {
        int ret;
 
-       ctdb_disable_monitoring(ctdb);
-
        ret = ctdb_event_script_callback(ctdb, state,
                                         ctdb_start_recovery_callback,
                                         state,