obnox/ctdb.git
11 years agoeventscripts: Do not use bashism for string comparison
Amitay Isaacs [Tue, 14 May 2013 13:18:32 +0000 (23:18 +1000)]
eventscripts: Do not use bashism for string comparison

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
11 years agorecoverd: Move IP flags into ctdb_takeover.c
Martin Schwenke [Thu, 9 May 2013 02:53:48 +0000 (12:53 +1000)]
recoverd: Move IP flags into ctdb_takeover.c

These should never be seen outside the IP allocation code.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agorecoverd: Clear IP flags after IP allocation algorithm has run
Martin Schwenke [Thu, 9 May 2013 02:51:57 +0000 (12:51 +1000)]
recoverd: Clear IP flags after IP allocation algorithm has run

If these flags are left set they will confuse other recovery daemon
code.

Factor the clearing code into new function clear_ipflags().

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

11 years agorecoverd: Remove unused mask argument and initial mask calculation
Martin Schwenke [Fri, 3 May 2013 10:46:15 +0000 (20:46 +1000)]
recoverd: Remove unused mask argument and initial mask calculation

This has been replaced by set_ipflags() and associated functionality.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agorecoverd: When calculating rebalance candidates don't consider flags
Martin Schwenke [Fri, 3 May 2013 10:41:32 +0000 (20:41 +1000)]
recoverd: When calculating rebalance candidates don't consider flags

This is really a check to see if a node is already hosting IPs.  If
so, we assume it was previously healthy so it isn't considered as a
rebalance candidate.  There's no need to limit this to healthy node,
since this is checked elsewhere.

Due to this the variable newly_healthy is renamed everywhere to
rebalance_candidates.

The mask argument is now completely unused.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agorecoverd: Remove unused mask argument from IP allocation functions
Martin Schwenke [Fri, 3 May 2013 10:13:40 +0000 (20:13 +1000)]
recoverd: Remove unused mask argument from IP allocation functions

This is a no-op and is in a separate commit to make the previous
commit less cumbersome.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agotests/takeover: Add takeover tests, mostly for NoIPHostOnAllDisabled
Martin Schwenke [Fri, 3 May 2013 05:57:21 +0000 (15:57 +1000)]
tests/takeover: Add takeover tests, mostly for NoIPHostOnAllDisabled

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

11 years agorecoverd: Fix tunable NoIPTakeoverOnDisabled, rename to NoIPHostOnAllDisabled
Martin Schwenke [Fri, 3 May 2013 06:59:20 +0000 (16:59 +1000)]
recoverd: Fix tunable NoIPTakeoverOnDisabled, rename to NoIPHostOnAllDisabled

This really needs to be per-node.  The rename is because nodes with
this tunable switched on should drop IPs if they become unhealthy (or
disabled in some other way).

* Add new flag NODE_FLAGS_NOIPHOST, only used in recovery daemon.

* Enhance set_ipflags_internal() and set_ipflags() to setup
  NODE_FLAGS_NOIPHOST depending on setting of NoIPHostOnAllDisabled
  and/or whether nodes are disabled/inactive.

* Replace can_node_servce_ip() with functions can_node_host_ip() and
  can_node_takeover_ip().  These functions are the only ones that need
  to look at NODE_FLAGS_NOIPTAKEOVER and NODE_FLAGS_NOIPHOST.  They
  can make the decision without looking at any other flags due to
  previous setup.

* Remove explicit flag checking in IP allocation functions (including
  unassign_unsuitable_ips()) and just call can_node_host_ip() and
  can_node_takeover_ip() as appropriate.

* Update test code to handle CTDB_SET_NoIPHostOnAllDisabled.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

11 years agorecoverd: Factor out new function all_nodes_are_disabled()
Martin Schwenke [Fri, 3 May 2013 06:56:24 +0000 (16:56 +1000)]
recoverd: Factor out new function all_nodes_are_disabled()

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agotests/takeover: Allow per-node tunable settings
Martin Schwenke [Fri, 3 May 2013 05:55:01 +0000 (15:55 +1000)]
tests/takeover: Allow per-node tunable settings

Implemented for CTDB_SET_NoIPTakeover.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

11 years agorecoverd: Refactor code to get NoIPTakeover tunable from all nodes
Martin Schwenke [Fri, 3 May 2013 06:21:16 +0000 (16:21 +1000)]
recoverd: Refactor code to get NoIPTakeover tunable from all nodes

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

11 years agotests: Unit test diff output should use filtered output
Martin Schwenke [Fri, 3 May 2013 05:53:13 +0000 (15:53 +1000)]
tests: Unit test diff output should use filtered output

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agorecoverd: Add debug message when dropping IPs in IP allocation
Martin Schwenke [Fri, 3 May 2013 05:41:26 +0000 (15:41 +1000)]
recoverd: Add debug message when dropping IPs in IP allocation

Update tests accordingly.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: NFS RPC checks no longer support "knfsd"
Martin Schwenke [Tue, 23 Apr 2013 02:30:33 +0000 (12:30 +1000)]
eventscripts: NFS RPC checks no longer support "knfsd"

No longer used, support removed from test infrastructure.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: 60.nfs uses nfs_check_rpc_services() to check NFS RPC services
Martin Schwenke [Tue, 23 Apr 2013 02:17:31 +0000 (12:17 +1000)]
eventscripts: 60.nfs uses nfs_check_rpc_services() to check NFS RPC services

* New directory nfs-rpc-checks.d/ replaces hardcoded rules in 60.nfs

* Installation and packaging additions to handle nfs-rpc-checks.d/

* Unit test updates, including deleting 1 test that sanity checked
  test infrastructure

* Test infrastructure changes to use nfs-rpc-checks.d/

Note that this removes support for $CTDB_NFS_SKIP_KNFSD_ALIVE_CHECK in
60.nfs.  To get the equivalent behaviour, edit 20.nfsd.check and
remove/comment all lines.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: NFS RPC checks allows "nfsd" in addition to "knfsd"
Martin Schwenke [Tue, 23 Apr 2013 01:14:48 +0000 (11:14 +1000)]
eventscripts: NFS RPC checks allows "nfsd" in addition to "knfsd"

Want nfs_check_rpc_services() to support filenames without the 'k'.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: New function nfs_check_rpc_services()
Martin Schwenke [Mon, 22 Apr 2013 20:42:54 +0000 (06:42 +1000)]
eventscripts: New function nfs_check_rpc_services()

This is intended to replace nfs_check_rpc_service(), which builds
configuration into eventscripts.

nfs_check_rpc_services() uses a directory of configuration checks that
can be edited by an administrator.  The files have one limit check and
a set of actions per line.  The program name is extracted from the
file name.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: nfs_check_rpc_action() should be _nfs_check_rpc_action()
Martin Schwenke [Mon, 22 Apr 2013 20:28:27 +0000 (06:28 +1000)]
eventscripts: nfs_check_rpc_action() should be _nfs_check_rpc_action()

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Factor out common code from nfs_check_rpc_service()
Martin Schwenke [Mon, 22 Apr 2013 20:27:02 +0000 (06:27 +1000)]
eventscripts: Factor out common code from nfs_check_rpc_service()

This creates new function _nfs_check_rpc_common().

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Remove ganesha support from nfs_check_rpc_service()
Martin Schwenke [Mon, 22 Apr 2013 20:17:15 +0000 (06:17 +1000)]
eventscripts: Remove ganesha support from nfs_check_rpc_service()

This is unused so doesn't need to be maintained.  An attempt to use it
now will explicitly fail rather than implicitly fail via bitrot.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoRevert "Eventscript functions: add optional version to nfs_check_rpc_service()"
Martin Schwenke [Mon, 22 Apr 2013 20:14:43 +0000 (06:14 +1000)]
Revert "Eventscript functions: add optional version to nfs_check_rpc_service()"

This reverts commit 92f74fd589467b46c758e116e97417edfe8773d7.

This change is unused and is just complicating the function.

Conflicts:
config/functions

11 years agoeventscripts: Move rpc.statd existence check into nfs_check_rpc_service ()
Martin Schwenke [Mon, 22 Apr 2013 19:54:12 +0000 (05:54 +1000)]
eventscripts: Move rpc.statd existence check into nfs_check_rpc_service ()

The code in 60.nfs is going to be genericised, so make all the checks
look the same.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Factor NFS RPC check action code into nfs_check_rpc_action()
Martin Schwenke [Mon, 22 Apr 2013 05:45:13 +0000 (15:45 +1000)]
eventscripts: Factor NFS RPC check action code into nfs_check_rpc_action()

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Remove unused function ctdb_check_counter_limit()
Martin Schwenke [Tue, 30 Apr 2013 05:33:12 +0000 (15:33 +1000)]
eventscripts: Remove unused function ctdb_check_counter_limit()

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Use ctdb_check_counter() instead of ctdb_check_counter_limit()
Martin Schwenke [Tue, 30 Apr 2013 05:23:20 +0000 (15:23 +1000)]
eventscripts: Use ctdb_check_counter() instead of ctdb_check_counter_limit()

ctdb_check_counter_limit() can soon be removed...

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Might as well try to stat the reclock file first
Martin Schwenke [Tue, 30 Apr 2013 05:19:52 +0000 (15:19 +1000)]
eventscripts: Might as well try to stat the reclock file first

It is in the background but it still might cause the counter to be
reset before it is checked.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Make the early exit in 01.reclock earlier
Martin Schwenke [Tue, 30 Apr 2013 05:16:44 +0000 (15:16 +1000)]
eventscripts: Make the early exit in 01.reclock earlier

That way we don't even check the counter...

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Minor cleanups for killtcp/tickle functions
Martin Schwenke [Mon, 6 May 2013 06:23:25 +0000 (16:23 +1000)]
eventscripts: Minor cleanups for killtcp/tickle functions

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Tweak the timeout check in kill_tcp_connections()
Martin Schwenke [Tue, 30 Apr 2013 01:39:46 +0000 (11:39 +1000)]
eventscripts: Tweak the timeout check in kill_tcp_connections()

This has 2 advantages:

1. It uses get_tcp_connections_for_ip() to check for leftover
   connections, instead of custom code.

2. It checks for the timeout condition before sleeping.  The current
   code sleeps and then checks, so wastes a second.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: In killtcp/tickle functions, $_failed should be boolean
Martin Schwenke [Mon, 29 Apr 2013 20:31:30 +0000 (06:31 +1000)]
eventscripts: In killtcp/tickle functions, $_failed should be boolean

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Remove unused $_killcount from tickle_tcp_connections()
Martin Schwenke [Mon, 29 Apr 2013 20:27:58 +0000 (06:27 +1000)]
eventscripts: Remove unused $_killcount from tickle_tcp_connections()

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Refactor connection listing in killtcp and tickle functions
Martin Schwenke [Mon, 29 Apr 2013 20:25:26 +0000 (06:25 +1000)]
eventscripts: Refactor connection listing in killtcp and tickle functions

Uses new function get_tcp_connections_for_ip().  This avoids using a
temporary file and running netstat twice.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Reimplement kill_tcp_connections_local_only()
Martin Schwenke [Mon, 29 Apr 2013 20:19:18 +0000 (06:19 +1000)]
eventscripts: Reimplement kill_tcp_connections_local_only()

... using kill_tcp_connections()

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Change handling of one-way kills in kill_tcp_connections()
Martin Schwenke [Mon, 29 Apr 2013 20:14:01 +0000 (06:14 +1000)]
eventscripts: Change handling of one-way kills in kill_tcp_connections()

This change is a no-op.  However, In a subsequent commit we'll merge
kill_tcp_connections_local_only() with this function.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Remove unnecessary variables from killtcp/tickle functions
Martin Schwenke [Mon, 29 Apr 2013 20:05:52 +0000 (06:05 +1000)]
eventscripts: Remove unnecessary variables from killtcp/tickle functions

Setting these variables spawns lots of unnecessary processes, which
would surely slow down these functions on a busy system.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Clean up ctdb_check_command()
Martin Schwenke [Mon, 29 Apr 2013 17:54:17 +0000 (03:54 +1000)]
eventscripts: Clean up ctdb_check_command()

* Command is now multiple arguments, preserving quoting
* $service_name no longer printed, no longer an argument
* Debug output from failed command

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts; Cleanup up ctdb_check_directories()
Martin Schwenke [Mon, 29 Apr 2013 17:48:51 +0000 (03:48 +1000)]
eventscripts; Cleanup up ctdb_check_directories()

The documentation comments are wrong... and remove option
$service_name argument.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Assert that $service_name is set in a few key places
Martin Schwenke [Mon, 29 Apr 2013 17:45:21 +0000 (03:45 +1000)]
eventscripts: Assert that $service_name is set in a few key places

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: counters default to $script_name if $service_name not set
Martin Schwenke [Tue, 30 Apr 2013 05:31:27 +0000 (15:31 +1000)]
eventscripts: counters default to $script_name if $service_name not set

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Simplify handling of $service name in "managed" functions
Martin Schwenke [Mon, 29 Apr 2013 17:32:29 +0000 (03:32 +1000)]
eventscripts: Simplify handling of $service name in "managed" functions

Complicated argument handling was introduced to deal with multiple
services per eventscript.  This was a failure and we split 50.samba.

This simplifies several functions to use global $service_name
unconditionally instead of having an optional argument.

$service_name is no automatically longer set in the functions file.
This means it needs to be explicitly set in 13.per_ip_routing because
this script uses ctdb_service_check_reconfigure().

Eventscript unit test infrastructure needs to set $service_name during
fake service setup, and policy routing tests need to be updated
accordingly.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Simplify handling of $service name in start/stop functions
Martin Schwenke [Mon, 29 Apr 2013 17:18:01 +0000 (03:18 +1000)]
eventscripts: Simplify handling of $service name in start/stop functions

Complicated argument handling was introduced to deal with multiple
services per eventscript.  This was a failure and we split 50.samba.

This simplifies several functions to use global $service_name
unconditionally instead of having an optional argument.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Simplify handling of $service name in service_management
Martin Schwenke [Mon, 29 Apr 2013 17:13:36 +0000 (03:13 +1000)]
eventscripts: Simplify handling of $service name in service_management

Complicated argument handling was introduced to deal with multiple
services per eventscript.  This was a failure and we split 50.samba.

This simplifies several functions to use global $service_name
unconditionally instead of having an optional argument.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Simplify handling of $service name in reconfigure functions
Martin Schwenke [Mon, 29 Apr 2013 16:59:41 +0000 (02:59 +1000)]
eventscripts: Simplify handling of $service name in reconfigure functions

Complicated argument handling was introduced to deal with multiple
services per eventscript.  This was a failure and we split 50.samba.

This simplifies several functions to use global $service_name
unconditionally instead of having an optional argument.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Remove unused function ctdb_check_counter_equal()
Martin Schwenke [Wed, 24 Apr 2013 07:14:32 +0000 (17:14 +1000)]
eventscripts: Remove unused function ctdb_check_counter_equal()

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoscripts: Fix script_log() regression
Martin Schwenke [Tue, 23 Apr 2013 03:56:15 +0000 (13:56 +1000)]
scripts: Fix script_log() regression

5940a2494e9e43a83f2bca098bd04dfc1a8f2e93 makes script_log() always
pass a message to logger, so script_log() can no longer log stdin.

Put all the tag fu in the actual tag so the message argument is empty
if no message was passed.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoinitscript: Look for tdbtool/tdbdump using which, not in fixed locations
Martin Schwenke [Tue, 23 Apr 2013 03:49:28 +0000 (13:49 +1000)]
initscript: Look for tdbtool/tdbdump using which, not in fixed locations

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoctdbd: Log CTDB startup before creating the PID file
Martin Schwenke [Mon, 22 Apr 2013 04:55:33 +0000 (14:55 +1000)]
ctdbd: Log CTDB startup before creating the PID file

Otherwise the messages are in a stupid order...  :-)

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reported-by: Amitay Isaacs <amitay@gmail.com>
11 years agoctdbd: Remove the "stopped" event
Martin Schwenke [Thu, 21 Feb 2013 03:28:13 +0000 (14:28 +1100)]
ctdbd: Remove the "stopped" event

It isn't used, superceded by "ipreallocated".

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Remove use of "stopped" event
Martin Schwenke [Thu, 21 Feb 2013 03:17:09 +0000 (14:17 +1100)]
eventscripts: Remove use of "stopped" event

Use "ipreallocated" instead.  The "stopped" event pre-dates the
"ipreallocated" event.  The only way of stopping a node is via the
ctdb tool, which explicitly causes a takeover run to occur after the
node is stopped.  The takeover run will generate an "ipreallocated"
event.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agorecoverd: ctdb_takeover_run() uses CTDB_CONTROL_IPREALLOCATED
Martin Schwenke [Thu, 21 Feb 2013 02:13:09 +0000 (13:13 +1100)]
recoverd: ctdb_takeover_run() uses CTDB_CONTROL_IPREALLOCATED

This means "ipreallocated" is now run on stopped nodes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoctdbd: New control CTDB_CONTROL_IPREALLOCATED
Martin Schwenke [Fri, 19 Apr 2013 03:05:02 +0000 (13:05 +1000)]
ctdbd: New control CTDB_CONTROL_IPREALLOCATED

This is an alternative to using ctdb_run_eventscripts() that can be
used when in recovery.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoctdbd: Avoid freeing non-monitor event callback when monitoring is disabled
Martin Schwenke [Tue, 30 Apr 2013 07:22:23 +0000 (17:22 +1000)]
ctdbd: Avoid freeing non-monitor event callback when monitoring is disabled

When running a non-monitor event, check is made for any active monitor
events.  If there is an active monitor event, then the active monitor
event is cancelled.  This is done by freeing state->callback which is
allocated from monitor_context.

When CTDB is stopped or shutdown, monitoring is disabled by freeing
monitor_context, which frees callback and then stopped or shutdown event
is run.  This creates a new callback structure which is allocated at
the exact same memory location as the monitor callback which was freed.
So in the check for active monitor events, it frees the new callback
for non-monitor event.  Since the callback function flags successful
completion of that event, it is never marked complete and CTDB is stuck
in a loop waiting for completion.

Move the monitor cancellation to the top of the function so that this
can't happen.

Follow log snippest highlights the problem.

2013/04/30 16:54:10.673807 [21505]: Received SHUTDOWN command. Stopping CTDB daemon.
2013/04/30 16:54:10.673814 [21505]: Shutting down recovery daemon
2013/04/30 16:54:10.673852 [21505]: server/eventscript.c:696 in remove_callback 0x1c6d5c0
2013/04/30 16:54:10.673858 [21505]: Monitoring has been stopped
2013/04/30 16:54:10.673899 [21505]: server/eventscript.c:594 Sending SIGTERM to child pid:23847
2013/04/30 16:54:10.673913 [21505]: server/eventscript.c:629 searching for callback 0x1c6d5c0
2013/04/30 16:54:10.673932 [21505]: server/eventscript.c:641 running callback
2013/04/30 16:54:10.673939 [21505]: server/eventscript.c:866 in event_script_callback
2013/04/30 16:54:10.673946 [21505]: server/eventscript.c:696 in remove_callback 0x1c6d5c0

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

11 years agorecoverd: Interface reference count changes should not cause takeover runs
Martin Schwenke [Wed, 20 Feb 2013 23:43:35 +0000 (10:43 +1100)]
recoverd: Interface reference count changes should not cause takeover runs

At the moment a naive compare of the all the interface data is done.
So, if any IPs move then the reference counts for the the relevant
interfaces change, interfaces appear to have changed and another
takeover run is initiated by each node that took/released IPs.

This change stops the spurious takeover runs by changing the interface
comparison to ignore the reference counts.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agorecover: use CTDB_REC_RO_FLAGS where appropriate
Michael Adam [Fri, 19 Apr 2013 14:24:32 +0000 (16:24 +0200)]
recover: use CTDB_REC_RO_FLAGS where appropriate

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agoctdb_daemon: use CTDB_REC_RO_FLAGS where appropriate
Michael Adam [Fri, 19 Apr 2013 14:23:16 +0000 (16:23 +0200)]
ctdb_daemon: use CTDB_REC_RO_FLAGS where appropriate

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agoctdb_call: use CTDB_REC_RO_FLAGS where appropriate
Michael Adam [Fri, 19 Apr 2013 14:22:49 +0000 (16:22 +0200)]
ctdb_call: use CTDB_REC_RO_FLAGS where appropriate

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agovacuum: use CTDB_REC_RO_FLAGS in the vacuuming code
Michael Adam [Fri, 19 Apr 2013 14:09:34 +0000 (16:09 +0200)]
vacuum: use  CTDB_REC_RO_FLAGS in the vacuuming code

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agoltdb_server: use CTDB_REC_RO_FLAGS where appropriate
Michael Adam [Fri, 19 Apr 2013 13:55:38 +0000 (15:55 +0200)]
ltdb_server: use CTDB_REC_RO_FLAGS where appropriate

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agoinclude: define CTDB_REC_RO_FLAGS - all read-only related record flags
Michael Adam [Fri, 19 Apr 2013 14:01:45 +0000 (16:01 +0200)]
include: define CTDB_REC_RO_FLAGS - all read-only related record flags

This is used for some checks

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agovacuum: Update (C)
Michael Adam [Fri, 22 Feb 2013 15:12:17 +0000 (16:12 +0100)]
vacuum: Update (C)

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agovacuum: extend the header comment for ctdb_process_delete_list()
Michael Adam [Sat, 29 Dec 2012 16:23:27 +0000 (17:23 +0100)]
vacuum: extend the header comment for ctdb_process_delete_list()

Describe the (new) process more precisely.
And mention that is the last step of the vacuuming process
that is performed on the lmaster.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agovacuum: turn the vacuuming on lmaster into a three-phase process.
Michael Adam [Sat, 5 Jan 2013 00:20:18 +0000 (01:20 +0100)]
vacuum: turn the vacuuming on lmaster into a three-phase process.

More precisely, before locally deleting an empty record, that has been
migrated with data and that we are dmaster and laster for, we now perform
the deletion on the other nodes in two steps instead of a single step.

- First send out the list of records to be deleted to all
  other nodes with the new RECEIVE_RECORDS control to store
  the lmaster's current empty copy.
- Then send those records that could be deleted on all nodes
  to all nodes again with the TRY_DELETE_RECORDS control
  as before for deletion.
- Finally delete those records locally that were successfully
  deleted remotely in the previous step.

This fixes an old race where a recovery that hits the vacuum process
square between the eyes can create gaps in the record's history and
hence let the records resurrect. In the case of the locking.tdb,
that could mean that a file that was already closed, was recorded as
being open and locked again, so samba clients were locked out of that
file until samba was restarted.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agovacuum: introduce the RECEIVE_RECORDS control
Michael Adam [Thu, 20 Dec 2012 23:24:47 +0000 (00:24 +0100)]
vacuum: introduce the RECEIVE_RECORDS control

This in preparation of turning the vacuming on the lmaster into
into a two phase process:

- First the node sends the list of records to be vacuumed
  to all other nodes with this new RECEIVE_RECORDS control.
  The remote nodes should store the lmaster's empty current copy.
- Only those records that could be stored on all other nodes
  are processed further. They are send to all other nodes with
  the TRY_DELETE_RECORDS control as before for deletion.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agovacuum: reorder some of ctdb_process_delete_list() more intuitively
Michael Adam [Sat, 29 Dec 2012 17:32:39 +0000 (18:32 +0100)]
vacuum: reorder some of ctdb_process_delete_list() more intuitively

Now that the nodemap and its talloc children don't hang off of the
delete_records_list talloc context, we can build the nodemap
and earlier, and move the construction of the delete_records_list
to where it is more obvious what it is used for.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agovacuum: add explicit temporary memory context to ctdb_process_delete_list()
Michael Adam [Sat, 29 Dec 2012 16:16:33 +0000 (17:16 +0100)]
vacuum: add explicit temporary memory context to ctdb_process_delete_list()

This removes the implicit artificial talloc hierarchy and makes the
code easier to understand.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agovacuum: fix indentation in ctdb_process_delete_list()
Michael Adam [Sat, 5 Jan 2013 00:19:06 +0000 (01:19 +0100)]
vacuum: fix indentation in ctdb_process_delete_list()

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agovacuum: free temporary allocated memory correctly in ctdb_process_delete_list().
Michael Adam [Mon, 17 Dec 2012 16:31:55 +0000 (17:31 +0100)]
vacuum: free temporary allocated memory correctly in ctdb_process_delete_list().

Add a common exit point for cleanup.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agovacuum: move variable into scope of use in ctdb_process_delete_list()
Michael Adam [Mon, 17 Dec 2012 16:26:22 +0000 (17:26 +0100)]
vacuum: move variable into scope of use in ctdb_process_delete_list()

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agovacuum: move variable into scope of use in ctdb_process_delete_list()
Michael Adam [Mon, 17 Dec 2012 12:07:21 +0000 (13:07 +0100)]
vacuum: move variable into scope of use in ctdb_process_delete_list()

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agovacuum: simplify ctdb_process_delete_list(): reduce indentation
Michael Adam [Mon, 17 Dec 2012 12:03:42 +0000 (13:03 +0100)]
vacuum: simplify ctdb_process_delete_list(): reduce indentation

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agovacuum: add DEBUG to skip conditions in delete_record_traverse()
Michael Adam [Wed, 3 Apr 2013 12:12:27 +0000 (14:12 +0200)]
vacuum: add DEBUG to skip conditions in delete_record_traverse()

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agovacuum: break line for RO-flags check in delete_record_traverse() for readability
Michael Adam [Fri, 5 Apr 2013 15:14:43 +0000 (17:14 +0200)]
vacuum: break line for RO-flags check in delete_record_traverse() for readability

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agoclient: fix ctdb_control() to be able to cope with CTDB_CTRL_FLAG_NOREPLY
Michael Adam [Mon, 22 Apr 2013 14:21:02 +0000 (10:21 -0400)]
client: fix ctdb_control() to be able to cope with CTDB_CTRL_FLAG_NOREPLY

This was apparently not used before in this context, and the bug hence
not detected. It becomes necessary when ctdb_local_schedule_for_deletion()
is called from a client ctdbd (the vacuuming child), hence needs to send
the SCHEDULE_FOR_DELETION control to its parent.

Pair-Programmed-With: Stefan Metzmacher <metze@samba.org>

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agoctdbd: Set num_clients statistic from ctdb->num_clients
Amitay Isaacs [Fri, 19 Apr 2013 03:29:04 +0000 (13:29 +1000)]
ctdbd: Set num_clients statistic from ctdb->num_clients

This fixes the problem of "ctdb statisticsreset" clearing the number of
clients even when there are active clients.

Values returned in statistics for frozen, recovering, memory_used are based on
the current state of CTDB and are not maintained as statistics.  This should
include num_clients as well.

Currently ctdb->num_clients is unused. So use that to track the number of
clients and fill in statistics field only when requested.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
11 years agoctdbd: Log PID file creation and removal at NOTICE level
Martin Schwenke [Mon, 22 Apr 2013 03:52:04 +0000 (13:52 +1000)]
ctdbd: Log PID file creation and removal at NOTICE level

Unexpected removal of this file can have serious consequences, so it
is best if this is logged at the default level.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoscripts: Ensure even external scripts get tagged in logs as "ctdbd"
Martin Schwenke [Mon, 22 Apr 2013 03:48:06 +0000 (13:48 +1000)]
scripts: Ensure even external scripts get tagged in logs as "ctdbd"

Our practice is to search logs for "ctdbd:".  We want to make sure we
find everything.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoeventscripts: Ensure directories are created
Martin Schwenke [Sun, 21 Apr 2013 20:52:49 +0000 (06:52 +1000)]
eventscripts: Ensure directories are created

Previous commits stopped the top level of the script from creating
certain directories but some functions assume that required
directories exist.

Create those directories instead.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoscripts: Clean up update_tickles() and handling of associated directory
Martin Schwenke [Wed, 17 Apr 2013 03:26:04 +0000 (13:26 +1000)]
scripts: Clean up update_tickles() and handling of associated directory

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoscripts: Use $CTDB_SCRIPT_DEBUGLEVEL instead of something more complex
Martin Schwenke [Wed, 17 Apr 2013 03:12:32 +0000 (13:12 +1000)]
scripts: Use $CTDB_SCRIPT_DEBUGLEVEL instead of something more complex

The current logic is horrible and creates an unnecessary file.  Let's
make the script debug level independent of ctddb's debug level.

* Have debug() use $CTDB_SCRIPT_DEBUGLEVEL directly

* Remove ctdb_set_current_debuglevel()

* Remove the "getdebug" command from ctdb stub in eventscript unit
  tests

* Update relevant eventscript unit tests to use
  $CTDB_SCRIPT_DEBUGLEVEL

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoscripts: Ensure service command is in $PATH in ctdb-crash-cleanup.sh
Martin Schwenke [Fri, 19 Apr 2013 03:10:27 +0000 (13:10 +1000)]
scripts: Ensure service command is in $PATH in ctdb-crash-cleanup.sh

Move the use of the service command below inclusion of functions file,
which sets $PATH.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoinitscript: Remove duplicate setting of $ctdbd
Martin Schwenke [Mon, 15 Apr 2013 09:15:22 +0000 (19:15 +1000)]
initscript: Remove duplicate setting of $ctdbd

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
11 years agoutil: Removed unused declaration of ctdbd_start()
Martin Schwenke [Tue, 16 Apr 2013 01:40:55 +0000 (11:40 +1000)]
util: Removed unused declaration of ctdbd_start()

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
11 years agoinclude: Move ctdb_start_daemon() from ctdb_client.h to ctdb_private.h
Martin Schwenke [Mon, 15 Apr 2013 03:31:42 +0000 (13:31 +1000)]
include: Move ctdb_start_daemon() from ctdb_client.h to ctdb_private.h

It really is internal.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
11 years agoscripts: ctdb-crash-cleanup.sh uses initscript to see if ctdbd is running
Martin Schwenke [Mon, 15 Apr 2013 05:42:55 +0000 (15:42 +1000)]
scripts: ctdb-crash-cleanup.sh uses initscript to see if ctdbd is running

"ctdb ping" can time out.  How many times should we try?

Instead, depend on the initscript to implement something sane.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
11 years agoinitscript: Use a PID file to implement the "status" option
Martin Schwenke [Mon, 15 Apr 2013 05:18:12 +0000 (15:18 +1000)]
initscript: Use a PID file to implement the "status" option

Using "ctdb ping" and "ctdb status" is fraught with danger.  These
commands can timeout when ctdbd is running, leading callers to believe
that ctdbd is not running.  Timeouts could be increased but we would
still have to handle potential timeouts.

Everything else in the world implements the "status" option by
checking if the relevant process is running.  This change makes CTDB
do the same thing and uses standard distro functions.

This change is backward compatible in sense that a missing
/var/run/ctdb/ directory means that we don't do a PID file check but
just depend on the distro's checking method.  Therefore, if CTDB was
started with an older version of this script then "service ctdb
status" will still work.

This script does not support changing the value of CTDB_VALGRIND
between calls.  If you start with CTDB_VALGRIND=yes then you need to
check status with the same setting.  CTDB_VALGRIND is a debug
variable, so this is acceptable.

This also adds sourcing of /lib/lsb/init-functions to make the Debian
function status_of_proc() available.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
11 years agoctdbd: Add --pidfile option
Martin Schwenke [Mon, 15 Apr 2013 03:32:57 +0000 (13:32 +1000)]
ctdbd: Add --pidfile option

Default is not to create a pid file.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
11 years agoutil: ctdb_fork() should call ctdb_set_child_info()
Martin Schwenke [Mon, 15 Apr 2013 06:14:40 +0000 (16:14 +1000)]
util: ctdb_fork() should call ctdb_set_child_info()

For now we pass NULL as the child name.  Later we'll give ctdb_fork()
and friends an extra argument and pass that through.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
11 years agoutil: New functions ctdb_set_child_info() and ctdb_is_child_process()
Martin Schwenke [Tue, 16 Apr 2013 01:11:11 +0000 (11:11 +1000)]
util: New functions ctdb_set_child_info() and ctdb_is_child_process()

Must be called by all child processes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
11 years agotests: add a comment to recovery db corruption test
Michael Adam [Wed, 17 Apr 2013 11:08:49 +0000 (13:08 +0200)]
tests: add a comment to recovery db corruption test

The comment explains that we use "ctdb stop" and "ctdb continue"
but we should use "ctdb setcrecmasterrole off".

Signed-off-by: Michael Adam <obnox@samba.org>
11 years agotests: Add a test for subsequent recoveries corrupting databases
Amitay Isaacs [Thu, 11 Apr 2013 06:59:36 +0000 (16:59 +1000)]
tests: Add a test for subsequent recoveries corrupting databases

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
11 years agotests: Support waiting for "recovered" state in tests
Amitay Isaacs [Thu, 11 Apr 2013 06:58:34 +0000 (16:58 +1000)]
tests: Support waiting for "recovered" state in tests

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
11 years agoctdb_call: don't bump the rsn in ctdb_become_dmaster() any more
Michael Adam [Wed, 3 Apr 2013 10:02:59 +0000 (12:02 +0200)]
ctdb_call: don't bump the rsn in ctdb_become_dmaster() any more

This is now done in ctdb_ltdb_store_server(), so this
extra bump can be spared.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agoFix a severe recovery bug that can lead to data corruption for SMB clients.
Michael Adam [Wed, 3 Apr 2013 09:40:25 +0000 (11:40 +0200)]
Fix a severe recovery bug that can lead to data corruption for SMB clients.

Problem:
Recovery can under certain circumstances lead to old record copies
resurrecting: Recovery selects the newest record copy purely by RSN. At
the end of the recovery, the recovery master is the dmaster for all
records in all (non-persistent) databases. And the other nodes locally
hold the complete copy of the databases. The bug is that the recovery
process does not increment the RSN on the recovery master at the end of
the recovery. Now clients acting directly on the Recovery master will
directly change a record's content on the recmaster without migration
and hence without RSN bump.  So a subsequent recovery can not tell that
the recmaster's copy is newer than the copies on the other nodes, since
their RSN is the same. Hence, if the recmaster is not node 0 (or more
precisely not the active node with the lowest node number), the recovery
will choose copies from nodes with lower number and stick to these.

Here is how to reproduce:

- assume we have a cluster with at least 2 nodes
- ensure that the recmaster is not node 0
  (maybe ensure with "onnode 0 ctdb setrecmasterrole off")
  say recmaster is node 1
- choose a new database name, say "test1.tdb"
  (make sure it is not yet attached as persistent)
- choose a key name, say "key1"
- all clustere nodes should ok and no recovery running
- now do the following on node 1:

1. dbwrap_tool test1.tdb store key1 uint32 1
2. dbwrap_tool test1.tdb fetch key1 uint32
   ==> 1
3. ctdb recover
4. dbwrap_tool test1.tdb store key1 uint32 2
5. dbwrap_tool test1.tdb fetch key1 uint32
   ==> 2
4. ctdb recover
7. dbwrap_tool test1.tdb fetch key1 uint32
   ==> 1
   ==> BUG

This is a very severe bug, since when applied to Samba's locking.tdb
database, it means that for SMB clients on clustered Samba there is
the potential for locking out oneself from previously opened files
or even worse, data corruption:

Case 1: locking out

- client on recmaster opens file
- recovery propagates open file handle (entry in locking.tdb) to
  other nodes
- client closes file
- client opens the same file
- recovery resurrects old copy of open file record in locking.tdb
  from lower node
- client closes file but fails to delete entry in locking.tdb
- client tries to open same file again but fails, since
  the old record locks it out (since the client is still connected)

Case 2: data corruption

- clien1 on recmaster opens file
- recovery propagates open file info to other nodes
- client1 closes the file and disconnects
- client2 opens the same file
- recovery resurrects old copy of locking.tdb record,
  where client2 has no entry, but client1 has.
- but client2 believes it still has a handle
- client3 opens the file and succees without
  conflicting with client2
  (the detached entry for client1 is discarded because
   the server does not exist any more).
=> both client2 and client3 believe they have exclusive
  access to the file and writing creates data corruption

Fix:

When storing a record on the dmaster, bump its RSN.

The ctdb_ltdb_store_server() is the central function for storing
a record to a local tdb from the ctdbd server context.
So this is also the place where the RSN of the record to be stored
should be incremented, when storing on the dmaster.

For the case of the record migration, this is currently done in
ctdb_become_dmaster() in ctdb_call.c, but there are other places
such as in recovery, where we should bump the RSN, but currently
don't do it.

So moving the RSN incrementation into ctdb_ltdb_store_server fixes
the recovery-record-resurrection bug.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
11 years agologging: fix comment typo
Michael Adam [Mon, 15 Apr 2013 10:50:42 +0000 (12:50 +0200)]
logging: fix comment typo

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
11 years agoctdbd: unimplement the unused SET_DMASTER control
Michael Adam [Wed, 3 Apr 2013 12:03:32 +0000 (14:03 +0200)]
ctdbd: unimplement the unused SET_DMASTER control

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
11 years agorecoverd: remove bogus comment "qqq" from "add prototype new banning code"
Michael Adam [Fri, 22 Mar 2013 16:48:00 +0000 (17:48 +0100)]
recoverd: remove bogus comment "qqq" from "add prototype new banning code"

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
11 years agobuild: silence building of porting_test
Michael Adam [Fri, 5 Apr 2013 14:55:18 +0000 (16:55 +0200)]
build: silence building of porting_test

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
11 years agotraverse: Ensure backward compatibility for CTDB_CONTROL_TRAVERSE_ALL
Amitay Isaacs [Thu, 11 Apr 2013 03:20:09 +0000 (13:20 +1000)]
traverse: Ensure backward compatibility for CTDB_CONTROL_TRAVERSE_ALL

This makes sure that CTDB_CONTROL TRAVERSE_ALL is compatible with older versions
of CTDB (i.e. 1.2.39 and 1.2.40 branches).

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
11 years agotraverse: Add CTDB_CONTROL_TRAVERSE_ALL_EXT to support withemptyrecords
Amitay Isaacs [Thu, 11 Apr 2013 03:18:36 +0000 (13:18 +1000)]
traverse: Add CTDB_CONTROL_TRAVERSE_ALL_EXT to support withemptyrecords

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
11 years agotests: Fix typo in variable name
Amitay Isaacs [Thu, 11 Apr 2013 06:58:59 +0000 (16:58 +1000)]
tests: Fix typo in variable name

Signed-off-by: Amitay Isaacs <amitay@gmail.com>