ctdb.git
10 years agotests/tool: Fix some comment typos
Martin Schwenke [Wed, 18 Sep 2013 03:43:53 +0000 (13:43 +1000)]
tests/tool: Fix some comment typos

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Stop return value from being clobbered in control_lvsmaster()
Martin Schwenke [Wed, 18 Sep 2013 03:40:52 +0000 (13:40 +1000)]
tools/ctdb: Stop return value from being clobbered in control_lvsmaster()

ret is initialised too early and is clobbered by the call to
ctdb_ctrl_getcapabilities().  Initialising it later means that the
function returns -1 when no LVS master is found.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoclient: Fix some format string compiler warnings
Martin Schwenke [Wed, 18 Sep 2013 03:40:10 +0000 (13:40 +1000)]
client: Fix some format string compiler warnings

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agocommon: Fix setting of debug level in the client code
Amitay Isaacs [Fri, 30 Aug 2013 13:38:15 +0000 (23:38 +1000)]
common: Fix setting of debug level in the client code

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agolibctdb: Remove incomplete libctdb
Amitay Isaacs [Sun, 25 Aug 2013 11:44:59 +0000 (21:44 +1000)]
libctdb: Remove incomplete libctdb

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotools/ctdb: Pass memory context for returning nodes in parse_nodestring
Amitay Isaacs [Tue, 27 Aug 2013 04:46:08 +0000 (14:46 +1000)]
tools/ctdb: Pass memory context for returning nodes in parse_nodestring

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotests: Do not use libctdb code in tests
Amitay Isaacs [Sun, 25 Aug 2013 11:43:29 +0000 (21:43 +1000)]
tests: Do not use libctdb code in tests

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotools/ctdb: Do not use libctdb for commandline tool
Amitay Isaacs [Thu, 29 Aug 2013 07:22:38 +0000 (17:22 +1000)]
tools/ctdb: Do not use libctdb for commandline tool

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoclient: Add ctdb_ctrl_getdbseqnum() function
Amitay Isaacs [Fri, 23 Aug 2013 06:52:24 +0000 (16:52 +1000)]
client: Add ctdb_ctrl_getdbseqnum() function

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoclient: Add ctdb_ctrl_getdbstatistics() function
Amitay Isaacs [Fri, 23 Aug 2013 06:52:02 +0000 (16:52 +1000)]
client: Add ctdb_ctrl_getdbstatistics() function

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoclient: Add ctdb_client_check_message_handlers() function
Amitay Isaacs [Fri, 23 Aug 2013 06:51:26 +0000 (16:51 +1000)]
client: Add ctdb_client_check_message_handlers() function

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoclient: Remove extra whitespaces
Amitay Isaacs [Fri, 23 Aug 2013 06:49:46 +0000 (16:49 +1000)]
client: Remove extra whitespaces

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotests: Remove unused test program ctdb_fetch_lock_once
Amitay Isaacs [Fri, 23 Aug 2013 07:21:24 +0000 (17:21 +1000)]
tests: Remove unused test program ctdb_fetch_lock_once

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotools/ctdb: When printing TDB data as a string, use correct length of the string
Amitay Isaacs [Thu, 29 Aug 2013 06:58:47 +0000 (16:58 +1000)]
tools/ctdb: When printing TDB data as a string, use correct length of the string

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotools/ctdb: Remove un-implemented ctdb vacuum command
Amitay Isaacs [Fri, 23 Aug 2013 06:57:40 +0000 (16:57 +1000)]
tools/ctdb: Remove un-implemented ctdb vacuum command

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotests: Add a simple test to test cluster wide database traverse
Amitay Isaacs [Wed, 25 Sep 2013 09:10:13 +0000 (19:10 +1000)]
tests: Add a simple test to test cluster wide database traverse

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotraverse: Send traverse end record from traverse child process
Amitay Isaacs [Mon, 9 Sep 2013 02:46:26 +0000 (12:46 +1000)]
traverse: Send traverse end record from traverse child process

Traverse records are sent directly from traverse child process, but
the last empty record signalling end of traverse is sent from ctdbd.
This creates a race condition between ctdbd and traverse child.
There are two fds from traverse child to ctdbd - a pipe to track status
of the child process and unix socket connection for sending records.
It's possible that last few records are sitting in unix socket buffer
when ctdbd reads the status written from traverse child.  This will
be interpreted as end of traverse and ctdbd will send the last empty
record to originating node before it has processed the pending packets
in unix socket connection.

The race is avoided by sending the last empty record marking end of
traverse from the child process.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotraverse: Wait till all data has been flushed from output queue
Amitay Isaacs [Tue, 10 Sep 2013 07:52:26 +0000 (17:52 +1000)]
traverse: Wait till all data has been flushed from output queue

To improve the traverse performance, records are directly sent from
traverse child process to the originating node.  Make sure that all the
data is sent via socket, before informing ctdbd that traverse is complete.

Without waiting for all the packets to be flushed from the queue,
child process can incorrectly signal ctdbd that traverse has ended.
This will cause the pending records in the queue never to make it to
the originating node and traverse information will not be complete.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotraverse: Use ctdb local variable for convenience
Amitay Isaacs [Fri, 13 Sep 2013 03:28:31 +0000 (13:28 +1000)]
traverse: Use ctdb local variable for convenience

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotraverse: Check if local traverse failed or succeeded
Amitay Isaacs [Fri, 6 Sep 2013 08:11:40 +0000 (18:11 +1000)]
traverse: Check if local traverse failed or succeeded

By passing the result of tdb_traverse_read() allows ctdbd to determine
if the local traverse succeeded or not.  In case of a problem with local
traverse, ctdbd can log an error.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotraverse: Log information when traverse starts and ends
Amitay Isaacs [Fri, 6 Sep 2013 04:51:54 +0000 (14:51 +1000)]
traverse: Log information when traverse starts and ends

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotool/ltdbtool: -h option does not require an argument
Martin Schwenke [Mon, 23 Sep 2013 06:23:36 +0000 (16:23 +1000)]
tool/ltdbtool: -h option does not require an argument

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoscripts: Add support for optional ctdbd.conf configuration file
Martin Schwenke [Mon, 23 Sep 2013 06:22:36 +0000 (16:22 +1000)]
scripts: Add support for optional ctdbd.conf configuration file

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agoutils: Make debug level strings case-insensitive
Martin Schwenke [Mon, 23 Sep 2013 06:21:30 +0000 (16:21 +1000)]
utils: Make debug level strings case-insensitive

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agotools/ctdb: Fix help messages for ctdb commands
Martin Schwenke [Mon, 23 Sep 2013 06:20:42 +0000 (16:20 +1000)]
tools/ctdb: Fix help messages for ctdb commands

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Ban time of 0 is invalid
Martin Schwenke [Mon, 23 Sep 2013 06:19:52 +0000 (16:19 +1000)]
tools/ctdb: Ban time of 0 is invalid

Apparently it used to mean a permanent ban but it is unclear if this
was ever supported.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: Load CTDB configuration settings in 70.iscsi
Amitay Isaacs [Mon, 16 Sep 2013 04:35:13 +0000 (14:35 +1000)]
eventscripts: Load CTDB configuration settings in 70.iscsi

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agorecoverd: Disable takeover runs on other nodes for 5 minutes
Martin Schwenke [Wed, 18 Sep 2013 07:07:32 +0000 (17:07 +1000)]
recoverd: Disable takeover runs on other nodes for 5 minutes

60 seconds might not be long enough to kill all connections and
release IPs.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Improve logging for takeover runs
Martin Schwenke [Wed, 18 Sep 2013 07:06:16 +0000 (17:06 +1000)]
recoverd: Improve logging for takeover runs

Takeover runs are currently silent when they succeed.  However, they
are important, so log something by default.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Use the standard long timeout when disabling takeover runs
Martin Schwenke [Wed, 18 Sep 2013 06:35:18 +0000 (16:35 +1000)]
tools/ctdb: Use the standard long timeout when disabling takeover runs

This means that takeover runs will be disabled for about as long as the
reloadips control can take to complete.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Fix arguments/semantics of rebalance node
Martin Schwenke [Fri, 6 Sep 2013 03:20:26 +0000 (13:20 +1000)]
tools/ctdb: Fix arguments/semantics of rebalance node

There's no reason why specifying a node should be compulsory.  This is
a cluster-wide operation because it is implemented by the recovery
master so multiple nodes should not be specified using -n.  However,
the command should be able to specify multiple nodes so let it have
its own nodestring argument.

This change should be backward compatible with the old requirement of
specifying a single node via -n.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Make rebalancenode more robust
Martin Schwenke [Fri, 6 Sep 2013 03:19:09 +0000 (13:19 +1000)]
tools/ctdb: Make rebalancenode more robust

Use a broadcast instead of trying to win the race of determining the
recovery master and then sending the message before the recovery
master changes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotests/simple: Fix the reloadips test to cope with changes to reloadips
Martin Schwenke [Fri, 6 Sep 2013 01:29:14 +0000 (11:29 +1000)]
tests/simple: Fix the reloadips test to cope with changes to reloadips

Specifying nodes to reload no longer uses -n.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Be careful about freeing the list of IP rebalance target nodes
Martin Schwenke [Fri, 6 Sep 2013 01:23:07 +0000 (11:23 +1000)]
recoverd: Be careful about freeing the list of IP rebalance target nodes

It can change during a takeover run.  If it does then don't free it.

There are potentially fancier solutions (e.g. check what PNNs are new
to the list) to this issue but this is the simplest.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: reloadips should rebalance target nodes for new IPs
Martin Schwenke [Fri, 6 Sep 2013 01:21:10 +0000 (11:21 +1000)]
recoverd: reloadips should rebalance target nodes for new IPs

Otherwise, if existing IPs are added to extra nodes (that have,
perhaps, been disconnected) then those IPs will not be rebalanced
across the extra nodes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoctdbd: Make ctdb_reloadips_child send controls asynchronously
Martin Schwenke [Thu, 5 Sep 2013 05:56:51 +0000 (15:56 +1000)]
ctdbd: Make ctdb_reloadips_child send controls asynchronously

Deleting IPs can take a while because IPs are released and connections
are killed.  This can take a while so do them in parallel.  In fact,
since the set of IPs being added and deleted will be disjoint, send
all the adds/deletes at the same time and then wait.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE
Martin Schwenke [Wed, 4 Sep 2013 04:30:04 +0000 (14:30 +1000)]
recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE

The current implementation has a few flaws:

* A takeover run is called unconditionally when the timer goes even if
  the recovery master role has moved.  This means a node other than
  the recovery master can incorrectly do a takeover run.

* The rebalancing target nodes are cleared in the setup for a takeover
  run, regardless of whether the takeover run succeeds.

* The timer to force a rebalance isn't cleared if another takeover run
  occurs before the deadline.  Any forced rebalancing will happen in
  the first takeover run and when the timer expires some time later
  then an unnecessary takeover run will occur.

* If the recovery master role moves then the rebalancing data will
  stay on the original node and affect the next takeover run to occur
  if the recovery master role should come back to the original node.

Instead, store an array of rebalance target nodes in the recovery
master context.  This is passed as an extra argument to
ctdb_takeover_run() each time it is called and is cleared when a
takeover run succeeds.  The timer hangs off the array of rebalance
target nodes, which is cleared if the node isn't the recovery master.

This means that it is possible to lose rebalance data if the recovery
master role moves.  However, that's a difficult problem to solve.  The
best way of approaching it is probably to try to stop the recovery
master role from jumping around unnecesarily when inactive nodes join
the cluster.

The long term solution is to avoid this nonsense completely.  The IP
allocation algorithm needs to cache state between runs so that it
knows which nodes have just become healthy.  This also needs recovery
master stability.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Remove unused CTDB_SRVID_RELOAD_ALL_IPS and handler
Martin Schwenke [Wed, 28 Aug 2013 05:46:27 +0000 (15:46 +1000)]
recoverd: Remove unused CTDB_SRVID_RELOAD_ALL_IPS and handler

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Reimplement reloadips
Martin Schwenke [Wed, 28 Aug 2013 05:38:48 +0000 (15:38 +1000)]
tools/ctdb: Reimplement reloadips

This implementation disables takeover runs on all nodes before trying
to reload IPs.  It also takes "all" or the list of PNNs as an argument
to the command instead of to -n.  -n can still be specified with a
single node indicating that node should be considered the current node
- that might be confusing so could be removed.

This implementation does not use CTDB_SRVID_RELOAD_ALL_IPS, so it can
be removed.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Defer ipreallocated requests when takeover runs are disabled
Martin Schwenke [Wed, 28 Aug 2013 01:50:23 +0000 (11:50 +1000)]
recoverd: Defer ipreallocated requests when takeover runs are disabled

The takeover run will fail anyway but deferring seems like a cleaner
option.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Reimplement CTDB_SRVID_DISABLE_IP_CHECK
Martin Schwenke [Wed, 28 Aug 2013 01:32:54 +0000 (11:32 +1000)]
recoverd: Reimplement CTDB_SRVID_DISABLE_IP_CHECK

Use disable_takeover_runs_handler() instead of maintaining duplicate
logic.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS
Martin Schwenke [Tue, 27 Aug 2013 05:04:40 +0000 (15:04 +1000)]
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS

This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK.  It stops
the IP checks but also causes any attempted takeover runs to fail and
be rescheduled.

This is meant to completely stop IP movements.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Add a wait_for_all option to srvid_broadcast()
Martin Schwenke [Fri, 16 Aug 2013 08:47:51 +0000 (18:47 +1000)]
tools/ctdb: Add a wait_for_all option to srvid_broadcast()

This will be useful for other SRVIDs.

The error checking in the handler depends on the SRVID responding with
a uint32_t where <0 indicates an error and >=0 is a PNN that
succeeded.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Factor out SRVID broadcast code from ipreallocate()
Martin Schwenke [Fri, 16 Aug 2013 07:06:23 +0000 (17:06 +1000)]
tools/ctdb: Factor out SRVID broadcast code from ipreallocate()

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Change ipreallocate() to use a local done flag
Martin Schwenke [Fri, 16 Aug 2013 06:25:28 +0000 (16:25 +1000)]
tools/ctdb: Change ipreallocate() to use a local done flag

Instead of the current global variable.  This is in anticipation of
abstracting the code.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Factor out the SRVID handling code
Martin Schwenke [Fri, 16 Aug 2013 10:02:34 +0000 (20:02 +1000)]
recoverd: Factor out the SRVID handling code

The code that handles IP reallocate requests can be reused.

This also changes the result back to a SRVID caller to the PNN on
success or a negative error code on failure.  None of the callers
currently look at the result so this is harmless... but it will be
useful later.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Make the SRVID request structure generic
Martin Schwenke [Fri, 16 Aug 2013 10:10:10 +0000 (20:10 +1000)]
recoverd: Make the SRVID request structure generic

No need for a separate one for each SRVID.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Move disabling of IP checks into do_takeover_run()
Martin Schwenke [Tue, 3 Sep 2013 01:21:09 +0000 (11:21 +1000)]
recoverd: Move disabling of IP checks into do_takeover_run()

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: do_takeover_run() should mark when a takeover run is in progress
Martin Schwenke [Tue, 3 Sep 2013 01:20:01 +0000 (11:20 +1000)]
recoverd: do_takeover_run() should mark when a takeover run is in progress

Nested takeover runs should never happens so they should fail.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: takeover_fail_callback() doesn't need to set rec->need_takeover_run
Martin Schwenke [Tue, 27 Aug 2013 02:19:18 +0000 (12:19 +1000)]
recoverd: takeover_fail_callback() doesn't need to set rec->need_takeover_run

It is set on every failure anyway.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Fail takeover run if "ipreallocated" fails
Martin Schwenke [Mon, 9 Sep 2013 02:13:11 +0000 (12:13 +1000)]
recoverd: Fail takeover run if "ipreallocated" fails

Previously flagging a failure was probably avoided because of attempts
to run "ipreallocated" events on stopped and banned nodes, which would
fail because they are in recovery.  Given the change to a new control
and that fallback only retries the old method on active nodes, this
should never fail in reasonable circumstances.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: New function do_takeover_run()
Martin Schwenke [Tue, 27 Aug 2013 02:14:34 +0000 (12:14 +1000)]
recoverd: New function do_takeover_run()

Factor the calling sequence for ctdb_takeover_run() into a new
function and call it instead.  This changes rec->need_takeover_run to
false for each successful takeover run and that seems to be the right
thing to do.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Stabilise the recovery master role
Martin Schwenke [Tue, 17 Sep 2013 02:00:26 +0000 (12:00 +1000)]
recoverd: Stabilise the recovery master role

On rare occasions when a node that has been inactive it will trigger
an election when it becomes active again.  If that node has been up
for the longest then it will win the election and the recovery master
role will spuriously move.

While a node remains inactive we reset the priority time to discourage
it from winning elections.  The priority time will now reflect roughly
how long the node has been active rather than how long it has been up.
That means the most stable node is more likely to win elections.

Having a stable recovery master means that disabling takeover runs
while reloading IPs is more likely to succeed.  It also improves the
chances of being able to cache information in the recovery master -
for example, between takeover runs.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Banned nodes should not be told to run "ipreallocated" event
Martin Schwenke [Wed, 4 Sep 2013 03:54:23 +0000 (13:54 +1000)]
recoverd: Banned nodes should not be told to run "ipreallocated" event

They will reject it because they are in recovery.  This can result in
extra banning credits being applied to banned nodes.

This corresponds to commit 9132e6814ed927fa317f333f03dedb18f75d0e5b
from the 1.2.40 branch.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agocommon: Make parse_ip() valgrind-clean
Martin Schwenke [Mon, 9 Sep 2013 06:16:24 +0000 (16:16 +1000)]
common: Make parse_ip() valgrind-clean

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agorecoverd: Remove an orphaned comment
Martin Schwenke [Tue, 27 Aug 2013 05:27:30 +0000 (15:27 +1000)]
recoverd: Remove an orphaned comment

This should have been removed with the associated code in commit
14bd0b6961ef1294e9cba74ce875386b7dfbf446.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Update a comment to use current terminology
Martin Schwenke [Tue, 27 Aug 2013 05:24:17 +0000 (15:24 +1000)]
recoverd: Update a comment to use current terminology

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoclient: Remove unused function list_of_active_nodes_except_pnn()
Martin Schwenke [Tue, 27 Aug 2013 05:16:51 +0000 (15:16 +1000)]
client: Remove unused function list_of_active_nodes_except_pnn()

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: list_of_active_nodes_except_pnn() -> list_of_nodes()
Martin Schwenke [Tue, 27 Aug 2013 05:14:10 +0000 (15:14 +1000)]
tools/ctdb: list_of_active_nodes_except_pnn() -> list_of_nodes()

list_of_active_nodes_except_pnn() is only used here and can be removed
if we remove this call.  Less is more...

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Fix a memory leak in parse_nodestring()
Martin Schwenke [Wed, 28 Aug 2013 05:36:27 +0000 (15:36 +1000)]
tools/ctdb: Fix a memory leak in parse_nodestring()

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotests/eventscripts: Tests for memory checking in 00.ctdb
Martin Schwenke [Fri, 6 Sep 2013 06:37:52 +0000 (16:37 +1000)]
tests/eventscripts: Tests for memory checking in 00.ctdb

... plus updates to test infrastructure to support.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: Clean up monitoring of system memory in 00.ctdb
Martin Schwenke [Fri, 6 Sep 2013 02:13:31 +0000 (12:13 +1000)]
eventscripts: Clean up monitoring of system memory in 00.ctdb

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoserver: standardize formatting of comment block for ctdb_reply_dmaster() while I...
Michael Adam [Thu, 22 Aug 2013 14:17:09 +0000 (16:17 +0200)]
server: standardize formatting of comment block for ctdb_reply_dmaster() while I'm at it..

This was the comment block I was touching and meant to adapt in
commit 00d3bf092e2f72eda330978c75ec85f17e870553.
My search was apparently not unique...

Signed-off-by: Michael Adam <obnox@samba.org>
10 years agodoc: Update NEWS ctdb-2.4
Martin Schwenke [Wed, 21 Aug 2013 04:01:25 +0000 (14:01 +1000)]
doc: Update NEWS

Signed-off-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agobuild: Fix build dependencies for ctdb_lock_tdb
Amitay Isaacs [Thu, 22 Aug 2013 07:59:31 +0000 (17:59 +1000)]
build: Fix build dependencies for ctdb_lock_tdb

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotests/simple: Minimise the chance of a monitor event being cancelled
Martin Schwenke [Thu, 22 Aug 2013 04:04:59 +0000 (14:04 +1000)]
tests/simple: Minimise the chance of a monitor event being cancelled

A monitor event following a "ctdb delip" might reconfigure services.
If the monitor event is cancelled then a service might be stopped but
not yet restarted and this could result in the subsequent monitor
events failing.

This obviously needs to be fixed in CTDB itself.  This will happen by
making "ctdb reloadips" the supported way of reconfiguring IPs.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agopackaging: Remove pushd/popd from maketarball.sh, don't need bash
Martin Schwenke [Wed, 21 Aug 2013 07:24:03 +0000 (17:24 +1000)]
packaging: Remove pushd/popd from maketarball.sh, don't need bash

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb_diagnostics: Add output of "ctdb getdbmap"
Martin Schwenke [Wed, 21 Aug 2013 06:48:21 +0000 (16:48 +1000)]
tools/ctdb_diagnostics: Add output of "ctdb getdbmap"

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb_diagnostics: Safer temporary file creation
Martin Schwenke [Wed, 21 Aug 2013 06:38:17 +0000 (16:38 +1000)]
tools/ctdb_diagnostics: Safer temporary file creation

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: Avoid using a temporary file in 62.cnfs
Martin Schwenke [Wed, 21 Aug 2013 04:34:49 +0000 (14:34 +1000)]
eventscripts: Avoid using a temporary file in 62.cnfs

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoscripts: Remove gdb_backtrace
Martin Schwenke [Wed, 21 Aug 2013 04:27:39 +0000 (14:27 +1000)]
scripts: Remove gdb_backtrace

This uses potentially insecure temporary files and is not referenced
anywhere else.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Make most non-auto-all commands abort if run with -n all
Martin Schwenke [Mon, 19 Aug 2013 04:40:52 +0000 (14:40 +1000)]
tools/ctdb: Make most non-auto-all commands abort if run with -n all

Or if run with -n A,B,...

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Remove more non-essential fetching of PNN from daemon
Martin Schwenke [Wed, 14 Aug 2013 19:02:37 +0000 (05:02 +1000)]
tools/ctdb: Remove more non-essential fetching of PNN from daemon

The useful cases are either CTDB_CURRENT_NODE, in which case
ctdb_get_pnn() does the job, or a PNN, which is... ummm... a PNN!  :-)

This works because parse_nodestring() validates PNNs.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Improve auto-all settings for some commands
Martin Schwenke [Mon, 19 Aug 2013 03:54:49 +0000 (13:54 +1000)]
tools/ctdb: Improve auto-all settings for some commands

* ipreallocate is cluster-wide so should not be auto-all

* enablescript, disablescript, getreclock, setreclock, natgwlist can
  all be auto-all without issues

* xpnn, ipiface a local-only so don't work with -n, so might as well
  not be auto-all

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Remove an unused temporary talloc context
Martin Schwenke [Fri, 16 Aug 2013 10:27:25 +0000 (20:27 +1000)]
recoverd: Remove an unused temporary talloc context

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Move struct ctdb_public_ip_list back into ctdb_takeover.c
Martin Schwenke [Fri, 16 Aug 2013 04:10:57 +0000 (14:10 +1000)]
recoverd: Move struct ctdb_public_ip_list back into ctdb_takeover.c

This is an internal structure.  It was moved into ctdb_private.h a
long time ago to allow unit testing.  Unit test compilation was
changed shortly afterwards to make this unnecessary.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Log more information when interfaces change
Martin Schwenke [Thu, 15 Aug 2013 07:04:01 +0000 (17:04 +1000)]
recoverd: Log more information when interfaces change

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotraverse: Log when database traverse is started
Amitay Isaacs [Thu, 11 Jul 2013 06:00:30 +0000 (16:00 +1000)]
traverse: Log when database traverse is started

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoctdbd: Finish eventscript callback processing before debugging hung script
Amitay Isaacs [Thu, 22 Aug 2013 05:12:17 +0000 (15:12 +1000)]
ctdbd: Finish eventscript callback processing before debugging hung script

This ensures that the result of eventscripts is updated and callback is
processed before debugging hung script.  So "ctdb scriptstatus" output
will be useful from debug hung script.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>

10 years agoctdbd: Make sure call data is freed if doing an early return
Amitay Isaacs [Tue, 23 Jul 2013 06:00:15 +0000 (16:00 +1000)]
ctdbd: Make sure call data is freed if doing an early return

This should avoid memory bloat when a request bounces between nodes.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agocommon/io: Limit the queue buffer size for fair scheduling via tevent
Amitay Isaacs [Wed, 21 Aug 2013 04:42:06 +0000 (14:42 +1000)]
common/io: Limit the queue buffer size for fair scheduling via tevent

If we process all the data available in a socket buffer, CTDB can stay busy
processing lots of packets via immediate event mechanism in tevent.  After
processing an immediate event, tevent returns without epoll_wait.  So as long
as there are immediate events, tevent will never poll other FDs.  CTDB will
report this as "Event handling took xx seconds" warning.  This is misleading
since CTDB is very busy processing packets, but never gets to the point of
polling FDs.

The improvement in socket handling made it worse when handling traverse
control.  There were lots of packets filled in the socket buffer quickly and
CTDB stayed busy processing those packets and not polling other FDs and timer
events.  This can lead to controls timing out and in worse case other nodes
marking busy node as disconnected.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoRevert "common/io: Keep queue buffer size multiple of 4K"
Amitay Isaacs [Tue, 20 Aug 2013 04:20:09 +0000 (14:20 +1000)]
Revert "common/io: Keep queue buffer size multiple of 4K"

This reverts commit 5e9b1a7e24d058ff88aaa0563db36a804e866fa9.

This is not the best approach.  Allowing queue buffer size to grow
indefinitely causes large number of CTDB packets to be queued up very
quickly which when processed via immediate events will block CTDB from
processing events from other FDs.  If there are immediate events queued
up, tevent will never process any of the FDs till all immediate events
are processed.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoRevert "LACOUNT: Add back lacount mechanism to defer migrating a fetched/read copy...
Amitay Isaacs [Mon, 19 Aug 2013 05:04:46 +0000 (15:04 +1000)]
Revert "LACOUNT:  Add back lacount mechanism to defer migrating a fetched/read copy until after default of 20 consecutive requests from the same node"

This reverts commit 035c0d981bde8c0eee8b3f24ba8e2dc817e5b504.

This is a premature optimization.  Record can bounce between nodes
very quickly if it is a contended record.  There is no need to hold a
record on a node unnecessarily.  In case record contention becomes bad,
enabling sticky records on a database is a better idea.

Conflicts:
include/ctdb_private.h
server/ctdb_tunables.c

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoctdbd: Print a log message when a key becomes hot
Amitay Isaacs [Mon, 15 Jul 2013 05:39:47 +0000 (15:39 +1000)]
ctdbd: Print a log message when a key becomes hot

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoctdbd: For volatile databases, write an empty record with rsn=0 only on dmaster
Amitay Isaacs [Fri, 9 Aug 2013 07:22:55 +0000 (17:22 +1000)]
ctdbd: For volatile databases, write an empty record with rsn=0 only on dmaster

Empty record with rsn=0 should not be written on any other node other than
dmaster.  This is however not true for persistent databases.  So currently
apply the check only for volatile databases.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotools/ctdb: Fix message in showban when node is banned
Martin Schwenke [Fri, 9 Aug 2013 07:00:10 +0000 (17:00 +1000)]
tools/ctdb: Fix message in showban when node is banned

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Reimplement ban/unban using update_flags_wait_and_ipreallocate()
Martin Schwenke [Fri, 9 Aug 2013 06:58:42 +0000 (16:58 +1000)]
tools/ctdb: Reimplement ban/unban using update_flags_wait_and_ipreallocate()

This has the side effect of making these commands more resilient to
control timeouts.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Factor out common pattern used in disable/enable/stop/continue
Martin Schwenke [Fri, 9 Aug 2013 06:34:59 +0000 (16:34 +1000)]
tools/ctdb: Factor out common pattern used in disable/enable/stop/continue

Now we will only have one set of bugs.  :-)

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agotools/ctdb: Factor, simplify and improve robustness of ipreallocate code
Martin Schwenke [Fri, 9 Aug 2013 05:41:37 +0000 (15:41 +1000)]
tools/ctdb: Factor, simplify and improve robustness of ipreallocate code

Having other functions call control_ipreallocate() suggests that the
it might look at the argv/argv arguments that are passed.  This is not
the case.  Change the callers so they call the new ipreallocate()
function instead.

Broadcast CTDB_SRVID_TAKEOVER_RUN to all connected nodes.  Inactive
nodes will ignore it.  This is safe since we only want 1 reply.  If we
didn't get a response, we don't actually care if there's no active
recovery master - just fire, wait, retry, ...

Ignore some failures on the basis that they might be transient, so it
is probably worth retrying.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Use ctdb_get_pnn() to get PNN of the current node
Martin Schwenke [Wed, 14 Aug 2013 18:38:02 +0000 (04:38 +1000)]
tools/ctdb: Use ctdb_get_pnn() to get PNN of the current node

This has already been stored at connect time and can't fail.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoutil: In passing the code, fix a space vs. tab in set_close_on_exec().
Michael Adam [Mon, 19 Aug 2013 14:54:06 +0000 (16:54 +0200)]
util: In passing the code, fix a space vs. tab in set_close_on_exec().

Signed-off-by: Michael Adam <obnox@samba.org>
10 years agoserver: standardize formatting of comment block for ctdb_reply_dmaster() while I...
Michael Adam [Mon, 19 Aug 2013 15:07:19 +0000 (17:07 +0200)]
server: standardize formatting of comment block for ctdb_reply_dmaster() while I'm at it..

Signed-off-by: Michael Adam <obnox@samba.org>
10 years agoserver: fix wording and punctuation in comment block for ctdb_reply_dmaster().
Michael Adam [Tue, 13 Aug 2013 08:17:45 +0000 (10:17 +0200)]
server: fix wording and punctuation in comment block for ctdb_reply_dmaster().

Signed-off-by: Michael Adam <obnox@samba.org>
10 years agorecoverd: Improve log message when nodes disagree on recmaster
Amitay Isaacs [Wed, 14 Aug 2013 01:44:12 +0000 (11:44 +1000)]
recoverd: Improve log message when nodes disagree on recmaster

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agocommon: Null terminate process name string so valgrind doesn't complain
Amitay Isaacs [Fri, 2 Aug 2013 01:05:08 +0000 (11:05 +1000)]
common: Null terminate process name string so valgrind doesn't complain

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agovacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 2)
Amitay Isaacs [Mon, 12 Aug 2013 05:50:30 +0000 (15:50 +1000)]
vacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 2)

This is caused by corruption of a record header such that the records
on two nodes point to each other as dmaster.  This makes a request for
that record bounce between nodes endlessly.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agovacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 1)
Amitay Isaacs [Mon, 12 Aug 2013 05:51:00 +0000 (15:51 +1000)]
vacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 1)

This is caused by corruption of a record header such that the records
on two nodes point to each other as dmaster.  This makes a request for
that record bounce between nodes endlessly.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agodb_wrap: Make sure tdb messages are logged correctly
Amitay Isaacs [Tue, 6 Aug 2013 04:37:13 +0000 (14:37 +1000)]
db_wrap: Make sure tdb messages are logged correctly

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoeventscripts: Become unhealthy faster on nfsd failure
Martin Schwenke [Mon, 12 Aug 2013 01:36:25 +0000 (11:36 +1000)]
eventscripts: Become unhealthy faster on nfsd failure

Anecdotal evidence suggests that most nfsd RPC check failures are due
to cluster filesystem or storage problem.  Apparently these are rarely
helped by attempting to restart the NFS service because the restart
tends to hang.

Fail after 2 nfsd RPC check failures, instead of waiting for 6
failures.  Restart on every 10th failure to try to bring the node back
to good health.

Update unit tests to match.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Increase default control timeout to 10 seconds
Martin Schwenke [Fri, 9 Aug 2013 01:56:29 +0000 (11:56 +1000)]
tools/ctdb: Increase default control timeout to 10 seconds

The current 3 second timeout is arbitrary and users trip over it
sometimes.

Signed-off-by: Martin Schwenke <martin@meltin.net>