Martin Schwenke [Wed, 18 Sep 2013 03:43:53 +0000 (13:43 +1000)]
tests/tool: Fix some comment typos
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 18 Sep 2013 03:40:52 +0000 (13:40 +1000)]
tools/ctdb: Stop return value from being clobbered in control_lvsmaster()
ret is initialised too early and is clobbered by the call to
ctdb_ctrl_getcapabilities(). Initialising it later means that the
function returns -1 when no LVS master is found.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 18 Sep 2013 03:40:10 +0000 (13:40 +1000)]
client: Fix some format string compiler warnings
Signed-off-by: Martin Schwenke <martin@meltin.net>
Amitay Isaacs [Fri, 30 Aug 2013 13:38:15 +0000 (23:38 +1000)]
common: Fix setting of debug level in the client code
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Sun, 25 Aug 2013 11:44:59 +0000 (21:44 +1000)]
libctdb: Remove incomplete libctdb
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Tue, 27 Aug 2013 04:46:08 +0000 (14:46 +1000)]
tools/ctdb: Pass memory context for returning nodes in parse_nodestring
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Sun, 25 Aug 2013 11:43:29 +0000 (21:43 +1000)]
tests: Do not use libctdb code in tests
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Thu, 29 Aug 2013 07:22:38 +0000 (17:22 +1000)]
tools/ctdb: Do not use libctdb for commandline tool
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Fri, 23 Aug 2013 06:52:24 +0000 (16:52 +1000)]
client: Add ctdb_ctrl_getdbseqnum() function
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Fri, 23 Aug 2013 06:52:02 +0000 (16:52 +1000)]
client: Add ctdb_ctrl_getdbstatistics() function
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Fri, 23 Aug 2013 06:51:26 +0000 (16:51 +1000)]
client: Add ctdb_client_check_message_handlers() function
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Fri, 23 Aug 2013 06:49:46 +0000 (16:49 +1000)]
client: Remove extra whitespaces
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Fri, 23 Aug 2013 07:21:24 +0000 (17:21 +1000)]
tests: Remove unused test program ctdb_fetch_lock_once
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Thu, 29 Aug 2013 06:58:47 +0000 (16:58 +1000)]
tools/ctdb: When printing TDB data as a string, use correct length of the string
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Fri, 23 Aug 2013 06:57:40 +0000 (16:57 +1000)]
tools/ctdb: Remove un-implemented ctdb vacuum command
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Wed, 25 Sep 2013 09:10:13 +0000 (19:10 +1000)]
tests: Add a simple test to test cluster wide database traverse
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Mon, 9 Sep 2013 02:46:26 +0000 (12:46 +1000)]
traverse: Send traverse end record from traverse child process
Traverse records are sent directly from traverse child process, but
the last empty record signalling end of traverse is sent from ctdbd.
This creates a race condition between ctdbd and traverse child.
There are two fds from traverse child to ctdbd - a pipe to track status
of the child process and unix socket connection for sending records.
It's possible that last few records are sitting in unix socket buffer
when ctdbd reads the status written from traverse child. This will
be interpreted as end of traverse and ctdbd will send the last empty
record to originating node before it has processed the pending packets
in unix socket connection.
The race is avoided by sending the last empty record marking end of
traverse from the child process.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Tue, 10 Sep 2013 07:52:26 +0000 (17:52 +1000)]
traverse: Wait till all data has been flushed from output queue
To improve the traverse performance, records are directly sent from
traverse child process to the originating node. Make sure that all the
data is sent via socket, before informing ctdbd that traverse is complete.
Without waiting for all the packets to be flushed from the queue,
child process can incorrectly signal ctdbd that traverse has ended.
This will cause the pending records in the queue never to make it to
the originating node and traverse information will not be complete.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Fri, 13 Sep 2013 03:28:31 +0000 (13:28 +1000)]
traverse: Use ctdb local variable for convenience
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Fri, 6 Sep 2013 08:11:40 +0000 (18:11 +1000)]
traverse: Check if local traverse failed or succeeded
By passing the result of tdb_traverse_read() allows ctdbd to determine
if the local traverse succeeded or not. In case of a problem with local
traverse, ctdbd can log an error.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Fri, 6 Sep 2013 04:51:54 +0000 (14:51 +1000)]
traverse: Log information when traverse starts and ends
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Mon, 23 Sep 2013 06:23:36 +0000 (16:23 +1000)]
tool/ltdbtool: -h option does not require an argument
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 23 Sep 2013 06:22:36 +0000 (16:22 +1000)]
scripts: Add support for optional ctdbd.conf configuration file
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Mon, 23 Sep 2013 06:21:30 +0000 (16:21 +1000)]
utils: Make debug level strings case-insensitive
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Mon, 23 Sep 2013 06:20:42 +0000 (16:20 +1000)]
tools/ctdb: Fix help messages for ctdb commands
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 23 Sep 2013 06:19:52 +0000 (16:19 +1000)]
tools/ctdb: Ban time of 0 is invalid
Apparently it used to mean a permanent ban but it is unclear if this
was ever supported.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Amitay Isaacs [Mon, 16 Sep 2013 04:35:13 +0000 (14:35 +1000)]
eventscripts: Load CTDB configuration settings in 70.iscsi
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Wed, 18 Sep 2013 07:07:32 +0000 (17:07 +1000)]
recoverd: Disable takeover runs on other nodes for 5 minutes
60 seconds might not be long enough to kill all connections and
release IPs.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 18 Sep 2013 07:06:16 +0000 (17:06 +1000)]
recoverd: Improve logging for takeover runs
Takeover runs are currently silent when they succeed. However, they
are important, so log something by default.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 18 Sep 2013 06:35:18 +0000 (16:35 +1000)]
tools/ctdb: Use the standard long timeout when disabling takeover runs
This means that takeover runs will be disabled for about as long as the
reloadips control can take to complete.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 6 Sep 2013 03:20:26 +0000 (13:20 +1000)]
tools/ctdb: Fix arguments/semantics of rebalance node
There's no reason why specifying a node should be compulsory. This is
a cluster-wide operation because it is implemented by the recovery
master so multiple nodes should not be specified using -n. However,
the command should be able to specify multiple nodes so let it have
its own nodestring argument.
This change should be backward compatible with the old requirement of
specifying a single node via -n.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 6 Sep 2013 03:19:09 +0000 (13:19 +1000)]
tools/ctdb: Make rebalancenode more robust
Use a broadcast instead of trying to win the race of determining the
recovery master and then sending the message before the recovery
master changes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 6 Sep 2013 01:29:14 +0000 (11:29 +1000)]
tests/simple: Fix the reloadips test to cope with changes to reloadips
Specifying nodes to reload no longer uses -n.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 6 Sep 2013 01:23:07 +0000 (11:23 +1000)]
recoverd: Be careful about freeing the list of IP rebalance target nodes
It can change during a takeover run. If it does then don't free it.
There are potentially fancier solutions (e.g. check what PNNs are new
to the list) to this issue but this is the simplest.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 6 Sep 2013 01:21:10 +0000 (11:21 +1000)]
recoverd: reloadips should rebalance target nodes for new IPs
Otherwise, if existing IPs are added to extra nodes (that have,
perhaps, been disconnected) then those IPs will not be rebalanced
across the extra nodes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 5 Sep 2013 05:56:51 +0000 (15:56 +1000)]
ctdbd: Make ctdb_reloadips_child send controls asynchronously
Deleting IPs can take a while because IPs are released and connections
are killed. This can take a while so do them in parallel. In fact,
since the set of IPs being added and deleted will be disjoint, send
all the adds/deletes at the same time and then wait.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 4 Sep 2013 04:30:04 +0000 (14:30 +1000)]
recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE
The current implementation has a few flaws:
* A takeover run is called unconditionally when the timer goes even if
the recovery master role has moved. This means a node other than
the recovery master can incorrectly do a takeover run.
* The rebalancing target nodes are cleared in the setup for a takeover
run, regardless of whether the takeover run succeeds.
* The timer to force a rebalance isn't cleared if another takeover run
occurs before the deadline. Any forced rebalancing will happen in
the first takeover run and when the timer expires some time later
then an unnecessary takeover run will occur.
* If the recovery master role moves then the rebalancing data will
stay on the original node and affect the next takeover run to occur
if the recovery master role should come back to the original node.
Instead, store an array of rebalance target nodes in the recovery
master context. This is passed as an extra argument to
ctdb_takeover_run() each time it is called and is cleared when a
takeover run succeeds. The timer hangs off the array of rebalance
target nodes, which is cleared if the node isn't the recovery master.
This means that it is possible to lose rebalance data if the recovery
master role moves. However, that's a difficult problem to solve. The
best way of approaching it is probably to try to stop the recovery
master role from jumping around unnecesarily when inactive nodes join
the cluster.
The long term solution is to avoid this nonsense completely. The IP
allocation algorithm needs to cache state between runs so that it
knows which nodes have just become healthy. This also needs recovery
master stability.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 28 Aug 2013 05:46:27 +0000 (15:46 +1000)]
recoverd: Remove unused CTDB_SRVID_RELOAD_ALL_IPS and handler
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 28 Aug 2013 05:38:48 +0000 (15:38 +1000)]
tools/ctdb: Reimplement reloadips
This implementation disables takeover runs on all nodes before trying
to reload IPs. It also takes "all" or the list of PNNs as an argument
to the command instead of to -n. -n can still be specified with a
single node indicating that node should be considered the current node
- that might be confusing so could be removed.
This implementation does not use CTDB_SRVID_RELOAD_ALL_IPS, so it can
be removed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 28 Aug 2013 01:50:23 +0000 (11:50 +1000)]
recoverd: Defer ipreallocated requests when takeover runs are disabled
The takeover run will fail anyway but deferring seems like a cleaner
option.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 28 Aug 2013 01:32:54 +0000 (11:32 +1000)]
recoverd: Reimplement CTDB_SRVID_DISABLE_IP_CHECK
Use disable_takeover_runs_handler() instead of maintaining duplicate
logic.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 27 Aug 2013 05:04:40 +0000 (15:04 +1000)]
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS
This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops
the IP checks but also causes any attempted takeover runs to fail and
be rescheduled.
This is meant to completely stop IP movements.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 16 Aug 2013 08:47:51 +0000 (18:47 +1000)]
tools/ctdb: Add a wait_for_all option to srvid_broadcast()
This will be useful for other SRVIDs.
The error checking in the handler depends on the SRVID responding with
a uint32_t where <0 indicates an error and >=0 is a PNN that
succeeded.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 16 Aug 2013 07:06:23 +0000 (17:06 +1000)]
tools/ctdb: Factor out SRVID broadcast code from ipreallocate()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 16 Aug 2013 06:25:28 +0000 (16:25 +1000)]
tools/ctdb: Change ipreallocate() to use a local done flag
Instead of the current global variable. This is in anticipation of
abstracting the code.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 16 Aug 2013 10:02:34 +0000 (20:02 +1000)]
recoverd: Factor out the SRVID handling code
The code that handles IP reallocate requests can be reused.
This also changes the result back to a SRVID caller to the PNN on
success or a negative error code on failure. None of the callers
currently look at the result so this is harmless... but it will be
useful later.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 16 Aug 2013 10:10:10 +0000 (20:10 +1000)]
recoverd: Make the SRVID request structure generic
No need for a separate one for each SRVID.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 3 Sep 2013 01:21:09 +0000 (11:21 +1000)]
recoverd: Move disabling of IP checks into do_takeover_run()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 3 Sep 2013 01:20:01 +0000 (11:20 +1000)]
recoverd: do_takeover_run() should mark when a takeover run is in progress
Nested takeover runs should never happens so they should fail.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 27 Aug 2013 02:19:18 +0000 (12:19 +1000)]
recoverd: takeover_fail_callback() doesn't need to set rec->need_takeover_run
It is set on every failure anyway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 9 Sep 2013 02:13:11 +0000 (12:13 +1000)]
recoverd: Fail takeover run if "ipreallocated" fails
Previously flagging a failure was probably avoided because of attempts
to run "ipreallocated" events on stopped and banned nodes, which would
fail because they are in recovery. Given the change to a new control
and that fallback only retries the old method on active nodes, this
should never fail in reasonable circumstances.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 27 Aug 2013 02:14:34 +0000 (12:14 +1000)]
recoverd: New function do_takeover_run()
Factor the calling sequence for ctdb_takeover_run() into a new
function and call it instead. This changes rec->need_takeover_run to
false for each successful takeover run and that seems to be the right
thing to do.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 17 Sep 2013 02:00:26 +0000 (12:00 +1000)]
recoverd: Stabilise the recovery master role
On rare occasions when a node that has been inactive it will trigger
an election when it becomes active again. If that node has been up
for the longest then it will win the election and the recovery master
role will spuriously move.
While a node remains inactive we reset the priority time to discourage
it from winning elections. The priority time will now reflect roughly
how long the node has been active rather than how long it has been up.
That means the most stable node is more likely to win elections.
Having a stable recovery master means that disabling takeover runs
while reloading IPs is more likely to succeed. It also improves the
chances of being able to cache information in the recovery master -
for example, between takeover runs.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 4 Sep 2013 03:54:23 +0000 (13:54 +1000)]
recoverd: Banned nodes should not be told to run "ipreallocated" event
They will reject it because they are in recovery. This can result in
extra banning credits being applied to banned nodes.
This corresponds to commit
9132e6814ed927fa317f333f03dedb18f75d0e5b
from the 1.2.40 branch.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 9 Sep 2013 06:16:24 +0000 (16:16 +1000)]
common: Make parse_ip() valgrind-clean
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Tue, 27 Aug 2013 05:27:30 +0000 (15:27 +1000)]
recoverd: Remove an orphaned comment
This should have been removed with the associated code in commit
14bd0b6961ef1294e9cba74ce875386b7dfbf446.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 27 Aug 2013 05:24:17 +0000 (15:24 +1000)]
recoverd: Update a comment to use current terminology
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 27 Aug 2013 05:16:51 +0000 (15:16 +1000)]
client: Remove unused function list_of_active_nodes_except_pnn()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 27 Aug 2013 05:14:10 +0000 (15:14 +1000)]
tools/ctdb: list_of_active_nodes_except_pnn() -> list_of_nodes()
list_of_active_nodes_except_pnn() is only used here and can be removed
if we remove this call. Less is more...
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 28 Aug 2013 05:36:27 +0000 (15:36 +1000)]
tools/ctdb: Fix a memory leak in parse_nodestring()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 6 Sep 2013 06:37:52 +0000 (16:37 +1000)]
tests/eventscripts: Tests for memory checking in 00.ctdb
... plus updates to test infrastructure to support.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 6 Sep 2013 02:13:31 +0000 (12:13 +1000)]
eventscripts: Clean up monitoring of system memory in 00.ctdb
Signed-off-by: Martin Schwenke <martin@meltin.net>
Michael Adam [Thu, 22 Aug 2013 14:17:09 +0000 (16:17 +0200)]
server: standardize formatting of comment block for ctdb_reply_dmaster() while I'm at it..
This was the comment block I was touching and meant to adapt in
commit
00d3bf092e2f72eda330978c75ec85f17e870553.
My search was apparently not unique...
Signed-off-by: Michael Adam <obnox@samba.org>
Martin Schwenke [Wed, 21 Aug 2013 04:01:25 +0000 (14:01 +1000)]
doc: Update NEWS
Signed-off-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Thu, 22 Aug 2013 07:59:31 +0000 (17:59 +1000)]
build: Fix build dependencies for ctdb_lock_tdb
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Thu, 22 Aug 2013 04:04:59 +0000 (14:04 +1000)]
tests/simple: Minimise the chance of a monitor event being cancelled
A monitor event following a "ctdb delip" might reconfigure services.
If the monitor event is cancelled then a service might be stopped but
not yet restarted and this could result in the subsequent monitor
events failing.
This obviously needs to be fixed in CTDB itself. This will happen by
making "ctdb reloadips" the supported way of reconfiguring IPs.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 21 Aug 2013 07:24:03 +0000 (17:24 +1000)]
packaging: Remove pushd/popd from maketarball.sh, don't need bash
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 21 Aug 2013 06:48:21 +0000 (16:48 +1000)]
tools/ctdb_diagnostics: Add output of "ctdb getdbmap"
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 21 Aug 2013 06:38:17 +0000 (16:38 +1000)]
tools/ctdb_diagnostics: Safer temporary file creation
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 21 Aug 2013 04:34:49 +0000 (14:34 +1000)]
eventscripts: Avoid using a temporary file in 62.cnfs
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 21 Aug 2013 04:27:39 +0000 (14:27 +1000)]
scripts: Remove gdb_backtrace
This uses potentially insecure temporary files and is not referenced
anywhere else.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 19 Aug 2013 04:40:52 +0000 (14:40 +1000)]
tools/ctdb: Make most non-auto-all commands abort if run with -n all
Or if run with -n A,B,...
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 14 Aug 2013 19:02:37 +0000 (05:02 +1000)]
tools/ctdb: Remove more non-essential fetching of PNN from daemon
The useful cases are either CTDB_CURRENT_NODE, in which case
ctdb_get_pnn() does the job, or a PNN, which is... ummm... a PNN! :-)
This works because parse_nodestring() validates PNNs.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 19 Aug 2013 03:54:49 +0000 (13:54 +1000)]
tools/ctdb: Improve auto-all settings for some commands
* ipreallocate is cluster-wide so should not be auto-all
* enablescript, disablescript, getreclock, setreclock, natgwlist can
all be auto-all without issues
* xpnn, ipiface a local-only so don't work with -n, so might as well
not be auto-all
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 16 Aug 2013 10:27:25 +0000 (20:27 +1000)]
recoverd: Remove an unused temporary talloc context
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 16 Aug 2013 04:10:57 +0000 (14:10 +1000)]
recoverd: Move struct ctdb_public_ip_list back into ctdb_takeover.c
This is an internal structure. It was moved into ctdb_private.h a
long time ago to allow unit testing. Unit test compilation was
changed shortly afterwards to make this unnecessary.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 15 Aug 2013 07:04:01 +0000 (17:04 +1000)]
recoverd: Log more information when interfaces change
Signed-off-by: Martin Schwenke <martin@meltin.net>
Amitay Isaacs [Thu, 11 Jul 2013 06:00:30 +0000 (16:00 +1000)]
traverse: Log when database traverse is started
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Thu, 22 Aug 2013 05:12:17 +0000 (15:12 +1000)]
ctdbd: Finish eventscript callback processing before debugging hung script
This ensures that the result of eventscripts is updated and callback is
processed before debugging hung script. So "ctdb scriptstatus" output
will be useful from debug hung script.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>
Amitay Isaacs [Tue, 23 Jul 2013 06:00:15 +0000 (16:00 +1000)]
ctdbd: Make sure call data is freed if doing an early return
This should avoid memory bloat when a request bounces between nodes.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Wed, 21 Aug 2013 04:42:06 +0000 (14:42 +1000)]
common/io: Limit the queue buffer size for fair scheduling via tevent
If we process all the data available in a socket buffer, CTDB can stay busy
processing lots of packets via immediate event mechanism in tevent. After
processing an immediate event, tevent returns without epoll_wait. So as long
as there are immediate events, tevent will never poll other FDs. CTDB will
report this as "Event handling took xx seconds" warning. This is misleading
since CTDB is very busy processing packets, but never gets to the point of
polling FDs.
The improvement in socket handling made it worse when handling traverse
control. There were lots of packets filled in the socket buffer quickly and
CTDB stayed busy processing those packets and not polling other FDs and timer
events. This can lead to controls timing out and in worse case other nodes
marking busy node as disconnected.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Tue, 20 Aug 2013 04:20:09 +0000 (14:20 +1000)]
Revert "common/io: Keep queue buffer size multiple of 4K"
This reverts commit
5e9b1a7e24d058ff88aaa0563db36a804e866fa9.
This is not the best approach. Allowing queue buffer size to grow
indefinitely causes large number of CTDB packets to be queued up very
quickly which when processed via immediate events will block CTDB from
processing events from other FDs. If there are immediate events queued
up, tevent will never process any of the FDs till all immediate events
are processed.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Mon, 19 Aug 2013 05:04:46 +0000 (15:04 +1000)]
Revert "LACOUNT: Add back lacount mechanism to defer migrating a fetched/read copy until after default of 20 consecutive requests from the same node"
This reverts commit
035c0d981bde8c0eee8b3f24ba8e2dc817e5b504.
This is a premature optimization. Record can bounce between nodes
very quickly if it is a contended record. There is no need to hold a
record on a node unnecessarily. In case record contention becomes bad,
enabling sticky records on a database is a better idea.
Conflicts:
include/ctdb_private.h
server/ctdb_tunables.c
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Mon, 15 Jul 2013 05:39:47 +0000 (15:39 +1000)]
ctdbd: Print a log message when a key becomes hot
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Fri, 9 Aug 2013 07:22:55 +0000 (17:22 +1000)]
ctdbd: For volatile databases, write an empty record with rsn=0 only on dmaster
Empty record with rsn=0 should not be written on any other node other than
dmaster. This is however not true for persistent databases. So currently
apply the check only for volatile databases.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Fri, 9 Aug 2013 07:00:10 +0000 (17:00 +1000)]
tools/ctdb: Fix message in showban when node is banned
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 9 Aug 2013 06:58:42 +0000 (16:58 +1000)]
tools/ctdb: Reimplement ban/unban using update_flags_wait_and_ipreallocate()
This has the side effect of making these commands more resilient to
control timeouts.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 9 Aug 2013 06:34:59 +0000 (16:34 +1000)]
tools/ctdb: Factor out common pattern used in disable/enable/stop/continue
Now we will only have one set of bugs. :-)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Fri, 9 Aug 2013 05:41:37 +0000 (15:41 +1000)]
tools/ctdb: Factor, simplify and improve robustness of ipreallocate code
Having other functions call control_ipreallocate() suggests that the
it might look at the argv/argv arguments that are passed. This is not
the case. Change the callers so they call the new ipreallocate()
function instead.
Broadcast CTDB_SRVID_TAKEOVER_RUN to all connected nodes. Inactive
nodes will ignore it. This is safe since we only want 1 reply. If we
didn't get a response, we don't actually care if there's no active
recovery master - just fire, wait, retry, ...
Ignore some failures on the basis that they might be transient, so it
is probably worth retrying.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 14 Aug 2013 18:38:02 +0000 (04:38 +1000)]
tools/ctdb: Use ctdb_get_pnn() to get PNN of the current node
This has already been stored at connect time and can't fail.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Michael Adam [Mon, 19 Aug 2013 14:54:06 +0000 (16:54 +0200)]
util: In passing the code, fix a space vs. tab in set_close_on_exec().
Signed-off-by: Michael Adam <obnox@samba.org>
Michael Adam [Mon, 19 Aug 2013 15:07:19 +0000 (17:07 +0200)]
server: standardize formatting of comment block for ctdb_reply_dmaster() while I'm at it..
Signed-off-by: Michael Adam <obnox@samba.org>
Michael Adam [Tue, 13 Aug 2013 08:17:45 +0000 (10:17 +0200)]
server: fix wording and punctuation in comment block for ctdb_reply_dmaster().
Signed-off-by: Michael Adam <obnox@samba.org>
Amitay Isaacs [Wed, 14 Aug 2013 01:44:12 +0000 (11:44 +1000)]
recoverd: Improve log message when nodes disagree on recmaster
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Fri, 2 Aug 2013 01:05:08 +0000 (11:05 +1000)]
common: Null terminate process name string so valgrind doesn't complain
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Mon, 12 Aug 2013 05:50:30 +0000 (15:50 +1000)]
vacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 2)
This is caused by corruption of a record header such that the records
on two nodes point to each other as dmaster. This makes a request for
that record bounce between nodes endlessly.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Mon, 12 Aug 2013 05:51:00 +0000 (15:51 +1000)]
vacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 1)
This is caused by corruption of a record header such that the records
on two nodes point to each other as dmaster. This makes a request for
that record bounce between nodes endlessly.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Tue, 6 Aug 2013 04:37:13 +0000 (14:37 +1000)]
db_wrap: Make sure tdb messages are logged correctly
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Mon, 12 Aug 2013 01:36:25 +0000 (11:36 +1000)]
eventscripts: Become unhealthy faster on nfsd failure
Anecdotal evidence suggests that most nfsd RPC check failures are due
to cluster filesystem or storage problem. Apparently these are rarely
helped by attempting to restart the NFS service because the restart
tends to hang.
Fail after 2 nfsd RPC check failures, instead of waiting for 6
failures. Restart on every 10th failure to try to bring the node back
to good health.
Update unit tests to match.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 9 Aug 2013 01:56:29 +0000 (11:56 +1000)]
tools/ctdb: Increase default control timeout to 10 seconds
The current 3 second timeout is arbitrary and users trip over it
sometimes.
Signed-off-by: Martin Schwenke <martin@meltin.net>