metze/ctdb/wip.git
14 years agoWhen testing make the time taken for some operations more obvious.
Martin Schwenke [Mon, 6 Jul 2009 06:39:08 +0000 (16:39 +1000)]
When testing make the time taken for some operations more obvious.

If wait_until() does not timeout, print the time taken for the command
to succeed.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoNew tests for different aspects of failover.
Martin Schwenke [Fri, 3 Jul 2009 10:55:02 +0000 (20:55 +1000)]
New tests for different aspects of failover.

3 separate tests:

* Check that gratuitous ARPs are received and take effect.

* Check that ping still works after failover.

* Check, via SSH, that the hostname changes after failover.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoUpdates to TCP tickle tests and supporting functions.
Martin Schwenke [Fri, 3 Jul 2009 10:44:55 +0000 (20:44 +1000)]
Updates to TCP tickle tests and supporting functions.

* Removed a race from tpcdump_start().  It seems impossible to tell
  when tcpdump is actually ready to capture packets.  So this function
  now generates some dummy ping packets and waits until it sees them
  in the output file.

* tcpdump_start() sets $tcpdump_filter.  This is the default filter
  for tcpdump_wait() and tcpdump_show(), but other filters may be
  passed to those functions.

* New functions tcptickle_sniff_start() and
  tcptickle_sniff_wait_show() handle capturing TCP tickle packets.
  These are used by complex/31_nfs_tickle.sh and
  complex/32_cifs_tickle.sh.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoAdd an extra ctdb recovery to test function restart_ctdb().
Martin Schwenke [Fri, 3 Jul 2009 08:01:29 +0000 (18:01 +1000)]
Add an extra ctdb recovery to test function restart_ctdb().

There are still very rare cases where IPs haven't been reallocated
before the beginning of the next test, so this adds a sleep and an
extra call to "ctdb recover" to restart_ctdb().

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoFix the run_tests script so that the number of columns is never 0.
Martin Schwenke [Fri, 3 Jul 2009 07:58:38 +0000 (17:58 +1000)]
Fix the run_tests script so that the number of columns is never 0.

Sometimes "stty size" reports 0, for example when running in a shell
under Emacs.  In this case, we just change it to 80.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoSeparate test cleanup code in output and clean up ctdb restart code.
Martin Schwenke [Fri, 3 Jul 2009 07:40:16 +0000 (17:40 +1000)]
Separate test cleanup code in output and clean up ctdb restart code.

* ctdb_restart_when_done() now schedules a restart by setting an
  explicit variable that is respected in ctdb_test_exit(), rather than
  adding a restart to $ctdb_test_exit_hook.  This means that restarts
  are all done in one place.

* ctdb_test_exit() turns off "set -e" to make sure that all cleanup
  happens.

* ctdb_test_exit() now prints a clear message indicating where the
  test ends and the cleanup begins.  This message also includes the
  return code of the test.

* Add debug in cluster_is_healthy to try to capture information about
  unexpected unhealthiness when a test starts.

* Simplify simple/07_ctdb_process_exists.sh so that the exit code is
  generated more obviously.

* Remove redundant calls to ctdb_test_exit at the end of tests, since
  they're done automatically via a trap.  Also remove any preceding
  warnings of restarts or final hints about test success/failure.

* Allow multi-digit debug levels in simple/12_ctdb_getdebug.sh and
  simple/13_ctdb_setdebug.sh.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoInitscript cleanups.
Ronnie Sahlberg [Tue, 7 Jul 2009 03:45:19 +0000 (13:45 +1000)]
Initscript cleanups.

* Move building of CTDB_OPTIONS to new function build_ctdb_options()
  and have it use a helper function for readability.

* New functions check_persistent_databases() and set_ctdb_variables().

* Remove valgrind-specific stop code, since the general pkill should
  kill ctdbd when running under valgrind.

* Remove some bash-isms (e.g. >& /dev/null) since the script is /bin/sh.

* Make indentation consistent.

* Minor clean-ups.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Conflicts:

config/ctdb.init

14 years agoMerge root@10.1.1.27:/shared/ctdb/ctdb-git
Ronnie Sahlberg [Tue, 7 Jul 2009 01:19:44 +0000 (11:19 +1000)]
Merge root@10.1.1.27:/shared/ctdb/ctdb-git

14 years agosend ARPs with an interval of 1.1 seconds during ip takeover.
Ronnie Sahlberg [Tue, 7 Jul 2009 01:40:01 +0000 (11:40 +1000)]
send ARPs with an interval of 1.1 seconds during ip takeover.

this is to better handle linux clients which often default to ignore grat arps that arrive within 1 second of eachother.

14 years agoPerform an ipreallocate efter each enable/disable.
Ronnie Sahlberg [Mon, 6 Jul 2009 01:49:55 +0000 (11:49 +1000)]
Perform an ipreallocate efter each enable/disable.

This will force a wait until the ip addresses have been reallocated after a disable/enable command and will make scripting of enable/disable more predictable.

This will cause the command enable/disable to wait until the ip realocation that normally follows shortly after a enable/disable to finish before the command returns to the prompt.

14 years agoMerge root@10.1.1.27:/shared/ctdb/ctdb-git
Ronnie Sahlberg [Mon, 6 Jul 2009 01:28:10 +0000 (11:28 +1000)]
Merge root@10.1.1.27:/shared/ctdb/ctdb-git

14 years agoadd a new command "ctdb ipreallocate", this command will force the recovery master...
Ronnie Sahlberg [Thu, 2 Jul 2009 03:00:26 +0000 (13:00 +1000)]
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process.
the ctdb command will block until the ip reallocation has comleted

14 years agoWhen we dispatch a message to a handler, pass the data as a real talloc object so...
Ronnie Sahlberg [Thu, 2 Jul 2009 02:58:49 +0000 (12:58 +1000)]
When we dispatch a message to a handler, pass the data as a real talloc object so that the handler can talloc_steal() the message content.

14 years agodocument the ipreallocate command
Ronnie Sahlberg [Thu, 2 Jul 2009 02:45:14 +0000 (12:45 +1000)]
document the ipreallocate command

14 years agoupdate enable/disable
Ronnie Sahlberg [Tue, 30 Jun 2009 23:33:08 +0000 (09:33 +1000)]
update enable/disable

14 years agoupdate the sysconfig to show setting the debuglevel using a string literal instead...
Ronnie Sahlberg [Tue, 30 Jun 2009 23:23:52 +0000 (09:23 +1000)]
update the sysconfig to show setting the debuglevel using a string literal instead of a numeric value

14 years agoshow the valid debuglevels that can be used in the error text when an invalid level...
Ronnie Sahlberg [Tue, 30 Jun 2009 23:21:07 +0000 (09:21 +1000)]
show the valid debuglevels that can be used in the error text when an invalid level was specified to ctdb setdebug

14 years agoupdate the handling of debug levels so that we always can use a literal instead of...
Ronnie Sahlberg [Tue, 30 Jun 2009 23:17:13 +0000 (09:17 +1000)]
update the handling of debug levels so that we always can use a literal instead of a numeric value.

validate the input values used and refuse setting the debug level to an unknown value

14 years agowhen no debuglevel is specified, make 'ctdb setdebug' show the available options
Ronnie Sahlberg [Tue, 30 Jun 2009 22:26:00 +0000 (08:26 +1000)]
when no debuglevel is specified, make 'ctdb setdebug' show the available options

14 years agodont try sending a keepalive if the transport is down
Ronnie Sahlberg [Tue, 30 Jun 2009 02:17:05 +0000 (12:17 +1000)]
dont try sending a keepalive if the transport is down

14 years agoDont even try allocating and sending a CALL packet if the transport is down
Ronnie Sahlberg [Tue, 30 Jun 2009 02:16:13 +0000 (12:16 +1000)]
Dont even try allocating and sending a CALL packet if the transport is down

14 years agofailing a dmaster send due to the transport being down is fatal
Ronnie Sahlberg [Tue, 30 Jun 2009 02:14:58 +0000 (12:14 +1000)]
failing a dmaster send due to the transport being down is fatal

14 years agoif we fail a dmaster migration due to the transport being down, then that is a fatal...
Ronnie Sahlberg [Tue, 30 Jun 2009 02:13:15 +0000 (12:13 +1000)]
if we fail a dmaster migration due to the transport being down, then that is a fatal condition.

14 years agodont try to send error packets if the transport is down
Ronnie Sahlberg [Tue, 30 Jun 2009 02:10:27 +0000 (12:10 +1000)]
dont try to send error packets if the transport is down

14 years agodont even try to send a message from the main daemon if the transport is down
Ronnie Sahlberg [Tue, 30 Jun 2009 02:09:28 +0000 (12:09 +1000)]
dont even try to send a message from the main daemon if the transport is down

14 years agoDont try to allocate and send packets if the transport is down
Ronnie Sahlberg [Tue, 30 Jun 2009 02:03:12 +0000 (12:03 +1000)]
Dont try to allocate and send packets if the transport is down

14 years agodont even try to allocate a packet if the transport is down since it will fail
Ronnie Sahlberg [Tue, 30 Jun 2009 01:55:42 +0000 (11:55 +1000)]
dont even try to allocate a packet if the transport is down since it will fail

14 years agoNew version 1.0.86
Ronnie Sahlberg [Mon, 29 Jun 2009 23:09:06 +0000 (09:09 +1000)]
New version 1.0.86

14 years agoupdate the man pages with the "getreclock" and "setreclock" commands.
Ronnie Sahlberg [Thu, 25 Jun 2009 04:45:57 +0000 (14:45 +1000)]
update the man pages with the "getreclock" and "setreclock" commands.

14 years agoDo not allow the "VerifyRecoveryLock" tunable to be changed if there is no reclock...
Ronnie Sahlberg [Thu, 25 Jun 2009 04:45:17 +0000 (14:45 +1000)]
Do not allow the "VerifyRecoveryLock" tunable to be changed if there is no reclock file

14 years agodisable VerifyRecoveryLock when the user modifies the filename
Ronnie Sahlberg [Thu, 25 Jun 2009 04:34:21 +0000 (14:34 +1000)]
disable VerifyRecoveryLock when the user modifies the filename

14 years agoadd a control to set the reclock file
Ronnie Sahlberg [Thu, 25 Jun 2009 04:25:18 +0000 (14:25 +1000)]
add a control to set the reclock file

14 years agoupdate the recovery daemon to read the recovery lock file off the main daemon and...
Ronnie Sahlberg [Thu, 25 Jun 2009 02:55:43 +0000 (12:55 +1000)]
update the recovery daemon to read the recovery lock file off the main daemon and handle when the file is changed/enabled/disabled

14 years agoreturn NULL and not a "" when there is no reclock file returned from the server
Ronnie Sahlberg [Thu, 25 Jun 2009 02:26:14 +0000 (12:26 +1000)]
return NULL and not a "" when there is no reclock file returned from the server

14 years agoadd a control to read the current reclock file from a node
Ronnie Sahlberg [Thu, 25 Jun 2009 02:17:19 +0000 (12:17 +1000)]
add a control to read the current reclock file from a node

14 years agoDocument that you can run ctdb without a reclock file in the sysconfig file
Ronnie Sahlberg [Thu, 25 Jun 2009 01:59:21 +0000 (11:59 +1000)]
Document that you can run ctdb without a reclock file in the sysconfig file

14 years agoAllow setting the recovery lock file as "", which means that we do not use a file...
Ronnie Sahlberg [Thu, 25 Jun 2009 01:50:45 +0000 (11:50 +1000)]
Allow setting the recovery lock file as "", which means that we do not use a file and that we implicitely also disable the recovery lock checking.

Update the init script to allow starting without a reclock file.

14 years agoDont access the reclock file at all if VerifyRecoveryLock is zero and also
Ronnie Sahlberg [Thu, 25 Jun 2009 01:41:18 +0000 (11:41 +1000)]
Dont access the reclock file at all if VerifyRecoveryLock is zero and also
make sure the reclock file is closed if the variable is cleared at runtime

14 years agoMerge root@10.1.1.27:/shared/ctdb/ctdb-git
Ronnie Sahlberg [Tue, 23 Jun 2009 01:21:37 +0000 (11:21 +1000)]
Merge root@10.1.1.27:/shared/ctdb/ctdb-git

14 years agonew version 1.0.85
Ronnie Sahlberg [Tue, 23 Jun 2009 01:30:25 +0000 (11:30 +1000)]
new version 1.0.85

14 years agorename 99.routing to 11.routing so that it executed before the service scripts
Ronnie Sahlberg [Tue, 23 Jun 2009 01:29:26 +0000 (11:29 +1000)]
rename 99.routing to 11.routing so that it executed before the service scripts

14 years agonew version 1.0.85
Ronnie Sahlberg [Tue, 23 Jun 2009 01:23:54 +0000 (11:23 +1000)]
new version 1.0.85

14 years agorename 99.routing to 11.routing so the eventscript is processed before
Ronnie Sahlberg [Tue, 23 Jun 2009 01:01:04 +0000 (11:01 +1000)]
rename 99.routing to 11.routing so the eventscript is processed before
NFS and LVS

14 years agoFix minor problem in previous initscript commit.
Martin Schwenke [Tue, 2 Jun 2009 05:54:04 +0000 (15:54 +1000)]
Fix minor problem in previous initscript commit.

The valgrind start case should not use daemon, since this is specific
to Red Hat.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoInitscript fixes, mostly for "stop" action.
Martin Schwenke [Tue, 2 Jun 2009 00:01:50 +0000 (10:01 +1000)]
Initscript fixes, mostly for "stop" action.

Use a local variable $ctdbd so that we always run ctdbd from the the
same place and so that we know what to kill.  This variable respects
the $CTDBD environment variable, which may be used to specify an
alternative location for the daemon.

In the important cases use "pkill -0 -f" to check if ctdbd is
running.  Also, remove the special case for killing ctdbd when running
under valgrind.  The regular case will handle this just fine.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoClean up handling the of CTDB restarts in testcases.
Martin Schwenke [Fri, 19 Jun 2009 01:40:09 +0000 (11:40 +1000)]
Clean up handling the of CTDB restarts in testcases.

Glitches during restarts of the CTDB cluster have been causing some
tests to fail.  This is because restarts are initiated in the body of
many tests.  This adds a simple function ctdb_restart_when_done, which
schedules a restart using an existing hook in the test exit code.
This function is now used in tests that need to restart CTDB.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoFix minor onnode bugs relating to local daemons.
Martin Schwenke [Fri, 19 Jun 2009 02:12:39 +0000 (12:12 +1000)]
Fix minor onnode bugs relating to local daemons.

Commit a0f5148ac749758e2dfbd6099e829c5bf1d900e6 caused a subtle
regression.  Due to the subtlety, this description is much longer than
the 1 line patch that fixes it!  The regression, where a process that
invokes onnode is unexpectedly blocked, is only apparent if the
following conditions are met:

1. $CTDB_NODES_SOCKETS is set;
2. The command passed to onnode attempts to background a process; and
3. onnode is run in certain types of subshell (e.g. foo=$(onnode ...)).

In particular, when testing against local daemons (i.e. condition (1)
is met), tests/simple/07_ctdb_process_exists.sh would fail (because it
does both (2), (3)).

The problem is caused by the use of file descriptor 3 in the code that
allows separate filtering of stdout and stderr.  A backgrounded
process will have this descriptor open and the $(...) construct
appears to wait for all file descriptors to be closed.  This only
happens with local daemons because SSH is replaced by a shell and file
descriptor 3 leaks into that shell.  It does not occur when SSH is
used because the file descriptor does not leak into the remote shell
where the process is backgrounded.

The fix is simply to redirect file descriptor 3 to /dev/null in the
fakessh function, which is used when $CTDB_NODES_SOCKETS is set.

Also fixed is another minor bug when the -o option and
$CTDB_NODES_SOCKETS are used in combination.  The code uses the node
name as a suffix for the output filename(s).  Usually this is an IP
address.  However, when $CTDB_NODES_SOCKETS is in use the node name is
the socket name, which might be a path several directories deep.
Each output file is created via a simple redirection and this would
fail if unexpected directories appear in the filename.  3 possible
fixes were considered:

1. Replace all '/'s in the node name by '_'s.  Nice and simple.
2. Use the basename of the node name.  However, sockets may be in
   different directories but have the same basename.
3. Create all required directories before redirecting.  This is a
   little more complex and probably doesn't meet the user's
   expectations.

Option (1) is implemented here.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agodont log an error if waitpid returns -1 and errno is ECHILD
Ronnie Sahlberg [Fri, 19 Jun 2009 05:55:13 +0000 (15:55 +1000)]
dont log an error if waitpid returns -1 and errno is ECHILD

14 years agodont leak file descriptors when set recmdoe timesout
Ronnie Sahlberg [Fri, 19 Jun 2009 04:58:06 +0000 (14:58 +1000)]
dont leak file descriptors when set recmdoe timesout

14 years agodont leak file descriptors
Ronnie Sahlberg [Fri, 19 Jun 2009 04:54:22 +0000 (14:54 +1000)]
dont leak file descriptors

14 years agoin the recovery daemon, check that the recovery master can access the recovery lock...
Ronnie Sahlberg [Fri, 19 Jun 2009 04:44:26 +0000 (14:44 +1000)]
in the recovery daemon, check that the recovery master can access the recovery lock file and verify it is not stale from a child process.
This allows us to timeout the operation if the underlying filesystem has become temporarily unresponsive without causing a new recovery.

14 years agoreduce the timeout we wait for the reclock child process to finish to 5 seconds
Ronnie Sahlberg [Fri, 19 Jun 2009 03:09:11 +0000 (13:09 +1000)]
reduce the timeout we wait for the reclock child process to finish to 5 seconds
before we log an error and abort

14 years agoincrease the timeout before we shutdown when ther ecovery daemon is hung
Ronnie Sahlberg [Wed, 17 Jun 2009 23:20:18 +0000 (09:20 +1000)]
increase the timeout before we shutdown when ther ecovery daemon is hung

14 years agorename 99.routing to 11.routing
Ronnie Sahlberg [Wed, 17 Jun 2009 23:11:46 +0000 (09:11 +1000)]
rename 99.routing to 11.routing
so it is executed before any of the service scripts

14 years agoNew tests for NFS and CIFS tickles.
Martin Schwenke [Tue, 16 Jun 2009 02:47:59 +0000 (12:47 +1000)]
New tests for NFS and CIFS tickles.

New tests/complex/ subdirectory contains 2 new tests to ensure that
NFS and CIFS connections are tracked by CTDB and that tickle resets
are sent when a node is disabled.

Changes to ctdb_test_functions.bash to support these tests.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoIncrease threshold in 51_ctdb_bench from 2% to 5%.
Martin Schwenke [Tue, 16 Jun 2009 02:42:29 +0000 (12:42 +1000)]
Increase threshold in 51_ctdb_bench from 2% to 5%.

The threshold for the difference in the number messages sent in either
direction around the ring of nodes was set to 2%.  Something
environmental is causing this different to sometimes be as high as 3%.
We're confident it isn't a CTDB issue so we're increasing the
threshold to 5%.

Signed-off-by: Martin Schwenke <martin@meltin.net>
15 years agoWhen we ban a node, only drop the IPs on the node being banned, not on every node
Ronnie Sahlberg [Wed, 10 Jun 2009 00:28:47 +0000 (10:28 +1000)]
When we ban a node, only drop the IPs on the node being banned, not on every node

15 years agoremove unused variable
Ronnie Sahlberg [Tue, 9 Jun 2009 00:58:46 +0000 (10:58 +1000)]
remove unused variable

15 years agodont require particular values for NoIPFailback and DeterministicIPs when
Ronnie Sahlberg [Tue, 9 Jun 2009 00:57:46 +0000 (10:57 +1000)]
dont require particular values for NoIPFailback and DeterministicIPs when
using ctdb moveip

15 years agoimprove ctdb moveip so that it does not always trigger a recovery.
Ronnie Sahlberg [Tue, 9 Jun 2009 00:56:50 +0000 (10:56 +1000)]
improve ctdb moveip so that it does not always trigger a recovery.

15 years agotry avoiding to cause a recovery when deleting a public ip from a node
Ronnie Sahlberg [Fri, 5 Jun 2009 07:57:14 +0000 (17:57 +1000)]
try avoiding to cause a recovery when deleting a public ip from a node

15 years agowhen adding an ip, try manually adding and takingover the ip instead of triggering...
Ronnie Sahlberg [Fri, 5 Jun 2009 07:00:47 +0000 (17:00 +1000)]
when adding an ip, try manually adding and takingover the ip instead of triggering a full recovery to do the same thing

15 years agodont list DELETED nodes in the ctdb listnodes output
Ronnie Sahlberg [Thu, 4 Jun 2009 03:25:58 +0000 (13:25 +1000)]
dont list DELETED nodes in the ctdb listnodes output

15 years agomake it possible to run 'ctdb listnodes' also if the daemon is not running.
Ronnie Sahlberg [Thu, 4 Jun 2009 03:21:25 +0000 (13:21 +1000)]
make it possible to run 'ctdb listnodes' also if the daemon is not running.
in this case, read the nodes file directly instead of asking the local daemon for the list.

add an option -Y to provide machinereadable output to listnodes

15 years agoFrom William Jojo <w.jojo[AT]hvcc.edu>
Ronnie Sahlberg [Wed, 3 Jun 2009 23:41:05 +0000 (09:41 +1000)]
From William Jojo <w.jojo[AT]hvcc.edu>

AIX dont have getopt.h by default.
Dont try including this file when building on AIX

15 years agonew version 1.0.84
Ronnie Sahlberg [Tue, 2 Jun 2009 05:05:41 +0000 (15:05 +1000)]
new version 1.0.84

15 years agoteach ONNODE about deleted nodes
Ronnie Sahlberg [Tue, 2 Jun 2009 05:03:44 +0000 (15:03 +1000)]
teach ONNODE about deleted nodes

15 years agonew version 1.0.83
Ronnie Sahlberg [Tue, 2 Jun 2009 03:13:03 +0000 (13:13 +1000)]
new version 1.0.83

15 years agoidocument how to remove a node from an existing cluster using 'ctdb
Ronnie Sahlberg [Tue, 2 Jun 2009 02:43:11 +0000 (12:43 +1000)]
idocument how to remove a node from an existing cluster using 'ctdb
reloadnodes'

15 years agohide all DELETED nodes from the ctdb command output
Ronnie Sahlberg [Mon, 1 Jun 2009 05:43:30 +0000 (15:43 +1000)]
hide all DELETED nodes from the ctdb command output

15 years agolower the loglevel when we long that we skip an evenscript because it is not executable
Ronnie Sahlberg [Mon, 1 Jun 2009 05:29:36 +0000 (15:29 +1000)]
lower the loglevel when we long that we skip an evenscript because it is not executable

15 years agodont try to queue packets for sending to (recently) deleted nodes since these nodes...
Ronnie Sahlberg [Mon, 1 Jun 2009 04:56:19 +0000 (14:56 +1000)]
dont try to queue packets for sending to (recently) deleted nodes since these nodes do not have a queue.

15 years agowhen building the initial vnnmap, make sure to skip any deleted nodes
Ronnie Sahlberg [Mon, 1 Jun 2009 04:44:15 +0000 (14:44 +1000)]
when building the initial vnnmap, make sure to skip any deleted nodes

15 years agouse num_nodes and the nodes array instead of walking the vnnmap
Ronnie Sahlberg [Mon, 1 Jun 2009 04:39:34 +0000 (14:39 +1000)]
use num_nodes and the nodes array instead of walking the vnnmap
when counting the number of active nodes

15 years agoadd a new node state : DELETED.
Ronnie Sahlberg [Mon, 1 Jun 2009 04:18:34 +0000 (14:18 +1000)]
add a new node state : DELETED.

This is used to mark nodes as being DELETED internally in ctdb
so that nodes are not renumbered if / when they are removed from the nodes file.

This is used to be able to do "ctdb reloadnodes" at runtime without
causing nodes to be renumbered.
To do this, instead of deleting a node from the nodes file, just comment it out like

   1.0.0.1
   #1.0.0.2
   1.0.0.3

After removing 1.0.0.2 from the cluster,  the remaining nodes retain their
pnn's from prior to the deletion, namely 0 and 2

Any line in the nodes file that is commented out represents a DELETED pnn

15 years agodont remove the socket when the dameon stops. This can race if the
Ronnie Sahlberg [Fri, 29 May 2009 08:16:13 +0000 (18:16 +1000)]
dont remove the socket when the dameon stops. This can race if the
service is immediately restarted

15 years agoNew attempt at TDB transaction nesting allow/disallow.
Ronnie Sahlberg [Mon, 25 May 2009 07:04:42 +0000 (17:04 +1000)]
New attempt at TDB transaction nesting allow/disallow.

Make the default be that transaction is not allowed and any attempt to create a nested transaction will fail with TDB_ERR_NESTING.

If an application can cope with transaction nesting and the implicit
semantics of tdb_transaction_commit(), it can enable transaction nesting
by using the TDB_ALLOW_NESTING flag.

15 years agoRevert "we only need to have transaction nesting disabled when we start the new trans...
Ronnie Sahlberg [Mon, 25 May 2009 06:55:27 +0000 (16:55 +1000)]
Revert "we only need to have transaction nesting disabled when we start the new transaction for the recovery"

This reverts commit bf8dae63d10498e6b6179bbacdd72f1ff0fc60be.

15 years agoRevert "set the TDB_NO_NESTING flag for the tdb before we start a transaction from...
Ronnie Sahlberg [Mon, 25 May 2009 06:55:02 +0000 (16:55 +1000)]
Revert "set the TDB_NO_NESTING flag for the tdb before we start a transaction from within recovery"

This reverts commit 1b2029dbb055ff07367ebc1f307f5241320227b2.

15 years agoRevert "add TDB_NO_NESTING. When this flag is set tdb will not allow any nested trans...
Ronnie Sahlberg [Mon, 25 May 2009 06:54:25 +0000 (16:54 +1000)]
Revert "add TDB_NO_NESTING. When this flag is set tdb will not allow any nested transactions and tdb_transaction_start() will implicitely _cancel() any pending transactions before starting any new ones."

This reverts commit 459e4ee135bd1cd24c15e5325906eb4ecfd550ec.

15 years agoremove the obsolete ipmux component.
Ronnie Sahlberg [Mon, 25 May 2009 02:33:52 +0000 (12:33 +1000)]
remove the obsolete ipmux component.
this is replaced by LVS since a long time

15 years agofix the git path to the repository
Ronnie Sahlberg [Mon, 25 May 2009 02:15:13 +0000 (12:15 +1000)]
fix the git path to the repository

15 years agoinstall the 31.clamd script as 644 by default
Ronnie Sahlberg [Mon, 25 May 2009 02:02:36 +0000 (12:02 +1000)]
install the 31.clamd script as 644 by default

15 years agoadd 31.clamd to the install and the rpm
Ronnie Sahlberg [Mon, 25 May 2009 01:46:47 +0000 (11:46 +1000)]
add 31.clamd to the install and the rpm

15 years agoFrom Flavio Carmo Junior <carmo.flavio@gmail.com>
Ronnie Sahlberg [Mon, 25 May 2009 02:10:29 +0000 (12:10 +1000)]
From Flavio Carmo Junior <carmo.flavio@gmail.com>

Add an eventscript to manage ClamAV

15 years agoFrom Flavio Carmo Junior <carmo.flavio@gmail.com>
Ronnie Sahlberg [Mon, 25 May 2009 02:08:50 +0000 (12:08 +1000)]
From Flavio Carmo Junior <carmo.flavio@gmail.com>
(with modifications)

Add a webpage about CLAMAV support in CTDB

15 years agodocument the new support for ClamAV
Ronnie Sahlberg [Mon, 25 May 2009 01:44:27 +0000 (11:44 +1000)]
document the new support for ClamAV

15 years agofix re pattern to accept the new recovery lock times in the statistics output
Sumit Bose [Thu, 21 May 2009 11:43:41 +0000 (13:43 +0200)]
fix re pattern to accept the new recovery lock times in the statistics output

15 years agochange the socket we use for sending grautious ARPs from AF_INET/SOCK_PACKET to AF_PA...
Ronnie Sahlberg [Thu, 21 May 2009 04:10:45 +0000 (14:10 +1000)]
change the socket we use for sending grautious ARPs from AF_INET/SOCK_PACKET to AF_PACKET/SOCK_RAW

15 years agoWhitespace changes and using the CTDB_NO_MEMORY() macro changes to
Ronnie Sahlberg [Thu, 21 May 2009 01:49:16 +0000 (11:49 +1000)]
Whitespace changes and using the CTDB_NO_MEMORY() macro changes to
the previous patch.

15 years agoadd missing checks on so far ignored return values
Sumit Bose [Wed, 20 May 2009 10:08:13 +0000 (12:08 +0200)]
add missing checks on so far ignored return values

Most of these were found during a review by Jim Meyering <meyering@redhat.com>

15 years agostructure member node_list_file is not used anywhere
Sumit Bose [Wed, 20 May 2009 10:02:27 +0000 (12:02 +0200)]
structure member node_list_file is not used anywhere

15 years agostructure member logfile is not used anywhere
Sumit Bose [Wed, 20 May 2009 09:47:34 +0000 (11:47 +0200)]
structure member logfile is not used anywhere

15 years agofix a configure warning while checking for netfilter.h
Sumit Bose [Wed, 20 May 2009 07:17:01 +0000 (09:17 +0200)]
fix a configure warning while checking for netfilter.h

15 years agoadded a missing dependency
Sumit Bose [Wed, 20 May 2009 06:59:00 +0000 (08:59 +0200)]
added a missing dependency

15 years agoChange the loglevel of "registered tcp client for ..." to INFO
Ronnie Sahlberg [Mon, 18 May 2009 22:55:42 +0000 (08:55 +1000)]
Change the loglevel of "registered tcp client for ..." to INFO
instead of ERR

15 years agoFrom : Flavio Carmo Junior <carmo.flavio@gmail.com>
Ronnie Sahlberg [Mon, 18 May 2009 22:47:19 +0000 (08:47 +1000)]
From : Flavio Carmo Junior <carmo.flavio@gmail.com>

Add a helper function that checks whether a unix domain socket exists
and there is a daemon LISTENING to it  similar to the existing function
to check for a daemon LISTENING to a tcp/ip socket.

15 years agoFix http://ctdb.samba.org/download.html
Volker Lendecke [Fri, 15 May 2009 20:08:21 +0000 (22:08 +0200)]
Fix http://ctdb.samba.org/download.html

15 years agoRemove error messages about a non-existing /var/log/log.ctdb when running ctdb with...
Christian Ambach [Wed, 6 May 2009 17:01:58 +0000 (19:01 +0200)]
Remove error messages about a non-existing /var/log/log.ctdb when running ctdb with logging to syslog

15 years agoadd additional log info to track if/why we cant switch to client mode.
Ronnie Sahlberg [Thu, 14 May 2009 08:25:00 +0000 (18:25 +1000)]
add additional log info to track if/why we cant switch to client mode.