Ronnie Sahlberg [Wed, 17 Dec 2008 01:01:40 +0000 (12:01 +1100)]
dont call ctdb_fatal() just because we are asked to restart a connection
to a remote node and ctdb->methods is NULL.
This can happen when we are in the middle of a normal shutdown of the
daemon and we have already shut down the transport layer (thus setting
ctdb->methods == NULL in the transport layer destructor)
band there is some unprocessed data related to a remote node.
This prevents an ugly race condition where ctdb might sometimes (rare)
cause a core dump during "ctdb shutdown".
Michael Adam [Fri, 12 Dec 2008 15:00:07 +0000 (16:00 +0100)]
ctdb.init: behave correctly when calling "service ctdb stop" on stopped service
When "service ctdb stop" is called and the ctdbd is not running,
don't print the "Failed to connect to daemon" error messages.
But print a warning and exit with status success instead.
Michael Adam [Sat, 5 Jul 2008 12:42:46 +0000 (14:42 +0200)]
packaging: set docdir in calls to make (to get it right on e.g. SuSE systems).
Currently docdir = /usr/share/doc is hardcoded in the Makefile.in.
Some systems use a different doc dir (SuSE uses /usr/share/doc/packages).
And not all versions of autoconf provide the --docdir parameter
(2.61 does, while 2.59 does not). So we use the quick solution
to specify "docdir=%{_docdir}" in the make calls in the spec file.
Michael Adam [Wed, 10 Dec 2008 21:27:36 +0000 (22:27 +0100)]
Improve the monitor event test for ethernet interfaces (link detection).
On some systems, the ethtool link detection is not successful when a
cable is plugged but the interface has not been brought up previously.
This improves the test by bringing the interface up (without checking
for success here) and trying the ethtool test again afterwards.
root [Wed, 10 Dec 2008 01:06:51 +0000 (12:06 +1100)]
update the "ctdb recover" command.
block and wait until the clustered has completed the recovery before returning.
this makes it easier to script since it avoids the common need for
ctdb recover
... complex loop to wait for recovery to complete ...
script continues
root [Wed, 10 Dec 2008 01:01:19 +0000 (12:01 +1100)]
add a CTDB_TIMEOUT variable for the ctdb tool.
If set this specified the maximum runtime for the ctdb tool before it will terminate with status == 20
Just like the -T ... option would.
root [Tue, 9 Dec 2008 01:03:42 +0000 (12:03 +1100)]
add a helper that waits until the clueter is no longe rin recovery mode and return the generation number.
change the ban/unban logic to wait until we are not in recovery before it bans/unbans the node.
also wait until after the cluster has recovered from the ban/unban before returning so that the cluster is in recpovery mode == normal when the command returns. this makes it much easier to script things ...
root [Fri, 5 Dec 2008 05:32:30 +0000 (16:32 +1100)]
redo and update how we synchronize flags across the cluster.
this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing.
root [Thu, 4 Dec 2008 23:33:38 +0000 (10:33 +1100)]
some platforms are very picky about the third argument passed to bind().
and would complain if sa.family is AF_INET and the third argument is not exactly the size of a sockaddr_in.
We used to pass a union containing both a sockaddr_in and a sockaddr_in6 which would mean that on those platforms bind() would fail since the passed structure for AF_INET would be too big.
Thus we need to set and pass the appropriate size to bind. At the same time for thos eplatforms we can also set sin[6]_size to the expected size.
(bind() on those platforms were isurprisingly perfectly ok with sin_len was "too big")
Ronnie Sahlberg [Tue, 2 Dec 2008 02:26:30 +0000 (13:26 +1100)]
redesign how reloadnodes is implemented.
modify the transport methods to allow to restart individual connections
and set up destructors properly.
only tear down/set-up tcp connections to nodes removed from the cluster
or nodes added to the cluster.
Leave tcp connections to unchanged nodes connected.
make "ctdb reloadnodes" explicitely cause a recovery of the cluster once
the files have been realoaded
Ronnie Sahlberg [Thu, 27 Nov 2008 22:52:26 +0000 (09:52 +1100)]
make it possible to delete an ip from all nodes at once using
"ctdb delip x.x.x.x -n all"
This is not as straightforward as one might think since during the
delete process we don not want the ip to be bouncing from one node to
another as node by node deletes it.
Thus we first delete the ip from all connected nodes which are not
currently hosting it.
After this we delete the ip from the node which is hosting it.
Andrew Tridgell [Thu, 20 Nov 2008 21:05:59 +0000 (08:05 +1100)]
fixed problem with looping ctdb recoveries
After a node failure, GPFS can get into a state where non-blocking
fcntl() locks can take a long time. This means to the ctdb set_recmode
test timing out, which leads to a recovery failure, and a new
recovery. The recovery loop can last a long time.
The fix is to consider a fcntl timeout as a success of this test. The
test is to see that we can't lock the shared reclock file, so a
timeout is fine for a success.
Ronnie Sahlberg [Thu, 20 Nov 2008 02:35:08 +0000 (13:35 +1100)]
Keepalive packets were only sent every KeepaliveInterval if the socket
had been completely idle during that interval.
If we had been sending other packets such as Messages, Calls or Controls
there wouldnt be any need for an explicit keepalive and thus we didnt
send one.
This does make it somewhat awkward when analyzing traces since it is
non-intuitive when keepalives are sent and when they are not sent.
Change the keepalive logic to always send a keepalive regardless of
whether the link is idle or not.
Ronnie Sahlberg [Wed, 19 Nov 2008 03:43:46 +0000 (14:43 +1100)]
reqrite the handling of flag updates across the cluster to eliminate a
race between the ctdb tool and the recovery daemon both at once
trying to push flag changes across the cluster.
Ronnie Sahlberg [Tue, 14 Oct 2008 16:02:09 +0000 (03:02 +1100)]
lower the loglevel for the informational message that a TCP_ADD opeation
described an ip address not known to be a public address.
This could happen if someone for genuine reasons accesses a share
through a static ip address.
It can also happen if non homogenous public address configurations are
used and when a tcp description is pushed out to a different node that
does not server/know the specific ip address.
Ronnie Sahlberg [Tue, 14 Oct 2008 13:24:44 +0000 (00:24 +1100)]
update the client side of getnodemap and getpublicips controls to
fallback to the old-style ipv4-only controls if the new-style ipv4/ipv6
control fails.
this allows a 1.0.59+ (ipv4/ipv6) ctdb daemon being recmaster to be
compatible with
pre-1.0.59 versions of ctdb that are ipv4 only.
Ronnie Sahlberg [Mon, 13 Oct 2008 23:40:29 +0000 (10:40 +1100)]
update TAKEIP/RELEASEIP/GETPUBLICIP/GETNODEMAP controls so we retain an
older ipv4-only version of these controls.
We need this so that we are backwardcompatible with old versions of ctdb
and so that we can interoperate with a ipv4-only recmaster during a
rolling upgrade.
Ronnie Sahlberg [Sun, 12 Oct 2008 21:27:33 +0000 (08:27 +1100)]
from Mathieu Parent <math.parent@gmail.com>
Hi,
I have attached a patch necessary as debian log dir (/var/log) is not
a subdir of VARDIR (/var/lib on rpm systems, /var/lib/ctdb on debian).
As I don't know much about autotools and friends, this patch may be
hacky.
This is part of the process to minimize diff between distributions.
Ronnie Sahlberg [Wed, 17 Sep 2008 04:17:41 +0000 (14:17 +1000)]
The ctdb daemon keeps track of whether the recovery process is running
correctly by measuring how long it was since the last successful
communication with the recovery daemon was recorded.
After a certain timeout the ctdb daemon would deem the recovery daemon
as inoperable and shut down.
If the system clock is suddenly changed forward by many (60 or more)
seconds this could cause the timeout to trigger prematurely/immediately
where ctdb would incorrectly think that more than 60 seconds had passed
since last successful communications and thus abort.
Instead of cehcking for one timeout occuring, only deem the recovery
daemon to be "down" and trigger a shutdown if communications have
timedout for three intervals in a row.
Martin Schwenke [Fri, 12 Sep 2008 06:55:18 +0000 (16:55 +1000)]
onnode changes. "ok" is an alias for "healthy", "con" is an alias for
"connected". Allow "rm" or "recmaster" to be a nodespec for the
recovery master. Better error handling for interaction with ctdb
client.
Signed-off-by: Martin Schwenke <martin@meltin.net>