Martin Schwenke [Wed, 31 Aug 2011 07:27:05 +0000 (17:27 +1000)]
Tests - eventscripts - allow "ctdb scriptstatus" output to be primed
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 31 Aug 2011 05:38:55 +0000 (15:38 +1000)]
Merge branch 'eventscripts' into tests
Martin Schwenke [Wed, 31 Aug 2011 05:34:43 +0000 (15:34 +1000)]
Eventscripts - enhance ctdb_replay_monitor_status()
Print useful output and return a suitable exit code.
The DISABLED and TIMEDOUT statuses use fake negative return codes, and
these can't be faked from the shell. So we map DISABLED to OK and
TIMEDOUT to ERROR - this should avoid nearly all surprises. When we
do this we add a note to the beginning of the output. The alternative
is to "fix" ctdbd to use only codes that can actually be returned by
shell scripts. However, the reason for using negative codes is
probably to distinguish them from real ones...
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 30 Aug 2011 06:27:04 +0000 (16:27 +1000)]
Tests - eventscripts - ctdb stub - implement scriptstatus, tweaks
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 30 Aug 2011 06:25:13 +0000 (16:25 +1000)]
Tests - eventscripts - formatting tweak in simple_test()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 30 Aug 2011 04:21:02 +0000 (14:21 +1000)]
Tests - eventscripts - new function simple_test_event()
Just like simple_test() but 1st arg is the event name.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 30 Aug 2011 04:20:38 +0000 (14:20 +1000)]
Tests - eventscripts - output format tweaks
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 30 Aug 2011 04:19:09 +0000 (14:19 +1000)]
Tests - eventscripts - add extra filename format for multi-event tests
$event may not be set so we need to test for it.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 30 Aug 2011 04:16:28 +0000 (14:16 +1000)]
Tests - eventscripts - add die() function and use it
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 30 Aug 2011 03:35:43 +0000 (13:35 +1000)]
Tests - eventscripts - remove undefined argument in some simple_test calls
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 23 Aug 2011 03:53:39 +0000 (13:53 +1000)]
Tests - evenscripts - add symlink to ctdb.sysconfig
Some of the tests expect the default to be
CTDB_SERVICE_AUTOSTARTSTOP=yes
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 23 Aug 2011 03:52:42 +0000 (13:52 +1000)]
Tests - eventscripts - Samba TCP port checking fixes
Expect "ctdb checktcpport" to exit with 1 if not implemented.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 22 Aug 2011 06:41:01 +0000 (16:41 +1000)]
Tests - eventscripts - TCP port checking, no working checkers
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 22 Aug 2011 06:37:22 +0000 (16:37 +1000)]
Tests - eventscripts - new Samba TCP port checking test - no nmap
This one should fall back to netstat.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 22 Aug 2011 06:14:55 +0000 (16:14 +1000)]
Tests - eventscripts - nmap and netstat stubs can pretend they weren't found
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 22 Aug 2011 06:10:44 +0000 (16:10 +1000)]
Tests - eventscripts - new Samba tests to test TCP port checking
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 22 Aug 2011 06:06:16 +0000 (16:06 +1000)]
Tests - eventscripts - add a new ctdb_not_implemented() function
This allowed a single ctdb command to be defined as not-implemented
and provided the associated output from the ctdb stub in
$ctdb_not_implemented.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 22 Aug 2011 06:00:52 +0000 (16:00 +1000)]
Tests - eventscripts - new function setup_nmap_output_filter()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 22 Aug 2011 06:07:36 +0000 (16:07 +1000)]
Tests - eventscripts - add some output filtering
This allows $OUT_FILTER to be set to one or more sed commands to
filter eventscript output. This allows expected output to be
generalised.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 22 Aug 2011 05:59:20 +0000 (15:59 +1000)]
Tests - eventscripts - ctdb default default level is 0.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 22 Aug 2011 05:58:23 +0000 (15:58 +1000)]
Tests - eventscripts - add output for "not implemented" in ctdb stub
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 22 Aug 2011 05:56:57 +0000 (15:56 +1000)]
Tests - eventscripts - add an nmap stub
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 19 Aug 2011 06:51:08 +0000 (16:51 +1000)]
Tests - eventscripts - stop timeouts waiting for backgrounded testparm
Not sleeping at all speeds up the tests. However, it can also cause
timeouts. Therefore, every time sleep is run we force the stub to do
a short 0.1s sleep instead of whatever is specified. This should be
enough to avoid races.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 19 Aug 2011 03:54:49 +0000 (13:54 +1000)]
Tests - add getdebug and checktcpport to ctdb eventscripts stub
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 19 Aug 2011 03:54:20 +0000 (13:54 +1000)]
Tests - add hooks to simulate ctdb commands that aren't implemented
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 19 Aug 2011 03:53:05 +0000 (13:53 +1000)]
Tests - add eventscripts testing stub for sleep command.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 19 Aug 2011 01:35:38 +0000 (11:35 +1000)]
Tests - Change variable used to fake listening TCP ports.
Change from $FAKE_NETSTAT_TCP_LISTEN to $FAKE_TCP_LISTEN.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 19 Aug 2011 01:24:56 +0000 (11:24 +1000)]
Tests - new NFS share checking tests
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 19 Aug 2011 01:22:51 +0000 (11:22 +1000)]
Tests - eventscripts exportfs stub should splits lines
The real exportfs splits lines longer than 15 characters. The stub
should do that too...
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 19 Aug 2011 01:21:33 +0000 (11:21 +1000)]
Tests - add -T (trace) option to eventscripts run_test.sh
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 30 Aug 2011 06:31:17 +0000 (16:31 +1000)]
Eventscripts - use ctdb scriptstatus -Y when replaying status
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 16 May 2011 04:23:28 +0000 (14:23 +1000)]
Eventscripts: add a synchronous synthetic reconfigure event.
In the current code services can only be reconfigured asynchronously.
This means that configuration file changes can be made, an asychronous
reconfigure event can be triggered, and it always succeeds. Some time
later when a service is actually reconfigured then a failure may be
seen
This adds a synthetic reconfigure event that reconfigures a service
synchronously so that any failure is reported on exit.
ctdb_service_check_reconfigure() is essentially reimplemented.
If a reconfigure event is in flight and an ipreallocated or monitor
event occurs then any scheduled asynchronous reconfigure is deferred
until the next monitor cycle. This is to avoid reconfigures trampling
on each other. In this case a monitor event will also replay the
previous status to try to avoid exposing any temporary instability.
If a reconfigure event collides with another reconfigure event it will
exit with status 2, indicating that the reconfigure should be retried.
The reconfigure event is implemented using a subprocess to control the
exit from the synthetic event.
As before, if a monitor event causes a scheduled synchronous
reconfigure to occure then it will replay the previous status for the
service, given that a reconfigure can cause temporary instability.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 23 Aug 2011 06:43:53 +0000 (16:43 +1000)]
Eventscripts - call ctdb_check_args() in 00.ctdb
This is the first eventscript. Sanity check as early as possible and
everyone benefits.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 23 Aug 2011 06:36:19 +0000 (16:36 +1000)]
Eventscripts - call ctdb_check_args() instead of doing hand checking
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 23 Aug 2011 06:32:34 +0000 (16:32 +1000)]
Eventscripts - new function ctdb_check_args()
Pass this "$@" to do common eventscript argument checking.
For regular use putting this in 00.ctdb would be enough. However, for
developer testing it can be useful to call this in other eventscripts.
For example, 10.interfaces and 13.per_ip_routing currently check these
by hand.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 19 Aug 2011 04:20:58 +0000 (14:20 +1000)]
Eventscripts - ctdb_check_tcp_ports() bug fix.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 19 Aug 2011 03:55:55 +0000 (13:55 +1000)]
Eventscripts - fix debugging buglet in ctdb_check_tcp_ports_ctdb()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 8 Aug 2011 03:13:59 +0000 (13:13 +1000)]
Eventscripts: New configuration variable CTDB_SERVICE_AUTOSTARTSTOP.
Some of the current auto-start/stop logic is broken, particularly for
Samba. Fixing it is non-trivial.
If $CTDB_SERVICE_AUTOSTARTSTOP is "yes" then auto-start/stop services
when told to newly manage or no longer manage them. This defaults to
"yes".
However, if using a canned configuration file that doesn't set
$CTDB_SERVICE_AUTOSTARTSTOP then this stops the auto-start-stop logic
from working. Therefore, this works around CQ S1026685 - on the
system in question another daemon controls service auto-start/stop and
CTDB just gets in the way.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 17 Aug 2011 07:42:07 +0000 (17:42 +1000)]
Eventscripts - in 60.nfs uniquify the share check directory list
There are sites that have multiple entries for the same export. This
optimises the share check in this case.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Ronnie Sahlberg [Thu, 25 Aug 2011 23:39:25 +0000 (09:39 +1000)]
Logging: when we log stdout/stderr messages from eventscripts to the system log, prefix every line of output with the name of the eventscript.
CQ S1028412
Ronnie Sahlberg [Tue, 23 Aug 2011 06:35:08 +0000 (16:35 +1000)]
LibCTDB : update the ctdb tool to use libctdb to read the recovery mode
Ronnie Sahlberg [Tue, 23 Aug 2011 06:32:38 +0000 (16:32 +1000)]
LibCTDB : uptade the ctdb tool to use libctdb to query for the recmaster
Ronnie Sahlberg [Tue, 23 Aug 2011 06:15:34 +0000 (16:15 +1000)]
LibCTDB : initialize ctdb->pnn to -1 when we create a new context
but before we learn the pnn of the local node
Ronnie Sahlberg [Tue, 23 Aug 2011 05:13:40 +0000 (15:13 +1000)]
LibCTDB : change the ctdb_fetch_lock_once test tool to use libctdb instead of the old client
Ronnie Sahlberg [Tue, 23 Aug 2011 05:00:27 +0000 (15:00 +1000)]
LibCTDB : add support for getrecmode
Ronnie Sahlberg [Tue, 23 Aug 2011 02:43:16 +0000 (12:43 +1000)]
LibCTDB: add commands where an application can query how many commands are active
and we have not yet received a reply to.
Applications may use this command to query if it is "safe" to stop the event system and sleep
or whether it should first wait for all activity to ctdb daemons to cease first.
Volker Lendecke [Mon, 22 Aug 2011 14:40:58 +0000 (16:40 +0200)]
Fix a const warning
Signed-off-by: Michael Adam <obnox@samba.org>
Volker Lendecke [Mon, 22 Aug 2011 14:39:32 +0000 (16:39 +0200)]
Remove an unused variable
Signed-off-by: Michael Adam <obnox@samba.org>
Volker Lendecke [Fri, 19 Aug 2011 15:05:36 +0000 (17:05 +0200)]
libctdb: "unpack_reply_control" does not need the ctdb_connection parameter
Signed-off-by: Michael Adam <obnox@samba.org>
Volker Lendecke [Fri, 19 Aug 2011 15:05:36 +0000 (17:05 +0200)]
libctdb: "unpack_reply_call" does not need the ctdb_connection parameter
Signed-off-by: Michael Adam <obnox@samba.org>
Volker Lendecke [Fri, 19 Aug 2011 15:05:36 +0000 (17:05 +0200)]
libctdb: "ctdb_request_free" does not need the ctdb_connection parameter
Signed-off-by: Michael Adam <obnox@samba.org>
Volker Lendecke [Fri, 19 Aug 2011 14:36:20 +0000 (16:36 +0200)]
libctdb: Make sure ctdb_request->ctdb is filled correctly
Signed-off-by: Michael Adam <obnox@samba.org>
Volker Lendecke [Thu, 18 Aug 2011 12:47:09 +0000 (14:47 +0200)]
libctdb: Ensure 0-termination of sun_path
Rusty, please check!
Signed-off-by: Michael Adam <obnox@samba.org>
Volker Lendecke [Thu, 18 Aug 2011 11:59:48 +0000 (13:59 +0200)]
libctdb: Fix a few format warnings
Signed-off-by: Michael Adam <obnox@samba.org>
Volker Lendecke [Thu, 18 Aug 2011 11:57:58 +0000 (13:57 +0200)]
libctdb: Add license header to messages.c
Rusty, please check!
Signed-off-by: Michael Adam <obnox@samba.org>
Volker Lendecke [Thu, 18 Aug 2011 11:37:23 +0000 (13:37 +0200)]
libctdb: Reorder attachdb
No code change, this is for easier reading the sequence of what happens
Signed-off-by: Michael Adam <obnox@samba.org>
Volker Lendecke [Thu, 18 Aug 2011 11:55:24 +0000 (13:55 +0200)]
libctdb: Reorder set_message_handler
No code change, this is for better readability
Signed-off-by: Michael Adam <obnox@samba.org>
Volker Lendecke [Thu, 18 Aug 2011 11:54:36 +0000 (13:54 +0200)]
libctdb: Correct
4bfdfda, stddef.h is needed by libctdb_private.h
Signed-off-by: Michael Adam <obnox@samba.org>
Volker Lendecke [Wed, 17 Aug 2011 12:46:43 +0000 (14:46 +0200)]
Add missing #include to libctdb/ctdb.c
We need that to have the "offsetof" macro, thus we don't need to redeclare it
in libctdb_private.h
Signed-off-by: Michael Adam <obnox@samba.org>
Ronnie Sahlberg [Wed, 17 Aug 2011 04:10:04 +0000 (14:10 +1000)]
Merge remote branch 'martins/eventscripts'
Martin Schwenke [Wed, 17 Aug 2011 04:02:45 +0000 (14:02 +1000)]
Eventscripts - new default TCP port checker using "ctdb checktcpport"
New function ctdb_check_tcp_ports_ctdb(). This should be fast... and
is now the default checker. If it fails in an unexpected way we fall
back to the nmap and netstat checkers.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 17 Aug 2011 02:12:20 +0000 (12:12 +1000)]
Eventscripts - generalise TCP port checking plus new nmap-based checker
Split the netstat-specific parts of ctdb_check_tcp_ports() into new
function ctdb_check_tcp_ports_netstat().
Implement new ctdb_check_tcp_ports_nmap() function that uses
"nmap -PS" to check if the desired ports are listening.
ctdb_check_ctdb_ports() now uses new configuration variable
CTDB_TCP_PORT_CHECKERS to decide which port checkers to try. Default
value is currently "nmap netstat". If nmap is not found then this
will fall back to netstat - if logging is at debug level this will
also fill the logs with message saying the nmap checker failed. This
indicates that either nmap should be installed or the default value of
CTDB_TCP_PORT_CHECKERS should be changed (in a configuration file) to
avoid trying to use nmap.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 17 Aug 2011 00:27:01 +0000 (10:27 +1000)]
Eventscripts - ctdb_check_tcp_ports() only prints netstat output if debugging
Use the new debug function to conditionally print the netstat output.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 5 Aug 2011 06:39:57 +0000 (16:39 +1000)]
Eventscripts - weaken TCP port check message if CTDB has just been started.
Sometimes smbd and other services can take a while to start,
especially when there is a lot of activity after ctdbd has just
started. The TCP port check can then pollute the logs with lots of
"ERROR" messages and possibly extra debug.
This creates a flag file when a service is started (but not restarted)
and this flag is removed the first time that TCP port checks succeed
for that service. When a port check fails and the flag file still
exists, a less extreme "INFO" message is printed rather than the usual
"ERROR" message. This means that until the node actually becomes
healthy we see more friendly messages.
The subtext is that we're hearing false positive reports "recreates"
of CQ S1024874 (samba stopped responding on port 445) quite often when
ctdbd is started. This reduces the chances of people reporting such
false recreates...
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 5 Jul 2011 01:32:06 +0000 (11:32 +1000)]
Eventscript functions: optimise ctdb_check_tcp_ports() and add debug.
ctdb_check_tcp_ports() runs "netstat -a -t -n" in a loop for each
port. There are 2 problems with this:
* Netstat is run on each loop iteration when it need only be run once.
* The -a option is used to list all connections but the function only
cares about the listening ports. There may be many thousands of
non-listening ports to grep through.
This changes ctdb_check_tcp_ports() to run netstat with the -l option
instead of the -a option. It also only runs netstat once before the
main loop.
When a port is found to not be listening the output of the netstat
command is now dumped to help with debugging.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 16 Aug 2011 23:44:11 +0000 (09:44 +1000)]
Eventscripts: add a debug() function and call ctdb_set_current_debuglevel()
The debug function passes its arguments to echo if
$CTDB_CURRENT_DEBUGLEVEL is >= 4 (i.e. DEBUG). If no args are given
then use stdin - this allows the function to be used with here
documents.
To ensure $CTDB_CURRENT_DEBUGLEVEL is set,
ctdb_set_current_debuglevel() is called near the end of the functions
file.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Ronnie Sahlberg [Wed, 17 Aug 2011 00:16:35 +0000 (10:16 +1000)]
Add a new command 'ctdb checktcpport <port>'
that tries to bind to the specified port on INADDR_ANY.
This can be used for testing if a service is listening to that port or not.
Errors are printed to stdout and the returned status code is either 0 : if we managed to bind to the port (in which case the service is NOT listening on that bort) or the value of errno that stopped us from binding to a port.
errno for EADDRINUSE is 98 so a script using this command should check the status code against the value 98.
If this command returns 98 it means the service is listening to the specified port.
Ronnie Sahlberg [Tue, 16 Aug 2011 23:59:42 +0000 (09:59 +1000)]
dont use a too big persistence timeout value
Martin Schwenke [Tue, 16 Aug 2011 23:14:23 +0000 (09:14 +1000)]
Eventscripts - conditionally inherit ctdbd debug level in each monitor event
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 16 Aug 2011 23:00:46 +0000 (09:00 +1000)]
Eventscripts - new function ctdb_set_current_debuglevel()
This function ensures that CTDB_CURRENT_DEBUGLEVEL is set. It works
like this:
1. If it is already set then do nothing, since it might have been set
some other way.
The recommended "other way" would be to add a file in rc.local.d/.
2. If it is not set then set it by sourcing
/var/ctdb/eventscript_debuglevel.
3. If this file does not exist then create it using output from "ctdb
getdebug".
If the optional 1st argument is set to "create" then don't source an
existing file but create a new one instead - this is useful for
creating the file just once in each event run in, say, 00.ctdb.
If there's a problem getting the debug level from ctdb then it is
silently set to 0 - no use spamming logs if our debug code is
broken...
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 16 Aug 2011 03:28:40 +0000 (13:28 +1000)]
Eventscripts - ensure the statd update-trigger file always exists.
See the comment in the code for details.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 16 Aug 2011 03:18:40 +0000 (13:18 +1000)]
Eventscripts: remove "return 0" from 50.samba service_stop().
This potentially masks errors and was basically included by accident.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Ronnie Sahlberg [Mon, 15 Aug 2011 05:53:04 +0000 (15:53 +1000)]
Change the errors for 10.interface to clearly state ERROR: for error messages
Update the tests system to catch the new error strings generated by this change
Ronnie Sahlberg [Mon, 15 Aug 2011 05:43:15 +0000 (15:43 +1000)]
Merge remote branch 'martins/eventscript_tests'
Martin Schwenke [Mon, 15 Aug 2011 05:40:35 +0000 (15:40 +1000)]
Tests - exportfs stub needs to print out export options.
This is needed due to
bd39b91ad12fd05271a7fced0e6f9d8c4eba92e6.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Ronnie Sahlberg [Mon, 15 Aug 2011 05:27:50 +0000 (15:27 +1000)]
Merge remote branch 'martins/eventscript.10.interface'
Ronnie Sahlberg [Mon, 15 Aug 2011 05:22:20 +0000 (15:22 +1000)]
Merge remote branch 'martins/60_nfs_regression'
Ronnie Sahlberg [Mon, 15 Aug 2011 05:20:18 +0000 (15:20 +1000)]
Merge remote branch 'martins/eventscript.60.nfs.rpc'
Ronnie Sahlberg [Mon, 15 Aug 2011 05:16:06 +0000 (15:16 +1000)]
Merge remote branch 'martins/test_suite'
Ronnie Sahlberg [Mon, 15 Aug 2011 05:15:12 +0000 (15:15 +1000)]
Merge remote branch 'martins/eventscript_tests'
Martin Schwenke [Mon, 15 Aug 2011 03:53:39 +0000 (13:53 +1000)]
Tests - ctdb listvars test should allow alphanumericals in tunable names.
This matches the new "LCP2PublicIPs" tunable.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Ronnie Sahlberg [Mon, 15 Aug 2011 00:23:50 +0000 (10:23 +1000)]
Change the default for ip failover to be LCP2 and not DeterministicIPs
Martin Schwenke [Tue, 5 Jul 2011 07:21:57 +0000 (17:21 +1000)]
Eventscripts: 10.interfaces - make startup event actually mark interfaces up!
The startup event intends to mark interfaces up. However, it doesn't
actually do that because $INTERFACES is empty.
This uses the function get_all_interfaces() to list the
interfaces... and then mark them up.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 5 Jul 2011 07:20:09 +0000 (17:20 +1000)]
Eventscripts: 10.interfaces - startup comment says assume all interfaces good.
Interfaces are currently marked down. Mark them up instead, as per
the comment... and discussion with Ronnie.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 5 Jul 2011 07:18:30 +0000 (17:18 +1000)]
Eventscripts: 10.interfaces - new function get_all_interfaces().
Move existing interface listing code to new function in preparation
for using it in startup event.
While we're here change the "sort | uniq" into "sort -u" and save some
complexity.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 28 Jun 2011 07:07:39 +0000 (17:07 +1000)]
Eventscripts: 10.interface clean-ups - minor tweaks and new comments.
* sed can read files, it doesn't need a file piped to it
* use $() subshells instead of `` - they seem to quote better in dash
* tweak the uniquifying code so that it is easier to read
* add comments
* remove some extraneous semicolons at ends of lines
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 12 Aug 2011 06:30:54 +0000 (16:30 +1000)]
Tests: re-enable the NFS eventscript tests - they work again.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 12 Aug 2011 06:28:09 +0000 (16:28 +1000)]
Eventscripts: In 60.nfs don't restart NFS when restarting rpc.lockd.
This effectively reverts
953dbfbddad656a64e30a6aca115cb1479d11573 and
is a policy decision.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 28 Jun 2011 06:50:47 +0000 (16:50 +1000)]
Eventscripts: 10.interface clean-ups - variable name fix-ups.
Change most of the uppercase variable names to lowercase for
consistency with other variables, readability and so they can be
easily distinguished from environment/configuration variables. Change
the name of 2 of the variabless to add some clarity. Changes are as
follows:
INTERFACES -> all_interfaces
IFACES -> ctdb_interfaces
IFACE -> iface
I -> i
REALIFACE -> realiface
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 28 Jun 2011 06:27:01 +0000 (16:27 +1000)]
Eventscripts: 10.interfaces clean-ups - push logic into monitor_interfaces().
The logic in the monitor event itself is very complex. Nearly all of
it can go away by adding a single check of
$CTDB_PARTIALLY_ONLINE_INTERFACES to the return logic of
monitor_interfaces() and reversing the sense of the corresponding
check.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 28 Jun 2011 06:10:23 +0000 (16:10 +1000)]
Eventscripts: 10.interfaces clean-up - use more descriptive variable names.
The name of variable $ok gives no clue to its meaning/use so this
changes that variable to be named $up_interfaces_found.
The return logic relating to $ok and $fail is difficult to read, so
these variables are given true/fale values, allowing the return logic
to be simplified.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 28 Jun 2011 05:53:54 +0000 (15:53 +1000)]
Eventscripts: 10.interfaces cleanup - new functions mark_up(), mark_down().
The same few lines of logic are used every time an interface up or down.
This encapsulates those few lines in 2 new functions.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 13 Jan 2011 22:40:11 +0000 (09:40 +1100)]
Eventscripts: change failure counts and behaviour for statd and nfsd.
We reduce the number of failures before attempting a restart.
However, after 6 failures we mark the cluster unhealthy and no longer
try to restart. If the previous 2 attempts didn't work then there
isn't any use in bogging the system down with an attempted restart on
every monitor event.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 17 Dec 2010 05:25:04 +0000 (16:25 +1100)]
Eventscripts: clean up 60.nfs monitor event.
This adds a helper function called nfs_check_rpc_service() and uses it
to make the monitor event much more readable. An example of usage is
as follows:
nfs_check_rpc_service "mountd" \
-ge 10 "verbose restart:b unhealthy" \
-eq 5 "restart:b"
The first argument to nfs_check_rpc_service() is the name of the RPC
service to be checked. The RPC service corresponding to this command
is checked for availability using the rpcinfo command. If the service
is available then the function succeeds and subsequent arguments are
ignored.
If the rpcinfo check fails then a failure counter for that particular
RPC service is incremented and subsequent arguments are processed in
groups of 3:
1. An integer comparison operator supported by test.
2. An integer failure limit.
3. An action string.
The value of the failure counter is checked using (1) and (2) above.
The first check that succeeds has its action string processed - note
that this explains the somewhat curious reverse ordering of checks.
It the example above:
* If the counter is >= 10 then a verbose message is printed
describing the failure, the service is restarted in the background
and the node is marked as unhealthy (via an "exit 1" from the
function).
* If the counter is == 5 then the service us restarted in the
background.
For more action options please see the code.
This also changes the ctdb_check_rpc() function so that it no longer
takes a program number to check. It now just takes a real RPC program
name that rpcinfo can resolve via /etc/rpc.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 11 Aug 2011 05:33:46 +0000 (15:33 +1000)]
Tests: Re-enable the Samba eventscript tests.
They work again.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 11 Aug 2011 05:32:28 +0000 (15:32 +1000)]
Revert "Tests: tweak some samba tests to cope with debug from ctdb_check_tcp_ports()."
This reverts commit
557ac30e60516742da10b83bfbbbb41430c977a2.
Martin Schwenke [Wed, 13 Apr 2011 02:37:42 +0000 (12:37 +1000)]
Eventscripts: fix regression in 60.nfs export checking.
Commit
35a60a63a9b5c7d98dde514ae552239506b691c9 introduced a
regression, reported by "Jonathan Buzzard" <J.Buzzard@dundee.ac.uk>,
as follows:
Basically the use of sed in the following code snippet does not work
for long exports where exportfs wraps the host or network onto the
next line.
exportfs | grep -v '^#' | grep '^/' |
sed -e 's/[[:space:]]*[^[:space:]]*$//' |
ctdb_check_directories
The result is that the you get lots of blank lines being sent to
ctdb_check_directories which causes the host to be marked as
unhealthy and then thrashing sets in of the managed IP's making the
whole cluster unusable.
This tightens up the sed expression so that it is less likely to
produce a spurious empty line. It also removes an unnecessary "grep -v".
Signed-off-by: Martin Schwenke <martin@meltin.net>
Ronnie Sahlberg [Thu, 11 Aug 2011 04:15:22 +0000 (14:15 +1000)]
Merge remote branch 'martins/eventscript.10.interface'
Ronnie Sahlberg [Thu, 11 Aug 2011 04:01:02 +0000 (14:01 +1000)]
Merge remote branch 'martins/eventscript_infrastructure'
Martin Schwenke [Mon, 23 May 2011 06:00:05 +0000 (16:00 +1000)]
Eventscripts: in 60.nfs move statd-notify code to service_reconfigure().
This means that it now occurs on every reconfigure event. As a result
the ipreallocated event is removed.
Signed-off-by: Martin Schwenke <martin@meltin.net>