ctdb.git
12 years agoReadOnly: update the documentation about readonly locks master-readonly-records
Ronnie Sahlberg [Thu, 1 Sep 2011 01:40:51 +0000 (11:40 +1000)]
ReadOnly: update the documentation about readonly locks

12 years agoReadOnly: add a new control to activate readonly lock capability for a database.
Ronnie Sahlberg [Thu, 1 Sep 2011 01:08:18 +0000 (11:08 +1000)]
ReadOnly: add a new control to activate readonly lock capability for a database.
let all databases default to not support this  until enabled through this control

12 years agoReadOnly: add a readonly flag to the getdbmap control and show the readonly setting...
Ronnie Sahlberg [Thu, 1 Sep 2011 00:28:15 +0000 (10:28 +1000)]
ReadOnly: add a readonly flag to the getdbmap control and show the readonly setting in ctdb getdbmap output

12 years agoReadOnly: Change the ctdb_db structure to keep a uint8_t for flags instead of a boole...
Ronnie Sahlberg [Thu, 1 Sep 2011 00:21:55 +0000 (10:21 +1000)]
ReadOnly: Change the ctdb_db structure to keep a uint8_t for flags instead of a boolean for
the persistent flag.
This is the same size as the original boolean but allows ut to add additional flags for the database

12 years agoLibCTDB : initialize ctdb->pnn to -1 when we create a new context
Ronnie Sahlberg [Tue, 23 Aug 2011 06:15:34 +0000 (16:15 +1000)]
LibCTDB : initialize ctdb->pnn to -1 when we create a new context
but before we learn the pnn of the local node

12 years agoLibCTDB : change the ctdb_fetch_lock_once test tool to use libctdb instead of the...
Ronnie Sahlberg [Tue, 23 Aug 2011 05:13:40 +0000 (15:13 +1000)]
LibCTDB : change the ctdb_fetch_lock_once test tool to use libctdb instead of the old client

12 years agoLibCTDB : add support for getrecmode
Ronnie Sahlberg [Tue, 23 Aug 2011 05:00:27 +0000 (15:00 +1000)]
LibCTDB : add support for getrecmode

12 years agoReadOnly: Check the readonly flag instead of whether the tdb pointer is NULL or not
Ronnie Sahlberg [Tue, 23 Aug 2011 00:41:52 +0000 (10:41 +1000)]
ReadOnly: Check the readonly flag instead of whether the tdb pointer is NULL or not

12 years agoReadOnly: add description of readonly records
Ronnie Sahlberg [Tue, 23 Aug 2011 00:37:20 +0000 (10:37 +1000)]
ReadOnly: add description of readonly records

12 years agoReadOnly: clear out the tracking record once a revoke is completed
Ronnie Sahlberg [Wed, 17 Aug 2011 06:14:57 +0000 (16:14 +1000)]
ReadOnly: clear out the tracking record once a revoke is completed

12 years agoReadOnly: When the client wants a readwrite lock but the local node is the dmaster...
Ronnie Sahlberg [Thu, 21 Jul 2011 05:59:37 +0000 (15:59 +1000)]
ReadOnly: When the client wants a readwrite lock but the local node is the dmaster and also have delegations active we must send a CALL to the local daemon to trigger it to revoke the delegations

12 years agoReadOnly: Change the update_record test tool to use the new fetchlock routine that...
Ronnie Sahlberg [Thu, 21 Jul 2011 05:58:56 +0000 (15:58 +1000)]
ReadOnly: Change the update_record test tool to use the new fetchlock routine that can do either normal or readonly fetchlock

12 years agoReadOnly: Add a test tool that requests a readonly delegation in a loop
Ronnie Sahlberg [Wed, 20 Jul 2011 05:47:15 +0000 (15:47 +1000)]
ReadOnly: Add a test tool that requests a readonly delegation in a loop

12 years agoReadOnly: Add a test tool to fetch a record, requesting a readonly delegation and...
Ronnie Sahlberg [Wed, 20 Jul 2011 05:43:55 +0000 (15:43 +1000)]
ReadOnly: Add a test tool to fetch a record, requesting a readonly delegation and lock the record once

12 years agoReadOnly: Add clientside code to fetch readonly records
Ronnie Sahlberg [Wed, 20 Jul 2011 05:37:37 +0000 (15:37 +1000)]
ReadOnly: Add clientside code to fetch readonly records

12 years agoReadOnly: Add a ctdb_ltdb_fetch_readonly() helper function
Ronnie Sahlberg [Wed, 20 Jul 2011 05:31:44 +0000 (15:31 +1000)]
ReadOnly: Add a ctdb_ltdb_fetch_readonly() helper function

12 years agoReadOnly: Add handlign of readonly requests readwrite requests, delegations and revok...
Ronnie Sahlberg [Wed, 20 Jul 2011 05:17:29 +0000 (15:17 +1000)]
ReadOnly: Add handlign of readonly requests readwrite requests, delegations and revoking of delegation to the processing loop for CALL requests coming in from a local client via domain socket

12 years agoReadOnly: Add processing for ReadOnly delegation requests and revoke requests to...
Ronnie Sahlberg [Wed, 20 Jul 2011 05:13:47 +0000 (15:13 +1000)]
ReadOnly: Add processing for ReadOnly delegation requests and revoke requests to the processing loop for CALL packets we receive from different nodes.

This implements the ReadOnly and ReadWrite request processing, delegation and revoking of delegations for all requests coming in across the network from a remote node.

12 years agoReadOnly: Once recovery has finished, make sure to free all revoke child processes...
Ronnie Sahlberg [Wed, 20 Jul 2011 04:25:29 +0000 (14:25 +1000)]
ReadOnly: Once recovery has finished, make sure to free all revoke child processes and trigger the destructors for all deferred calls to re-queue the original packets to the input packet processing function

12 years agoReadOnly: When releasing all deferred calls that blocked during revoke of all previou...
Ronnie Sahlberg [Wed, 20 Jul 2011 04:23:05 +0000 (14:23 +1000)]
ReadOnly: When releasing all deferred calls that blocked during revoke of all previous delegations, add a 1 second grace/delay for any new readonly delegation requests so that the read-write fetch-lock porcess has a chance to make progress

12 years agoReadOnly: Add a new flag to call request packet to indicate that the client wants...
Ronnie Sahlberg [Wed, 20 Jul 2011 04:21:04 +0000 (14:21 +1000)]
ReadOnly: Add a new flag to call request packet to indicate that the client wants a readonly delegation

12 years agoReadOnly: Add a function to start a revoke of all delegations for a record.
Ronnie Sahlberg [Tue, 23 Aug 2011 00:27:31 +0000 (10:27 +1000)]
ReadOnly: Add a function to start a revoke of all delegations for a record.
This triggers a child process to be created to perform the actual potentially blocking calls that are required.

12 years agoReadOnly: Add functions to register CALLs to a context used to handle deferal of...
Ronnie Sahlberg [Wed, 20 Jul 2011 03:49:17 +0000 (13:49 +1000)]
ReadOnly: Add functions to register CALLs to a context used to handle deferal of processing of CALL commands.
Once the contexts are freed, the deferred calls are re-issued to the input packet processing functions again.
This is needed when/if a CALL can not currently be processed by the main engine due to the record being locked down for revoking of all delegations.

The data is passed through several layers of callbacks, and finally a timed event callback to ensure that the processing of the packet will be restarted again at the topmost eventloop, avoinding event loop nesting.

12 years agoReadOnly: Add an extra flag to ctdb_call_local to specify whether we want to write...
Ronnie Sahlberg [Wed, 20 Jul 2011 03:30:12 +0000 (13:30 +1000)]
ReadOnly: Add an extra flag to ctdb_call_local to specify whether we want to write the record and header back to the tdb (for example we do when performing dmaster migrations)

12 years agoReadOnly: After recovering all databases, make sure to clear out the tracking databas...
Ronnie Sahlberg [Wed, 20 Jul 2011 03:20:32 +0000 (13:20 +1000)]
ReadOnly: After recovering all databases, make sure to clear out the tracking database used to track delegations and revoke. This is because the recovery will implicitely result in a revoke of all delegations.

12 years agoReadOnly: Add "readonly" flag to the ctdb_db_context to indicate if this database...
Ronnie Sahlberg [Wed, 20 Jul 2011 03:15:48 +0000 (13:15 +1000)]
ReadOnly: Add "readonly" flag to the ctdb_db_context to indicate if this database supports readonly operations or not. Add a private lock-less tdb file to the ctdb_db_context to use for tracking delegarions for records

Assume all databases will support readonly mode for now and se thte flag for all databases. At later stage we will add support to control on a per database level whether delegations will be supported or not.

12 years agoReadOnly: After performing a recovery, clear out all flags related to readonly delega...
Ronnie Sahlberg [Wed, 20 Jul 2011 03:08:21 +0000 (13:08 +1000)]
ReadOnly: After performing a recovery, clear out all flags related to readonly delegations and revoke

12 years agoAdd the missing "persistent" argument to db_exist()
Ronnie Sahlberg [Tue, 23 Aug 2011 00:23:18 +0000 (10:23 +1000)]
Add the missing "persistent" argument to db_exist()
The API for this function has changed since the 1.2 branch where readonly locks are being merged from

12 years agoReadOnly: Add a new command 'ctdb cattdb'. This fucntion differs from 'ctdb catdb...
Ronnie Sahlberg [Wed, 20 Jul 2011 02:30:33 +0000 (12:30 +1000)]
ReadOnly: Add a new command 'ctdb cattdb'. This fucntion differs from 'ctdb catdb' in that 'cattdb' will always traverse the local tdb file only, while 'catdb' does a cluster traverse.

Since some record flags may differ between nodes in the cluster when read only delegations are in use, cattdb is needed when you need to know the exact flag settings on the current node itself.

12 years agoReadOnly: Add printing of the record flags when we are traversing a database to print...
Ronnie Sahlberg [Wed, 20 Jul 2011 02:21:33 +0000 (12:21 +1000)]
ReadOnly: Add printing of the record flags when we are traversing a database to print its content.

12 years agoReadOnly: Add 4 new record flags to handle read only delegation and revoking of deleg...
Ronnie Sahlberg [Wed, 20 Jul 2011 02:17:27 +0000 (12:17 +1000)]
ReadOnly: Add 4 new record flags to handle read only delegation and revoking of delegations

12 years agoReadOnly: add a new test tool that does a fetchlock on a record, then bunps the RSN...
Ronnie Sahlberg [Wed, 20 Jul 2011 02:13:53 +0000 (12:13 +1000)]
ReadOnly: add a new test tool that does a fetchlock on a record, then bunps the RSN by 10 and writes the new content to the record as sprintf("%d", rsn)

12 years agoReadOnly: Add clientside functions to send the UPDATE_RECORD control
Ronnie Sahlberg [Wed, 20 Jul 2011 02:06:37 +0000 (12:06 +1000)]
ReadOnly: Add clientside functions to send the UPDATE_RECORD control

12 years agoReadOnly: Add test tool to validate the functions to manipulate and enumerate the...
Ronnie Sahlberg [Wed, 20 Jul 2011 01:50:14 +0000 (11:50 +1000)]
ReadOnly: Add test tool to validate the functions to manipulate and enumerate the bitmap of nodes to where we have readonly delegations

12 years agoReadOnly: Add helper functions to manipulate a TDB_DATA as a bitmap for nodes that...
Ronnie Sahlberg [Wed, 20 Jul 2011 01:39:50 +0000 (11:39 +1000)]
ReadOnly: Add helper functions to manipulate a TDB_DATA as a bitmap for nodes that we are tracking as having a readonly delegation

12 years agoReadOnly records: Add a new RPC function FETCH_WITH_HEADER.
Ronnie Sahlberg [Wed, 20 Jul 2011 01:27:05 +0000 (11:27 +1000)]
ReadOnly records: Add a new RPC function FETCH_WITH_HEADER.
This function differs from the old FETCH in that this function will also fetch the record header and not just the record data

12 years agoFix a const warning
Volker Lendecke [Mon, 22 Aug 2011 14:40:58 +0000 (16:40 +0200)]
Fix a const warning

Signed-off-by: Michael Adam <obnox@samba.org>
12 years agoRemove an unused variable
Volker Lendecke [Mon, 22 Aug 2011 14:39:32 +0000 (16:39 +0200)]
Remove an unused variable

Signed-off-by: Michael Adam <obnox@samba.org>
12 years agolibctdb: "unpack_reply_control" does not need the ctdb_connection parameter
Volker Lendecke [Fri, 19 Aug 2011 15:05:36 +0000 (17:05 +0200)]
libctdb: "unpack_reply_control" does not need the ctdb_connection parameter

Signed-off-by: Michael Adam <obnox@samba.org>
12 years agolibctdb: "unpack_reply_call" does not need the ctdb_connection parameter
Volker Lendecke [Fri, 19 Aug 2011 15:05:36 +0000 (17:05 +0200)]
libctdb: "unpack_reply_call" does not need the ctdb_connection parameter

Signed-off-by: Michael Adam <obnox@samba.org>
12 years agolibctdb: "ctdb_request_free" does not need the ctdb_connection parameter
Volker Lendecke [Fri, 19 Aug 2011 15:05:36 +0000 (17:05 +0200)]
libctdb: "ctdb_request_free" does not need the ctdb_connection parameter

Signed-off-by: Michael Adam <obnox@samba.org>
12 years agolibctdb: Make sure ctdb_request->ctdb is filled correctly
Volker Lendecke [Fri, 19 Aug 2011 14:36:20 +0000 (16:36 +0200)]
libctdb: Make sure ctdb_request->ctdb is filled correctly

Signed-off-by: Michael Adam <obnox@samba.org>
12 years agolibctdb: Ensure 0-termination of sun_path
Volker Lendecke [Thu, 18 Aug 2011 12:47:09 +0000 (14:47 +0200)]
libctdb: Ensure 0-termination of sun_path

Rusty, please check!

Signed-off-by: Michael Adam <obnox@samba.org>
12 years agolibctdb: Fix a few format warnings
Volker Lendecke [Thu, 18 Aug 2011 11:59:48 +0000 (13:59 +0200)]
libctdb: Fix a few format warnings

Signed-off-by: Michael Adam <obnox@samba.org>
12 years agolibctdb: Add license header to messages.c
Volker Lendecke [Thu, 18 Aug 2011 11:57:58 +0000 (13:57 +0200)]
libctdb: Add license header to messages.c

Rusty, please check!

Signed-off-by: Michael Adam <obnox@samba.org>
12 years agolibctdb: Reorder attachdb
Volker Lendecke [Thu, 18 Aug 2011 11:37:23 +0000 (13:37 +0200)]
libctdb: Reorder attachdb

No code change, this is for easier reading the sequence of what happens

Signed-off-by: Michael Adam <obnox@samba.org>
12 years agolibctdb: Reorder set_message_handler
Volker Lendecke [Thu, 18 Aug 2011 11:55:24 +0000 (13:55 +0200)]
libctdb: Reorder set_message_handler

No code change, this is for better readability

Signed-off-by: Michael Adam <obnox@samba.org>
12 years agolibctdb: Correct 4bfdfda, stddef.h is needed by libctdb_private.h
Volker Lendecke [Thu, 18 Aug 2011 11:54:36 +0000 (13:54 +0200)]
libctdb: Correct 4bfdfda, stddef.h is needed by libctdb_private.h

Signed-off-by: Michael Adam <obnox@samba.org>
12 years agoAdd missing #include to libctdb/ctdb.c
Volker Lendecke [Wed, 17 Aug 2011 12:46:43 +0000 (14:46 +0200)]
Add missing #include to libctdb/ctdb.c

We need that to have the "offsetof" macro, thus we don't need to redeclare it
in libctdb_private.h

Signed-off-by: Michael Adam <obnox@samba.org>
12 years agoMerge remote branch 'martins/eventscripts'
Ronnie Sahlberg [Wed, 17 Aug 2011 04:10:04 +0000 (14:10 +1000)]
Merge remote branch 'martins/eventscripts'

12 years agoEventscripts - new default TCP port checker using "ctdb checktcpport"
Martin Schwenke [Wed, 17 Aug 2011 04:02:45 +0000 (14:02 +1000)]
Eventscripts - new default TCP port checker using "ctdb checktcpport"

New function ctdb_check_tcp_ports_ctdb().  This should be fast... and
is now the default checker.  If it fails in an unexpected way we fall
back to the nmap and netstat checkers.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts - generalise TCP port checking plus new nmap-based checker
Martin Schwenke [Wed, 17 Aug 2011 02:12:20 +0000 (12:12 +1000)]
Eventscripts - generalise TCP port checking plus new nmap-based checker

Split the netstat-specific parts of ctdb_check_tcp_ports() into new
function ctdb_check_tcp_ports_netstat().

Implement new ctdb_check_tcp_ports_nmap() function that uses
"nmap -PS" to check if the desired ports are listening.

ctdb_check_ctdb_ports() now uses new configuration variable
CTDB_TCP_PORT_CHECKERS to decide which port checkers to try.  Default
value is currently "nmap netstat".  If nmap is not found then this
will fall back to netstat - if logging is at debug level this will
also fill the logs with message saying the nmap checker failed.  This
indicates that either nmap should be installed or the default value of
CTDB_TCP_PORT_CHECKERS should be changed (in a configuration file) to
avoid trying to use nmap.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts - ctdb_check_tcp_ports() only prints netstat output if debugging
Martin Schwenke [Wed, 17 Aug 2011 00:27:01 +0000 (10:27 +1000)]
Eventscripts - ctdb_check_tcp_ports() only prints netstat output if debugging

Use the new debug function to conditionally print the netstat output.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts - weaken TCP port check message if CTDB has just been started.
Martin Schwenke [Fri, 5 Aug 2011 06:39:57 +0000 (16:39 +1000)]
Eventscripts - weaken TCP port check message if CTDB has just been started.

Sometimes smbd and other services can take a while to start,
especially when there is a lot of activity after ctdbd has just
started.  The TCP port check can then pollute the logs with lots of
"ERROR" messages and possibly extra debug.

This creates a flag file when a service is started (but not restarted)
and this flag is removed the first time that TCP port checks succeed
for that service.  When a port check fails and the flag file still
exists, a less extreme "INFO" message is printed rather than the usual
"ERROR" message.  This means that until the node actually becomes
healthy we see more friendly messages.

The subtext is that we're hearing false positive reports "recreates"
of CQ S1024874 (samba stopped responding on port 445) quite often when
ctdbd is started.  This reduces the chances of people reporting such
false recreates...

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscript functions: optimise ctdb_check_tcp_ports() and add debug.
Martin Schwenke [Tue, 5 Jul 2011 01:32:06 +0000 (11:32 +1000)]
Eventscript functions: optimise ctdb_check_tcp_ports() and add debug.

ctdb_check_tcp_ports() runs "netstat -a -t -n" in a loop for each
port.  There are 2 problems with this:

* Netstat is run on each loop iteration when it need only be run once.

* The -a option is used to list all connections but the function only
  cares about the listening ports.  There may be many thousands of
  non-listening ports to grep through.

This changes ctdb_check_tcp_ports() to run netstat with the -l option
instead of the -a option.  It also only runs netstat once before the
main loop.

When a port is found to not be listening the output of the netstat
command is now dumped to help with debugging.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: add a debug() function and call ctdb_set_current_debuglevel()
Martin Schwenke [Tue, 16 Aug 2011 23:44:11 +0000 (09:44 +1000)]
Eventscripts: add a debug() function and call ctdb_set_current_debuglevel()

The debug function passes its arguments to echo if
$CTDB_CURRENT_DEBUGLEVEL is >= 4 (i.e. DEBUG).  If no args are given
then use stdin - this allows the function to be used with here
documents.

To ensure $CTDB_CURRENT_DEBUGLEVEL is set,
ctdb_set_current_debuglevel() is called near the end of the functions
file.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoAdd a new command 'ctdb checktcpport <port>'
Ronnie Sahlberg [Wed, 17 Aug 2011 00:16:35 +0000 (10:16 +1000)]
Add a new command 'ctdb checktcpport <port>'
that tries to bind to the specified port on INADDR_ANY.

This can be used for testing if a service is listening to that port or not.

Errors are printed to stdout and the returned status code is either 0 : if we managed to bind to the port (in which case the service is NOT listening on that bort) or the value of errno that stopped us from binding to a port.

errno for EADDRINUSE is 98 so a script using this command should check the status code against the value 98.
If this command returns 98 it means the service is listening to the specified port.

12 years agodont use a too big persistence timeout value
Ronnie Sahlberg [Tue, 16 Aug 2011 23:59:42 +0000 (09:59 +1000)]
dont use a too big persistence timeout value

12 years agoEventscripts - conditionally inherit ctdbd debug level in each monitor event
Martin Schwenke [Tue, 16 Aug 2011 23:14:23 +0000 (09:14 +1000)]
Eventscripts - conditionally inherit ctdbd debug level in each monitor event

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts - new function ctdb_set_current_debuglevel()
Martin Schwenke [Tue, 16 Aug 2011 23:00:46 +0000 (09:00 +1000)]
Eventscripts - new function ctdb_set_current_debuglevel()

This function ensures that CTDB_CURRENT_DEBUGLEVEL is set.  It works
like this:

1. If it is already set then do nothing, since it might have been set
   some other way.

   The recommended "other way" would be to add a file in rc.local.d/.

2. If it is not set then set it by sourcing
   /var/ctdb/eventscript_debuglevel.

3. If this file does not exist then create it using output from "ctdb
   getdebug".

If the optional 1st argument is set to "create" then don't source an
existing file but create a new one instead - this is useful for
creating the file just once in each event run in, say, 00.ctdb.

If there's a problem getting the debug level from ctdb then it is
silently set to 0 - no use spamming logs if our debug code is
broken...

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts - ensure the statd update-trigger file always exists.
Martin Schwenke [Tue, 16 Aug 2011 03:28:40 +0000 (13:28 +1000)]
Eventscripts - ensure the statd update-trigger file always exists.

See the comment in the code for details.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: remove "return 0" from 50.samba service_stop().
Martin Schwenke [Tue, 16 Aug 2011 03:18:40 +0000 (13:18 +1000)]
Eventscripts: remove "return 0" from 50.samba service_stop().

This potentially masks errors and was basically included by accident.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoChange the errors for 10.interface to clearly state ERROR: for error messages
Ronnie Sahlberg [Mon, 15 Aug 2011 05:53:04 +0000 (15:53 +1000)]
Change the errors for 10.interface to clearly state ERROR: for error messages

Update the tests system to catch the new error strings generated by this change

12 years agoMerge remote branch 'martins/eventscript_tests'
Ronnie Sahlberg [Mon, 15 Aug 2011 05:43:15 +0000 (15:43 +1000)]
Merge remote branch 'martins/eventscript_tests'

12 years agoTests - exportfs stub needs to print out export options.
Martin Schwenke [Mon, 15 Aug 2011 05:40:35 +0000 (15:40 +1000)]
Tests - exportfs stub needs to print out export options.

This is needed due to bd39b91ad12fd05271a7fced0e6f9d8c4eba92e6.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoMerge remote branch 'martins/eventscript.10.interface'
Ronnie Sahlberg [Mon, 15 Aug 2011 05:27:50 +0000 (15:27 +1000)]
Merge remote branch 'martins/eventscript.10.interface'

12 years agoMerge remote branch 'martins/60_nfs_regression'
Ronnie Sahlberg [Mon, 15 Aug 2011 05:22:20 +0000 (15:22 +1000)]
Merge remote branch 'martins/60_nfs_regression'

12 years agoMerge remote branch 'martins/eventscript.60.nfs.rpc'
Ronnie Sahlberg [Mon, 15 Aug 2011 05:20:18 +0000 (15:20 +1000)]
Merge remote branch 'martins/eventscript.60.nfs.rpc'

12 years agoMerge remote branch 'martins/test_suite'
Ronnie Sahlberg [Mon, 15 Aug 2011 05:16:06 +0000 (15:16 +1000)]
Merge remote branch 'martins/test_suite'

12 years agoMerge remote branch 'martins/eventscript_tests'
Ronnie Sahlberg [Mon, 15 Aug 2011 05:15:12 +0000 (15:15 +1000)]
Merge remote branch 'martins/eventscript_tests'

12 years agoTests - ctdb listvars test should allow alphanumericals in tunable names.
Martin Schwenke [Mon, 15 Aug 2011 03:53:39 +0000 (13:53 +1000)]
Tests - ctdb listvars test should allow alphanumericals in tunable names.

This matches the new "LCP2PublicIPs" tunable.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoChange the default for ip failover to be LCP2 and not DeterministicIPs
Ronnie Sahlberg [Mon, 15 Aug 2011 00:23:50 +0000 (10:23 +1000)]
Change the default for ip failover to be LCP2 and not DeterministicIPs

12 years agoEventscripts: 10.interfaces - make startup event actually mark interfaces up!
Martin Schwenke [Tue, 5 Jul 2011 07:21:57 +0000 (17:21 +1000)]
Eventscripts: 10.interfaces - make startup event actually mark interfaces up!

The startup event intends to mark interfaces up.  However, it doesn't
actually do that because $INTERFACES is empty.

This uses the function get_all_interfaces() to list the
interfaces... and then mark them up.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: 10.interfaces - startup comment says assume all interfaces good.
Martin Schwenke [Tue, 5 Jul 2011 07:20:09 +0000 (17:20 +1000)]
Eventscripts: 10.interfaces - startup comment says assume all interfaces good.

Interfaces are currently marked down.  Mark them up instead, as per
the comment... and discussion with Ronnie.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: 10.interfaces - new function get_all_interfaces().
Martin Schwenke [Tue, 5 Jul 2011 07:18:30 +0000 (17:18 +1000)]
Eventscripts: 10.interfaces - new function get_all_interfaces().

Move existing interface listing code to new function in preparation
for using it in startup event.

While we're here change the "sort | uniq" into "sort -u" and save some
complexity.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: 10.interface clean-ups - minor tweaks and new comments.
Martin Schwenke [Tue, 28 Jun 2011 07:07:39 +0000 (17:07 +1000)]
Eventscripts: 10.interface clean-ups - minor tweaks and new comments.

* sed can read files, it doesn't need a file piped to it
* use $() subshells instead of `` - they seem to quote better in dash
* tweak the uniquifying code so that it is easier to read
* add comments
* remove some extraneous semicolons at ends of lines

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoTests: re-enable the NFS eventscript tests - they work again.
Martin Schwenke [Fri, 12 Aug 2011 06:30:54 +0000 (16:30 +1000)]
Tests: re-enable the NFS eventscript tests - they work again.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: In 60.nfs don't restart NFS when restarting rpc.lockd.
Martin Schwenke [Fri, 12 Aug 2011 06:28:09 +0000 (16:28 +1000)]
Eventscripts: In 60.nfs don't restart NFS when restarting rpc.lockd.

This effectively reverts 953dbfbddad656a64e30a6aca115cb1479d11573 and
is a policy decision.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: 10.interface clean-ups - variable name fix-ups.
Martin Schwenke [Tue, 28 Jun 2011 06:50:47 +0000 (16:50 +1000)]
Eventscripts: 10.interface clean-ups - variable name fix-ups.

Change most of the uppercase variable names to lowercase for
consistency with other variables, readability and so they can be
easily distinguished from environment/configuration variables.  Change
the name of 2 of the variabless to add some clarity.  Changes are as
follows:

  INTERFACES   -> all_interfaces
  IFACES       -> ctdb_interfaces
  IFACE        -> iface
  I            -> i
  REALIFACE    -> realiface

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: 10.interfaces clean-ups - push logic into monitor_interfaces().
Martin Schwenke [Tue, 28 Jun 2011 06:27:01 +0000 (16:27 +1000)]
Eventscripts: 10.interfaces clean-ups - push logic into monitor_interfaces().

The logic in the monitor event itself is very complex.  Nearly all of
it can go away by adding a single check of
$CTDB_PARTIALLY_ONLINE_INTERFACES to the return logic of
monitor_interfaces() and reversing the sense of the corresponding
check.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: 10.interfaces clean-up - use more descriptive variable names.
Martin Schwenke [Tue, 28 Jun 2011 06:10:23 +0000 (16:10 +1000)]
Eventscripts: 10.interfaces clean-up - use more descriptive variable names.

The name of variable $ok gives no clue to its meaning/use so this
changes that variable to be named $up_interfaces_found.

The return logic relating to $ok and $fail is difficult to read, so
these variables are given true/fale values, allowing the return logic
to be simplified.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: 10.interfaces cleanup - new functions mark_up(), mark_down().
Martin Schwenke [Tue, 28 Jun 2011 05:53:54 +0000 (15:53 +1000)]
Eventscripts: 10.interfaces cleanup - new functions mark_up(), mark_down().

The same few lines of logic are used every time an interface up or down.

This encapsulates those few lines in 2 new functions.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: change failure counts and behaviour for statd and nfsd.
Martin Schwenke [Thu, 13 Jan 2011 22:40:11 +0000 (09:40 +1100)]
Eventscripts: change failure counts and behaviour for statd and nfsd.

We reduce the number of failures before attempting a restart.
However, after 6 failures we mark the cluster unhealthy and no longer
try to restart.  If the previous 2 attempts didn't work then there
isn't any use in bogging the system down with an attempted restart on
every monitor event.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: clean up 60.nfs monitor event.
Martin Schwenke [Fri, 17 Dec 2010 05:25:04 +0000 (16:25 +1100)]
Eventscripts: clean up 60.nfs monitor event.

This adds a helper function called nfs_check_rpc_service() and uses it
to make the monitor event much more readable.  An example of usage is
as follows:

  nfs_check_rpc_service "mountd" \
    -ge 10 "verbose restart:b unhealthy" \
    -eq 5 "restart:b"

The first argument to nfs_check_rpc_service() is the name of the RPC
service to be checked.  The RPC service corresponding to this command
is checked for availability using the rpcinfo command.  If the service
is available then the function succeeds and subsequent arguments are
ignored.

If the rpcinfo check fails then a failure counter for that particular
RPC service is incremented and subsequent arguments are processed in
groups of 3:

1. An integer comparison operator supported by test.
2. An integer failure limit.
3. An action string.

The value of the failure counter is checked using (1) and (2) above.
The first check that succeeds has its action string processed - note
that this explains the somewhat curious reverse ordering of checks.

It the example above:

* If the counter is >= 10 then a verbose message is printed
  describing the failure, the service is restarted in the background
  and the node is marked as unhealthy (via an "exit 1" from the
  function).

* If the counter is == 5 then the service us restarted in the
  background.

For more action options please see the code.

This also changes the ctdb_check_rpc() function so that it no longer
takes a program number to check.  It now just takes a real RPC program
name that rpcinfo can resolve via /etc/rpc.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoTests: Re-enable the Samba eventscript tests.
Martin Schwenke [Thu, 11 Aug 2011 05:33:46 +0000 (15:33 +1000)]
Tests: Re-enable the Samba eventscript tests.

They work again.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoRevert "Tests: tweak some samba tests to cope with debug from ctdb_check_tcp_ports()."
Martin Schwenke [Thu, 11 Aug 2011 05:32:28 +0000 (15:32 +1000)]
Revert "Tests: tweak some samba tests to cope with debug from ctdb_check_tcp_ports()."

This reverts commit 557ac30e60516742da10b83bfbbbb41430c977a2.

12 years agoEventscripts: fix regression in 60.nfs export checking.
Martin Schwenke [Wed, 13 Apr 2011 02:37:42 +0000 (12:37 +1000)]
Eventscripts: fix regression in 60.nfs export checking.

Commit 35a60a63a9b5c7d98dde514ae552239506b691c9 introduced a
regression, reported by "Jonathan Buzzard" <J.Buzzard@dundee.ac.uk>,
as follows:

  Basically the use of sed in the following code snippet does not work
  for long exports where exportfs wraps the host or network onto the
  next line.

         exportfs | grep -v '^#' | grep '^/' |
         sed -e 's/[[:space:]]*[^[:space:]]*$//' |
         ctdb_check_directories

  The result is that the you get lots of blank lines being sent to
  ctdb_check_directories which causes the host to be marked as
  unhealthy and then thrashing sets in of the managed IP's making the
  whole cluster unusable.

This tightens up the sed expression so that it is less likely to
produce a spurious empty line.  It also removes an unnecessary "grep -v".

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoMerge remote branch 'martins/eventscript.10.interface'
Ronnie Sahlberg [Thu, 11 Aug 2011 04:15:22 +0000 (14:15 +1000)]
Merge remote branch 'martins/eventscript.10.interface'

12 years agoMerge remote branch 'martins/eventscript_infrastructure'
Ronnie Sahlberg [Thu, 11 Aug 2011 04:01:02 +0000 (14:01 +1000)]
Merge remote branch 'martins/eventscript_infrastructure'

12 years agoEventscripts: in 60.nfs move statd-notify code to service_reconfigure().
Martin Schwenke [Mon, 23 May 2011 06:00:05 +0000 (16:00 +1000)]
Eventscripts: in 60.nfs move statd-notify code to service_reconfigure().

This means that it now occurs on every reconfigure event.  As a result
the ipreallocated event is removed.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts - 60.nfs should define service_reconfigure().
Martin Schwenke [Thu, 11 Aug 2011 03:55:02 +0000 (13:55 +1000)]
Eventscripts - 60.nfs should define service_reconfigure().

Not $service_reconfigure.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoWhen starting and stopping ctdb through the init-script, make sure we first clear...
Ronnie Sahlberg [Thu, 11 Aug 2011 01:45:59 +0000 (11:45 +1000)]
When starting and stopping ctdb through the init-script, make sure we first clear all public ips bvefore we start the daemon, in case they are still hanging around since a previous kill -9   and also make sure we drop them after we have stopped the deamon when shutting down

CQ S1027550

12 years agoEvenscripts: improvements to ctdb_service_check_reconfigure().
Martin Schwenke [Thu, 13 Jan 2011 22:31:56 +0000 (09:31 +1100)]
Evenscripts: improvements to ctdb_service_check_reconfigure().

* Make this function applicable to "ipreallocated" event too.

* Monitor event should not always succeed just because we reconfigure.

  If the service was unhealthy before the reconfigure and we end the
  reconfigure with "exit 0" then we can cause the node's health status
  to flip-flop.

  To avoid this we return the status of the service from the previous
  monitor event.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: 50.samba - only start/stop nmbd if $CTDB_SERVICE_NMB set.
Martin Schwenke [Fri, 27 May 2011 04:37:37 +0000 (14:37 +1000)]
Eventscripts: 50.samba - only start/stop nmbd if $CTDB_SERVICE_NMB set.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: 50.samba needs null service_reconfigure() function.
Martin Schwenke [Mon, 23 May 2011 05:37:09 +0000 (15:37 +1000)]
Eventscripts: 50.samba needs null service_reconfigure() function.

Samba doesn't need to do anything for configuration changes.  It will
notice configuration changes and reload automatically.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: 40.vsftpd service_stop() no longer /dev/null's output.
Martin Schwenke [Thu, 13 Jan 2011 22:42:18 +0000 (09:42 +1100)]
Eventscripts: 40.vsftpd service_stop() no longer /dev/null's output.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: improvements to 41.httpd.
Martin Schwenke [Thu, 13 Jan 2011 22:43:01 +0000 (09:43 +1100)]
Eventscripts: improvements to 41.httpd.

* Reduce the failure counts so that restart attempts happen sooner.

* Use service_start() and service_stop() for the restart.
  ctdb_service_start() resets the failure count, which isn't very
  useful in this context.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscript functions: new function ctdb_check_counter().
Martin Schwenke [Fri, 17 Dec 2010 05:10:56 +0000 (16:10 +1100)]
Eventscript functions: new function ctdb_check_counter().

This should eventually be able to replace ctdb_check_counter_limit()
and ctdb_check_counter_equal(), although it doesn't issue warnings
like the former.

It takes 4 optional arguments:

1. _msg - If "error" then over limit causes an error message and and
   exit 1.  Anything else fails silently but the function returns 1.
   Default is "error".

2. _op - An integer operator supported by test (e.g. -eq, -ge, -gt).
   Default is -ge.

3. _limit - Limit for the counter to be used in comparison.  Default is
   $service_fail_limit.

4. _service_name - Used to identify the counter.  Default is
   $service_name.

For example:

  ctdb_check_counter error -ge 5 foo

will print a message and exit 1 if the counter for foo is >= 5,
whereas

  ctdb_check_counter check -ge 5 foo

will just return 1 if the counter for foo is >= 5, and

  ctdb_counter_check

with print a message and exit 1 if the counter for $service_name is >=
$service_fail_limit.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: remove unused remove_ip() function.
Martin Schwenke [Tue, 28 Jun 2011 04:57:11 +0000 (14:57 +1000)]
Eventscripts: remove unused remove_ip() function.

Signed-off-by: Martin Schwenke <martin@meltin.net>
12 years agoEventscripts: startstop_nfs stop no longer redirects output to /dev/null.
Martin Schwenke [Thu, 13 Jan 2011 22:31:05 +0000 (09:31 +1100)]
Eventscripts: startstop_nfs stop no longer redirects output to /dev/null.

When stopping (as opposed to restarting) it is useful to see this
information.

Signed-off-by: Martin Schwenke <martin@meltin.net>