Ronnie Sahlberg [Wed, 1 Feb 2012 01:14:13 +0000 (12:14 +1100)]
Merge branch '1.2-readonly' of 10.1.1.27:/shared/ctdb/ctdb-master into 1.2-readonly
Conflicts:
packaging/RPM/ctdb.spec.in
Ronnie Sahlberg [Tue, 31 Jan 2012 22:32:02 +0000 (09:32 +1100)]
ReadOnly: update the loop test tool to print number of fetches per second
Ronnie Sahlberg [Tue, 31 Jan 2012 06:20:35 +0000 (17:20 +1100)]
Niceify the readonlyrecord API. Dont force clients to be exposed to the featch_with_header function
We dont strictly need to force clients to use CTDB_FETCH_WITH_HEADER instead of CTDB_FETCH when they ask for readonly records.
Have ctdbd internally remap this internally to FETCH_WITH_HEADER and map the reply back to CTDB_FETCH_FUNC or CTDB_FETCH_WITH_HEADER_FUNC based on what the client initially asked for.
This removes the need for the client to know about the CTDB_FETCH_WITH_HEADER_FUNC function and simplifies the client code.
Clients that do not care what the header after the request is can just continue using the old CTDB_FETCH_FUNC call and ctdbd will do all the difficult stuff.
Ronnie Sahlberg [Fri, 28 Oct 2011 02:38:32 +0000 (13:38 +1100)]
libctdb: dont allow ctdb_writerecord() for readonly records
Ronnie Sahlberg [Fri, 28 Oct 2011 01:41:27 +0000 (12:41 +1100)]
ReadOnly: If record does not exist, upgrade to write-lock
If we attempt a readonly lock request for a record that do not exist (yet)
in the local TDB, then upgrade the request to ask for a write lock and force a
request for migrate the record onto the local node.
This allows the "only request record on second local request for known contended records"
heuristics to try to avoid creating readonly delegations unless we have good reason to
assume it is a contended record.
Ronnie Sahlberg [Tue, 31 Jan 2012 23:26:41 +0000 (10:26 +1100)]
ReadOnly: add readonly record lock requests to libctdb
Initial readonly record support in libctdb.
New records are not yet created by the library but extising records will be
This needs a bit more tests before we can drop the "old style" implementatio
code in client/ctdb_client.c
Ronnie Sahlberg [Fri, 28 Oct 2011 00:44:19 +0000 (11:44 +1100)]
ReadOnly: fix bug writing incorrect amount of data in delegated record
Fix bug when ctdbd updates the local copy of a delegated record to write the correct
amount of data to the record.
Ronnie Sahlberg [Mon, 24 Oct 2011 02:19:30 +0000 (13:19 +1100)]
ReadOnly DOCS: update the docs for readonly delegations to remove the passage that records are written/updated by the client
Ronnie Sahlberg [Mon, 24 Oct 2011 02:14:26 +0000 (13:14 +1100)]
ReadOnly: Dont update the record header from the calling client. While it is convenient since it avoids having to create a child process from the main dameon for writing the updated record it makes the cleitn more complex.
Remove the code in the example client code that writes the record to the local tdb.
Add code to the local ctdbd processing of replies to check if this reply contain a ro delegation and if so, spawn a child process to lock the tdb and then write the data.
Ronnie Sahlberg [Tue, 13 Sep 2011 08:47:18 +0000 (18:47 +1000)]
ReadOnly: revokechild_active is a list, not a context.
Dont reset the pointer to NULL after deleting the first entry, loop deleting one entry
at a time until they are all gone or we will leak some memory and possibly a process.
Ronnie Sahlberg [Tue, 13 Sep 2011 08:41:34 +0000 (18:41 +1000)]
fix some compiler warnings for the test tools
Ronnie Sahlberg [Tue, 13 Sep 2011 08:38:20 +0000 (18:38 +1000)]
ReadOnly: Rename the function ctdb_ltdb_fetch_readonly() to ctdb_ltdb_fetch_with_header() since this is what it actually does.
Ronnie Sahlberg [Thu, 1 Sep 2011 01:40:51 +0000 (11:40 +1000)]
ReadOnly: update the documentation about readonly locks
Ronnie Sahlberg [Thu, 1 Sep 2011 01:08:18 +0000 (11:08 +1000)]
ReadOnly: add a new control to activate readonly lock capability for a database.
let all databases default to not support this until enabled through this control
Ronnie Sahlberg [Thu, 1 Sep 2011 00:28:15 +0000 (10:28 +1000)]
ReadOnly: add a readonly flag to the getdbmap control and show the readonly setting in ctdb getdbmap output
Ronnie Sahlberg [Thu, 1 Sep 2011 00:21:55 +0000 (10:21 +1000)]
ReadOnly: Change the ctdb_db structure to keep a uint8_t for flags instead of a boolean for
the persistent flag.
This is the same size as the original boolean but allows ut to add additional flags for the database
Ronnie Sahlberg [Tue, 23 Aug 2011 00:41:52 +0000 (10:41 +1000)]
ReadOnly: Check the readonly flag instead of whether the tdb pointer is NULL or not
Ronnie Sahlberg [Tue, 23 Aug 2011 00:37:20 +0000 (10:37 +1000)]
ReadOnly: add description of readonly records
Ronnie Sahlberg [Wed, 17 Aug 2011 06:14:57 +0000 (16:14 +1000)]
ReadOnly: clear out the tracking record once a revoke is completed
Ronnie Sahlberg [Thu, 21 Jul 2011 05:59:37 +0000 (15:59 +1000)]
ReadOnly: When the client wants a readwrite lock but the local node is the dmaster and also have delegations active we must send a CALL to the local daemon to trigger it to revoke the delegations
Ronnie Sahlberg [Thu, 21 Jul 2011 05:58:56 +0000 (15:58 +1000)]
ReadOnly: Change the update_record test tool to use the new fetchlock routine that can do either normal or readonly fetchlock
Ronnie Sahlberg [Wed, 20 Jul 2011 05:47:15 +0000 (15:47 +1000)]
ReadOnly: Add a test tool that requests a readonly delegation in a loop
Ronnie Sahlberg [Wed, 20 Jul 2011 05:43:55 +0000 (15:43 +1000)]
ReadOnly: Add a test tool to fetch a record, requesting a readonly delegation and lock the record once
Ronnie Sahlberg [Wed, 20 Jul 2011 05:37:37 +0000 (15:37 +1000)]
ReadOnly: Add clientside code to fetch readonly records
Ronnie Sahlberg [Wed, 20 Jul 2011 05:31:44 +0000 (15:31 +1000)]
ReadOnly: Add a ctdb_ltdb_fetch_readonly() helper function
Ronnie Sahlberg [Wed, 20 Jul 2011 05:17:29 +0000 (15:17 +1000)]
ReadOnly: Add handlign of readonly requests readwrite requests, delegations and revoking of delegation to the processing loop for CALL requests coming in from a local client via domain socket
Ronnie Sahlberg [Wed, 20 Jul 2011 05:13:47 +0000 (15:13 +1000)]
ReadOnly: Add processing for ReadOnly delegation requests and revoke requests to the processing loop for CALL packets we receive from different nodes.
This implements the ReadOnly and ReadWrite request processing, delegation and revoking of delegations for all requests coming in across the network from a remote node.
Ronnie Sahlberg [Wed, 20 Jul 2011 04:25:29 +0000 (14:25 +1000)]
ReadOnly: Once recovery has finished, make sure to free all revoke child processes and trigger the destructors for all deferred calls to re-queue the original packets to the input packet processing function
Ronnie Sahlberg [Wed, 20 Jul 2011 04:23:05 +0000 (14:23 +1000)]
ReadOnly: When releasing all deferred calls that blocked during revoke of all previous delegations, add a 1 second grace/delay for any new readonly delegation requests so that the read-write fetch-lock porcess has a chance to make progress
Ronnie Sahlberg [Wed, 20 Jul 2011 04:21:04 +0000 (14:21 +1000)]
ReadOnly: Add a new flag to call request packet to indicate that the client wants a readonly delegation
Ronnie Sahlberg [Tue, 23 Aug 2011 00:27:31 +0000 (10:27 +1000)]
ReadOnly: Add a function to start a revoke of all delegations for a record.
This triggers a child process to be created to perform the actual potentially blocking calls that are required.
Ronnie Sahlberg [Wed, 20 Jul 2011 03:49:17 +0000 (13:49 +1000)]
ReadOnly: Add functions to register CALLs to a context used to handle deferal of processing of CALL commands.
Once the contexts are freed, the deferred calls are re-issued to the input packet processing functions again.
This is needed when/if a CALL can not currently be processed by the main engine due to the record being locked down for revoking of all delegations.
The data is passed through several layers of callbacks, and finally a timed event callback to ensure that the processing of the packet will be restarted again at the topmost eventloop, avoinding event loop nesting.
Ronnie Sahlberg [Wed, 20 Jul 2011 03:30:12 +0000 (13:30 +1000)]
ReadOnly: Add an extra flag to ctdb_call_local to specify whether we want to write the record and header back to the tdb (for example we do when performing dmaster migrations)
Ronnie Sahlberg [Wed, 20 Jul 2011 03:20:32 +0000 (13:20 +1000)]
ReadOnly: After recovering all databases, make sure to clear out the tracking database used to track delegations and revoke. This is because the recovery will implicitely result in a revoke of all delegations.
Ronnie Sahlberg [Wed, 20 Jul 2011 03:15:48 +0000 (13:15 +1000)]
ReadOnly: Add "readonly" flag to the ctdb_db_context to indicate if this database supports readonly operations or not. Add a private lock-less tdb file to the ctdb_db_context to use for tracking delegarions for records
Assume all databases will support readonly mode for now and se thte flag for all databases. At later stage we will add support to control on a per database level whether delegations will be supported or not.
Ronnie Sahlberg [Wed, 20 Jul 2011 03:08:21 +0000 (13:08 +1000)]
ReadOnly: After performing a recovery, clear out all flags related to readonly delegations and revoke
Ronnie Sahlberg [Wed, 20 Jul 2011 02:30:33 +0000 (12:30 +1000)]
ReadOnly: Add a new command 'ctdb cattdb'. This fucntion differs from 'ctdb catdb' in that 'cattdb' will always traverse the local tdb file only, while 'catdb' does a cluster traverse.
Since some record flags may differ between nodes in the cluster when read only delegations are in use, cattdb is needed when you need to know the exact flag settings on the current node itself.
Ronnie Sahlberg [Wed, 20 Jul 2011 02:21:33 +0000 (12:21 +1000)]
ReadOnly: Add printing of the record flags when we are traversing a database to print its content.
Ronnie Sahlberg [Wed, 20 Jul 2011 02:17:27 +0000 (12:17 +1000)]
ReadOnly: Add 4 new record flags to handle read only delegation and revoking of delegations
Ronnie Sahlberg [Wed, 20 Jul 2011 02:13:53 +0000 (12:13 +1000)]
ReadOnly: add a new test tool that does a fetchlock on a record, then bunps the RSN by 10 and writes the new content to the record as sprintf("%d", rsn)
Ronnie Sahlberg [Wed, 20 Jul 2011 02:06:37 +0000 (12:06 +1000)]
ReadOnly: Add clientside functions to send the UPDATE_RECORD control
Ronnie Sahlberg [Wed, 20 Jul 2011 01:50:14 +0000 (11:50 +1000)]
ReadOnly: Add test tool to validate the functions to manipulate and enumerate the bitmap of nodes to where we have readonly delegations
Ronnie Sahlberg [Wed, 20 Jul 2011 01:39:50 +0000 (11:39 +1000)]
ReadOnly: Add helper functions to manipulate a TDB_DATA as a bitmap for nodes that we are tracking as having a readonly delegation
Ronnie Sahlberg [Wed, 20 Jul 2011 01:27:05 +0000 (11:27 +1000)]
ReadOnly records: Add a new RPC function FETCH_WITH_HEADER.
This function differs from the old FETCH in that this function will also fetch the record header and not just the record data
Volker Lendecke [Mon, 31 Oct 2011 12:29:13 +0000 (13:29 +0100)]
Add CTDB_CONTROL_CHECK_SRVID
Ronnie Sahlberg [Mon, 28 Nov 2011 02:56:30 +0000 (13:56 +1100)]
Recover Persistent database DB by DB and not record by record
Add a new tunable that changes the mode how persistent databases are recovered.
RecoveryPDBBySeqNum
When set to 1, persistent databases will be recovered in whole from the node which
has the highest "__db_sequence_number__" record.
This record is managed by samba for those databases where we do persistent writes and have
inter-record relations.
For these databases we do not want the usual "blend records from all nodes based
on individual record RSN" but instead a mode where we pick one instance of the persistent database.
If no node was found with a "__db_sequence_number__" record at all, we fail back to the original "recover records independently based on record RSN".
Some persistent databases do not contain record interrelations and as such does not
contain this special record at all.
Ronnie Sahlberg [Sun, 27 Nov 2011 23:57:39 +0000 (10:57 +1100)]
LibCTDB: add get persistent db seqnum control
Ronnie Sahlberg [Sun, 27 Nov 2011 23:41:17 +0000 (10:41 +1100)]
DB Seqnum: must provide a ctdb_ltdb_header when calling ctdb_ltdb_fetch()
Ronnie Sahlberg [Thu, 17 Nov 2011 02:34:29 +0000 (13:34 +1100)]
Eventscripts: Add special -ECANCELED status for monitor events that are cancelled
When a monitor event is canceled by a higher priority script, make sure we return
status -ECANCELED to the callback in ctdB_monitor.c
Also treat -ECANCELED as a simple "try monitor event again" and skip modifying any HEALTHY/UNHEALTHY flags when this happens
Martin Schwenke [Tue, 1 Nov 2011 09:52:57 +0000 (20:52 +1100)]
LCP IP allocation algorithm - try harder to find a candidate source node
There's a bug in LCP2. Selecting the node with the highest imbalance
doesn't always work. Some nodes can have a high imbalance metric
because they have a lot of IPs. However, these nodes can be part of a
group that is perfectly balanced. Nodes in another group with less
IPs might actually be imbalanced.
Instead of just trying the source node with the highest imbalance this
tries them in descending order of imbalance until it finds one where
an IP can be moved to another node.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 1 Nov 2011 08:49:38 +0000 (19:49 +1100)]
LCP IP allocation algorithm - new function lcp2_failback_candidate()
There's a bug in LCP2. Selecting the node with the highest imbalance
doesn't always work. Some nodes can have a high imbalance metric
because they have a lot of IPs. However, these nodes can be part of a
group that is perfectly balanced. Nodes in another group with less
IPs might actually be imbalanced.
Factor out the code from lcp2_failback() that actually takes a node
and decides which address should be moved to which node.
This is the first step in fixing the above bug.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Ronnie Sahlberg [Mon, 7 Nov 2011 19:55:46 +0000 (06:55 +1100)]
Record Fetch Collapse: Collapse multiple fetch request into one single request.
When multiple clients fetch the same record concurrently, send only one single
fetch across the network and deferr all other fetches locally.
This improves performance for hot records and reduces cpu load on ctdb.
Ronnie Sahlberg [Mon, 17 Oct 2011 04:27:41 +0000 (15:27 +1100)]
new version 1.2.38
Martin Schwenke [Fri, 14 Oct 2011 06:29:35 +0000 (17:29 +1100)]
Eventscripts: Work around 50.samba autostop failure
This hack is shameful but the eventscript really needs to be split to
avoid this type of rubbish - 1 service per eventscript!
There are a few ways of hacking the conditions and this doesn't seem
much worse than any of the others...
This works... but on autostart will still restart the service that
isn't being autostarted. That's a separate bug.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Ronnie Sahlberg [Mon, 17 Oct 2011 01:11:54 +0000 (12:11 +1100)]
S1031575
When performing addip we dont allow "gratious failvoer" which can, due to timing, and depending on which order the "ctdb addip ..." is called on the nodes lead to imperfect balancing of the ip addresses when addigng several at the same time.
This patch makes sure that once the ip address is added to a node, any node, this ip address is released from the node currently hosting the address and there will possibly be a failover after a short while while performing the rebalance of the ip address.
This means that when performing "ctdb addip ..." and adding it to a new node, this could affect/disrupt the i/o on this address to the node currently hosting the address, but
it will mean we do get a more even distribution after the assignment.
This is based on the assumption that it will be more common to "add completely new ip to a set of nodes" rather than "add an ip address that is already in service to a brand new node"
Ronnie Sahlberg [Thu, 13 Oct 2011 06:16:46 +0000 (17:16 +1100)]
new version 1.2.37
Martin Schwenke [Fri, 7 Oct 2011 04:00:42 +0000 (15:00 +1100)]
Make ctdb_diagnostics more resilient to uncontactable nodes.
Current behaviour is for onnode to timeout (for about 20s) for each
attempted ssh to a down node. With 40 or 50 invocations of onnode
this takes a long time.
2 changes to work around this:
* If EXTRA_SSH_OPTS (which is passed to ssh by onnode) does not
contains a ConnectTimeout= setting then add a setting for a 5 second
timeout.
* Filter the nodes before starting any diagnosis, taking out any "bad
nodes" that are uncontactable via onnode.
In the nodes summary at the beginning of the output, print
information about any "bad nodes".
Signed-off-by: Martin Schwenke <martin@meltin.net>
Ronnie Sahlberg [Thu, 22 Sep 2011 05:16:06 +0000 (15:16 +1000)]
New version 1.2.36
Ronnie Sahlberg [Thu, 22 Sep 2011 05:13:26 +0000 (15:13 +1000)]
One of the entry points to release an ip reset the pnn field before invoking the eventscript.
this triggered a check for "only run the eventscript if we host the address" to trigger and shortcir=cuit calling the eventscript.
An effect of this would be that 'ctdb delip' would remove the ip from ctdb, but fail to delete it from the interface.
S1028798
Ronnie Sahlberg [Thu, 8 Sep 2011 05:05:56 +0000 (15:05 +1000)]
new version 1.2.35
Ronnie Sahlberg [Wed, 7 Sep 2011 23:28:33 +0000 (09:28 +1000)]
Drop loglevel for a tevent message from FATAL to ERROR
CQ S1028400
Michael Adam [Fri, 2 Sep 2011 14:42:10 +0000 (16:42 +0200)]
Add a tunable "AllowClientDBAttach" with default value 1.
When set to 0, clients will not be able to attach to databases
via the db_attach control. This might can be useful for maintenance
where ctdb should be kept running but clients should not be able
to modify databases.
Ronnie Sahlberg [Fri, 26 Aug 2011 06:25:21 +0000 (16:25 +1000)]
New version 1.2.34
Ronnie Sahlberg [Fri, 26 Aug 2011 00:03:47 +0000 (10:03 +1000)]
Sometimes external services will activate Samba in ctdb without doing hte proper and correct
initialization and setup, which leads to the state directory never become created.
Workaround this configuration issue by always creating the direcotry for now
S1028573
Martin Schwenke [Fri, 19 Aug 2011 04:20:58 +0000 (14:20 +1000)]
Eventscripts - ctdb_check_tcp_ports() bug fix.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 19 Aug 2011 03:55:55 +0000 (13:55 +1000)]
Eventscripts - fix debugging buglet in ctdb_check_tcp_ports_ctdb()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 17 Aug 2011 04:02:45 +0000 (14:02 +1000)]
Eventscripts - new default TCP port checker using "ctdb checktcpport"
New function ctdb_check_tcp_ports_ctdb(). This should be fast... and
is now the default checker. If it fails in an unexpected way we fall
back to the nmap and netstat checkers.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 17 Aug 2011 02:12:20 +0000 (12:12 +1000)]
Eventscripts - generalise TCP port checking plus new nmap-based checker
Split the netstat-specific parts of ctdb_check_tcp_ports() into new
function ctdb_check_tcp_ports_netstat().
Implement new ctdb_check_tcp_ports_nmap() function that uses
"nmap -PS" to check if the desired ports are listening.
ctdb_check_ctdb_ports() now uses new configuration variable
CTDB_TCP_PORT_CHECKERS to decide which port checkers to try. Default
value is currently "nmap netstat". If nmap is not found then this
will fall back to netstat. This indicates that either nmap should be
installed or the default value of CTDB_TCP_PORT_CHECKERS should be
changed (in a configuration file) to avoid trying to use nmap.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Ronnie Sahlberg [Wed, 17 Aug 2011 00:16:35 +0000 (10:16 +1000)]
Add a new command 'ctdb checktcpport <port>'
that tries to bind to the specified port on INADDR_ANY.
This can be used for testing if a service is listening to that port or not.
Errors are printed to stdout and the returned status code is either 0 : if we managed to bind to the port (in which case the service is NOT listening on that bort) or the value of errno that stopped us from binding to a port.
errno for EADDRINUSE is 98 so a script using this command should check the status code against the value 98.
If this command returns 98 it means the service is listening to the specified port.
Ronnie Sahlberg [Wed, 17 Aug 2011 00:00:28 +0000 (10:00 +1000)]
Revert "Add new eventscript 40.fs_use that can be used to monitor file system use and flag a node unhealthy when they become full"
This reverts commit
b65efc69317756fe60479c0df58c874da3fde6db.
Ronnie Sahlberg [Tue, 16 Aug 2011 23:59:42 +0000 (09:59 +1000)]
dont use a too big persistence timeout value
Ronnie Sahlberg [Mon, 15 Aug 2011 00:27:49 +0000 (10:27 +1000)]
new version 1.2.33
Ronnie Sahlberg [Mon, 15 Aug 2011 00:23:50 +0000 (10:23 +1000)]
Change the default for ip failover to be LCP2 and not DeterministicIPs
Ronnie Sahlberg [Sun, 14 Aug 2011 03:50:12 +0000 (13:50 +1000)]
remove the nfs share check completely from the 1.2 branch
Ronnie Sahlberg [Thu, 11 Aug 2011 01:45:59 +0000 (11:45 +1000)]
When starting and stopping ctdb through the init-script, make sure we first clear all public ips bvefore we start the daemon, in case they are still hanging around since a previous kill -9 and also make sure we drop them after we have stopped the deamon when shutting down
CQ S1027550
Ronnie Sahlberg [Thu, 11 Aug 2011 00:09:52 +0000 (10:09 +1000)]
document the new check for file system use
Ronnie Sahlberg [Thu, 11 Aug 2011 00:00:53 +0000 (10:00 +1000)]
Add new eventscript 40.fs_use that can be used to monitor file system use and flag a node unhealthy when they become full
Ronnie Sahlberg [Wed, 10 Aug 2011 23:11:38 +0000 (09:11 +1000)]
make the persistent even longer for lvs to make people even happier
Ronnie Sahlberg [Wed, 10 Aug 2011 21:14:57 +0000 (07:14 +1000)]
increase the persistent timeout to make people happier
Ronnie Sahlberg [Wed, 10 Aug 2011 21:13:28 +0000 (07:13 +1000)]
check the shares if they are available before we decide to try to restart nfs
CQ S1027529
Martin Schwenke [Mon, 8 Aug 2011 03:13:59 +0000 (13:13 +1000)]
Eventscripts: New configuration variable CTDB_SERVICE_AUTOSTARTSTOP.
Some of the current auto-start/stop logic is broken, particularly for
Samba. Fixing it is non-trivial.
If $CTDB_SERVICE_AUTOSTARTSTOP is "yes" then auto-start/stop services
when told to newly manage or no longer manage them. This defaults to
"yes".
However, if using a canned configuration file that doesn't set
$CTDB_SERVICE_AUTOSTARTSTOP then this stops the auto-start-stop logic
from working. Therefore, this works around CQ S1026685 - on the
system in question another daemon controls service auto-start/stop and
CTDB just gets in the way.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 5 Aug 2011 06:39:57 +0000 (16:39 +1000)]
Eventscripts - weaken TCP port check message if CTDB has just been started.
Sometimes smbd and other services can take a while to start,
especially when there is a lot of activity after ctdbd has just
started. The TCP port check can then pollute the logs with lots of
"ERROR" messages and possibly extra debug.
This creates a flag file when a service is started (but not restarted)
and this flag is removed the first time that TCP port checks succeed
for that service. When a port check fails and the flag file still
exists, a less extreme "INFO" message is printed rather than the usual
"ERROR" message. This means that until the node actually becomes
healthy we see more friendly messages.
The subtext is that we're hearing false positive reports "recreates"
of CQ S1024874 (samba stopped responding on port 445) quite often when
ctdbd is started. This reduces the chances of people reporting such
false recreates...
Signed-off-by: Martin Schwenke <martin@meltin.net>
David Disseldorp [Sun, 31 Jul 2011 01:14:54 +0000 (03:14 +0200)]
io: Make queue_io_read() safe for reentry
queue_io_read() may be reentered via the queue callback, recoverd is
particularly guilty of this.
queue_io_read() is not safe for reentry if more than one packet is
received and partial chunks follow - data read off the pipe on re-entry
is assumed to be the start-of-packet four byte length. This leads to a
wrongly aligned stream and the notorious "Invalid packet of length 0"
errors.
This change fixes queue_io_read() to be safe under reentry, only a
single packet is processed per call.
https://bugzilla.samba.org/show_bug.cgi?id=8319
Ronnie Sahlberg [Fri, 5 Aug 2011 00:03:34 +0000 (10:03 +1000)]
Remove a log message about setting linkstate for an unknown interface.
sometimes we do want to try to set the linkstate for interfaces that are not in use by public addresses right now (but posisbly by other mechanisms) and these messages just spam the logs
S1026357
Ronnie Sahlberg [Thu, 4 Aug 2011 03:49:02 +0000 (13:49 +1000)]
remove log message we dont need
S1026492
Ronnie Sahlberg [Thu, 4 Aug 2011 03:47:52 +0000 (13:47 +1000)]
remove a non-error logmessage about persistent databases being healthy, as expected
S1026492
Ronnie Sahlberg [Thu, 4 Aug 2011 03:46:12 +0000 (13:46 +1000)]
remove a log message we dont need about "allow clients to attach to databases"
S1026492
Ronnie Sahlberg [Thu, 4 Aug 2011 03:44:25 +0000 (13:44 +1000)]
Change the message when we start the daemon to "CTDB starting on node"
S1026492
Martin Schwenke [Mon, 1 Aug 2011 03:37:06 +0000 (13:37 +1000)]
Eventscripts - 10.interfaces should not check orphaned interfaces.
If the last IP address on an interfaces is removed then that
interfaces should no longer be checked by 10.interfaces. However,
"ctdb ifaces" still lists such interfaces so they are currently
checked.
The problem really needs to be addressed in ctdbd but a neat quick
eventscript fix will be minimally invasive...
This changes the code to use "ctdb -Y ip -v" instead of "ctdb -Y
ifaces". The former includes details of all public addresses and
associated interfaces, so when an address is removed there is no
output for it. This avoids orphaned interfaces from being listed.
The logic is also slightly improved so that $IFACES includes just a
(non-uniquified) list of interfaces, allowing an existing loop to be
removed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Ronnie Sahlberg [Thu, 28 Jul 2011 22:57:51 +0000 (08:57 +1000)]
new version 1.2.32
Martin Schwenke [Thu, 28 Jul 2011 05:22:42 +0000 (15:22 +1000)]
Tests: Initial test code for LCP2 IP allocation algorithm.
Move struct ctdb_public_ip_list to ctdb_private.h and put some
definitions for some functions from ctdb_takeover.c there. This
allows those functions to be called from unit tests.
Add ctdb_takeover_tests.c and the Makefile support to build it.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 28 Jul 2011 05:16:46 +0000 (15:16 +1000)]
IP allocation - add LCP2 algorithm.
The current non-deterministic IP allocation algorithm balances IPs
across the whole cluster. It does not consider different
interfaces/VLANs/subnets, so these different groups of IPs aren't
generally well balanced.
This adds the LCP2 algorithm for IP allocation and allows it to be
enabled by setting the "LCP2PublicIPs" tunable to 1.
The LCP2 algorithm calculates the imbalance of a node by totalling the
squares of the distances between each IP on the node. The IP distance
is defined as the length longest common prefix (LCP) of bits that is
found when comparing 2 IPs. The imbalance of a cluster is the maximum
imbalance for any node. At each step the algorithm selects an
allocation to the IP/node combination that results in the choosing the
allocation that best reduces the imbalance of the cluster.
The implementation splits out the IP allocation part of
ctdb_takeover_run() into new function ctdb_takeover_run_core(), and
then extracts out the basic IP assignment code into new functions
basic_allocate_unassigned() and basic_failback(). 3 new functions
lcp2_init(), lcp2_allocate_unassigned() and lcp2_failback() implement
the LCP2 algorithm, and are hooked into ctdb_takeover_run_core().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Ronnie Sahlberg [Thu, 28 Jul 2011 22:41:35 +0000 (08:41 +1000)]
Update the delip command
Dont talloc_free(vnn) immediately but postphone it until later when
the eventscript callback has completed.
CQ S1026664
Rusty Russell [Mon, 25 Jul 2011 08:26:06 +0000 (17:56 +0930)]
eventscript: fix callback after free
ctdb_event_script_callback() takes a mem_ctx arg which it doesn't use, but
the implication is pretty clear, that when that mem_ctx is freed, the callback
shouldn't happen. Indeed, Ronnie reproduced a case where that callback
refers to freed memory, in the ip reallocation code under stress.
So attach the callback to the mem_ctx they give us, and remove it from the
script state structure when that's freed. It's a bit weird, but it works.
CQ: S1026179
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Ronnie Sahlberg [Mon, 11 Jul 2011 07:41:12 +0000 (17:41 +1000)]
new version 1.2.31
Martin Schwenke [Tue, 5 Jul 2011 01:32:06 +0000 (11:32 +1000)]
Eventscript functions: optimise ctdb_check_tcp_ports() and add debug.
ctdb_check_tcp_ports() runs "netstat -a -t -n" in a loop for each
port. There are 2 problems with this:
* Netstat is run on each loop iteration when it need only be run once.
* The -a option is used to list all connections but the function only
cares about the listening ports. There may be many thousands of
non-listening ports to grep through.
This changes ctdb_check_tcp_ports() to run netstat with the -l option
instead of the -a option. It also only runs netstat once before the
main loop.
When a port is found to not be listening the output of the netstat
command is now dumped to help with debugging.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Ronnie Sahlberg [Mon, 4 Jul 2011 20:29:00 +0000 (06:29 +1000)]
Add log output to wipedb and backupdb
CQ S1025379
Ronnie Sahlberg [Tue, 3 Aug 2010 03:34:27 +0000 (13:34 +1000)]
When trying to re-balance the ip assignment and shuffle ips from
nodes with many addresses to nodes with few addresses,
loop up to num_ips+5 times instead of only 5 times.
When we have very many public ips per node, we might need to loop more than
5 times or else we will exit without reaching optimal balance.
Ronnie Sahlberg [Tue, 28 Jun 2011 05:39:38 +0000 (15:39 +1000)]
change the name for the key for the record where we stoire the public address config from public-addresses... to public_addresses...
CQ1019030
Ronnie Sahlberg [Sat, 18 Jun 2011 00:47:25 +0000 (10:47 +1000)]
Remove a benign by annoying log message that will be logged after an interface that has been in use has later been removed and is no longer referenced by any public addresses.
CQ S1024495