sahlberg/ctdb.git
15 years agodont call ctdb_fatal() just because we are asked to restart a connection 1.0.64 obnox/1.0.64 origin/1.0.64
Ronnie Sahlberg [Wed, 17 Dec 2008 01:01:40 +0000 (12:01 +1100)]
dont call ctdb_fatal() just because we are asked to restart a connection
to a remote node and ctdb->methods is NULL.

This can happen when we are in the middle of a normal shutdown of the
daemon and we have already shut down the transport layer (thus setting
ctdb->methods == NULL in the transport layer destructor)
band there is some unprocessed data related to a remote node.

This prevents an ugly race condition where ctdb might sometimes (rare)
cause a core dump during "ctdb shutdown".

15 years agoinew version 1.0.64-2
Ronnie Sahlberg [Thu, 27 Nov 2008 00:48:43 +0000 (11:48 +1100)]
inew version 1.0.64-2

15 years agoKeepalive packets were only sent every KeepaliveInterval if the socket
Ronnie Sahlberg [Thu, 20 Nov 2008 02:35:08 +0000 (13:35 +1100)]
Keepalive packets were only sent every KeepaliveInterval if the socket
had been completely idle during that interval.
If we had been sending other packets such as Messages, Calls or Controls
there wouldnt be any need for an explicit keepalive and thus we didnt
send one.

This does make it somewhat awkward when analyzing traces since it is
non-intuitive when keepalives are sent and when they are not sent.

Change the keepalive logic to always send a keepalive regardless of
whether the link is idle or not.

15 years agonew version 1.0.64_1
Ronnie Sahlberg [Thu, 20 Nov 2008 23:33:12 +0000 (10:33 +1100)]
new version 1.0.64_1

15 years agofixed problem with looping ctdb recoveries
Andrew Tridgell [Thu, 20 Nov 2008 21:05:59 +0000 (08:05 +1100)]
fixed problem with looping ctdb recoveries

After a node failure, GPFS can get into a state where non-blocking
fcntl() locks can take a long time. This means to the ctdb set_recmode
test timing out, which leads to a recovery failure, and a new
recovery. The recovery loop can last a long time.

The fix is to consider a fcntl timeout as a success of this test. The
test is to see that we can't lock the shared reclock file, so a
timeout is fine for a success.

15 years agonew version 1.0.64 ctdb-1.0.64
Ronnie Sahlberg [Wed, 22 Oct 2008 00:06:18 +0000 (11:06 +1100)]
new version 1.0.64

15 years agoadd a context and a timed event so that once we have been in recovery
Ronnie Sahlberg [Wed, 22 Oct 2008 00:04:41 +0000 (11:04 +1100)]
add a context and a timed event so that once we have been in recovery
mode for too long we drop all public ip addresses

15 years agonew version 1.0.63 ctdb-1.0.63
Ronnie Sahlberg [Sun, 19 Oct 2008 22:47:54 +0000 (09:47 +1100)]
new version 1.0.63

15 years agodont log "running periodic cleanup" ...
Ronnie Sahlberg [Sun, 19 Oct 2008 22:45:15 +0000 (09:45 +1100)]
dont log "running periodic cleanup" ...

15 years agonull out the pointer before we reload the nodes file
Ronnie Sahlberg [Fri, 17 Oct 2008 10:38:42 +0000 (21:38 +1100)]
null out the pointer before we reload the nodes file

15 years agowhen we reload the nodes file, we may need to reload the nodes file
Ronnie Sahlberg [Fri, 17 Oct 2008 10:18:06 +0000 (21:18 +1100)]
when we reload the nodes file,   we may need to reload the nodes file
inside the recovery daemon as well.

15 years agomake it possible to set the script log level in CTDB sysconfig
Ronnie Sahlberg [Thu, 16 Oct 2008 22:02:03 +0000 (09:02 +1100)]
make it possible to set the script log level in CTDB sysconfig

15 years agospecify a "script log level" on the commandline to set under which log
Ronnie Sahlberg [Thu, 16 Oct 2008 20:56:12 +0000 (07:56 +1100)]
specify a "script log level" on the commandline to set under which log
level any/all output from eventscripts will be logged as

15 years agonew version 1.0.62 ctdb-1.0.62
Ronnie Sahlberg [Thu, 16 Oct 2008 06:59:55 +0000 (17:59 +1100)]
new version 1.0.62

15 years agoallow multiple eventscripts using the same prefix.
Ronnie Sahlberg [Thu, 16 Oct 2008 06:57:50 +0000 (17:57 +1100)]
allow multiple eventscripts using the same prefix.
this eases the pain for users that use out of tree eventscripts

15 years agonew version 1.0.61 ctdb-1.0.61
Ronnie Sahlberg [Wed, 15 Oct 2008 05:40:44 +0000 (16:40 +1100)]
new version 1.0.61

15 years agoinstall the new multipath monitoring event script
Ronnie Sahlberg [Wed, 15 Oct 2008 05:29:09 +0000 (16:29 +1100)]
install the new multipath monitoring event script

15 years agoadd an eventscript to monitor that the multipath devices are healthy
Ronnie Sahlberg [Wed, 15 Oct 2008 05:27:33 +0000 (16:27 +1100)]
add an eventscript to monitor that the multipath devices are healthy

15 years agowe must also check the status returned from the get tickles control to
Ronnie Sahlberg [Tue, 14 Oct 2008 21:33:37 +0000 (08:33 +1100)]
we must also check the status returned from the get tickles control to
determine whether it was successful or not

15 years agolower the loglevel for the informational message that a TCP_ADD opeation
Ronnie Sahlberg [Tue, 14 Oct 2008 16:02:09 +0000 (03:02 +1100)]
lower the loglevel for the informational message that a TCP_ADD opeation
described an ip address not known to be a public address.

This could happen if someone for genuine reasons accesses a share
through a static ip address.
It can also happen if non homogenous public address configurations are
used and when a tcp description is pushed out to a different node that
does not server/know the specific ip address.

15 years agochange ip route add to route add -net since this works more reliably
Ronnie Sahlberg [Tue, 14 Oct 2008 14:49:19 +0000 (01:49 +1100)]
change ip route add to route add -net  since this works more reliably

update the makefile and rpm to install 99.routing

15 years agonew version 1.0.60
Ronnie Sahlberg [Tue, 14 Oct 2008 14:32:46 +0000 (01:32 +1100)]
new version 1.0.60

15 years agoverify that the nodes we try to ban/unban are operational and print an ctdb-1.0.60
Ronnie Sahlberg [Tue, 14 Oct 2008 14:23:57 +0000 (01:23 +1100)]
verify that the nodes we try to ban/unban are operational and print an
error to the user othervise.

15 years agoRevert "from Mathieu Parent <math.parent@gmail.com>"
Ronnie Sahlberg [Tue, 14 Oct 2008 14:08:29 +0000 (01:08 +1100)]
Revert "from Mathieu Parent <math.parent@gmail.com>"

This reverts commit dc9cd4779db4a89697731e4cf415be51067a07c1.

Conflicts:

15 years agoupdate the client side of getnodemap and getpublicips controls to
Ronnie Sahlberg [Tue, 14 Oct 2008 13:24:44 +0000 (00:24 +1100)]
update the client side of getnodemap and getpublicips controls to
fallback to the old-style ipv4-only controls if the new-style ipv4/ipv6
control fails.

this allows a 1.0.59+ (ipv4/ipv6) ctdb daemon being recmaster  to be
compatible with
pre-1.0.59  versions of ctdb that are ipv4 only.

15 years agoupdate TAKEIP/RELEASEIP/GETPUBLICIP/GETNODEMAP controls so we retain an
Ronnie Sahlberg [Mon, 13 Oct 2008 23:40:29 +0000 (10:40 +1100)]
update TAKEIP/RELEASEIP/GETPUBLICIP/GETNODEMAP controls so we retain an
older ipv4-only version of these controls.

We need this so that we are backwardcompatible with old versions of ctdb
and so that we can interoperate with a ipv4-only recmaster during a
rolling upgrade.

15 years agofrom Mathieu Parent <math.parent@gmail.com>
Ronnie Sahlberg [Sun, 12 Oct 2008 21:27:33 +0000 (08:27 +1100)]
from Mathieu Parent <math.parent@gmail.com>
Hi,

I have attached a patch necessary as debian log dir (/var/log) is not
a subdir of VARDIR (/var/lib on rpm systems, /var/lib/ctdb on debian).
As I don't know much about autotools and friends, this patch may be
hacky.

This is part of the process to minimize diff between distributions.

15 years agoFrom Mathieu Parent
Ronnie Sahlberg [Sun, 12 Oct 2008 21:21:20 +0000 (08:21 +1100)]
From Mathieu Parent
patch to make debian systems log the package versions in
ctdb_diagnostics

15 years agoskip empty lines in the public addresses file, not skip all non-empty ctdb-1.0.59
Ronnie Sahlberg [Tue, 7 Oct 2008 08:34:34 +0000 (19:34 +1100)]
skip empty lines in the public addresses file,   not skip all non-empty
lines

15 years agofrom Michael Adams : allow #-style comments in the nodes and public
Ronnie Sahlberg [Tue, 7 Oct 2008 08:25:10 +0000 (19:25 +1100)]
from Michael Adams : allow #-style comments in the nodes and public
addresses file

15 years agonew version 1.0.59
Ronnie Sahlberg [Tue, 7 Oct 2008 07:23:12 +0000 (18:23 +1100)]
new version   1.0.59

15 years agoremove an unused variable
Ronnie Sahlberg [Tue, 7 Oct 2008 07:14:44 +0000 (18:14 +1100)]
remove an unused variable

15 years agoWhen we reload the nodes file
Ronnie Sahlberg [Tue, 7 Oct 2008 07:12:54 +0000 (18:12 +1100)]
When we reload the nodes file
instead of shutting down/restarting the entire tcp layer
just bounce all outgoing connections and reconnect

15 years agoadd a new eventscript : 99.routing that is used to add static routes to
Ronnie Sahlberg [Tue, 7 Oct 2008 00:03:30 +0000 (11:03 +1100)]
add a new eventscript : 99.routing that is used to add static routes to
interfaces when they are activated (an ip address is added during
takeip)

15 years agomerged a bugfix for the idtree code from the Linux kernel. This
Andrew Tridgell [Tue, 30 Sep 2008 14:09:06 +0000 (07:09 -0700)]
merged a bugfix for the idtree code from the Linux kernel. This
matches commit 7aae6dd80e265aa9402ed507caaff4a5dba55069 in the kernel.

Many thanks to Jim Houston for pointing out this fix to us

15 years agoCheck that a database exists first before we dump its content (and
Ronnie Sahlberg [Mon, 22 Sep 2008 15:38:28 +0000 (01:38 +1000)]
Check that a database exists first before we dump its content (and
implicitely also create it) using 'ctdb catdb'

15 years agoexpanded ctdb_diagnostics based on recent experience
Andrew Tridgell [Wed, 17 Sep 2008 11:00:04 +0000 (21:00 +1000)]
expanded ctdb_diagnostics based on recent experience

15 years agouse the correct tunable failcount not timeout
Ronnie Sahlberg [Wed, 17 Sep 2008 04:24:12 +0000 (14:24 +1000)]
use the correct tunable   failcount not timeout

15 years agoThe ctdb daemon keeps track of whether the recovery process is running
Ronnie Sahlberg [Wed, 17 Sep 2008 04:17:41 +0000 (14:17 +1000)]
The ctdb daemon keeps track of whether the recovery process is running
correctly by measuring how long it was since the last successful
communication with the recovery daemon was recorded.

After a certain timeout the ctdb daemon would deem the recovery daemon
as inoperable and shut down.

If the system clock is suddenly changed forward by many (60 or more)
seconds this could cause the timeout to trigger prematurely/immediately
where ctdb would incorrectly think that more than 60 seconds had passed
since last successful communications and thus abort.

Instead of cehcking for one timeout occuring, only deem the recovery
daemon to be "down" and trigger a shutdown if communications have
timedout for three intervals in a row.

15 years agofix a slow memory leak in the recovery daemon in the error paths for the
Ronnie Sahlberg [Mon, 15 Sep 2008 23:00:48 +0000 (09:00 +1000)]
fix a slow memory leak in the recovery daemon in the error paths for the
memdump function

15 years agofix some slow memory leaks in the vacuuming handler in the recovery
Ronnie Sahlberg [Mon, 15 Sep 2008 21:55:57 +0000 (07:55 +1000)]
fix some slow memory leaks in the vacuuming handler in the recovery
daemon

15 years agoFrom Volker L
Ronnie Sahlberg [Mon, 15 Sep 2008 20:50:28 +0000 (06:50 +1000)]
From Volker L
Fix a slow memory leak in the recovery daemon if there is a recoery
triggered during the public ip reassignment process

15 years agoupdates to the precompiled documentation
Ronnie Sahlberg [Sun, 14 Sep 2008 21:04:26 +0000 (07:04 +1000)]
updates to the precompiled documentation

15 years agoDocument the new descriptive node specifications.
Martin Schwenke [Fri, 12 Sep 2008 08:20:52 +0000 (18:20 +1000)]
Document the new descriptive node specifications.

Signed-off-by: Martin Schwenke <martin@meltin.net>
15 years agoonnode changes. "ok" is an alias for "healthy", "con" is an alias for
Martin Schwenke [Fri, 12 Sep 2008 06:55:18 +0000 (16:55 +1000)]
onnode changes.  "ok" is an alias for "healthy", "con" is an alias for
"connected".  Allow "rm" or "recmaster" to be a nodespec for the
recovery master. Better error handling for interaction with ctdb
client.

Signed-off-by: Martin Schwenke <martin@meltin.net>
15 years agoMerge commit 'origin/master' into for-ronnie
Martin Schwenke [Fri, 12 Sep 2008 08:21:51 +0000 (18:21 +1000)]
Merge commit 'origin/master' into for-ronnie

15 years agoi add a new ctdb command "ctdb recmaster"
Ronnie Sahlberg [Fri, 12 Sep 2008 02:06:53 +0000 (12:06 +1000)]
i add a new ctdb command "ctdb recmaster"
this shows the node id of hte current recmaster

15 years agoChanges to onnode. Add "healthy" and "connected" as possible
Martin Schwenke [Fri, 12 Sep 2008 01:22:50 +0000 (11:22 +1000)]
Changes to onnode.  Add "healthy" and "connected" as possible
nodespecs.  Since we're now explicitly using bash, use local variables
when sensible.

Signed-off-by: Martin Schwenke <martin@meltin.net>
15 years agoMerge commit 'origin/master' into for-ronnie
Martin Schwenke [Fri, 12 Sep 2008 01:26:25 +0000 (11:26 +1000)]
Merge commit 'origin/master' into for-ronnie

15 years agoMinor documentation fixes.
Martin Schwenke [Fri, 12 Sep 2008 00:36:15 +0000 (10:36 +1000)]
Minor documentation fixes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
15 years agolower the debuglevel when logging unknown idr in responses
Ronnie Sahlberg [Tue, 9 Sep 2008 03:59:48 +0000 (13:59 +1000)]
lower the debuglevel when logging unknown idr in responses

15 years agolower the debug level for when printing that the nodeflags have changed
Ronnie Sahlberg [Tue, 9 Sep 2008 03:55:31 +0000 (13:55 +1000)]
lower the debug level for when printing that the nodeflags have changed

15 years agoadditional monitoring between the two daemons.
Ronnie Sahlberg [Tue, 9 Sep 2008 03:44:46 +0000 (13:44 +1000)]
additional monitoring between the two daemons.

we currently only monitor that the dameons are running by kill(0, pid)
and verifying the the domain socket between them is ok.

this is not sufficient since we can have a situation where the recovery
daemon is hung.

this new code monitors that the recovery daemon is operating.
if the recovery hangs, we log this and shut down the main daemon

15 years agoFrom C Cowan.
Ronnie Sahlberg [Sun, 7 Sep 2008 22:57:42 +0000 (08:57 +1000)]
From C Cowan.
Patch to make AIX compile with the new ipv6 additions.

15 years agozero out the address structure to keep valgrind happy
Ronnie Sahlberg [Fri, 29 Aug 2008 02:26:02 +0000 (12:26 +1000)]
zero out the address structure to keep valgrind happy

15 years agonew version 1.0.58 ctdb-1.0.58
Ronnie Sahlberg [Wed, 27 Aug 2008 00:26:34 +0000 (10:26 +1000)]
new version 1.0.58

15 years agorename ctdb_tcp_client back to the original name ctdb_control_tcp
Ronnie Sahlberg [Wed, 27 Aug 2008 00:24:35 +0000 (10:24 +1000)]
rename ctdb_tcp_client back to the original name ctdb_control_tcp

15 years agoFrom Abhijith Das <adas@redhat.com>:
Ronnie Sahlberg [Mon, 25 Aug 2008 00:13:18 +0000 (10:13 +1000)]
From Abhijith Das <adas@redhat.com>:

Fixup the initscript sdo it passes rpm-lint

15 years agoAdd a "reload" option to the initscript.
Ronnie Sahlberg [Mon, 25 Aug 2008 00:03:16 +0000 (10:03 +1000)]
Add a "reload" option to the initscript.

15 years agoadd a link to my webpage
Ronnie Sahlberg [Sun, 24 Aug 2008 23:41:08 +0000 (09:41 +1000)]
add a link to my webpage

15 years agoversion 1.0.57 : initial ipv6 support ctdb-1.0.57
Ronnie Sahlberg [Sun, 24 Aug 2008 22:52:29 +0000 (08:52 +1000)]
version 1.0.57   : initial ipv6 support

15 years agoDo not fail the takeip event if the "ip addr add ..." command failed. ipv6-test obnox/ipv6-test origin/ipv6-test
Ronnie Sahlberg [Thu, 21 Aug 2008 23:25:47 +0000 (09:25 +1000)]
Do not fail the takeip event if the "ip addr add ..." command failed.
Let the event complete successfully.   the local recovery daemon will check that we have the address and reissue takip othervise.

There are several reasons why "ip addr add "  can fail, one is a misconfiguration
anothe ris that for ipv6 the stack is a lot more picky than for ipv4.     for examplke this WILL fail in ipv6 if there is a duplicate ip address on the network.

thus  this check could cause rolling-recoveries  which is why it has to go

15 years agowhen we collect all ip addresses and sort them for the "ctdb ip -n all" output we...
Ronnie Sahlberg [Thu, 21 Aug 2008 23:09:08 +0000 (09:09 +1000)]
when we collect all ip addresses and sort them for the "ctdb ip -n all" output we must look at more than just the first 4 bytes of the sockaddr address or ipv6 wont work

15 years agoWhen we harvest all tcp connections to kill off after a takeip/releaseip event we...
Ronnie Sahlberg [Wed, 20 Aug 2008 02:50:50 +0000 (12:50 +1000)]
When we harvest all tcp connections to kill off after a takeip/releaseip event we must also harvest the ipv4 connections which may be presented in ::ff:xxxx:xxxx form by netstat

15 years agowe must canonicalize the sockaddr structures in killtcp so that we do the necessary...
Ronnie Sahlberg [Wed, 20 Aug 2008 02:02:54 +0000 (12:02 +1000)]
we must canonicalize the sockaddr structures in killtcp so that we do the necessary downgrade if required

15 years agomake the function to canonicalize a sockaddr structure public
Ronnie Sahlberg [Wed, 20 Aug 2008 01:58:27 +0000 (11:58 +1000)]
make the function to canonicalize a sockaddr structure public

15 years agowhen we compare ip addresses in ctdb_same_ip we must first canonicalize the addresses...
Ronnie Sahlberg [Wed, 20 Aug 2008 01:52:36 +0000 (11:52 +1000)]
when we compare ip addresses in ctdb_same_ip we must first canonicalize the addresses  so that we realize that 127.0.0.1:22 is really the same thing as ::ffff:127.0.0.1:22

Downgrade all AF_INET6 ::ffff:xxxx:xxxx sockaddresses into AF_INET ones

15 years agoupdate the socketkiller in the eventscripts to be able to handle ipv6
Ronnie Sahlberg [Tue, 19 Aug 2008 23:47:00 +0000 (09:47 +1000)]
update the socketkiller in the eventscripts to be able to handle ipv6

15 years agofix a bug in the tcp socketkiller for ipv6
Ronnie Sahlberg [Tue, 19 Aug 2008 23:23:31 +0000 (09:23 +1000)]
fix a bug in the tcp socketkiller for ipv6

15 years agofix the ipv6 checksum calculation for pseudoheader so that it actually works
Ronnie Sahlberg [Tue, 19 Aug 2008 08:24:08 +0000 (18:24 +1000)]
fix the ipv6 checksum calculation for pseudoheader so that it actually works

add support to send ipv6 "gratious arp" aka neighbor solicitation packets from ctdb

Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
15 years agoremove a file we dont need
Ronnie Sahlberg [Tue, 19 Aug 2008 04:58:57 +0000 (14:58 +1000)]
remove a file we dont need

Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
15 years agoinitial ipv6 patch
Ronnie Sahlberg [Tue, 19 Aug 2008 04:58:29 +0000 (14:58 +1000)]
initial ipv6 patch

Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
15 years agouse a local tdb_traverse instead of a ctdb_pulldb to lessen the impact of the system...
Ronnie Sahlberg [Thu, 14 Aug 2008 00:57:08 +0000 (10:57 +1000)]
use a local tdb_traverse instead of a ctdb_pulldb to lessen the impact of the system while performing a database backup

15 years agoonly freeze the local node when doing a backup and not the entire cluster
Ronnie Sahlberg [Wed, 13 Aug 2008 23:52:23 +0000 (09:52 +1000)]
only freeze the local node when doing a backup and not the entire cluster

15 years agostore the database name, not the backup filename in the database header
Ronnie Sahlberg [Wed, 13 Aug 2008 22:36:39 +0000 (08:36 +1000)]
store the database name, not the backup filename in the database header

15 years agoEncode a file version number in the database backup header
Ronnie Sahlberg [Wed, 13 Aug 2008 22:35:19 +0000 (08:35 +1000)]
Encode a file version number in the database backup header
Encode the database name in the header so we dont need to provide the database
name when doing a restore
Encode a timestamp in the header telling us when the backup was created

15 years agoAdd two new ctdb commands :
Ronnie Sahlberg [Wed, 13 Aug 2008 12:03:29 +0000 (22:03 +1000)]
Add two new ctdb commands :

ctdb backupdb : which will copy a database out from ctdb and write it to a file
ctdb restoredb : which will read a database backup from a file and write it into ctdb

15 years agofixed merge
Andrew Tridgell [Mon, 11 Aug 2008 14:10:48 +0000 (00:10 +1000)]
fixed merge

15 years agoup release version
Andrew Tridgell [Mon, 11 Aug 2008 13:52:46 +0000 (23:52 +1000)]
up release version

15 years ago new version 1.0.56 ctdb-1.0.56
Ronnie Sahlberg [Mon, 11 Aug 2008 13:50:42 +0000 (23:50 +1000)]
 new version 1.0.56

15 years agoMerge commit 'ronnie/master'
Andrew Tridgell [Mon, 11 Aug 2008 13:33:46 +0000 (23:33 +1000)]
Merge commit 'ronnie/master'

15 years agofixed a memory leak in the recovery daemon
Andrew Tridgell [Mon, 11 Aug 2008 13:33:05 +0000 (23:33 +1000)]
fixed a memory leak in the recovery daemon

thanks to vl for spotting this

15 years agofix the date soe rpmbuild works 1.0.55 obnox/1.0.55 origin/1.0.55 ctdb-1.0.55
Ronnie Sahlberg [Mon, 11 Aug 2008 00:36:38 +0000 (10:36 +1000)]
fix the date soe rpmbuild works

15 years agonew version 1.0.55
Ronnie Sahlberg [Mon, 11 Aug 2008 00:33:22 +0000 (10:33 +1000)]
new version 1.0.55

15 years agofixed send of release IP message
Andrew Tridgell [Fri, 8 Aug 2008 12:06:39 +0000 (22:06 +1000)]
fixed send of release IP message

15 years agoMerge git://git.samba.org/tridge/ctdb 1.0.54 obnox/1.0.54 origin/1.0.54 ctdb-1.0.54
Ronnie Sahlberg [Fri, 8 Aug 2008 03:11:07 +0000 (13:11 +1000)]
Merge git://git.samba.org/tridge/ctdb

15 years agoadded retry handling in client
Andrew Tridgell [Fri, 8 Aug 2008 03:11:41 +0000 (13:11 +1000)]
added retry handling in client

15 years agoadded a new control CTDB_CONTROL_TRANS2_COMMIT_RETRY so we can tell
Andrew Tridgell [Fri, 8 Aug 2008 03:11:28 +0000 (13:11 +1000)]
added a new control CTDB_CONTROL_TRANS2_COMMIT_RETRY so we can tell
the difference between a initial commit attempt and a retry, which
allows us to get the persistent updates counter right for retries

15 years agoimported failure handling from dbwrap_ctdb.c
Andrew Tridgell [Fri, 8 Aug 2008 01:04:21 +0000 (11:04 +1000)]
imported failure handling from dbwrap_ctdb.c

15 years agoMerge git://git.samba.org/tridge/ctdb
Ronnie Sahlberg [Fri, 8 Aug 2008 00:59:40 +0000 (10:59 +1000)]
Merge git://git.samba.org/tridge/ctdb

15 years agosave writing the same data twice
Andrew Tridgell [Fri, 8 Aug 2008 00:15:23 +0000 (10:15 +1000)]
save writing the same data twice

15 years agonew version 1.0.54
Ronnie Sahlberg [Fri, 8 Aug 2008 00:01:20 +0000 (10:01 +1000)]
new version 1.0.54

15 years agoup release number
Andrew Tridgell [Fri, 8 Aug 2008 00:00:33 +0000 (10:00 +1000)]
up release number

15 years agoreturn a more detailed error code from a trans2 commit error
Andrew Tridgell [Thu, 7 Aug 2008 23:58:49 +0000 (09:58 +1000)]
return a more detailed error code from a trans2 commit error

15 years agoMerge commit 'ronnie/1.0.53'
Andrew Tridgell [Thu, 7 Aug 2008 14:48:19 +0000 (00:48 +1000)]
Merge commit 'ronnie/1.0.53'

15 years agofixed a looping error bug with the new transactions code
Andrew Tridgell [Thu, 7 Aug 2008 14:44:33 +0000 (00:44 +1000)]
fixed a looping error bug with the new transactions code

15 years agonew version 1.0.53 1.0.53 obnox/1.0.53 origin/1.0.53 ctdb-1.0.53
Ronnie Sahlberg [Thu, 7 Aug 2008 08:57:24 +0000 (18:57 +1000)]
new version 1.0.53

this adds completely new transaction code for persistent databases

15 years agoMerge git://git.samba.org/tridge/ctdb
Ronnie Sahlberg [Thu, 7 Aug 2008 08:50:48 +0000 (18:50 +1000)]
Merge git://git.samba.org/tridge/ctdb

15 years agocover some corner cases where the persistent database could become
Andrew Tridgell [Thu, 7 Aug 2008 03:34:18 +0000 (13:34 +1000)]
cover some corner cases where the persistent database could become
inconsistent

15 years agoremove the reclock file we store pnn counts in.
Ronnie Sahlberg [Wed, 6 Aug 2008 01:52:26 +0000 (11:52 +1000)]
remove the reclock file we store pnn counts in.
This file creates additional locking stress on the backend filesystem and we may not need it anyway.