git.samba.org - sahlberg/ctdb.git/log

git.samba.org / sahlberg / ctdb.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 2 Dec 2009 00:28:42 +0000 (11:28 +1100)]

version 1.0.107

commit | commitdiff | tree

Rusty Russell [Tue, 1 Dec 2009 22:27:42 +0000 (08:57 +1030)]

ctdb_io: fix use-after-free on invalid packets

Wolfgang saw a talloc complaint about using freed memory in ctdb_tcp_read_cb.
His fix was to remove the talloc_free() in that function, which causes
loops when a socket is closed (as it does not get removed from the event
system), eg:
netcat 192.168.1.2 4379 < /dev/null

The real bug is that when we have more than one pending packet in the
queue, we loop calling the callback without any safeguards should that
callback free the queue (as it tends to do on invalid packets). This
can be reproduced by sending more than one bogus packet at once:
# Length word at start: 4 == empty packet (assumed little endian)
/usr/bin/printf \\4\\0\\0\\0\\4\\0\\0\\0 > /tmp/pkt
netcat 192.168.1.2 4379 < /tmp/pkt

Using a destructor we can check if the callback frees us, and exit
immediately. Elsewhere, we return after the callback anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 2 Dec 2009 00:26:51 +0000 (11:26 +1100)]

version 1.0.106

commit | commitdiff | tree

Michael Adam [Thu, 26 Nov 2009 07:35:20 +0000 (08:35 +0100)]

packaging:maketarball.sh: add a DEBIAN_MODE to the tarball creation

It is triggered by setting DEBIAN_MODE=yes in the environment.
This creates a tarball suitable for use in debian packages.
The differences from the standard tarball are these:

* The tar ball file is called ctdb_VERSION.orig.tar.gz
* The base directory in the tar ball is ctdb-VERSION.orig/

Michael

commit | commitdiff | tree

Michael Adam [Thu, 26 Nov 2009 07:34:44 +0000 (08:34 +0100)]

configure:maketarball.sh: call autogen.sh and include configure in the tarball

Michael

commit | commitdiff | tree

Michael Adam [Thu, 26 Nov 2009 07:32:24 +0000 (08:32 +0100)]

packaging:maketarball.sh: create the specfile from the ctdb.spec.in

Michael

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 1 Dec 2009 05:06:59 +0000 (16:06 +1100)]

when we detect a ip-allocation mismatch, just force a new ip reassignment
instead of a full blown recovery

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 1 Dec 2009 02:19:58 +0000 (13:19 +1100)]

When starting up ctdbd, wait until all initial recoveries have finished
and until we have gone through a full re-recovery timeout without triggering
any pending recoveries before we start up the services and start monitoring
the node.

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 30 Nov 2009 23:53:18 +0000 (10:53 +1100)]

Merge commit 'martins/status-test-2'

Conflicts:

server/eventscript.c

commit | commitdiff | tree

Martin Schwenke [Fri, 27 Nov 2009 04:57:33 +0000 (15:57 +1100)]

Event scripts: functions file now intercepts status and setstatus.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 27 Nov 2009 02:35:39 +0000 (13:35 +1100)]

remove a stray ) so we compile

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 27 Nov 2009 02:28:31 +0000 (13:28 +1100)]

dont use talloc_steal() on a object that is already a child of ctdb.

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 27 Nov 2009 01:50:45 +0000 (12:50 +1100)]

Merge commit 'martins/status-test' into status-test-2

commit | commitdiff | tree

Martin Schwenke [Fri, 27 Nov 2009 01:49:31 +0000 (12:49 +1100)]

Merge commit 'martins-svart/status-test-2' into status-test

commit | commitdiff | tree

Martin Schwenke [Fri, 27 Nov 2009 01:04:02 +0000 (12:04 +1100)]

Event script infrastructure: add reload event to check_options().

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 26 Nov 2009 05:26:25 +0000 (16:26 +1100)]

Merge commit 'martins/status-test' into status-test-2

commit | commitdiff | tree

Martin Schwenke [Thu, 26 Nov 2009 05:25:15 +0000 (16:25 +1100)]

Merge commit 'martins-svart/status-test-2' into status-test

commit | commitdiff | tree

Martin Schwenke [Thu, 26 Nov 2009 04:49:49 +0000 (15:49 +1100)]

Add flag to ctdb_event_script_callback indicating when called by client.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 26 Nov 2009 02:42:12 +0000 (13:42 +1100)]

resolve some conflicts from merging from martins branch

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 26 Nov 2009 01:08:35 +0000 (12:08 +1100)]

change the lock wait child handling to use a pipe isntead of a socketpair

remove a stray alarm(30) that caused databases to be unlocked after 30 seconds.

commit | commitdiff | tree

Martin Schwenke [Wed, 25 Nov 2009 23:49:47 +0000 (10:49 +1100)]

Merge commit 'martins-svart/status-test-2' into status-test

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Wed, 25 Nov 2009 05:42:14 +0000 (16:42 +1100)]

Event scripts: use $script_name rather than $service name for status.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Wed, 25 Nov 2009 05:34:49 +0000 (16:34 +1100)]

Event scripts: Respect CTDB_MANAGES_NFS and add function log_status_cat.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Fri, 20 Nov 2009 05:45:36 +0000 (16:45 +1100)]

More eventscript cleanups. Initial smoke testing seems OK.

Apart from lots of cleanup work, this also fixes a bug where the share
checks didn't used to cope with directory names containing spaces.
The previous commit also loaded the config incorrectly.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 25 Nov 2009 00:54:40 +0000 (11:54 +1100)]

use a binary tree and sort all ipv4/v6 addresses before we assign them out on nodes.

commit | commitdiff | tree

Rusty Russell [Wed, 25 Nov 2009 00:32:29 +0000 (11:02 +1030)]

eventscript: check that ctdb forced script events correct

Now we're doing checking, we might as well make sure the commands from
"ctdb eventscripts" are valid.

This gets rid of the "UNKNOWN" event type.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 24 Nov 2009 21:03:42 +0000 (08:03 +1100)]

iIt is better to plainly disallow clietnts from connecting here
if the node is BANNED.
Dont even let them attach at all
to the database

Revert "temporarily try allowing clients to attach to databases even if
the node is banned/stopped or inactive in any other way."

This reverts commit 227fe99f105bdc3a4f1000f238cbe3adeb3f22f0.

commit | commitdiff | tree

Martin Schwenke [Tue, 24 Nov 2009 05:14:54 +0000 (16:14 +1100)]

Merge commit 'origin/status-test' into status-test

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:54:22 +0000 (11:24 +1030)]

eventscript: check that ctdb forced script events correct

Now we're doing checking, we might as well make sure the commands from
"ctdb eventscripts" are valid.

This gets rid of the "UNKNOWN" event type.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:53:13 +0000 (11:23 +1030)]

eventscript: check that internal script events are being invoked correctly

This is not as good as a compile-time check, but at least we count the
number of arguments are correct.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:53:13 +0000 (11:23 +1030)]

eventscript: check that internal script events are being invoked correctly

This is not as good as a compile-time check, but at least we count the
number of arguments are correct.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:52:46 +0000 (11:22 +1030)]

eventscript: remove call name from state->options

Finally, we remove the call name (eg. "monitor" or "start") from the
options field of the struct: it now contains only extra options.

This is clearer, and mainly involves adding some %s to debug statements.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:52:46 +0000 (11:22 +1030)]

eventscript: remove call name from state->options

Finally, we remove the call name (eg. "monitor" or "start") from the
options field of the struct: it now contains only extra options.

This is clearer, and mainly involves adding some %s to debug statements.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:49:58 +0000 (11:19 +1030)]

eventscript: put call type into state struct.

This means we can get rid of more strcmp; they can simply use the
state->call value instead.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:49:58 +0000 (11:19 +1030)]

eventscript: put call type into state struct.

This means we can get rid of more strcmp; they can simply use the
state->call value instead.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:46:49 +0000 (11:16 +1030)]

eventscript: introduce enum for different event script calls.

Rather than doing strcmp everywhere, pass an explicit enum around.  This
also subtly documents what options are available.  The "options" arg
is now used for extra arguments only.

Unfortunately, gcc complains on empty format strings, so we make
ctdb_event_script() take no varargs, and add ctdb_event_script_args().  We
leave ctdb_event_script_callback() taking varargs, which means callers
have to do "%s", "".

For the moment, we have CTDB_EVENT_UNKNOWN for handling forced scripts
from the ctdb tool.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:46:49 +0000 (11:16 +1030)]

eventscript: introduce enum for different event script calls.

Rather than doing strcmp everywhere, pass an explicit enum around.  This
also subtly documents what options are available.  The "options" arg
is now used for extra arguments only.

Unfortunately, gcc complains on empty format strings, so we make
ctdb_event_script() take no varargs, and add ctdb_event_script_args().  We
leave ctdb_event_script_callback() taking varargs, which means callers
have to do "%s", "".

For the moment, we have CTDB_EVENT_UNKNOWN for handling forced scripts
from the ctdb tool.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:39:46 +0000 (11:09 +1030)]

eventscript: put timeout inside ctdb_event_script_callback_v

Everyone uses the same timeout value, so just remove it from the API.
If we ever need variable timeouts, that might as well be central too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:39:46 +0000 (11:09 +1030)]

eventscript: put timeout inside ctdb_event_script_callback_v

Everyone uses the same timeout value, so just remove it from the API.
If we ever need variable timeouts, that might as well be central too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:39:01 +0000 (11:09 +1030)]

eventscript: cleanup ctdb_event_script_v

ctdb_event_script_v doesn't take varargs. ctdb_run_event_script is
a better name, and fix comment.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:38:39 +0000 (11:08 +1030)]

eventscript: typo cleanups

1) ctdb_event_script_v doesn't take varargs.  ctdb_run_event_script is
   a better name, and fix comment.
2) Fix indentation on allowed_scripts.
3) Comment on run_eventscripts_callback is wrong; it's the callback
   for any ctdb forced event.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:36:53 +0000 (11:06 +1030)]

eventscript: fix bug in timeouts on forced eventscripts.  Again.

In 15bc66ae801b0c69, Ronnie fixed a double-free race.  The problem was that
ctdb_run_eventscripts() hands a context to ctdb_event_script_callback() to
hang its data off, which gets freed in the callback.  This particularly
hurt in ctdb_event_script_timeout.

There's nothing wrong with this, but obviously we should make the callback
call last of all.  At the time, ctdb_event_script_timeout() carefully
extracted everything from the struct ctdb_event_script_state before
calling ->callback.

This was cleaned up in 64da4402c6ad485f (Ronnie again), and now state
was referred to after the callback again.  But the same change introduced
a direct use-after-free bug which caused an occasional oops.

So in our last episode (eda052101728cf92) Volker fixed this, and Michael
committed it.

But we still have the double free bug which 15bc66ae801b0c69 was supposed
to fix!  Let's try to fix this in a more permanent way, but always doing
the callback from the destructor.  This means we need to hold the status,
and don't send the KILL signal if ->child is set to 0.

Finally, add a comment about freeing ourselves in run_eventscripts_callback
and the structure definition.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:36:53 +0000 (11:06 +1030)]

eventscript: fix bug in timeouts on forced eventscripts.  Again.

In 15bc66ae801b0c69, Ronnie fixed a double-free race.  The problem was that
ctdb_run_eventscripts() hands a context to ctdb_event_script_callback() to
hang its data off, which gets freed in the callback.  This particularly
hurt in ctdb_event_script_timeout.

There's nothing wrong with this, but obviously we should make the callback
call last of all.  At the time, ctdb_event_script_timeout() carefully
extracted everything from the struct ctdb_event_script_state before
calling ->callback.

This was cleaned up in 64da4402c6ad485f (Ronnie again), and now state
was referred to after the callback again.  But the same change introduced
a direct use-after-free bug which caused an occasional oops.

So in our last episode (eda052101728cf92) Volker fixed this, and Michael
committed it.

But we still have the double free bug which 15bc66ae801b0c69 was supposed
to fix!  Let's try to fix this in a more permanent way, but always doing
the callback from the destructor.  This means we need to hold the status,
and don't send the KILL signal if ->child is set to 0.

Finally, add a comment about freeing ourselves in run_eventscripts_callback
and the structure definition.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:30:13 +0000 (11:00 +1030)]

eventscript: clean up forked handler event code

Write the whole int through the pipe, rather than quietly cutting it
off. Also, use -2 as the result if the read fails; -1 comes from many
paths if the child fails before running the script.

Add a comment about why we don't need to check the write.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 25 Nov 2009 00:30:11 +0000 (11:00 +1030)]

rework and simplify the eventscript handling

This version has no trailing whitespace, and fixed

commit | commitdiff | tree

Rusty Russell [Tue, 24 Nov 2009 00:30:13 +0000 (11:00 +1030)]

eventscript: clean up forked handler event code

Write the whole int through the pipe, rather than quietly cutting it
off. Also, use -2 as the result if the read fails; -1 comes from many
paths if the child fails before running the script.

Add a comment about why we don't need to check the write.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 23 Nov 2009 22:27:22 +0000 (09:27 +1100)]

reduce the log level for three vacuuming related log messages

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 23 Nov 2009 20:40:51 +0000 (07:40 +1100)]

rework and simplify the eventscript handling

commit | commitdiff | tree

Martin Schwenke [Fri, 20 Nov 2009 05:45:36 +0000 (16:45 +1100)]

More eventscript cleanups. Initial smoke testing seems OK.

Apart from lots of cleanup work, this also fixes a bug where the share
checks didn't used to cope with directory names containing spaces.
The previous commit also loaded the config incorrectly.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Thu, 19 Nov 2009 05:48:19 +0000 (16:48 +1100)]

Now vaguely tested initscript updates.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Thu, 19 Nov 2009 04:00:17 +0000 (15:00 +1100)]

More untested eventscript factorisation.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Thu, 19 Nov 2009 03:54:05 +0000 (14:54 +1100)]

Test suite: Make the CIFS tickle test wait until it sees the required tickle.

The test depended on the exit code of "ctdb gettickles", which always
succeeds. This change wraps the command in a function that checks
whether the tickle we're interested in is registered.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 19 Nov 2009 00:08:14 +0000 (11:08 +1100)]

new version 1.0.105

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 19 Nov 2009 00:03:51 +0000 (11:03 +1100)]

dont reset the event script context everytime we start a new "ctdb eventscript ..."
command.
Use the existing context used for non-monitor events

Multiple concurrent uses of "ctdb eventscript ..." could otherwise lead to a SEGV

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 18 Nov 2009 08:10:50 +0000 (19:10 +1100)]

make the ringbuffer logging more efficient and marshall the data by writing to a tmpfile instead of continously talloc resizing a blob

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 18 Nov 2009 01:44:18 +0000 (12:44 +1100)]

add an in memory ringbuffer where we store the last 500000 log entries regardless of log level.

add commandt to extract this in memory buffer and to clear it

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 17 Nov 2009 01:07:10 +0000 (12:07 +1100)]

create a new event context for the syslog daemon

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 16 Nov 2009 04:17:32 +0000 (15:17 +1100)]

set up a pipe betweent he main daemon and the child we use for syslogling so that we can clean up the childprocess when we stop ctdbd

commit | commitdiff | tree

Martin Schwenke [Fri, 13 Nov 2009 07:28:25 +0000 (18:28 +1100)]

Eventscripts: Untested factorisations and introduction of status event.

This is the first stage of an experimental change to eventscripts.
Ronnie and I did a few hours of factorisation of 40.vsftpd and applied
many of the changes to 41.httpd. Other eventscripts were also
modified.

At this stage this is completely untested.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 13 Nov 2009 01:37:55 +0000 (12:37 +1100)]

test of a change to make ctdbd use "status" event instead of the "monitor" event.

This allows running the actual monitoring asynchronously from ctdbd
and only using "status" to pick up the actual results.

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 13 Nov 2009 01:25:31 +0000 (12:25 +1100)]

Merge commit 'martins/master'

commit | commitdiff | tree

Martin Schwenke [Thu, 12 Nov 2009 22:44:34 +0000 (09:44 +1100)]

Test suite: Fix the NFS and CIFS tickle tests.

The NFS test sleeps for MonitorInterval to give CTDB time to record an
NFS tickle.  However, this isn't always long enough.  This changes the
test to wait until a monitor event has actually occurred.

The CIFS test assumes that Samba is able to register a tickle with
CTDB before it notices that netstat has registered the tickle and can
use onnode to ask CTDB about it.  That is an incorrect assumption -
sometimes we can get to the point of asking CTDB about the tickle
before Samba and CTDB have processed it.  This adds a timeout loop
that makes the CIFS test wait until the tickle has been registered or
fail after 10 seconds.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Wed, 11 Nov 2009 01:16:30 +0000 (12:16 +1100)]

Merge commit 'origin/master'

commit | commitdiff | tree

Mathieu Parent [Tue, 10 Nov 2009 11:04:13 +0000 (12:04 +0100)]

Fix bashism in events.d/11.natgw

Signed-off-by: Michael Adam <obnox@samba.org>

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 6 Nov 2009 00:16:05 +0000 (11:16 +1100)]

version 1.0.104

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 5 Nov 2009 22:54:03 +0000 (09:54 +1100)]

sugegstion from metze,
use killtcp and kill both directions of the nfs connections.
we used to kill only one direction since the other direction was unkillble
but recent kernels allow us to kill both

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 5 Nov 2009 21:19:32 +0000 (08:19 +1100)]

suggestion from Christian,

dont allow UNHEALTHY nodes to become natgw master, unless all nodes
are unhealthy

commit | commitdiff | tree

Volker Lendecke [Tue, 3 Nov 2009 19:01:00 +0000 (20:01 +0100)]

Fix a segfault in the eventscript timeout handler.

The state was freed too early.

Signed-off-by: Michael Adam <obnox@samba.org>

commit | commitdiff | tree

Michael Adam [Tue, 3 Nov 2009 19:00:27 +0000 (20:00 +0100)]

ctdb.sysconfig: add a comment section about CTDB_RUN_TIMEOUT_MONITOR

Michael

commit | commitdiff | tree

Michael Adam [Tue, 3 Nov 2009 19:00:07 +0000 (20:00 +0100)]

Add a 99.timeout event script to trigger monitor timeouts.

This just sleeps for twice the value of EventScriptTimeout
in the monitor action. It is not run by default, but
can be activated by setting CTDB_RUN_TIMEOUT_MONITOR
in /etc/sysconfig/ctdb .

Michael

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 5 Nov 2009 05:07:23 +0000 (16:07 +1100)]

dont use the pointer after it has been talloc_free()d.

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 5 Nov 2009 04:57:46 +0000 (15:57 +1100)]

From Rusty

It's much nicer for post-mortem debugging to have a body to examine.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 5 Nov 2009 01:12:06 +0000 (12:12 +1100)]

add an extra test for the bond devices and check that there is an active slave.
this to handle the case where all links do have a physical layer, but where all slaves have been disabled using ifdown

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 3 Nov 2009 20:50:26 +0000 (07:50 +1100)]

dont verify winbindd is running properly at startup

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 3 Nov 2009 00:46:37 +0000 (11:46 +1100)]

new version 1.0.103

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 2 Nov 2009 23:48:27 +0000 (10:48 +1100)]

move the check to skip vacuuming on persistent database to the ctdb_vacuuming_init() function

commit | commitdiff | tree

Michael Adam [Mon, 2 Nov 2009 00:37:07 +0000 (01:37 +0100)]

packaging: use githash in rpm release by default.

setting USE_GITHASH=no in the environment makes
makerpms.sh omit the git hash

Michael

commit | commitdiff | tree

Michael Adam [Mon, 2 Nov 2009 23:04:27 +0000 (00:04 +0100)]

server: disable vacuuming for persistent tdbs.

The vacuum process treats persistent databases the same as
non-persistent and thus ignores the extra state for transactions.
This way, it breaks the api-level transactions.

Michael

commit | commitdiff | tree

Michael Adam [Thu, 29 Oct 2009 21:53:44 +0000 (22:53 +0100)]

client: randomize the transaction_start retry loop:

instead of sleeping 1 second, sleep between 1 and 100 milliseconds

Michael

commit | commitdiff | tree

Michael Adam [Thu, 29 Oct 2009 21:40:00 +0000 (22:40 +0100)]

Revert "dont exit on a commit failure"

This reverts commit 4e9a3a5dc232bac12ab387ea0cf4f1b279bed5c1.

Transaction commit should not be allowed to fail.
This is a real error.

Michael

commit | commitdiff | tree

Michael Adam [Thu, 29 Oct 2009 21:20:38 +0000 (22:20 +0100)]

client: fix a race in the local race condition fix in transaction_start

The gap that remained is between checking whether a transaction commit
is in progress and taking the lock. Now we first take the lock and then
check whether a transaction commit is in progress. If so, we release the
lock, wait for one second and retry.

Michael

commit | commitdiff | tree

Michael Adam [Thu, 29 Oct 2009 21:19:19 +0000 (22:19 +0100)]

client: add a debug message when a transaction_commit needs to be retried

Michael

commit | commitdiff | tree

Michael Adam [Tue, 20 Oct 2009 15:02:16 +0000 (17:02 +0200)]

packaging(RPM): don't touch the run levels in ctdb install/udpate.

We should really leave it up to the administrator to decide
whether ctdb should be started automatically at boot-time.

Michael

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 30 Oct 2009 08:39:11 +0000 (19:39 +1100)]

start the syslog child a little later, after we have forked and detached from the local shell

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 30 Oct 2009 07:53:17 +0000 (18:53 +1100)]

create a child process to write to syslog.

use a udp socket on the ctdbd port to send messages to teh syslog child process for loggign.

we need this when syslog becomes "slow", like very slow, and on boxes where syslog is limited to 100 lines per second and starts to block after that

commit | commitdiff | tree

Michael Adam [Thu, 29 Oct 2009 12:48:36 +0000 (13:48 +0100)]

server: fix debug message in trans2_commit (refusing persistent store during transaction)

log the right db_id
also log the client_id

Michael

commit | commitdiff | tree

Michael Adam [Thu, 29 Oct 2009 12:45:38 +0000 (13:45 +0100)]

client: log db_id as 8-digit hex in ctdb_transaction_fetch_start()

Michael

commit | commitdiff | tree

Michael Adam [Thu, 29 Oct 2009 12:44:39 +0000 (13:44 +0100)]

server: uniformly log db and client ids as 8-digit hex numbers in trans2_commit

Michael

commit | commitdiff | tree

Michael Adam [Thu, 29 Oct 2009 12:30:03 +0000 (13:30 +0100)]

server: line-wrap a debug statement in trans2_commit

Michael

commit | commitdiff | tree

Michael Adam [Thu, 29 Oct 2009 12:27:47 +0000 (13:27 +0100)]

server: output client_id in some debug messages in trans2_commit

Michael

commit | commitdiff | tree

Michael Adam [Thu, 29 Oct 2009 12:24:19 +0000 (13:24 +0100)]

server: fix a debug message in trans2_commit - log the correct db_id

Michael

commit | commitdiff | tree

Michael Adam [Thu, 29 Oct 2009 12:54:55 +0000 (13:54 +0100)]

server: extend a debug message in ctdb_control_trans2_error()

Michael

commit | commitdiff | tree

Michael Adam [Thu, 29 Oct 2009 12:53:44 +0000 (13:53 +0100)]

server: add positive debug statements to trans2_commit and trans2_finished

When the operation completed / started successfully.

Michael

commit | commitdiff | tree

Michael Adam [Thu, 29 Oct 2009 18:49:10 +0000 (19:49 +0100)]

client: improve "control timed out" debug message

* add __location__
* wrap overly long line
* print unsigned ints as unsigned (reqid, opcode, destnode)

Michael

commit | commitdiff | tree

Michael Adam [Thu, 29 Oct 2009 16:08:37 +0000 (17:08 +0100)]

server: trans2_active: don't report a transaction active on the node that performs the transaction

Otherwise a node can lock itself out, e.g. when a commit control times out...

Michael

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 29 Oct 2009 02:49:27 +0000 (13:49 +1100)]

new version 1.0.102

commit | commitdiff | tree

Wolfgang Mueller-Friedt [Wed, 28 Oct 2009 11:54:29 +0000 (14:54 +0300)]

ensure tdb names end with .tdb. and any number of digits

commit | commitdiff | tree

Wolfgang Mueller-Friedt [Wed, 28 Oct 2009 10:01:27 +0000 (13:01 +0300)]

vacuuming needed additional check before getting rid of the record; there is a gap between selecting the records and deleting them, therefore we have to check if the records still can be deleted when we actually are about to delete them

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 29 Oct 2009 02:44:12 +0000 (13:44 +1100)]

Revert "From Wolfgang M."

This reverts commit 5b70fa8cfd5916d3c212823ad5cc1b251ae175ed.

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 29 Oct 2009 00:54:24 +0000 (11:54 +1100)]

make the error logged when winbindd fails to access the dc during startup more scary and easier to spot in the logs

CTDB repository