metze/ctdb/wip.git
13 years agoTODO: server: collect the panic action outout for logging master-backtrace
Stefan Metzmacher [Tue, 12 Jan 2010 11:18:23 +0000 (12:18 +0100)]
TODO: server: collect the panic action outout for logging

metze

13 years agoTODO: server: add --panic-action and CTDB_PANIC_ACTION option for backtrace support
Stefan Metzmacher [Thu, 7 Jan 2010 08:21:56 +0000 (09:21 +0100)]
TODO: server: add --panic-action and CTDB_PANIC_ACTION option for backtrace support

pass cmdline parameters with spaces...

metze

13 years ago41.HTTPD
Ronnie Sahlberg [Tue, 21 Dec 2010 23:27:53 +0000 (10:27 +1100)]
41.HTTPD

Httpd can be very slow to start on some platforms,
wait 5 monitor intervals before we try to restart it if
it has not bound to port 80 yet.
After 10 failed intervals, flag the node as unhealthy.

13 years ago60.nfs
Ronnie Sahlberg [Tue, 21 Dec 2010 23:09:35 +0000 (10:09 +1100)]
60.nfs

Try to restart LOCKD after 10 failures and
flag the node as unhealthy after 15 failures

13 years agoDont run net serverid wipe in the background
Ronnie Sahlberg [Tue, 21 Dec 2010 23:05:40 +0000 (10:05 +1100)]
Dont run net serverid wipe in the background

13 years ago50.samba
Ronnie Sahlberg [Tue, 14 Dec 2010 10:17:14 +0000 (21:17 +1100)]
50.samba

Net serverid wipe can take a bit of time sometimes so background it.

Only perform auto start/stop of the managed service on the monitor event

13 years agoctdb addip:
Ronnie Sahlberg [Mon, 13 Dec 2010 01:06:01 +0000 (12:06 +1100)]
ctdb addip:

After finishing "ctdb addip"  wait for an implicit "iptakeover" to complete
the assignment to a node.

This makes it more wasteful and timeconsuming when adding multiple ips
at once, or the same ip to multiple nodes,
but makes it easier to script the use of this command.

13 years agoLVS
Ronnie Sahlberg [Sun, 12 Dec 2010 08:38:39 +0000 (19:38 +1100)]
LVS

update lvs configuration on ipreallocated events too

13 years agoWhen assigning the single-public-ip during startup,
Ronnie Sahlberg [Sun, 12 Dec 2010 03:22:20 +0000 (14:22 +1100)]
When assigning the single-public-ip during startup,
flag the interface as initially being "link ok"
so that we can add it and startup.

The eventscript can later drop the flag if required

13 years agoRevert "server: when we migrate off a record with data, set the MIGRATED_WITH_DATA...
Ronnie Sahlberg [Mon, 13 Dec 2010 03:23:48 +0000 (14:23 +1100)]
Revert "server: when we migrate off a record with data, set the MIGRATED_WITH_DATA flag"

This reverts commit 17e231abf5ade83d7fa624b5cf54ae876e2795aa.

13 years agoRevert "Add a new header flag for "migrated with data" and set this to 1"
Ronnie Sahlberg [Mon, 13 Dec 2010 03:23:32 +0000 (14:23 +1100)]
Revert "Add a new header flag for "migrated with data" and set this to 1"

This reverts commit a8cc35191df1cd4b866897df71d317ce5f198cb5.

13 years agolibctdb
Ronnie Sahlberg [Fri, 10 Dec 2010 03:18:28 +0000 (14:18 +1100)]
libctdb

fix a compile problem after renaming a structure field

13 years agoLibCTDB
Ronnie Sahlberg [Fri, 10 Dec 2010 02:39:18 +0000 (13:39 +1100)]
LibCTDB

Add an input queue where we keep received pdus we have not yet processed
This allows us to perform SYNC calls from an ASYNC callback

13 years agoonly run "serverid wipe" if we are actually running samba.
Ronnie Sahlberg [Wed, 8 Dec 2010 00:08:19 +0000 (11:08 +1100)]
only run "serverid wipe" if we are actually running samba.
we dont need to run this on systems where we do run winbind but not samba

13 years agoidtree: fix overflow for v. large ids on allocation and removal
Rusty Russell [Mon, 6 Dec 2010 03:22:38 +0000 (13:52 +1030)]
idtree: fix overflow for v. large ids on allocation and removal

(Imported from SAMBA commit 09a6538969ac).

Chris Cowan tracked down a SEGV in sub_alloc: idp->level can actually
be equal to 7 (MAX_LEVEL) there, as it can be in sub_remove.

(We unfairly blamed a shift of a signed var for this crash in commit
 2db1987f5a3a).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
13 years agoAdd a new header flag for "migrated with data" and set this to 1
Ronnie Sahlberg [Mon, 6 Dec 2010 05:09:38 +0000 (16:09 +1100)]
Add a new header flag for "migrated with data" and set this to 1
when we migrate a non-empty record onto the node
or a non-empty record off the node

When we migrate a record back to the lmaster and yield the dmaster role,
inspect this flag if if it is still not set, we can delete the record from
the local database as soon as we have migrated it back to the lmaster.

13 years agoadd new command line functions
Ronnie Sahlberg [Mon, 6 Dec 2010 05:07:55 +0000 (16:07 +1100)]
add new command line functions
ctdb readkey <dbid> <key>
ctdb writekey <dbid> <key> <value>

these are mainly intended for debugging of databases and dmaster migration issues

13 years agoadd a new ctdb_ltdb function to delete a record in a normal database
Ronnie Sahlberg [Mon, 6 Dec 2010 05:06:20 +0000 (16:06 +1100)]
add a new ctdb_ltdb function to delete a record in a normal database

13 years agoserver: when we migrate off a record with data, set the MIGRATED_WITH_DATA flag
Michael Adam [Fri, 3 Dec 2010 14:21:51 +0000 (15:21 +0100)]
server: when we migrate off a record with data, set the MIGRATED_WITH_DATA flag

13 years agoAdd 60.ganesha to what gets installed by make install as well as by the RPM
Ronnie Sahlberg [Mon, 6 Dec 2010 00:30:24 +0000 (11:30 +1100)]
Add 60.ganesha to what gets installed by make install as well as by the RPM

13 years agoadd a missing part of the import of the previous ganesha patch
Ronnie Sahlberg [Mon, 6 Dec 2010 00:26:43 +0000 (11:26 +1100)]
add a missing part of the import of the previous ganesha patch

13 years agomake changes to ctdb event scripts to support NFS-Ganesha.
Chandra Seetharaman [Fri, 3 Dec 2010 23:26:22 +0000 (15:26 -0800)]
make changes to ctdb event scripts to support NFS-Ganesha.

make changes to ctdb event scripts to support NFS-Ganesha.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
13 years agoduring ip allocation, there are failure modes where a node might hold a ip address
Ronnie Sahlberg [Fri, 3 Dec 2010 02:28:35 +0000 (13:28 +1100)]
during ip allocation, there are failure modes where a node might hold a ip address
but thinks it is still unassigned (-1).

add code to the recovery daemon to detect this case and trigger a reallocation
so that the ip gets covered

and change the takeip code to allow for this condition, taking on an ip address that is
already hosted.

cq s1021073

13 years agodont try starting samba through the "init" event
Ronnie Sahlberg [Thu, 2 Dec 2010 19:07:03 +0000 (06:07 +1100)]
dont try starting samba through the "init" event

13 years agoWhen we are no longer the natgw master, dont put the natgw ip on loopback.
Ronnie Sahlberg [Mon, 29 Nov 2010 01:39:14 +0000 (12:39 +1100)]
When we are no longer the natgw master, dont put the natgw ip on loopback.
We put the ip on loopback just to make sure we would still interoperate with
non-standard configurations on unix-KDC, that are configured to verify the optional
HostAddresses field.
This is not required for AD, since AD does not use this field, and is replaced in
unix land with other/better mechanisms than this "dodgy" check.

This makes it "easier" for applications that have bound to the natgw address
to detect a socket problem and try to reconnect/recover if the ip address
is completely missing from the system.

At the same time, use the winbind specific hook that exists to explicitely tell winbindd : this address is gone, so if you have bound to it, this is a good time to close and rebind your socket.

cq 1020333

13 years agoupdate autostart/stop to work for samba
Ronnie Sahlberg [Thu, 18 Nov 2010 04:40:19 +0000 (15:40 +1100)]
update autostart/stop to work for samba

13 years agoadd an explicit _is_managed_service to iscsi eventscript
Ronnie Sahlberg [Thu, 18 Nov 2010 03:15:18 +0000 (14:15 +1100)]
add an explicit _is_managed_service to iscsi eventscript

13 years agoDont pollute the logs with a "file not found" message
Ronnie Sahlberg [Thu, 18 Nov 2010 02:52:46 +0000 (13:52 +1100)]
Dont pollute the logs with a "file not found" message

CQ S1020745

13 years ago60.nfs eventscript should do nothing if NFS isn't managed by CTDB.
Martin Schwenke [Thu, 18 Nov 2010 02:23:40 +0000 (13:23 +1100)]
60.nfs eventscript should do nothing if NFS isn't managed by CTDB.

Signed-off-by: Martin Schwenke <martin@meltin.net>
13 years agoEventscript functions - catch failures in ctdb_service_start().
Martin Schwenke [Thu, 18 Nov 2010 00:27:10 +0000 (11:27 +1100)]
Eventscript functions - catch failures in ctdb_service_start().

ctdb_service_start() currently succeeds if ctdb_counter_init()
succeeds.

This changes it to fail when a service start fails.

Signed-off-by: Martin Schwenke <martin@meltin.net>
13 years ago50.samba eventscript should stop/start services when they become (un)managed.
Martin Schwenke [Thu, 18 Nov 2010 00:04:52 +0000 (11:04 +1100)]
50.samba eventscript should stop/start services when they become (un)managed.

When the value of $CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND (or
corresponding changes are made to $CTDB_MANAGED_VERSIONS), the
associated service should be started or stopped as necessary.

This add calls to ctdb_start_stop_service() to manage
starting/stopping samba and winbind.

An associated cleanup is made to the initial checks that one of
$CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND is set, replacing them
with calls to is_ctdb_managed_service().

To handle the winbind cases ctdb_start_stop_service() and
is_ctdb_managed_service() are updated to take an optional service name
parameter.

Signed-off-by: Martin Schwenke <martin@meltin.net>
13 years agoadd a new support function ctdb_check_counter_equal()
Ronnie Sahlberg [Wed, 17 Nov 2010 02:50:56 +0000 (13:50 +1100)]
add a new support function ctdb_check_counter_equal()

update nfs to try to restart the service after 10 consecutive failures
and to flag the node unhealthy after 15

add similar function to mountd

13 years agoEventscripts: make loadconfig() function hookable by the test suite.
Martin Schwenke [Tue, 31 Aug 2010 07:40:40 +0000 (17:40 +1000)]
Eventscripts: make loadconfig() function hookable by the test suite.

Rename loadconfig() to _loadconfig().  Add a new loadconfig() that
simply calls _loadconfig().

This makes it easy for the test suite to override loadconfig().

Signed-off-by: Martin Schwenke <martin@meltin.net>
13 years agoMake a time comparison in 60.nfs eventscript more readable.
Martin Schwenke [Tue, 16 Nov 2010 08:42:31 +0000 (19:42 +1100)]
Make a time comparison in 60.nfs eventscript more readable.

Signed-off-by: Martin Schwenke <martin@meltin.net>
13 years ago60.nfs only fails or warns after 10 consecutive nfsd/statd failures.
Martin Schwenke [Tue, 16 Nov 2010 08:31:18 +0000 (19:31 +1100)]
60.nfs only fails or warns after 10 consecutive nfsd/statd failures.

These failures are sometimes the result of slow restarts so we want to
avoid dirtying the logs or marking a node unhealthy because of them,
unless they are excessive.

For these 2 cases we use the existing fail counting code but hack a
temporary service_name in a subshell to allow separate fail counts.

We also update ctdb_check_rpc() so that it captures the error output
from rpcinfo and we add a message including the service name to the
beginning.  The error is printed to stdout but is also stored in
ctdb_check_rpc_out to allow it to be conditionally used by the caller.
This function also now returns non-zero rather than exiting on
failure.

Other direct rpcinfo calls are relaced by called to ctdb_check_rpc()
for consistency.

Option handling code for service restarts is cleaned up so that fits
in 80 columns.  A more informative restart messageis now used in all
cases, printing the exact command being used to start a service.

Signed-off-by: Martin Schwenke <martin@meltin.net>
13 years agoTest suite: fix typo in ctdb ping test grep pattern.
Martin Schwenke [Tue, 12 Oct 2010 00:10:38 +0000 (11:10 +1100)]
Test suite: fix typo in ctdb ping test grep pattern.

Signed-off-by: Martin Schwenke <martin@meltin.net>
13 years agoTest suite: match changed output for ctdb ping to disconnected node.
Martin Schwenke [Wed, 6 Oct 2010 05:32:22 +0000 (16:32 +1100)]
Test suite: match changed output for ctdb ping to disconnected node.

Signed-off-by: Martin Schwenke <martin@meltin.net>
13 years agoTest suite: make statistics test cope with changes to statistics output.
Martin Schwenke [Fri, 15 Oct 2010 04:09:08 +0000 (15:09 +1100)]
Test suite: make statistics test cope with changes to statistics output.

Signed-off-by: Martin Schwenke <martin@meltin.net>
13 years agoinitialize the statistics to the current time, not start of epoch
Ronnie Sahlberg [Mon, 15 Nov 2010 05:30:44 +0000 (16:30 +1100)]
initialize the statistics to the current time, not start of epoch
this makes "ctdb statistics" show correct "start of starts collection"

13 years agoDont exit the update ip function if the old and new interfaces are the same
Ronnie Sahlberg [Wed, 10 Nov 2010 03:47:28 +0000 (14:47 +1100)]
Dont exit the update ip function if the old and new interfaces are the same
since if they are the same for whatever reason this triggers the system
to go into an infinite loop and is unrobust

The scriptds have been changed instead to be able to cope with this
situation for enhanced robustness

During takeover_run and when merging all ip allocations across the cluster
try to kepe track of when and which node currently hosts an ip address
so that we avoid extra ip failovers between nodes

13 years agochange the takeover script timeout to 9 seconds from 5
Ronnie Sahlberg [Wed, 10 Nov 2010 03:46:45 +0000 (14:46 +1100)]
change the takeover script timeout to 9 seconds from 5

13 years agoDont check remote ip allocation if public ip mgmt is disabled
Ronnie Sahlberg [Wed, 10 Nov 2010 03:46:05 +0000 (14:46 +1100)]
Dont check remote ip allocation if public ip mgmt is disabled

13 years agothis stuff is just so fragile that it will enter infinite recovery and fail loops
Ronnie Sahlberg [Wed, 10 Nov 2010 03:45:43 +0000 (14:45 +1100)]
this stuff is just so fragile  that it will enter infinite recovery and fail loops
on any kind of tiny unexpected error

unconditionally try to remove ip addresses from both old and new interface
before trying to add it to the new interface to make it less
fragile

13 years agodelete from old interface before adding to new interface
Ronnie Sahlberg [Wed, 10 Nov 2010 03:40:43 +0000 (14:40 +1100)]
delete from old interface before adding to new interface
this stops the script from failing with an error if
both interfaces are specified as the same, which otherwise breaks and leads to an infinite recovery loop

13 years agodelay loading the public ip address file until after we have started the transport...
Ronnie Sahlberg [Wed, 10 Nov 2010 01:59:25 +0000 (12:59 +1100)]
delay loading the public ip address file until after we have started the transport and discovered ouw own pnn number

13 years agowhen we load the public address file, at the same time check if we are already hosti...
Ronnie Sahlberg [Wed, 10 Nov 2010 01:11:11 +0000 (12:11 +1100)]
when we load the public address file,  at the same time check if we are already hosting the public address, if so, set ourselves up as the pnn for that address

13 years agodont check the public ip assignment or if even we are hosting them and shouldnt
Ronnie Sahlberg [Wed, 10 Nov 2010 01:06:05 +0000 (12:06 +1100)]
dont check the public ip assignment or if even we are hosting them and shouldnt
when public ips have been disabled

13 years agoAdd a new tunable : DisableIPFailover that when set to non 0
Ronnie Sahlberg [Tue, 9 Nov 2010 04:19:06 +0000 (15:19 +1100)]
Add a new tunable : DisableIPFailover that when set to non 0
will stopp any ip reallocations at all from happening.

13 years agochange the default for how long to waqit before dropping all ips to 120 seconds
Ronnie Sahlberg [Tue, 9 Nov 2010 01:59:05 +0000 (12:59 +1100)]
change the default for how long to waqit before dropping all ips to 120 seconds

13 years agodont delete all ips from the system during the initial "init" event
Ronnie Sahlberg [Tue, 9 Nov 2010 01:56:02 +0000 (12:56 +1100)]
dont delete all ips from the system during the initial "init" event
leave any ips as they are and let the recovery daemon remove them as required

13 years agowhen creating/adding a public ip, set the initial interface to be the first interface...
Ronnie Sahlberg [Tue, 9 Nov 2010 01:55:20 +0000 (12:55 +1100)]
when creating/adding a public ip, set the initial interface to be the first interface specified

13 years agoBoth nfs and nfslock scripts can fail under redhat in very rare situations.
Ronnie Sahlberg [Thu, 28 Oct 2010 02:43:57 +0000 (13:43 +1100)]
Both nfs and nfslock scripts can fail under redhat in very rare situations.
Ctdb can also be configured to ignore checking for knfsd and if it is alive.
In that situation, no attempt will be made to restart nfs, and sicne nfs is not running,  lockd can not be restarted either.

To workaround this, everytime we try to restart the lockmanager, also try to restart nfsd

13 years agoduring shutdown there is a window after we have stopped TCP and disconnected from...
Ronnie Sahlberg [Thu, 28 Oct 2010 02:38:34 +0000 (13:38 +1100)]
during shutdown there is a window after we have stopped TCP and disconnected from all other nodes but before we have stopped all processing.

During this window we may still hit asynchronous events that will fail because we can not send/receive packets from other nodes.

These messages are logged as ... Transport is DOWN. To help indicate that they are benign messages related to the process of shutting down.

These messages spam the syslog during normal shutdown, so this patch will drop the loglevel of these messages to DEBUG, so that they will not appear in or spam the syslog.

13 years agoWhen shuttind down, we always unconditionally try to remove the natgw address
Ronnie Sahlberg [Thu, 28 Oct 2010 02:36:24 +0000 (13:36 +1100)]
When shuttind down, we always unconditionally try to remove the natgw address
even if we are not currently the natgw master.
This adds extra reliability in case we have stopped previously without removing it proper,
but does add spam messages to syslog everytime we shutdowm.

Remove these spam messages from pulluting the syslog upon normal shutdown

13 years agoRedirect the output from 00.ctdb pfetch to stdout.
Ronnie Sahlberg [Thu, 28 Oct 2010 02:34:33 +0000 (13:34 +1100)]
Redirect the output from 00.ctdb pfetch to stdout.
Normally, the config.tdb database would not exist, so we do not need
to spam syslog with a "config.tdb does not exist" message every time we start ctdb

13 years agoDrop the loglevel of the "reqid wrap" developer debug message to DEBUG
Ronnie Sahlberg [Thu, 28 Oct 2010 02:32:29 +0000 (13:32 +1100)]
Drop the loglevel of the "reqid wrap" developer debug message to DEBUG
so that we dont spam the logs with this normal benign message.

13 years agoAdd support to create TDB databases using the new jenkins hash.
Ronnie Sahlberg [Mon, 25 Oct 2010 00:31:12 +0000 (11:31 +1100)]
Add support to create TDB databases using the new jenkins hash.

SRVID for the control to attach to a database is used to pass
tdb flags from samba to ctdb when samba attached to a database.
This has been used earlier for TDB_NOSYNC flag.

Add TDB_INCOMPATIBLE_HASH as a supported tdb flag to store in the
SRVID field when attaching to a database.

This allows samba to control if ctdb should create databases using the
new jenkins hash, or using the old hash.
This only affects new databases when they are initially created.
Existing databases remain using the old hash when attached to.

13 years agonew version 1.10
Ronnie Sahlberg [Thu, 21 Oct 2010 00:12:30 +0000 (11:12 +1100)]
new version 1.10

13 years agoweb: fix link to tdb README
Stefan Metzmacher [Mon, 10 May 2010 07:20:13 +0000 (09:20 +0200)]
web: fix link to tdb README

metze

13 years agodoc: regenerate docs
Stefan Metzmacher [Tue, 14 Sep 2010 14:28:27 +0000 (16:28 +0200)]
doc: regenerate docs

metze

13 years agodoc/ctdb.1: fix "ctdb restore <filename> [<dbname>]" cmdline
Stefan Metzmacher [Tue, 14 Sep 2010 13:14:29 +0000 (15:14 +0200)]
doc/ctdb.1: fix "ctdb restore <filename> [<dbname>]" cmdline

metze

13 years agodoc/ctdb.1: document "persistent" flag for "ctdb atttach"
Stefan Metzmacher [Tue, 14 Sep 2010 13:05:42 +0000 (15:05 +0200)]
doc/ctdb.1: document "persistent" flag for "ctdb atttach"

metze

13 years agotools/ctdb: allow "ctdb pfetch" only on persistent databases
Stefan Metzmacher [Tue, 14 Sep 2010 14:21:27 +0000 (16:21 +0200)]
tools/ctdb: allow "ctdb pfetch" only on persistent databases

metze

13 years agotools/ctdb: add 'persistent' flag to "ctdb attach"
Stefan Metzmacher [Tue, 14 Sep 2010 12:49:42 +0000 (14:49 +0200)]
tools/ctdb: add 'persistent' flag to "ctdb attach"

metze

13 years agotools/ctdb: let "ctdb catdb" pass the persistent flag to ctdb_attach()
Stefan Metzmacher [Tue, 14 Sep 2010 12:45:16 +0000 (14:45 +0200)]
tools/ctdb: let "ctdb catdb" pass the persistent flag to ctdb_attach()

metze

13 years agoevents.d/11.routing: handle "updateip" event
Stefan Metzmacher [Tue, 19 Oct 2010 17:21:23 +0000 (19:21 +0200)]
events.d/11.routing: handle "updateip" event

metze

13 years agoIf tdb_open() fails when trying to open the vacuuming database,
Ronnie Sahlberg [Wed, 13 Oct 2010 22:49:23 +0000 (09:49 +1100)]
If tdb_open() fails when trying to open the vacuuming database,
print errno so we get some idea of why this failed.

13 years agotry to restart NFS LOCKD if it failed to start
Ronnie Sahlberg [Wed, 13 Oct 2010 21:12:41 +0000 (08:12 +1100)]
try to restart NFS LOCKD if it failed to start

13 years agoRemove a debug message "Timed out waiting ..."
Ronnie Sahlberg [Tue, 12 Oct 2010 22:21:09 +0000 (09:21 +1100)]
Remove a debug message "Timed out waiting ..."
from the ctdb command.

This is a debugging message and is normal tro tigger on a busy system.
It should not be logged as ERROR.

13 years agoMake sure the statd directory exist before trying to access the
Ronnie Sahlberg [Mon, 11 Oct 2010 21:02:18 +0000 (08:02 +1100)]
Make sure the statd directory exist before trying to access the
"update trigger" file.

CQ 1020344

13 years agomove extracting the config from config.tdb for public addresses
Ronnie Sahlberg [Mon, 11 Oct 2010 15:49:11 +0000 (02:49 +1100)]
move extracting the config from config.tdb for public addresses
into its own function

13 years agoUpdate latency countes to show min/max and average
Ronnie Sahlberg [Mon, 11 Oct 2010 04:11:18 +0000 (15:11 +1100)]
Update latency countes to show min/max and average

13 years agoUpdate the default hash size to be 100001 instead of 10000
Ronnie Sahlberg [Sun, 10 Oct 2010 20:09:18 +0000 (07:09 +1100)]
Update the default hash size to be 100001 instead of 10000
This can sometimes improve performance for environments where very many
files are touched in rapid succession

13 years agoRevert "change the hash function to use the much better Jenkins hash"
Ronnie Sahlberg [Sun, 10 Oct 2010 20:05:41 +0000 (07:05 +1100)]
Revert "change the hash function to use the much better Jenkins hash"

This reverts commit f7e91ae905cd61249028e15f2cb509ea69f10b9e.

This may require a change to the ctdb protocol, or a mechanism
to negotiate/verify that we dont run with different hash fucntions
across the cluster.

Reverting the change until we decide how to solve this in the master
version.

13 years agodont stop checking interfaces after the first bond device
Ronnie Sahlberg [Fri, 8 Oct 2010 23:54:12 +0000 (10:54 +1100)]
dont stop checking interfaces after the first bond device
continue the loop to process all other interfaces too

13 years agoSpotted by rusty.
Ronnie Sahlberg [Fri, 8 Oct 2010 04:51:44 +0000 (15:51 +1100)]
Spotted by rusty.

Add a missing $
so we delete $_ip   and not _ip

13 years agochange the hash function to use the much better Jenkins hash
Ronnie Sahlberg [Fri, 8 Oct 2010 02:14:14 +0000 (13:14 +1100)]
change the hash function to use the much better Jenkins hash
from the tdb library

cq S1020233

13 years agoMerge commit 'rusty/tdb-update'
Ronnie Sahlberg [Fri, 8 Oct 2010 01:49:08 +0000 (12:49 +1100)]
Merge commit 'rusty/tdb-update'

13 years agoidtree: fix right shift of signed ints, crash on large ids on AIX
Rusty Russell [Tue, 5 Oct 2010 02:36:19 +0000 (13:06 +1030)]
idtree: fix right shift of signed ints, crash on large ids on AIX

Right-shifting signed integers in undefined; indeed it seems that on
AIX with their compiler, doing a 30-bit shift on (INT_MAX-200) gives
0, not 1 as we might expect.

The obvious fix is to make id and oid unsigned: l (level count) is also
logically unsigned.

(Note: Samba doesn't generally get to ids > 1 billion, but ctdb does)

Reported-by: Chris Cowan <cc@us.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Autobuild-User: Rusty Russell <rusty@samba.org>
Autobuild-Date: Wed Oct  6 08:31:09 UTC 2010 on sn-devel-104

13 years agoget rid of the "ctdb setflags" command since
Ronnie Sahlberg [Thu, 7 Oct 2010 05:18:27 +0000 (16:18 +1100)]
get rid of the "ctdb setflags" command since
1, we dont need it
2, it uses the ugly "modify flags" control that should die

13 years agoidtree: fix right shift of signed ints, crash on large ids on AIX
Rusty Russell [Tue, 5 Oct 2010 02:36:19 +0000 (13:06 +1030)]
idtree: fix right shift of signed ints, crash on large ids on AIX

Right-shifting signed integers in undefined; indeed it seems that on
AIX with their compiler, doing a 30-bit shift on (INT_MAX-200) gives
0, not 1 as we might expect.

The obvious fix is to make id and oid unsigned: l (level count) is also
logically unsigned.

(Note: Samba doesn't generally get to ids > 1 billion, but ctdb does)

Reported-by: Chris Cowan <cc@us.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Autobuild-User: Rusty Russell <rusty@samba.org>
Autobuild-Date: Wed Oct  6 08:31:09 UTC 2010 on sn-devel-104

13 years agopytdb: Add __version__ attribute.
Jelmer Vernooij [Mon, 4 Oct 2010 11:17:25 +0000 (13:17 +0200)]
pytdb: Add __version__ attribute.

13 years agopytdb: Include Python.h first to prevent warning.
Jelmer Vernooij [Sat, 2 Oct 2010 21:40:19 +0000 (23:40 +0200)]
pytdb: Include Python.h first to prevent warning.

13 years agopytdb: Check errors after PyObject_New() calls
Kirill Smelkov [Sat, 2 Oct 2010 13:43:50 +0000 (17:43 +0400)]
pytdb: Check errors after PyObject_New() calls

The call could fail with e.g. MemoryError, and we'll dereference NULL
pointer without checking.

Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>
13 years agopytdb: Add support for tdb_repack()
Kirill Smelkov [Sat, 2 Oct 2010 13:43:46 +0000 (17:43 +0400)]
pytdb: Add support for tdb_repack()

Cc: 597386@bugs.debian.org
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>
13 years agopytdb: Add TDB_INCOMPATIBLE_HASH open flag
Kirill Smelkov [Sat, 2 Oct 2010 13:43:40 +0000 (17:43 +0400)]
pytdb: Add TDB_INCOMPATIBLE_HASH open flag

In 2dcf76 Rusty added TDB_INCOMPATIBLE_HASH open flag which selects
Jenkins lookup3 hash for new databases.

Expose this flag to python users too.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>
13 years agotdb: fix non-WAF build, commit 1.2.6 ABI file.
Rusty Russell [Mon, 27 Sep 2010 01:36:51 +0000 (11:06 +0930)]
tdb: fix non-WAF build, commit 1.2.6 ABI file.

Sorry Jeremy.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
13 years agotdb: TDB_INCOMPATIBLE_HASH, to allow safe changing of default hash.
Rusty Russell [Fri, 24 Sep 2010 06:15:11 +0000 (15:45 +0930)]
tdb: TDB_INCOMPATIBLE_HASH, to allow safe changing of default hash.

This flag to tdb_open/tdb_open_ex effects creation of a new database:
1) Uses the Jenkins lookup3 hash instead of the old gdbm hash if none is
   specified,
2) Places a non-zero field in header->rwlocks, so older versions of TDB will
   refuse to open it.

This means that the caller (ie Samba) can set this flag to safely
change the hash function.  Versions of TDB from this one on will either
use the correct hash or refuse to open (if a different hash is specified).
Older TDB versions will see the nonzero rwlocks field and refuse to open
it under any conditions.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
13 years agotdb: automatically identify Jenkins hash tdbs
Rusty Russell [Fri, 24 Sep 2010 06:09:43 +0000 (15:39 +0930)]
tdb: automatically identify Jenkins hash tdbs

If the caller to tdb_open_ex() doesn't specify a hash, and tdb_old_hash
doesn't match, try tdb_jenkins_hash.

This was Metze's idea: it makes life simpler, especially with the upcoming
TDB_INCOMPATIBLE_HASH flag.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
13 years agotdb: add Bob Jenkins lookup3 hash as helper hash.
Rusty Russell [Fri, 24 Sep 2010 06:04:06 +0000 (15:34 +0930)]
tdb: add Bob Jenkins lookup3 hash as helper hash.

This is a better hash than the default: shipping it with tdb makes it easy
for callers to use it as the hash by passing it to tdb_open_ex().

This version taken from CCAN and modified, which took it from
http://www.burtleburtle.net/bob/c/lookup3.c.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
13 years agotdb: add restore
Volker Lendecke [Sat, 18 Sep 2010 06:56:10 +0000 (10:56 +0400)]
tdb: add restore

Based on an idea by Simon McVittie, largely rewritten

13 years agolib/tdb: fix c++ build warning in tdb_header_hash().
Günther Deschner [Mon, 20 Sep 2010 23:01:51 +0000 (16:01 -0700)]
lib/tdb: fix c++ build warning in tdb_header_hash().

Guenther

13 years agopytdb: Make filename argument optional.
Jelmer Vernooij [Sun, 19 Sep 2010 17:42:29 +0000 (10:42 -0700)]
pytdb: Make filename argument optional.

13 years agopytdb: Add support for tdb_freelist_size()
Kirill Smelkov [Sun, 19 Sep 2010 09:53:29 +0000 (13:53 +0400)]
pytdb: Add support for tdb_freelist_size()

Cc: 597386@bugs.debian.org
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>
13 years agopytdb: Add support for tdb_transaction_prepare_commit()
Kirill Smelkov [Sun, 19 Sep 2010 09:53:32 +0000 (13:53 +0400)]
pytdb: Add support for tdb_transaction_prepare_commit()

Cc: 597386@bugs.debian.org
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>
13 years agopytdb: Add support for tdb_enable_seqnum, tdb_get_seqnum and tdb_increment_seqnum_non...
Kirill Smelkov [Sun, 19 Sep 2010 16:34:33 +0000 (09:34 -0700)]
pytdb: Add support for tdb_enable_seqnum, tdb_get_seqnum and tdb_increment_seqnum_nonblock

Cc: 597386@bugs.debian.org
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>
13 years agopytdb: Update open flags to match those for tdb_open() in tdb.h
Kirill Smelkov [Sun, 19 Sep 2010 09:53:19 +0000 (13:53 +0400)]
pytdb: Update open flags to match those for tdb_open() in tdb.h

Namely TDB_NOSYNC, TDB_SEQNUM, TDB_VOLATILE, TDB_ALLOW_NESTING and
TDB_DISALLOW_NESTING were missing.

Cc: 597386@bugs.debian.org
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>
13 years agopytdb: Fix repr segfault for internal db
Kirill Smelkov [Sun, 19 Sep 2010 09:53:21 +0000 (13:53 +0400)]
pytdb: Fix repr segfault for internal db

The problem was tdb->name is NULL for TDB_INTERNAL databases, and
so it was crashing ...

    #0  0xb76944f3 in strlen () from /lib/i686/cmov/libc.so.6
    #1  0x0809862b in PyString_FromFormatV (format=0xb72b6a26 "Tdb('%s')", vargs=0xbfc26a94 "")
        at ../Objects/stringobject.c:211
    #2  0x08098888 in PyString_FromFormat (format=0xb72b6a26 "Tdb('%s')") at ../Objects/stringobject.c:358
    #3  0xb72b65f2 in tdb_object_repr (self=0xb759e060) at ./pytdb.c:439

Cc: 597089@bugs.debian.org
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>
13 years agopytdb: Add support for tdb_add_flags() & tdb_remove_flags()
Kirill Smelkov [Sun, 19 Sep 2010 09:53:20 +0000 (13:53 +0400)]
pytdb: Add support for tdb_add_flags() & tdb_remove_flags()

Note, unlike tdb_open where flags is `int', tdb_{add,remove}_flags want
flags as `unsigned', so instead of "i" I used "I" in PyArg_ParseTuple.

Cc: 597386@bugs.debian.org
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>
13 years agotdb: added TDB_NO_FSYNC env variable
Andrew Tridgell [Thu, 16 Sep 2010 10:06:44 +0000 (20:06 +1000)]
tdb: added TDB_NO_FSYNC env variable

this might help reduce test times and load on test machines