Ronnie Sahlberg [Fri, 4 Jun 2010 06:26:52 +0000 (16:26 +1000)]
ldd logging functions
Ronnie Sahlberg [Fri, 4 Jun 2010 04:47:06 +0000 (14:47 +1000)]
Readrecordlock changes:
Make the use of ctdb_release_lock() mandatory from the callback.
Split ctdb_release_lock() in two, release the tdb lock in the
ctdb_release_lock() function and move the freeing of the lock structure to ctdb_free_lock() which is private to libctdb.
When the callback returns, verify that the callback has actually released the lock and warn (FIXME) if not.
Update ctdb_writerecord to warn and fail (FIXME) if writing while the lock is not held.
Ronnie Sahlberg [Fri, 4 Jun 2010 04:20:17 +0000 (14:20 +1000)]
remove the global rrl_cb_called from the libctdb example
and psss it through the callback via private_data.
add a comment that the callback may sometimes have already been invoked
when the ctdb_readrecordlock_async() call returns
and that the application can use *private_data IF the application
needs to know if the callback has already triggered or not.
Rusty Russell [Fri, 4 Jun 2010 04:03:08 +0000 (13:33 +0930)]
libctdb: change callback for ctdb_readrecordlock.
After discussion with Ronnie, we decided to revisit this interface. We use
the name ctdb_readrecordlock_async, as it is *not* always a send, and we
use a specific callback to avoid the "fake request" creation on the fast
path.
The request itself is never exposed: this means it can't be cancelled,
but we can revisit that later if need be.
This makes both use and implementation simpler.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Fri, 4 Jun 2010 04:04:06 +0000 (13:34 +0930)]
libctdb: fix wrong argument being handed to callback on attachdb fail
When attachdb failed, we were handing the db, not the user-supplied
arg to the callback.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Ronnie Sahlberg [Wed, 2 Jun 2010 07:06:14 +0000 (17:06 +1000)]
When we say "current time of statistics" in the "ctdb statistics" output,
print the current time and not the start time
Ronnie Sahlberg [Wed, 2 Jun 2010 06:49:05 +0000 (16:49 +1000)]
ctdb_req_control contains 4 padding bytes. Create an explicit pad variable here and set it to 0 when creating a control to keep valgrind happy.
PDUs are padded to 8 byte boundary. If padding is used, memset it to 0
to keep valgrind happy.
Ronnie Sahlberg [Wed, 2 Jun 2010 05:13:32 +0000 (15:13 +1000)]
Add the offsetof macro to libctdb
change all calls to new_ctdb_request() to use the offset macro to calculate the correct size (instead of allocating one byte too many and hoping the alignment padding saves us.)
Ronnie Sahlberg [Wed, 2 Jun 2010 03:49:34 +0000 (13:49 +1000)]
Make the call to free the request explicit in the callback
instead of implicit
Ronnie Sahlberg [Wed, 2 Jun 2010 03:42:03 +0000 (13:42 +1000)]
Automatically free the request once the callback has returned.
Ronnie Sahlberg [Wed, 2 Jun 2010 03:13:09 +0000 (13:13 +1000)]
Add a variable for start/current time to ctdb statistics
and print the time startistics was taken and for how long the statistics have been collected to the "ctdb statistics" output.
Ronnie Sahlberg [Wed, 2 Jun 2010 00:37:00 +0000 (10:37 +1000)]
link ctdb with libctdb and connect to the daemon both the old way and by using libctdb
update the function "control_pnn()" to use libctdb to ask the daemon for the pnn
Ronnie Sahlberg [Wed, 2 Jun 2010 00:36:19 +0000 (10:36 +1000)]
add a sync wrapper for the getpnn control
Ronnie Sahlberg [Wed, 2 Jun 2010 00:25:31 +0000 (10:25 +1000)]
add a function to read the current socketname from the ctdb structure
Ronnie Sahlberg [Wed, 2 Jun 2010 00:05:58 +0000 (10:05 +1000)]
rename ctdb_remove_message_handler to ctdb_client_remove_message_handler
to avoid conflict with the function of the same name in libctdb
Ronnie Sahlberg [Wed, 2 Jun 2010 00:00:58 +0000 (10:00 +1000)]
rename ctdb_message_fn_t to ctdb_msg_fn_t to avoid a conflict with the type of the same name used in libctdb
Ronnie Sahlberg [Tue, 1 Jun 2010 23:51:47 +0000 (09:51 +1000)]
rename ctdb_set_message_handler to ctdb_client_set_message_handler
to avoid a colission with the function of the same name in libctdb
Ronnie Sahlberg [Tue, 1 Jun 2010 23:45:21 +0000 (09:45 +1000)]
rename ctdb_send_message to ctdb_client_send_message to resolve colission with the function of the same name in libctdb
Ronnie Sahlberg [Tue, 1 Jun 2010 23:43:16 +0000 (09:43 +1000)]
Dont link with libctdb
This makes it easier to ensure we catch all places
when we rename old api functions due to colissions with
functions in libctdb
Ronnie Sahlberg [Tue, 1 Jun 2010 23:24:17 +0000 (09:24 +1000)]
Update the tst.c example application for libctdb to
use the headers from /usr/include
and add a comment about how to compile it
Ronnie Sahlberg [Tue, 1 Jun 2010 23:18:48 +0000 (09:18 +1000)]
rename ccan/typesafe_cb.h to ctdb_typesafe_cb.h and
add this file to the install/rpm
Ronnie Sahlberg [Tue, 1 Jun 2010 06:22:48 +0000 (16:22 +1000)]
When adding an ip at runtime, it might not yet have an iface assigned to it, so ensure that the next takover_ip call will fall through to accept the ip and add it.
Ronnie Sahlberg [Tue, 1 Jun 2010 04:51:09 +0000 (14:51 +1000)]
Dont check linkstatus for loopback. This interface never has
issues with the physical layer
Ronnie Sahlberg [Tue, 1 Jun 2010 02:43:32 +0000 (12:43 +1000)]
Prevent clients from connecting to the natgw address.
This address is dedicated for outgoing connections.
BZ62613
Ronnie Sahlberg [Wed, 26 May 2010 03:55:19 +0000 (13:55 +1000)]
add a gplv3 boilerplate to the example application for libctdb
Ronnie Sahlberg [Wed, 26 May 2010 03:43:28 +0000 (13:43 +1000)]
check if vnn is a valid pointer before dereferencing it
based on rustys patch for bz62783
Ronnie Sahlberg [Wed, 26 May 2010 00:01:37 +0000 (10:01 +1000)]
move the header files and libctdb.a out into a separate ctdb-devel rpm
Ronnie Sahlberg [Tue, 25 May 2010 23:01:26 +0000 (09:01 +1000)]
make install to install libctdb.a
Ronnie Sahlberg [Tue, 25 May 2010 22:56:46 +0000 (08:56 +1000)]
make sure we build libctdb for "make all"
Ronnie Sahlberg [Tue, 25 May 2010 02:48:49 +0000 (12:48 +1000)]
Merge commit 'rusty/libctdb2'
Ronnie Sahlberg [Tue, 25 May 2010 02:47:15 +0000 (12:47 +1000)]
new version 1.9
Ronnie Sahlberg [Mon, 24 May 2010 02:33:47 +0000 (12:33 +1000)]
Add monitoring of quorum and make the node UNHEALTHY when quarum is lost
Ronnie Sahlberg [Sun, 23 May 2010 23:51:52 +0000 (09:51 +1000)]
in 62.cnfs, lines in /etc/exports can have hte exports quoted,
so strip off any initial " on the exports line
Ronnie Sahlberg [Fri, 21 May 2010 04:25:47 +0000 (14:25 +1000)]
It was possible for ->recovery_mode to get out of sync with the new three db priorities in such a way that
->recovery_mode was set to normal but database priorities leven2 or 3 was still set to frozen.
causing the recovery daemon to fail to detect that a recovery was needed to recover access to the database.
BZ63951
Rusty Russell [Mon, 24 May 2010 04:22:17 +0000 (13:52 +0930)]
libctdb: tweak interface for readrecordlock
Previously we could hang in poll with the callback pending (since we
fake it): explicitly call it immediately.
Note: I experienced corruption using DLIST_ADD_END (ctdb->pnn was blatted
when adding to the message_handler list). I switched them all to DLIST_ADD,
but maybe I'm using it wrong?
Rusty Russell [Mon, 24 May 2010 03:47:36 +0000 (13:17 +0930)]
libctdb: uniform callbacks, _recv functions to pull out data.
This is a bit tricky for those cases where we need to do multiple or
zero I/Os (eg. attachdb and readrecordlock), but works well for the
simple cases.
Rusty Russell [Fri, 21 May 2010 02:43:10 +0000 (12:13 +0930)]
patch libctdb-single-callback.patch
Rusty Russell [Thu, 20 May 2010 06:46:04 +0000 (16:16 +0930)]
tst.c: update to Ronnie's latest
This provides a slightly more comprehensive test of the library.
Rusty Russell [Thu, 20 May 2010 06:27:40 +0000 (15:57 +0930)]
libctdb: Ronnie's build changes, so we actually build libctdb with make.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Fri, 21 May 2010 02:37:41 +0000 (12:07 +0930)]
libctdb: first cut, supports getrecmaster only
This is a completely standalone library using only ctdb_protocol.h.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Header from folded patch 'libctdb-message-handling.patch':
libctdb: add message handling to libctdb.
Now clients can send and receive ctdb messages.
Rusty Russell [Thu, 20 May 2010 06:37:30 +0000 (16:07 +0930)]
libctdb: API changes from Ronnie's version
These simplifications mostly came up due to the implementation.
o Rename ctdb_context to ctdb_connection.
We already have a ctdb_context internally in ctdbd; don't confuse them!
o Rename ctdb_handle to struct ctdb_request.
From the user POV it's a request, and it's also useful internally to
avoid implicit cast to/from void *.
o Rename ctdb_db_context to ctdb_db.
o Introduce ctdb_lock.
This provides an explicit "lock object" you get from readrecordlock
and have to hand to those functions which need you to hold a lock.
o status args are "int" not int32_t.
Should this be a bool?
o Remove last traces on generic callback.
Without semi-sync API, this doesn't help anything and loses type safety.
o Remove the semi-async API.
We can add this later, but I think a sync and async API is enough for
our poor users for the moment :)
o Registering a message handler also takes a callback.
This way you can tell if it failed. Not sure if this is overkill, but it's
consistent.
o ctdb_service() takes an revents arg
Strictly not necessary for a nonblocking fd, but nice to know if a
read or write is possible.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Thu, 20 May 2010 06:31:28 +0000 (16:01 +0930)]
libctdb: ctdb.h and tst.c from Ronnie
This imports ctdb.h and tst.c from Ronnie's work: it's a separate commit
for now to make the changes obvious.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Thu, 20 May 2010 05:48:30 +0000 (15:18 +0930)]
libctdb: reorganize headers: remove ctdb.h, add ctdb_client.h and ctdb_protocol.h
ctdb_client.h is the existing internal client interface (which was mainly
in ctdb.h), and ctdb_protocol.h is the information needed for the wire
protocol only.
ctdb.h will be the new, shiny, libctdb API.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Ronnie Sahlberg [Thu, 20 May 2010 02:35:57 +0000 (12:35 +1000)]
In control_ipreallocate() we wait at most 5 tries before aborting the command
and returning an error.
This might not be sufficient if there are several recoveries in a row.
Instead loop as long as it takes for the recovery master to finish the recoveries and re
spond to the ipreallocate call.
Increase the log level of the error message when the recovery master was busy and could
not perform the ipreallocation promptly
BZ61783
Ronnie sahlberg [Thu, 20 May 2010 01:44:18 +0000 (11:44 +1000)]
document how to restore a backup into a different database
Ronnie Sahlberg [Thu, 20 May 2010 01:26:37 +0000 (11:26 +1000)]
Enhance the "ctdb restoredb" command so you can restore a backup into a different database.
Ronnie sahlberg [Tue, 11 May 2010 09:44:11 +0000 (19:44 +1000)]
Merge commit 'obnox/master-rebase'
Michael Adam [Fri, 26 Mar 2010 15:40:00 +0000 (16:40 +0100)]
functions: when checking for a directory also check whether it can be accessed.
Thanks to "waKKu" on irc for this improvement.
Michael
Michael Adam [Mon, 15 Mar 2010 15:35:39 +0000 (16:35 +0100)]
tests:ctdb_bench: make send_start_messages() static - eliminates compile warning
Michael
Michael Adam [Mon, 15 Mar 2010 15:34:43 +0000 (16:34 +0100)]
tests: eliminate a floating point exception by requiring -n option to ctdb_bench
Michael
Ronnie Sahlberg [Mon, 10 May 2010 23:28:59 +0000 (09:28 +1000)]
Add the number of performed recoveries to the "ctdb statistics" output.
Rusty Russell [Sat, 8 May 2010 12:54:11 +0000 (22:24 +0930)]
ctdb: use full range of IDR
This resolves a problem with huge numbers of requests which could overflow
16 bits. Fortunately, the IDR should scale reasonably well, so we can simply
hold all the requests.
Although noone checks for failure, I added a constant for that.
BZ: 60540
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Ronnie Sahlberg [Wed, 5 May 2010 23:33:08 +0000 (09:33 +1000)]
when performing a recovery,
ensure that all nodes use the same reclock file setting as the recovery master
Ronnie Sahlberg [Tue, 4 May 2010 03:56:55 +0000 (13:56 +1000)]
Add a new eventscript 62.cnfs to integrate better with gpfs/cnfs
Ronnie sahlberg [Mon, 3 May 2010 05:57:41 +0000 (15:57 +1000)]
Merge commit 'rusty/signal-fix'
Ronnie Sahlberg [Mon, 3 May 2010 05:52:02 +0000 (15:52 +1000)]
Dont check ip assignment across the cluster while ip-verification
checks are disabled
Ronnie Sahlberg [Wed, 28 Apr 2010 05:43:11 +0000 (15:43 +1000)]
The recent change to the recovery daemon to keep track of and
verify that all nodes agree on the most recent ip address assignments
broke "ctdb moveip ..." since that call would never trigger
a full takeover run and thus would immediately trigger an inconsistency.
Add a new message to the recovery daemon where we can tell the recovery daemon to update its assignments.
BZ62782
Ronnie Sahlberg [Wed, 28 Apr 2010 04:47:37 +0000 (14:47 +1000)]
Make create_merged_ip_list() a static function since
it is not called from outside of ctdb_takeover.c
Ronnie Sahlberg [Wed, 28 Apr 2010 04:44:53 +0000 (14:44 +1000)]
In the log message when we have found an inconsistent ip address allocation,
add extra log information about what the inconsistency is.
Ronnie Sahlberg [Tue, 27 Apr 2010 22:46:41 +0000 (08:46 +1000)]
If the admin makes a configuration mistake and configures NATGW to use the
same ip address as a normal public-address,
check for this in the natgw script and warn the user.
Also prevent ctdb from starting up since this configuration will not work.
BZ60933
Ronnie sahlberg [Thu, 22 Apr 2010 23:25:25 +0000 (09:25 +1000)]
Merge commit 'rusty/tdb-update'
Ronnie Sahlberg [Thu, 22 Apr 2010 22:52:09 +0000 (08:52 +1000)]
Add a setting where CTDB will monitor and warn for low memory conditions.
CTDB_MONITOR_FREE_MEMORY_WARN
BZ 59747
Ronnie Sahlberg [Thu, 22 Apr 2010 22:35:01 +0000 (08:35 +1000)]
In the example script to remove all ip addresses after a ctdb crash,
add the NATGW address as one to be removed in addition to the
public addresses.
Rusty Russell [Thu, 22 Apr 2010 04:41:38 +0000 (14:11 +0930)]
tdb: define _PUBLIC_ so we can compile tdb.
The Samba tree defines _PUBLIC_ (and _PRIVATE_) for libraries to
control visibility. The last commit absorbed this from their tdb,
but we need to #define to stub it out since ctdb doesn't use it
(and doesn't need to: we only use tdb internally).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Andrew Tridgell [Thu, 22 Apr 2010 04:31:36 +0000 (14:01 +0930)]
tdb: update tdb ABI to use hide_symbols=True
We now use -fvisibilty=hidden to hide symbols from outside the tdb
shared library.
This also moved tdb_transaction_recover() into the tdb_private.h
header, as it should never have been a public API. For that reason we
are changing the version number. We're only doing a minor version
increment as it is extremely unlikely that anyone was actually using
tdb_transaction_recover() as its locking requirements were rather
unusual.
Pair-Programmed-With: Rusty Russell <rusty@samba.org>
(Imported from commit
773a8afbba27a5e2e48577100f3ca9873b506615)
Jelmer Vernooij [Thu, 22 Apr 2010 04:29:22 +0000 (13:59 +0930)]
subunit: Support formatting compatible with upstream subunit, for consistency.
Upstream subunit makes a ":" after commands optional, so I've fixed any
places where we might trigger commands accidently. I've filed a bug
about this in subunit.
(Imported from commit
7da94cc4a664521be279b019e9f32121cd410193)
Simo Sorce [Thu, 22 Apr 2010 04:28:35 +0000 (13:58 +0930)]
tdb: update exports and signatures files
(Imported from commit
c1f6f61f620e865516d1856c9d937b5326a29046)
Volker Lendecke [Thu, 22 Apr 2010 04:28:35 +0000 (13:58 +0930)]
tdb: Add a non-blocking version of tdb_transaction_start
(Imported from commit
261c3b4f1beed820647061bacbee3acccbcbb089)
Volker Lendecke [Thu, 22 Apr 2010 04:28:07 +0000 (13:58 +0930)]
tdb: Fix indentation in tdb_new_database()
(Imported from commit
59315887a07033316edf91c0c57563eee5ea992d)
Volker Lendecke [Thu, 22 Apr 2010 04:28:07 +0000 (13:58 +0930)]
Fix some nonempty blank lines
(Imported from commit
ea8e0d5d54b020c530e392c4edaeed43e20af303)
Andrew Tridgell [Thu, 22 Apr 2010 04:27:17 +0000 (13:57 +0930)]
python: use '#!/usr/bin/env python' to cope with varying install locations
this should be much more portable
(Imported from commit
088096d1bad51428a2e2d487214995d4fdfc7ccc)
Volker Lendecke [Thu, 22 Apr 2010 04:24:06 +0000 (13:54 +0930)]
tdb: Fix bug 7248, avoid the nanosleep dependency
(Imported from commit
e2c7e5c4f72565fe49265d5b036531926ea1ac92)
Volker Lendecke [Thu, 22 Apr 2010 04:24:06 +0000 (13:54 +0930)]
tdb: If tdb_parse_record does not find a record, return -1 instead of 0
(Imported from commit
fb98f60594b6cabc52d0f2f49eda08f793ba4748)
Rusty Russell [Thu, 22 Apr 2010 04:24:06 +0000 (13:54 +0930)]
tdb: handle processes dying during transaction commit.
tdb transactions were designed to be robust against the machine
powering off, but interestingly were never designed to handle the case
where an administrator kill -9's a process during commit. Because
recovery is only done on tdb_open, processes with the tdb already
mapped will simply use it despite it being corrupt and needing
recovery.
The solution to this is to check for recovery every time we grab a
data lock: we could have gained the lock because a process just died.
This has no measurable cost: here is the time for tdbtorture -s 0 -n 1
-l 10000:
Before:
2.75 2.50 2.81 3.19 2.91 2.53 2.72 2.50 2.78 2.77 = Avg 2.75
After:
2.81 2.57 3.42 2.49 3.02 2.49 2.84 2.48 2.80 2.43 = Avg 2.74
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
ec96ea690edbe3398d690b4a953d487ca1773f1c)
Rusty Russell [Thu, 22 Apr 2010 04:24:06 +0000 (13:54 +0930)]
patch tdb-refactor-tdb_lock-and-tdb_lock_nonblock.patch
(Imported from commit
1bf482b9ef9ec73dd7ee4387d7087aa3955503dd)
Rusty Russell [Thu, 22 Apr 2010 04:24:06 +0000 (13:54 +0930)]
tdb: add -k option to tdbtorture
To test the case of death of a process during transaction commit, add
a -k (kill random) option to tdbtorture. The easiest way to do this
is to make every worker a child (unless there's only one child), which
is why this patch is bigger than you might expect.
Using -k without -t (always transactions) you expect corruption, though
it doesn't happen every time. With -t, we currently get corruption but
the next patch fixes that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
ececeffd85db1b27c07cdf91a921fd203006daf6)
Rusty Russell [Thu, 22 Apr 2010 04:24:06 +0000 (13:54 +0930)]
tdb: don't truncate tdb on recovery
The current recovery code truncates the tdb file on recovery. This is
fine if recovery is only done on first open, but is a really bad idea
as we move to allowing recovery on "live" databases.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
8c3fda4318adc71899bc41486d5616da3a91a688)
Rusty Russell [Thu, 22 Apr 2010 04:24:06 +0000 (13:54 +0930)]
tdb: remove lock ops
Now the transaction code uses the standard allrecord lock, that stops
us from trying to grab any per-record locks anyway. We don't need to
have special noop lock ops for transactions.
This is a nice simplification: if you see brlock, you know it's really
going to grab a lock.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
9f295eecffd92e55584fc36539cd85cd32c832de)
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: rename tdb_release_extra_locks() to tdb_release_transaction_locks()
tdb_release_extra_locks() is too general: it carefully skips over the
transaction lock, even though the only caller then drops it. Change
this, and rename it to show it's clearly transaction-specific.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
a84222bbaf9ed2c7b9c61b8157b2e3c85f17fa32)
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: cleanup: remove ltype argument from _tdb_transaction_cancel.
Now the transaction allrecord lock is the standard one, and thus is cleaned
in tdb_release_extra_locks(), _tdb_transaction_cancel() doesn't need to
know what type it is.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
dd1b508c63034452673dbfee9956f52a1b6c90a5)
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: tdb_allrecord_lock/tdb_allrecord_unlock/tdb_allrecord_upgrade
Centralize locking of all chains of the tdb; rename _tdb_lockall to
tdb_allrecord_lock and _tdb_unlockall to tdb_allrecord_unlock, and
tdb_brlock_upgrade to tdb_allrecord_upgrade.
Then we use this in the transaction code. Unfortunately, if the transaction
code records that it has grabbed the allrecord lock read-only, write locks
will fail, so we treat this upgradable lock as a write lock, and mark it
as upgradable using the otherwise-unused offset field.
One subtlety: now the transaction code is using the allrecord_lock, the
tdb_release_extra_locks() function drops it for us, so we no longer need
to do it manually in _tdb_transaction_cancel.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
fca1621965c547e2d076eca2a2599e9629f91266)
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: suppress record write locks when allrecord lock is taken.
Records themselves get (read) locked by the traversal code against delete.
Interestingly, this locking isn't done when the allrecord lock has been
taken, though the allrecord lock until recently didn't cover the actual
records (it now goes to end of file).
The write record lock, grabbed by the delete code, is not suppressed
by the allrecord lock. This is now bad: it causes us to punch a hole
in the allrecord lock when we release the write record lock. Make this
consistent: *no* record locks of any kind when the allrecord lock is
taken.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
caaf5c6baa1a4f340c1f38edd99b3a8b56621b8b)
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: cleanup: always grab allrecord lock to infinity.
We were previously inconsistent with our "global" lock: the
transaction code grabbed it from FREELIST_TOP to end of file, and the
rest of the code grabbed it from FREELIST_TOP to end of the hash
chains. Change it to always grab to end of file for simplicity and
so we can merge the two.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
9341f230f8968b4b18e451d15dda5ccbe7787768)
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: remove num_locks
This was redundant before this patch series: it mirrored num_lockrecs
exactly. It still does.
Also, skip useless branch when locks == 1: unconditional assignment is
cheaper anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
1ab8776247f89b143b6e58f4b038ab4bcea20d3a)
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: use tdb_nest_lock() for seqnum lock.
This is pure overhead, but it centralizes the locking. Realloc (esp. as
most implementations are lazy) is fast compared to the fnctl anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
d48c3e4982a38fb6b568ed3903e55e07a0fe5ca6)
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: use tdb_nest_lock() for active lock.
Use our newly-generic nested lock tracking for the active lock.
Note that the tdb_have_extra_locks() and tdb_release_extra_locks()
functions have to skip over this lock now it is tracked.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
4738d474c412cc59d26fcea64007e99094e8b675)
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: use tdb_nest_lock() for open lock.
This never nests, so it's overkill, but it centralizes the locking into
lock.c and removes the ugly flag in the transaction code to track whether
we have the lock or not.
Note that we have a temporary hack so this places a real lock, despite
the fact that we are in a transaction.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
9136818df30c7179e1cffa18201cdfc990ebd7b7)
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: use tdb_nest_lock() for transaction lock.
Rather than a boutique lock and a separate nest count, use our
newly-generic nested lock tracking for the transaction lock.
Note that the tdb_have_extra_locks() and tdb_release_extra_locks()
functions have to skip over this lock now it is tracked.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
e8fa70a321d489b454b07bd65e9b0d95084168de)
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: find_nestlock() helper.
Factor out two loops which find locks; we are going to introduce a couple
more so a helper makes sense.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
ce41411c84760684ce539b6a302a0623a6a78a72)
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: tdb_release_extra_locks() helper
Move locking intelligence back into lock.c, rather than open-coding the
lock release in transaction.c.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
db270734d8b4208e00ce9de5af1af7ee11823f6d)
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: tdb_have_extra_locks() helper
In many places we check whether locks are held: add a helper to do this.
The _tdb_lockall() case has already checked for the allrecord lock, so
the extra work done by tdb_have_extra_locks() is merely redundant.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
fba42f1fb4f81b8913cce5a23ca5350ba45f40e1)
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: don't suppress the transaction lock because of the allrecord lock.
tdb_transaction_lock() and tdb_transaction_unlock() do nothing if we
hold the allrecord lock. However, the two locks don't overlap, so
this is wrong.
This simplification makes the transaction lock a straight-forward nested
lock.
There are two callers for these functions:
1) The transaction code, which already makes sure the allrecord_lock
isn't held.
2) The traverse code, which wants to stop transactions whether it has the
allrecord lock or not. There have been deadlocks here before, however
this should not bring them back (I hope!)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
b754f61d235bdc3e410b60014d6be4072645e16f)
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: tdb_nest_lock/tdb_nest_unlock
Because fcntl locks don't nest, we track them in the tdb->lockrecs array
and only place/release them when the count goes to 1/0. We only do this
for record locks, so we simply place the list number (or -1 for the free
list) in the structure.
To generalize this:
1) Put the offset rather than list number in struct tdb_lock_type.
2) Rename _tdb_lock() to tdb_nest_lock, make it non-static and move the
allrecord check out to the callers (except the mark case which doesn't
care).
3) Rename _tdb_unlock() to tdb_nest_unlock(), make it non-static and
move the allrecord out to the callers (except mark again).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
5d9de604d92d227899e9b861c6beafb2e4fa61e0)
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: rename global_lock to allrecord_lock.
The word global is overloaded in tdb. The global_lock inside struct
tdb_context is used to indicate we hold a lock across all the chains.
Rename it to allrecord_lock.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
e9114a758538d460d4f9deae5ce631bf44b1eff8)
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: rename GLOBAL_LOCK to OPEN_LOCK.
The word global is overloaded in tdb. The GLOBAL_LOCK offset is used at
open time to serialize initialization (and by the transaction code to block
open).
Rename it to OPEN_LOCK.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
7ab422d6fbd4f8be02838089a41f872d538ee7a7)
Rusty Russell [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]
tdb: make _tdb_transaction_cancel static.
Now tdb_open() calls tdb_transaction_cancel() instead of
_tdb_transaction_cancel, we can make it static.
Signed-off-by: Rusty Russell<rusty@rustcorp.com.au>
(Imported from commit
a6e0ef87d25734760fe77b87a9fd11db56760955)
Rusty Russell [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]
tdb: cleanup: split brlock and brunlock methods.
This is taken from the CCAN code base: rather than using tdb_brlock for
locking and unlocking, we split it into brlock and brunlock functions.
For extra debugging information, brunlock says what kind of lock it is
unlocking (even though fnctl locks don't need this). This requires an
extra argument to tdb_transaction_unlock() so we know whether the
lock was upgraded to a write lock or not.
We also use a "flags" argument tdb_brlock:
1) TDB_LOCK_NOWAIT replaces lck_type = F_SETLK (vs F_SETLKW).
2) TDB_LOCK_MARK_ONLY replaces setting TDB_MARK_LOCK bit in ltype.
3) TDB_LOCK_PROBE replaces the "probe" argument.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit
452b4a5a6efeecfb5c83475f1375ddc25bcddfbe)
Brad Hards [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]
Spelling fixes for tdb.
Signed-off-by: Matthias Dieter Wallnöfer <mwallnoefer@yahoo.de>
(Imported from commit
09e756b1d651caef203a4b7e02234f6dea374b08)
Andrew Tridgell [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]
tdb: use fdatasync() instead of fsync() in transactions
This might help on some filesystems
(Imported from commit
1373e748aa53fbd3afe4d2377208257d42628d86)
Volker Lendecke [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]
tdb: Apply some const, just for clarity
(Imported from commit
6824c6f46ba7c15e8af91d5aa8b21a946b63107b)