Amitay Isaacs [Thu, 25 Sep 2014 07:55:15 +0000 (17:55 +1000)]
daemon: Fix the usage for lock helper
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Sep 25 17:16:31 CEST 2014 on sn-devel-104
(Imported from commit
0f92de8463b71a2d7e9acdd27454be7859713436)
Amitay Isaacs [Thu, 25 Sep 2014 07:17:04 +0000 (17:17 +1000)]
recoverd: If obtaining recovery lock fails, try again
When ctdb daemon starts up, it considers itself the recovery master
and tries to do first recovery. However, it's possible that there is
already a recovery master and the current node has not yet heard from it.
So do not ban ourselves immediately if ctdb_recovery_lock() fails when
doing first recovery.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
57310f80c9b8146a0978d912f73b0a64fde7697e)
Amitay Isaacs [Thu, 25 Sep 2014 02:46:22 +0000 (12:46 +1000)]
scripts: Fix the regular expresssion for parsing /proc/locks
The major and minor device numbers are hexadecimal not decimal.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Sep 25 07:19:59 CEST 2014 on sn-devel-104
(Imported from commit
f1e281cd47d9ebd79e09294606b8fa411ec0fbb4)
Amitay Isaacs [Thu, 25 Sep 2014 02:44:59 +0000 (12:44 +1000)]
locking: Reset ttimer before doing an early return
When timer expires, timeout handler routine sets lock_ctx->ttimer
to a newly created timer event. However, when a node is INACTIVE,
timeout handler returns early with lock_ctx->ttimer set to the previous
timer event. This timer event gets freed when the callback returns and
lock_ctx->ttimer remains set to already freed timer event.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
c64369cba2e5a975d87d518737abbf04c9871a26)
Amitay Isaacs [Mon, 22 Sep 2014 01:54:04 +0000 (11:54 +1000)]
doc: Update NEWS
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Tue, 16 Sep 2014 02:33:26 +0000 (12:33 +1000)]
scripts: Do not export variables if they are not set
Variables that are not set but exported, may return an empty string
for getenv(). Tested on freebsd.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Sep 17 09:55:47 CEST 2014 on sn-devel-104
(Imported from commit
22257dd4b6d226ee956ede5a847ce0bcb99333be)
Amitay Isaacs [Tue, 16 Sep 2014 02:00:10 +0000 (12:00 +1000)]
scripts: Fix a typo
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
8509bffdebb7884b765904f8112ff83056511a30)
Martin Schwenke [Mon, 15 Sep 2014 01:48:33 +0000 (11:48 +1000)]
util: Log an error if there is no way to set scheduler
Although configure should catch this, logging a run-time error is
better than being mystified when ctdbd silently exits.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
fb9c49f2ce0838baa5f94f4ca03d1c92cb58b306)
Amitay Isaacs [Fri, 12 Sep 2014 06:24:09 +0000 (16:24 +1000)]
doc: Add reference to new magepage ctdb-statistics
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Sep 12 11:13:56 CEST 2014 on sn-devel-104
(Imported from commit
d744eb03c5236284cf0141c1a2f687263cbd8414)
Amitay Isaacs [Fri, 12 Sep 2014 04:22:00 +0000 (14:22 +1000)]
doc: Add ctdb-statistics manual page
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
efd34bb274a5ed015d7fe9374718671e0d7f9cc6)
Amitay Isaacs [Fri, 12 Sep 2014 00:50:27 +0000 (10:50 +1000)]
daemon: Decrement pending calls statistics when calls are deferred
Deferred calls should not be treated as pending calls since they are
re-processed from the beginning.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
f5f11e1a05d4d75a7662d6c413a14c4cd18f8ed9)
Amitay Isaacs [Fri, 12 Sep 2014 01:25:14 +0000 (11:25 +1000)]
tests: Do not expect real-time priority when running local daemons
Local daemons are started mainly for testing and usually not as root.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
3c1bae12217ead74863a7cdd9b8a338aef80adb1)
Amitay Isaacs [Fri, 12 Sep 2014 01:22:36 +0000 (11:22 +1000)]
daemon: Make sure ctdb runs with real-time priority
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
d410b20601cccd8b67d48c42a6d689cd65e94f61)
Martin Schwenke [Wed, 13 Aug 2014 05:01:54 +0000 (15:01 +1000)]
locking: Fork lock helper with vfork_with_logging()
Otherwise errors printed by the lock helper get lost.
lock_helper_args() no longer adds the program name to the list of
arguments, since vfork_with_logging() does that. Update the lock
helper to handle the extra log_fd parameter passed by
vfork_with_logging() and send stdout/stderr there.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
7ae7a9c46301e4fed870516c448a79bb7a9ac53a)
Martin Schwenke [Wed, 13 Aug 2014 04:46:31 +0000 (14:46 +1000)]
locking: Add argc parameter to lock_helper_args()
To make this sane, also add an argv parameter and change the return
type to bool. Anticipating a subsequent change, make the type of argv
match what is needed by vfork_with_logging() and cast it when passing
to execv(). This also means changing the type of the name member of
struct db_namelist.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
2e17b0ecddffb8590c4e8b9afaf1767ef7e8f89c)
Amitay Isaacs [Wed, 10 Sep 2014 08:14:24 +0000 (18:14 +1000)]
locking: Set real-time priority for lock helpers
To avoid lock helper starvation when userspace robust mutexes are
enabled.
Commit
6f072f85a138f595494dbec137bcf23d1e666acc removed reset_scheduler(),
to avoid resetting scheduler priority. However, that is not sufficient
because of commit
1be8564e553ce044426dbe7b3987edf514832940, which sets
SCHED_RESET_ON_FORK flag. With SCHED_RESET_ON_FORK, all CTDB child
processes will automatically have normal scheduling priority.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Sep 11 11:31:10 CEST 2014 on sn-devel-104
(Imported from commit
4e5a6b154e1549e959c5de4b58432e33c0d57b55)
Amitay Isaacs [Tue, 9 Sep 2014 09:01:55 +0000 (19:01 +1000)]
daemon: Increment pending calls statistics correctly
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
e6127a9eceb215e421ee56c09032bb1e81c8131e)
Amitay Isaacs [Tue, 2 Sep 2014 06:10:20 +0000 (16:10 +1000)]
call: Drop all deferred requests from older generation
Deferring packets has a nasty interaction with recovery. All deferred
packets must be dropped when recovery happens, since those packets are
tracked as pending requests and will be re-sent with new generation.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Sep 5 09:30:50 CEST 2014 on sn-devel-104
(Imported from commit
2c57cc9597cb9cfe5ab3a458df74d6b5cda45465)
Amitay Isaacs [Tue, 19 Aug 2014 11:49:59 +0000 (21:49 +1000)]
locking: Do not reset real-time priority for lock helpers
When using TDB robust mutexes, the kernel wakes waiting processes one
by one, in the priority list order. To ensure that ctdb lock helper
processes do not starve, lock helper processes need to run at a higher
priority than smbd.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
6f072f85a138f595494dbec137bcf23d1e666acc)
Amitay Isaacs [Fri, 15 Aug 2014 05:20:36 +0000 (15:20 +1000)]
daemon: Defer all calls when processing dmaster packets
When CTDB receives DMASTER_REQUEST or DMASTER_REPLY packet, the specified
record needs to be updated as soon as possible to avoid inconsistent
dmaster information between nodes. During this time, queue up all calls
for that record and process them only after dmaster request/reply has
been processed.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
ef59f2e6bbd502f7cb58ad3a74a6448ccd1ebe59)
Amitay Isaacs [Fri, 15 Aug 2014 03:33:24 +0000 (13:33 +1000)]
daemon: Remove duplicate code with refactored function
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
deb7bb89b3844f209ef73cc5707fcb4673bf08d7)
Amitay Isaacs [Fri, 15 Aug 2014 03:22:29 +0000 (13:22 +1000)]
common: Refactor code to convert TDB_DATA key to aligned uint32 array
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
bd133894672fcf3c79868605466ba7b527af3018)
Amitay Isaacs [Fri, 15 Aug 2014 03:31:37 +0000 (13:31 +1000)]
include: Remove declaration of non-existent function
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
13d5af48ac514621a6a820ba800771a7fdb4fe75)
Amitay Isaacs [Mon, 11 Aug 2014 07:10:23 +0000 (17:10 +1000)]
locking: Remove unused function ctdb_free_lock_request_context
There is no need for a special function to free lock request and
corresponding lock context. Freeing lock request will free lock
context also.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
2592ae5a56e813bb7cb68789f93fc281b1822a82)
Amitay Isaacs [Mon, 11 Aug 2014 07:08:20 +0000 (17:08 +1000)]
locking: Talloc lock request from client specified context
This makes sure that when the client context is destroyed, the lock
request goes away. If the lock requests is already scheduled, then the
lock child process will be terminated.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
374cbc7b0ff68e04ee4e395935509c7df817b3c0)
Amitay Isaacs [Mon, 11 Aug 2014 06:43:07 +0000 (16:43 +1000)]
locking: Run debug locks script only if the node is active
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
d9e4622a446c9ed60771c508638fb89055320f03)
Martin Schwenke [Mon, 4 Aug 2014 04:50:17 +0000 (14:50 +1000)]
daemon: Fix some strict-aliasing warnings
Seeing these with -Wall:
../server/ctdb_call.c:1117:3: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
record_flags = *(uint32_t *)&c->data[c->keylen + c->datalen];
^
memcpy() seems to be the easiest way to get fix these. The
alternative would be to use unmarshalling functions.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
6fd3ce53914c5c5aa79b972b42258c722b227b88)
Martin Schwenke [Wed, 30 Jul 2014 11:10:01 +0000 (21:10 +1000)]
util: Fix warning about ignored result from system()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
2807b185f438c40544d4fd133bc386e411b12d0c)
Martin Schwenke [Wed, 30 Jul 2014 11:03:53 +0000 (21:03 +1000)]
Use sys_read() and sys_write() to ensure correct signal interaction
... and avoid compiler warnings in some cases.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
c1558adeaa980fb4bd6177d36250ec8262e9b9fe)
Martin Schwenke [Wed, 30 Jul 2014 10:50:59 +0000 (20:50 +1000)]
common: Copy functions sys_read() and sys_write() from source3
We really should extricate these from source3 and into some common
code. However, just copy them for now to help get rid of a lot of
warnings.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
fcd6ee1eac8627e75f72019027513cc46429a3a9)
Martin Schwenke [Wed, 20 Aug 2014 23:26:39 +0000 (09:26 +1000)]
tools: Be more helpful when CTDB CLI tool is run on unconfigured node
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
72fa984423b77eaddb16b63e6c3857600e054836)
Martin Schwenke [Fri, 8 Aug 2014 00:43:44 +0000 (10:43 +1000)]
tools: Factor out new function find_node_xpnn() from control_xpnn()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
26a02a64cda0501d57db686dee61fda0846083ee)
Martin Schwenke [Thu, 31 Jul 2014 01:03:59 +0000 (11:03 +1000)]
replace: Remove unused item returned by FAILED()
The (return) value of FAILED() is a constant 1. However, it is never
used, so the compiler complains when run with -Wall:
lib/replace/test/os2_delete.c: In function ‘cleanup’:
lib/replace/test/os2_delete.c:39:163: warning: right-hand operand of comma expression has no effect [-Wunused-value]
FAILED("system");
So just get remove the ", 1" since it is the bit that does nothing and
is never used.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Aug 20 16:54:31 CEST 2014 on sn-devel-104
(Imported from commit
47e7440be9ab422b3b2544c0b071fb8717a7a915)
Amitay Isaacs [Tue, 12 Aug 2014 13:58:00 +0000 (23:58 +1000)]
readonly: Do not abort if revoke of readonly record fails on a node
Revoking readonly record involves first marking the record on dmaster as
RO_REVOKING_READONLY. Then all the other nodes are sent update_record
control to get rid of RO_DELEGATION. Once that succeeds, the record
is marked RO_REVOKING_COMPLETE.
Currently, revoking of readonly delegations on the nodes is tried only
once. If a node goes in recovery, it can fail update_record control and
revoke code will abort ctdb. Since database recovery would revoke all
readonly delegations anyway, there is no reason to abort. Simply undo
the start of revoke process by resetting RO_REVOKING_READONLY flag.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Aug 13 11:24:09 CEST 2014 on sn-devel-104
(Imported from commit
c6d0e8dadcff55ea21973f4f7a89f241180d17e8)
Amitay Isaacs [Tue, 12 Aug 2014 13:54:39 +0000 (23:54 +1000)]
readonly: Add an early return to simplify code
This patch makes the subsequent logic change small and easier to
understand.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
f96f395d853e0181d9ee031c3e3f1d31f5cff35c)
Martin Schwenke [Sun, 10 Aug 2014 09:02:42 +0000 (19:02 +1000)]
doc: Fix default database directories in ctdbd.1
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Aug 13 07:06:42 CEST 2014 on sn-devel-104
(Imported from commit
b8e9f6b015811d7fb162634f85721b5d27ab503b)
Volker Lendecke [Mon, 4 Aug 2014 13:57:12 +0000 (13:57 +0000)]
locking: Simplify ctdb_find_lock_context()
I like early returns that avoid else branches :-)
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Aug 6 14:44:31 CEST 2014 on sn-devel-104
(Imported from commit
e185ff22caf430f680f8bad1edf14bc98dd7c64e)
Volker Lendecke [Mon, 4 Aug 2014 12:41:06 +0000 (12:41 +0000)]
locking: TALLOC_FREE copes with NULL
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
9f596c17c7d255213df6201d4d489df1580faef4)
Amitay Isaacs [Thu, 24 Jul 2014 05:56:41 +0000 (15:56 +1000)]
locking: Add per database queues for pending and active lock requests
This avoids traversing a single pending queue which is quite expensive
when there are lots of pending lock requests. This seems to happen
quite a lot on a loaded cluster for notify_index.tdb.
Adding per database queues avoids the need to traverse pending queue
for that database if there are already the maximum number of active
lock requests.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Mon Aug 4 20:23:45 CEST 2014 on sn-devel-104
(Imported from commit
88f6a6c188b8e43f710c50a9c1f88af660772e3d)
Amitay Isaacs [Wed, 23 Jul 2014 02:52:03 +0000 (12:52 +1000)]
locking: Update a comment
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
f73adff737c8fd3ab797de35bf1463359ce801cd)
Amitay Isaacs [Tue, 15 Jul 2014 04:49:44 +0000 (14:49 +1000)]
locking: Simplify check for locks on record or database
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
a890e760bbcb2d0f384aff285d1282de2a42d313)
Amitay Isaacs [Tue, 15 Jul 2014 04:38:52 +0000 (14:38 +1000)]
locking: Decrement pending statistics when lock is scheduled
and not when the lock is obtained.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
aa1ff305f9bdd97675ceb4ce2b18f4cd623b8a38)
Amitay Isaacs [Tue, 15 Jul 2014 04:38:12 +0000 (14:38 +1000)]
locking: Update ctdb statistics for all lock types
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
dce68a21416dd3dc016ed6a7c884b1314ffca121)
Amitay Isaacs [Tue, 15 Jul 2014 04:13:25 +0000 (14:13 +1000)]
locking: Add DB lock requests to head of the pending queue
This allows to schedule DB locks quickly without having to scan through
the pending lock requests.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
7189437be447d33038eb26bca055b1025cebacd3)
Amitay Isaacs [Tue, 15 Jul 2014 02:59:57 +0000 (12:59 +1000)]
locking: Remove unused variable lock_num_pending
The number of pending locks displayed in ctdb statistics are stored in
ctdb_statistics structure and not ctdb_context.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
3aa96c3a3eb87fc6a1ad94c983e363b402b48ff5)
Amitay Isaacs [Tue, 15 Jul 2014 06:41:31 +0000 (16:41 +1000)]
locking: Increase number of lock processes per database to 200
This was the original limit in the older versions of CTDB.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
3ff8ec02830b5fd2f88e33748a2bfd9f066a1285)
Amitay Isaacs [Tue, 15 Jul 2014 02:13:53 +0000 (12:13 +1000)]
locking: Add new tunable LockProcessesPerDB
This allows to change the maximum number of lock processes that can
be active.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
59d45ea307fd460953a3b4924dfa60f5ab6dea4a)
Amitay Isaacs [Fri, 30 May 2014 05:49:46 +0000 (15:49 +1000)]
locking: Allocate lock request soon after allocating lock context
This avoids extra work in case lock request allocation fails.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
e0d54594519de67b6f0d0ec003bc0327f70f026b)
Amitay Isaacs [Tue, 15 Jul 2014 04:44:55 +0000 (14:44 +1000)]
locking: Remove unused function find_lock_context()
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
97a5c579574fb5702a743e07b896c9a0ec0acc4f)
Amitay Isaacs [Thu, 29 May 2014 07:27:32 +0000 (17:27 +1000)]
locking: Schedule the next possible lock based on per-db limit
This prevents searching through active lock requests for every pending
lock request to check if the pending lock request can be scheduled or not.
The locks are scheduled in strict first-in-first-out order.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
c9664b4b17660c03ed96072a9f5392dbb0800f2c)
Amitay Isaacs [Fri, 30 May 2014 05:36:03 +0000 (15:36 +1000)]
locking: Remove multiple lock requests per lock context (part 2)
Store only a single request instead of storing a queue in lock context.
Lock request structure does not need to be a linked list any more.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
19b3810b61e4856dc5ab6b8a5a785b836172f01e)
Amitay Isaacs [Fri, 30 May 2014 05:36:28 +0000 (15:36 +1000)]
locking: Remove multiple lock requests per lock context (part 1)
This was a bad idea and caused out of order scheduling of lock requests.
The logic to append lock requests to existing lock context is already
commented. Remove the commented code and there is no need to check if
lock_ctx is NULL, since we are always creating a new one.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
a89f3508796d6be8efe45ccc1f9ffee7e4d3f4f3)
Amitay Isaacs [Fri, 30 May 2014 03:57:44 +0000 (13:57 +1000)]
locking: Remove unused structure members
block_child was used to keep track of a process which was created to debug
why a lock process has blocked. That logic was replaced to execute an
external debug script.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
b93d9c062229252a78a4497fba402ec968be9713)
Amitay Isaacs [Fri, 30 May 2014 03:48:45 +0000 (13:48 +1000)]
locking: Fix the lock_type_str corresponding to LOCK_ALLDB
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
8aa6c039ae8f2bfd99515999e1ce647f0e4028d7)
Martin Schwenke [Tue, 29 Jul 2014 05:08:36 +0000 (15:08 +1000)]
eventscripts: Remove special case for virtio_net
The current check is incorrect in 2 ways:
* Commit
be71a84565e9e7532a77c175732b764d1f42c1cd contained a thinko
that stops virtio_net interfaces from simply being marked up
* virtio_net interfaces can actually be down
virtio_net has supported ethtool since Linux 2.6.29, so just remove
the special case. This means that testing CTDB on very old virtual
machines is not supported.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Jul 31 13:08:47 CEST 2014 on sn-devel-104
(Imported from commit
bc59e508d381e6ec2a47eed1e0bc8fc3025904a2)
Martin Schwenke [Mon, 28 Jul 2014 02:56:47 +0000 (12:56 +1000)]
eventscripts: Remove unused argument to natgw_ensure_master()
This was used to limit damage in the "recovered" event.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Jul 29 10:03:16 CEST 2014 on sn-devel-104
(Imported from commit
7c2c6748e323fb0e54fbc2d1b773608904458e94)
Martin Schwenke [Tue, 22 Jul 2014 02:31:44 +0000 (12:31 +1000)]
tests: Add another LCP2 takeover test
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
d011e70215717a0ec94f6ca30d44b0302e4533ef)
Martin Schwenke [Fri, 25 Jul 2014 06:56:57 +0000 (16:56 +1000)]
eventscripts: Remove NAT gateway "monitor" event
This event was introduced to handle misconfiguration. For example,
where all nodes where configured as NAT gateway slaves.
However, this event can fail when there are performance issues and
capabilities can't be retrieved from a remote node. The problem is
most likely with the remote node, so marking the local node UNHEALTHY
is probably a mistake.
Having a NAT gateway master node only matters in "ipreallocated", so
leave it to do the checking. Given that a node will run
"ipreallocated" as part of the first recovery, this should cause
misconfigurations to be detected nice and early.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
cb94eba157679574c05d85f05828195e4099f2ba)
Michael Adam [Wed, 23 Jul 2014 11:34:03 +0000 (13:34 +0200)]
vacuum: stop vacuuming when the first delete_list traverse fails.
This indirect caller of delete_marshall_traverse was missed
in
fa4a81c86b6073b2563b090aa657d8e8b63c1276
which lets failure of the second travers fail the vacuum run.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
9d6f187b5811faed6e9b6c4bc61e42175c0c0ae2)
Amitay Isaacs [Tue, 6 May 2014 08:52:54 +0000 (18:52 +1000)]
vacuum: Use existing function ctdb_marshall_finish
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Jul 23 09:44:00 CEST 2014 on sn-devel-104
(Imported from commit
f87b7f664f813957ee55a6f35abb208eb0f3dcad)
Amitay Isaacs [Tue, 6 May 2014 08:39:25 +0000 (18:39 +1000)]
vacuum: Use ctdb_marshall_add to add a record to marshall buffer
This avoids duplicate code and extra talloc in ctdb_marshall_record.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
6edc4f23e9094860ad5cc6b93ce66169dd99047a)
Amitay Isaacs [Tue, 6 May 2014 08:26:41 +0000 (18:26 +1000)]
util: Refactor record marshalling routines to avoid extra talloc
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
42ba7a0a400c970dd534e92d2effa3ed385f8d6d)
Amitay Isaacs [Tue, 22 Jul 2014 11:23:03 +0000 (11:23 +0000)]
util: Refactor ctdb_marshall_record
Create new routines ctdb_marshall_record_size and ctdb_marshall_record_copy
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
64ea6e30ef601d91ea16f6a9c5b7a6b9395c0152)
Amitay Isaacs [Tue, 22 Jul 2014 11:22:25 +0000 (11:22 +0000)]
util: Fix nonempty line endings
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
5eac2302819b26b8eaf4f6c0a333e4af2b368679)
Amitay Isaacs [Thu, 10 Jul 2014 08:38:13 +0000 (18:38 +1000)]
vacuum: If talloc_realloc fails, terminate traverse
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
fa4a81c86b6073b2563b090aa657d8e8b63c1276)
Amitay Isaacs [Tue, 22 Jul 2014 11:19:13 +0000 (11:19 +0000)]
vacuum: Fix talloc hierarchy in delete_marshall_traverse
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit
9a4a9ccda397e20b0a894541f4f1a6d24e09bf19)
Volker Lendecke [Mon, 21 Jul 2014 09:48:45 +0000 (09:48 +0000)]
common: Fix verbose_memory_names
If we have already partly written a packet, "data" and thus "pkt->data"
does not point to the start of the packet anymore. Assign "hdr" while
it still points at the start of the header.
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Jul 22 06:09:50 CEST 2014 on sn-devel-104
(Imported from commit
478ef9493f131c4d94bada708f790db3254f0a59)
Volker Lendecke [Mon, 21 Jul 2014 09:42:54 +0000 (09:42 +0000)]
common: Avoid a talloc in ctdb_queue_send
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
70c79f514024551128acc2d3ba879ef1407ed130)
Martin Schwenke [Mon, 2 Jun 2014 09:09:38 +0000 (19:09 +1000)]
recoverd: Gently abort recovery when election is underway
Sometimes the recovery daemon fails to get the recovery lock on one
node so that node is banned. This seems to always happen during an
election. The recovery is triggered because other nodes are found to
have recovery mode enabled. They have recovery mode enabled because
an election has been forced.
The recovery daemon's main_loop() only does an initial check for an
election. After that, a node can force an election and, in the
process, set itself to be the current winner. In this situation,
verify_recmode() will always return MONITOR_RECOVERY_NEEDED so
do_recovery() is called. If the previous recovery master hasn't
admitted defeat and released the recovery lock, then do_recovery()
will rightly fail. However, it would be better if it failed a little
more gracefully, since this case is not that unusual.
Instead of trying to take the recovery lock, return early with an
error if there is an election in progress. Note that the race is
still there but it is now much narrower.
There are probably more subtle ways of avoiding this issue, including
something like this in main_loop():
- if (pnn != rec->recmaster) {
+ if (pnn != rec->recmaster || rec->election_timeout) {
return;
}
However, this check is done earlier so it leaves the race window open
a little wider.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Jul 21 06:57:07 CEST 2014 on sn-devel-104
(Imported from commit
705e4174c988eea5c5b3a834710f9f920369c8ee)
Amitay Isaacs [Mon, 14 Jul 2014 06:30:18 +0000 (16:30 +1000)]
ltdb: Use tdb_null instead of zeroing TDB_DATA variable
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Mon Jul 14 16:01:31 CEST 2014 on sn-devel-104
(Imported from commit
208b2d88c4efacee79fe4d856ee8256c680cad5c)
Amitay Isaacs [Sun, 10 Nov 2013 13:32:31 +0000 (00:32 +1100)]
daemon: Support per-node robust mutex feature
To enable TDB mutex support, set tunable TDBMutexEnabled=1.
When databases are attached for the first time, attach flags must include
TDB_MUTEX_LOCKING and TDBMutexEnabled must set to enable mutex support.
However, when CTDB attaches databases internally for recovery, it will
enable mutex support if TDBMutexEnabled is set.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Jul 9 06:45:17 CEST 2014 on sn-devel-104
(Imported from commit
55fbe364b93000c7766e95e16fa35cc6a80c697b)
Amitay Isaacs [Mon, 30 Jun 2014 05:09:32 +0000 (15:09 +1000)]
daemon: Enable robust mutexes only if TDB_MUTEX_LOCKING is defined
Runtime check for robust mutexes is performed just before opening local tdb.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
(Imported from commit
2e7b0870ec1014f8320032b86dc54f0a6fd55776)
Volker Lendecke [Wed, 20 Feb 2013 14:09:36 +0000 (15:09 +0100)]
daemon: Allow flag TDB_MUTEX_LOCKING to pass into db_attach
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
(Imported from commit
1627171792567fc55290330feaaef9d9efc66c48)
Amitay Isaacs [Tue, 24 Jun 2014 02:04:25 +0000 (12:04 +1000)]
daemon: Simplify code a bit
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
(Imported from commit
91be76dbe93a2be763a93163bec8c17d35057944)
Amitay Isaacs [Tue, 24 Jun 2014 01:46:53 +0000 (11:46 +1000)]
daemon: Use false instead of 0 for boolean arguments
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
(Imported from commit
1ed330f7cbd753b6c29246d522c5ddca5160d8bb)
Amitay Isaacs [Tue, 6 May 2014 07:44:24 +0000 (17:44 +1000)]
tests: Add a test for ctdb restoredb
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Mon Jul 7 16:06:39 CEST 2014 on sn-devel-104
(Imported from commit
eccce073d084eceb4bfb5c25001b5873e2c0f2b2)
Amitay Isaacs [Tue, 6 May 2014 07:16:19 +0000 (17:16 +1000)]
tests: Check that ctdb wipedb cleans the database
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
9c8c8a7b0bfd4c1cafa3deaa012049b7f0851617)
Amitay Isaacs [Tue, 6 May 2014 04:20:44 +0000 (14:20 +1000)]
daemon: Do not thaw databases if recovery is active
This prevents ctdb tool from thawing databases prematurely in
thaw/wipedb/restoredb commands if recovery is active.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
2855173dac5386bff655d1bb94c1848591b963e1)
Amitay Isaacs [Tue, 6 May 2014 04:24:52 +0000 (14:24 +1000)]
recoverd: Set recovery mode before freezing databases
Setting recovery mode to active is the only correct way to inform recovery
daemon to run database recovery. Only freezing databases without setting
recovery mode should not trigger database recovery, as this mechanism
is used in tool to implement wipedb/restoredb commands.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
28a1b75886fb4aea65e23bfd00b9f4c98780fdfd)
Amitay Isaacs [Tue, 6 May 2014 04:15:45 +0000 (14:15 +1000)]
tools: There is no need for forcing a recovery
This effectively reverts commit
442953c540424ad0c64f4264b5ee27c45a3130e8.
The correct way of telling recovery daemon to trigger a database recovery is
by setting recovery mode to active. There is no need to freeze databases as
recovery master will do that across the cluster anyway.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
72c6500ee440779819b9adb768a7022cc251f07e)
Amitay Isaacs [Tue, 6 May 2014 04:07:00 +0000 (14:07 +1000)]
Revert "It was possible for ->recovery_mode to get out of sync with the new three db priorities in such a way that"
This reverts commit
6578a97bd94fc14d5b6df85b84e50447f7bdb2e3.
This condition cannot happen since when recovery is triggered, all the
databases would get frozen and thawed in the order of priority. The only
other place where databases get frozen are for implementation of ctdb
wipedb/restoredb commands.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit
e5cd81da77ef58992b7eb9ff7d972b499b946bb7)
Martin Schwenke [Sat, 31 May 2014 04:01:53 +0000 (14:01 +1000)]
common: Use SCHED_RESET_ON_FORK when setting SCHED_FIFO
This makes the scheduler reset code a no-op.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Jul 7 13:28:25 CEST 2014 on sn-devel-104
(Imported from commit
1be8564e553ce044426dbe7b3987edf514832940)
Martin Schwenke [Fri, 20 Jun 2014 03:36:25 +0000 (13:36 +1000)]
recoverd: Don't say "Election timed out"
That makes people think there's a problem (and report bugs) so say
something a bit less scary instead...
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
a283b9e43a602b9c72065336edbe8ad7c2499117)
Martin Schwenke [Fri, 20 Jun 2014 00:51:16 +0000 (10:51 +1000)]
recoverd: Log a message when releasing the recovery lock
It is a non-trivial event and will make it easier to debug recovery
lock issues.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
8bdb9b85cc02f589a3b219de07f3c2ef7510d937)
Martin Schwenke [Thu, 26 Jun 2014 00:36:17 +0000 (10:36 +1000)]
scripts: Support NFS on RHEL7 with systemd
Need to be able to recognise a RHEL system. Still use "system" to
start and stop service, since that still works and yields the smallest
change.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
61b1fdec2fdb19be9b9cd39bc5298917e914cc04)
Amitay Isaacs [Thu, 3 Jul 2014 03:47:07 +0000 (13:47 +1000)]
recoverd: No need to set ctdbd_pid again
This is unnecessary since ctdbd_pid is set very early in the code before
creating any other processes including recovery daemon.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Sat Jul 5 09:20:27 CEST 2014 on sn-devel-104
(Imported from commit
331fb7fc64c0a4f64c28001a1644a2a6a923be75)
Martin Schwenke [Thu, 3 Jul 2014 02:29:46 +0000 (12:29 +1000)]
daemon: Remove ctdbd_pid global variable
This duplicates ctdb->ctdbd_pid.
Thanks to Sumit Bose <sbose@redhat.com> for the suggestion.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
1677dd499c571081a8ddaf560eb3b033156e1c67)
Martin Schwenke [Mon, 30 Jun 2014 07:22:14 +0000 (17:22 +1000)]
daemon: Check PID in ctdb_remove_pidfile(), not unreliable flag
If something unexpectedly uses fork() then an exiting child will
remove the PID file while the main daemon is still running. The real
test is whether the current process has the PID of the main CTDB
daemon, which is the process that calls setsid().
This could be done using getpgrp() instead. At the moment the
eventscript handler harmlessly calls setpgid() - harmless because the
atexit() handlers are cleared upon exec(). However, it is possible
that process groups will be used more in future so it is probably
better to rely on the session ID.
Thanks to Sumit Bose <sbose@redhat.com> for the idea.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
e454e5ac9c8ed77409d9fa4463b2b29985e67e10)
Martin Schwenke [Thu, 3 Jul 2014 02:12:20 +0000 (12:12 +1000)]
daemon: Exit if setting the session ID fails
Currently ctdbd_wrapper depends on the session ID. Very soon PID file
removal will too. :-)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
c7b3be97d96ee5a17bb88dceec42c57e9bf69c5d)
Martin Schwenke [Thu, 26 Jun 2014 05:16:12 +0000 (15:16 +1000)]
tests: Fix racy test for debugging hung scripts
Debugging can still be running when a monitor event times out and
scriptstatus output changes.
When debugging a hung script to a log file, write to a temporary file
and move the temporary file over the log file when done. The test
then waits for the log file to appear.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Jul 3 08:19:23 CEST 2014 on sn-devel-104
(Imported from commit
a7c55007659ab768293f15c5f5fc00c5d9e5c814)
Martin Schwenke [Thu, 26 Jun 2014 04:46:54 +0000 (14:46 +1000)]
scripts: Always print footer when debugging hung script
There shouldn't be an early exit for the "init" event. Just make the
"ctdb scriptstatus" call conditional.
While here, move the comment about only running a single instance to
be near locking code. The comment is more useful there.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
b0c191e5de15e54646b02925e37458d6a56db015)
Martin Schwenke [Mon, 16 Jun 2014 00:59:20 +0000 (10:59 +1000)]
eventscripts: Ensure $GANRECDIR points to configured subdirectory
Check that the $GANRECDIR symlink points to the location specified by
$CTDB_GANESHA_REC_SUBDIR and replace it if incorrect. This handles
reconfiguration and filesystem changes.
While touching this code:
* Create the $GANRECDIR link as a separate step if it doesn't exist.
This means there is only 1 place where the link is created.
* Change some variables names to the style used for local function
variables.
* Remove some "ln failed" error messages. ln failures will be logged
anyway.
* Add -v to various mkdir/rm/ln commands so that these actions are
logged when they actually do something.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Jun 20 05:40:16 CEST 2014 on sn-devel-104
(Imported from commit
aac607d7271eb50e776423329f2446a1e33a2641)
Martin Schwenke [Wed, 5 Mar 2014 05:21:45 +0000 (16:21 +1100)]
daemon: Debugging for tickle updates
This was useful for debugging the race fixed by commit
4f79fa6c7c843502fcdaa2dead534ea3719b9f69. It might be useful again.
Also fix a nearby comment typo.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Jun 20 02:07:48 CEST 2014 on sn-devel-104
(Imported from commit
6f43896e1258c4cf43401cbfeba24a50de3c3140)
Martin Schwenke [Tue, 10 Jun 2014 05:16:44 +0000 (15:16 +1000)]
tests: Try harder to avoid failures due to repeated recoveries
About a year ago a check was added to _cluster_is_healthy() to make
sure that node 0 isn't in recovery. This was to avoid unexpected
recoveries causing tests to fail. However, it was misguided because
each test initially calls cluster_is_healthy() and will now fail if an
unexpected recovery occurs.
Instead, have cluster_is_healthy() warn if the cluster is in recovery.
Also:
* Rename wait_until_healthy() to wait_until_ready() because it waits
until both healthy and out of recovery.
* Change the post-recovery sleep in restart_ctdb() to 2 seconds and
add a loop to wait (for 2 seconds at a time) if the cluster is back
in recovery. The logic here is that the re-recovery timeout has
been set to 1 second, so sleeping for just 1 second might race
against the next recovery.
* Use reverse logic in node_has_status() so that it works for "all".
* Tweak wait_until() so that it can handle timeouts with a
recheck-interval specified.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
6a552f1a12ebe43f946bbbee2a3846b5a640ae4f)
Michael Adam [Sat, 15 Feb 2014 00:36:06 +0000 (01:36 +0100)]
vacuum: always run freelist_size again
and not only if repack_limit != 0. This partially reverts
commit
48f2d1158820bfb063ba0a0bbfb6f496a8e7522.
With the new tdb code this defragments the
free list by merging adjacent records.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
5334881afab42eae77bb2015ec21cbfe1df87807)
Michael Adam [Tue, 22 Apr 2014 20:09:35 +0000 (22:09 +0200)]
vacuum: add missing return to ctdb_vacuum_traverse_db() error path.
This got lost in commit
19948702992c94553e1a611540ad398de9f9d8b9
("ctdb-vacuum: make ctdb_vacuum_traverse_db() void.")
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
026d79cb009beba6987da6a6dd5fd98609140136)
Michael Adam [Sat, 19 Apr 2014 01:36:49 +0000 (03:36 +0200)]
vacuum: remove now unused talloc ctx argument from ctdb_vacuum_db()
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
b8658b395921a5400c9f794a07748f5ad18991f8)
Michael Adam [Sat, 19 Apr 2014 01:34:05 +0000 (03:34 +0200)]
vacuum: move init of vdata into init_vdata funcion
This is a small code cleanup.
vdata is only used in ctdb_vacuum_db() and not in
ctdb_vacuum_and_repack_db() where it is currently initialized.
This patch moves creation and all previously scattered
initialization of vacuum_data into ctdb_vacuum_init_vacuum_data
which is called from ctdb_vacuum_db.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
c3cb8c277a02a8a68c11ef8d341c8116172e989b)
Michael Adam [Sat, 19 Apr 2014 01:08:20 +0000 (03:08 +0200)]
vacuum: remove vacuum limit from vdata - not used
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
3cf018935e057c1748ab44491135c632c023de9f)
Michael Adam [Sat, 19 Apr 2014 01:02:42 +0000 (03:02 +0200)]
vacuum: remove a superfluous comment.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit
a99035a4c52f68a4a4f1862c74c1c71273a47d5b)