git.samba.org - sahlberg/ctdb.git/log

git.samba.org / sahlberg / ctdb.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 3 Mar 2011 19:55:24 +0000 (06:55 +1100)]

Restart recovery dameon if it looks like it hung.
Dont shutdown ctdbd completely, that only makes the problem worse.

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 1 Mar 2011 01:09:42 +0000 (12:09 +1100)]

If/when the recovery daemon terminates unexpectedly, try to restart it again from the main daemon instead of just shutting down the main deamon too.

While it does not address the reason for recovery daemon shutting down, it reduces the impact of such issues and makes the system more robust.

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 28 Feb 2011 23:48:04 +0000 (10:48 +1100)]

version 1.2.200

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 24 Feb 2011 23:33:12 +0000 (10:33 +1100)]

ATTACH_DB: simplify the code slightly and change the semantics to only
refuse a db attach during recovery IF we can associate the request from a
genuine real client instead of deciding this on whether client_id is zero or

This will suppress/avoid messages like these :
DB Attach to database %s refused. Can not match clientid...

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 24 Feb 2011 23:06:08 +0000 (10:06 +1100)]

Dont return error if trying to set db priority on a db that does not yet exist.
Just treat as a nop.

When the database is created later it will get its priority set properly.

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 24 Feb 2011 21:57:07 +0000 (08:57 +1100)]

new version 1.3.2

commit | commitdiff | tree

Michael Adam [Wed, 23 Feb 2011 16:39:57 +0000 (17:39 +0100)]

recover: finish pending trans3 commits when a recovery is finished.

When the end_recovery control is received, pending trans3 commits are
finished. During the recovery, all the actions like persistent_callback
and persistent_store_timeout had been disabled to let the recovery do
its job. After the recover is completed, send the reply to the waiting
clients.

commit | commitdiff | tree

Michael Adam [Wed, 23 Feb 2011 16:38:40 +0000 (17:38 +0100)]

persistent: add ctdb_persistent_finish_trans3_commits().

This function walks all databases and checks for running trans3 commits.
It sends replies to all of them (with error code) and ends them.
To be called when a recovery finishes.

commit | commitdiff | tree

Michael Adam [Wed, 23 Feb 2011 16:37:42 +0000 (17:37 +0100)]

daemon: correctly end a running trans3_commit if the client disconnects.

commit | commitdiff | tree

Michael Adam [Wed, 23 Feb 2011 16:35:27 +0000 (17:35 +0100)]

persistent: add a client context to the persistent_stat and track the db_id

The db_id is tracked in the client context as an indication that a
transaction commit is in progress. This is cleared in the persistent_state
talloc destructor.

This is in order to properly treat running trans3_commits if the client
disconnects.

commit | commitdiff | tree

Michael Adam [Tue, 22 Feb 2011 23:03:07 +0000 (00:03 +0100)]

persistent: reject trans3_control when a commit is already active.

This should actually never happen.

commit | commitdiff | tree

Michael Adam [Tue, 22 Feb 2011 23:01:13 +0000 (00:01 +0100)]

persistent: allocate the persistent state in the ctdb_db struct in trans3_commit

Make sure that ctdb_db->persistent_state is correctly NULL-ed when
the state is freed. This way, we can use ctdb_db->persistent_state
as an indication for whether a transaction commit is currently
running.

commit | commitdiff | tree

Michael Adam [Tue, 22 Feb 2011 23:23:18 +0000 (00:23 +0100)]

persistent: add a ctdb_db context to the ctdb_persistent_state struct.

commit | commitdiff | tree

Michael Adam [Tue, 22 Feb 2011 23:00:04 +0000 (00:00 +0100)]

persistent: add a ctdb_persistent_state member to the ctdb_db context.

To be used for tracking running transaction commits through recoveries.

commit | commitdiff | tree

Michael Adam [Tue, 22 Feb 2011 21:49:52 +0000 (22:49 +0100)]

persistent_callback: print "no error message given" instead of "(null)"

commit | commitdiff | tree

Michael Adam [Tue, 22 Feb 2011 21:47:30 +0000 (22:47 +0100)]

persistent: reduce indentation for the finishing moves in ctdb_persistent_callback

commit | commitdiff | tree

Michael Adam [Tue, 22 Feb 2011 21:44:16 +0000 (22:44 +0100)]

persistent: if a node failed to update_record, trigger a recovery

and stop processing of the update_record replies in order to let
the recovery finish the trans3_commit control.

commit | commitdiff | tree

Michael Adam [Tue, 22 Feb 2011 21:24:50 +0000 (22:24 +0100)]

persistent_store_timout: do not really time out the trans3_commit control in recovery

If a recovery was started, then all further processing of the update_record
controls sent by the trans3_commit control and timing them out is disabled.
The recovery should trigger sending the reply for the update record control
when finished.

commit | commitdiff | tree

Michael Adam [Tue, 22 Feb 2011 21:24:50 +0000 (22:24 +0100)]

persistent_callback: ignore the update-recordreturn code of remote node in recovery

If a recovery was started, then all further processing of the update_record
controls sent by the trans3_commit control is disabled. The recovery should
trigger sending the reply for the update record control when finished.

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 23 Feb 2011 04:46:36 +0000 (15:46 +1100)]

Deferred attach : at early startup, defer any db attach calls until we are out of recovery.

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 21 Feb 2011 05:48:18 +0000 (16:48 +1100)]

new version 1.3.1

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 21 Feb 2011 05:47:45 +0000 (16:47 +1100)]

50.samba run the smbcontrol in the background. no need to block waiting for it.

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 21 Feb 2011 05:19:18 +0000 (16:19 +1100)]

Merge branch '1.3' of 10.1.1.27:/shared/ctdb/ctdb-git into 1.3

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 18 Feb 2011 00:21:19 +0000 (11:21 +1100)]

ctdb_req_dmaster from non-master

If we find a situatior where we get a stray packet with the wrong
dmaster, dont suicide with ctdb_fatal() since this is too disruptive.
Just drop the stray packet and force a recovery to make sure all is good again.

CQ S1022004

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 18 Feb 2011 00:11:31 +0000 (11:11 +1100)]

Merge branch '1.3' of 10.1.1.27:/shared/ctdb/ctdb-git into 1.3

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 18 Feb 2011 00:02:21 +0000 (11:02 +1100)]

Merge branch '1.3' of 10.1.1.27:/shared/ctdb/ctdb-git into 1.3

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 17 Feb 2011 23:44:55 +0000 (10:44 +1100)]

50.samba : Tell winbind about every time we add/remove and ip from the node

CQ S1021636

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 17 Feb 2011 23:36:15 +0000 (10:36 +1100)]

New branch 1.3

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 17 Feb 2011 20:39:14 +0000 (07:39 +1100)]

Revert "Dont allow client processes to attach to databases while we are still in recovery mode."

This reverts commit faf3b1542fd27b3ad32ac7b362ef39d8cb0b05ff.

git pull ... 1.2-splitbrain
does not do what I think it does.
Revert patch and pull it into the right branch instead.

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 17 Feb 2011 02:14:41 +0000 (13:14 +1100)]

Dont allow client processes to attach to databases while we are still in recovery mode.

The exception is the local recovery daemon which needs to be able to attach (==create) any missing databases during recovery. This process requires the use of the attach control.

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 8 Feb 2011 06:01:33 +0000 (17:01 +1100)]

New version 1.2.20

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 2 Feb 2011 04:00:53 +0000 (15:00 +1100)]

We default to non-deterministic ip now where ips are "sticky" and dont change
too much.
This means we can simplify the way we add ips significantly and stop
trying to move them.

We also check if the node already hosts the ip, in which case we used to return an error. Instead just print an error string but return 0, ok.
This makes it easier to script, and works around broken scripts.

CQ1021034

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 31 Jan 2011 06:48:22 +0000 (17:48 +1100)]

New version 1.2.19

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 31 Jan 2011 06:40:26 +0000 (17:40 +1100)]

If the node is stopped, put a log entry in /var/log/* to indicate this is why we never become ready

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 24 Jan 2011 00:42:50 +0000 (11:42 +1100)]

LockWait congestion.

Add a dlist to track all active lockwait child processes.
Everytime creating a new lockwait handle, check if there is already an
active lockwait process for this database/key and if so,
send the new request straight to the overflow queue.

This means we will only have one active lockwaic child process for a certain key,
even if there were thousands of fetch-lock requests for this key.

When the lockwait processing finishes for the original request, the processing in d_overflow() will automagically process all remaining keys as well.

Add back a --nosetsched argument to make it easier to run under gdb

commit | commitdiff | tree

Ronnie Sahlberg [Sun, 23 Jan 2011 22:43:45 +0000 (09:43 +1100)]

Compile fix

commit | commitdiff | tree

Rusty Russell [Fri, 21 Jan 2011 10:47:02 +0000 (21:17 +1030)]

ctdb_lockwait: create overflow queue.

Once we have more than 200 children waiting on a particular db, don't create
any more. Just put them on an overflow queue, and when a child gets a lock
search that queue to see if others were after the same lock (they probably
were).

commit | commitdiff | tree

Ronnie Sahlberg [Sun, 23 Jan 2011 20:39:33 +0000 (07:39 +1100)]

Add a new test tool that fetch locks a record and then blocks until it receives
user input to unlock the record again.

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 20 Jan 2011 23:56:56 +0000 (10:56 +1100)]

60.nfs
Dont update the statd settings that often.
When we have very many nodes and very many ips, this would generate
a lot of unnessecary load on the system

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 18 Jan 2011 21:00:36 +0000 (08:00 +1100)]

TDB : Fix for a deadlock with transaction lock and lockall/lockallmark
causing ctdbd hangs

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 18 Jan 2011 02:33:24 +0000 (13:33 +1100)]

ctdb: hold transaction locks during freeze, mark during recover.

Make the ctdb parent "mark" the transaction lock once the child process
has frozen/locked the entire database.
This stops the ctdb daemon from using a blocking fcntl() locking on the tdb during the
read traverse during recovery.

CQ 1021388

commit | commitdiff | tree

Rusty Russell [Tue, 18 Jan 2011 00:17:11 +0000 (10:47 +1030)]

tdb: expose transaction lock infrastructure for ctdb

tdb_traverse_read() grabs the transaction lock. This can cause ctdbd
(which uses it) to block when it should not; expose mark and normal
variants of this lock, so ctdbd's child (the recovery daemon) can
acquire it and the ctdbd parent can mark it was held.

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 17 Jan 2011 01:05:43 +0000 (12:05 +1100)]

New version 1.2.17

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 17 Jan 2011 01:00:18 +0000 (12:00 +1100)]

change Christinas previous patch to only perform the check/logging
if we are the main ctdb daemon.
Other daemons/child processes are not guaranteed to get events on regular basis
so those should not be checked.

commit | commitdiff | tree

Christian Ambach [Fri, 14 Jan 2011 12:55:28 +0000 (13:55 +0100)]

improve timing issue detections

the original "Time jumped" messages are too coarse to interpret
exactly what was going wrong inside of CTDB.

This patch removes the original logs and adds two other logs that
differentiate between the time it took to work on an event and
the time it took to get the next event.

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 14 Jan 2011 06:35:31 +0000 (17:35 +1100)]

LIBCTDB: add support for traverse

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 13 Jan 2011 22:46:04 +0000 (09:46 +1100)]

We can not always rely on the recovery daemon pinging us in a timely manner
so we need a "ticker" in the main ctdbd daemon too to ensure we get at least one event to process every second.

This will improve the accuracy of "Time jumped" messages and remove false positives when the recovery daemon is "slow".

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 13 Jan 2011 05:17:43 +0000 (16:17 +1100)]

ADDIP failure

Found during automatic regression testing.
We do not allow the takeip/releaseip events to be executed during a recovery.

All of "ctdb addip, ctdb delip, ctdb moveip" use and force these events to
trigger to perform the ip assignments required.

If these commands collide with a recovery, these commands could fail since we do
not allow takeip/releaseip events to trigger during the recovery.
While it is easy to just try running hte command again, this is suboptimal for script use.

Change these commands to retry these operations a few times until either successfull or until we give up.
This makes the commands much easier to use in scripts.

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 12 Jan 2011 22:35:37 +0000 (09:35 +1100)]

IPALLOCATION : If the node is held pinned down in "init" state
by external services failing to start, or blocking CTDBD from finishing the startup phase,
we can encounter a situation where we have not yet fully initialized, but a
remote recovery master tries to release a certain ip clusterwide.

In this situation the node that is pinned down in init/startup phase
would fail to perform the release of the ip address since we are not yet fully operational and not yet host any valid interfaces.

In this situation, we just need to remain unhealthy, there is on need to
also ban the node.

Remove the autobanning for this condition and just let the node remain in
unhealthy mode.
Banning is overkill in this situation when the system is broken and just
draws attention to ctdbd instead of the root cause.

commit | commitdiff | tree

Martin Schwenke [Tue, 11 Jan 2011 06:13:57 +0000 (17:13 +1100)]

Eventscripts: lower the fail/restart limits for nfsd.

We were potentially leaving a node unable to serve requests for too
long.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Tue, 11 Jan 2011 06:13:06 +0000 (17:13 +1100)]

Eventscripts: use "startstop_nfs restart" to reconfigure NFS.

This was defaulting to just "service nfs restart", which doesn't have
the workarounds we need.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Tue, 11 Jan 2011 06:12:03 +0000 (17:12 +1100)]

Eventscripts: only autostart during a monitor event.

Otherwise we might short-circuit events that are run only once and
actually need to do something.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Tue, 11 Jan 2011 06:10:55 +0000 (17:10 +1100)]

Eventscripts: print a message when reconfiguring a service.

Otherwise there can be strange error messages from services
stopping/starting, without any context.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Tue, 11 Jan 2011 06:06:48 +0000 (17:06 +1100)]

Eventscripts: work around NFS restart failure under load.

"service nfs restart" can fail.  To stop nfsd it sends a SIGINT and
nfsd might take a while to process it if the system is loaded.
Starting nfsd may then fail because resources are still in use.

This does some /proc magic to tell nfsd to do no more processing.  It
then runs service stop, kills nfsd with SIGKILL, and then runs service
start.  This is much less likely to fail.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 11 Jan 2011 05:17:06 +0000 (16:17 +1100)]

TYPO

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 11 Jan 2011 05:15:41 +0000 (16:15 +1100)]

STATD is 100027 not 1000247

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 10 Jan 2011 20:37:17 +0000 (07:37 +1100)]

LIBCTDB uninitialized inqueue element

From Michael Anderson,
initialize the inqueue element of the ctdb structure to NULL,
else it might be used uninitialized and cause a segv.

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 10 Jan 2011 05:51:56 +0000 (16:51 +1100)]

recoverd: avoid triggering a full recovery if just some ip allocation
has failed.
We dont need to rebuild the databases in this situation, we just
need to try again to sort out the ip address allocations.

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 10 Jan 2011 02:57:49 +0000 (13:57 +1100)]

Add ctdb_fork(0 which will fork a child process and drop the real-time
scheduler for the child.

Use ctdb_fork() from callers where we dont want the child to be running
at real-time privilege.

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 10 Jan 2011 02:35:39 +0000 (13:35 +1100)]

Revert scheduling back to use real-time processes

Revert this patch:
commit 482c302d46e2162d0cf552f8456bc49573ae729d

We may need to use real-time processes for the main daemon and the recovery daemon to handle the cases where systems come under very high loads.

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 6 Jan 2011 04:42:45 +0000 (15:42 +1100)]

60.nfs Check if we have rpc.statd and if not, skip checking for statd
availability at all (since we cant restart it, there is not point checking
if it is alive)

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 21 Dec 2010 23:39:25 +0000 (10:39 +1100)]

New version 1.2.16

- 50.samba  dont run serverid wipe in the background in case it
   is so slow to start that samba manages to come up before it finishes.
- 60.nfs  wait 10 intervals before trying to restart lockd.
   flag the node unhealthy after 15 failures.
   CQ S1021266
- 41.httpd  httpd can sometimes be slow, wait 5 intervals before we try to
   restart it and 10 intervals before we flag the node unhealthy.

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 21 Dec 2010 23:27:53 +0000 (10:27 +1100)]

41.HTTPD

Httpd can be very slow to start on some platforms,
wait 5 monitor intervals before we try to restart it if
it has not bound to port 80 yet.
After 10 failed intervals, flag the node as unhealthy.

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 21 Dec 2010 23:09:35 +0000 (10:09 +1100)]

60.nfs

Try to restart LOCKD after 10 failures and
flag the node as unhealthy after 15 failures

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 21 Dec 2010 23:05:40 +0000 (10:05 +1100)]

Dont run net serverid wipe in the background

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 14 Dec 2010 10:17:14 +0000 (21:17 +1100)]

50.samba

Net serverid wipe can take a bit of time sometimes so background it.

Only perform auto start/stop of the managed service on the monitor event

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 13 Dec 2010 03:20:37 +0000 (14:20 +1100)]

New version 1.2.15

* Mon Dec 13 2010 : Version 1.2.15
- Add two new debugging commands "ctdb readkey/writekey"
- idtree overflow bugfix
- only run "serverid wipe" when we are actually running samba
- libctdb, add roper input queueing so we can support calling
sync functions from an async callback
- lvs updates
- addip, always wait across at least one ip reallocation, making the
command slower, but making it easier to use in scripts

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 13 Dec 2010 01:39:01 +0000 (12:39 +1100)]

Revert "server: when we migrate off a record with data, set the MIGRATED_WITH_DATA flag"

This reverts commit c63dab9763d45fd4f9be77b9c9f463bd457de808.

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 13 Dec 2010 01:38:39 +0000 (12:38 +1100)]

Revert "Add a new header flag for "migrated with data" and set this to 1"

This reverts commit d22e7e47a7f3d450bbbc2267322dadbdbf192e84.

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 13 Dec 2010 01:06:01 +0000 (12:06 +1100)]

ctdb addip:

After finishing "ctdb addip" wait for an implicit "iptakeover" to complete
the assignment to a node.

This makes it more wasteful and timeconsuming when adding multiple ips
at once, or the same ip to multiple nodes,
but makes it easier to script the use of this command.

commit | commitdiff | tree

Ronnie Sahlberg [Sun, 12 Dec 2010 08:38:39 +0000 (19:38 +1100)]

LVS

update lvs configuration on ipreallocated events too

commit | commitdiff | tree

Ronnie Sahlberg [Sun, 12 Dec 2010 03:22:20 +0000 (14:22 +1100)]

When assigning the single-public-ip during startup,
flag the interface as initially being "link ok"
so that we can add it and startup.

The eventscript can later drop the flag if required

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 10 Dec 2010 03:18:28 +0000 (14:18 +1100)]

libctdb

fix a compile problem after renaming a structure field

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 10 Dec 2010 02:39:18 +0000 (13:39 +1100)]

LibCTDB

Add an input queue where we keep received pdus we have not yet processed
This allows us to perform SYNC calls from an ASYNC callback

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 8 Dec 2010 00:08:19 +0000 (11:08 +1100)]

only run "serverid wipe" if we are actually running samba.
we dont need to run this on systems where we do run winbind but not samba

commit | commitdiff | tree

Rusty Russell [Mon, 6 Dec 2010 03:22:38 +0000 (13:52 +1030)]

idtree: fix overflow for v. large ids on allocation and removal

(Imported from SAMBA commit 09a6538969ac).

Chris Cowan tracked down a SEGV in sub_alloc: idp->level can actually
be equal to 7 (MAX_LEVEL) there, as it can be in sub_remove.

(We unfairly blamed a shift of a signed var for this crash in commit
2db1987f5a3a).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 6 Dec 2010 05:09:38 +0000 (16:09 +1100)]

Add a new header flag for "migrated with data" and set this to 1
when we migrate a non-empty record onto the node
or a non-empty record off the node

When we migrate a record back to the lmaster and yield the dmaster role,
inspect this flag if if it is still not set, we can delete the record from
the local database as soon as we have migrated it back to the lmaster.

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 6 Dec 2010 05:07:55 +0000 (16:07 +1100)]

add new command line functions
ctdb readkey <dbid> <key>
ctdb writekey <dbid> <key> <value>

these are mainly intended for debugging of databases and dmaster migration issues

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 6 Dec 2010 05:06:20 +0000 (16:06 +1100)]

add a new ctdb_ltdb function to delete a record in a normal database

commit | commitdiff | tree

Michael Adam [Fri, 3 Dec 2010 14:21:51 +0000 (15:21 +0100)]

server: when we migrate off a record with data, set the MIGRATED_WITH_DATA flag

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 6 Dec 2010 07:34:49 +0000 (18:34 +1100)]

new version 1.2.14

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 6 Dec 2010 02:08:53 +0000 (13:08 +1100)]

Add two new flags for the ltdb header.
One of which signals that the record has never been migrated to/from a node
while containing data.
This property "has never been migrated while non-zero" is important later
to provide heuristics on which records we might be able to purge
from the tdb files cheaply, i.e. without having to rely on the full-blown
database vacuum.

These records are belived to be very common and the pattern would look like
this :
1, no record exists at all.
2, client opens a file
3, samba requests the record for this file
4, an empty record is created on the LMASTER
5, the empty record is migrated to the DMASTER
6, samba writes a <sharemode> to the record locally and the record grows
7, client finishes working the file and closes the file
8, samba removes the sharemode and the record becomes empty again.
9, much later : vacuuming will delete the record

At stage 8, since the record has never been migrated onto a node wile being
non-zero it would be safe, and much more efficient to just delete the record
completely from the database and hand it back to the LMASTER.

The flags occupy the same uint32_t as was previously used for laccessor/lacount
in the header. For now, make sure the flags only define/use the top 16 bits
of this field so that we are sure we dont collide with bits set to one
from previous generations of the ctdb cluster database prior to this
change in semantics of this word.

This is a rework of Michaels patch :
commit 2af1a47cbe1a608496c8caf3eb0c990eb7259a0d
Author: Michael Adam <obnox@samba.org>
Date: Tue Nov 30 17:00:54 2010 +0100

add a DEFAULT record flag and a MIGRATED_WITH_DATA record flag.

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 6 Dec 2010 02:04:44 +0000 (13:04 +1100)]

change one of the reserved words in the ctdb ltdb header to be a flags field
for now, try avoiding using bits in the low16 bits as flags since this may
collide with laccessor/lacount values from previous versions of the cluster
databases

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 29 Nov 2010 02:07:59 +0000 (13:07 +1100)]

Remove LACOUNT and LACCESSOR and migrate the records immediately.

This concept didnt work out and it is really just as expensive as a full migration
anyway, without the benefit of caching the data for subsequence accesses.

Now, migrate the records immediately on first access.
This will be combined with a "cheap vacuum-lite" for special empty records to
prevent growth of databases.

Later extensions to mimic read-only behaviour of records will include proper shared read-only locking of database records, making the laccessor/lacount read-only access to the data obsolete anyway.

By removing this special case and handling of lacount laccessor makes the codapath where shared read-only locking will be be implemented simpler, and frees up space in the ctdb_ltdb header for use by vacuuming flags as well as read-only locking flags.

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 6 Dec 2010 00:30:24 +0000 (11:30 +1100)]

Add 60.ganesha to what gets installed by make install as well as by the RPM

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 6 Dec 2010 00:26:43 +0000 (11:26 +1100)]

add a missing part of the import of the previous ganesha patch

commit | commitdiff | tree

Chandra Seetharaman [Fri, 3 Dec 2010 23:26:22 +0000 (15:26 -0800)]

make changes to ctdb event scripts to support NFS-Ganesha.

make changes to ctdb event scripts to support NFS-Ganesha.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 3 Dec 2010 02:28:35 +0000 (13:28 +1100)]

during ip allocation, there are failure modes where a node might hold a ip address
but thinks it is still unassigned (-1).

add code to the recovery daemon to detect this case and trigger a reallocation
so that the ip gets covered

and change the takeip code to allow for this condition, taking on an ip address that is
already hosted.

cq s1021073

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 2 Dec 2010 19:08:44 +0000 (06:08 +1100)]

new version 1.2.13

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 2 Dec 2010 19:07:03 +0000 (06:07 +1100)]

dont try starting samba through the "init" event

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 29 Nov 2010 08:31:05 +0000 (19:31 +1100)]

new version 1.2.12

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 29 Nov 2010 01:39:14 +0000 (12:39 +1100)]

When we are no longer the natgw master, dont put the natgw ip on loopback.
We put the ip on loopback just to make sure we would still interoperate with
non-standard configurations on unix-KDC, that are configured to verify the optional
HostAddresses field.
This is not required for AD, since AD does not use this field, and is replaced in
unix land with other/better mechanisms than this "dodgy" check.

This makes it "easier" for applications that have bound to the natgw address
to detect a socket problem and try to reconnect/recover if the ip address
is completely missing from the system.

At the same time, use the winbind specific hook that exists to explicitely tell winbindd : this address is gone, so if you have bound to it, this is a good time to close and rebind your socket.

cq 1020333

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 22 Nov 2010 09:57:27 +0000 (20:57 +1100)]

new version 1.2.11

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 18 Nov 2010 04:40:19 +0000 (15:40 +1100)]

update autostart/stop to work for samba

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 18 Nov 2010 03:15:18 +0000 (14:15 +1100)]

add an explicit _is_managed_service to iscsi eventscript

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 18 Nov 2010 02:52:46 +0000 (13:52 +1100)]

Dont pollute the logs with a "file not found" message

CQ S1020745

commit | commitdiff | tree

Martin Schwenke [Thu, 18 Nov 2010 02:23:40 +0000 (13:23 +1100)]

60.nfs eventscript should do nothing if NFS isn't managed by CTDB.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Thu, 18 Nov 2010 00:27:10 +0000 (11:27 +1100)]

Eventscript functions - catch failures in ctdb_service_start().

ctdb_service_start() currently succeeds if ctdb_counter_init()
succeeds.

This changes it to fail when a service start fails.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Thu, 18 Nov 2010 00:04:52 +0000 (11:04 +1100)]

50.samba eventscript should stop/start services when they become (un)managed.

When the value of $CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND (or
corresponding changes are made to $CTDB_MANAGED_VERSIONS), the
associated service should be started or stopped as necessary.

This add calls to ctdb_start_stop_service() to manage
starting/stopping samba and winbind.

An associated cleanup is made to the initial checks that one of
$CTDB_MANAGES_SAMBA or $CTDB_MANAGES_WINBIND is set, replacing them
with calls to is_ctdb_managed_service().

To handle the winbind cases ctdb_start_stop_service() and
is_ctdb_managed_service() are updated to take an optional service name
parameter.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 17 Nov 2010 02:50:56 +0000 (13:50 +1100)]

add a new support function ctdb_check_counter_equal()

update nfs to try to restart the service after 10 consecutive failures
and to flag the node unhealthy after 15

add similar function to mountd

CTDB repository