git.samba.org - sahlberg/ctdb.git/commit

author	Rusty Russell <rusty@rustcorp.com.au>
	Wed, 9 Jun 2010 23:28:55 +0000 (08:58 +0930)
committer	Rusty Russell <rusty@rustcorp.com.au>
	Wed, 30 Jun 2010 05:21:05 +0000 (14:51 +0930)
commit	9b4884e0bad3b23a8cf32ff19dc9bb8b26436e2d
tree	52a06f0058a2c618cf6110172d6dec96ad40fec3	tree
parent	7d4658d3fc09560ccf16b304ffdb5391a2b48f72	commit \| diff

Delay reusing ids to make protocol more robust

Ronnie and I tracked down a bug which seems to be caused by a node
running so slowly that we timed out the request and reused the request
id before it responded.

The result was that we unlocked the wrong record, leading to the
following:

ctdbd: tdb_unlock: count is 0
ctdbd: tdb_chainunlock failed
smbd[1630912]: [2010/06/08 15:32:28.251716, 0] lib/util_sock.c:1491(get_peer_addr_internal)
ctdbd: Could not find idr:43
ctdbd: server/ctdb_call.c:492 reqid 43 not found

This exact problem is now detected, but in general we want to delay
id reuse as long as possible to make our system more robust.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

client/ctdb_client.c		diff \| blob \| history
common/ctdb_util.c		diff \| blob \| history
include/ctdb_private.h		diff \| blob \| history