git.samba.org - sahlberg/ctdb.git/commit

author	Ronnie Sahlberg <ronniesahlberg@gmail.com>
	Wed, 9 Jun 2010 06:12:36 +0000 (16:12 +1000)
committer	Ronnie Sahlberg <ronniesahlberg@gmail.com>
	Wed, 9 Jun 2010 06:12:36 +0000 (16:12 +1000)
commit	4b5bce6bcebb5cdb6048283181591562badfc2d9
tree	a1b95fe57852363b1453844d29afdbf287792a41	tree
parent	3cd9d214e8a2e915fbd0dc321cc12b5d80130fd2	commit \| diff

idr can timeout and wrap/be reused quite quickly.

If a noremote node hangs for an extended period, it is possible
that we might have a DMASTER request in flight for record A to that node.
Eventually we will reuse the idr, and may reuse it for a DMASTER request to a different node for a different record B.

If while the request for B is in flight, the first tnode un-hangs and responds back
we would receive a dmaster reply for the wrong record.

This would cause a record to become perpetually locked, since inside the daemon we would tdb_chainlock(dmaster_reply->pdu->key) but once the migration would complete we would chainunlock idr->state->call->key

Adding code to verify that when we receive a dmaster reply packet that it does in fact match the exact same key that the state variable we have for the idr in flight.