takeover: prevent crash by avoiding free in traverse on RST timeout
authorRusty Russell <rusty@rustcorp.com.au>
Mon, 26 Jul 2010 04:28:48 +0000 (13:58 +0930)
committerRusty Russell <rusty@rustcorp.com.au>
Mon, 26 Jul 2010 04:28:48 +0000 (13:58 +0930)
After 5 attempts to send a RST to a client without any response, we free
"con"; this is done during a traverse.  This frees the node we are walking
through (the node is made a child of "con" down in rb_tree.c's
trbt_create_node() (Valgrind would catch this, as Martin confirmed).

So, we create a temporary parent and reparent onto that; then we free
that parent after the traverse, thus deleting the unwanted nodes.

CQ:S1019041
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
server/ctdb_takeover.c

index ae6c0646aece8b7ab2d0ff79f5a17a2208acef53..6c771e0e75ae6e72f49e6a20d848dc368e294087 100644 (file)
@@ -1607,7 +1607,8 @@ static void tickle_connection_traverse(void *param, void *data)
 
        /* have tried too many times, just give up */
        if (con->count >= 5) {
-               talloc_free(con);
+               /* can't delete in traverse: reparent to delete_cons */
+               talloc_steal(param, con);
                return;
        }
 
@@ -1627,11 +1628,13 @@ static void ctdb_tickle_sentenced_connections(struct event_context *ev, struct t
                                              struct timeval t, void *private_data)
 {
        struct ctdb_kill_tcp *killtcp = talloc_get_type(private_data, struct ctdb_kill_tcp);
-
+       void *delete_cons = talloc_new(NULL);
 
        /* loop over all connections sending tickle ACKs */
-       trbt_traversearray32(killtcp->connections, KILLTCP_KEYLEN, tickle_connection_traverse, NULL);
+       trbt_traversearray32(killtcp->connections, KILLTCP_KEYLEN, tickle_connection_traverse, delete_cons);
 
+       /* now we've finished traverse, it's safe to do deletion. */
+       talloc_free(delete_cons);
 
        /* If there are no more connections to kill we can remove the
           entire killtcp structure