eventscript: fix bug in timeouts on forced eventscripts. Again.
authorRusty Russell <rusty@rustcorp.com.au>
Tue, 24 Nov 2009 00:36:53 +0000 (11:06 +1030)
committerRusty Russell <rusty@rustcorp.com.au>
Tue, 24 Nov 2009 00:36:53 +0000 (11:06 +1030)
commitb90bdb07c1f6913ddbf11bde9684bdc8af61c549
treea3f72cdb9112cff004c2ade03a79f0f01e200aa9
parent6804f880436645b52c09a78fa300377fa8058d0e
eventscript: fix bug in timeouts on forced eventscripts.  Again.

In 15bc66ae801b0c69, Ronnie fixed a double-free race.  The problem was that
ctdb_run_eventscripts() hands a context to ctdb_event_script_callback() to
hang its data off, which gets freed in the callback.  This particularly
hurt in ctdb_event_script_timeout.

There's nothing wrong with this, but obviously we should make the callback
call last of all.  At the time, ctdb_event_script_timeout() carefully
extracted everything from the struct ctdb_event_script_state before
calling ->callback.

This was cleaned up in 64da4402c6ad485f (Ronnie again), and now state
was referred to after the callback again.  But the same change introduced
a direct use-after-free bug which caused an occasional oops.

So in our last episode (eda052101728cf92) Volker fixed this, and Michael
committed it.

But we still have the double free bug which 15bc66ae801b0c69 was supposed
to fix!  Let's try to fix this in a more permanent way, but always doing
the callback from the destructor.  This means we need to hold the status,
and don't send the KILL signal if ->child is set to 0.

Finally, add a comment about freeing ourselves in run_eventscripts_callback
and the structure definition.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
server/eventscript.c