config/events.d/README

   1 This directory is where you should put any local or application
   2 specific event scripts for ctdb to call.
   3
   4 All event scripts start with the prefic 'NN.' where N is a digit.
   5 The event scripts are run in sequence based on NN.
   6 Thus 10.interfaces will be run before 60.nfs.
   7
   8 Each NN must be unique and duplicates will cause undefined behaviour.
   9 I.e. having both 10.interfaces and 10.otherstuff is not allowed.
  10
  11
  12 As a special case, any eventscript that ends with a '~' character will be
  13 ignored since this is a common postfix that some editors will append to
  14 older versions of a file.
  15
  16
  17 The eventscripts are called with varying number of arguments.
  18 The first argument is the "event" and the rest of the arguments depend
  19 on which event was triggered.
  20
  21 The events currently implemented are
  22 startup
  23         This event does not take any additional arguments.
  24         This event is only invoked once, when ctdb is starting up.
  25         This event is used to wait for the service to start and all
  26         resources for the service becoming available.
  27
  28         This is used to prevent ctdb from starting up and advertize its
  29         services until all dependent services have become available.
  30
  31         All services that are managed by ctdb should implement this
  32         event and use it to start the service.
  33
  34         Example: 50.samba uses this event to start the samba daemon
  35         and then wait until samba and all its associated services have
  36         become available. It then also proceeds to wait until all
  37         shares have become available.
  38
  39 shutdown
  40         This event is called when the ctdb service is shuting down.
  41
  42         All services that are managed by ctdb should implement this event
  43         and use it to perform a controlled shutdown of the service.
  44
  45         Example: 60.nfs uses this event to shut down nfs and all associated
  46         services and stop exporting any shares when this event is invoked.
  47
  48 monitor
  49         This event is invoked every X number of seconds.
  50         The interval can be configured using the MonitorInterval tunable
  51         but defaults to 15 seconds.
  52
  53         This event is triggered by ctdb to continously monitor that all
  54         managed services are healthy.
  55         When invoked, the event script will check that the service is healthy
  56         and return 0 if so. If the service is not healthy the event script
  57         should return non zero.
  58
  59         If a service returns nonzero from this script this will cause ctdb
  60         to consider the node status as UNHEALTHY and will cause the public
  61         address and all associated services to be failed over to a different
  62         node in the cluster.
  63
  64         All managed services should implement this event.
  65
  66         Example: 10.interfaces which checks that the public interface (if used)
  67         is healthy, i.e. it has a physical link established.
  68
  69 takeip
  70         This event is triggered everytime the node takes over a public ip
  71         address during recovery.
  72         This event takes three additional arguments :
  73         'interface' 'ipaddress' and 'netmask'
  74
  75         This event will always be followed by a 'recovered' event onse
  76         all ipaddresses have been reassigned to new nodes and the ctdb database
  77         has been recovered.
  78         If multiple ip addresses are reassigned during recovery it is
  79         possible to get several 'takeip' events followed by a single
  80         'recovered' event.
  81
  82         Since there might involve substantial work for the service when an ip
  83         address is taken over and since multiple ip addresses might be taken
  84         over in a single recovery it is often best to only mark which addresses
  85         are being taken over in this event and defer the actual work to
  86         reconfigure or restart the services until the 'recovered' event.
  87
  88         Example: 60.nfs which just records which ip addresses are being taken
  89         over into a local state directory   and which defers the actual
  90         restart of the services until the 'recovered' event.
  91
  92
  93 releaseip
  94         This event is triggered everytime the node releases a public ip
  95         address during recovery.
  96         This event takes three additional arguments :
  97         'interface' 'ipaddress' and 'netmask'
  98
  99         In all other regards this event is analog to the 'takeip' event above.
 100
 101         Example: 60.nfs
 102
 103 recovered
 104         This event is triggered everytime a full ctdb recovery has completed
 105         and all public ip addresses have been reassigned among the nodes.
 106
 107         Example: 60.nfs which if the ip address configuration has changed
 108         during the recovery (i.e. if addresses have been taken over or
 109         released) will kill off any tcp connections that exist for that
 110         service and also send out statd notifications to all registered
 111         clients.
 112
 113
 114 Additional note for takeip, releaseip, recovered:
 115
 116 ALL services that depend on the ip address configuration of the node must
 117 implement all three of these events.
 118
 119 ALL services that use TCP should also implement these events and at least
 120 kill off any tcp connections to the service if the ip address config has
 121 changed in a similar fashion to how 60.nfs does it.
 122 The reason one must do this is that ESTABLISHED tcp connections may survive
 123 when an ip address is released and removed from the host until the ip address
 124 is re-takenover.
 125 Any tcp connections that survive a release/takeip sequence can potentially
 126 cause the client/server tcp connection to get out of sync with sequence and
 127 ack numbers and cause a disruptive ack storm.
 128
 129