ctdb.git
8 years agoclient: Increase the timeout for TRANS3_COMMIT control 2.5 ctdb-2.5.6
Amitay Isaacs [Thu, 10 Mar 2016 07:01:31 +0000 (18:01 +1100)]
client: Increase the timeout for TRANS3_COMMIT control

On a busy system, TRANS3_COMMIT control can take upto or longer than
3 seconds.  On timeout, there are few possible outcomes.

1. The transaction has completed on all nodes and TRANS3_COMMIT control
   has returned.  In such a case, there is no problem.

2. The transaction has completed on the local node, but TRANS3_COMMIT
   control is still active.  In such a case, ctdb_transaction_commit()
   can return successfully.  If this is being called from ctdb, then
   ctdb will exit.  This will cause ctdb daemon to trigger recovery
   since the client exited while transaction is active.  This will cause
   unnecessary recovery.

3. Database recovery was started and ctdb_transaction_commit() will
   retry till the recovery completes the transaction.

Increasing the timeout to 30 seconds will avoid the spurious database
recoveries when TRANS3_COMMIT control takes longer to finish.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Fri Mar 11 19:59:53 CET 2016 on sn-devel-144

(Imported from commit ad5b9c3df2f2e3c93642fb1c069a6f4c56eb94f4)

8 years agocommon: For AF_PACKET socket types, protocol is in network order
Amitay Isaacs [Thu, 3 Mar 2016 03:17:40 +0000 (14:17 +1100)]
common: For AF_PACKET socket types, protocol is in network order

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11770

From man page of packet(7):

                                             protocol is the  IEEE  802.3
   protocol  number  in  network  byte  order.  See the <linux/if_ether.h>
   include file for a list of allowed protocols.  When protocol is set  to
   htons(ETH_P_ALL),  then all protocols are received.

Protocol argument was changed from network order to host order wrongly
in commit 9f8395cb7d49b63a82f75bf504f5f83920102b29.

Specifying "protocol" field to socket(AF_PACKET, ...) call only affects
the packets that are recevied.  So use protocol = 0 when sending raw
packets.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Mar  4 12:58:50 CET 2016 on sn-devel-144

(Imported from commit f5b6a5b13406c245ab9cc8c1699483af9eb21f88)

8 years agocommon: Use documented names for protocol family in socket()
Amitay Isaacs [Thu, 28 Jan 2016 13:06:18 +0000 (00:06 +1100)]
common: Use documented names for protocol family in socket()

Instead of using PF_*, use AF_*.

https://bugzilla.samba.org/show_bug.cgi?id=11705

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit 9f94620a308a3b17c1886c2c4807b34b8d5edacb)

8 years agocommon: Protocol argument must be in host order for socket() call
Amitay Isaacs [Thu, 28 Jan 2016 13:05:26 +0000 (00:05 +1100)]
common: Protocol argument must be in host order for socket() call

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11705

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit 9f8395cb7d49b63a82f75bf504f5f83920102b29)

8 years agoUpdate NEWS
Martin Schwenke [Wed, 24 Feb 2016 23:12:42 +0000 (10:12 +1100)]
Update NEWS

Signed-off-by: Martin Schwenke <martin@meltin.net>
8 years agoscripts: Fix regression in updateip code
Martin Schwenke [Fri, 18 Dec 2015 04:33:38 +0000 (15:33 +1100)]
scripts: Fix regression in updateip code

Regression introduced in commit
6471541d6d2bc9f2af0ff92b280abbd1d933cf88.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
(Imported from commit d8e4c5a468286ecc1c38ecd66a3606e84db02373)

8 years agodaemon: Don't leak memory if not using recovery lock
Martin Schwenke [Mon, 11 Jan 2016 02:41:30 +0000 (13:41 +1100)]
daemon: Don't leak memory if not using recovery lock

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
(Imported from commit 24160ee6a4a0727840d73955b99aef690450f345)

8 years agodaemon: Drop the "schedule for deletion" messages to DEBUG level
Martin Schwenke [Thu, 17 Dec 2015 01:27:58 +0000 (12:27 +1100)]
daemon: Drop the "schedule for deletion" messages to DEBUG level

Thousands of these can be generated each second, rendering INFO level
debugging useless.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
(Imported from commit 8f73ae03cc50f26e85b78e35bf22e40eb1ff7684)

8 years agoFix CID 1347319 Unchecked return value
Volker Lendecke [Thu, 7 Jan 2016 20:14:05 +0000 (21:14 +0100)]
Fix CID 1347319 Unchecked return value

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
(Imported from commit 0cb8b9d113b322f784100365669d2be8b7fa635a)

8 years agoping_pong: add -l option
Ralph Boehme [Sat, 9 May 2015 23:39:16 +0000 (01:39 +0200)]
ping_pong: add -l option

Add a new option -l to check whether POSIX byte range locks are
working. Usage:

node1$ touch /path/to/cluster-fs/FILE

node1$ ./bin/ping_pong -l /path/to/cluster-fs/FILE
Holding lock, press any key to continue...
You should run the same command on another node now.

node2$ ./bin/ping_pong -l /path/to/cluster-fs/FILE

Output can either be:

  Holding lock, press any key to continue...

This means POSIX byte range locks are *not* working.

If you see this instead:

  file already locked, calling check_lock to tell us who has it locked...:
  check_lock failed: lock held: pid='27375', type='1', start='0', len='0'
  Working POSIX byte range locks

Congrats, you have a cluster fs with functional byte range locks!

Signed-off-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Thu Dec 10 08:48:38 CET 2015 on sn-devel-104

(Imported from commit 2f16675a2294c8197ad45862c3e8a4fa2061d2e9)

8 years agoFix a 32-bit problem
Volker Lendecke [Thu, 3 Sep 2015 14:25:02 +0000 (16:25 +0200)]
Fix a 32-bit problem

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Autobuild-User(master): Michael Adam <obnox@samba.org>
Autobuild-Date(master): Thu Sep  3 22:12:02 CEST 2015 on sn-devel-104

(Imported from commit 239062a062bb70c50bf884885b58054c44c9ebf8)

8 years agoFix some clang uninitialized errors
Volker Lendecke [Wed, 19 Aug 2015 05:35:32 +0000 (07:35 +0200)]
Fix some clang uninitialized errors

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
(Imported from commit 963874279997b98c8b29bee6d2417f81a0e8b0d2)

8 years agopmda: Add missing prototype declaration for non-static function
Amitay Isaacs [Mon, 3 Aug 2015 05:36:06 +0000 (15:36 +1000)]
pmda: Add missing prototype declaration for non-static function

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11434

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit 6538ba5243a043bc727039a16a7a9d5d8027fa06)

8 years agoscripts: Fix regression in VLAN interface support
Martin Schwenke [Tue, 7 Jul 2015 10:49:38 +0000 (20:49 +1000)]
scripts: Fix regression in VLAN interface support

Commit 6471541d6d2bc9f2af0ff92b280abbd1d933cf88 broke support for VLAN
interfaces.  Releasing a public IP address depends on
ip_maskbits_iface() and for a VLAN interface this will return an
interface of the form <vlan>@<iface>, which can't be fed back into
"ip" commands.

Update ip_maskbits_iface() to drop the '@' and everything after it.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reported-by: Jan Schwaratzki <jschwaratzki@ddn.com>
(Imported from commit 87c5c96b767aa317dd620f89ac3e11bb40dae70f)

8 years agoAccept hex format for pdelete and ptrans commands
Christof Schmitt [Mon, 6 Jul 2015 21:32:15 +0000 (14:32 -0700)]
Accept hex format for pdelete and ptrans commands

Signed-off-by: Christof Schmitt <cs@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
(Imported from commit cd55349e9b0cfc9bb8c04a5cfa3c142efead6b83)

8 years agoCreate helper function for optional hex input
Christof Schmitt [Mon, 6 Jul 2015 20:07:33 +0000 (13:07 -0700)]
Create helper function for optional hex input

Signed-off-by: Christof Schmitt <cs@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
(Imported from commit 663db9fbb028fe524bb0eef09398c62bf4fb08d4)

8 years agoAccept the key in hex format for the pstore command
Christof Schmitt [Thu, 2 Jul 2015 20:06:32 +0000 (13:06 -0700)]
Accept the key in hex format for the pstore command

This follows the same pattern as the tstore command, and it allows
specifying key strings with a trailing \0 character.

Signed-off-by: Christof Schmitt <cs@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Mon Jul  6 23:23:22 CEST 2015 on sn-devel-104

(Imported from commit cdbc6d92c6bf0645c5a23955e8ec5e253212e86d)

8 years agoscripts: Only write to /proc route flush files if they exist
Martin Schwenke [Wed, 24 Jun 2015 11:06:22 +0000 (21:06 +1000)]
scripts: Only write to /proc route flush files if they exist

On IPv4-only or IPv6-only systems one of these files will not exist.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 0c609c95051ff8c1e8fd61acc6abc7e4b4c4441b)

8 years agoscripts: Create the directory containing the recovery lock
Martin Schwenke [Tue, 19 May 2015 18:19:09 +0000 (04:19 +1000)]
scripts: Create the directory containing the recovery lock

This will handle the most obvious cases.  It won't handle the case
where the directory is missing and the recovery lock location is
updated at run-time.  However, this is a good improvement.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 27674c413de5cadae633c834d4f2e41f26ab455e)

8 years agoping_pong: Fix CID 1273087 Resource leak
Volker Lendecke [Sun, 3 May 2015 09:34:41 +0000 (09:34 +0000)]
ping_pong: Fix CID 1273087 Resource leak

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Ira Cooper <ira@samba.org>
(Imported from commit bfbaf51cd627b2d1052dd23be4b0df5e004cc92f)

8 years agoRevert "ctdb-recoverd: Abort when daemon can take recovery lock during recovery"
Martin Schwenke [Mon, 4 May 2015 05:27:19 +0000 (15:27 +1000)]
Revert "ctdb-recoverd: Abort when daemon can take recovery lock during recovery"

This reverts commit 39d2fd330a60ea590d76213f8cb406a42fa8d680.

An election can occur in the middle of a recovery.  During the
election the recovery master can change.  When a node loses a round of
the election and stops being the recovery master it releases the
recovery lock.  Then at the end of the ongoing recovery all nodes are
able to take the recovery lock so they will all abort.

The most likely cause for a change in recovery master is that several
(all?) nodes are starting up and the "connected-ness" of each node is
a primary factor in winning the election.  In this situation the
recovery master can bounce around the cluster.

The simplest solution is to revert this patch so that the recovery
will fail.  The new recovery master will then start a new recovery.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon May  4 10:40:36 CEST 2015 on sn-devel-104

(Imported from commit 20a7945a2695d7ed811237adde5af6549e53c6e9)

8 years agoscripts: Run tdb checker under timeout command
Amitay Isaacs [Tue, 28 Apr 2015 13:15:37 +0000 (23:15 +1000)]
scripts: Run tdb checker under timeout command

If tdb database file size grows beyond 4GB, tdbtool/tdbdump can hang
indefinitely.  This will prevent CTDB from starting up.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit f6af2d96c275ad7614671aabac1e21f9d58b1585)

8 years agoscripts: New configuration variable CTDB_NODE_ADDRESS
Martin Schwenke [Sun, 19 Apr 2015 23:53:23 +0000 (09:53 +1000)]
scripts: New configuration variable CTDB_NODE_ADDRESS

Required when automatic address detection can not be used.  This can
be the case when running multiple ctdbd daemons/nodes on the same
physical host (usually for testing), using InfiniBand for the private
network or on Linux when sysctl net.ipv4.ip_nonlocal_bind=1.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Apr 27 06:10:08 CEST 2015 on sn-devel-104

(Imported from commit 0621f07eb482daf7495f6314b0af32853573cb82)

8 years agoscripts: Replace uses of "ctdb pnn" with ctdb_get_pnn()
Martin Schwenke [Sun, 19 Apr 2015 09:45:41 +0000 (19:45 +1000)]
scripts: Replace uses of "ctdb pnn" with ctdb_get_pnn()

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 1092f9755fed331251ae508f1e04e85dc47ae902)

8 years agoscripts: Changed uses of "ctdb xpnn" to ctdb_get_pnn()
Martin Schwenke [Sat, 18 Apr 2015 12:00:49 +0000 (22:00 +1000)]
scripts: Changed uses of "ctdb xpnn" to ctdb_get_pnn()

"ctdb xpnn" does not work when sysctl net.ipv4.ip_nonlocal_bind=1,
since it determines the node by attempting to bind to each addres in
the nodes file.  The solution is to not use "ctdb xpnn".  After the
initial call, ctdb_get_pnn() will be more efficient that "ctdb xpnn".

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 09b5e4978ab1df09f47156147848a6bf099ea665)

8 years agotests: New function ctdb_set_pnn() to change PNN
Martin Schwenke [Sat, 18 Apr 2015 11:55:50 +0000 (21:55 +1000)]
tests: New function ctdb_set_pnn() to change PNN

ctdb_get_pnn() incorrectly caches to the same file regardless of what
node is selected via FAKE_CTDB_PNN.

Instead, set the PNN using new function ctdb_get_pnn(), which also
makes CTDB_VARDIR point to a node-specific subdirectory.  This means
that ctdb_get_pnn() will correctly cache to the node-specific
directory.

Fake tickle and TDB files/directories used by the ctdb stub need to be
the same across all PNNs, so change these to use
$EVENTSCRIPTS_TESTS_VAR_DIR instead of node-specific $CTDB_VARDIR.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit af93ae1a540003824b32301d3c9f09c713f1fa7a)

8 years agoscripts: New function ctdb_get_pnn() does cached retrieval of PNN
Martin Schwenke [Fri, 17 Apr 2015 10:44:15 +0000 (20:44 +1000)]
scripts: New function ctdb_get_pnn() does cached retrieval of PNN

This avoids the expense of establishing a client connection to the
daemon just to get the PNN of the current node.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 579dda6858f547d360073cd67235e49ab03b355e)

8 years agoFix the O3 developer build
Volker Lendecke [Tue, 21 Apr 2015 08:34:54 +0000 (10:34 +0200)]
Fix the O3 developer build

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
(Imported from commit b8ac9853b0483fc4af82f731337464f9b5aaf53c)

8 years agoCoverity fix for CID 1125630
Rajesh Joseph [Thu, 16 Apr 2015 06:25:53 +0000 (11:55 +0530)]
Coverity fix for CID 1125630

Due to usage of CTDB_NO_MEMORY macro,
some of the resources are not freed in failure cases.

Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Guenther Deschner <gd@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Autobuild-User(master): Günther Deschner <gd@samba.org>
Autobuild-Date(master): Fri Apr 17 16:49:05 CEST 2015 on sn-devel-104

(Imported from commit 9b33732a57a919059bf17e9348a60019146e9e1d)

8 years agoCoverity fix for CID 1125625
Rajesh Joseph [Thu, 16 Apr 2015 06:55:28 +0000 (12:25 +0530)]
Coverity fix for CID 1125625

Memory allocated by ctdb_sys_find_ifname is not
freed by the caller.

Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Michael Adam <obnox@samba.org>
(Imported from commit a689cd5d955214fe94f19af9d1b5aec6d44d568a)

8 years agocheck for talloc_asprintf() failure
David Disseldorp [Tue, 31 Mar 2015 16:06:43 +0000 (18:06 +0200)]
check for talloc_asprintf() failure

Signed-off-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Autobuild-User(master): Michael Adam <obnox@samba.org>
Autobuild-Date(master): Wed Apr  1 15:36:03 CEST 2015 on sn-devel-104

(Imported from commit 12309f8bfb70878bec5fcec4681eb4e463e07357)

8 years agoFix CID 1125615 Copy into fixed size buffer
Volker Lendecke [Thu, 26 Mar 2015 12:11:14 +0000 (13:11 +0100)]
Fix CID 1125615 Copy into fixed size buffer

Might be a "can't happen", but strcpy always looks fishy

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
(Imported from commit 508b45fca93ca2dfb048fdf7465602bc34df42db)

8 years agoFix CID 1125634 Out-of-bounds write
Volker Lendecke [Thu, 26 Mar 2015 12:06:26 +0000 (13:06 +0100)]
Fix CID 1125634 Out-of-bounds write

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
(Imported from commit 93d4e801298d8ebb7261adbfc2bdb1a5fbe7115c)

8 years agoFix 1125553 Buffer not null terminated
Volker Lendecke [Sat, 7 Mar 2015 10:29:21 +0000 (10:29 +0000)]
Fix 1125553 Buffer not null terminated

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Ira Cooper <ira@samba.org>
(Imported from commit 621bd0784290f24e229caf0590206805f6f2e75c)

8 years agobuild: fix building with external libtdb
Michael Adam [Wed, 11 Jun 2014 16:16:34 +0000 (18:16 +0200)]
build: fix building with external libtdb

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Jelmer Vernooij <jelmer@samba.org>
(Imported from commit 4106cf2eb969da179720067c86728441442cde59)

8 years agodaemon avoid goto ctdb_remove_orphaned_ifaces()
Gregor Beck [Mon, 31 Mar 2014 06:04:21 +0000 (08:04 +0200)]
daemon avoid goto ctdb_remove_orphaned_ifaces()

Signed-off-by: Gregor Beck <gbeck@sernet.de>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
Autobuild-User(master): Michael Adam <obnox@samba.org>
Autobuild-Date(master): Tue Apr  1 02:59:05 CEST 2014 on sn-devel-104

(Imported from commit 6cdde2711b5b4ad09f9703b2558db7c5d90e9a35)

8 years agodaemon take a shortcut in all_nodes_are_disabled()
Gregor Beck [Mon, 31 Mar 2014 05:50:45 +0000 (07:50 +0200)]
daemon take a shortcut in all_nodes_are_disabled()

Signed-off-by: Gregor Beck <gbeck@sernet.de>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
(Imported from commit dd56afc7df1149e809486bc0f1c336a42bc7c0aa)

8 years agorecoverd: LCP2 cleanups
Martin Schwenke [Fri, 7 Feb 2014 06:32:12 +0000 (17:32 +1100)]
recoverd: LCP2 cleanups

* Remove unnecessary candimbl parameter.

  This parameter can be cheaply calculated in
  lcp2_failback_candidate().  The compiler will probably do an
  excellent job optimising it.  :-)

* Clarify a debug statement

  This is much clearer than doing a complex recalculation of a known
  value.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 24b734f084de36160d065dc639100eab3b186f6c)

8 years agorecoverd: Optimise check for rebalance candidates in LCP2
Martin Schwenke [Fri, 7 Feb 2014 03:28:54 +0000 (14:28 +1100)]
recoverd: Optimise check for rebalance candidates in LCP2

Currently this can be checked many times.  However, there's no point
calling the rebalance/failback code at all if there are no rebalance
candidates.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 9e5ef44f32fad6606bd95e619f0720a72344e441)

8 years agoFix CID 1138340 Resource leak
Volker Lendecke [Sun, 15 Dec 2013 19:28:53 +0000 (20:28 +0100)]
Fix CID 1138340 Resource leak

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
(Imported from commit c943937ec69f6547533f34ae83a268960395b521)

8 years agoFix CID 1138341 Resource leak
Volker Lendecke [Sun, 15 Dec 2013 19:28:04 +0000 (20:28 +0100)]
Fix CID 1138341 Resource leak

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
(Imported from commit b2937fd6186003740f3bef3c2f4fd54a4d3cf335)

8 years agoscripts: Fix CTDB_DBDIR=tmpfs support
Martin Schwenke [Tue, 17 Nov 2015 03:57:44 +0000 (14:57 +1100)]
scripts: Fix CTDB_DBDIR=tmpfs support

Various scripts (including debug_locks.sh, 00.ctdb, 05.system) need
CTDB_DBDIR to point to the right place... but it doesn't.

Move the rewriting of CTDB_DBDIR to loadconfig() so that it happens
for all scripts.  Have this code set internal variable
CTDB_DBDIR_TMPFS_OPTIONS so that ctdbd_wrapper can do the mount.

This loses the generality that was present in dbdir_tmpfs_start() but
it wasn't being used anyway.  If it is needed in the future then it
will be in the git history.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Nov 18 11:51:54 CET 2015 on sn-devel-104

(Imported from commit d9677894b7aa2248e1884ab9e21667879bf1e3c4)

8 years agoscripts: Add support for CTDB_DBDIR in tmpfs
Martin Schwenke [Fri, 23 Oct 2015 03:04:04 +0000 (14:04 +1100)]
scripts: Add support for CTDB_DBDIR in tmpfs

The tmpfs is mounted and unmounted by ctdbd_wrapper.  Format is
CTDB_DBDIR=tmpfs:<tmpfs-options>.  The only default for the tmpfs is
mode=700 - to override, specify a different value in <tmpfs-options>.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
Autobuild-User(master): Michael Adam <obnox@samba.org>
Autobuild-Date(master): Mon Nov  9 10:58:32 CET 2015 on sn-devel-104

(Imported from commit be670ef0103878d8d939de5972b567c4db404082)

8 years agoscripts: Improve CTDB wrapper shutdown code
Martin Schwenke [Fri, 23 Oct 2015 03:04:04 +0000 (14:04 +1100)]
scripts: Improve CTDB wrapper shutdown code

This will make it easier to run things after CTDB is stopped.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
(Imported from commit f05c6d32cce334d29e373f1e74f0b52cab14409d)

8 years agoscripts: Drop use of "smbcontrol winbindd ip-dropped ..."
Martin Schwenke [Mon, 8 Feb 2016 04:55:17 +0000 (15:55 +1100)]
scripts: Drop use of "smbcontrol winbindd ip-dropped ..."

This is unnecessary in Samba >= 4.0 because winbindd monitors IP
address itself and no longer needs to be told when they are dropped.
The smbcontrol commands can hang if a node has recovery mode active
because smbcontrol is unable to connect to the registry.  Therefore,
the smbcontrol commands should be removed.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11719

Signed-off-by: Martin Schwenke <martin@meltin.net>
(Imported from commit 519564bb35a0f840bc4d7c8c5a92441c97b49791)

8 years agoscripts: Improve error handling for 50.samba testparm failure
Martin Schwenke [Thu, 30 Jul 2015 06:49:35 +0000 (16:49 +1000)]
scripts: Improve error handling for 50.samba testparm failure

Also add tests.  Update testparm stub to fake error and timeout.  Add
timeout stub.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 7d04778c82a8f657b6ba0173c29529fa03ab7a25)

8 years agotests: Run transaction tests with externally imposed timeout
Martin Schwenke [Wed, 8 Oct 2014 01:22:06 +0000 (12:22 +1100)]
tests: Run transaction tests with externally imposed timeout

This works around cases where ctdb_transaction gets stuck - this still
needs to be debugged.  However, this change will at least cause
individual tests to fail rather than having whole test runs time out.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit f4871b8736f22941b227c19656319033c0c812e8)

8 years agodaemon: Reset database statistics when resetting statistics
Amitay Isaacs [Thu, 2 Apr 2015 02:53:09 +0000 (13:53 +1100)]
daemon: Reset database statistics when resetting statistics

When the ctdb statistics is reset, reset per database statistics to keep
it consistent with ctdb statistics.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit 7949ce103f2062aa703a24f72e11be96dc497a7a)

8 years agosystem: Remove unused system specific calls
Amitay Isaacs [Mon, 3 Aug 2015 05:02:43 +0000 (15:02 +1000)]
system: Remove unused system specific calls

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit d9030d8c10ebe6f95f33cbc691b5756d97395b0f)

8 years agodaemon: Check if updates are in flight when releasing all IPs
Martin Schwenke [Fri, 24 Jul 2015 05:32:42 +0000 (15:32 +1000)]
daemon: Check if updates are in flight when releasing all IPs

Some code involved in releasing IPs is not re-entrant.  Memory
corruption can occur if, for example, overlapping attempts are made to
ban a node.  We haven't been able to recreate the corruption but this
should protect against it.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 952a50485f68b3cffdf57da84aa9bb9fde630b7e)

8 years agobanning: If node is already banned, do not run ctdb_local_node_got_banned()
Amitay Isaacs [Mon, 27 Jul 2015 06:51:08 +0000 (16:51 +1000)]
banning: If node is already banned, do not run ctdb_local_node_got_banned()

This calls release_all_ips() only once on the first ban.  If the node gets
banned again due to event script timeout while running release_all_ips(),
then avoid calling release_all_ips() in re-entrant fashion.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit 8eb04d09b119e234c88150e1dc35fc5057f9c926)

8 years agoclient: Return the correct status sent from the daemon
Amitay Isaacs [Thu, 23 Jul 2015 21:39:26 +0000 (07:39 +1000)]
client: Return the correct status sent from the daemon

If a control fails and error message is set, the returned status of the
control is always set to -1 ignoring the status passed by the daemon.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit 1286b02e24a521dafa7061d09fb5c21d1ebb3011)

8 years agodaemon: Correctly process the exit code from failed eventscripts
Amitay Isaacs [Tue, 21 Jul 2015 06:37:04 +0000 (16:37 +1000)]
daemon: Correctly process the exit code from failed eventscripts

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Jul 22 15:03:53 CEST 2015 on sn-devel-104

(Imported from commit 00ec3c477eba50206801b451ae4eb64c12aba5db)

8 years agotool: Correctly print timed out event scripts output
Amitay Isaacs [Mon, 20 Jul 2015 06:37:58 +0000 (16:37 +1000)]
tool: Correctly print timed out event scripts output

The timed out error is ignored for certain events (start_recovery,
recoverd, takeip, releaseip).  If these events time out, then the debug
hung script outputs the following:

 3 scripts were executed last releaseip cycle
 00.ctdb              Status:OK    Duration:4.381 Thu Jul 16 23:45:24 2015
 01.reclock           Status:OK    Duration:13.422 Thu Jul 16 23:45:28 2015
 10.external          Status:DISABLED
 10.interface         Status:OK    Duration:-1437083142.208 Thu Jul 16 23:45:42 2015

The endtime for timed out scripts is not set.  Since the status is not
returned as -ETIME for some events, ctdb scriptstatus prints -ve duration.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit 71b89b2b7a9768de437347e6678370b2682da892)

8 years agodaemon: Ignore SIGUSR1
Martin Schwenke [Tue, 21 Jul 2015 02:23:27 +0000 (12:23 +1000)]
daemon: Ignore SIGUSR1

No use dying or failing eventscripts if someone sends a random
SIGUSR1.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Jul 21 11:00:17 CEST 2015 on sn-devel-104

(Imported from commit 65515919142c922fe6ddf63d0f50449eec445b30)

8 years agodaemon: Return correct sequence number for CONTROL_GET_DB_SEQNUM
Amitay Isaacs [Tue, 14 Jul 2015 06:54:59 +0000 (16:54 +1000)]
daemon: Return correct sequence number for CONTROL_GET_DB_SEQNUM

Due to the missing cast of uint64_t, CONTROL_GET_DB_SEQNUM always returned
seqnum <= 256.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11398

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Jul 14 13:03:25 CEST 2015 on sn-devel-104

(Imported from commit 1023db2543f7785e4527a4565db91edcde4ca7f1)

8 years agodaemon: Allow a new monitor event to cancel one already in progress
Martin Schwenke [Tue, 14 Jul 2015 03:43:14 +0000 (13:43 +1000)]
daemon: Allow a new monitor event to cancel one already in progress

Before commit cbffbb7c2f406fc1d8ebad3c531cc2757232690e this was
possible and some users depend on this behaviour.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 182ebc07289c776ca104e648911a53209bcdaf00)

8 years agodaemon: Improve error messages when eventscript control is cancelled
Martin Schwenke [Mon, 6 Jul 2015 02:02:00 +0000 (12:02 +1000)]
daemon: Improve error messages when eventscript control is cancelled

Warn specifically about cancellation instead of printing a generic
error message.  Also pass back an error message for the tool - it
could just rely on the status but it already looks at the error
message.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 122a4fda7272ec4d63452037f0b838d2bdc5a79a)

8 years agotools: Avoiding printing "(null)" on "ctdb eventscript" error
Martin Schwenke [Mon, 6 Jul 2015 01:48:28 +0000 (11:48 +1000)]
tools: Avoiding printing "(null)" on "ctdb eventscript" error

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit b71d18d2dc090e99d67c6bd8552380b44f8db810)

8 years agodaemon: Avoid double-free during monitor cancellation
Amitay Isaacs [Fri, 10 Jul 2015 04:02:29 +0000 (14:02 +1000)]
daemon: Avoid double-free during monitor cancellation

The eventscript state should never be freed externally, so it should
never be allocated off a temporary context.  It will either be freed
by the handler or in the cancellation code.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-programmed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit f951ff13838e796cd6661d800daf460247cac60b)

8 years agotests: Add some 10.interfaces VLAN tests
Martin Schwenke [Wed, 8 Jul 2015 12:22:09 +0000 (22:22 +1000)]
tests: Add some 10.interfaces VLAN tests

One without a bond, one with a bond.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 8ed0cacaf4aa9fc63b8c8d610a6164c5d01e473a)

8 years agotests: Add VLAN support to the "ip link" stub
Martin Schwenke [Wed, 8 Jul 2015 12:14:51 +0000 (22:14 +1000)]
tests: Add VLAN support to the "ip link" stub

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 8e41cb1e4e7b4a7d92628771260649ded4432772)

8 years agotests: Interface number in "ip link show" stub defaults to 42
Martin Schwenke [Wed, 8 Jul 2015 11:39:51 +0000 (21:39 +1000)]
tests: Interface number in "ip link show" stub defaults to 42

It needs to have a default for the standalone case, when it is not run
in a loop inside "ip addr show".

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 4f84d42b511a4c9a79bd835eeca0a80082e76227)

8 years agoscripts: Support monitoring of interestingly named VLANs on bonds
Martin Schwenke [Wed, 8 Jul 2015 11:23:48 +0000 (21:23 +1000)]
scripts: Support monitoring of interestingly named VLANs on bonds

VLAN interfaces on bonds with a name other than <iface>.<id>@<iface>
are not currently supported.  That is, where the VLAN name isn't based
on the underlying bond name.  Such VLAN interfaces can be created with
the "ip link" command, as opposed to the "vconfig" command, or by
renaming a VLAN interface.

This is improved by determining the underlying interface name for a
VLAN from the output of "ip link".

No serious attempt is made to support VLANs with '@' in their name,
although this seems to be legal.  Why would you do that?

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit bc71251433ce618c95c674d7cbe75b01a94adad9)

8 years agodaemon: Fix valgrind invalid read error in db_statistics control
Amitay Isaacs [Thu, 9 Jul 2015 04:55:59 +0000 (14:55 +1000)]
daemon: Fix valgrind invalid read error in db_statistics control

  ==20761== Invalid read of size 8
  ==20761==    at 0x11BE30: ctdb_ctrl_dbstatistics (ctdb_client.c:1286)
  ==20761==    by 0x12BA89: control_dbstatistics (ctdb.c:713)
  ==20761==    by 0x1312E0: main (ctdb.c:6543)
  ==20761==  Address 0x713b0d0 is 0 bytes after a block of size 560 alloc'd
  ==20761==    at 0x4C27A2E: malloc (vg_replace_malloc.c:270)
  ==20761==    by 0x5CB0954: _talloc_memdup (talloc.c:615)
  ==20761==    by 0x11395C: ctdb_control_recv (ctdb_client.c:1146)
  ==20761==    by 0x11BDD7: ctdb_ctrl_dbstatistics (ctdb_client.c:1265)
  ==20761==    by 0x12BA89: control_dbstatistics (ctdb.c:713)
  ==20761==    by 0x1312E0: main (ctdb.c:6543)

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit 9aa90482f8ffbddf898eb8a900112f45d82f0930)

8 years agodaemon: Promote debug messages about --start-as-* to NOTICE level
Martin Schwenke [Wed, 17 Jun 2015 05:05:30 +0000 (15:05 +1000)]
daemon: Promote debug messages about --start-as-* to NOTICE level

It is important to know when ctdbd is started with --start-as-stopped
or --start-as-disabled.  Given that this only happens once it makes
sense to promote these debug items to NOTICE level.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit eb159f3ff530de8828631b04e17bf0990aed906e)

8 years agorecoverd: Clear IP assignment tree on election loss
Martin Schwenke [Thu, 11 Jun 2015 05:49:25 +0000 (15:49 +1000)]
recoverd: Clear IP assignment tree on election loss

If a node was previously recovery master (say, 20 years ago) and it
becomes recovery master again then, if IP assignments have changed,
verify_remote_ip_allocation() can produce messages like the following
when called during recovery:

  ctdbd: recoverd:Inconsistent IP allocation - node 0 thinks 10.1.1.1 is held by node 0 while it is assigned to node 1

When a node loses an election it should clear all data specific to it
being the recovery master.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit b234ae0a900052b03ca22efab8fa1b9e11f44ecc)

8 years agorecoverd: Add new function clear_ip_assignment_tree()
Martin Schwenke [Thu, 11 Jun 2015 05:46:27 +0000 (15:46 +1000)]
recoverd: Add new function clear_ip_assignment_tree()

This needs to be cleared to avoid stale data when a new recovery
master is elected.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 036c2a92438585ab6b99a22fcf67b67890c525f0)

8 years agovacuum: revert "Do not delete VACUUM MIGRATED records immediately"
Michael Adam [Fri, 12 Jun 2015 08:59:54 +0000 (10:59 +0200)]
vacuum: revert "Do not delete VACUUM MIGRATED records immediately"

This reverts commit 257311e337065f089df688cbf261d2577949203d.

That commit was due to a misunderstanding, and it
does not fix what it was supposed to fix.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 1898200481f64676e596e52dc177c8d70ca1a00c)

8 years agoib: make sure the tevent_fd is removed before the fd is closed
Stefan Metzmacher [Fri, 5 Jun 2015 08:30:39 +0000 (10:30 +0200)]
ib: make sure the tevent_fd is removed before the fd is closed

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11316

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Volker Lendecke <vl@samba.org>
(Imported from commit 53ff3e4f31f3debd98f9293171c023a0a406858d)

8 years agolocking: move all auto_mark logic into process_callbacks()
Stefan Metzmacher [Tue, 2 Jun 2015 10:43:17 +0000 (12:43 +0200)]
locking: move all auto_mark logic into process_callbacks()

The caller should not dereference lock_ctx after invoking
process_callbacks(), it might be destroyed already.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Jun 12 15:28:57 CEST 2015 on sn-devel-104

(Imported from commit b3a18d66c00dba73a3f56a6f95781b4d34db1fe2)

8 years agolocking: make process_callbacks() more robust
Stefan Metzmacher [Tue, 2 Jun 2015 10:39:17 +0000 (12:39 +0200)]
locking: make process_callbacks() more robust

We should not dereference lock_ctx after invoking the callback
in the auto_mark == false case. The callback could have destroyed it.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit a2690bc3f4e28a2ed50ccb47cb404fc8570fde6d)

8 years agolocking: Add a comment to explain auto_mark usage
Amitay Isaacs [Tue, 2 Jun 2015 03:15:37 +0000 (13:15 +1000)]
locking: Add a comment to explain auto_mark usage

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
(Imported from commit 89849c4d31c0bb0c47864e11abc89efe7d812d87)

8 years agolocking: Avoid resetting talloc destructor
Amitay Isaacs [Tue, 2 Jun 2015 01:25:44 +0000 (11:25 +1000)]
locking: Avoid resetting talloc destructor

Let ctdb_lock_request_destructor() take care of the proper cleanup.
If the request if freed from the callback function, then the lock context
should not be freed.  Setting request->lctx to NULL takes care of that
in the destructor.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
(Imported from commit bc747030d435447e62262541cf2e74be4c4229d8)

8 years agolocking: Avoid memory leak in the failure case
Amitay Isaacs [Tue, 2 Jun 2015 01:15:11 +0000 (11:15 +1000)]
locking: Avoid memory leak in the failure case

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
(Imported from commit 2b352ff20597b9e34b3777d35deca1bf56209f8a)

8 years agolocking: Set destructor when lock_context is created
Amitay Isaacs [Mon, 1 Jun 2015 14:22:07 +0000 (00:22 +1000)]
locking: Set destructor when lock_context is created

There is already code in the destructor to correctly remove it from the
pending or the active queue.  This also ensures that when lock context
is in pending queue and if the lock request gets freed, the lock context
is correctly removed from the pending queue.

Thanks to Stefan Metzmacher for noticing this and suggesting the fix.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
(Imported from commit 5ae6a8f2fff5b5f4d46f496fd83f555be4b3d448)

8 years agolocking: Set the lock_ctx->request to NULL when request is freed
Stefan Metzmacher [Mon, 1 Jun 2015 14:15:11 +0000 (00:15 +1000)]
locking: Set the lock_ctx->request to NULL when request is freed

The code was added to ctdb_lock_context_destructor() to ensure that
the if a lock_ctx gets freed first, the lock_request does not have a
dangling pointer.  However, the reverse is also true.  When a lock_request
is freed, then lock_ctx should not have a dangling pointer.

In commit 374cbc7b0ff68e04ee4e395935509c7df817b3c0, the code for second
condition was dropped causing a regression.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 752ec31bcbbfe9f5b3b1c5dde4179d69f41cb53c)

8 years agolocking: Avoid memory corruption in ctdb_lock_context_destructor
Stefan Metzmacher [Tue, 26 May 2015 14:45:34 +0000 (16:45 +0200)]
locking: Avoid memory corruption in ctdb_lock_context_destructor

If the lock request is freed from within the callback, then setting
lock_ctx->request to NULL in ctdb_lock_context_destructor will end up
corrupting memory.  In this case, lock_ctx->request could be reallocated
and pointing to something else.  This may cause unexpected abort trying
to dereference a NULL pointer.

So, set lock_ctx->request to NULL before processing callbacks.

This avoids the following valgrind problem.

==3636== Invalid write of size 8
==3636==    at 0x151F3D: ctdb_lock_context_destructor (ctdb_lock.c:276)
==3636==    by 0x58B3618: _talloc_free_internal (talloc.c:993)
==3636==    by 0x58AD692: _talloc_free_children_internal (talloc.c:1472)
==3636==    by 0x58AD692: _talloc_free_internal (talloc.c:1019)
==3636==    by 0x58AD692: _talloc_free (talloc.c:1594)
==3636==    by 0x15292E: ctdb_lock_handler (ctdb_lock.c:471)
==3636==    by 0x56A535A: epoll_event_loop (tevent_epoll.c:728)
==3636==    by 0x56A535A: epoll_event_loop_once (tevent_epoll.c:926)
==3636==    by 0x56A3826: std_event_loop_once (tevent_standard.c:114)
==3636==    by 0x569FFFC: _tevent_loop_once (tevent.c:533)
==3636==    by 0x56A019A: tevent_common_loop_wait (tevent.c:637)
==3636==    by 0x56A37C6: std_event_loop_wait (tevent_standard.c:140)
==3636==    by 0x11E03A: ctdb_start_daemon (ctdb_daemon.c:1320)
==3636==    by 0x118557: main (ctdbd.c:321)
==3636==  Address 0x9c5b660 is 96 bytes inside a block of size 120 free'd
==3636==    at 0x4C29D17: free (in
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==3636==    by 0x58B32D3: _talloc_free_internal (talloc.c:1063)
==3636==    by 0x58B3232: _talloc_free_children_internal (talloc.c:1472)
==3636==    by 0x58B3232: _talloc_free_internal (talloc.c:1019)
==3636==    by 0x58B3232: _talloc_free_children_internal (talloc.c:1472)
==3636==    by 0x58B3232: _talloc_free_internal (talloc.c:1019)
==3636==    by 0x58AD692: _talloc_free_children_internal (talloc.c:1472)
==3636==    by 0x58AD692: _talloc_free_internal (talloc.c:1019)
==3636==    by 0x58AD692: _talloc_free (talloc.c:1594)
==3636==    by 0x11EC30: daemon_incoming_packet (ctdb_daemon.c:844)
==3636==    by 0x136F4A: lock_fetch_callback (ctdb_ltdb_server.c:268)
==3636==    by 0x152489: process_callbacks (ctdb_lock.c:353)
==3636==    by 0x152489: ctdb_lock_handler (ctdb_lock.c:468)
==3636==    by 0x56A535A: epoll_event_loop (tevent_epoll.c:728)
==3636==    by 0x56A535A: epoll_event_loop_once (tevent_epoll.c:926)
==3636==    by 0x56A3826: std_event_loop_once (tevent_standard.c:114)
==3636==    by 0x569FFFC: _tevent_loop_once (tevent.c:533)
==3636==    by 0x56A019A: tevent_common_loop_wait (tevent.c:637)
==3636==    by 0x56A37C6: std_event_loop_wait (tevent_standard.c:140)
==3636==    by 0x11E03A: ctdb_start_daemon (ctdb_daemon.c:1320)
==3636==    by 0x118557: main (ctdbd.c:321)

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit ee02e40e869fd46f113d016122dd5384b7887228)

8 years agoscripts: Add alternative network family monitoring for NFS
Martin Schwenke [Tue, 28 Apr 2015 03:51:00 +0000 (13:51 +1000)]
scripts: Add alternative network family monitoring for NFS

For example, adding a file called nfs-rpc-checks.d/20.nfsd@udp.check
will cause NFS to be checked on UDP as well, using a separate counter.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Apr 30 09:24:12 CEST 2015 on sn-devel-104

(Imported from commit e359d826a42656bb02ca2ab85f0fa886a046cb58)

9 years agotests: Switch to tcp check in rpcinfo stub
Amitay Isaacs [Fri, 27 Mar 2015 01:00:56 +0000 (12:00 +1100)]
tests: Switch to tcp check in rpcinfo stub

Use -T tcp instead of deprecated options -u and -t.  Also, check for
localhost.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Mar 27 09:16:50 CET 2015 on sn-devel-104

(Imported from commit 079575d80f5b28e452abf80efc4d005fb6dac270)

9 years agoscripts: Use tcp connection for checking RPC services
Amitay Isaacs [Fri, 27 Mar 2015 01:04:03 +0000 (12:04 +1100)]
scripts: Use tcp connection for checking RPC services

It's possible for a RPC service to register only for UDP and not TCP.
Since we assume all the NFS operations are over TCP, always check RPC
services over TCP.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit 14886ed00c998c2ac4deb70f650584e9b371345d)

9 years agoscripts: Respect $RPCMOUNTDOPTS when restarting rpc.mountd
Martin Schwenke [Tue, 24 Mar 2015 09:12:51 +0000 (20:12 +1100)]
scripts: Respect $RPCMOUNTDOPTS when restarting rpc.mountd

$RPCMOUNTDOPTS is ignored when restarting rpc.statd due to the service
being unresponsive.  This variable can be used to increase the number
of rpc.mountd threads when there are a lot of clients reattaching so
ignoring it can mean that only a single rpc.mount thread is started.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 130202d635d8712575fa201a12ef257f4278b862)

9 years agodaemon: Drop tunable that is no longer in use
Amitay Isaacs [Wed, 30 Jul 2014 04:31:54 +0000 (14:31 +1000)]
daemon: Drop tunable that is no longer in use

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit 62ba95a9f347d2ac0e4fb53dc62b94f557e17e8b)

9 years agorecoverd: Fix typo in comment
Amitay Isaacs [Wed, 30 Jul 2014 02:32:08 +0000 (12:32 +1000)]
recoverd: Fix typo in comment

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
(Imported from commit 41ed26cbf7b81e372ea0b5cc3d96dfe217a0cf58)

9 years agodoc: Update NEWS ctdb-2.5.5
Amitay Isaacs [Mon, 13 Apr 2015 04:17:12 +0000 (14:17 +1000)]
doc: Update NEWS

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
9 years agoincludes: Remove some unnecessary declarations
Martin Schwenke [Fri, 5 Sep 2014 06:09:34 +0000 (16:09 +1000)]
includes: Remove some unnecessary declarations

To accommodate removing file_lines_load() from here, drop the #ifdef
around the declaration in util.h.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 9726e17e366382776c87a8aaa63884665c604896)

9 years agologging: Move variable debug_extra from debug.*
Martin Schwenke [Sat, 16 Aug 2014 06:17:02 +0000 (16:17 +1000)]
logging: Move variable debug_extra from debug.*

debug_extra is CTDB-specific.  Moving it will help with the
transitions to Samba's updated debug.[ch].

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 8b39141c46458974d5476b2925f2dd5d51d9180d)

9 years agologging: Factor out ctdb_logging.h from includes.h
Martin Schwenke [Tue, 9 Sep 2014 03:52:07 +0000 (13:52 +1000)]
logging: Factor out ctdb_logging.h from includes.h

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 97dc127b81c1923755b59aad6e735aa679af3f64)

9 years agorecoverd: Change include of dlinklist.h to contain directory
Martin Schwenke [Fri, 15 Aug 2014 06:18:05 +0000 (16:18 +1000)]
recoverd: Change include of dlinklist.h to contain directory

This makes it consistent with the rest of the code and avoids problems
when some variant of lib/util isn't in the include path.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 0c0f323bb3e9146dc584a461b225586670fa9c2e)

9 years agotools: Move definition of timeval_delta() to tools/ctdb.c
Martin Schwenke [Fri, 15 Aug 2014 05:53:03 +0000 (15:53 +1000)]
tools: Move definition of timeval_delta() to tools/ctdb.c

This function is only used in this file.  Samba's lib/util doesn't
have timeval_delta(), so staging a clean transition.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 6e1568149ede06d48b91bbc7ecd8c55da3b41a41)

9 years agodaemon: Drop the argument to fault_setup()
Martin Schwenke [Fri, 15 Aug 2014 05:55:20 +0000 (15:55 +1000)]
daemon: Drop the argument to fault_setup()

Samba's version doesn't accept an argument, so this aids a smooth
transition.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit c5c74e47ee672e9e9605c5c4b96733d899b6f9b6)

9 years agoutil: Add extra max_size argument to file_lines_load()
Martin Schwenke [Fri, 15 Aug 2014 06:11:45 +0000 (16:11 +1000)]
util: Add extra max_size argument to file_lines_load()

This is part of a migration to Samba's lib/util.  CTDB always passes 0
(i.e. no max_size) so use a simple assert() to enforce this, rather
than changing a lot of code that will be discarded anyway.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit a4e76b58a5086e1339dea53b72437ed179e6025a)

9 years agocommon: Move hex_decode_talloc() to the lock helper
Martin Schwenke [Wed, 6 Aug 2014 06:36:58 +0000 (16:36 +1000)]
common: Move hex_decode_talloc() to the lock helper

This is the only place it is used.

After migrating to Samba's lib/util, the lock helper can be changed to
use strhex_to_data_blob().

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 94a5e28ffb53a268865666038678e78cbbb39de3)

9 years agocommon: Add some missing #includes
Martin Schwenke [Thu, 4 Sep 2014 03:33:58 +0000 (13:33 +1000)]
common: Add some missing #includes

To avoid warnings when using --enable-developer, which uses
-Wmissing-prototypes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 751ad4b62561b140b7a33d66e63907411a748501)

9 years agodaemon: Move some inline declarations to header file
Martin Schwenke [Thu, 4 Sep 2014 03:31:15 +0000 (13:31 +1000)]
daemon: Move some inline declarations to header file

To avoid warnings when using --enable-developer, which uses
-Wmissing-prototypes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit a81dccf7ad8345a1c44dc7a08e2320bd88e1aaa5)

9 years agotests: Add missing declarations caused by #define magic
Martin Schwenke [Thu, 4 Sep 2014 03:30:09 +0000 (13:30 +1000)]
tests: Add missing declarations caused by #define magic

Some declarations get lost because they basically get #define-d away,
so they need to be repeated after the #undef-s.  Also, some functions
are introduced due the #define-s.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 6336b958d61ba6901edbaddac8bc10539c8f30ab)

9 years agotests: Mark some functions as static
Martin Schwenke [Thu, 4 Sep 2014 03:28:34 +0000 (13:28 +1000)]
tests: Mark some functions as static

To avoid warnings when using --enable-developer, which uses
-Wmissing-prototypes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 6674949317dd4b2c1855571ea378eb6bc3b2e86c)

9 years agoutil: Remove util/strlist.c and references to str_util_*()
Martin Schwenke [Thu, 4 Sep 2014 02:34:46 +0000 (12:34 +1000)]
util: Remove util/strlist.c and references to str_util_*()

They're not used in CTDB.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 5de4a97fe941c27080061480cdd7ed8f60f4438e)

9 years agoFix some "declarations after code" problems
Martin Schwenke [Thu, 4 Sep 2014 01:21:24 +0000 (11:21 +1000)]
Fix some "declarations after code" problems

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit b0f9d3305850bdcce171b53e7bbbc9628a4e3c20)

9 years agoutil: Variables should be declared extern in headers
Martin Schwenke [Thu, 4 Sep 2014 01:20:28 +0000 (11:20 +1000)]
util: Variables should be declared extern in headers

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(Imported from commit 1d16555fa0ad562dcd8c4bbffaca454e68bcabbf)