Martin Schwenke [Fri, 12 Jan 2024 01:45:05 +0000 (12:45 +1100)]
Merge branch 'ctdb-statd-callout-sm-notify' into ctdb-merge
Martin Schwenke [Fri, 12 Jan 2024 01:44:51 +0000 (12:44 +1100)]
Merge branch 'ctdb-wip' into ctdb-merge
Martin Schwenke [Fri, 12 Jan 2024 01:44:38 +0000 (12:44 +1100)]
Merge branch 'ctdb-tickles' into ctdb-merge
Martin Schwenke [Fri, 12 Jan 2024 01:44:05 +0000 (12:44 +1100)]
Merge branch 'ctdb-host-monitoring' into ctdb-merge
Martin Schwenke [Fri, 12 Jan 2024 01:44:05 +0000 (12:44 +1100)]
Merge branch 'ctdb-killtcp' into ctdb-merge
Martin Schwenke [Fri, 12 Jan 2024 01:44:05 +0000 (12:44 +1100)]
Merge branch 'ctdb-ganesha' into ctdb-merge
Martin Schwenke [Fri, 12 Jan 2024 01:44:05 +0000 (12:44 +1100)]
Merge branch 'ctdb-tunnel' into ctdb-merge
Martin Schwenke [Fri, 30 Jun 2023 11:50:10 +0000 (21:50 +1000)]
ctdb-tests: Update statd-callout tests to handle both modes
Add support for shared_dir mode. Add tests hooks to statd-callout
itself.
Currently run with:
CTDB_NFS_SHARED_STATE_DIR=/clusterfs \
CTDB_STATD_CALLOUT_SHARED_STORAGE=shared_dir \
./tests/run_tests.sh ./tests/UNIT/eventscripts/statd-callout.*
or similar.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Wed, 28 Jun 2023 04:01:44 +0000 (14:01 +1000)]
ctdb-scripts: Support storing statd-callout state in filesystem
CTDB_STATD_CALLOUT_SHARED_STORAGE is a new configuration variable
indicating where statd-callout should store its NFS client locking
data. See the update to ctdb-script.options(5) for details.
This adds back functionality that was removed in commit
12cc82623150ca4a83482f1b7165401cbdecd3de. The commit message doesn't
say why this was changed but it was most likely due to a cluster
filesystem hanging at inopportune times. Hence, this is re-added as a
non-default option.
Note that this could create the files for sm-notify in add-client.
However, this would require an alternate implementation of
send_notifies() (or a change to the implementation for persistent_db
too). It seems better to leave add-client lightweight and do the work
in notify, since add-client is a more frequent operation.
In test mode, the shared storage location is prefixed by a shared
directory location within the test environment.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Fri, 30 Jun 2023 02:24:30 +0000 (12:24 +1000)]
ctdb-tests: Default PNN is 0
This is called in a couple of places without an argument, so give it a
default.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 29 Jun 2023 03:25:03 +0000 (13:25 +1000)]
ctdb-scripts: Move state directory creation to "startup" action
Now that there is a startup action, directory creation can be
unconditionally done there.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 29 Jun 2023 03:11:46 +0000 (13:11 +1000)]
ctdb-scripts: Move ctdb.tdb attach to statd-callout
All of the other uses of ctdb.tdb are in statd-callout. This will
also allow an alternate database name to be configured.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Tue, 27 Jun 2023 03:37:56 +0000 (13:37 +1000)]
ctdb-scripts: Factor out some statd-callout functions
This captures all of the ctdb.tdb implementation specific details in
functions. Alternate implementations can now be added.
While factoring, remove the need to cd to the queue directory by
explicitly setting and using $statd_callout_queue_dir.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Fri, 3 Mar 2017 04:44:08 +0000 (15:44 +1100)]
ctdb-scripts: Use nfs-utils' sm-notify instead of CTDB's smnotify
CTDB's smnotify does not support IPv6 and is difficult to maintain.
So, create directories of files and pass them to NFS util's sm-notify.
There is an implied change here, because NFS utils sm-notify stopped
sending IP addresses as mon_name back in 2010:
http://git.linux-nfs.org/?p=steved/nfs-utils.git;a=commitdiff;h=
900df0e7c0b9006d72d8459b30dc2cd69ce495a5
This will change advice given in the wiki to use a hostname for the
cluster with round-robin DNS, since this is what is best supported.
Another behavioural change is that sm-notify only sends "up"
notifications with an odd state.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Wed, 10 May 2023 02:21:07 +0000 (12:21 +1000)]
ctdb-failover: Add ctdb_smnotify_helper
There are endian-specific outputs in the files used by NFS utils'
sm-notify, which will subsequently be used, so create a tiny helper
instead of depending on complicated shell commands or calling out to
Python.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Mon, 19 Jun 2023 00:39:29 +0000 (10:39 +1000)]
ctdb-scripts: No longer run statd-callout under sudo
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Wed, 2 Aug 2023 03:37:03 +0000 (13:37 +1000)]
ctdb-scripts: Use find_statd_sm_dir() in one more place
Take advantage of new function find_statd_sm_dir() when clearing of
the local statd state directory so it uses the correct directory when
running on a non-RH distro.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Mon, 19 Jun 2023 02:17:44 +0000 (12:17 +1000)]
ctdb-scripts: Set ownership of statd-callout state directory
For add-client and del-client, statd-callout is called by rpc.statd,
which runs as rpcuser, statd or some other non-root system user. Find
the local statd state directory and use it as a reference to set the
ownership of statd-callout's state directory so add-client and
del-client can write to it.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Sun, 18 Jun 2023 23:52:21 +0000 (09:52 +1000)]
ctdb-scripts: Avoid connecting to ctdbd in add-client/del-client
rpc.statd runs statd-callout as a non-root user, which is currently
hacked around using some sudo logic that fails to work in some
contexts (e.g. in a container).
Remove the local ctdb_get_my_public_ip_addresses() so that the new
caching version is now used in add-client and del-client. This avoids
connecting to ctdbd when called from rpc.statd.
Use ctdb_get_my_public_ip_addresses() in other places where it makes
sense.
Connections to ctdbd are still made in the "notify" action, but this
is always run as root.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 15 Jun 2023 06:21:19 +0000 (16:21 +1000)]
ctdb-scripts: Add caching function for public IPs
This is way more complicated than I would like but, as per the
comment, this is due to complexities in the way public IPs work. The
main consumer will be statd-callout, which will then be able to run as
a non-root user.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Sun, 18 Jun 2023 23:43:33 +0000 (09:43 +1000)]
ctdb-scripts: Avoid ShellCheck warning SC2162
SC2162 read without -r will mangle backslashes.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Wed, 2 Aug 2023 03:23:58 +0000 (13:23 +1000)]
ctdb-scripts: Improve update code
Drop the complexity associated with using awk to escape dots in IPv4
addresses to protect them from sed, and generate a grep -F filter
instead. Use temporary files for the grep filter and a file list
constructed by find to avoid command-line length limits.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 29 Jun 2023 00:12:44 +0000 (10:12 +1000)]
ctdb-scripts: Factor out handling of hosted public IPs
This is done differently in different places, so make it consistent.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Tue, 13 Jun 2023 00:39:37 +0000 (10:39 +1000)]
ctdb-scripts: Improve documentation
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Fri, 16 Jun 2023 01:09:02 +0000 (11:09 +1000)]
ctdb-scripts: Reformat with "shfmt -w -p -i 0 -fn"
Best reviewed with "git show -w".
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Tue, 18 Jul 2023 21:22:08 +0000 (07:22 +1000)]
ctdb-scripts: Improve documentation comment
There is some confusion around how the GlusterFS support should be
used. If nobody chimes in with hints then that support may be removed
in future. The general scheme should work on all filesystems.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Wed, 19 Jul 2023 00:27:45 +0000 (10:27 +1000)]
ctdb-scripts: Construct state subdirectory for IP address
Using "grace_period 5:<IP>" without a subdirectory for <IP> causes an
error.
Construct a recovery subdirectory for the <IP> being taken. Find
client IDs (CIDs) matching <IP> (according to NFS-Ganesha's
recovery_fs.c implementation, these will be empty directories) and
effectively copy them to the subdirectory (using mkdir, since they are
directories). After running grace_period, remove all entries in the
subdirectory.
This ensures that NFS-Ganesha only adds CIDs to the in-memory CID list
for clients connecting to <IP> that will reclaim locks on the takeover
node.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Mon, 24 Jul 2023 23:09:46 +0000 (09:09 +1000)]
ctdb-scripts: Switch takeip handling into takeip-pre
The NFS-Ganesha lock manager needs to be in grace before any attempts
to reclaim locks. This needs to happen before the IP is on the
interface via 10.interface, which allows clients to reconnect. So,
use takeip-pre for this, since it is run from 06.nfs.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Tue, 18 Jul 2023 21:22:54 +0000 (07:22 +1000)]
ctdb-scripts: Support Lustre
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 6 Jul 2023 10:28:30 +0000 (20:28 +1000)]
ctdb-scripts: Change NFS-Ganesha PID file location
This is the current default.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 6 Jul 2023 03:37:55 +0000 (13:37 +1000)]
ctdb-scripts: Drop reference to historical commit
No novel contributions from that commit remain.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 6 Jul 2023 03:37:03 +0000 (13:37 +1000)]
ctdb-scripts: Fix usage message
An IP address is passed to these actions.
Reported-by: Arnab Tah <atah@ddn.com>
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 6 Jul 2023 03:24:22 +0000 (13:24 +1000)]
ctdb-scripts: Drop GPFS special cases
Use the simple GPFS code by default for non-GlusterFS filesystems.
Drop variable GANRECDIR, which is only used once for an obvious
purpose and isn't a configuration variable.
This makes it possible to add support for a new filesystem type by
adding it to the initial case statement.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 6 Jul 2023 03:16:02 +0000 (13:16 +1000)]
ctdb-scripts: Drop get_cluster_fs_state()
It is unclear why NFS monitoring should succeed if the filesystem
containing the shared NFS state directory is not active. Nothing else
in the current monitor code depends on the shared directory, and any
other operation that uses the filesystem may still behave badly.
When commit
28cbe527d47822f870e8252495ab2a1c8fddd12f introduced this
function, there was more monitoring logic. However, that logic didn't
seem to use the shared directory and it has since been removed anyway.
Additionally, checking the state of the filesystem here seems like a
layering violation, where failure in a lower layer is ignored at its
own level, so it then needs to be ignored here in a higher layer. It
should be checked in a previous event script, though it is unclear
what should be done if the filesystem is failing over. It doesn't
seem sane to mark all nodes unhealthy there, but that isn't an NFS
problem.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Wed, 5 Jul 2023 22:20:37 +0000 (08:20 +1000)]
ctdb-scripts: Use CTDB_NFS_SHARED_STATE_DIR in nfs-ganesha-callout
Rename CTDB_NFS_STATE_MNT to CTDB_NFS_SHARED_STATE_DIR. It doesn't
have to be a mount but can be any directory in a cluster filesystem.
There are plans to also use CTDB_NFS_SHARED_STATE_DIR for
statd-callout, so the name might as well be better.
CTDB_NFS_SHARED_STATE_DIR is now mandatory when GPFS is used. This is
much saner that choosing the first GPFS filesystem.
Drop CTDB_NFS_STATE_FS_TYPE. The filesystem type is now determined
from CTDB_NFS_SHARED_STATE_DIR and it is now checked against supported
filesystems. This will catch the case when the filesystem for the
specified directory has not been mounted and the filesystem for the
mountpoint (e.g. ext4) is not a supported filesystem for shared state.
A side-effect is that the filesystem containing
CTDB_NFS_SHARED_STATE_DIR must be mounted when nfs-ganesha-callout is
first run.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 6 Jul 2023 03:13:06 +0000 (13:13 +1000)]
ctdb-scripts: Reformat with "shfmt -w -p -i 0 -fn"
Best reviewed with "git show -w".
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Tue, 9 Jan 2024 22:28:45 +0000 (09:28 +1100)]
utils: Add error checking to log level parsing
There is an additional use of atoi() in the backend setup that is not
handled by this change. It is in a void function, where error
handling is more difficult. However, the goal here is to detect
errors in the parsing of the "log level" parameter, not "logging".
Don't attempt to fix testparm (i.e.
https://bugzilla.samba.org/show_bug.cgi?id=11301). That is a
different problem, requiring a lot of detangling.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Fri, 10 Nov 2023 00:17:13 +0000 (11:17 +1100)]
WIP ctdb-scripts: Use nftables for IP blocking
TODO:
* Consider calling ip_block_init() if an add element fails.
* 11.natgw
* 70.iscsi
* Drop iptables_wrapper
Don't bother cleaning up the sets on shutdown or similar. The whole
point of sets is that it is easy to delete elements from them. Rules
are difficult to delete because they can only be referenced by handle.
nft is better than iptables for defining an initial setup, but it is
harder for scripting.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Tue, 31 May 2022 03:36:39 +0000 (13:36 +1000)]
ctdb-daemon: Add "max log size" configuration parameter
Passing the limit via an environment variable is questionable.
However, the dependencies (logging-conf depends on logging) make other
ways difficult. It would be possible to extend logging_init(), but
not sure it is right for a feature that only applies to file
logging...
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 23 Feb 2018 09:16:18 +0000 (20:16 +1100)]
docs: Add some design notes
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 4 Jan 2017 03:25:14 +0000 (14:25 +1100)]
ctdb-tests: Add some debug on an important failure
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 4 Jan 2017 03:14:01 +0000 (14:14 +1100)]
ctdb-util: Add D_TRACE
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 23 Oct 2023 03:23:45 +0000 (14:23 +1100)]
ctdb-scripts: Remove superseded compatibility code
Since commit
224e99804efef960ef4ce2ff2f4f6dced1e74146, square brackets
have been parsed by daemon and tool code, so drop the compatibility
code from here.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Mon, 23 Oct 2023 03:17:36 +0000 (14:17 +1100)]
ctdb-scripts: Track connections for all ports for public IPs
Currently TCP ports like NFS lock manager are not tracked. It is
easier to track all connections than to add a configuration system to
try to track specified ports, so do that.
Note that this is not exactly all connections. smbd connections are
excluded because ctdbd tracks them directly.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Mon, 23 Oct 2023 03:05:21 +0000 (14:05 +1100)]
ctdb-scripts: Move connection tracking to 10.interface
This should really be done for all connections to public IP
addresses. This is the first step.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Mon, 6 Nov 2023 03:12:47 +0000 (14:12 +1100)]
ctdb-tests: Extend ss stub to handle excluded ports
This involves input like:
(sport != :139 && sport != :445)
Add appropriate error checking to ensure that for sport lists:
* == is always used with ||
* != is always used with &&
Add similar checking for src.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Fri, 27 Oct 2023 00:06:23 +0000 (11:06 +1100)]
ctdb-tests: Ensure square brackets are handled around IP addresses
It isn't unreasonable for unit test cases to use square brackets in
their input.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Wed, 1 Mar 2023 22:29:31 +0000 (09:29 +1100)]
ctdb-scripts: Add host monitoring
It can be difficult to diaagnose certain event script timeouts. So,
at the risk of re-inventing Nagios, this provides some limited host
monitoring to ensure that nodes are able to reach required
infrastructure.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Fri, 17 Mar 2023 00:05:19 +0000 (11:05 +1100)]
ctdb-tests: Enhance timeout stub
Allow it to trigger on a particular command instead of
unconditionally. Wrap setting and clearing the variable.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Wed, 1 Mar 2023 01:14:04 +0000 (12:14 +1100)]
ctdb-scripts: Improve some comments
In particular, the word "passing" is ambiguous, since it could refer
to whether a value is passed to the function.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Tue, 22 Aug 2023 02:13:44 +0000 (12:13 +1000)]
ctdb-scripts: Add configuration variable CTDB_KILLTCP_USE_SS_KILL
This allows CTDB to be configured to use "ss -K" to reset TCP
connections on "releaseip". This is only supported when the kernel is
configured with CONFIG_INET_DIAG_DESTROY enabled.
From the documentation:
ss -K has been supported in ss since iproute 4.5 in March 2016 and
in the Linux kernel since 4.4 in December 2015. However, the
required kernel configuration item CONFIG_INET_DIAG_DESTROY is
disabled by default. Although enabled in Debian kernels since
~2017 and in Ubuntu since at least 18.04,, this has only recently
been enabled in distributions such as RHEL. There seems to be no
way, including running ss -K, to determine if this is supported, so
use of this feature needs to be configurable. When available, it
should be the fastest, most reliable way of killing connections.
For RHEL and derivatives, this was enabled as follows:
* RHEL 8 via https://bugzilla.redhat.com/show_bug.cgi?id=
2230213,
arriving in version kernel-4.18.0-513.5.1.el8_9
* RHEL 9 via https://issues.redhat.com/browse/RHEL-212, arriving in
kernel-5.14.0-360.el9
Enabling this option results in a small behaviour change because ss -K
always does a 2-way kill (i.e. it also sends a RST to the client).
Only a 1-way kill is done for SMB connections when ctdb_killtcp is
used - the reasons for this are shrouded in history and the 2-way kill
seems to work fine.
For the summary that is logged, when CTDB_KILLTCP_USE_SS_KILL is "yes"
or "try", always log the method used, even the fallback to
ctdb_killtcp. However, when set to "no", maintain the existing
output.
The decision to use -K rather than --kill is because short options are
trivial to implement in test stubs.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Fri, 25 Aug 2023 00:00:57 +0000 (10:00 +1000)]
ctdb-scripts: Factor out function kill_tcp_summarise()
This will be used in a slightly different context in a subsequent
commit. In that case, the number of killed connections will be passed
instead of the total number of connections, so support this here via
different modes instead of churning later.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Tue, 22 Aug 2023 02:12:50 +0000 (12:12 +1000)]
ctdb-doc: Improve description of 10.interface event script
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Tue, 18 Jul 2023 21:22:08 +0000 (07:22 +1000)]
ctdb-scripts: Improve documentation comment
There is some confusion around how the GlusterFS support should be
used. If nobody chimes in with hints then that support may be removed
in future. The general scheme should work on all filesystems.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Wed, 19 Jul 2023 00:27:45 +0000 (10:27 +1000)]
ctdb-scripts: Construct state subdirectory for IP address
Using "grace_period 5:<IP>" without a subdirectory for <IP> causes an
error.
Construct a recovery subdirectory for the <IP> being taken. Find
client IDs (CIDs) matching <IP> (according to NFS-Ganesha's
recovery_fs.c implementation, these will be empty directories) and
effectively copy them to the subdirectory (using mkdir, since they are
directories). After running grace_period, remove all entries in the
subdirectory.
This ensures that NFS-Ganesha only adds CIDs to the in-memory CID list
for clients connecting to <IP> that will reclaim locks on the takeover
node.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Mon, 24 Jul 2023 23:09:46 +0000 (09:09 +1000)]
ctdb-scripts: Switch takeip handling into takeip-pre
The NFS-Ganesha lock manager needs to be in grace before any attempts
to reclaim locks. This needs to happen before the IP is on the
interface via 10.interface, which allows clients to reconnect. So,
use takeip-pre for this, since it is run from 06.nfs.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Tue, 18 Jul 2023 21:22:54 +0000 (07:22 +1000)]
ctdb-scripts: Support Lustre
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 6 Jul 2023 10:28:30 +0000 (20:28 +1000)]
ctdb-scripts: Change NFS-Ganesha PID file location
This is the current default.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 6 Jul 2023 03:37:55 +0000 (13:37 +1000)]
ctdb-scripts: Drop reference to historical commit
No novel contributions from that commit remain.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 6 Jul 2023 03:37:03 +0000 (13:37 +1000)]
ctdb-scripts: Fix usage message
An IP address is passed to these actions.
Reported-by: Arnab Tah <atah@ddn.com>
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 6 Jul 2023 03:24:22 +0000 (13:24 +1000)]
ctdb-scripts: Drop GPFS special cases
Use the simple GPFS code by default for non-GlusterFS filesystems.
Drop variable GANRECDIR, which is only used once for an obvious
purpose and isn't a configuration variable.
This makes it possible to add support for a new filesystem type by
adding it to the initial case statement.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 6 Jul 2023 03:16:02 +0000 (13:16 +1000)]
ctdb-scripts: Drop get_cluster_fs_state()
It is unclear why NFS monitoring should succeed if the filesystem
containing the shared NFS state directory is not active. Nothing else
in the current monitor code depends on the shared directory, and any
other operation that uses the filesystem may still behave badly.
When commit
28cbe527d47822f870e8252495ab2a1c8fddd12f introduced this
function, there was more monitoring logic. However, that logic didn't
seem to use the shared directory and it has since been removed anyway.
Additionally, checking the state of the filesystem here seems like a
layering violation, where failure in a lower layer is ignored at its
own level, so it then needs to be ignored here in a higher layer. It
should be checked in a previous event script, though it is unclear
what should be done if the filesystem is failing over. It doesn't
seem sane to mark all nodes unhealthy there, but that isn't an NFS
problem.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Wed, 5 Jul 2023 22:20:37 +0000 (08:20 +1000)]
ctdb-scripts: Use CTDB_NFS_SHARED_STATE_DIR in nfs-ganesha-callout
Rename CTDB_NFS_STATE_MNT to CTDB_NFS_SHARED_STATE_DIR. It doesn't
have to be a mount but can be any directory in a cluster filesystem.
There are plans to also use CTDB_NFS_SHARED_STATE_DIR for
statd-callout, so the name might as well be better.
CTDB_NFS_SHARED_STATE_DIR is now mandatory when GPFS is used. This is
much saner that choosing the first GPFS filesystem.
Drop CTDB_NFS_STATE_FS_TYPE. The filesystem type is now determined
from CTDB_NFS_SHARED_STATE_DIR and it is now checked against supported
filesystems. This will catch the case when the filesystem for the
specified directory has not been mounted and the filesystem for the
mountpoint (e.g. ext4) is not a supported filesystem for shared state.
A side-effect is that the filesystem containing
CTDB_NFS_SHARED_STATE_DIR must be mounted when nfs-ganesha-callout is
first run.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Thu, 6 Jul 2023 03:13:06 +0000 (13:13 +1000)]
ctdb-scripts: Reformat with "shfmt -w -p -i 0 -fn"
Best reviewed with "git show -w".
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Martin Schwenke [Fri, 16 Jul 2021 07:21:07 +0000 (17:21 +1000)]
ctdb-tests: Extend tunnel test to cover more cases
Test heterogeneous tunnels where the tunnel IDs are different at
either end. Also test tunnels on the same node - the heterogeneous
case is the useful one, while homogeneous is just loopback.
Drop the time for each run to 15s, given the increase in test cases.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 16 Jul 2021 03:19:29 +0000 (13:19 +1000)]
ctdb-tests: Factor out running the tunnel test
Do this for both the script and the C code. This will tests with
different options to be run.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 16 Jul 2021 02:55:32 +0000 (12:55 +1000)]
ctdb: Allow a destination tunnel ID different to that of the sender
This will allow a general transport API to implemented on top of
tunnels.
Testing will be subsequently enhanced.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Amitay Isaacs [Fri, 28 Jun 2019 13:34:56 +0000 (23:34 +1000)]
ctdb-transport: Add new transport protocol
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Wed, 20 Jun 2018 02:09:01 +0000 (12:09 +1000)]
ctdb-failover: Use command-line option for NoIPTakeover
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 18 Jun 2018 11:31:10 +0000 (21:31 +1000)]
ctdb-failover: Use command-line option to select algorithm
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 18 Jun 2018 11:32:58 +0000 (21:32 +1000)]
ctdb-failover: Convert IP allocation helper option handling to use popt
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 19 Jun 2018 20:53:32 +0000 (06:53 +1000)]
ctdb-failover: Merge function ctdb_test_init() into run_ipalloc()
This provides a nice linear code flow.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 19 Jun 2018 20:39:23 +0000 (06:39 +1000)]
ctdb-failover: Improve error handling in IP allocation helper
Rename ctdb_test_ipalloc() to run_ipalloc() while touching the
signature. Drop function read_ctdb_public_ip_info() because it isn't
adding value.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 18 Jun 2018 09:03:55 +0000 (19:03 +1000)]
ctdb-failover: Improve logging in IP allocation helper
ERROR level is used most frequently, so make it the default. Use
standard CTDB_DEBUGLEVEL in testcases instead of something more
obscure. Also initialise logging correctly.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 27 Jun 2018 12:26:35 +0000 (22:26 +1000)]
ctdb-failover: Don't pass node states to IP allocation helper
This helper is doing IP allocation and simply needs to know which
nodes each IP addresses are available on. It doesn't need to know
node states. So just pass the number of nodes instead of the node
states.
Update tests accordingly.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 20 Jun 2018 03:39:42 +0000 (13:39 +1000)]
ctdb-failover: Use takeover test helper as IP allocation helper
After a good clean up this will become a helper in the failover
component. The same unit tests are still run.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 9 Jul 2018 07:28:48 +0000 (17:28 +1000)]
ctdb-tests: Drop checking of runstate in takeover test
Daemons do not return available IPs for nodes that are not RUNNING.
Indicate the IP addresses as not available on relevant nodes instead
of doing runstate checking.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 9 Jul 2018 06:16:00 +0000 (16:16 +1000)]
ctdb-tests: Add function to read IPs, use ipalloc_set_public_ip_info
The input format contains per-address lists of node/available nodes.
ipalloc_read_known_ips() inverts the structure into per-node arrays of
IP lists. Then ipalloc_set_public_ips() converts it back to a
per-address merged IP list containing bitmaps of nodes. This is
needlessly complex.
Instead, add a local function to read the addresses, building merged
IP list and bitmaps directly. Then use ipalloc_set_public_ip_info()
to set this in the state.
Sorting the input list adds some unwelcome complexity. This is partly
to avoid churn in the test cases. However, continuing to have the
input sorted keeps the multi-stage tests sane, since output from one
stage is fed back in as input.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 9 Jul 2018 06:50:44 +0000 (16:50 +1000)]
ctdb-tests: Drop ability to read per-node IP states
This is no longer used.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 9 Jul 2018 06:44:49 +0000 (16:44 +1000)]
ctdb-tests: Drop IP allocation test that reads input in "multi" mode
This tests an inconsistency of IP state between nodes and actually
tests the create_merged_ip_list() function rather than IP allocation.
The bug has been fixed for years and soon there won't be a possibility
of inconsistency between nodes (due to state being stored in a
database), so drop this test.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 9 Jul 2018 02:33:50 +0000 (12:33 +1000)]
ctdb-ipalloc: Add function ipalloc_set_public_ip_info()
This is an alternative way of setting the public IP list, including
the bitmaps and available counts. It is a much more natural form for
providing the data when it is read in.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 9 Jul 2018 09:23:02 +0000 (19:23 +1000)]
ctdb-takeover: Free known and available IPs when finished with them
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 10 Jul 2018 07:30:45 +0000 (17:30 +1000)]
ctdb-ipalloc: Allow ipalloc() to return an error
Allows the caller to distinguish between an error and an empty out
list (due to empty input list).
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 9 Jul 2018 09:19:56 +0000 (19:19 +1000)]
ctdb-ipalloc: Don't store known and available public addresses
They are only used by the setup functions, so pass them to those
functions instead of storing them.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 9 Jul 2018 02:30:39 +0000 (12:30 +1000)]
ctdb-ipalloc: Optimise ipalloc_can_host_ips()
With the merged IP list in hand, including the number of nodes each IP
address is available on, this becomes trivial.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 9 Jul 2018 02:25:24 +0000 (12:25 +1000)]
ctdb-ipalloc: Count the number of nodes each IP address is available on
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 9 Jul 2018 02:18:05 +0000 (12:18 +1000)]
ctdb-ipalloc: Call setup functions when setting public IPs
This simplifies the actual IP allocation calculation.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 9 Jul 2018 02:16:12 +0000 (12:16 +1000)]
ctdb-ipalloc: Allow ipalloc_set_public_ips() to return an error
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 9 Jul 2018 01:54:48 +0000 (11:54 +1000)]
ctdb-ipalloc: Allow create_merged_ip_list() to return an error
Right now it is not possible to tell the difference between a memory
allocation failure and no known public IP addresses. Fix this by
having the function return an error code separate from the resulting
public IP list.
While touching this, fix the variable name(s) in ipalloc() and a
couple of debug messages.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Andreas Schneider [Tue, 9 Jan 2024 07:50:01 +0000 (08:50 +0100)]
python:gp: Print a nice message if cepces-submit can't be found
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15552
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: David Mulder <dmulder@samba.org>
Autobuild-User(master): Andreas Schneider <asn@cryptomilk.org>
Autobuild-Date(master): Wed Jan 10 09:54:34 UTC 2024 on atb-devel-224
Andreas Schneider [Mon, 8 Jan 2024 15:15:03 +0000 (16:15 +0100)]
s3:rpc_server: Mark _lsa_CreateTrustedDomainEx as NOT_IMPLMENTED
There is no PDB backend supporting this.
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
Autobuild-User(master): Andreas Schneider <asn@cryptomilk.org>
Autobuild-Date(master): Tue Jan 9 14:17:40 UTC 2024 on atb-devel-224
Andreas Schneider [Mon, 8 Jan 2024 15:13:52 +0000 (16:13 +0100)]
s3:rpc_server: Mark _lsa_CreateTrustedDomain as NOT_IMPLMENTED
There is no PDB backend which is supporting this.
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Fri, 24 Nov 2023 13:42:35 +0000 (14:42 +0100)]
dcesrv_reply: just drop responses if the connection is already terminating
There's no reason to waste resources...
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Günther Deschner <gd@samba.org>
Reviewed-by: Andreas Schneider <asn@samba.org>
Autobuild-User(master): Stefan Metzmacher <metze@samba.org>
Autobuild-Date(master): Tue Jan 9 11:26:55 UTC 2024 on atb-devel-224
Stefan Metzmacher [Fri, 24 Nov 2023 13:02:02 +0000 (14:02 +0100)]
dcesrv_core: add dcesrv_call_state->subreq in order to allow tevent_req_cancel() on termination
Requests might be cancelled if the connection got disconnected,
we got an ORPHANED or CO_CANCEL pdu.
But this is all opt-in for the backends to choose.
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Günther Deschner <gd@samba.org>
Reviewed-by: Andreas Schneider <asn@samba.org>
Stefan Metzmacher [Fri, 29 Dec 2023 09:20:02 +0000 (10:20 +0100)]
witness.idl: add flag(NDR_PAHEX) to some hex based enums
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Günther Deschner <gd@samba.org>
Reviewed-by: Andreas Schneider <asn@samba.org>
Stefan Metzmacher [Fri, 24 Nov 2023 15:38:06 +0000 (16:38 +0100)]
witness.idl: make some types public in order to be used elsewhere
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Günther Deschner <gd@samba.org>
Reviewed-by: Andreas Schneider <asn@samba.org>
Samuel Cabrero [Wed, 21 Oct 2020 16:30:29 +0000 (18:30 +0200)]
witness.idl: Set cifs as auth service name for the witness interface
Windows clients use the 'cifs' service name to bind to the witness interface.
Signed-off-by: Samuel Cabrero <scabrero@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Günther Deschner <gd@samba.org>
Reviewed-by: Andreas Schneider <asn@samba.org>
Stefan Metzmacher [Fri, 24 Nov 2023 15:28:38 +0000 (16:28 +0100)]
tdb: fix python/tdbdump.py example
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Günther Deschner <gd@samba.org>
Reviewed-by: Andreas Schneider <asn@samba.org>
Ralph Boehme [Sun, 28 Jan 2018 14:35:44 +0000 (15:35 +0100)]
examples/scripts: add smbXsrvdump
A simple python tool to dump smbXsrv TDB databases.
Signed-off-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Günther Deschner <gd@samba.org>
Reviewed-by: Andreas Schneider <asn@samba.org>
Stefan Metzmacher [Fri, 24 Nov 2023 15:09:58 +0000 (16:09 +0100)]
smbXsrv.idl: add python bindings
This is useful for some scripting examples and debugging...
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Günther Deschner <gd@samba.org>
Reviewed-by: Andreas Schneider <asn@samba.org>