Stefan Metzmacher [Tue, 23 Feb 2010 07:42:41 +0000 (08:42 +0100)]
s3:winbindd: never mark external domains as internal!
This way we can endup with silently using builtin_passdb_methods
for an ad domain without an inbound trust.
This fixes bug #7170.
metze
(cherry picked from commit
f924b7749280b31ece19885de1c3ad1bd71942ac)
(cherry picked from commit
1ea768baa9bb38533d4bd273d6c4e7b1f5fd12bd)
Stefan Metzmacher [Mon, 29 Mar 2010 20:03:55 +0000 (22:03 +0200)]
s3:winbindd: correctly retry if the netlogon pipe gets disconnected during a logon call
This fixes hopefully the last part of bug #7295.
metze
(cherry picked from commit
4c6cde99c0751a073120d8bc36d40922d8027344)
(cherry picked from commit
482518fcafb18bda1f084ebf1906a2ad02436b80)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Tue, 6 Apr 2010 12:45:19 +0000 (14:45 +0200)]
s3:winbindd_reconnect: don't only reconnect on NT_STATUS_UNSUCCESSFUL
metze
(cherry picked from commit
6bd5a2a3739938f95fce23ab2da652c9b5a48111)
(cherry picked from commit
169628fcb656ba5987a99bd50c7f588b731eae51)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Thu, 25 Mar 2010 14:25:47 +0000 (15:25 +0100)]
s3:winbindd_cm: invalidate connection if cm_connect_netlogon() fails
metze
(cherry picked from commit
94a4bcd2f0c0464e192556679c6636639cb307ea)
(cherry picked from commit
c046ae8428fb62ff2749689e7c738f1a2e8f8251)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Thu, 25 Mar 2010 14:17:07 +0000 (15:17 +0100)]
s3:winbindd: consistently use TALLOC_FREE(conn->foo_pipe) is we create a new connection
metze
(cherry picked from commit
4f391fedac7111683d13f2d79fee7c0dbc27f86e)
(cherry picked from commit
c462e54142c00fdd81c2847d16a75119b1cc89fc)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Tue, 6 Apr 2010 12:42:04 +0000 (14:42 +0200)]
s3:winbindd_cm: use rpccli_is_connected() helper function
metze
(cherry picked from commit
d980c06a994d032a833adc8d56d2f2c037f8fdaf)
(cherry picked from commit
aa7d54ed04585a183a88363406ed7f3244b24d85)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Thu, 25 Mar 2010 14:14:02 +0000 (15:14 +0100)]
s3:winbindd_cm: use cli_state_is_connected() helper function
metze
(cherry picked from commit
408a3eb35a0e61b5d66a3b48ebbd1a6796672d0f)
(cherry picked from commit
00a93190d2cae31cd2213b810ea348c055670399)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Sun, 28 Mar 2010 17:34:34 +0000 (19:34 +0200)]
s3:rpc_client: return at least 10 sec as old timeout in rpccli_set_timeout() instead of 0
metze
(cherry picked from commit
3e70da3f470eeb122f95477fb48d89939f501b3e)
(cherry picked from commit
60861fba533027b6c9a0ff704b95dcf631ea3ca3)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Tue, 6 Apr 2010 12:31:17 +0000 (14:31 +0200)]
s3:rpc_client: add set_timeout hook to rpc_cli_transport
metze
(cherry picked from commit
99664ad15460530b6fb44957b6c57823f09884bf)
(cherry picked from commit
89164eb8363ffc0b951256578be48d37ddba46b1)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Tue, 6 Apr 2010 12:26:29 +0000 (14:26 +0200)]
s3:rpc_client: add rpccli_is_connected()
metze
(cherry picked from commit
4f41b53487ac9bc96c7960e8edab464558656373)
(similar to commit
958b49323968740e2cbf69dc2a0a5dd57d5bcf87)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Mon, 29 Mar 2010 12:58:19 +0000 (14:58 +0200)]
s3:rpc_client: don't mix layers and keep a reference to cli_state in the caller
We should not rely on the backend to have a reference to the cli_state.
This will make it possible for the backend to set its cli_state reference
to NULL, when the transport is dead.
metze
(cherry picked from commit
dc09b12681ea0e6d4c2b0f1c99dfeb1f23019c65)
(cherry picked from commit
1e2e47da82aeb249dce431541738a62cb139aebb)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Tue, 6 Apr 2010 10:23:39 +0000 (12:23 +0200)]
s3:rpc_transport_np: use cli_state_is_connected() helper
metze
(cherry picked from commit
b862351da8624df893ec77e020a456c1d23c58ed)
(cherry picked from commit
8c2f4426ce178ac33748cfba01532ec2fd205710)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Thu, 25 Mar 2010 12:20:56 +0000 (13:20 +0100)]
s3:libsmb: add cli_state_is_connected() function
metze
(cherry picked from commit
d7bf30ef92031ffddcde3680b38e602510bcae24)
(cherry picked from commit
589f73924273e8a9b54669f42a92381661dcb33f)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Mon, 29 Mar 2010 16:23:40 +0000 (18:23 +0200)]
s3:libsmb: don't let cli_shutdown() segfault with a NULL cli_state
metze
(similar to commit
47e10ab9a85960c78af807b66b99bcd139713644)
(cherry picked from commit
957c0d4a5ee67ac70e576155a0f2f6f84cdb1596)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Tue, 6 Apr 2010 10:22:54 +0000 (12:22 +0200)]
s3:rpc_transport_np: handle trans rdata like the output of a normal read
Inspired by bug #7159.
metze
(cherry picked from commit
911287285cc4c8485b75edfad3c1ece901a69b0b)
(cherry picked from commit
e2739a2bf37e654c37cbea6e510f63a7ce4adfea)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Tue, 6 Apr 2010 12:14:53 +0000 (14:14 +0200)]
s3: Fix infinite loop in NCACN_IP_TCP asa there is no timeout. Assume lsa_pipe_tcp is ok but network is down, then send request is ok, but select() on writeable fds loops forever since there is no response.
Signed-off-by: Bo Yang <boyang@samba.org>
(cherry picked from commit
36493bf2f6634b84c57107bcb86bcbf3e82e80fc)
(similar to commit
b58b359881c91ec382cfa1d6ba3007b8354b29cb)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Tue, 6 Apr 2010 12:06:39 +0000 (14:06 +0200)]
Fix broken pipe handling
Metze is right: If we have *any* error at the socket level, we just can
not continue.
Also, apply some defensive programming: With this async stuff someone else
might already have closed the socket.
(cherry picked from commit
f140bf2e6578e45b8603d4a6c5feef9a3b735804)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Tue, 6 Apr 2010 12:04:33 +0000 (14:04 +0200)]
s3:rpc_client: close the socket when pipe is broken
Signed-off-by: Bo Yang <boyang@samba.org>
(similar to commit
aa70e44cd0576e5280e24cf35000369a47dd958f)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Tue, 6 Apr 2010 09:53:33 +0000 (11:53 +0200)]
s3: fix crash in winbindd (similar to commit
f8cc0e88fbbb082ead023e0cb437b1e12cf35459)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Jeremy Allison [Fri, 19 Feb 2010 22:24:17 +0000 (14:24 -0800)]
Second part of fix for bug #7159 - client rpc_transport doesn't cope with bad server data returns.
If server returns zero on a NP read. Report pipe broken.
Prevents client from looping if it thinks there should be
more data.
Jeremy.
(cherry picked from commit
0055e33dbed0e81548464d01bcf864255bab3159)
(cherry picked from commit
f5ca9f84e9b511c2ba7a4280b1997daa441f9877)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Stefan Metzmacher [Tue, 6 Apr 2010 10:20:02 +0000 (12:20 +0200)]
First part of fix for bug #7159 - client rpc_transport doesn't cope with bad server data returns.
Ensure that subreq is *always* talloc_free'd in the _done
function, as it has an event timeout attached. If the
read requests look longer than the cli->timeout, then
the timeout fn is called with already freed data.
Jeremy.
(cherry picked from commit
ad77ae1d5870e06f8587ecf634e0b6bdcbb950d7)
(similar to commit
6e5b6b5acb30869eb63b25ed1406014101a5e89d)
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Christian Ambach [Mon, 5 Apr 2010 12:12:52 +0000 (14:12 +0200)]
fix a segfault in the notify subsystem
When the notify_array cannot be loaded correctly,
do not keep the half-baked parsing results in the global variable.
This can lead to segfaults next time notify_load is entered and
the seqnum has not changed. This has been seen in a case
where mixed smbd versions were running in a CTDB cluster
(versions with and w/o commit
c216d1e6 that changed the
notify_entry structure).
There will be missed notifications until all smbds are at the
same software level, but this should be acceptable and is better
than crashing and interrupting client operations.
This fix cleans up the notify_array, removes the unparseable data
from the TDB and returns a fresh notify_array that can be worked
with.
The NDR_PRINT_DEBUG had to be moved to only be called when the
parsing succeeded, it was seen to cause additional segfaults.
The status variable is intentionally left to NT_STATUS_OK to not
make callers abort and report errors to the clients and make them
disconnect.
Signed-off-by: Christian Ambach <christian.ambach@de.ibm.com>
Volker Lendecke [Fri, 26 Mar 2010 12:18:52 +0000 (13:18 +0100)]
s3: Use tdb_transaction_start_nonblock in gencache_stabilize
This avoids the thundering herd problem when 5000 smbds exit simultaneously
because the network went down.
Volker Lendecke [Fri, 26 Mar 2010 12:30:28 +0000 (13:30 +0100)]
tdb: Add a non-blocking version of tdb_transaction_start
Volker Lendecke [Fri, 26 Mar 2010 12:20:34 +0000 (13:20 +0100)]
Revert "s3: Optimize gencache for smbd exit"
This reverts commit
e5a63346ecbfff1058c08402c40df927dbac51b8.
That does not fully fix the problem, adding a tdb_transaction_start_nonblock to
fix it.
Stefan Metzmacher [Tue, 23 Mar 2010 18:46:07 +0000 (19:46 +0100)]
s3:passdb: avoid sid_to_gid() if the sid is "domain users"
If the call fails we would use the "domain users" sid anyway.
metze
(cherry picked from commit
9fbbaa560ae74f015e404cfa700753c0b5909519)
Volker Lendecke [Thu, 18 Mar 2010 11:50:22 +0000 (12:50 +0100)]
s3: Implement an asynchronous echo responder process
This replies to echo requests when the main smbd is stuck somewhere
Signed-off-by: Stefan Metzmacher <metze@samba.org>
(cherry picked from commit
cad0c004ad54d80dcb25803f0ebb317344a42792)
Stefan Metzmacher [Fri, 19 Mar 2010 14:47:11 +0000 (15:47 +0100)]
s3:smbd: disable SMB encryption when the echo handler is active
metze
(cherry picked from commit
5a069f7209855e69082a176969533cc0d0ac0f55)
Stefan Metzmacher [Mon, 22 Mar 2010 08:11:05 +0000 (09:11 +0100)]
s3:smbd: disallow readbraw and writebraw if the echo handler is active
metze
(cherry picked from commit
d663b4c6c03450366375eb0951209bc374835935)
Stefan Metzmacher [Fri, 19 Mar 2010 11:08:13 +0000 (12:08 +0100)]
s3:smbd: disable sendfile if the echo handler is active
metze
(cherry picked from commit
fbf112bd1684acf420b104e0e7d66721af47c676)
Stefan Metzmacher [Thu, 18 Mar 2010 19:22:26 +0000 (20:22 +0100)]
s3:smbd: don't use recvfile if the echo handler is active
metze
(cherry picked from commit
453e6af5b81c8f206d87ec2e62fd79172f695950)
Stefan Metzmacher [Mon, 22 Mar 2010 08:45:43 +0000 (09:45 +0100)]
s3:smbd: setup a shared memory area for the signing state
metze
(cherry picked from commit
79e5e3dda7178c4d3c5952a48474d6dcafba91ec)
Stefan Metzmacher [Mon, 22 Mar 2010 08:43:48 +0000 (09:43 +0100)]
s3:smbd: add echo handler information to struct smbd_server_connection
metze
(cherry picked from commit
44d655b33fecb7a543ff957940716ba93fec12cd)
Stefan Metzmacher [Mon, 22 Mar 2010 08:36:41 +0000 (09:36 +0100)]
s3:param: add "async smb echo handler" option
This will enable an extra forked process that will reply
to SMBecho requests, while the main process is blocked by another
request.
metze
(cherry picked from commit
752240ccdc4dcdce7a2270ee5544e007c44bcf4d)
Stefan Metzmacher [Thu, 18 Mar 2010 14:36:19 +0000 (15:36 +0100)]
s3:smbd: pass down trusted_channel via receive_smb_talloc()
metze
(cherry picked from commit
b2c107ffbcd067ccc42f81a2d0969f7f88b63ae7)
Stefan Metzmacher [Fri, 19 Mar 2010 11:04:32 +0000 (12:04 +0100)]
s3:smbd: let reply_readbraw_error use the locked socket
metze
(cherry picked from commit
1e7086e5ce0924687d657de583adb63a9f0c1bfb)
Stefan Metzmacher [Fri, 19 Mar 2010 11:02:27 +0000 (12:02 +0100)]
s3:smbd: send keepalive packets under the socket lock
metze
(cherry picked from commit
c1653e3b0e536e835faf82a5aadadaec1cd38d1a)
Stefan Metzmacher [Thu, 18 Mar 2010 08:23:48 +0000 (09:23 +0100)]
s3:smbd: smbd_[un]lock_socket() while accessing the socket to the client
metze
(cherry picked from commit
977aa660f452d8ebc8f3a2f4bfbf0dda0bc230a2)
Stefan Metzmacher [Mon, 22 Mar 2010 08:34:07 +0000 (09:34 +0100)]
s3:smbd: add smbd_[un]lock_socket() dummies
metze
(cherry picked from commit
8de8554628bd3b16d9e488adfc31c8014c2eb1db)
Stefan Metzmacher [Mon, 22 Mar 2010 08:31:57 +0000 (09:31 +0100)]
s3:smbd: add an option to skip signings checks srv_check_sign_mac for trusted channels
metze
(cherry picked from commit
0b7da43da0bd5c7e0986854cda63103f082a26ee)
Stefan Metzmacher [Wed, 17 Mar 2010 14:07:07 +0000 (15:07 +0100)]
s3:libsmb: add a smb_signing_init_ex() function
Make it possible to overload memory handling functions.
metze
(cherry picked from commit
048c919dc0b7bc038becad34c2861c43c72c43c9)
Stefan Metzmacher [Mon, 22 Mar 2010 08:30:39 +0000 (09:30 +0100)]
lib/util: add allocate_anonymous_shared()
metze
(cherry picked from commit
01f2c023f7d2a4b0e016676638a062a5ba29ec0b)
Stefan Metzmacher [Mon, 22 Mar 2010 09:12:42 +0000 (10:12 +0100)]
lib/async_sock: handle queue = NULL in writev_send()
metze
Stefan Metzmacher [Mon, 15 Mar 2010 14:40:34 +0000 (15:40 +0100)]
s3:smbd: use new simplified smb_signing code in the server
We keep the seqnum/mid mapping in the smb_request structure.
This also moves one global variable into the
smbd_server_connection struct.
metze
(cherry picked from commit
c16c90a1cb3b0e2ceadd3dea835a4e69acfc2fae)
Stefan Metzmacher [Mon, 9 Mar 2009 07:42:05 +0000 (08:42 +0100)]
s3:libsmb: add a much simplified smb_siging infrastructure
It's the job of the caller to maintain the seqnum/mid mapping.
Hopefully we can use this code in s4 later too.
metze
(cherry picked from commit
2654653f55ed5744cc9fca6a79127386f55425e1)
Stefan Metzmacher [Sun, 8 Mar 2009 16:47:08 +0000 (17:47 +0100)]
s3:libsmb: rename smb_signing.c => clisigning.c
This prepares a large simplification of the smb_signing code
metze
(cherry picked from commit
1a48d0793b9d3a76aff76580661626e5cd95f427)
Volker Lendecke [Thu, 25 Mar 2010 15:45:02 +0000 (16:45 +0100)]
s3: Add a comment to notify_internal_parent_init, this is pretty confusing
Volker Lendecke [Thu, 25 Mar 2010 15:44:41 +0000 (16:44 +0100)]
s3: Add a comment to serverid_parent_init, this is pretty confusing
Volker Lendecke [Thu, 25 Mar 2010 15:44:02 +0000 (16:44 +0100)]
s3: Add a comment to messaging_tdb_parent_init, this is pretty confusing
Volker Lendecke [Thu, 25 Mar 2010 15:02:54 +0000 (16:02 +0100)]
s3: Make sure our CLEAR_IF_FIRST optimization works for serverid.tdb
In the child, we fully re-open serverid.tdb, which leads to one fcntl lock for
CLEAR_IF_FIRST detection per smbd. This opens the tdb in the parent and holds
it, so that tdb_reopen_all correctly catches the CLEAR_IF_FIRST bit.
Volker Lendecke [Thu, 25 Mar 2010 15:01:54 +0000 (16:01 +0100)]
s3: Make sure our CLEAR_IF_FIRST optimization works for the notify tdbs
The notify tdb files are opened at tconX time, which leads to one fcntl lock
for CLEAR_IF_FIRST detection per smbd. This opens the tdbs in the parent and
holds it, so that tdb_reopen_all correctly catches the CLEAR_IF_FIRST bit.
Volker Lendecke [Thu, 25 Mar 2010 14:59:41 +0000 (15:59 +0100)]
s3: Make sure our CLEAR_IF_FIRST optimization works for messaging.tdb
In the child, we fully re-open messaging.tdb, which leads to one fcntl lock for
CLEAR_IF_FIRST detection per smbd. This opens the tdb in the parent and holds
it, so that tdb_reopen_all correctly catches the CLEAR_IF_FIRST bit.
Volker Lendecke [Wed, 24 Mar 2010 09:28:46 +0000 (10:28 +0100)]
v3-4-ctdb: Use connections_forall_read() in smbstatus
This avoids a dmaster migration for every record when smbstatus is run
Volker Lendecke [Wed, 24 Mar 2010 09:28:44 +0000 (10:28 +0100)]
v3-4-ctdb: Add connections_forall_read()
Volker Lendecke [Tue, 23 Mar 2010 17:36:55 +0000 (18:36 +0100)]
s3: Optimize gencache for smbd exit
If thousands of smbds try to gencache_stabilize at the same time because the
network died, all of them might be sitting in transaction_start. Don't do the
stabilize transaction if nothing has changed in gencache_notrans.tdb.
Volker
Michael Adam [Fri, 12 Feb 2010 15:46:33 +0000 (16:46 +0100)]
s3:configure: prevent using external libtalloc with version >= 1.4.0
There was an ABI change and this results in an error
"undefined symbol: _talloc_free"
Michael
Volker Lendecke [Fri, 5 Mar 2010 15:46:36 +0000 (16:46 +0100)]
s3: Add the "ctdb locktime warn threshold" parameter
This is mainly a debugging aid for post-mortem analysis in case a cluster file
system is slow.
Volker Lendecke [Mon, 22 Mar 2010 10:19:10 +0000 (11:19 +0100)]
s3: Add "log writeable files on exit" parameter
This boolean option controls whether at exit time the server dumps a list of
files with debug level 0 that were still open for write. This is an
administrative aid to find the files that were potentially corrupt if the
network connection died.
Volker Lendecke [Mon, 22 Mar 2010 08:16:57 +0000 (09:16 +0100)]
s3: file_walk_table -> files_forall
This is more in line with the rest of the Samba code, like connections_forall
etc.
Michael Adam [Tue, 2 Mar 2010 13:43:53 +0000 (14:43 +0100)]
s3:net: add a command "net registry setsd_sdd"
This permits to set the security descriptor of a registry
key from the unix command line.
Michael
(cherry picked from commit
27ae935a8df409ce7557bd369250fa450120fdfe)
Michael Adam [Fri, 26 Feb 2010 08:37:45 +0000 (09:37 +0100)]
s3:net: add new subcommand "net registry getsd_sddl" to print secdesc in sddl format
Michael
(cherry picked from commit
caa27bb165a69766585ec4a13a6c09fa774d3b48)
Michael Adam [Fri, 26 Feb 2010 08:31:03 +0000 (09:31 +0100)]
s3:net: refactor getting of secdesc out of net_registry_getsd()
New net_registry_getsd_internal does the work(),
net_registry_getsd() just prints the result.
This in preparation to add support for other output formats
than the currently used display_sec_desc().
Michael
Michael Adam [Tue, 11 Aug 2009 21:35:48 +0000 (23:35 +0200)]
s3:smbcacls: forbid change of debug level from config file
Michael
(cherry picked from commit
a038f1e05b8b7acb5e99257e59178e1ece4ce156)
Michael Adam [Mon, 15 Mar 2010 11:16:52 +0000 (12:16 +0100)]
s3:smbcacls: also honour the "--sddl" flag when setting ACLs.
Michael
Michael Adam [Sun, 28 Feb 2010 21:20:03 +0000 (22:20 +0100)]
s3:smbcacls: add switch "--sddl" to output acls as sddl encoded strings
(cherry picked from commit
9cea4d5969d3061689e7399e0a97f7f83ed31976)
Michael Adam [Sun, 28 Feb 2010 21:15:23 +0000 (22:15 +0100)]
s3: build sddl.c in samba3
Michael Adam [Sun, 28 Feb 2010 21:01:49 +0000 (22:01 +0100)]
libcli/security: fix sddl.c to be able to build it from source3
(cherry picked from commit
f37030b33afa989adaafa6d3d02751bd286f879b)
Michael Adam [Fri, 26 Feb 2010 17:32:21 +0000 (18:32 +0100)]
s4:move the sddl code down to the top level
Michael
Volker Lendecke [Fri, 12 Mar 2010 14:48:35 +0000 (15:48 +0100)]
s3: Add "net registry increment"
A convenience function to increment a DWORD value under a (cluster-wide) lock
Volker Lendecke [Fri, 12 Mar 2010 13:22:54 +0000 (14:22 +0100)]
s3: Add "g_lock_do" as a convenience wrapper function
Volker Lendecke [Fri, 12 Mar 2010 11:12:25 +0000 (12:12 +0100)]
s3: Actually use mem_ctx in net_g_lock_init()
Volker Lendecke [Tue, 2 Mar 2010 16:02:01 +0000 (17:02 +0100)]
s3: Fix a long-standing problem with recycled PIDs
When a samba server process dies hard, it has no chance to clean up its entries
in locking.tdb, brlock.tdb, connections.tdb and sessionid.tdb.
For locking.tdb and brlock.tdb Samba is robust by checking every time we read
an entry from the database if the corresponding process still exists. If it
does not exist anymore, the entry is deleted. This is not 100% failsafe though:
On systems with a limited PID space there is a non-zero chance that between the
smbd's death and the fresh access, the PID is recycled by another long-running
process. This renders all files that had been locked by the killed smbd
potentially unusable until the new process also dies.
This patch is supposed to fix the problem the following way: Every process ID
in every database is augmented by a random 64-bit number that is stored in a
serverid.tdb. Whenever we need to check if a process still exists we know its
PID and the 64-bit number. We look up the PID in serverid.tdb and compare the
64-bit number. If it's the same, the process still is a valid smbd holding the
lock. If it is different, a new smbd has taken over.
I believe this is safe against an smbd that has died hard and the PID has been
taken over by a non-samba process. This process would not have registered
itself with a fresh 64-bit number in serverid.tdb, so the old one still exists
in serverid.tdb. We protect against this case by the parent smbd taking care of
deregistering PIDs from serverid.tdb and the fact that serverid.tdb is
CLEAR_IF_FIRST.
CLEAR_IF_FIRST does not work in a cluster, so the automatic cleanup does not
work when all smbds are restarted. For this, "net serverid wipe" has to be run
before smbd starts up. As a convenience, "net serverid wipedbs" also cleans up
sessionid.tdb and connections.tdb.
While there, this also cleans up overloading connections.tdb with all the
process entries just for messaging_send_all().
Volker
Michael Adam [Mon, 8 Mar 2010 22:35:17 +0000 (23:35 +0100)]
s3:release-scripts: fix create-tarball to treat vendor patch level correctly
Volker Lendecke [Mon, 8 Mar 2010 16:59:35 +0000 (17:59 +0100)]
packaging(RHEL-CTDB): explicitly build the tsm vfs module.
Volker Lendecke [Fri, 5 Mar 2010 15:10:49 +0000 (16:10 +0100)]
packaging(RHEL-CTDB): Fix the RPM build
Michael Adam [Mon, 8 Mar 2010 21:32:41 +0000 (22:32 +0100)]
s3:build: Fix automatic building of vfs_tsmsm if gpfs and dmapi are present.
Michael
Volker Lendecke [Fri, 5 Mar 2010 16:06:08 +0000 (17:06 +0100)]
s3: Make "smbcontrol xx debuglevel" print the correct cluster pid
Volker Lendecke [Fri, 5 Mar 2010 11:28:59 +0000 (12:28 +0100)]
v3-4-ctdb: Fix the build of vfs_gpfs_prefetch.c
Volker Lendecke [Tue, 16 Feb 2010 14:21:25 +0000 (15:21 +0100)]
s3: Fix timeout calculation if g_lock_lock is given a timeout < 60s
Detected while showing this code to obnox :-)
Volker Lendecke [Tue, 16 Feb 2010 11:31:58 +0000 (12:31 +0100)]
s3: Slightly increase parallelism in g_lock
There's no need to still hold the g_lock tdb-level lock while telling the
waiters to retry
Volker Lendecke [Tue, 16 Feb 2010 11:28:53 +0000 (12:28 +0100)]
s3: Avoid starving locks when many processes die at the same time
In g_lock_unlock we have a little race between the process_exists and
messaging_send call: We only send to 5 waiters now, they all might have died
between us checking their existence and sending the message. This change makes
g_lock_lock retry at least once every minute.
Volker Lendecke [Tue, 16 Feb 2010 11:22:08 +0000 (12:22 +0100)]
s3: Avoid a thundering herd in g_lock_unlock
Only notify the first 5 pending lock waiters. This avoids a thundering herd
problem that is really nasty in a cluster. It also makes acquiring a lock a bit
more FIFO, lock waiters are added to the end of the array.
Volker Lendecke [Mon, 15 Feb 2010 15:57:16 +0000 (16:57 +0100)]
s3: Optimize g_lock_lock for a heavily contended case
Only check the existence of the lock owner in g_lock_parse, check the rest of
the records only when we got the lock successfully. This reduces the load on
process_exists which can involve a network roundtrip in the clustered case.
Volker Lendecke [Mon, 15 Feb 2010 15:49:46 +0000 (16:49 +0100)]
s3: Fix handling of processes that died in g_lock
g_lock_parse might have thrown away entries from the locks array because the
processes were not around anymore. Don't store the orphaned entries.
Volker Lendecke [Mon, 15 Feb 2010 15:35:06 +0000 (16:35 +0100)]
s3: Fix a typo
Volker Lendecke [Fri, 12 Feb 2010 11:06:50 +0000 (12:06 +0100)]
s3: notify_onelevel does not use seqnums, so don't open asking for it
Andrew Tridgell [Fri, 5 Feb 2010 03:25:03 +0000 (14:25 +1100)]
s3-events: make the old timed events compatible with tevent
tevent ensures that a timed event is only called once. The old events
code relied on the called handler removing the event itself. If the
handler removed the event after calling a function which invoked the
event loop then the timed event could loop forever.
This change makes the two timed event systems more compatible, by
allowing the handler to free the te if it wants to, but ensuring it is
off the linked list of events before the handler is called, and
ensuring it is freed even if the handler doesn't free it.
Andrew Tridgell [Fri, 5 Feb 2010 01:42:06 +0000 (12:42 +1100)]
s3-smbd: add a rate limited cleanup of brl, connections and locking db
On unclean shutdown we can end up with stale entries in the brlock,
connections and locking db. Previously we would do the cleanup on
every unclean exit, but that can cause smbd to be completely
unavailable for several minutes when a large number of child smbd
processes exit.
This adds a rate limited cleanup of the databases, with the default
that cleanup happens at most every 20s
Andrew Tridgell [Thu, 4 Feb 2010 07:02:52 +0000 (18:02 +1100)]
s3-brlock: we don't need these MSG_SMB_UNLOCK calls now
These have been replaced with the min timeout in blocking.c
Andrew Tridgell [Sat, 6 Feb 2010 04:59:43 +0000 (20:59 -0800)]
s3-brlock: add a minimim retry time for pending blocking locks
When we are waiting on a pending byte range lock, another smbd might
exit uncleanly, and therefore not notify us of the removal of the
lock, and thus not trigger the lock to be retried.
We coped with this up to now by adding a message_send_all() in the
SIGCHLD and cluster reconfigure handlers to send a MSG_SMB_UNLOCK to
all smbd processes. That would generate O(N^2) work when a large
number of clients disconnected at once (such as on a network outage),
which could leave the whole system unusable for a very long time (many
minutes, or even longer).
By adding a minimum re-check time for pending byte range locks we
avoid this problem by ensuring that pending locks are retried at a
more regular interval.
Michael Adam [Tue, 9 Feb 2010 07:00:06 +0000 (08:00 +0100)]
packaging(RHEL-CTDB): adapt configure.rpm to match the spec-file configure call
Michael
Abhidnya P Chirmule [Wed, 6 Jan 2010 18:45:24 +0000 (19:45 +0100)]
s3: Add a vfs_time_audit module
This warns if a file system is slow
Michael Adam [Mon, 8 Feb 2010 10:01:47 +0000 (11:01 +0100)]
s3:registry: eliminate race condition in creating/scanning sorted subkeys
Called, from key_exists, scan_sorted_subkeys re-creates the sorted
subkeys record of the given key and then searches through it.
The race is that between creation and parsing of the sorted subkey
record, another process that stores some other subkey of the same
parent key will delete the sorted subkey record, resulting in an
WERR_BADFILE of an operation that should actually succeed.
This patch fixes the issue by wrapping the creation and parsing
into a transaction.
Michael
Michael Adam [Fri, 29 Jan 2010 14:04:25 +0000 (15:04 +0100)]
s3:make "net conf addshare" atomic by wrapping all writes in one transaction
Michael
Michael Adam [Sat, 23 Jan 2010 00:17:06 +0000 (01:17 +0100)]
s3:g_lock: remove a nested event loop, replacing the inner loop by select
This made smbd crash in g_lock_lock() when trying to start a
transaction on a db with an already started transaction,
e.g. in a tcon_and_X where the share_info.tdb was not yet
initialized but share_info.tdb was already locked by another
process or writing acces to the winreg rpc pipe where the
registry tdb was already locked by another process.
What we really _want_ to do here by design is to react to
MSG_DBWRAP_G_LOCK_RETRY messages that are either sent
by a client doing g_lock_unlock or by ourselves when
we receive a CTDB_SRVID_SAMBA_NOTIFY or
CTDB_SRVID_RECONFIGURE message from ctdbd, i.e. when
either a client holding a lock or a complete node
has died.
Doing this properly involves calling tevent_loop_once(),
but doing this here with the main ctdbd messaging context
creates a nested event loop when g_lock_lock() is called
from the main event loop.
So as a quick fix, we act a little corasely here: we do
a select on the ctdb connection fd and when it is readable
or we get EINTR, then we retry without actually parsing
any ctdb packages or dispatching messages. This means that
we retry more often than necessary and intended by design,
but this does not harm and it is unobtrusive. When we have
finished, the main loop will pick up all the messages and
ctdb packets. The only extra twist is that we cannot use
timed events here but have to handcode a timeout for select.
Michael
Michael Adam [Fri, 22 Jan 2010 23:05:15 +0000 (00:05 +0100)]
s3:ctdb_conn: add ctdbd_conn_get_fd() to get the fd out of the ctdb connection
Michael
Michael Adam [Fri, 22 Jan 2010 14:56:28 +0000 (15:56 +0100)]
s3:g_lock: remove an unreached code path.
Michael
Michael Adam [Mon, 18 Jan 2010 16:26:04 +0000 (17:26 +0100)]
s3:dbwrap_ctdb: fix reading/storing of special key __db_sequence_number__
The key for reading and writing was inconsistent due to a
off by one data length.
Michael
Volker Lendecke [Thu, 14 Jan 2010 17:26:01 +0000 (18:26 +0100)]
v3-4-ctdb: Do not do any logrotation
Michael Adam [Wed, 13 Jan 2010 22:53:54 +0000 (23:53 +0100)]
s3:dbwrap_ctdb: exit early when nothing has been written in transaction_commit.
This skips update of the __db_sequence_number__ record when nothing else has
been written. There are transactions that are just openend and then nothing
is written until transaction_commit is called. This is for instance the case
with registry initialization routines: They start a transaction and only
write somthing when the registry has not been initialized yet.
So this change will skip many db_seqnum bumps and TRANS3_COMMIT roundtrips.
Michael