nfsd: hold a lighter-weight client reference over CB_RECALL_ANY
authorJeff Layton <jlayton@kernel.org>
Fri, 5 Apr 2024 17:56:18 +0000 (13:56 -0400)
committerChuck Lever <chuck.lever@oracle.com>
Fri, 5 Apr 2024 18:05:35 +0000 (14:05 -0400)
commit10396f4df8b75ff6ab0aa2cd74296565466f2c8d
treec34f457c0e65e8a90841638b1f09d458d044c040
parent05258a0a69b3c5d2c003f818702c0a52b6fea861
nfsd: hold a lighter-weight client reference over CB_RECALL_ANY

Currently the CB_RECALL_ANY job takes a cl_rpc_users reference to the
client. While a callback job is technically an RPC that counter is
really more for client-driven RPCs, and this has the effect of
preventing the client from being unhashed until the callback completes.

If nfsd decides to send a CB_RECALL_ANY just as the client reboots, we
can end up in a situation where the callback can't complete on the (now
dead) callback channel, but the new client can't connect because the old
client can't be unhashed. This usually manifests as a NFS4ERR_DELAY
return on the CREATE_SESSION operation.

The job is only holding a reference to the client so it can clear a flag
after the RPC completes. Fix this by having CB_RECALL_ANY instead hold a
reference to the cl_nfsdfs.cl_ref. Typically we only take that sort of
reference when dealing with the nfsdfs info files, but it should work
appropriately here to ensure that the nfs4_client doesn't disappear.

Fixes: 44df6f439a17 ("NFSD: add delegation reaper to react to low memory condition")
Reported-by: Vladimir Benes <vbenes@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
fs/nfsd/nfs4state.c