This hopefully hides the strange behaviour of FreeBSD (at least 12.1)
for already disconnected AF_UNIX sockets.
The race is triggered when the following detects the usage of 'getpeername':
truss -o ./truss.out -f -H -a -e -D -s 160 ctest -V -R test_thread_echo_tcp_connect;
grep getpeername truss.out
In a simplified log the following is happening:
ECHO_SRV(parent): socket(PF_LOCAL,SOCK_STREAM,0) = 4 (0x4)
ECHO_SRV(parent): unlink("/tmp/w_E37bkf/T0A0007") ERR#2 'No such file or directory'
ECHO_SRV(parent): bind(4,{ AF_UNIX "/tmp/w_E37bkf/T0A0007" },106) = 0 (0x0)
ECHO_SRV(parent): listen(4,16) = 0 (0x0)
...
ECHO_SRV(parent): write(2,"SWRAP_ERROR[echo_srv (9792)] - swrap_accept: before accept(sa_socklen=106)\n",75) = 75 (0x4b)
ECHO_SRV(parent): accept4(0x4,0x7ffffffde158,0x7ffffffde150,0x0) = 5 (0x5)
ECHO_SRV(parent): write(2,"SWRAP_ERROR[echo_srv (9792)] - swrap_accept: after accept(sa_socklen=106, family=1)\n",84) = 84 (0x54)
ECHO_SRV(parent): getsockname(5,{ AF_UNIX "/tmp/w_E37bkf/T0A0007" },0x7ffffffde0c0) = 0 (0x0)
ECHO_SRV(parent): swrap_accept() returned a valid connection and a per connection child (pid=9793) handles it
TEST_THREAD: socket(PF_LOCAL,SOCK_STREAM,0) = 7 (0x7)
TEST_THREAD: bind(7,{ AF_UNIX "/tmp/w_E37bkf/T014D4F" },106) = 0 (0x0)
TEST_THREAD: connect(7,{ AF_UNIX "/tmp/w_E37bkf/T0A0007" },106) = 0 (0x0)
TEST_THREAD: close(7) = 0 (0x0)
ECHO_SRV(parent): wait4(-1,0x0,0x0,0x0) = 9793 (0x2641)
ECHO_SRV(parent): close(5) = 0 (0x0)
ECHO_SRV(parent): write(2,"SWRAP_ERROR[echo_srv (9792)] - swrap_accept: before accept(sa_socklen=106)\n",75) = 75 (0x4b)
ECHO_SRV(parent): accept4(0x4,0x7ffffffde158,0x7ffffffde150,0x0) = 5 (0x5)
TEST_THREAD: unlink("/tmp/w_E37bkf/T014D4F") = 0 (0x0)
ECHO_SRV(parent): write(2,"SWRAP_ERROR[echo_srv (9792)] - swrap_accept: after accept(sa_socklen=16, family=1)\n",83) = 83 (0x53)
ECHO_SRV(parent): getpeername(5,0x7ffffffde158,0x7ffffffde150) ERR#57 'Socket is not connected'
ECHO_SRV(parent): getsockname(5,{ AF_UNIX "/tmp/w_E37bkf/T0A0007" },0x7ffffffde0c0) = 0 (0x0)
ECHO_SRV(parent): getpeername(5,0x7ffffffde158,0x7ffffffde150) ERR#57 'Socket is not connected'
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Andreas Schneider <asn@samba.org>
/* Check if we have a stale fd and remove it */
swrap_remove_stale(fd);
+ if (un_addr.sa.un.sun_path[0] == '\0') {
+ /*
+ * FreeBSD seems to have a problem where
+ * accept4() on the unix socket doesn't
+ * ECONNABORTED for already disconnected connections.
+ *
+ * Let's try libc_getpeername() to get the peer address
+ * as a fallback, but it'll likely return ENOTCONN,
+ * which we have to map to ECONNABORTED.
+ */
+ un_addr.sa_socklen = sizeof(struct sockaddr_un),
+ ret = libc_getpeername(fd, &un_addr.sa.s, &un_addr.sa_socklen);
+ if (ret == -1) {
+ int saved_errno = errno;
+ libc_close(fd);
+ if (saved_errno == ENOTCONN) {
+ /*
+ * If the connection is already disconnected
+ * we should return ECONNABORTED.
+ */
+ saved_errno = ECONNABORTED;
+ }
+ errno = saved_errno;
+ return ret;
+ }
+ }
+
ret = libc_getsockname(fd,
&un_my_addr.sa.s,
&un_my_addr.sa_socklen);