[lxc-devel] [lxd/master] proxy: handle full socket buffer

brauner on Github lxc-bot at linuxcontainers.org
Fri Jun 22 10:27:01 UTC 2018


A non-text attachment was scrubbed...
Name: not available
Type: text/x-mailbox
Size: 1655 bytes
Desc: not available
URL: <http://lists.linuxcontainers.org/pipermail/lxc-devel/attachments/20180622/1489d1c3/attachment.bin>
-------------- next part --------------
From 777e48048023d887db2a0ef58c739b39bbee054e Mon Sep 17 00:00:00 2001
From: Christian Brauner <christian.brauner at ubuntu.com>
Date: Fri, 22 Jun 2018 12:20:42 +0200
Subject: [PATCH] proxy: handle full socket buffer

Usually we should wait for the child process somewhere here. But we cannot
really do this. The listener file descriptors are retrieved in the go runtime
but at that point we have already double-fork()ed to daemonize ourselves and so
we can't wait on the child anymore after we received the listener fds. On the
other hand, if we wait on the child here we wait on the child before the
receive. However, if we do this then we can end up in a situation where the
socket send buffer is full and we need to retrieve some file descriptors first
before we can go on sending more. But this won't be possible because we're
waiting before the call to receive the file descriptor in the go runtime.
Luckily, we can just rely on init doing it's job and reaping the zombie
process. So, technically unsatisfying but pragmatically correct.

Explanatory Anectdote:
The limit for sending fds lies somewhere in the hundreds. I've hit this limit
before when sending pty fds in LXC around. Once there was code in LXC that did
this in one send. So if a user requested 500 fds LXC would try to send 500 fds
from parent to child and it would fail with EINVAL. That's when I switched to
batch-based sending. This is a similar situation just that we fill the buffer
bit by bit.

Closes #4681.

Signed-off-by: Christian Brauner <christian.brauner at ubuntu.com>
---
 lxd/main_forkproxy.go | 49 +++++++++++++++++++++++++++++--------------------
 1 file changed, 29 insertions(+), 20 deletions(-)

diff --git a/lxd/main_forkproxy.go b/lxd/main_forkproxy.go
index d9bb4509c..5d73a6304 100644
--- a/lxd/main_forkproxy.go
+++ b/lxd/main_forkproxy.go
@@ -165,19 +165,13 @@ void forkproxy()
 		}
 
 		ret = dup3(sk_fds[1], FORKPROXY_UDS_SOCK_FD_NUM, O_CLOEXEC);
+		close(sk_fds[1]);
 		if (ret < 0) {
 			fprintf(stderr,
 				"%s - Failed to duplicate fd %d to fd 200\n",
 				strerror(errno), sk_fds[1]);
 			_exit(1);
 		}
-
-		ret = close(sk_fds[1]);
-		if (ret < 0) {
-			fprintf(stderr, "%s - Failed to close socket fd %d\n",
-				strerror(errno), sk_fds[1]);
-			_exit(1);
-		}
 	} else {
 		whoami = FORKPROXY_PARENT;
 
@@ -203,6 +197,7 @@ void forkproxy()
 		}
 
 		ret = dup3(sk_fds[0], FORKPROXY_UDS_SOCK_FD_NUM, O_CLOEXEC);
+		close(sk_fds[0]);
 		if (ret < 0) {
 			fprintf(stderr,
 				"%s - Failed to duplicate fd %d to fd 200\n",
@@ -210,18 +205,21 @@ void forkproxy()
 			_exit(1);
 		}
 
-		ret = close(sk_fds[0]);
-		if (ret < 0) {
-			fprintf(stderr, "%s - Failed to close socket fd %d\n",
-				strerror(errno), sk_fds[1]);
-			_exit(1);
-		}
-
-		ret = wait_for_pid(pid);
-		if (ret < 0) {
-			fprintf(stderr, "Failed to start listener\n");
-			_exit(EXIT_FAILURE);
-		}
+		// Usually we should wait for the child process somewhere here.
+		// But we cannot really do this. The listener file descriptors
+		// are retrieved in the go runtime but at that point we have
+		// already double-fork()ed to daemonize ourselves and so we
+		// can't wait on the child anymore after we received the
+		// listener fds. On the other hand, if we wait on the child
+		// here we wait on the child before the receive. However, if we
+		// do this then we can end up in a situation where the socket
+		// send buffer is full and we need to retrieve some file
+		// descriptors first before we can go on sending more. But this
+		// won't be possible because we're waiting before the call to
+		// receive the file descriptor in the go runtime. Luckily, we
+		// can just rely on init doing it's job and reaping the zombie
+		// process. So, technically unsatisfying but pragmatically
+		// correct.
 
 		// daemonize
 		pid = fork();
@@ -443,11 +441,16 @@ func (c *cmdForkproxy) Run(cmd *cobra.Command, args []string) error {
 				return err
 			}
 
+		sAgain:
 			err = shared.AbstractUnixSendFd(forkproxyUDSSockFDNum, int(file.Fd()))
-			file.Close()
 			if err != nil {
+				errno, ok := shared.GetErrno(err)
+				if ok && (errno == syscall.EAGAIN) {
+					goto sAgain
+				}
 				break
 			}
+			file.Close()
 		}
 
 		syscall.Close(forkproxyUDSSockFDNum)
@@ -456,8 +459,14 @@ func (c *cmdForkproxy) Run(cmd *cobra.Command, args []string) error {
 
 	files := []*os.File{}
 	for range lAddr.addr {
+	rAgain:
 		f, err := shared.AbstractUnixReceiveFd(forkproxyUDSSockFDNum)
 		if err != nil {
+			errno, ok := shared.GetErrno(err)
+			if ok && (errno == syscall.EAGAIN) {
+				goto rAgain
+			}
+
 			fmt.Printf("Failed to receive fd from listener process: %v\n", err)
 			syscall.Close(forkproxyUDSSockFDNum)
 			return err


More information about the lxc-devel mailing list