[lxc-devel] [lxd/master] probably fix a concurrent websocket write

tych0 on Github lxc-bot at linuxcontainers.org
Thu Jun 2 19:50:39 UTC 2016


A non-text attachment was scrubbed...
Name: not available
Type: text/x-mailbox
Size: 2350 bytes
Desc: not available
URL: <http://lists.linuxcontainers.org/pipermail/lxc-devel/attachments/20160602/214ca06e/attachment.bin>
-------------- next part --------------
From 38c7a38f12e0f2928d7bcb74a20410d09d23cd67 Mon Sep 17 00:00:00 2001
From: Tycho Andersen <tycho.andersen at canonical.com>
Date: Wed, 1 Jun 2016 21:26:02 +0000
Subject: [PATCH] probably fix a concurrent websocket write

Consider,

t=2016-05-31T20:49:07+0000 lvl=dbug msg=MigrationSink driver=storage/dir live=false container=nonlive objects=[nonlive/snap0]
t=2016-05-31T20:49:07+0000 lvl=dbug msg="sending write barrier"
t=2016-05-31T20:49:07+0000 lvl=dbug msg="Got message barrier, resetting stream"
t=2016-05-31T20:49:07+0000 lvl=dbug msg="sending write barrier"
t=2016-05-31T20:49:07+0000 lvl=dbug msg="Got message barrier, resetting stream"
panic: concurrent write to websocket connection

goroutine 560 [running]:
panic(0x9b9da0, 0xc42012a000)
	/lxd/build/tmp.xTgWRy05x1/go/golang/src/runtime/panic.go:500 +0x1a1
github.com/gorilla/websocket.(*Conn).flushFrame(0xc42000bb00, 0xc42034e601, 0x0, 0x0, 0x0, 0xc42034e778, 0xc42046c001)
	/lxd/build/tmp.xTgWRy05x1/go/src/github.com/gorilla/websocket/conn.go:481 +0x414
github.com/gorilla/websocket.(*Conn).NextWriter(0xc42000bb00, 0x2, 0xc42034e778, 0x1044e01, 0xc42028aa20, 0x0)
	/lxd/build/tmp.xTgWRy05x1/go/src/github.com/gorilla/websocket/conn.go:407 +0x1a1
github.com/lxc/lxd/shared.WebsocketMirror.func2(0xc420147180, 0xc42000bb00, 0x10482e0, 0xc42028aa20)
	/lxd/build/tmp.xTgWRy05x1/go/src/github.com/lxc/lxd/shared/network.go:257 +0x125
created by github.com/lxc/lxd/shared.WebsocketMirror
	/lxd/build/tmp.xTgWRy05x1/go/src/github.com/lxc/lxd/shared/network.go:274 +0xe2
error: Get https://127.0.0.1:22441/1.0/operations/eb592bbb-aa8c-4a0f-a57f-cac9285fedba/wait: Unable to connect to: 127.0.0.1:22441

One way this could happen is that the goroute that is sending the snapshot
signals readDone, and then the storage driver starts sending the next bits over
the websocket before the message barrier is actually sent. This seems really
unlikley to happen since the goroutine sending the message needs to be starved
for a long time, but given that we've only seen this stack trace once in the
last few months it seems quite rare.

There may be other ways for this to occur, but this is the only one I can see
right now.

Signed-off-by: Tycho Andersen <tycho.andersen at canonical.com>
---
 shared/network.go | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/shared/network.go b/shared/network.go
index cfd2017..9ad35e4 100644
--- a/shared/network.go
+++ b/shared/network.go
@@ -275,10 +275,10 @@ func WebsocketMirror(conn *websocket.Conn, w io.WriteCloser, r io.ReadCloser) (c
 		for {
 			buf, ok := <-in
 			if !ok {
-				readDone <- true
 				r.Close()
 				Debugf("sending write barrier")
 				conn.WriteMessage(websocket.TextMessage, []byte{})
+				readDone <- true
 				return
 			}
 			w, err := conn.NextWriter(websocket.BinaryMessage)


More information about the lxc-devel mailing list