[lxc-users] lxc-ls -f problem

Mon Jun 1 19:41:06 UTC 2015

Quoting david.andel at bli.uzh.ch (david.andel at bli.uzh.ch):
> Now attached the output of 
> strace -f -ostrace.out -- lxc-ls -f

hm, giving you

6886  connect(4, {sa_family=AF_LOCAL, sun_path=@"/home/david/.local/share/lxc/s0_RStSh/command"}, 48) = 0
6886  getuid()                          = 1000
6886  getgid()                          = 1000
6886  sendmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"\4\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=6886, uid=1000, gid=1000}}, msg_flags=0}, MSG_NOSIGNAL) = 16
6886  recvmsg(4, 0x7ffe4a8ed1d0, 0)     = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
6886  --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL, si_value={int=1, ptr=0x1}} ---

> strace -f -ostrace-start.out -- lxc-start -n s0_RStSh

and

6927  connect(7, {sa_family=AF_LOCAL, sun_path=@"/home/david/.local/share/lxc/s0_RStSh/command"}, 48) = 0
6927  getuid()                          = 1000
6927  getgid()                          = 1000
6927  sendmsg(7, {msg_name(0)=NULL, msg_iov(1)=[{"\4\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=6927, uid=1000, gid=1000}}, msg_flags=0}, MSG_NOSIGNAL) = 16
6927  recvmsg(7,  <unfinished ...>
6926  <... read resumed> 0x7fff75794a2c, 4) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
6927  <... recvmsg resumed> 0x7fff75794780, 0) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
6926  --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---

In both cases the command socket appears to be bad.

> lxc-start -n s0_RStSh -l trace -o debug.out
> 
> I was running these not as root this time but if that is required I will post those as well.
> 
> Interestingly, this happens only on a vivid running in a KVM.
> On three other vivid instances running on bare metal this does not happen.

Very interesting.  If you create a new kvm vm does it happen there too, or
is that instance maybe somehow in a a bad state?

> I am running the latest stable releases from the PPA, i.e. lxc 1.1.2-0ubuntu3.
> 
> Cheers,
> David
> 
> 
> -----"lxc-users" <lxc-users-bounces at lists.linuxcontainers.org> wrote: -----
> To: LXC users mailing-list <lxc-users at lists.linuxcontainers.org>
> From: david.andel at bli.uzh.ch
> Sent by: "lxc-users" 
> Date: 05/23/2015 20:47
> Subject: Re: [lxc-users] lxc-ls -f problem
> 
> Hi
> 
> I have the exact same problem after yesterdays update.
> 
> And I suspect it is bug https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1413927 or at least closely related.
> 
> root at andel2:~# cat /proc/self/cgroup
> 10:devices:/system.slice/ssh.service
> 9:perf_event:/system.slice/ssh.service
> 8:cpuset:/system.slice/ssh.service
> 7:cpu,cpuacct:/system.slice/ssh.service
> 6:memory:/system.slice/ssh.service
> 5:freezer:/system.slice/ssh.service
> 4:net_cls,net_prio:/system.slice/ssh.service
> 3:hugetlb:/system.slice/ssh.service
> 2:blkio:/system.slice/ssh.service
> 1:name=systemd:/system.slice/ssh.service
> 
> root at andel2:~# service cgmanager status
> ● cgmanager.service - Cgroup management daemon
>    Loaded: loaded (/lib/systemd/system/cgmanager.service; disabled; vendor preset: enabled)
>    Active: active (running) since Sat 2015-05-23 15:48:07 CEST; 30min ago
>  Main PID: 2994 (cgmanager)
>    Memory: 296.0K
>    CGroup: /system.slice/cgmanager.service
>            ‣ 2994 /sbin/cgmanager -m name=systemd
> 
> May 23 15:48:15 andel2 cgmanager[2994]: cgmanager: Invalid path /run/cgmanager/fs/hugetlb/system.slice/ssh.service/lxc/s0_nginx
> May 23 15:48:15 andel2 cgmanager[2994]: cgmanager:per_ctrl_move_pid_main: Invalid path /run/cgmanager/fs/hugetlb/system.slice/ssh.servi...s0_nginx
> May 23 15:48:15 andel2 cgmanager[2994]: cgmanager: Invalid path /run/cgmanager/fs/memory/system.slice/ssh.service/lxc/s0_nginx
> May 23 15:48:15 andel2 cgmanager[2994]: cgmanager:per_ctrl_move_pid_main: Invalid path /run/cgmanager/fs/memory/system.slice/ssh.servic...s0_nginx
> May 23 15:48:15 andel2 cgmanager[2994]: cgmanager: Invalid path /run/cgmanager/fs/net_cls/system.slice/ssh.service/lxc/s0_nginx
> May 23 15:48:15 andel2 cgmanager[2994]: cgmanager:per_ctrl_move_pid_main: Invalid path /run/cgmanager/fs/net_cls/system.slice/ssh.servi...s0_nginx
> May 23 15:48:15 andel2 cgmanager[2994]: cgmanager: Invalid path /run/cgmanager/fs/perf_event/system.slice/ssh.service/lxc/s0_nginx
> May 23 15:48:15 andel2 cgmanager[2994]: cgmanager:per_ctrl_move_pid_main: Invalid path /run/cgmanager/fs/perf_event/system.slice/ssh.se...s0_nginx
> May 23 15:48:15 andel2 cgmanager[2994]: cgmanager: Invalid path /run/cgmanager/fs/none,name=systemd/system.slice/ssh.service/lxc/s0_nginx
> May 23 15:48:15 andel2 cgmanager[2994]: cgmanager:per_ctrl_move_pid_main: Invalid path /run/cgmanager/fs/none,name=systemd/system.slice...s0_nginx
> Hint: Some lines were ellipsized, use -l to show in full.
> 
> The unprivileged containers could be stopped but trying to stop a running privileged container hangs and blocked the host completely.
> Even a reboot is not possible, the host answers only to ping requests, ssh returns with "Write failed: Broken pipe".
> And since the machine is geographically distant (and it's weekend as usual when such stuff happens) I cannot provide the results generated from the commands below.
> 
> But probably I am going to run into the same error on other machines and will provide the results.
> 
> David
> 
> 
> -----"lxc-users" <lxc-users-bounces at lists.linuxcontainers.org> wrote: -----
> To: LXC users mailing-list <lxc-users at lists.linuxcontainers.org>
> From: Serge Hallyn 
> Sent by: "lxc-users" 
> Date: 05/22/2015 17:44
> Subject: Re: [lxc-users] lxc-ls -f problem
> 
> Quoting Dave Birch (dave.birch at gmail.com):
> > Dave Birch <dave.birch at ...> writes:
> > 
> > Further update - just discovered that lxc-start now hangs for all 
> > containers, even newly created ones using only the standard download 
> > template on lxc-create.
> > 
> > I'm pretty much dead in the water until I can work out how to resolve 
> > this.
> 
> Can you attach the results of
> 
> sudo strace -f -ostrace.out -- lxc-ls -f
> sudo strace -f -ostrace-start.out -- lxc-start -n <container>
> sudo lxc-start -n <container> -l trace -o debug.out
> 
> and show your exact steps, if you can remember them or have them in
> history, when you were originally creating these containers?
> _______________________________________________
> lxc-users mailing list
> lxc-users at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users 
>  
> _______________________________________________
> lxc-users mailing list
> lxc-users at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users

> _______________________________________________
> lxc-users mailing list
> lxc-users at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users