[lxc-users] Using valgrind with lxc
Vallevand, Mark K
Mark.Vallevand at UNISYS.com
Thu Oct 2 12:43:32 UTC 2014
From my test program, which is trying to recreate the issue we see when running valgrind against our application.
Our application, running on Ubuntu 12.04 LTS, is a complicated control program that is having a memory leak or corruption.
The program manages multiple containers which are running a mature program that doesn't need any valgrinding. The
program does a fork() and in the child process the lxc library is called to start the mature program in a container
using lxc_start(). My test program is a very simple thing that just does an lxc_start() or __lxc_start() against an
existing container. (The __lxc_start() and some supporting code were copied into my test program and compiled there.
It was simpler. But, it's still __lxc_start(). )
Regards.
Mark K Vallevand
"If there are no dogs in Heaven, then when I die I want to go where they went."
-Will Rogers
THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
-----Original Message-----
From: lxc-users [mailto:lxc-users-bounces at lists.linuxcontainers.org] On Behalf Of Serge Hallyn
Sent: Wednesday, October 01, 2014 04:59 PM
To: LXC users mailing-list
Cc: valgrind-users at lists.sourceforge.net
Subject: Re: [lxc-users] Using valgrind with lxc
You called __lxc_start() from where, how?
Quoting Vallevand, Mark K (Mark.Vallevand at UNISYS.com):
> I did this by calling __lxc_start(). So, lxc_check_inherited() didn't get called. That was this:
> > If I call __lxc_start() rather than lxc_start(), I see this:
> > vdr1: sync wake failure : Broken pipe
> > vdr1: failed to spawn 'vdr1'
> > And, just before that there is some complaining from valgrind:
> > ==25086== Syscall param clone(child_tidptr) contains uninitialised byte(s)
lxc uses the libc clone wrappers and does not pass in a tidptr...
> > ==25086== at 0x56622E1: clone (clone.S:84)
> > ==25086== by 0x4E3BD38: __lxc_start (in /usr/lib/lxc/liblxc.so.0.7.5)
> > ==25086== by 0x4014C9: vgVdrStartClone (vgVdrTest.c:88)
> > ==25086== by 0x400F0A: main (vgVdrTest.c:337)
> > ==25086==
> > ==1== Syscall param wait4(status) points to unaddressable byte(s)
> > ==1== at 0x53607C4: wait (wait.c:32)
> > ==1== by 0x4E3A400: ??? (in /usr/lib/lxc/liblxc.so.0.7.5)
> > ==1== by 0x566231C: clone (clone.S:112)
> > ==1== Address 0xffffffffffffffd4 is not stack'd, malloc'd or (recently) free'd
Would help to see file:line in the lxc code (as would using
a newer lxc :)
> > ==1== Invalid write of size 4
> > ==1== at 0x4E3A4FF: ??? (in /usr/lib/lxc/liblxc.so.0.7.5)
> > ==1== by 0x566231C: clone (clone.S:112)
> > ==1== Address 0xffffffffffffffc0 is not stack'd, malloc'd or (recently) free'd
> > ==1==
> > ==1==
> > ==1== Process terminating with default action of signal 11 (SIGSEGV)
> > ==1== Access not within mapped region at address 0xFFFFFFFFFFFFFFC0
> > ==1== at 0x4E3A4FF: ??? (in /usr/lib/lxc/liblxc.so.0.7.5)
> > ==1== by 0x566231C: clone (clone.S:112)
>
>
> Regards.
> Mark K Vallevand
>
> "If there are no dogs in Heaven, then when I die I want to go where they went."
> -Will Rogers
>
> THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
>
>
> -----Original Message-----
> From: lxc-users [mailto:lxc-users-bounces at lists.linuxcontainers.org] On Behalf Of Serge Hallyn
> Sent: Wednesday, October 01, 2014 04:18 PM
> To: LXC users mailing-list
> Cc: valgrind-users at lists.sourceforge.net
> Subject: Re: [lxc-users] Using valgrind with lxc
>
> Hi,
>
> For the sake of testing I'd go ahead and just 'return 0' at the
> top of lxc_check_inherited.
>
> We can talk about adding an option to do this, i.e.
> lxc.close_all_fds = -1 maybe. It's a very rare case where
> that should be done, though.
>
> -serge
>
> Quoting Vallevand, Mark K (Mark.Vallevand at UNISYS.com):
> > Valgrind meet containers.
> > Containers meet valgrind.
> >
> > I've found what lxc doesn't like when running valgrind.
> >
> > The lxc_start() checks to see if there are extra file descriptors open and won't call __lxc_start().
> > vdr1: inherited fd 1024 on /home/vallevand/trunk_s4m/s4m-appliance/src/vdrd/vgVdrTest
> > vdr1: inherited fd 1025 on /tmp/valgrind_proc_24989_cmdline_4fbfb9a5 (deleted)VdrTest
> > vdr1: inherited fd 1026 on /dev/pts/1ind_proc_24989_cmdline_4fbfb9a5 (deleted)VdrTest
> > vdr1: inherited fd 1027 on pipe:[768863]_proc_24989_cmdline_4fbfb9a5 (deleted)VdrTest
> > vdr1: inherited fd 1028 on pipe:[768863]_proc_24989_cmdline_4fbfb9a5 (deleted)VdrTest
> >
> > Vdr1 is the name of my container. All those open files in the child process are related to valgrind.
> >
> > If I call __lxc_start() rather than lxc_start(), I see this:
> > vdr1: sync wake failure : Broken pipe
> > vdr1: failed to spawn 'vdr1'
> > And, just before that there is some complaining from valgrind:
> > ==25086== Syscall param clone(child_tidptr) contains uninitialised byte(s)
> > ==25086== at 0x56622E1: clone (clone.S:84)
> > ==25086== by 0x4E3BD38: __lxc_start (in /usr/lib/lxc/liblxc.so.0.7.5)
> > ==25086== by 0x4014C9: vgVdrStartClone (vgVdrTest.c:88)
> > ==25086== by 0x400F0A: main (vgVdrTest.c:337)
> > ==25086==
> > ==1== Syscall param wait4(status) points to unaddressable byte(s)
> > ==1== at 0x53607C4: wait (wait.c:32)
> > ==1== by 0x4E3A400: ??? (in /usr/lib/lxc/liblxc.so.0.7.5)
> > ==1== by 0x566231C: clone (clone.S:112)
> > ==1== Address 0xffffffffffffffd4 is not stack'd, malloc'd or (recently) free'd
> > ==1==
> > ==1== Invalid write of size 4
> > ==1== at 0x4E3A4FF: ??? (in /usr/lib/lxc/liblxc.so.0.7.5)
> > ==1== by 0x566231C: clone (clone.S:112)
> > ==1== Address 0xffffffffffffffc0 is not stack'd, malloc'd or (recently) free'd
> > ==1==
> > ==1==
> > ==1== Process terminating with default action of signal 11 (SIGSEGV)
> > ==1== Access not within mapped region at address 0xFFFFFFFFFFFFFFC0
> > ==1== at 0x4E3A4FF: ??? (in /usr/lib/lxc/liblxc.so.0.7.5)
> > ==1== by 0x566231C: clone (clone.S:112)
> >
> > Our program is designed to close all open file descriptors in the child process before calling lxc_start(). That code can try to close all file descriptors to make sure something doesn't sneak through. However, closing the file descriptors associated with valgrind does not work. I get errno=0 Bad File Descriptor. Valgrind really has them held open. I am running as root in all these tests.
> >
> > I've also reproduced the problem using the 'lxc-' programs. If you do something like 'lxc-create -n XXX' and then something like 'valgrind lxc-start -n XXX -- ls' you'll see it. Well, the flavor of the error with open file descriptors.
> >
> > My hopes aren't high, but any ideas are very welcome.
> >
> > Regards.
> > Mark K Vallevand
> > "If there are no dogs in Heaven, then when I die I want to go where they went."
> > -Will Rogers
> >
> > THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
> > From: lxc-users [mailto:lxc-users-bounces at lists.linuxcontainers.org] On Behalf Of Vallevand, Mark K
> > Sent: Thursday, September 25, 2014 09:19 AM
> > To: lxc-users at lists.linuxcontainers.org
> > Subject: [lxc-users] Using valgrind with lxc
> >
> > In our program, we do a fork() and in the child process the lxc library is called to start a program in a container using lxc_start().
> >
> > We don't care about valgrind in the child process. You can disable valgrind messages from child processes, but you cannot detach valgrind unless you exec() a new binary on top. However, valgrind and lxc do not play nicely, at least with the versions in Ubuntu 12.04 LTS. I'm getting an error back from lxc_start(). I'm having trouble getting logs to see why its failing, so I don't know exactly what's failing, yet.
> >
> > But, I'm looking for any ideas for getting valgrind to work with programs that use lxc_start().
> > Any suggestions will be welcome. And, thanks!
> >
> >
> > Regards.
> > Mark K Vallevand
> > "If there are no dogs in Heaven, then when I die I want to go where they went."
> > -Will Rogers
> >
> > THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
>
> > _______________________________________________
> > lxc-users mailing list
> > lxc-users at lists.linuxcontainers.org
> > http://lists.linuxcontainers.org/listinfo/lxc-users
>
> _______________________________________________
> lxc-users mailing list
> lxc-users at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users
> _______________________________________________
> lxc-users mailing list
> lxc-users at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users
_______________________________________________
lxc-users mailing list
lxc-users at lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users
More information about the lxc-users
mailing list