[Lxc-users] How to make a container init DIE after finishing runlevel 0

Mon Jan 25 21:09:29 UTC 2010

ah ok.  i think i remember reading that message about the shutdown
issues.  also in my last message i mixed up SIGINT and SIGPWR; with
the inittab im using, SIGPWR to container will enter runlevel 0, and
SIGINT will enter runlevel 6 then immediately back to 3 (reboot).

i managed to create a workaround that will suffice for the time being
(wish i'd done this sooner):

ps --no-headers -C init -o pid | \
while read x; do
    [ ${x} -eq 1 ] && continue
    [ $(ps --no-headers --ppid ${x} | wc -l) -eq 0 ] && kill -9 ${x}
done

this will scan for any process named init, except the real init, and
check how many children it has.  if the child count is zero, then it's
sent a SIGKILL and container life ends properly.

you could also adds some checks to /proc/${x}/cgroup and check for
conatiner names.

thanks

On Mon, Jan 25, 2010 at 2:50 PM, Daniel Lezcano <daniel.lezcano at free.fr> wrote:
> Tony Risinger wrote:
>>
>> hello,
>>
>> over the past several weeks i have been working intensively on setting
>> up my personal servers with LXC based containers.  at this point, i am
>> extremely pleased with my setup, and will soon be sharing all that i
>> have accomplished in the form of several scripts and implemented
>> configuration ideas.  i am using LXC in conjunction with BTRFS; all of
>> my containers are populated inside BTRFS subvolumes/snapshots, and i
>> have to say this is _very_ slick.  i can create several containers in
>> a matter of seconds, all sharing the base install (COW of course).
>> this allows me to create a "nano" template, fork it at the filesystem
>> level, update it to a "base" template via a package manager, and
>> repeat this as many times as i wish (LAMP/etc).  i then fork an
>> appropriate template into a usable domain, again at the FS level, and
>> run an LXC container inside it.
>>
>
> Right, I tried this kind of configuration and the true is the btrfs is
> *very* useful for consolidating the rootfs of several containers.
>
>> anyways, that is all working extremely well, all my build/run/manage
>> scripts are complete.  i am however experiencing one nuisance that
>> works against the elegance of it all... how to convince init to die
>> once it enters runlevel 0 and all other processes are dead.
>>
>
> You can't. This is something we are trying to solve right now.
>
> Dietmar Maurer posted a some month ago a process forker running in the
> container which has a socket unix opened with the supervisor process
> (lxc-start).
> The process allows to enter the container or execute command inside the
> container from the outside via commands send with the socket.
> During the discussion about the shutdown problem, Dietmar pointed when the
> container shutdowns, all the processes exit the container including the
> "forker". So we should be able to detect that with the af_unix socket
> closing and then kill the container from lxc-start.
>
> I don't know if it's the right way to handle the shutdown (that would be
> nice to kill/start again the container at reboot) but I think Dietmar's idea
> is a reasonable workaround and acceptable to be kept long enough before we
> found something better.
>
>> i swear i have tried/considered about everything... playing with
>> /etc/powersave, /etc/initscript, powerfail/SIGPWR, replacing
>> /sbin/init with /bin/true and calling init U, named pipes from host to
>> container/read only bind mounts of a folder with a named pipe to
>> trigger something in the host, kill -9 1 in inittab itself, writing a
>> custom init in bash, maybe using something other than init like
>> upstart (?), and probably several other things that i've forgotten....
>>  but they all feel kludgey and complicated.
>>
>> init simply refuses to die unless its issued a SIGKILL from the host.
>> and thats super inconvenient :-(.  i know pid 1 has special properties
>> but i hoped there would be a nice way to address the fact that its not
>> _really_ pid one... it just thinks it is.
>>
>> this is what i have for an /etc/inittab right now in the containers,
>> mostly pretty good i think:
>>
>> id:3:initdefault:
>> rc::sysinit:/etc/rc.sysinit
>> rs:S1:wait:/etc/rc.single
>> rm:2345:wait:/etc/rc.multi
>> rh:06:wait:/etc/rc.shutdown
>> su:S:wait:/sbin/sulogin -p
>> c1:2345:respawn:/sbin/agetty -n -l /bin/autologin -8 38400 tty1 linux
>> rb:6:once:/sbin/init 3
>> kl:0:once:/bin/touch /dev/hostctl
>> p6::ctrlaltdel:/sbin/init 6
>> p0::powerfail:/sbin/init 0
>>
>> this lets me reboot the container from the inside correctly, or from
>> the host with a SIGPWR, or "shutdown" with a SIGINT from the host.
>> the autologin binary lets the host login as root no matter what.  this
>> next line is my latest/final attempt at managing these "zombie"
>> containers once they enter runlevel 0:
>>
>> kl:0:once:/bin/touch /dev/hostctl
>>
>> on the host i basically run a "timeout 5m cat
>> /vps/dom/<DOM>/rootfs/dev/hostctl" for each dom, and monitor the
>> return code from that process.  the cat command will block until
>> "touched" by init at the end of its life.  at that point i mercilessly
>> SIGKILL the container init.  the other option i'm considering is a
>> cronjob that loops thru "running" containers, does an lxc-ps on them,
>> and if only one process is running assume its init and SIGKILL the
>> pesky bugger, this is probably the easier way.
>>
>> apologies for the length, but how is everyone else handling this?
>> this is the last thing i need to solve before i actually start running
>> all my services on this setup.
>>
>
> I was wondering if the kernel shouldn't send a signal to the init's parent
> when sys_reboot is called.
>
>