[Lxc-users] On clean shutdown of Ubuntu 10.04 containers

Mon Dec 6 17:38:16 UTC 2010

On 12/6/2010 2:42 AM, Trent W. Buck wrote:
> This post describes my attempts to get "clean" shutdown of Ubuntu 10.04
> containers.  The goal here is that a "shutdown -h now" of the dom0
> should not result in a potentially inconsistent domU postgres database,
> cf. a naive lxc-stop.
>
> As at Ubuntu 10.04 with lxc 0.7.2, lxc-start detects that a container
> has halted by 1) seeing a reboot event in<container>/var/run/utmp; or
> 2) seeing<container>'s PID 1 terminate.
>
> Ubuntu 10.04 simply REQUIRES /var/run to be a tmpfs; this is hard-coded
> into mountall's (upstart's) /lib/init/fstab.  Without it, the most
> immediate issue is that /var/run/ifstate isn't reaped on reboot, ifup(8)
> thinks lo (at least) is already configured, and the boot process hangs
> waiting for the network.
>
> Unfortunately, lxc 0.7's utmp detect requires /var/run to NOT be a
> tmpfs.  The shipped lxc-ubuntu script works around this by deleting the
> ifstate file and not mounting a tmpfs on /var/run, but to me that is
> simply waiting for something else to assume /var/run is empty.  It also
> doesn't cope with a mountall upgrade rewriting /lib/init/fstab.
>
> More or less by accident, I discovered that I can tell lxc-start that
> the container is ready to halt by "crashing" upstart:
>
>      container# kill -SEGV 1
>
> Likewise I can spoof a ctrl-alt-delete event in the container with:
>
>      dom0# pkill -INT lxc-start
>
> I automate the former signalling at the end of shutdowns thusly:
>
>      chroot $template_dir dpkg-divert --quiet --rename /sbin/reboot
>      chroot $template_dir tee>/dev/null /sbin/reboot<<-EOF
>      	#!/bin/bash
>      	while getopts nwdfiph opt
>      	do [[ f = \$opt ]]&&  exec kill -SEGV 1
>      	done
>      	exec -a "$0" "\$0.distrib" "\$@"
>      	EOF
>      chroot $template_dir chmod +x /sbin/reboot
>      chroot $template_dir ln -s reboot.distrib /sbin/halt.distrib
>      chroot $template_dir ln -s reboot.distrib /sbin/poweroff.distrib
>
> I use the latter in my customized /etc/init.d/lxc stop rule.
> Note that the lxc-wait's SHOULD be parallelized, but this is not
> possible as at lxc 0.7.2 :-(

Sure it is.
I parallelize the shutdowns (in any version, including 0.7.2) by doing 
all the lxc-stop in parallel without looking or waiting, then in a 
separate following step do a loop that waits for no containers running.

Here is my openSUSE init.d/lxc:
https://build.opensuse.org/package/files?package=lxc&project=home:aljex
And the packages:
http://download.opensuse.org/repositories/home:/aljex/*/lxc-0.7.2*.rpm

It makes assumptions that are wrong for ubuntu and is more limited than 
you may want in terms of what it even tries to handle. But that's beside 
the point of parallel shutdowns.

* cgroup handling includes a particular stack of override logic for 
possible cgroup mount points that makes sense to me.
- start with built-in default /var/run/lxc/cgroup, and name it "lxc" so 
as not to conflict with any other cgroup setup by default.
- if you defined something in $LXC_CONF, prefer it over default
- if kernel is providing /sys/fs/cgroup automatically, prefer that over 
either default or $LXC_CONF
- if a cgroup named "lxc" is already mounted, prefer that over all else

* assumes lxc 0.7.2 because the script is part of a lxc-0.7.2 rpm
- removes the shutdown/reboot watchdog functions that were needed in 
0.6.5 but are built in to 0.7.2 now.

* only starts containers that are defined by $LXC_ETC/*/config

* only shuts down containers that it started

* the stop function greps for /sbin/init in container inittab instead of 
trying to allow for any random container pid #1

* no provision for application/service containers, just whole systems 
started with /sbin/init

* starts containers in screen
- I have not figured out what it would take to get nice behavior out of 
lxc-console yet and screen is both easy and standard.

The $LXC_CONF (/etc/lxc/lxc.conf) referenced at the top does not exist 
usually so everything that happens is visible right in the script.

I'm using this in production. So far so good.

typical usage:

nj10:~ # rclxc status
Checking for LXC containers... 
                                                  running
nj10:~ # rclxc list
Listing LXC containers...
'vps001' is RUNNING
'vps002' is RUNNING
'vps003' is RUNNING
'vps004' is RUNNING
'vps005' is RUNNING
'vps006' is RUNNING
'vps007' is RUNNING
'vps008' is RUNNING
'vps009' is RUNNING
'vps011' is RUNNING
'vps012' is RUNNING
'vps013' is RUNNING
nj10:~ # rclxc stop vps008
Shutting down LXC containers... 
                                                  done
nj10:~ # rclxc list
Listing LXC containers...
'vps001' is RUNNING
'vps002' is RUNNING
'vps003' is RUNNING
'vps004' is RUNNING
'vps005' is RUNNING
'vps006' is RUNNING
'vps007' is RUNNING
'vps008' is STOPPED
'vps009' is RUNNING
'vps011' is RUNNING
'vps012' is RUNNING
'vps013' is RUNNING
nj10:~ # rclxc status
Checking for LXC containers... 
                                                  running
nj10:~ # rclxc stop
Shutting down LXC containers... 
                                                  done
nj10:~ # rclxc status
Checking for LXC containers... 
                                                  unused
nj10:~ # rclxc list
Listing LXC containers...
'vps001' is STOPPED
'vps002' is STOPPED
'vps003' is STOPPED
'vps004' is STOPPED
'vps005' is STOPPED
'vps006' is STOPPED
'vps007' is STOPPED
'vps008' is STOPPED
'vps009' is STOPPED
'vps011' is STOPPED
'vps012' is STOPPED
'vps013' is STOPPED
nj10:~ # time rclxc start
Starting LXC containers... 
                                                  done

real    0m0.242s
user    0m0.012s
sys     0m0.000s
nj10:~ # rclxc list
Listing LXC containers...
'vps001' is RUNNING
'vps002' is RUNNING
'vps003' is RUNNING
'vps004' is RUNNING
'vps005' is RUNNING
'vps006' is RUNNING
'vps007' is RUNNING
'vps008' is RUNNING
'vps009' is RUNNING
'vps011' is RUNNING
'vps012' is RUNNING
'vps013' is RUNNING
nj10:~ # screen -r vps013

INIT: version 2.88 booting
INIT: Entering runlevel: 3
blogd: can not set console device to /dev/pts/34: Device or resource busy
Master Resource Control: previous runlevel: N, switching to runlevel:3
Initializing random number generator                                 done
Starting syslog services                                             done
Starting D-Bus daemon                                                done
No keyboard map to load
Loading compose table winkeys shiftctrl latin1.add                   done
Stop Unicode mode                                                    done
Setting up (localfs) network interfaces:
     lo
     lo        IP address: 127.0.0.1/8
               IP address: 127.0.0.2/8
     lo                                                               done
     eth0
     eth0      IP address: 71.187.206.90/24
     eth0                                                             done
Setting up service (localfs) network  .  .  .  .  .  .  .  .  .  .   done
Starting SSH daemon                                                  done
Loading CPUFreq modules (CPUFreq not supported)
Starting HAL daemon                                                  done
Setting up (remotefs) network interfaces:
Setting up service (remotefs) network  .  .  .  .  .  .  .  .  .  .  done
Re-Starting syslog services                                          done
Starting auditd The audit system is disabled
                                                                      done
Starting incron                                                      done
Starting mail service (Postfix)                                      done
Starting CRON daemon                                                 done
Starting rpcbind                                                     done
Starting rsync daemon                                                done
Starting smartd                                                      unused
Starting vsftpd                                                      done
Starting INET services. (xinetd)                                     done
Master Resource Control: runlevel 3 has been                         reached
Skipped services in runlevel 3:                            splash smartd

Welcome to openSUSE 11.3 "Teal" - Kernel 2.6.37-rc3-3-default (console).

nj10-013 login:

[detached]
nj10:~ # time rclxc stop
Shutting down LXC containers... 
                                                  done

real    0m8.537s
user    0m0.048s
sys     0m0.124s
nj10:~ # rclxc list
Listing LXC containers...
'vps001' is STOPPED
'vps002' is STOPPED
'vps003' is STOPPED
'vps004' is STOPPED
'vps005' is STOPPED
'vps006' is STOPPED
'vps007' is STOPPED
'vps008' is STOPPED
'vps009' is STOPPED
'vps011' is STOPPED
'vps012' is STOPPED
'vps013' is STOPPED
nj10:~ # screen -ls
No Sockets found in /var/run/screens/S-root.
nj10:~ # lxc-ps --lxc auxwww
CONTAINER  USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START 
TIME COMMAND
nj10:~ #

-- 
bkw
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: lxc.init
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20101206/ecc75508/attachment.ksh>