[Lxc-users] On clean shutdown of Ubuntu 10.04 containers
Michael H. Warfield
mhw at WittsEnd.com
Mon Dec 6 20:45:46 UTC 2010
On Mon, 2010-12-06 at 18:42 +1100, Trent W. Buck wrote:
> This post describes my attempts to get "clean" shutdown of Ubuntu 10.04
> containers. The goal here is that a "shutdown -h now" of the dom0
> should not result in a potentially inconsistent domU postgres database,
> cf. a naive lxc-stop.
>
> As at Ubuntu 10.04 with lxc 0.7.2, lxc-start detects that a container
> has halted by 1) seeing a reboot event in <container>/var/run/utmp; or
> 2) seeing <container>'s PID 1 terminate.
>
> Ubuntu 10.04 simply REQUIRES /var/run to be a tmpfs; this is hard-coded
> into mountall's (upstart's) /lib/init/fstab.
Are you absolutely SURE about this? I was under the impression this was
under control of the /etc/default/rcS file and the RAMRUN option. I set
both that and RAMLOCK to "no" and didn't think I was having any problems
with it but I'm not sure if that was specifically a 10.04 container I
was testing with. I'll have to reverify to see if they've changed that.
That should really be consider a bug, if true. Nothing should require
something be on tmpfs.
> Without it, the most
> immediate issue is that /var/run/ifstate isn't reaped on reboot, ifup(8)
> thinks lo (at least) is already configured, and the boot process hangs
> waiting for the network.
>
> Unfortunately, lxc 0.7's utmp detect requires /var/run to NOT be a
> tmpfs. The shipped lxc-ubuntu script works around this by deleting the
> ifstate file and not mounting a tmpfs on /var/run, but to me that is
> simply waiting for something else to assume /var/run is empty. It also
> doesn't cope with a mountall upgrade rewriting /lib/init/fstab.
>
> More or less by accident, I discovered that I can tell lxc-start that
> the container is ready to halt by "crashing" upstart:
>
> container# kill -SEGV 1
>
> Likewise I can spoof a ctrl-alt-delete event in the container with:
>
> dom0# pkill -INT lxc-start
>
> I automate the former signalling at the end of shutdowns thusly:
>
> chroot $template_dir dpkg-divert --quiet --rename /sbin/reboot
> chroot $template_dir tee >/dev/null /sbin/reboot <<-EOF
> #!/bin/bash
> while getopts nwdfiph opt
> do [[ f = \$opt ]] && exec kill -SEGV 1
> done
> exec -a "$0" "\$0.distrib" "\$@"
> EOF
> chroot $template_dir chmod +x /sbin/reboot
> chroot $template_dir ln -s reboot.distrib /sbin/halt.distrib
> chroot $template_dir ln -s reboot.distrib /sbin/poweroff.distrib
>
> I use the latter in my customized /etc/init.d/lxc stop rule.
> Note that the lxc-wait's SHOULD be parallelized, but this is not
> possible as at lxc 0.7.2 :-( This means that theoretically the nth
> container gets n×10min to halt, although in practice I find most
> containers go down in a decisecond or two.
>
> case "$1" in
> ...
>
> stop)
> log_daemon_msg "Stopping $DESC"
> pkill -INT lxc-start
> for name in $(lxc-ls)
> do if timeout 10m lxc-wait -n $name -s STOPPED
> then
> log_progress_msg $name
> else
> lxc-stop -n $name
> log_progress_msg "$name (killed)"
> fi
> done
> wait
> log_end_msg 0
> ;;
> esac
Mike
--
Michael H. Warfield (AI4NB) | (770) 985-6132 | mhw at WittsEnd.com
/\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/
NIC whois: MHW9 | An optimist believes we live in the best of all
PGP Key: 0x674627FF | possible worlds. A pessimist is sure of it!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 482 bytes
Desc: This is a digitally signed message part
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20101206/0b1c92dc/attachment.pgp>
More information about the lxc-users
mailing list