[Lxc-users] Graceful shutdowns: current best practices?
Derek Simkowiak
derek at simkowiak.net
Tue Oct 18 22:22:11 UTC 2011
What is the best method for gracefully shutting down LXC containers
in a production environment?
By graceful, I mean that apps such as databases get a shutdown
signal, so they can save their data to disk, complete any pending
network ops, flush buffers, close filehandles, etc. without data loss.
Presently, the script /etc/init.d/lxc that ships for Ubuntu just
does an lxc-stop on any container listed in /etc/default/lxc. Since
that is like "pulling the power cord", that seems like an irresponsible
and dangerous thing to do. It also does not handle LXC containers not
listed in /etc/default/lxc. It needs to be fixed.
There is an RPM package for OpenSuse called rclxc at
http://download.opensuse.org/repositories/home:/aljex/ which has an init
script for LXC. It uses the following technique:
lxcstop () {
typeset -i PID=0
lxc-ps -- -C init -o pid |while read CN PID ;do
[[ $PID -gt 1 ]] || continue
[[ "${1:-$CN}" = "$CN" ]] || continue
grep -q 'p0::powerfail:/sbin/init 0'
${LXC_SRV}/${CN}/etc/inittab || continue
kill -PWR $PID
done
}
It sends a SIGPWR (after kindly checking .../etc/inittab to make
sure init will handle it). It uses lxc-ps to find the PID of the init
process first.
The Python script posted yesterday has its own technique. It
searches /proc/CONTAINER_PIDs/exe for a link to "/sbin/init", and then
sends a SIGINT to those. That seems like a reasonable approach, but all
of the Ubuntu init scripts are /bin/sh shell scripts, not Python scripts.
There is also an init script at http://lxc.teegra.net/ for Arch
Linux, but as the page says, "... this one is quite simplistic and does
not invoke *shutdown*/*halt* or *init 0* in the containers. Also, it
might hang on waiting for a container to start." Like the Ubuntu
script, it just calls lxc-stop, i.e., pulls the power cable on your
containers. Not graceful.
Several of the other scripts or tutorials I found are also outdated
or incomplete. For example, many still recommend running the container
using "screen", from before the lxc-start -d option was available.
As an alternate approach, what about running:
lxc-attach -n CONTAINER shutdown -h now
Is there any drawback to doing that, instead? The Python script
and the OpenSuse init script mentioned above both need root access, but
using lxc-attach (instead) would theoretically work once the User
Namespaces are fully implemented.
Other considerations for a production-quality script:
1. A watchdog timeout, so that if a process hangs during shutdown,
eventually lxc-stop would get called anyway. (A broken LXC process
should not prevent a host O.S. shutdown!) Could a timeout option be
added to lxc-wait for this feature?
2. A method that does not require root, the way virsh does not require
root to start or stop a VM. (Maybe this needs to wait.)
3. An "official" command name for graceful shutdowns from the host. I
propose lxc-shutdown. (There is an unofficial OpenSuse package from
rdannert that has a "lxc-shutdown-all" command, but I have not seen the
name "lxc-shutdown" used anywhere.)
4. Which signal? SIGINT? SIGPWR? Both?
I am looking to put some development and testing into this. If
readers would kindly post their own "best practices", I could create a
new lxc-shutdown command and an init script that uses it.
Thank You,
Derek Simkowiak
P.S.> The last major discussion I found about this was from ~two years ago:
http://www.mail-archive.com/lxc-users@lists.sourceforge.net/msg00040.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20111018/fcb86c75/attachment.html>
More information about the lxc-users
mailing list