[Lxc-users] Graceful shutdowns: current best practices?

Derek Simkowiak derek at simkowiak.net
Tue Oct 18 22:22:11 UTC 2011


     What is the best method for gracefully shutting down LXC containers 
in a production environment?

     By graceful, I mean that apps such as databases get a shutdown 
signal, so they can save their data to disk, complete any pending 
network ops, flush buffers, close filehandles, etc. without data loss.

     Presently, the script /etc/init.d/lxc that ships for Ubuntu just 
does an lxc-stop on any container listed in /etc/default/lxc.  Since 
that is like "pulling the power cord", that seems like an irresponsible 
and dangerous thing to do.  It also does not handle LXC containers not 
listed in /etc/default/lxc.  It needs to be fixed.

     There is an RPM package for OpenSuse called rclxc at 
http://download.opensuse.org/repositories/home:/aljex/ which has an init 
script for LXC.  It uses the following technique:

lxcstop () {
     typeset -i PID=0
     lxc-ps -- -C init -o pid |while read CN PID ;do
         [[ $PID -gt 1 ]] || continue
         [[ "${1:-$CN}" = "$CN" ]] || continue
         grep -q 'p0::powerfail:/sbin/init 0' 
${LXC_SRV}/${CN}/etc/inittab || continue
         kill -PWR $PID
     done
}

     It sends a SIGPWR (after kindly checking .../etc/inittab to make 
sure init will handle it).  It uses lxc-ps to find the PID of the init 
process first.

     The Python script posted yesterday has its own technique.  It 
searches /proc/CONTAINER_PIDs/exe for a link to "/sbin/init", and then 
sends a SIGINT to those.  That seems like a reasonable approach, but all 
of the Ubuntu init scripts are /bin/sh shell scripts, not Python scripts.

     There is also an init script at http://lxc.teegra.net/ for Arch 
Linux, but as the page says, "... this one is quite simplistic and does 
not invoke *shutdown*/*halt* or *init 0* in the containers. Also, it 
might hang on waiting for a container to start."  Like the Ubuntu 
script, it just calls lxc-stop, i.e., pulls the power cable on your 
containers.  Not graceful.

     Several of the other scripts or tutorials I found are also outdated 
or incomplete.  For example, many still recommend running the container 
using "screen", from before the lxc-start -d option was available.

     As an alternate approach, what about running:

lxc-attach -n CONTAINER shutdown -h now

     Is there any drawback to doing that, instead?  The Python script 
and the OpenSuse init script mentioned above both need root access, but 
using lxc-attach (instead) would theoretically work once the User 
Namespaces are fully implemented.

     Other considerations for a production-quality script:

1. A watchdog timeout, so that if a process hangs during shutdown, 
eventually lxc-stop would get called anyway.  (A broken LXC process 
should not prevent a host O.S. shutdown!)  Could a timeout option be 
added to lxc-wait for this feature?

2. A method that does not require root, the way virsh does not require 
root to start or stop a VM.  (Maybe this needs to wait.)

3. An "official" command name for graceful shutdowns from the host.  I 
propose lxc-shutdown.  (There is an unofficial OpenSuse package from 
rdannert that has a "lxc-shutdown-all" command, but I have not seen the 
name "lxc-shutdown" used anywhere.)

4. Which signal?  SIGINT?  SIGPWR?  Both?


     I am looking to put some development and testing into this.  If 
readers would kindly post their own "best practices", I could create a 
new lxc-shutdown command and an init script that uses it.


Thank You,
Derek Simkowiak

P.S.> The last major discussion I found about this was from ~two years ago:
http://www.mail-archive.com/lxc-users@lists.sourceforge.net/msg00040.html


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20111018/fcb86c75/attachment.html>


More information about the lxc-users mailing list