[Lxc-users] concurrent aptitude/dpkg runs in separate containers --> bork bork bork

Trent W. Buck twb at cybersource.com.au
Fri Feb 4 00:41:42 UTC 2011


Wrapping eatmydata around dpkg "fixes" the problem; details below.

Trent Buck wrote (offlist):

> As I now understand it, the facts are these:
>
>   - lucid's dpkg uses an fsync-per-file
>
>   - lucid-security's dpkg uses a sync-per-rename, because fsync was
>     slow (only on ext4/btrfs, I think).
>
>   - Because containers share a kernel, sync causes ALL writes to ALL
>     filesystems to be flushed to disk, and dpkg blocks while this
>     takes place.
>
>   - Because the syncs come in quick succession, and there are writes
>     on other containers (e.g. postfix, syslog, mysql), this borks at
>     least dpkg, and likely other stuff.
>
>   - using ext3 instead of ext4 won't help significantly.  Reducing the
>     commit interval from 60 (ext4 default) to 5 (ext3 default) might
>     help a little.
>
> We can work around this issue in a few ways:
>
>   - disable lucid-security (at least for dpkg), which should ensure
>     only fsync is used.  Possibly switch to ext3 to avoid problems
>     with ext4 vs. fsync.
>
>   - backport patches from Debian's latest dpkg to lucid-security's
>     dpkg.  NOT FUN.
>
>   - backport dpkg from a newer ubuntu release.  NOT FUN.
>
>   - use the EATMYDATA package to discard all fsync and sync calls from
>     dpkg.
>
> Using eatmydata, worst case is that dpkg is running AND there is a
> power outage, then THAT FILESYSTEM (i.e. not other fses) gets
> corrupted.  i.e. worst case you lose that container and have to
> rebuild it.  Non-container filesystems like home, cyber, mail should
> be completely safe.
>
> Therefore, I'm aiming to have lxc-create wrap /usr/bin/dpkg with
> eatmydata in containers.  Omega itself will still call sync() when you
> apply security updates to it, but that happens an order of magnitude
> less often than containers running dpkg.

This worked: placing an unmodified eatmydata .deb from sid in a local
apt repo, and running the following:

    aptitude install eatmydata -yq
    dpkg-divert --rename /usr/bin/dpkg
    cat >/usr/bin/dpkg <<-EOT
    	#!/bin/sh
    	exec eatmydata /usr/bin/dpkg.distrib "\$@"
    	EOT
    chmod +x /usr/bin/dpkg

Daniel Lezcano <daniel.lezcano at free.fr> writes:

> Assuming you have an ubuntu version on your host, I think the kernel is
> compiled with DETECT_HUNG_TASK, where a kernel stack trace is displayed
> if a task stays in the 'D' state indefinitively. Do you have such stack
> on your logs ?

That is the case, here are the entries I have:

-------------- next part --------------
A non-text attachment was scrubbed...
Name: tmp
Type: text/x-mail
Size: 18759 bytes
Desc: dmesg hung tasks
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20110204/f8170f65/attachment.bin>


More information about the lxc-users mailing list