[Lxc-users] mknod inside systemd container

Michael H. Warfield mhw at WittsEnd.com
Thu Apr 4 14:30:36 UTC 2013


Hey John,

On Thu, 2013-04-04 at 09:07 +0100, John wrote:
> On 03/04/13 23:15, Michael H. Warfield wrote:
> > On Wed, 2013-04-03 at 23:03 +0100, John wrote:
> >> On 02/04/13 23:59, Michael H. Warfield wrote:
> >>> On Tue, 2013-04-02 at 16:02 +0100, John wrote:
> >>>> If my understanding is correctl, to stop systemd trying to launch udev
> >>>> and generally make a mess of everything inside a container, you need to
> >>>> remove the mknod capability from the container.
> >>> Ah...  That's kind of old information and not really effective.
> >>>
> >>>> But what if I want
> >>>> (need) to be able to use mknod inside a container, how can I do that
> >>>> with a systemd container?
> >>> 1) Get the latest lxc.  lxc 0.8 might suffice for systemd in a container
> >>> but not with systemd in the host and I wouldn't recommend it.  0.9.0 is
> >>> being pulled and bundled now.  It's not up yet but 0.9.0.rc1 is.
> >>>
> >>> 2) You'll have to add "lxc.autodev = 1" to your configuration file.
> >> I already do that. I am running "lxc version: 0.9.0.alpha3"
> > That's strange.  What stops systemd from mounting devtmpfs and firing up
> > udev is having a tmpfs mounted on /dev.  That's part of what autodev = 1
> > is doing.
> I'm taking my understanding from here:
> http://www.freedesktop.org/wiki/Software/systemd/ContainerInterface
> where it says "The udev unit files will check for CAP_SYS_MKNOD, and 
> skip udev if that is not available."

> But it sounds like you're saying that "lxc.autodev = 1" should prevent 
> systemd from firing the udev systemd unit anyway.

Well, sort of.  I may have been a little off there.  I do find that
"systemd-udevd" is running in my systemd containers but that the actual
udevd process itself is not (as opposed to the host systems and other
systems).  I can run mknod in those containers.  So it is possible to do
this.

But...  A hint may be in the lxc-fedora template where there is
specifically a "configure_systemd_fedora" function that does this:

configure_fedora_systemd()
{
    unlink ${rootfs_path}/etc/systemd/system/default.target
    touch ${rootfs_path}/etc/fstab
    chroot ${rootfs_path} ln -s /dev/null //etc/systemd/system/udev.service
    chroot ${rootfs_path} ln -s /lib/systemd/system/multi-user.target /etc/systemd/system/default.target
    #dependency on a device unit fails it specially that we disabled udev
    sed -i 's/After=dev-%i.device/After=/' ${rootfs_path}/lib/systemd/system/getty\@.service
}


Something similar does exist in the lxc-archlinux template:

# disable services unavailable for container
ln -s /dev/null /etc/systemd/system/systemd-udevd.service
ln -s /dev/null /etc/systemd/system/systemd-udevd-control.socket
ln -s /dev/null /etc/systemd/system/systemd-udevd-kernel.socket
ln -s /dev/null /etc/systemd/system/proc-sys-fs-binfmt_misc.automount
# set default systemd target
ln -s /lib/systemd/system/multi-user.target /etc/systemd/system/default.target

The lxc-archlinux template script seems very badly broken for me,
expecting an fixed bridge name of br0 and not using the defaults
from /etc/lxc/default.conf and looking for things that are not present
on my Fedora host.  So I haven't been able to build an archlinux
container on my host systems.

Did you build yours from lxc-create or did you roll your own?  Maybe you
might want to check those /dev/null links in that container.  Looks like
udevd should not even start if those have been set correctly.

> > What distro is running in the container and what version of systemd?
> > I've seen this with Fedora 16 but the latest systemd and Fedora 17 in
> > the container are fine.
> >
> I am running Arch Linux on both host and container:
> Linux 3.7.10-1-ARCH #1 SMP PREEMPT Thu Feb 28 09:50:17 CET 2013 x86_64 
> GNU/Linux
> 
> On my host, "systemctl --version" reports:
> systemd 197
> +PAM -LIBWRAP -AUDIT -SELINUX -IMA -SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ

> And on the container, it's systemd 196

Those versions are congruent with what I'm running.

> I've just checked and the latest version in the Arch repo is 198. I 
> wonder if I should try and update to that?

Proabably won't make a difference.

This did give me some place to look over my headaches with Fedora 15 and
Fedora 16 upgraded containers though.  :-)=)

> >> I found that, without the removal of mknod capability, everything went
> >> crazy. I have working containers with systemd both on host and inside
> >> the container (I even run my full desktop inside a container). To get a
> >> systemd container working I found I needed three things:
> >> lxc.autodev = 1
> >> lxc.cap.drop = mknod
> > I'm not having to do that but I'm avoiding F15 and F16 because they
> > don't seem to play nice and start reliably.  F17 is doing well for me.
> >
> >> lxc.pts = 1024
> >>
> >> It's alll working well except for the fact that I might need to allow a
> >> container to have mknod capability. Are you saying that with 0.9.0 there
> >> are changes that negate the requirement for "lxc.cap.drop = mknod"? The
> >> way I understood it was that it was systemd that behaved differently
> >> based on the availability of that capability...
> >>
> >>
> >>> I have found that this works to get recent systemd containers (Fedora
> >>> 17) to work but Fedora 15 and Fedora 16 (neither of which are supported
> >>> any longer) work due to udev / systemd interaction.
> >>>
> >>> I would recommend waiting a couple of days until 0.9.0 is up and then
> >>> pulling it down and building it.  That's your best shot with systemd.
> >>>
> >>>> I have this container that is a builder of system images for other nodes
> >>>> (containers and/or metal boxes). In order to correctly do this it needs
> >>>> to execute mknod inside the image as it builds it. (note, device nodes
> >>>> created doesn't need to be usable in the context of the image being
> >>>> built - the builder just needs to be able to create it).
> >>>>
> >>>> I've been doing this for ages under sysvinit and it's been fine. I have
> >>>> just migrated this builder container to systemd and hit this problem...
> >>>> Is there another way to keep systemd in line other than removing the
> >>>> mknod capability ?
> >>>>
> >>>> Thanks,
> >>>> John

Regards,
Mike
-- 
Michael H. Warfield (AI4NB) | (770) 985-6132 |  mhw at WittsEnd.com
   /\/\|=mhw=|\/\/          | (678) 463-0932 |  http://www.wittsend.com/mhw/
   NIC whois: MHW9          | An optimist believes we live in the best of all
 PGP Key: 0x674627FF        | possible worlds.  A pessimist is sure of it!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 482 bytes
Desc: This is a digitally signed message part
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20130404/30d7140a/attachment.pgp>


More information about the lxc-users mailing list