[Lxc-users] mknod inside systemd container

Joerg Gollnick code4lxc+list at wurzelbenutzer.de
Thu Apr 4 18:20:15 UTC 2013


Am Thu, 04 Apr 2013 10:30:36 -0400
schrieb "Michael H. Warfield" <mhw at WittsEnd.com>:

Hi John,
> Hey John,
> 
> On Thu, 2013-04-04 at 09:07 +0100, John wrote:
> > On 03/04/13 23:15, Michael H. Warfield wrote:
> > > On Wed, 2013-04-03 at 23:03 +0100, John wrote:
> > >> On 02/04/13 23:59, Michael H. Warfield wrote:
> > >>> On Tue, 2013-04-02 at 16:02 +0100, John wrote:
> > >>>> If my understanding is correctl, to stop systemd trying to
> > >>>> launch udev and generally make a mess of everything inside a
> > >>>> container, you need to remove the mknod capability from the
> > >>>> container.
> > >>> Ah...  That's kind of old information and not really effective.
> > >>>
> > >>>> But what if I want
> > >>>> (need) to be able to use mknod inside a container, how can I
> > >>>> do that with a systemd container?
> > >>> 1) Get the latest lxc.  lxc 0.8 might suffice for systemd in a
> > >>> container but not with systemd in the host and I wouldn't
> > >>> recommend it.  0.9.0 is being pulled and bundled now.  It's not
> > >>> up yet but 0.9.0.rc1 is.
> > >>>
> > >>> 2) You'll have to add "lxc.autodev = 1" to your configuration
> > >>> file.
> > >> I already do that. I am running "lxc version: 0.9.0.alpha3"
> > > That's strange.  What stops systemd from mounting devtmpfs and
> > > firing up udev is having a tmpfs mounted on /dev.  That's part of
> > > what autodev = 1 is doing.
> > I'm taking my understanding from here:
> > http://www.freedesktop.org/wiki/Software/systemd/ContainerInterface
> > where it says "The udev unit files will check for CAP_SYS_MKNOD,
> > and skip udev if that is not available."
> 
> > But it sounds like you're saying that "lxc.autodev = 1" should
> > prevent systemd from firing the udev systemd unit anyway.
> 
> Well, sort of.  I may have been a little off there.  I do find that
> "systemd-udevd" is running in my systemd containers but that the
> actual udevd process itself is not (as opposed to the host systems
> and other systems).  I can run mknod in those containers.  So it is
> possible to do this.
> 
> But...  A hint may be in the lxc-fedora template where there is
> specifically a "configure_systemd_fedora" function that does this:
> 
> configure_fedora_systemd()
> {
>     unlink ${rootfs_path}/etc/systemd/system/default.target
>     touch ${rootfs_path}/etc/fstab
>     chroot ${rootfs_path} ln
> -s /dev/null //etc/systemd/system/udev.service chroot ${rootfs_path}
> ln
> -s /lib/systemd/system/multi-user.target /etc/systemd/system/default.target
> #dependency on a device unit fails it specially that we disabled udev
> sed -i 's/After=dev-%i.device/After=/'
> ${rootfs_path}/lib/systemd/system/getty\@.service }
> 
> 
> Something similar does exist in the lxc-archlinux template:
> 
> # disable services unavailable for container
> ln -s /dev/null /etc/systemd/system/systemd-udevd.service
> ln -s /dev/null /etc/systemd/system/systemd-udevd-control.socket
> ln -s /dev/null /etc/systemd/system/systemd-udevd-kernel.socket
> ln -s /dev/null /etc/systemd/system/proc-sys-fs-binfmt_misc.automount
> # set default systemd target
> ln
> -s /lib/systemd/system/multi-user.target /etc/systemd/system/default.target
> 
> The lxc-archlinux template script seems very badly broken for me,
> expecting an fixed bridge name of br0 and not using the defaults
> from /etc/lxc/default.conf and looking for things that are not present
> on my Fedora host.  So I haven't been able to build an archlinux
> container on my host systems.
> 
> Did you build yours from lxc-create or did you roll your own?  Maybe
> you might want to check those /dev/null links in that container.
> Looks like udevd should not even start if those have been set
> correctly.
> 
> > > What distro is running in the container and what version of
> > > systemd? I've seen this with Fedora 16 but the latest systemd and
> > > Fedora 17 in the container are fine.
> > >
> > I am running Arch Linux on both host and container:
> > Linux 3.7.10-1-ARCH #1 SMP PREEMPT Thu Feb 28 09:50:17 CET 2013
> > x86_64 GNU/Linux
> > 
> > On my host, "systemctl --version" reports:
> > systemd 197
> > +PAM -LIBWRAP -AUDIT -SELINUX -IMA -SYSVINIT +LIBCRYPTSETUP +GCRYPT
> > +ACL +XZ
> 
> > And on the container, it's systemd 196
> 
> Those versions are congruent with what I'm running.
> 
> > I've just checked and the latest version in the Arch repo is 198. I 
> > wonder if I should try and update to that?
> 
> Proabably won't make a difference.
> 
> This did give me some place to look over my headaches with Fedora 15
> and Fedora 16 upgraded containers though.  :-)=)
> 
> > >> I found that, without the removal of mknod capability,
> > >> everything went crazy. I have working containers with systemd
> > >> both on host and inside the container (I even run my full
> > >> desktop inside a container). To get a systemd container working
> > >> I found I needed three things: lxc.autodev = 1
> > >> lxc.cap.drop = mknod
> > > I'm not having to do that but I'm avoiding F15 and F16 because
> > > they don't seem to play nice and start reliably.  F17 is doing
> > > well for me.
> > >
> > >> lxc.pts = 1024
> > >>
> > >> It's alll working well except for the fact that I might need to
> > >> allow a container to have mknod capability. Are you saying that
> > >> with 0.9.0 there are changes that negate the requirement for
> > >> "lxc.cap.drop = mknod"? The way I understood it was that it was
> > >> systemd that behaved differently based on the availability of
> > >> that capability...
> > >>
> > >>
> > >>> I have found that this works to get recent systemd containers
> > >>> (Fedora 17) to work but Fedora 15 and Fedora 16 (neither of
> > >>> which are supported any longer) work due to udev / systemd
> > >>> interaction.
> > >>>
> > >>> I would recommend waiting a couple of days until 0.9.0 is up
> > >>> and then pulling it down and building it.  That's your best
> > >>> shot with systemd.
> > >>>
> > >>>> I have this container that is a builder of system images for
> > >>>> other nodes (containers and/or metal boxes). In order to
> > >>>> correctly do this it needs to execute mknod inside the image
> > >>>> as it builds it. (note, device nodes created doesn't need to
> > >>>> be usable in the context of the image being built - the
> > >>>> builder just needs to be able to create it).
> > >>>>
> > >>>> I've been doing this for ages under sysvinit and it's been
> > >>>> fine. I have just migrated this builder container to systemd
> > >>>> and hit this problem... Is there another way to keep systemd
> > >>>> in line other than removing the mknod capability ?
> > >>>>
> > >>>> Thanks,
> > >>>> John
> 
> Regards,
> Mike

With mask method (ln -s /dev/null ...) for systemd above, I had success
with lxc from git on 20130402, systemd 198 on (manual build
archlinux) container on a sysvinit/initscripts host.

I run openvpn in this container with following service script:

cat /etc/systemd/system/tundev.service
[Unit]
Description=Add tun device workaround
Wants=network.target
Before=openvpn at .service

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/mkdir /dev/net
ExecStart=/usr/bin/mknod -m 666 /dev/net/tun c 10 200

[Install]
WantedBy=multi-user.target

Hope that helps.
With best regards Joerg







More information about the lxc-users mailing list