[lxc-users] sshd-keygen fails during container boot

Mon Dec 7 22:23:12 UTC 2015

On 12/07/2015 07:49 AM, Serge Hallyn wrote:
> Quoting Peter Steele (pwsteele at gmail.com):
>> I'm actually not (yet) running lxcfs. My understanding was that it
>> isn't absolutely required but it does offer several benefits. I'd
>> planned to tackle lxcfs after getting things running without it
>> first, and my containers appear to be running reasonably okay
>> (although I am seeing some odd behavior). If it's *is* needed, then
>> I guess it's time to tackle lxcfs.
> Oh right, since you're running your containers unconfined, you should
> be fine without it.
>
> Still would be useful to see systemd's output :)
>
I'm having some trouble reproducing this behavior when I launch a 
container with

lxc-start -n containername -F -o /dev/stdout -- /sbin/init log_target=console log_level=debug

I'm seeing these errors when the containers are launched as part of my 
automated installation framework. The installs that are done in this 
manner are hands-free and the containers, once installed, are launched 
with a simple

lxc-start -n containername

Multiple containers are created and launched in parallel as part of this 
installation system. This is all controlled through automation (Python), 
and the framework expects the launched container processes to become 
daemons. I don't get the detailed startup output that happens when 
lxc-start is run in the foreground. If I create a container 
interactively and launch it with -F option manually, there are no errors 
in /var/log/messages.

The errors I'm seeing are not limited to the sshd-keygen errors I 
originally reported. In further tests since then I see similar errors 
being reported for a variety of things. For example:

Dec  7 13:52:00 pws-vm-00 systemd: Failed at step CGROUP spawning 
/usr/bin/kmod: No such file or directory
Dec  7 13:52:00 pws-vm-00 systemd: Mounted Huge Pages File System.
Dec  7 13:52:00 pws-vm-00 systemd: kmod-static-nodes.service: main 
process exited, code=exited, status=219/CGROUP
Dec  7 13:52:00 pws-vm-00 systemd: Failed to start Create list of 
required static device nodes for the current kernel.
Dec  7 13:52:00 pws-vm-00 systemd: Unit kmod-static-nodes.service 
entered failed state.

Dec  7 13:52:01 pws-vm-00 systemd: Failed at step CGROUP spawning 
/etc/rc.d/init.d/jexec: No such file or directory
Dec  7 13:52:01 pws-vm-00 systemd: jexec.service: control process 
exited, code=exited status=219
Dec  7 13:52:01 pws-vm-00 systemd: Failed to start LSB: Supports the 
direct execution of binary formats..
Dec  7 13:52:01 pws-vm-00 systemd: Unit jexec.service entered failed state.

I've also seen errors reported for network.service and others. In every 
case, if I reboot my server forcing the hosted containers to be 
restarted, everything comes up fine. If I run the same installation 
framework using the same container images but instead use libvirt-lxc 
create/start commands, I do not see these systemd errors. To the best of 
my knowledge, the LXC config I'm using more or less matches the config 
I'm using with my libvirt containers, for example

# cat /var/lib/lxc/vm-00/config
lxc.utsname = pws-vm-00
lxc.include = /var/lib/hf/lxc.conf
lxc.network.veth.pair = vm-00
lxc.network.hwaddr = fe:d6:e8:96:7e:2d
lxc.rootfs = /hf/cs/vm-00/rootfs
lxc.cgroup.memory.limit_in_bytes = 1073741824
lxc.cgroup.memory.memsw.limit_in_bytes = 2147483648
lxc.hook.autodev = /var/lib/lxc/vm-00/autodev
lxc.cgroup.devices.allow = b 8:3 rwm

# cat /var/lib/hf/lxc.conf
lxc.include = /usr/share/lxc/config/centos.common.conf
lxc.arch = x86_64

# Networking defaults
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br0

Am I missing anything in my config? I am running LXC version 1.1.5 under 
a CentOS host. All containers are also CentOS and are privileged.

Peter