[lxc-devel] lxc-1.0.0 reboot error

Vitaly Lavrov vel21ripn at gmail.com
Wed Feb 26 07:55:03 UTC 2014


On 25.02.2014 22:54, Serge Hallyn wrote:
> Quoting Vitaly Lavrov (vel21ripn at gmail.com):
>> On 23.02.2014 03:36, Stéphane Graber wrote:
>>> Hi,
>>>
>>> Thanks for your patch.
>>>
>>> Can I just ask you to sign it off? (Signed-off-by: Name <email>)
>> Hi!
>>
>> I found the source of the problem with a reboot of the container, but do not know how best to fix it.
>> We have a race condition between the end of the old container and the creation of the network interfaces
>> in the new container. Insert usleep (100000) before lxc_delete_network() solves the problem with a reboot,
>> but it's a bad way.
>>
>> How to wait until the completion of the container?
>
> How exactly are you doing the test?  just script
>
> 	lxc-start;
> 	lxc-stop;
> 	lxc-start;
lxc-stop -rn container


kernel 3.12.8

----lxc.config---
lxc.utsname = w8c
lxc.network.type = vlan
lxc.network.link = eth3
lxc.network.vlan.id = 125
lxc.network.ipv4 = 10.200.4.19/24
lxc.network.name = v125
lxc.network.flags = up

lxc.network.type = phys
lxc.network.link = eth2
lxc.network.name = eth6
lxc.tty = 4
lxc.pts = 1
lxc.rootfs = /LXC/w8c/root
lxc.mount.entry=proc /proc proc none defaults 0 0
lxc.mount.entry=ptsfs /dev/pts devpts mode=0644  0 0
lxc.mount.entry=shmfs /dev/shm tmpfs mode=0644  0 0
lxc.mount.entry=sysfs /sys sysfs defaults  0 0
lxc.mount.entry=tmpfs /tmpfs tmpfs defaults,size=128m  0 0
lxc.cap.drop = sys_module
----log---
lxc-start 1393396945.294 INFO     lxc_start_ui - using rcfile /var/lib/lxc/w8c/config
lxc-start 1393396945.294 DEBUG    lxc_confile - add ipv4 0 10.200.4.19/24 brd none
lxc-start 1393396945.294 WARN     lxc_log - lxc_log_init called with log already initialized
lxc-start 1393396945.295 DEBUG    lxc_conf - allocated pty '/dev/pts/8' (5/6)
lxc-start 1393396945.295 DEBUG    lxc_conf - allocated pty '/dev/pts/9' (7/8)
lxc-start 1393396945.295 DEBUG    lxc_conf - allocated pty '/dev/pts/10' (9/10)
lxc-start 1393396945.295 DEBUG    lxc_conf - allocated pty '/dev/pts/11' (11/12)
lxc-start 1393396945.295 INFO     lxc_conf - tty's configured
lxc-start 1393396945.295 DEBUG    lxc_start - sigchild handler set
lxc-start 1393396945.295 DEBUG    lxc_console - opening /dev/tty for console peer
lxc-start 1393396945.295 DEBUG    lxc_console - using '/dev/tty' as console
lxc-start 1393396945.295 DEBUG    lxc_console - 27484 got SIGWINCH fd 17
lxc-start 1393396945.295 DEBUG    lxc_console - set winsz dstfd:14 cols:96 rows:33
lxc-start 1393396945.295 DEBUG    lxc_console - using '/root/lxc.vlan1.log' as console log
lxc-start 1393396945.295 INFO     lxc_start - 'w8c' is initialized
lxc-start 1393396945.304 DEBUG    lxc_start - Not dropping cap_sys_boot or watching utmp
lxc-start 1393396945.305 DEBUG    lxc_conf - instanciated vlan 'vlan125', ifindex is '50'
lxc-start 1393396945.305 INFO     lxc_start - stored saved_nic #0 idx 3 name eth2
lxc-start 1393396945.305 INFO     lxc_cgroup - cgroup driver cgroupfs initing for w8c
lxc-start 1393396945.307 DEBUG    lxc_cgfs - cgroup 'memory.limit_in_bytes' set to '2048M'
lxc-start 1393396945.307 DEBUG    lxc_cgfs - cgroup 'cpuset.cpus' set to '0-3'
lxc-start 1393396945.307 DEBUG    lxc_cgfs - cgroup 'cpuset.mems' set to '0'
lxc-start 1393396945.307 INFO     lxc_cgfs - cgroup has been setup
lxc-start 1393396945.316 DEBUG    lxc_conf - move 'v125' to '27486'
lxc-start 1393396945.326 DEBUG    lxc_conf - move 'eth6' to '27486'
lxc-start 1393396945.326 INFO     lxc_conf - 'w8c' hostname has been setup
lxc-start 1393396945.336 DEBUG    lxc_conf - 'v125' has been setup
lxc-start 1393396945.340 DEBUG    lxc_conf - 'eth6' has been setup
lxc-start 1393396945.340 INFO     lxc_conf - network has been setup
...
lxc-start 1393396945.349 NOTICE   lxc_start - exec'ing '/sbin/init'
lxc-start 1393396945.349 NOTICE   lxc_start - '/sbin/init' started with pid '27486'
lxc-start 1393396945.349 WARN     lxc_start - invalid pid for SIGCHLD
...
<run lxc-stop -rn w8c>
lxc-start 1393396955.264 DEBUG    lxc_commands - peer has disconnected
lxc-start 1393396956.264 DEBUG    lxc_commands - peer has disconnected
lxc-start 1393396957.264 DEBUG    lxc_commands - peer has disconnected
lxc-start 1393396958.150 DEBUG    lxc_start - container init process exited
lxc-start 1393396958.150 DEBUG    lxc_start - Container rebooting
lxc-start 1393396958.150 INFO     lxc_error - child <27486> ended on signal (1)
lxc-start 1393396958.150 INFO     lxc_conf - running to reset 1 nic names
lxc-start 1393396958.150 INFO     lxc_conf - resetting nic 3 to eth2
lxc-start 1393396958.150 DEBUG    lxc_conf - Delete interface 'v125'
lxc-start 1393396958.151 WARN     lxc_conf - failed to remove interface 'v125'
(1)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^error code: ENODEV
lxc-start 1393396958.151 WARN     lxc_conf - failed to rename to the initial name the netdev 'eth2'
lxc-start 1393396958.151 INFO     lxc_container - container requested reboot
lxc-start 1393396958.151 DEBUG    lxc_conf - allocated pty '/dev/pts/8' (5/6)
lxc-start 1393396958.151 DEBUG    lxc_conf - allocated pty '/dev/pts/9' (7/8)
lxc-start 1393396958.151 DEBUG    lxc_conf - allocated pty '/dev/pts/10' (9/10)
lxc-start 1393396958.151 DEBUG    lxc_conf - allocated pty '/dev/pts/11' (11/12)
lxc-start 1393396958.151 INFO     lxc_conf - tty's configured
lxc-start 1393396958.151 DEBUG    lxc_start - sigchild handler set
lxc-start 1393396958.151 DEBUG    lxc_console - opening /dev/tty for console peer
lxc-start 1393396958.151 DEBUG    lxc_console - using '/dev/tty' as console
lxc-start 1393396958.151 DEBUG    lxc_console - 27484 got SIGWINCH fd 20
lxc-start 1393396958.151 DEBUG    lxc_console - set winsz dstfd:15 cols:96 rows:33
lxc-start 1393396958.151 DEBUG    lxc_console - using '/root/lxc.vlan1.log' as console log
lxc-start 1393396958.151 INFO     lxc_start - 'w8c' is initialized
lxc-start 1393396958.161 DEBUG    lxc_start - Not dropping cap_sys_boot or watching utmp
lxc-start 1393396958.161 ERROR    lxc_conf - failed to create vlan interface 'vlan125' on 'eth3' : File exists
(2)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
lxc-start 1393396958.161 ERROR    lxc_conf - failed to create netdev
lxc-start 1393396958.161 ERROR    lxc_start - failed to create the network
lxc-start 1393396958.161 ERROR    lxc_start - failed to spawn 'w8c'
----
(1) Error deleting the vlan interface is always present. (veth interface also)

(2) Very strange error! Remove the interface can not, because it no longer exists,
but can not be created because it is still there. Mystery!

I added a delay usleep(10000) in start.c: __lxc_start () after the line:

while (waitpid (handler-> pid, & status, 0) <0 && errno == EINTR)
	 continue;

After that vlan interface does not deleted and restart the container was
only possible after removing the vlan manually (vconfig rem vlan125).
Increasing the delay to 50ms I got rid of this error.

The problem with restarting the container appears in systems with high load.
At idle system error occurs rarely.

While I do not know any other way fix this error. Need help.


More information about the lxc-devel mailing list