[lxc-devel] failed to create netdev

Daniel Lezcano daniel.lezcano at free.fr
Wed Apr 6 11:31:52 UTC 2011


On 04/02/2011 11:18 PM, Jean-Philippe Menil wrote:
> Hi,
>
> i experienced some strange problems when restarting a container.
> Sometimes, it seems that the veth is not fully released on stop, then
> the container failed to restart with the following log:
> (in this case, the container have the same name of his veth.pair, but
> doesn't matter)
>
>         lxc-start 1301749565.517 INFO     lxc_start - 'cache2-crous' is
> initialized
>          lxc-wait 1301749565.517 DEBUG    lxc_cgroup - using cgroup
> mounted at '/var/local/cgroup'
>         lxc-start 1301749565.517 ERROR    lxc_conf - failed to create
> cache2-crous-veth2vrehs : File exists
>         lxc-start 1301749565.517 ERROR    lxc_conf - failed to create netdev
>         lxc-start 1301749565.517 ERROR    lxc_start - failed to create
> the network
>         lxc-start 1301749565.517 ERROR    lxc_start - failed to spawn
> 'cache2-crous'
>
> I can see the veth when i do an "ifconfig -a"
> I can't destroy it with a "tunctl -d cache2-crous"
> I can see it in /sys/devices/virtual/net/cache2-crous/
>
> I must wait, 2 or 3 minutes, to have the possibility to restart the
> container.
>
> Here is the network config relevant to this container:
> lxc.network.type = veth
> lxc.network.flags = up
> lxc.network.link = DMZ-CITE-U
> lxc.network.name = eth0
> lxc.network.mtu = 1500
> lxc.network.hwaddr = de:ad:be:ed:03:01
> lxc.network.veth.pair = cache2-crous
>
> DMZ-CITE-U is bridge with a tagged interface attached to it (eth1.xx)
>
> lxc is 0.73 package, kernel is a 2.6.37.2
>
> it's related to the network namespace, and haven't test with a 2.6.38.
>
> Is anybody seeing the same behaviour?
> Can someone point me the right direction, to solve it?

The problem you are facing is the life cycle of the network namespace.
It is very probable you have a socket in FIN_WAIT state holding a ref 
count on the network namespace and preventing this one to die until the 
socket is destroyed when its timer expires. When the last ref count is 
released the network namespace is destroyed with the virtual network 
interfaces.

We can not wipe out the sockets and force the network namespace to be 
destroyed because the kernel bufferize the data to be sent while the 
application things everything was sent. So if we take the scenario of an 
application container sending data through the network and when the last 
sendmsg is done the application exits, hence the container exits, we 
mustn't destroy the network until the last packet is sent to the peer.

This behavior gives the guarantee all the packets are received by the peer.




More information about the lxc-devel mailing list