[Lxc-users] Bug discussion: implementing high virtual device MAC addresses

Tue Oct 25 02:31:20 UTC 2011

Quoting Derek Simkowiak (derek at simkowiak.net):
>     Serge,
>     Thank you for looking at this.
> 
> Serge> /However, I actually don't think it should happen the way you
> describe./
> 
>     I believe you have mis-read my description.  I think we are
> actually in agreement with what is happening.

You're right :)

>     You said:
> 
> Serge> /So the mac address of the veth endpoint in the container
> should not matter./
> 
>     I think that is the same thing that I said:
> 
> Derek> [The problem MAC address] is NOT the mac address specified in
> lxc.conf, like this:
> 
> 
> lxc.network.hwaddr = fe:16:3e:fd:5a:5b

Ah, right!

> 	That MAC address has nothing to do with the bug; the host's bridge
> device (br0) will never assume a configured LXC MAC address as its own.
> 
> 
>     Also, you said:
> 
> Serge> /The other endpoint, the veth which stays in the host's
> network namespace, that is the one which gets placed on the bridge./
> 
>     I agree, that is the address which causes the ~4 network second
> freeze.  As I said in my original description:
> 
> Derek>> ...the MAC address in question is the one of the virtual
> vethXXXX device, as shown with "ifconfig" on the host:
> 
> 
> veth0IEDlk Link encap:Ethernet  HWaddr 4e:34:7c:dc:92:e8
> [...snip...]
> 
> 
>     So, are we in agreement that the problem address is NOT the one
> in the LXC .conf file (as specified by the user), but instead is the
> "random" address of the veth device on the host?

Yes.  So I think it's worth following up.

> Serge> /Hmm, I haven't seen this happen at all./
> 
>     I have seen it on Ubuntu 10.04, and there was an independent
> description of the same symptom (and a different but very similar
> work-around) filed in SourceForge here:
> 
> http://sourceforge.net/tracker/index.php?func=detail&aid=3411497&group_id=163076&atid=826303
> 
>     (That's SF bug ID# 3411497.)
> 
>     As described in the libvirt bugfix for this issue (linked
> below), the reason some people see it and some people don't is that
> it only happens when the veth MAC address is lower than that of the

Right - I do remember when it came up in libvirt.  Come to think of it,
the reason I don't see it much is that I don't, every often, bridge the
container nic and host nic together.  But obviously for *real* people
(not fake ppl like me) that's a very important use case.

> physical eth0 device's MAC address.  (That is how the Linux kernel
> handles it, by design.  I don't know why.)
> 
>     Since the MAC address is randomly chosen, it is a random symptom
> that will vary from one NIC to another.  Those who happen to have a
> high MAC address for eth0 will see it more frequently (but still
> randomly.)  This is a major impact on production symptoms, where a
> ~4 second network freeze could trigger admin alerts and/or failover
> scripts.  (Note the exact duration of the network freeze also
> depends on your switches and routers, and how they handle ARP
> caching.)

Yup.  I think you should proceed with a patch.  Patch the function
instanciate_veth() in src/lxc/conf.c to set the hwaddr on veth1 after
lxc_veth_create() but before the call to lxc_bridge_attach().
src/lxc/conf.c:setup_hw_addr() shows how to go about setting a mac
address.  You'll presumably want to only set the first two bytes, leaving
the rest random.  Libvirt used 0xFE.  It did a SIOCGIFHWADDR ioctl to
get the mac address, overwrote the first two bytes with 0xFE, then
did SIOCSIFHWADDR to set the tweaked address.

Thanks!

-serge