[Lxc-users] Slow response times (at least, from the LAN) to LXC containers

Michael B. Trausch mike at trausch.us
Mon Mar 15 20:53:14 UTC 2010


On 03/15/2010 01:05 PM, Brian K. White wrote:
> I have 2 cable accounts at my office just like you describe,
> 5 static ips provided by a cable company supplied router with a built-in
> 4 port switch. But I can't exactly replicate your setup because:
>
> 1) The router and cable modem are seperate boxes. The router part is a
> cisco 800 series router with one wan nic and one lan nic and 4 bridged
> lan ports. This means my hardware can't be the same as yours because you
> described a single box with integrated cable modem. So, being not
> identical, my hardware may not behave exactly as yours does and it may
> not be an LXC problem, merely a problem LXC tickles.

Yes, it is an SMC box.  I don't have the model number off-hand, but it's 
essentially the same device as the SMC-8014, which (other than being 
highly sensitive to slight fluctuations in power input) works rather 
well.  That said, the device should not have anything to do with 
it---two nodes on an IP network that are on the same physical segment 
talk directly to each other, not through the gateway.

> 2) More relevant, those cisco's aren't doing any nat for me. I treat the
> lan ports on those routers as part of the internet and are only
> connected to nics with public ip's in the particular range of each
> particular router. No connections to nics or switches that connect to
> any other nics having any ips outside that range.

I don't think that this should be an issue either, since again, we're 
talking about all systems on the same physical segment.  After 5 PM, I 
can disconnect the router and test to be sure, but I am certain that the 
problem with communicating to the LXC containers will persist after it 
is removed and that I will still be able to communicate with the 
hardware nodes that have IP addresses in the 173.15.213.184/29 range 
just fine.  That is what I would expect given the relevant standards; 
the only purpose that the device serves is to be a gateway to the 
Internet for both IP network numbers.

> I have a few lxc boxes set up here, and one of them is on one of these
> cable lines. But both the lxc host and the containers within it all have
> public ip's from the same 5-address pool of usable addresses for that
> router. The host and the containers do also have private lan ip's, but
> they bare all on a seperate nic on the host, and that nic connects to a
> seperate switch, which, even thought that nat traffic does happen to
> ultimately go back out via one of the public ip's on that same cable
> line, it does so via a seperate physical network and nat router, which
> happens to be another a linux box with 2 nics, one stricty private one
> public directly connected to one of the lan ports on the cisco.

That would be one way to set things up, however, such a setup is out of 
my reach.  This setup meets my needs (except for this issue that I am 
having here) and has for two years now.

> Perhaps lxc does mis a beat somewhere with that network, or perhaps it's
> the router, but I think this kind of mystery problem is exactly why I
> "just don't do that". I know it's technically "legal" and I'd do it if I
> had a reason to some time, but where possible I don't mix ip ranges on a
> physical network or within a vlan at least. Especially I avoid potential
> routing ambiguity such as having lan and wan traffic on the same
> physical net where both would end up routing, for different reasons, to
> the same gateway device or nic. Thats just begging for problems.

There should not be any routing ambiguity here.  ARP is able to find IP 
addresses of all the hardware nodes just fine, and connections between 
real hardware systems with real Ethernet cards work perfectly regardless 
of the operating system running on that real hardware.

That is, no routing is necessary (on _this_ network) to go from 
172.16.0.30 to 173.15.213.185, nor the inverse.  All systems are aware 
of the fact that 172.16.0.0/24 and 173.15.213.184/29 are on the local link.

As I have previously mentioned, this setup works with all of the following:

  * Real hardware with real Ethernet cards,
  * KVM virtual machines attached via a bridge interface,
  * VirtualBox virtual machines attached via a bridge interface,
  * QEMU virtual machines attached via a bridge interface, and
  * OpenVZ container instances attached via a bridge interface.

The only thing I can not get to reliably work are these containers 
running under LXC; I therefore do not have any probable cause to "blame 
the network", nor do I have any evidence that any system on this network 
is failing to adhere to some standard or specification correctly.  If I 
did, I'd be trying to find it and fix it.  And believe me, if I had an 
excuse to get rid of this SMC appliance that they have on this network, 
I would take advantage of it---I'd love to give its IP address to a 
Linux box that I control to do the IPv4 NAT routing (and then I would 
not have to do my IPv6 routing from within one of my containers nor give 
up a second address for that).

My first clue that there was something amiss with LXC was, in fact, the 
IPv6.  Now, OpenVZ is not capable of doing tunnels inside of containers, 
so I cannot compare to that.  When I was running OpenVZ containers, I 
used a KVM instance for my IPv6 routing.  Note that in that case, IPv6 
forwarding did _not_ need to be enabled on the host system.  However, 
for an LXC container to be able to run an IPv6 tunnel and communicate 
using IPv6 with the LAN, I had to enable IPv6 forwarding not only in the 
container, but on the host system as well.  This tells me that there is 
not a complete separation of the interfaces, and that there is something 
bleeding around the edges outside of the bridge.  I have not yet had the 
time to actually take a look at the code to confirm this suspicion of 
mine, but it is the only rational explanation that I can come up with.

Moving from there to the problem at hand, I can only come to the 
conclusion that there is some obscure bug somewhere in LXC or the 
modifications to the networking stack that serve LXC that needs to be 
hammered out.  I'd absolutely _love_ to be able to rip up my network and 
bring it to someone who knows the kernel code and LXC code well enough 
to draw concrete conclusions from it.

	--- Mike

-- 
Michael B. Trausch                                    ☎ (404) 492-6475




More information about the lxc-users mailing list