[Lxc-users] Slow response times (at least, from the LAN) to LXC containers

Mon Mar 15 15:29:32 UTC 2010

On Sun, Mar 14, 2010 at 6:02 PM, Daniel Lezcano <daniel.lezcano at free.fr> wrote:
> Michael B. Trausch wrote:
>>
>> Incorrect; I will try to explain again, I was very likely not clear
>> before.
>>
>> The device is a combination cable modem and switch provided by the ISP.
>>  The input to the device is a coax wire, and then it has four gigabit
>> Ethernet ports that are bridged together.  I have the IP network
>> 173.15.213.184/29 from my ISP, and the device takes from that address space
>> a single address for itself, 173.15.213.190, which is the address that it
>> uses to NAT any traffic from the LAN that uses it for a gateway on the LAN's
>> private network address space.
>>
>> This means that there are two address spaces that are valid on my network;
>> I can use either the IP addresses in the range of
>> 173.15.213.185--173.15.213.189 or I can use 172.16.0.0/24.  The device is on
>> the network with two IP addresses, 173.15.213.190 (which is reachable from
>> the public Internet, and is the gateway for the systems on my LAN that are
>> using addresses in the 173.15.213.184/29 subnet) and 172.16.0.1 (which is
>> the default gateway for all of the systems that have IP addresses in the
>> 172.16.0.0/24 subnet).
>
> Oh, very interesting I didn't know an ISP can provide such box. That means
> you can have one of your host directly available on internet without
> routing, right ?

I can have up to 5, as I have the entire IP address range from
173.15.213.185 to 173.15.213.189 inclusive.

>> In short, the device itself has two IP addresses on a single bridge
>> interface and how it handles the packets depends on the IP of the internal
>> machine trying to use it as a gateway.  It is also a black box insofar as I
>> can interact with it; I do not control its software nor am I able to make
>> any configuration changes to its IP stack other than things like port
>> forwarding and the like (I do not even have the ability to do protocol
>> forwarding via the NAT, which is why I have a Linux box running in a
>> container that does my IPv6 routing, and if I had to do any complex things
>> with NAT and protocol forwarding, I would need to suck up a second global
>> IPv4 address and NAT through it instead, probably on a second IPv4 RFC1918
>> subnet).
>>
>>> 2 - you have a host 'saffron' with the ip address 172.16.0.2/24 (ip
>>> forwarding is set and nat is enabled), right ?
>>
>> NAT routing is handled by the router at 172.16.0.1/24, which also has the
>> IP address 173.15.213.190 on the 173.15.213.184/29 subnet.
>>
>>> 3 - you have in this host a container 'spicerack' with two virtualized
>>> interfaces, the first one has 173.15.213.185 set and the second one has
>>> 172.16.0.3/24 set, right ?
>>
>> This is correct.
>>
>>> 4 - you have another host plugged on the lan called 'fennel', when you
>>> ping the container from this host, you receive an icmp redirect and
>>> packets are lost, right ?
>>
>> When I ping any of the 173.15.213.184/29 IP addresses from any host on my
>> network that has a 172.16.0.0/24 IP address, I receive an ICMP redirect from
>> 172.16.0.1, when the containers are running inside of LXC.  That would be
>> the issue as simply as I can state it.
>
> Ok, I think I understand now the topology of your network.
>
> IMO, the following is happening:
>
> * 'fennel' pings  172.15.213.185, but this IP address does not belong to its
> network so it sends the packet to default gateway 172.16.0.1.
>
> * The router sees there is a packet for 172.15.213.185 and this IP address
> is on the same LAN as 172.16.0.0/24. It assumes there is a suboptimal route
> for 'fennel' because this one can send directly the packet to 175.15.213.185
> wihout being routed. It sends a ICMP_REDIRECT to 'fennel' which creates a
> new route.
>
> * 'fennel' then resend another packet. As there is a route set by the icmp
> redirect, the routing resolution does no longer pass through the default
> gateway and an ARP request is made, then the container with the IP
> 175.15.213.185 answers.
>
> * As this point all packets are directly routed to 175.15.213.185 without
> passing to the router.
>
> Does this scenario explain why the first packets are lost ? And does a
> longer ping (eg ping -c 30) still show the 3 or 4 first packets lost ?
>
> Is it possible to show the ouput of a tcpdump -i any -n dst or src host
> 175.15.213.185 ? that would confirm such scenario.
>
> If I am right, that should happen without a container. If a host on the LAN
> is set with an aliased IP address 175.15.213.185 and 'fennel' pings it, that
> should trigger another ICMP_REDIRECT.

However, as I have mentioned, it does not happen for global IP
addresses that aren't in an LXC container.  I can not stress this
single fact enough.

As an example, zest.trausch.us has IP address 173.15.213.186.  It is a
real live hardware computer attached to the network.  It answers ping
without any ICMP redirects when it is alive (when it is off and not
responding, then I will get an ICMP redirect from the routing
appliance on the network).  Zest (which I have rarely used lately) was
booted up just now to confirm this again. I'm not blowing smoke when I
say that this problem is specific to the global IP addresses living
inside of the LXC containers.

This problem gets a bit worse, thanks to the brain-deadness of
Windows: Windows (at least Windows XP) can *never* talk to the LXC
hosts unless I install an IPv6 stack, as it currently sits.  This was
never the case when the containers were in OpenVZ---it just worked.
As well it should have just worked, I am using a bridge to link the
containers to the physical Ethernet network in this house.

Now, I know I am not forgetting how network bridges work, particularly
under Linux, because I use them all the time.  When I use KVM, or
OpenVZ, or VirtualBox, and I attach those VMs to a Linux bridge
device, it is as if they are physically on the network, and no matter
what IP address(es) or how many interfaces they have, they work.

I will follow up with tcpdump information as soon as I can get it;
currently my laptop is down due to a filesystem problem that I am
currently fixing and that's the machine that I need to do that,
because the other systems on this network are either running in a
degraded state or are running Windows (same thing, really).  But I
wanted to clarify the problem here---this problem *is* LXC specific,
which is why I am here, on the LXC mailing list.  If I were having a
generic networking issue, I would have contacted my local Linux user
group mailing list or a networking expert if I couldn't sort it out on
my own.

Mike Warfield, I don't know if you're paying any attention to this
thread, but if you are, do you have global addresses and a setup
similar to mine?  Are you able to confirm or deny that you can trigger
this behavior with your LXC containers as well?

   --- Mike