[Lxc-users] Slow response times (at least, from the LAN) to LXC containers

Sun Mar 14 23:02:33 UTC 2010

Michael B. Trausch wrote:
> On 03/11/2010 03:45 PM, Daniel Lezcano wrote:
>> Michael B. Trausch wrote:
>>> On 03/10/2010 12:06 PM, Daniel Lezcano wrote:
>>>> The redirect you receive means the router find an optimized route for
>>>> the packet you sent to him, so the icmp redirect will trigger the 
>>>> kernel
>>>> to create a new route for these packets. Maybe the route is not 
>>>> created
>>>> in the right container ? Can you check where is created this route ?
>>>> * ip route table show all
>>>> or
>>>> * route -Cn
>>>
>>> The routing tables are automatically setup (that is, they are setup by
>>> Debian's /etc/network/interfaces) based on the network configuration
>>> information.
>>>
>>> Here is the routing table from the spicerack.trausch.us container:
>>> mbt at spicerack:~$ ip route show all
>>> 173.15.213.184/29 dev eth0 proto kernel scope link src 173.15.213.185
>>> 172.16.0.0/24 dev eth1 proto kernel scope link src 172.16.0.3
>>> default via 173.15.213.190 dev eth0 metric 100
>>>
>>> Here is the routing table from the container's host:
>>> mbt at saffron:~$ ip route show all
>>> 172.16.0.0/24 dev br0 proto kernel scope link src 172.16.0.2
>>> 192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1
>>> default via 172.16.0.1 dev br0 metric 100
>>
>> What I would like to see is the route cache (so the ouput of "ip route
>> show table all"). The icmp redirect will create a specific route, I
>> would like to see where it is created.
>
> Ahh!  That command works.  The command that you gave earlier had the 
> words "table" and "show" transposed, so I picked the closest command 
> that I knew would output some routing information.  I had no idea that 
> there were just a few words switched.  Here is the output from "ip 
> route show table all", followed by an IPv4 ping to the troublesome 
> system, and then the "ip route show table all" command again (on 
> pastebin because it's very wide and would very likely be munged in the 
> message):
>
>   http://pastebin.com/UFuLxjtt

Thanks.

Stupid me, I gave the wrong command, it was "ip route show cache"
Sorry ...

>> Ok at this point I still have not enough information, let's summarize:
>>
>> 1 - you have a router with two nics. One is the internet side with
>> 173.15.213.190/29 and the other one is connected to the lan with
>> 172.20.0.1, right ?
>
> Incorrect; I will try to explain again, I was very likely not clear 
> before.
>
> The device is a combination cable modem and switch provided by the 
> ISP.  The input to the device is a coax wire, and then it has four 
> gigabit Ethernet ports that are bridged together.  I have the IP 
> network 173.15.213.184/29 from my ISP, and the device takes from that 
> address space a single address for itself, 173.15.213.190, which is 
> the address that it uses to NAT any traffic from the LAN that uses it 
> for a gateway on the LAN's private network address space.
>
> This means that there are two address spaces that are valid on my 
> network; I can use either the IP addresses in the range of 
> 173.15.213.185--173.15.213.189 or I can use 172.16.0.0/24.  The device 
> is on the network with two IP addresses, 173.15.213.190 (which is 
> reachable from the public Internet, and is the gateway for the systems 
> on my LAN that are using addresses in the 173.15.213.184/29 subnet) 
> and 172.16.0.1 (which is the default gateway for all of the systems 
> that have IP addresses in the 172.16.0.0/24 subnet).

Oh, very interesting I didn't know an ISP can provide such box. That 
means you can have one of your host directly available on internet 
without routing, right ?

> In short, the device itself has two IP addresses on a single bridge 
> interface and how it handles the packets depends on the IP of the 
> internal machine trying to use it as a gateway.  It is also a black 
> box insofar as I can interact with it; I do not control its software 
> nor am I able to make any configuration changes to its IP stack other 
> than things like port forwarding and the like (I do not even have the 
> ability to do protocol forwarding via the NAT, which is why I have a 
> Linux box running in a container that does my IPv6 routing, and if I 
> had to do any complex things with NAT and protocol forwarding, I would 
> need to suck up a second global IPv4 address and NAT through it 
> instead, probably on a second IPv4 RFC1918 subnet).
>
>> 2 - you have a host 'saffron' with the ip address 172.16.0.2/24 (ip
>> forwarding is set and nat is enabled), right ?
>
> NAT routing is handled by the router at 172.16.0.1/24, which also has 
> the IP address 173.15.213.190 on the 173.15.213.184/29 subnet.
>
>> 3 - you have in this host a container 'spicerack' with two virtualized
>> interfaces, the first one has 173.15.213.185 set and the second one has
>> 172.16.0.3/24 set, right ?
>
> This is correct.
>
>> 4 - you have another host plugged on the lan called 'fennel', when you
>> ping the container from this host, you receive an icmp redirect and
>> packets are lost, right ?
>
> When I ping any of the 173.15.213.184/29 IP addresses from any host on 
> my network that has a 172.16.0.0/24 IP address, I receive an ICMP 
> redirect from 172.16.0.1, when the containers are running inside of 
> LXC.  That would be the issue as simply as I can state it.
Ok, I think I understand now the topology of your network.

IMO, the following is happening:

 * 'fennel' pings  172.15.213.185, but this IP address does not belong 
to its network so it sends the packet to default gateway 172.16.0.1.

 * The router sees there is a packet for 172.15.213.185 and this IP 
address is on the same LAN as 172.16.0.0/24. It assumes there is a 
suboptimal route for 'fennel' because this one can send directly the 
packet to 175.15.213.185 wihout being routed. It sends a ICMP_REDIRECT 
to 'fennel' which creates a new route.

 * 'fennel' then resend another packet. As there is a route set by the 
icmp redirect, the routing resolution does no longer pass through the 
default gateway and an ARP request is made, then the container with the 
IP 175.15.213.185 answers.

 * As this point all packets are directly routed to 175.15.213.185 
without passing to the router.

Does this scenario explain why the first packets are lost ? And does a 
longer ping (eg ping -c 30) still show the 3 or 4 first packets lost ?

Is it possible to show the ouput of a tcpdump -i any -n dst or src host 
175.15.213.185 ? that would confirm such scenario.

If I am right, that should happen without a container. If a host on the 
LAN is set with an aliased IP address 175.15.213.185 and 'fennel' pings 
it, that should trigger another ICMP_REDIRECT.

>> - what are the ip address / routes of 'fennel' ?
>
> Fennel is configured via DHCP from 172.16.0.1, and it currently has 
> 172.16.0.33/24 with 172.16.0.1 as a default gateway.
>
> I have several containers running.  At the moment, only two of them 
> have global IP addresses---one has 173.15.213.185 and the other one 
> has 173.15.213.189.  They both are using 173.15.213.190 as the default 
> gateway.
>
> I have several other containers running which have 172.16.0.0/24 
> addresses (all handed out by DHCP; name resolution is done using 
> zeroconf as provided by avahi-daemon running on all of the 
> containers).  I can reach all of those just fine; they answer 
> immediately when I ping them or attempt to connect to services running 
> on them (such as my database server).
>
> Please let me know if there is any more information that would be 
> helpful here.  I don't know what else there is to say at the moment. 
> Given everything that I know about IPv4 networking, everything 
> _should_ be just working.  Also, as I have mentioned before, when I 
> had these containers running under OpenVZ on Proxmox VE, I did not 
> experience these issues.  I only experience them when running under 
> LXC, and only reliably can reproduce the problem when I attempt to 
> access a container on its global IP address from a system that does 
> not also have a global address.  Also of note, when I have a hardware 
> node on the network with a global address, I can ping it from a system 
> that does not have a global address.  The same is true when I run a 
> full VM under something like KVM or VirtualBox and have it attached to 
> the network using a bridged connection.
>
> I do not seem to be having any more trouble with IPv6 networking after 
> enabling IPv6 forwarding on the container host system, though that 
> doesn't make sense to me since enabling IPv6 forwarding should not be 
> necessary, since the container's virtual NIC is part of a bridge and 
> should be able to function forwarding IPv6 to and from the LAN without 
> any dependency on the hardware node's configuration.  I do not have to 
> enable IPv6 forwarding on the host when the system doing IPv6 routing 
> is running in a full virtual machine such as KVM, anyway---I would 
> expect the same assertion to be true of a container.  Is that an 
> incorrect expectation?
>
>     --- Mike
>