[Lxc-users] Bad checksums and lost packets with macvlan on dummy

Daniel Lezcano daniel.lezcano at free.fr
Tue Mar 1 13:29:12 UTC 2011


On 02/28/2011 08:45 AM, Eric Dumazet wrote:
> Le dimanche 27 février 2011 à 21:35 +0100, Daniel Lezcano a écrit :
>> On 02/27/2011 08:50 PM, Eric Dumazet wrote:
>>> Le dimanche 27 février 2011 à 16:14 +0100, Daniel Lezcano a écrit :
>>>> On 02/23/2011 06:13 PM, Andrian Nord wrote:
>>>>> On Mon, Feb 21, 2011 at 05:07:31PM +0100, Daniel Lezcano wrote:
>>>>>> I Cc'ed the netdev mailing list and Patrick in case my analysis is wrong
>>>>>> or incomplete.
>>>>> I'm confirming, that this happens only when macvlan's are onto dummy net
>>>>> device. In case of some physical interface under macvlan there is no lost
>>>>> packages and no broken checksums.
>>>> I did some tests with a 2.6.35 kernel version and it seems the checksum
>>>> errors do not appear.
>>>> I noticed there are some changes in the dummy setup function:
>>>>
>>>>      dev->features   |= NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_TSO;
>>>>      dev->features   |= NETIF_F_NO_CSUM | NETIF_F_HIGHDMA | NETIF_F_LLTX;
>>>>
>>>>
>>>> May be that was introduced by commit:
>>>>
>>>> commit 6d81f41c58c69ddde497e9e640ba5805aa26e78c
>>>> Author: Eric Dumazet<eric.dumazet at gmail.com>
>>>> Date:   Mon Sep 27 20:50:33 2010 +0000
>>>>
>>>>        dummy: percpu stats and lockless xmit
>>>>
>>>>        Converts dummy network device driver to :
>>>>
>>>>        - percpu stats
>>>>
>>>>        - 64bit stats
>>>>
>>>>        - lockless xmit (NETIF_F_LLTX)
>>>>
>>>>        - performance features added (NETIF_F_SG | NETIF_F_FRAGLIST |
>>>>        NETIF_F_TSO | NETIF_F_NO_CSUM | NETIF_F_HIGHDMA)
>>>>
>>>>        Signed-off-by: Eric Dumazet<eric.dumazet at gmail.com>
>>>>        Signed-off-by: David S. Miller<davem at davemloft.net>
>>>>
>>>>
>>>> Eric,
>>>>
>>>> Andrian is observing, with a couple of macvlan (in bridge mode) on top
>>>> of a dummy interface, a lot of checksums error and packets drop.
>>>> Each macvlan is in a different network namespace and the dummy interface
>>>> is in the init_net.
>>>>
>>>> Any ideas ?
>>> Not sure I understand... I thought dummy was dropping all frames
>>> anyway ?
>>>
>>> static netdev_tx_t dummy_xmit(struct sk_buff *skb, struct net_device *dev)
>>> {
>>>           struct pcpu_dstats *dstats = this_cpu_ptr(dev->dstats);
>>>
>>>           u64_stats_update_begin(&dstats->syncp);
>>>           dstats->tx_packets++;
>>>           dstats->tx_bytes += skb->len;
>>>           u64_stats_update_end(&dstats->syncp);
>>>
>>>           dev_kfree_skb(skb);
>>>           return NETDEV_TX_OK;
>>> }
>>>
>>>
>>> Maybe you could describe the setup ?
>> Yes, it is very simple.
>>
>> There are two network namespaces.
>>
>> macvlan1 is in network namespace 1
>> macvlan2 is in network namespace 2
>>
>> Both are in "bridge" mode, so they can communicate together.
>> The lower device is dummy0 in the init network namespace.
>>
>> IMO the problem is coming from the macvlan driver:
>>
>> dev->features           = lowerdev->features&  MACVLAN_FEATURES
>>
>> As dummy0 has the offloading capabilities set on, the macvlan driver
>> inherit these features.
>>
>> In the normal case, dummy0 is supposed to drop the packets. But with
>> macvlan these packets are broadcasted to the other macvlan ports, so no
>> checksum is computed when the packets are transmitted between macvlan1
>> and macvlan2.
> So where frames get bad checksums ?
>
> In this "bridge" mode, I suspect the broadcast is done _before_ sending
> frame to dummy, so maybe macvlan should not inherit from lowerdev in
> this particular case ?

Hi Eric,

yes, you are right, the packets are sent before.

In the 'macvlan_queue_xmit', the code checks the dev is in 'bridge' 
mode. If so, it looks if there is a destination port for the packet and 
then calls the 'forward' callback which is 'dev_forward_skb'.

I was able to reproduce the same problem with qemu and an emulated 
'e1000' card instead of dummy0. The packets are dropped too.

Patrick, do you have any suggestions to fix this ?

Thanks
   -- Daniel






More information about the lxc-users mailing list