[lxc-users] Seg fault when using VLAN mode network
Serge Hallyn
serge.hallyn at ubuntu.com
Mon Feb 16 17:30:23 UTC 2015
Quoting Rory McCann (Rory.McCann at riverbed.com):
> Hi,
>
> On a fresh Ubuntu 14.04.1, 3.13.0-32 kernel install, with LXC 1.0.7 - LXC 1.1.0,
> I can reliably reproduce a segmentation fault with a VLAN type network.
>
> Container configuration is:
>
> lxc.utsname=ahost
> lxc.network.type=vlan
> lxc.network.vlan.id=10
> lxc.network.flags=up
> lxc.network.link=eth1
> lxc.network.ipv4=192.168.1.2/16
>
> Running
> lxc-execute -f <conf> -n test /bin.bash succeeds, and everything works pretty much as expected. However, the debug log shows:
>
>
> lxc-execute 1423836870.792 INFO lxc_lsm - lsm/lsm.c:lsm_init:48 - LSM security driver AppArmor
> lxc-execute 1423836870.792 DEBUG lxc_start - start.c:setup_signal_fd:259 - sigchild handler set
> lxc-execute 1423836870.792 INFO lxc_console - console.c:lxc_console_create:565 - no console for lxc-execute.
> lxc-execute 1423836870.792 INFO lxc_start - start.c:lxc_init:451 - 'ahost' is initialized
> lxc-execute 1423836870.793 DEBUG lxc_start - start.c:__lxc_start:1130 - Not dropping cap_sys_boot or watching utmp
> lxc-execute 1423836870.801 DEBUG lxc_conf - conf.c:instantiate_vlan:2804 - instantiated vlan ' vlan1000', ifindex is '6'
> lxc-execute 1423836870.801 INFO lxc_cgroup - cgroup.c:cgroup_init:65 - cgroup driver cgmanager initing for ahost
> lxc-execute 1423836870.813 DEBUG lxc_conf - conf.c:lxc_assign_network:3092 - move '(null)' to '1257'
> lxc-execute 1423836870.813 INFO lxc_conf - conf.c:setup_utsname:908 - 'ahost' hostname has been setup
> lxc-execute 1423836870.815 DEBUG lxc_conf - conf.c:setup_netdev:2452 - 'eth0' has been setup
> lxc-execute 1423836870.815 INFO lxc_conf - conf.c:setup_network:2473 - network has been setup
> lxc-execute 1423836870.815 INFO lxc_conf - conf.c:mount_autodev:1137 - Mounting /dev under /usr/lib/x86_64-linux-gnu/lxc
> lxc-execute 1423836870.815 WARN lxc_conf - conf.c:mount_autodev:1148 - No /dev on container rootfs.
> lxc-execute 1423836870.815 WARN lxc_conf - conf.c:mount_autodev:1149 - Proceeding without autodev setup
> lxc-execute 1423836870.815 INFO lxc_conf - conf.c:lxc_execute_bind_init:3672 - lxc.init.static bound into container at /usr/sbin/init.lxc.static
> lxc-execute 1423836870.815 INFO lxc_conf - conf.c:fill_autodev:1204 - Creating initial consoles under /usr/lib/x86_64-linux-gnu/lxc/dev
> lxc-execute 1423836870.815 DEBUG lxc_conf - conf.c:setup_caps:2145 - capabilities have been setup
> lxc-execute 1423836870.815 NOTICE lxc_conf - conf.c:lxc_setup:3927 - 'ahost' is setup.
> lxc-execute 1423836870.815 INFO lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:187 - changed apparmor profile to lxc-container-default
> lxc-execute 1423836870.815 NOTICE lxc_execute - execute.c:execute_start:90 - exec'ing '/bin/bash'
> lxc-execute 1423836870.818 NOTICE lxc_execute - execute.c:execute_post_start:104 - '/bin/bash' started with pid '1257'
> lxc-execute 1423836870.818 INFO lxc_console - console.c:lxc_console_mainloop_add:275 - no console for lxc-execute.
> lxc-execute 1423836870.818 WARN lxc_start - start.c:signal_handler:307 - invalid pid for SIGCHLD
> lxc-execute 1423836872.467 DEBUG lxc_start - start.c:signal_handler:311 - container init process exited
> lxc-execute 1423836872.467 WARN lxc_conf - conf.c:lxc_delete_network:2968 - failed to remove interface '(null)'
Ok, so I get this too, even with veth, but I do not get a segfault. I'll
post a patch to check for null there. I'm not quite sure whether we
should prefer to make sure to fill network->name in at a certain point.
The netdev->name is supposed to be what comes from lxc.network.name in
your configuration file. So when you don't provide one, we don't necessarily
want to take the kernel-created name for it, because if you then write out
your configuration file, it'll try to make that name stick. So I think
we'll just paper that over here.
Anyway, the segfault may be more related to what is apparently a kernel bug:
> Now, after exiting the container and re-running the lxc-execute command, I straightaway get a segmentation fault, and the following kernel stacktrace:
>
> [ 169.728142] ------------[ cut here ]------------
> [ 169.728151] WARNING: CPU: 0 PID: 1382 at /build/buildd/linux-3.13.0/fs/sysfs/dir.c:486 sysfs_warn_dup+0x86/0xa0()
> [ 169.728153] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:16.0/0000:0b:00.0/net/eth1/upper_vlan10-0'
> [ 169.728155] Modules linked in: 8021q garp mrp xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp bridge stp llc iptable_filter ip_tables x_tables ppdev coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel vmw_balloon aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd serio_raw vmwgfx ttm drm vmw_vmci parport_pc i2c_piix4 shpchp lp mac_hid parport psmouse vmxnet3 mptspi mptscsih mptbase floppy
> [ 169.728204] CPU: 0 PID: 1382 Comm: lxc-execute Not tainted 3.13.0-32-generic #57-Ubuntu
I don't get this bug, and this appears to be a simple kernel bugl. Does
eth1 exist?
> [ 169.728205] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013
> [ 169.728207] 0000000000000009 ffff88003caff5d0 ffffffff8171bcb4 ffff88003caff618
> [ 169.728209] ffff88003caff608 ffffffff810676cd ffff88003c455000 ffff88003c455000
> [ 169.728210] ffff880000018850 0000000000000000 0000000000000001 ffff88003caff668
> [ 169.728212] Call Trace:
> [ 169.728219] [<ffffffff8171bcb4>] dump_stack+0x45/0x56
> [ 169.728223] [<ffffffff810676cd>] warn_slowpath_common+0x7d/0xa0
> [ 169.728224] [<ffffffff8106773c>] warn_slowpath_fmt+0x4c/0x50
> [ 169.728226] [<ffffffff81234116>] sysfs_warn_dup+0x86/0xa0
> [ 169.728227] [<ffffffff81234170>] sysfs_add_one+0x40/0x50
> [ 169.728229] [<ffffffff81234cdf>] sysfs_do_create_link_sd.isra.2+0xbf/0x210
> [ 169.728230] [<ffffffff81234e55>] sysfs_create_link+0x25/0x50
> [ 169.728234] [<ffffffff8161da99>] __netdev_adjacent_dev_insert+0x1d9/0x270
> [ 169.728237] [<ffffffff816226ad>] __netdev_adjacent_dev_link_lists+0x2d/0x80
> [ 169.728239] [<ffffffff81622821>] __netdev_adjacent_dev_link_neighbour+0x71/0xa0
> [ 169.728241] [<ffffffff81622990>] __netdev_upper_dev_link+0xf0/0x460
> [ 169.728244] [<ffffffff8108fbb6>] ? raw_notifier_call_chain+0x16/0x20
> [ 169.728246] [<ffffffff81622d12>] netdev_upper_dev_link+0x12/0x20
> [ 169.728252] [<ffffffffa0255a09>] register_vlan_dev+0xe9/0x260 [8021q]
> [ 169.728254] [<ffffffffa025745c>] vlan_newlink+0xbc/0xf0 [8021q]
> [ 169.728257] [<ffffffff816327c5>] rtnl_newlink+0x4f5/0x5d0
> [ 169.728259] [<ffffffff8163240e>] ? rtnl_newlink+0x13e/0x5d0
> [ 169.728261] [<ffffffff8162f0b9>] rtnetlink_rcv_msg+0x99/0x260
> [ 169.728264] [<ffffffff81610e7e>] ? __alloc_skb+0x7e/0x2b0
> [ 169.728266] [<ffffffff8162f020>] ? rtnetlink_rcv+0x30/0x30
> [ 169.728269] [<ffffffff8164d659>] netlink_rcv_skb+0xa9/0xc0
> [ 169.728270] [<ffffffff8162f018>] rtnetlink_rcv+0x28/0x30
> [ 169.728272] [<ffffffff8164cc85>] netlink_unicast+0xd5/0x1b0
> [ 169.728274] [<ffffffff8164d05f>] netlink_sendmsg+0x2ff/0x740
> [ 169.728277] [<ffffffff81222ecd>] ? proc_alloc_inode+0x1d/0xb0
> [ 169.728280] [<ffffffff812d3d5e>] ? security_inode_alloc+0x1e/0x20
> [ 169.728283] [<ffffffff816079eb>] sock_sendmsg+0x8b/0xc0
> [ 169.728286] [<ffffffff8122fb8b>] ? proc_tgid_net_lookup+0x3b/0x70
> [ 169.728289] [<ffffffff8114e843>] ? unlock_page+0x23/0x30
> [ 169.728292] [<ffffffff81176a5a>] ? do_wp_page+0x39a/0x7c0
> [ 169.728294] [<ffffffff816076de>] ? move_addr_to_kernel.part.16+0x1e/0x60
> [ 169.728296] [<ffffffff816082a1>] ? move_addr_to_kernel+0x21/0x30
> [ 169.728298] [<ffffffff81608273>] ___sys_sendmsg+0x3c3/0x3d0
> [ 169.728301] [<ffffffff816063cf>] ? sock_destroy_inode+0x2f/0x40
> [ 169.728304] [<ffffffff811d7c98>] ? destroy_inode+0x38/0x60
> [ 169.728305] [<ffffffff811d7ddb>] ? evict+0x11b/0x1b0
> [ 169.728307] [<ffffffff811d281f>] ? __d_free+0x3f/0x60
> [ 169.728311] [<ffffffff8111155c>] ? acct_account_cputime+0x1c/0x20
> [ 169.728313] [<ffffffff8109d7db>] ? account_user_time+0x8b/0xa0
> [ 169.728315] [<ffffffff81608972>] __sys_sendmsg+0x42/0x80
> [ 169.728316] [<ffffffff816089c2>] SyS_sendmsg+0x12/0x20
>
> [ 169.728319] [<ffffffff8172c87f>] tracesys+0xe1/0xe6
> [ 169.728321] ---[ end trace 7dab90e036ea6938 ]---
> [ 169.728881] ------------[ cut here ]------------
> [ 169.728904] kernel BUG at /build/buildd/linux-3.13.0/net/core/dev.c:6437!
> [ 169.728926] invalid opcode: 0000 [#1] SMP
> [ 169.728943] Modules linked in: 8021q garp mrp xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp bridge stp llc iptable_filter ip_tables x_tables ppdev coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel vmw_balloon aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd serio_raw vmwgfx ttm drm vmw_vmci parport_pc i2c_piix4 shpchp lp mac_hid parport psmouse vmxnet3 mptspi mptscsih mptbase floppy
> [ 169.729133] CPU: 0 PID: 1382 Comm: lxc-execute Tainted: G W 3.13.0-32-generic #57-Ubuntu
> [ 169.729162] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013
> [ 169.730437] task: ffff880036e15fc0 ti: ffff88003cafe000 task.ti: ffff88003cafe000
> [ 169.731010] RIP: 0010:[<ffffffff8162447f>] [<ffffffff8162447f>] free_netdev+0xff/0x110
> [ 169.731581] RSP: 0018:ffff88003caff8c0 EFLAGS: 00010297
> [ 169.732150] RAX: 0000000000000002 RBX: ffff88003c452018 RCX: 0000000000000242
> [ 169.732745] RDX: 0000000000000001 RSI: 0000000000000282 RDI: 0000000000000282
> [ 169.733306] RBP: ffff88003caff8d8 R08: 001fbc88ffffffc0 R09: 00000040ffffffc0
> [ 169.733863] R10: ffffffc000000040 R11: ffffffc000000be0 R12: ffff88003c452060
> [ 169.734424] R13: ffff88003c452000 R14: ffff88003caff8e8 R15: ffffffffa025a0a0
> [ 169.734985] FS: 00007f2bb52998c0(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
> [ 169.735556] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 169.736129] CR2: 00007f6136f45000 CR3: 000000003a2af000 CR4: 00000000000007f0
> [ 169.736758] Stack:
> [ 169.737328] 00000000ffffffef ffffffff81cda240 ffff88003caff920 ffff88003caffae8
> [ 169.737965] ffffffff81632848 0000000000000000 ffff88003dbe7434 0000000000000000
> [ 169.738573] 0000000000000000 0000000000000000 0000000000000000 ffffffff8163240e
> [ 169.739176] Call Trace:
> [ 169.739777] [<ffffffff81632848>] rtnl_newlink+0x578/0x5d0
> [ 169.740393] [<ffffffff8163240e>] ? rtnl_newlink+0x13e/0x5d0
> [ 169.741002] [<ffffffff8162f0b9>] rtnetlink_rcv_msg+0x99/0x260
> [ 169.741614] [<ffffffff81610e7e>] ? __alloc_skb+0x7e/0x2b0
> [ 169.742226] [<ffffffff8162f020>] ? rtnetlink_rcv+0x30/0x30
> [ 169.742912] [<ffffffff8164d659>] netlink_rcv_skb+0xa9/0xc0
> [ 169.743525] [<ffffffff8162f018>] rtnetlink_rcv+0x28/0x30
> [ 169.744136] [<ffffffff8164cc85>] netlink_unicast+0xd5/0x1b0
> [ 169.744721] [<ffffffff8164d05f>] netlink_sendmsg+0x2ff/0x740
> [ 169.745301] [<ffffffff81222ecd>] ? proc_alloc_inode+0x1d/0xb0
> [ 169.745873] [<ffffffff812d3d5e>] ? security_inode_alloc+0x1e/0x20
> [ 169.746444] [<ffffffff816079eb>] sock_sendmsg+0x8b/0xc0
> [ 169.747001] [<ffffffff8122fb8b>] ? proc_tgid_net_lookup+0x3b/0x70
> [ 169.747556] [<ffffffff8114e843>] ? unlock_page+0x23/0x30
> [ 169.748086] [<ffffffff81176a5a>] ? do_wp_page+0x39a/0x7c0
> [ 169.748631] [<ffffffff816076de>] ? move_addr_to_kernel.part.16+0x1e/0x60
> [ 169.749138] [<ffffffff816082a1>] ? move_addr_to_kernel+0x21/0x30
> [ 169.749626] [<ffffffff81608273>] ___sys_sendmsg+0x3c3/0x3d0
> [ 169.750102] [<ffffffff816063cf>] ? sock_destroy_inode+0x2f/0x40
> [ 169.750573] [<ffffffff811d7c98>] ? destroy_inode+0x38/0x60
> [ 169.751041] [<ffffffff811d7ddb>] ? evict+0x11b/0x1b0
> [ 169.751476] [<ffffffff811d281f>] ? __d_free+0x3f/0x60
> [ 169.751902] [<ffffffff8111155c>] ? acct_account_cputime+0x1c/0x20
> [ 169.752321] [<ffffffff8109d7db>] ? account_user_time+0x8b/0xa0
> [ 169.752732] [<ffffffff81608972>] __sys_sendmsg+0x42/0x80
> [ 169.753144] [<ffffffff816089c2>] SyS_sendmsg+0x12/0x20
> [ 169.753599] [<ffffffff8172c87f>] tracesys+0xe1/0xe6
> [ 169.754019] Code: a7 e6 ff 5b 41 5c 41 5d 5d c3 66 90 e8 2b 3a b6 ff e9 53 ff ff ff 66 0f 1f 44 00 00 4c 89 ef e8 b8 fe ff ff 5b 41 5c 41 5d 5d c3 <0f> 0b 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66
> [ 169.755445] RIP [<ffffffff8162447f>] free_netdev+0xff/0x110
> [ 169.755870] RSP <ffff88003caff8c0>
> [ 169.756330] ---[ end trace 7dab90e036ea6939 ]---
>
>
> Debug log only shows:
> lxc-execute 1423837022.982 INFO lxc_lsm - lsm/lsm.c:lsm_init:48 - LSM security driver AppArmor
> lxc-execute 1423837022.982 DEBUG lxc_start - start.c:setup_signal_fd:259 - sigchild handler set
> lxc-execute 1423837022.982 INFO lxc_console - console.c:lxc_console_create:565 - no console for lxc-execute.
> lxc-execute 1423837022.982 INFO lxc_start - start.c:lxc_init:451 - 'ahost' is initialized
> lxc-execute 1423837022.983 DEBUG lxc_start - start.c:__lxc_start:1130 - Not dropping cap_sys_boot or watching utmp
>
>
> I apologise if this is the wrong place to report issues.
>
> Regards,
> Rory McCann
> _______________________________________________
> lxc-users mailing list
> lxc-users at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users
More information about the lxc-users
mailing list