[Lxc-users] kernel bug?
Serge Hallyn
serge.hallyn at ubuntu.com
Thu Mar 14 03:31:34 UTC 2013
Quoting Gary Ballantyne (gary.ballantyne at haulashore.com):
> Hi All
>
> I have an intermittent, but crippling, problem on a raring EC2 instance
> (also on quantal). Its a (raring) lvm-backed container --- I use cgroups
> directly (via /sys/fs) and iptables in the instance (not sure if that's
> relevant at all).
>
> Occasionally, when stopping or starting the container (there is just
> one), the instance becomes unreachable. Rebooting doesn't help, but
> starting/stopping the instance, typically at least twice, fixes things
> (the instance is reachable, and the container auto-starts).
>
> There doesn't appear to be anything sinister in /var/log/dmesg (upon
> restart), but the AWS system log is pasted below. I *think* the first
> part corresponds to before the crash, and the interesting bit is:
>
> [3587596.471053] ------------[ cut here ]------------
> [3587596.471071] Kernel BUG at ffffffff816c7c2c [verbose debug info
> ...
> [3587596.472282] ---[ end trace dc5c4320e1320f1d ]---
> [3587596.472292] Fixing recursive fault but reboot is needed!
Looks to me like the problem is a conflict between memory cgroup and
xen:
[3587596.472052] [<ffffffff8100508c>] ? xen_mc_extend_args+0xfc/0x120
[3587596.472061] [<ffffffff816c827b>] do_page_fault+0x2b/0x50
[3587596.472068] [<ffffffff816c4818>] page_fault+0x28/0x30
[3587596.472076] [<ffffffff81187c24>] ? mem_cgroup_charge_statistics.isra.20+0x14/0x50
[3587596.472085] [<ffffffff81189cd0>] __mem_cgroup_uncharge_common+0xd0/0x2d0
[3587596.472093] [<ffffffff8118d21a>] mem_cgroup_uncharge_page+0x2a/0x30
If you have your System.map file, or even better yet if you objdump -d
your uncompressed vmlinux, you should be able to figure out more about
what is going on at those locations.
You don't say what distro/kernel version you have, but you also might
google on these functions, or check the git logs for recent changes/
fixes.
-serge
More information about the lxc-users
mailing list