[lxc-devel] [PATCH] /proc/stat in lxcfs

Mon Feb 16 15:15:22 UTC 2015

Quoting Christian Brauner (christianvanbrauner at gmail.com):
> Hello,
> 
> /proc/stat mounted at /var/lib/lxcfs/ is currently missing
> (a) the cpu average-line from the hosts /proc/stat
> and
> (b) the numbering of the cores starts with 1 and not with the 0.
> (This will e.g.  cause confusion for top/htop in the container.) I wrote a
> minimally invasive patch for that:
> 
> Signed-off-by: Christian Brauner <christianvanbrauner at gmail.com>
> Date:   Sun Feb 15 10:34:41 2015 +0100
> 
>     Bring the cpu-average line to /proc/stat mounted by lxcfs and let the
>     core numbering start at 0.
> 
> diff --git a/lxcfs.c b/lxcfs.c
> index d6fb101..ab296e7 100644
> --- a/lxcfs.c
> +++ b/lxcfs.c
> @@ -1642,7 +1642,7 @@ static int proc_stat_read(char *buf, size_t size, off_t offset,
>     nih_local char *cpuset = NULL;
>     char *line = NULL;
>     size_t linelen = 0, total_len = 0;
> -   int curcpu = 0;
> +   int curcpu = -1; /* cpu numbering starts at 0 */

Thanks, this looks good;

>     FILE *f;
>  
>     if (offset)
> @@ -1662,16 +1662,22 @@ static int proc_stat_read(char *buf, size_t size, off_t offset,
>     while (getline(&line, &linelen, f) != -1) {
>         size_t l;
>         int cpu;
> +        char cpu_char[10]; /* That's a lot of cores */
>         char *c;
>  
> -       if (sscanf(line, "cpu%d", &cpu) != 1) {
> -           /* not a ^cpu line, just print it */
> +       if (sscanf(line, "cpu%9[^ ]", cpu_char) != 1) {
> +           /* not a ^cpuN line containing a number N, just print it */

Ok, so that will print the "cpu  1025201 6863 262753 7391275 7423 2 1333 0 2773 0"
line, fixing that bug.

>             l = snprintf(buf, size, "%s", line);
>             buf += l;
>             size -= l;
>             total_len += l;
>             continue;
>         }
> +        else if (sscanf(cpu_char, "%d", &cpu) != 1) {

This is an unorthodox (for lxc) coding style.  Could you explain
exactly what you're doing here?  I would expect just:

	if (sscanf(cpu_char, "%d", &cpu) != 1))
		continue;
	if (!cpu_in_cpuset(cpu, cpuset))
		continue;

> +        }
> +        else {
> +        }
> +
>         if (!cpu_in_cpuset(cpu, cpuset))
>             continue;
>         curcpu ++;
> 
> Best,
> Christian
> 
> > ----- Forwarded message from Serge Hallyn <serge.hallyn at ubuntu.com> -----
> > 
> > Date: Fri, 13 Feb 2015 04:14:11 +0000
> > From: Serge Hallyn <serge.hallyn at ubuntu.com>
> > To: Christian Brauner <christianvanbrauner at gmail.com>
> > Subject: Re: cpusets in non-unified hierarchy broken?
> > User-Agent: Mutt/1.5.21 (2010-09-15)
> > 
> > Quoting Christian Brauner (christianvanbrauner at gmail.com):
> > > On Thu, Feb 12, 2015 at 11:44:32PM +0000, Serge Hallyn wrote:
> > > > Quoting Christian Brauner (christianvanbrauner at gmail.com):
> > > > > Hi Serge,
> > > > > 
> > > > > There is random misreporting at times from top/htop inside of the
> > > > > container.  Whereas e.g. cat /proc/self/status show Cpus_allowed 0-3,
> > > > > top/htop will show only 1 core.
> > > > 
> > > > It may be getting it from /proc/stat, which you *may* be getting through
> > > > lxcfs.  It would then be filtering it through the container's cgroup
> > > > info.
> > > > 
> > > > When you start the container, does uptime say 1 minute, or does it match
> > > > the host's uptime?
> > > > 
> > > It says uptime 0 minutes aka it does not match the host's uptime.
> > > /proc/stat is looking different in the container than on the host
> > > (probably because of the cgroup filtering you mentioned).
> > > 
> > > I mentioned "random misreporting" mainly because the number of cores as
> > > displayed by top/htop in the container does not match the number of cores on
> > > the host. It is always less than on the host. I think this is caused by the
> > > implementation of /proc/stat by lxcfs.
> > > 
> > > E.g. on the host
> > > 
> > >     [chb at conventiont ~]$ cat /proc/stat | grep cpu
> > > 
> > > will show
> > > 
> > >     cpu  2061 0 1623 259471 4879 0 38 0 0 0
> > >     cpu0 568 0 458 63949 1905 0 12 0 0 0
> > >     cpu1 584 0 371 65044 1018 0 8 0 0 0
> > >     cpu2 410 0 390 65450 786 0 12 0 0 0
> > >     cpu3 498 0 403 65026 1169 0 5 0 0 0
> > > 
> > > whereas in the container
> > > 
> > >     root at vivid:~# cat /proc/stat | grep cpu
> > > 
> > > will show
> > > 
> > >     cpu1  580 0 473 68799 1906 0 12 0 0 0
> > >     cpu2  600 0 384 69836 1080 0 8 0 0 0
> > >     cpu3  434 0 413 70268 786 0 12 0 0 0
> > >     cpu4  545 0 428 69839 1169 0 5 0 0 0
> > > 
> > > So
> > > (a) the cpu average-line from the hosts /proc/stat is missing.
> > > and
> > > (b) the numbering of the cores seems to be different in the container
> > > 
> > > I guess this causes top/htop in the container to e.g. only show three cores on
> > > a four core system. Maybe, it doesn't show cpu0 at all because it thinks cpu0
> > > is actually cpu (average)? Lxcfs bug maybe?
> > 
> > Seems very likely.  If you have time to write a patch for that that'd be
> > awesome :)
> > 
> > (Else I'll keep it on my list)
> > 
> > thanks,
> > -serge
> > 
> > ----- End forwarded message -----

> _______________________________________________
> lxc-devel mailing list
> lxc-devel at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-devel