[lxc-users] Enabling real time support in containers

Serge E. Hallyn serge at hallyn.com
Wed Apr 5 13:45:47 UTC 2017


Quoting Peter Steele (pwsteele at gmail.com):
> On 03/31/2017 10:16 AM, Peter Steele wrote:
> >As you can see, the sched_setscheduler() call fails with an EPERM
> >error. This same app runs fine on the host.
> >
> >Ultimately I expect this app to fail when run under my container
> >since I have not given the container any real time bandwidth. I
> >had hoped the option
> >
> >lxc.cgroup.cpu.rt_runtime_us = 475000
> >
> >would do the trick but this option is rejected with anything other
> >than "0". So presumably this isn't the correct way to give a
> >container real time bandwidth.
> >
> >I have more experience with the libvirt-lxc framework and I have
> >been able to enable real time support for containers under
> >libvirt. The approach used in this case involves explicitly
> >setting cgroup parameters, specifically
> >
> >/sys/fs/cgroup/cpu/machine.slice/cpu.rt_runtime_us
> >
> >under the host and
> >
> >/sys/fs/cgroup/cpu/cpu.rt_runtime_us
> >
> >under the container. For example, I might do something like this:
> >
> >echo 500000 >/sys/fs/cgroup/cpu/machine.slice/cpu.rt_runtime_us
> >--> on the host
> >echo 25000 >/sys/fs/cgroup/cpu/cpu.rt_runtime_us      --> on a container
> >
> >These do not work for LXC based containers though.
> >
> 
> The test code I'm running can be simplified to just this simple sequence:
> 
> #include <stdio.h>
> #include <sched.h>
> 
> int main() {
>     struct sched_param param;
>     param.sched_priority = 50;
>     const int myself  =  0; // 0 is the PID of ourself
>     if (0 != sched_setscheduler(myself, SCHED_FIFO, &param)) {
>         printf("Failure\n");
>         return -1;
>     }
> 
>     printf("Success\n");
>     return 0;
> }
> 
> On a container with RT support enabled, this should print "Success".
> 
> Am I correct in assuming LXC *does* provide a means to enable RT
> support? If not, we will need to another approach to this problem.

The kernel has hardcoded checks (which are not namespaced) that
if you are not (global) root, you cannot set or change the rt
policy.  I suspect there is a way that could be safely relaxed
(i.e. if a container has exclusie use of a cpu), but we'd have
to talk to the scheduling experts about what would make sense.
(see 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/sched/core.c?id=refs/tags/v4.11-rc5#n4164
)

Otherwise, as a workaround (assuming this is the only problem you
hit) you could simply make sure that the RT policy is correct ahead
of time and the priority is high enough that the application is only
lowering it, then the kernel wouldn't stop it.  Certainly that's
more fragile.  Or you could get fancier and LD_PRELOAD to catch
sys_setscheduler and redirect to an api over a socket to a tiny
deamon on the host kernel which sets it up for you...  But certainly
it would be best for everyone if this was supported in the kernel the
right way.

-serge


More information about the lxc-users mailing list