[lxc-users] debugging a failing clone() call
Christian Brauner
christian.brauner at getenv.org
Mon Apr 9 08:28:27 UTC 2018
On Fri, Mar 23, 2018 at 06:13:15AM -0400, Andrew Cann wrote:
> Hello,
>
> The folowing sycall is failing when called on a Travis-CI build machine.
>
> clone(..,
> CLONE_FILES |
> CLONE_IO |
> CLONE_SIGHAND |
> CLONE_VM |
> CLONE_SYSVSEM |
> CLONE_NEWNET |
> CLONE_NEWUTS |
> CLONE_NEWUSER,
> ..
> );
>
> This works when I run it on my machine, but inside the Docker container that
> Travis creates it fails with EPERM. Can anyone suggest why this might be
> happening? The clone(2) manpage lists possible reasons:
>
>
> EPERM CLONE_NEWCGROUP, CLONE_NEWIPC, CLONE_NEWNET, CLONE_NEWNS, CLONE_NEWPID,
> or CLONE_NEWUTS was specified by an unprivileged process (process
> without CAP_SYS_ADMIN).
>
> This shouldn't apply since I'm using CLONE_NEWUSER
>
>
> EPERM CLONE_PID was specified by a process other than process 0. (This error
> occurs only on Linux 2.5.15 and earlier.)
>
> Doesn't apply.
>
>
> EPERM CLONE_NEWUSER was specified in flags, but either the effective user ID
> or the effective group ID of the caller does not have a mapping in the
> parent namespace (see user_namespaces(7)).
>
> Again, this shouldn't apply. The process creating the namespace has a valid
> (not-nobody) uid and gid.
>
>
> EPERM (since Linux 3.9) CLONE_NEWUSER was specified in flags and the caller
> is in a chroot environment (i.e., the caller's root directory does not
> match the root directory of the mount namespace in which it resides).
>
> Possibly this one? The docker container shouldn't be aware that it's running in
> a chroot though. Calling mount inside the container lists:
>
> overlay on / type overlay (rw,relatime,...)
>
> Which indicates that it's living inside its own mount namespace with its own
> root directory.
>
> So I'm confused. Does anyone have any suggestions for why else this might be
> failing or thngs I could try to debug it? Is there a way to get more than just
> an eror code out of linux? Are there reasons for giving that error code that
> aren't listed in the man page?
Well, the Docker container might have dropped CAP_SYS_ADMIN at which
point you're not allowed to use any of the CLONE_* flags. Docker does
this as a security measure since they still default to running
privileged containers which are inherently unsafe.
Another possibility is that you're running a distro as host that does
not enable CLONE_NEWUSER by default. This can e.g. be the case with
CentOS based distros.
Christian
More information about the lxc-users
mailing list