[lxc-devel] [PATCH v2] safe_mount: Handle mounting proc and refactor

Bogdan Purcareata bogdan.purcareata at nxp.com
Wed Jan 13 11:27:34 UTC 2016


On 11.01.2016 20:59, Serge Hallyn wrote:
> Quoting Bogdan Purcareata (bogdan.purcareata at nxp.com):
>> The safe_mount primitive will mount the fs in the new container
>> environment by using file descriptors referred in /proc/self/fd.
>> However, when the mounted filesystem is proc itself, it will have
>> been previously unmounted, therefore resulting in an error when
>> searching for these file descriptors. This only happens when there's
>> no container rootfs prefix (commonly with lxc-execute).
>>
>> Implement the support for this use case as well, by doing the mount
>> based on the full path.
>>
>> Refactor the whole function in order to remove duplicated code checks
>> and improve readability.
>>
>> Changes since v1:
>> - In order to address CVE-2015-1335, still check if the destination is
>> not a symlink. Do the mount only if the destination file descriptor
>> exists.
>>
>> Signed-off-by: Bogdan Purcareata <bogdan.purcareata at nxp.com>
>> ---
>>   src/lxc/utils.c | 49 ++++++++++++++++++++++++++++---------------------
>>   1 file changed, 28 insertions(+), 21 deletions(-)
>>
>> diff --git a/src/lxc/utils.c b/src/lxc/utils.c
>> index d9e769d..c53711a 100644
>> --- a/src/lxc/utils.c
>> +++ b/src/lxc/utils.c
>> @@ -1644,9 +1644,9 @@ out:
>>   int safe_mount(const char *src, const char *dest, const char *fstype,
>>   		unsigned long flags, const void *data, const char *rootfs)
>>   {
>> -	int srcfd = -1, destfd, ret, saved_errno;
>> +	int srcfd = -1, destfd = -1, ret = 0;
>>   	char srcbuf[50], destbuf[50]; // only needs enough for /proc/self/fd/<fd>
>> -	const char *mntsrc = src;
>> +	const char *mntsrc = src, *mntdest = dest;
>>   
>>   	if (!rootfs)
>>   		rootfs = "";
>> @@ -1655,45 +1655,52 @@ int safe_mount(const char *src, const char *dest, const char *fstype,
>>   	if (flags & MS_BIND && src && src[0] != '/') {
>>   		INFO("this is a relative bind mount");
>>   		srcfd = open_without_symlink(src, NULL);
>> -		if (srcfd < 0)
>> -			return srcfd;
>> +		if (srcfd < 0) {
>> +			ret = srcfd;
>> +			goto out;
>> +		}
>>   		ret = snprintf(srcbuf, 50, "/proc/self/fd/%d", srcfd);
>>   		if (ret < 0 || ret > 50) {
>> -			close(srcfd);
>>   			ERROR("Out of memory");
>> -			return -EINVAL;
>> +			ret = -EINVAL;
>> +			goto out_src;
>>   		}
>>   		mntsrc = srcbuf;
>>   	}
>>   
>>   	destfd = open_without_symlink(dest, rootfs);
>>   	if (destfd < 0) {
>> -		if (srcfd != -1)
>> -			close(srcfd);
>> -		return destfd;
>> +		ret = destfd;
>> +		goto out_src;
>>   	}
>>   
>>   	ret = snprintf(destbuf, 50, "/proc/self/fd/%d", destfd);
>>   	if (ret < 0 || ret > 50) {
>> -		if (srcfd != -1)
>> -			close(srcfd);
>> -		close(destfd);
>>   		ERROR("Out of memory");
>> -		return -EINVAL;
>> +		ret = -EINVAL;
>> +		goto out_dest;
>>   	}
>>   
>> -	ret = mount(mntsrc, destbuf, fstype, flags, data);
>> -	saved_errno = errno;
>> -	if (srcfd != -1)
>> -		close(srcfd);
>> -	close(destfd);
>> +	/* make sure the destination descriptor exists */
>> +	if (access(destbuf, F_OK) == 0)
>> +		mntdest = destbuf;
> First, if we're going to shortcut I'd prefer to say "if /proc/self
> does not exist then skip this check" fo rnow.
> But can we think of any way to still do this check?
>
> What exactly are the cases?
>
> 1. lxc-execute, lxc-init tries to mount /proc.  We should be able
> to simply have lxc always mount /proc before the pivot_root, so
> we can properly do this check.
>
> what use-cases will break if we demand /proc to exist in the
> container?  (We can add an option to umount /proc in lxc-init,
> but the directory would have to exist)
That's my use case - the failing function is tmp_proc_mount, thus happening 
before pivot_root, and the scenario is running an application container with 
lxc-execute. This calls mount_proc_if_needed which is bound to fail, since it 
will first unmount /proc (because after cloning the process in the new 
namespaces, its PID will be different than /proc/self), and then try to 
safe_mount it, depending on /proc/self/fd. However, tmp_proc_mount will not 
fail, and afterwards lxc-init will mount /proc following the behavior you 
mentioned. So functionally all is ok, but tmp_proc_mount -> mount_proc_if_needed 
-> safe_mount will still spit a nasty warning.

BTW this whole thing is with LXC 1.1.5.
>
> 2. lxc.rootfs unset.  In this case we're trusting the *host* admin
> to not have messed with /proc to make it a symlink, if we can't
> trust that we're out of luck.  Other paths are not the same (since
> parts of the rootfs could be bind-mounted from container-owned
> dirs) so we should check those.  But so for this check we should
> simply explicitly check for "/proc".  Doing a more roundabout
> check may leave us open to subtle different attacks.  In particular
> I imagine there are other ways to make /proc/self/fd/N not
> access-able, and you are, in that case, re-introducing the TOCTTOU -
> the attacker could try to quickliy insert the symlink after our
> checks and before the real mount().
Leaving lxc.rootfs unset results in lxc-start blocking right after 
lxc_mount_auto_mounts. However, setting it to "/" leads to a successful 
scenario, since lxc_mount_auto_mounts will mount a temporary /proc in 
/usr/var/lib/lxc/rootfs, so mount_proc_if_needed will no longer lead to mounting 
/proc again.

So my thought right now is:
- leave the open_without_symlink call in order to check that the destination is 
not a symlink
- use the full destination path in mount syscall only if the fstype is proc and 
the destination path is "/proc" - otherwise use destfd

Does this make sense?
>> +	ret = mount(mntsrc, mntdest, fstype, flags, data);
>>   	if (ret < 0) {
>> -		errno = saved_errno;
>>   		SYSERROR("Failed to mount %s onto %s", src, dest);
>> -		return ret;
>> +		goto out_dest;
>>   	}
>>   
>> -	return 0;
>> +	ret = 0;
>> +
>> +out_dest:
>> +	if (destfd > 0)
> These should be >= 0 (here and below) right?
>
True, 0 _is_ a valid file descriptor. Will fix in v3.
>> +		close(destfd);
>> +out_src:
>> +	if (srcfd > 0)
>> +		close(srcfd);
>> +out:
>> +	return ret;
>>   }
Thanks for the feedback!
Bogdan P.


More information about the lxc-devel mailing list