[lxc-devel] Kernel bug? Setuid apps and user namespaces

Eric W. Biederman ebiederm at xmission.com
Tue Oct 22 19:50:43 UTC 2013


Serge Hallyn <serge.hallyn at ubuntu.com> writes:

> Quoting Sean Pajot (sean.pajot at execulink.com):
>> I've been playing with User Namespaces somewhat extensively and I think I've
>> come across a bug in the handling of /proc/$PID/ entries.
>> 
>> This is my example case on a 3.10.x kernel:
>> 
>> -- /var/lib/lxc/test1/config
>> 
>> lxc.rootfs = /lxc/c1
>> lxc.id_map = u 0 1000000 100000
>> lxc.id_map = g 0 1000000 100000
>> lxc.network.type = none
>> 
>> lxc.tty = 6
>> 
>> == END
>> 
>> On one console login as a non-root user and run "su", as an example of a
>> setuid root application. On another console login as root and examine
>> /proc/$(pidof su). You'll find all the files are owned by the "nobody" user
>> and inaccessible. The reason is on the host you'll find these files are owned
>> by "root", uid 0, which is odd because in the container they should be uid
>> 1000000 from the mappings.
>> 
>> I tracked down the cause to kernel source file /fs/proc/base.c function
>> pid_revalidate which contains static references to GLOBAL_ROOT_UID and
>> GLOBAL_ROOT_GID which are always UID 0 on the host. This little patch, which
>> might not be correct in terms of kernel standards, appears to mostly solve the
>> issue. It doesn't affect all entries in /proc/$PID but gets the majority of them.
>> 
>> Thoughts or opinions?
>
> Awesome - I've seen this bug and so far not had time to dig.  
>
> The patch offhand looks good to me.  Do you mind sending it to
> lkml?
>
> Acked-by: Serge E. Hallyn <serge.hallyn at ubuntu.com>
>

It is definitely worth looking at.  I punted on this when I did the
initial round of conversions.  Tasks that we don't consider dumpable are
weird.

At first glance this fine.  However __task_cred does not return NULL so
handling that case is nonsense and confusing.

Eric

>> --- linux-3.10-clean/fs/proc/base.c	2013-06-30 18:13:29.000000000 -0400
>> +++ linux-3.10-patched/fs/proc/base.c	2013-10-22 13:28:22.561262197 -0400
>> @@ -1632,17 +1632,17 @@
>>  	task = get_proc_task(inode);
>> 
>>  	if (task) {
>> +		rcu_read_lock();
>> +	        cred = __task_cred(task);
>>  		if ((inode->i_mode == (S_IFDIR|S_IRUGO|S_IXUGO)) ||
>>  		    task_dumpable(task)) {
>> -			rcu_read_lock();
>> -			cred = __task_cred(task);
>>  			inode->i_uid = cred->euid;
>>  			inode->i_gid = cred->egid;
>> -			rcu_read_unlock();
>>  		} else {
>> -			inode->i_uid = GLOBAL_ROOT_UID;
>> -			inode->i_gid = GLOBAL_ROOT_GID;
>> +			inode->i_uid = cred ? make_kuid(cred->user_ns, 0) : GLOBAL_ROOT_UID;
>> +			inode->i_gid = cred ? make_kgid(cred->user_ns, 0) : GLOBAL_ROOT_GID;
>>  		}
>> +		rcu_read_unlock();
>>  		inode->i_mode &= ~(S_ISUID | S_ISGID);
>>  		security_task_to_inode(task, inode);
>>  		put_task_struct(task);




More information about the lxc-devel mailing list