[lxc-devel] Kernel bug? Setuid apps and user namespaces
Eric W. Biederman
ebiederm at xmission.com
Tue Oct 22 19:50:43 UTC 2013
Serge Hallyn <serge.hallyn at ubuntu.com> writes:
> Quoting Sean Pajot (sean.pajot at execulink.com):
>> I've been playing with User Namespaces somewhat extensively and I think I've
>> come across a bug in the handling of /proc/$PID/ entries.
>>
>> This is my example case on a 3.10.x kernel:
>>
>> -- /var/lib/lxc/test1/config
>>
>> lxc.rootfs = /lxc/c1
>> lxc.id_map = u 0 1000000 100000
>> lxc.id_map = g 0 1000000 100000
>> lxc.network.type = none
>>
>> lxc.tty = 6
>>
>> == END
>>
>> On one console login as a non-root user and run "su", as an example of a
>> setuid root application. On another console login as root and examine
>> /proc/$(pidof su). You'll find all the files are owned by the "nobody" user
>> and inaccessible. The reason is on the host you'll find these files are owned
>> by "root", uid 0, which is odd because in the container they should be uid
>> 1000000 from the mappings.
>>
>> I tracked down the cause to kernel source file /fs/proc/base.c function
>> pid_revalidate which contains static references to GLOBAL_ROOT_UID and
>> GLOBAL_ROOT_GID which are always UID 0 on the host. This little patch, which
>> might not be correct in terms of kernel standards, appears to mostly solve the
>> issue. It doesn't affect all entries in /proc/$PID but gets the majority of them.
>>
>> Thoughts or opinions?
>
> Awesome - I've seen this bug and so far not had time to dig.
>
> The patch offhand looks good to me. Do you mind sending it to
> lkml?
>
> Acked-by: Serge E. Hallyn <serge.hallyn at ubuntu.com>
>
It is definitely worth looking at. I punted on this when I did the
initial round of conversions. Tasks that we don't consider dumpable are
weird.
At first glance this fine. However __task_cred does not return NULL so
handling that case is nonsense and confusing.
Eric
>> --- linux-3.10-clean/fs/proc/base.c 2013-06-30 18:13:29.000000000 -0400
>> +++ linux-3.10-patched/fs/proc/base.c 2013-10-22 13:28:22.561262197 -0400
>> @@ -1632,17 +1632,17 @@
>> task = get_proc_task(inode);
>>
>> if (task) {
>> + rcu_read_lock();
>> + cred = __task_cred(task);
>> if ((inode->i_mode == (S_IFDIR|S_IRUGO|S_IXUGO)) ||
>> task_dumpable(task)) {
>> - rcu_read_lock();
>> - cred = __task_cred(task);
>> inode->i_uid = cred->euid;
>> inode->i_gid = cred->egid;
>> - rcu_read_unlock();
>> } else {
>> - inode->i_uid = GLOBAL_ROOT_UID;
>> - inode->i_gid = GLOBAL_ROOT_GID;
>> + inode->i_uid = cred ? make_kuid(cred->user_ns, 0) : GLOBAL_ROOT_UID;
>> + inode->i_gid = cred ? make_kgid(cred->user_ns, 0) : GLOBAL_ROOT_GID;
>> }
>> + rcu_read_unlock();
>> inode->i_mode &= ~(S_ISUID | S_ISGID);
>> security_task_to_inode(task, inode);
>> put_task_struct(task);
More information about the lxc-devel
mailing list