[lxc-devel] RFC: Device Namespaces

Amir Goldstein amir at cellrox.com
Sun Sep 29 18:14:40 UTC 2013


On Wed, Sep 25, 2013 at 11:13 PM, Serge Hallyn <serge.hallyn at ubuntu.com>wrote:

> Quoting Michael J Coss (michael.coss at alcatel-lucent.com):
> > I've been looking at this problem for some time to help solve my very
> > specific use case.   In our case we are using containers to provide
> > individual "desktops" to a number of users.  We want the desktop to run
> > X, and bind and unbind a display, keyboard, mouse to that X server
> > running in a particular container, and not be able to grab anyone elses
> > keyboard, mouse or display unless granted specific access to that from
> > the owern.  To that end, I started worked on a udev solution.  I
> > understand that most containers don't/won't run udev.  And systemd won't
> > even start udev if the container doesn't have the mknod capability which
> > is a kinda odd cookie but I digress.
> >
> > Currently the kernel effectively broadcasts uevents to all network
> > namespaces, and this is an issue.  I don't want container A to see
> > container B's events.  It should see only what the admin has set for the
> > policy for that container.  This policy should be handled on the host
> > for the containers in userspace.  This deamon can get the events, and
> > then forward to the appropriate container(s) those events that are
> > pertinent, and disregard the rest.  To accomplish this, I had to change
> > the broadcast mechanism, and then provide a forwarding mechanism to
> > specific network namespaces.
> >
> > Back in the day, that would have been sufficient.  Udev running in the
> > container would have gotten the add event, and created the appropriate
> > devices and symlinks, and then cleaned up on remove/change events.  With
> > the introduction of devtmpfs, udev no longer actually creates the device
> > nodes.  It just handles links and name changes.   So, I'm still left
> > with needing to create/manage devtmpfs or some other solution.  This
> > leads me down the path of virtualizing devtmpfs.  I've been fooling
> > around with FUSE, to basically mirror the host /dev (filtered
>
> Rather than using FUSE, I'd recommend looking into doing it the same
> way as the devpts fs.  Might not work out (or be rejected) in the end,
> but at first glance it seems the right way to handle it.  So each new
> instance mount starts empty, changes in one are not reflected in
> another, but new devices which the kernel later creates may (subject
> to device cgroup of the process which mounted it?) be created in the
> new instances.
>

I was thinking it makes sense to tie unique instances of devtmpfs sb to
userns.
If not for any other reason, for the fact that any mount sb already has the
knowledge
of the userns that mounted it.
But also, I think devtmpfs needs to be userns friendly, so it can safely
get the FS_USERNS_DEV_MOUNT flag.



> > appropriately), but there are many ugly security, and implementation
> > details that look bad to me.  I have been kicking around the notion that
> > the device cgroup might provide the list of "acceptable" devices, and
> > construct a filter/view for devtmpfs based on that.
> >
> > I do have these changes working on a mostly stock 3.10 kernel,  the
> > kernel changes are pretty small, and the deamon does a pretty minimal
> > filtering mostly to demonstrate functionality.  It does assume that the
> > containers are running in a separate network namespace, but that's about
> it.
> >
> > Of course, that still leaves you with sysfs needing similar treatment.
> >
> > ---Michael J Coss
> >
> >
> ------------------------------------------------------------------------------
> > October Webinars: Code for Performance
> > Free Intel webinars can help you accelerate application performance.
> > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> from
> > the latest Intel processors and coprocessors. See abstracts and register
> >
> >
> http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
> > _______________________________________________
> > Lxc-devel mailing list
> > Lxc-devel at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/lxc-devel
>
>
> ------------------------------------------------------------------------------
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
> _______________________________________________
> Lxc-devel mailing list
> Lxc-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/lxc-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxcontainers.org/pipermail/lxc-devel/attachments/20130929/022bf81a/attachment.html>


More information about the lxc-devel mailing list