[Lxc-users] version 0.8.0 coming soon
Serge Hallyn
serge.hallyn at canonical.com
Tue Feb 28 23:16:12 UTC 2012
Quoting Papp Tamas (tompos at martos.bme.hu):
> On 02/28/2012 04:13 PM, Serge Hallyn wrote:
> >Quoting Papp Tamas (tompos at martos.bme.hu):
> >>On 02/28/2012 01:20 AM, Serge Hallyn wrote:
> >>>Quoting Daniel Lezcano (daniel.lezcano at free.fr):
> >>>>Hi all,
> >>>>
> >>>>I will release a 0.8.0-rc1. I am looking for volunteer to test it :)
> >>>Worked fine for me. Tested create and clone of ubuntu, ubuntu and
> >>>ubuntu-cloud images, with dir and lvm backing stores. (And a run
> >>>of lp:~serge-hallyn/+junk/lxc-test)
> >>>
> >>>Note, because upstream kernel didn't much care about the
> >>>'mount -o remount,ro /' problem, I'm going to patch lxc to
> >>>pin open a '${rootfs}.hold' file, as long as the container
> >>>is running. That will prevent the underlying fs from being
> >>>remounted ro. (see
> >>>https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/942325 for
> >>>details). That'll buy us some time to find a better solution
> >>>in the kernel.
> >>>
> >>>
> >>Why can a container change mount options outside of its rootfs?
> >>Sorry for the stupid question:)
> >It's not a stupid question at all.
> >
> >The container isn't changing mount options outside of its rootfs. THere
> >are two places an fs can be marked readonly - in the mount itself, and in
> >the superblock. When you make a bind mount, you are creating more mounts
> >(vfsmounts) using the same superblcok.
> >
> >If you do
> >
> > mount --bind / / # not needed in container bc it's already been done
> > mount --bind -o remount,ro /
> >
> >then you are setting the reasonly flag on the mount itself. If you just do
> >
> > mount -o remount,ro /
> >
> >then you are setting the reasonly flag on the superblock, which will
> >force all other mounts of that superblcok to also be readonly.
> >
> >Right now there is no way to prevent a container from doing that. I sent
> >a patch to make the devices cgroup be consulted on that, so that it could
> >reteurn -EPERM. That was refused. The two other options I'm considering
> >(and it wouldn't hurt ot have both) are 1. to pass the remoutn flags to the
> >LSM (selinux or apparmor or smack) so that it can deny permission. Right
> >now it can't do that (except for all-or-nothing check on remount). And 2.
> >to make it so that after doing
> >
> > mount --bind / /
> > mount --bind -o remount,ro /
> > mount --bind -o remount,rw /
> >
> >any subsequent
> >
> > mount -o remount,rw /
> >
> >would be refused (or automatically done only at the mount level). I don't
> >think that should be hard to do at fs/namespace.c:do_remount().
>
>
> This may be to much for my brain:)
>
> Anyway, could you make deb package from it?
I've got it working for an ubuntu package, though we're in freeze right
now. I intend to push the patch to my github tree tomorrow, and I've
pushed the package to ppa:serge-hallyn/virt (version 0.7.5-3ubuntu31,
should build in a few hours). Meanwhile here is the actual patch for
now.
Tests fine for me.
Subject: lxc-start: if rootfs is a dir, pin the fs
Otherwise the container can remount the shared underlying fs readonly.
Index: lxc-dnsmasq/src/lxc/conf.c
===================================================================
--- lxc-dnsmasq.orig/src/lxc/conf.c 2012-02-28 19:13:01.400960000 +0000
+++ lxc-dnsmasq/src/lxc/conf.c 2012-02-28 20:05:45.538144907 +0000
@@ -445,6 +445,51 @@
return mount_unknow_fs(rootfs, target, 0);
}
+/*
+ * pin_rootfs
+ * if rootfs is a directory, then open ${rootfs}.hold for writing for the
+ * duration of the container run, to prevent the container from marking the
+ * underlying fs readonly on shutdown.
+ * return -1 on error.
+ * return -2 if nothing needed to be pinned.
+ * return an open fd (>=0) if we pinned it.
+ */
+int pin_rootfs(const char *rootfs)
+{
+ char absrootfs[MAXPATHLEN];
+ char absrootfspin[MAXPATHLEN];
+ struct stat s;
+ int ret, fd;
+
+ if (!realpath(rootfs, absrootfs)) {
+ SYSERROR("failed to get real path for '%s'", rootfs);
+ return -1;
+ }
+
+ if (access(absrootfs, F_OK)) {
+ SYSERROR("'%s' is not accessible", absrootfs);
+ return -1;
+ }
+
+ if (stat(absrootfs, &s)) {
+ SYSERROR("failed to stat '%s'", absrootfs);
+ return -1;
+ }
+
+ if (!__S_ISTYPE(s.st_mode, S_IFDIR))
+ return -2;
+
+ ret = snprintf(absrootfspin, MAXPATHLEN, "%s%s", absrootfs, ".hold");
+ if (ret >= MAXPATHLEN) {
+ SYSERROR("pathname too long for rootfs hold file");
+ return -1;
+ }
+
+ fd = open(absrootfspin, O_CREAT | O_RDWR, S_IWUSR|S_IRUSR);
+ INFO("opened %s as fd %d\n", absrootfspin, fd);
+ return fd;
+}
+
static int mount_rootfs(const char *rootfs, const char *target)
{
char absrootfs[MAXPATHLEN];
Index: lxc-dnsmasq/src/lxc/conf.h
===================================================================
--- lxc-dnsmasq.orig/src/lxc/conf.h 2012-02-28 19:13:01.400960000 +0000
+++ lxc-dnsmasq/src/lxc/conf.h 2012-02-28 19:13:01.400960000 +0000
@@ -218,6 +218,8 @@
*/
extern struct lxc_conf *lxc_conf_init(void);
+extern int pin_rootfs(const char *rootfs);
+
extern int lxc_create_network(struct lxc_handler *handler);
extern void lxc_delete_network(struct lxc_list *networks);
extern int lxc_assign_network(struct lxc_list *networks, pid_t pid);
Index: lxc-dnsmasq/src/lxc/start.c
===================================================================
--- lxc-dnsmasq.orig/src/lxc/start.c 2012-02-28 19:13:01.400960000 +0000
+++ lxc-dnsmasq/src/lxc/start.c 2012-02-28 20:07:41.174882442 +0000
@@ -565,6 +565,7 @@
int clone_flags;
int failed_before_rename = 0;
const char *name = handler->name;
+ int pinfd;
if (lxc_sync_init(handler))
return -1;
@@ -585,6 +586,17 @@
}
+ /*
+ * if the rootfs is not a blockdev, prevent the container from
+ * marking it readonly.
+ */
+
+ pinfd = pin_rootfs(handler->conf->rootfs.path);
+ if (pinfd == -1) {
+ ERROR("failed to pin the container's rootfs");
+ goto out_abort;
+ }
+
/* Create a process in a new set of namespaces */
handler->pid = lxc_clone(do_start, handler, clone_flags);
if (handler->pid < 0) {
@@ -627,6 +639,10 @@
}
lxc_sync_fini(handler);
+
+ if (pinfd >= 0)
+ close(pinfd);
+
return 0;
out_delete_net:
More information about the lxc-users
mailing list