[Lxc-users] [PATCH 2/2] cgroups: support cgroups mounted in multiple places
Michael H. Warfield
mhw at WittsEnd.com
Sat Jun 25 17:39:16 UTC 2011
Serge,
On Fri, 2011-06-24 at 12:54 -0500, Serge Hallyn wrote:
> I.e. with systemd or libcgroup.
>
> To do this, instead of looking for one cgroup called 'lxc' or
> otherwise taking the first cgroup we find, we actually create a
> container in every mounted cgroup fs. Right now it's done under the
> root of each fs. We may want to put that under lxc, or, better yet,
> make that configurable.
>
> Note the use of clone_children seems not quite right, but that's
> not for this patch to fix. In particular, if clone_children is
> not in the mntopts, we reject it. Yet later we try to set it
> ourselves. I believe we should simply, if ns cgroup is not
> composed, always try to set clone_children to 1. As it stands,
> with libcgroup installed, I had to do
> cd /sys/fs/cgroup
> for d in `/bin/ls`; do
> echo 1 > $d/cgroup.clone_children
> done
Doing this step alone broke lxc totally for me, with or without the
patch below. This was on Fedora 15 testing with lxc 0.7.4.2 as well as
a build from git without the patch below and a build from git with the
patch below.
[root at forest mhw]# lxc-start -n Alcove
lxc-start: failed to clone(0x6c020000): Invalid argument
lxc-start: Invalid argument - failed to fork into a new namespace
lxc-start: failed to spawn 'Alcove'
lxc-start: No such file or directory - failed to remove cgroup '/sys/fs/cgroup/systemd/Alcove'
If I set those cgroup.clone_children values back to 0, Alcove fires up
normally with or without this patch with no "devices.*" entries in the
config.
If I have devices.allow and device.deny entries in that config, then
your patch below works! And I see subdirectories in all the cgroup
mounts for "Alcove".
<random aside> [[
Daniel -
Also, that nifty little option for ignoring other options in the config
works like a charm so my parameters for this now work uncommented for my
own high level scripts:
--
# High level script control management...
lxc-boot.onboot = true
lxc-boot.disabled = false
lxc-system.description = "Alcove system"
--
Thank you...
</random aside> ]]
> But after that, 'lxc-start -n l1' worked like a charm. It also
> continues to work with a single mount of cgroups under /cgroup.
Bottom line. AFA starting a container, your patch works for me and
fixes the systemd/libcgroup multi-mount problem but if and only if I do
NOT do what you described above. That's disturbing. Why does it work
one way for you and the exact opposite for me?
We do seem to have a remaining problem though...
Running or stopped, lxc-info seems broken:
[root at forest mhw]# lxc-info -n Alcove
Segmentation fault (core dumped)
Daniel - I saw you put this patch off, postponing it. Are you awaiting
testing like mine or is there some additional issues that are being
missed (like this weirdism with cgroup.clone_children). That seg fault
could be reason enough, too.
Regards,
Mike
> Signed-off-by: Serge Hallyn <serge.hallyn at ubuntu.com>
> ---
> src/lxc/cgroup.c | 207 +++++++++++++++++++++++++++++++++--------------------
> src/lxc/cgroup.h | 2 +-
> src/lxc/freezer.c | 2 +-
> src/lxc/lxc.h | 8 +-
> src/lxc/state.c | 2 +-
> 5 files changed, 135 insertions(+), 86 deletions(-)
>
> diff --git a/src/lxc/cgroup.c b/src/lxc/cgroup.c
> index a068a01..ecba56e 100644
> --- a/src/lxc/cgroup.c
> +++ b/src/lxc/cgroup.c
> @@ -52,11 +52,10 @@ enum {
> CGROUP_CLONE_CHILDREN,
> };
>
> -static int get_cgroup_mount(const char *mtab, char *mnt)
> +static int get_cgroup_mount(const char *mtab, const char *subsystem, char *mnt)
> {
> struct mntent *mntent;
> FILE *file = NULL;
> - int err = -1;
>
> file = setmntent(mtab, "r");
> if (!file) {
> @@ -66,29 +65,24 @@ static int get_cgroup_mount(const char *mtab, char *mnt)
>
> while ((mntent = getmntent(file))) {
>
> - /* there is a cgroup mounted named "lxc" */
> - if (!strcmp(mntent->mnt_fsname, "lxc") &&
> - !strcmp(mntent->mnt_type, "cgroup")) {
> - strcpy(mnt, mntent->mnt_dir);
> - err = 0;
> - break;
> - }
> -
> - /* fallback to the first non-lxc cgroup found */
> - if (!strcmp(mntent->mnt_type, "cgroup") && err) {
> + if (strcmp(mntent->mnt_type, "cgroup"))
> + continue;
> + if (!subsystem || hasmntopt(mntent, subsystem)) {
> strcpy(mnt, mntent->mnt_dir);
> - err = 0;
> + fclose(file);
> + DEBUG("using cgroup mounted at '%s'", mnt);
> + return 0;
> }
> };
>
> - DEBUG("using cgroup mounted at '%s'", mnt);
> + DEBUG("Failed to find cgroup for %s\n", subsystem ? subsystem : "(NULL)");
>
> fclose(file);
>
> - return err;
> + return -1;
> }
>
> -static int get_cgroup_flags(const char *mtab, int *flags)
> +static int get_cgroup_flags(const char *mtab, const char *mnt_dir, int *flags)
> {
> struct mntent *mntent;
> FILE *file = NULL;
> @@ -103,38 +97,24 @@ static int get_cgroup_flags(const char *mtab, int *flags)
> *flags = 0;
>
> while ((mntent = getmntent(file))) {
> -
> - /* there is a cgroup mounted named "lxc" */
> - if (!strcmp(mntent->mnt_fsname, "lxc") &&
> - !strcmp(mntent->mnt_type, "cgroup")) {
> -
> - if (hasmntopt(mntent, "ns"))
> - *flags |= CGROUP_NS_CGROUP;
> -
> - if (hasmntopt(mntent, "clone_children"))
> - *flags |= CGROUP_CLONE_CHILDREN;
> -
> + if (strcmp(mntent->mnt_type, "cgroup"))
> + continue;
> + if (strcmp(mntent->mnt_dir, mnt_dir))
> + continue;
> + if (hasmntopt(mntent, "ns")) {
> + *flags |= CGROUP_NS_CGROUP;
> err = 0;
> - break;
> }
> -
> - /* fallback to the first non-lxc cgroup found */
> - if (!strcmp(mntent->mnt_type, "cgroup") && err) {
> -
> - if (hasmntopt(mntent, "ns"))
> - *flags |= CGROUP_NS_CGROUP;
> -
> - if (hasmntopt(mntent, "clone_children"))
> - *flags |= CGROUP_CLONE_CHILDREN;
> -
> + if (hasmntopt(mntent, "clone_children")) {
> + *flags |= CGROUP_CLONE_CHILDREN;
> err = 0;
> }
> - };
> -
> - DEBUG("cgroup flags is 0x%x", *flags);
> -
> - fclose(file);
> + fclose(file);
> + DEBUG("cgroup flags for %s is 0x%x", mnt_dir, *flags);
> + return err;
> + }
>
> + fclose(file);
> return err;
> }
>
> @@ -199,18 +179,17 @@ static int cgroup_attach(const char *path, pid_t pid)
> return ret;
> }
>
> -int lxc_cgroup_create(const char *name, pid_t pid)
> +/*
> + * create a cgroup for the container in a particular subsystem.
> + * XXX TODO we will of course want to use cgroup_path{subsystem}/lxc/name,
> + * not just cgroup_path{subsystem}/name.
> + */
> +int lxc_one_cgroup_create(const char *name, const char *cgmnt, pid_t pid)
> {
> - char cgmnt[MAXPATHLEN];
> char cgname[MAXPATHLEN];
> char clonechild[MAXPATHLEN];
> int flags;
>
> - if (get_cgroup_mount(MTAB, cgmnt)) {
> - ERROR("cgroup is not mounted");
> - return -1;
> - }
> -
> snprintf(cgname, MAXPATHLEN, "%s/%s", cgmnt, name);
>
> /*
> @@ -222,7 +201,7 @@ int lxc_cgroup_create(const char *name, pid_t pid)
> return -1;
> }
>
> - if (get_cgroup_flags(MTAB, &flags)) {
> + if (get_cgroup_flags(MTAB, cgmnt, &flags)) {
> SYSERROR("failed to get cgroup flags");
> return -1;
> }
> @@ -266,16 +245,44 @@ int lxc_cgroup_create(const char *name, pid_t pid)
> return 0;
> }
>
> -int lxc_cgroup_destroy(const char *name)
> +/*
> + * for each mounted cgroup, create a cgroup for the container
> + */
> +int lxc_cgroup_create(const char *name, pid_t pid)
> {
> - char cgmnt[MAXPATHLEN];
> - char cgname[MAXPATHLEN];
> + struct mntent *mntent;
> + FILE *file = NULL;
> + int ret, err = -1;
>
> - if (get_cgroup_mount(MTAB, cgmnt)) {
> - ERROR("cgroup is not mounted");
> + DEBUG("%s: starting\n", __func__);
> + file = setmntent(MTAB, "r");
> + if (!file) {
> + SYSERROR("failed to open %s", MTAB);
> return -1;
> }
>
> + while ((mntent = getmntent(file))) {
> + if (!strcmp(mntent->mnt_type, "cgroup")) {
> + DEBUG("creating %s %s %d\n", name, mntent->mnt_dir, pid);
> + err = lxc_one_cgroup_create(name, mntent->mnt_dir, pid);
> + if (ret) {
> + fclose(file);
> + return ret;
> + }
> + err = 0;
> + }
> + };
> +
> + fclose(file);
> +
> + return err;
> +}
> +
> +
> +int lxc_one_cgroup_destroy(const char *cgmnt, const char *name)
> +{
> + char cgname[MAXPATHLEN];
> +
> snprintf(cgname, MAXPATHLEN, "%s/%s", cgmnt, name);
> if (rmdir(cgname)) {
> SYSERROR("failed to remove cgroup '%s'", cgname);
> @@ -287,37 +294,79 @@ int lxc_cgroup_destroy(const char *name)
> return 0;
> }
>
> -int lxc_cgroup_path_get(char **path, const char *name)
> +/*
> + * for each mounted cgroup, destroy the cgroup for the container
> + */
> +int lxc_cgroup_destroy(const char *name)
> {
> - static char cgroup[MAXPATHLEN];
> - static const char* cgroup_cached = 0;
> - static char buf[MAXPATHLEN];
> + struct mntent *mntent;
> + FILE *file = NULL;
> + int ret, err = -1;
>
> - if (!cgroup_cached) {
> - if (get_cgroup_mount(MTAB, cgroup)) {
> - ERROR("cgroup is not mounted");
> - return -1;
> - } else {
> - cgroup_cached = cgroup;
> + file = setmntent(MTAB, "r");
> + if (!file) {
> + SYSERROR("failed to open %s", MTAB);
> + return -1;
> + }
> +
> + while ((mntent = getmntent(file))) {
> + if (!strcmp(mntent->mnt_type, "cgroup")) {
> + DEBUG("destroying %s %s\n", mntent->mnt_dir, name);
> + ret = lxc_one_cgroup_destroy(mntent->mnt_dir, name);
> + if (ret) {
> + fclose(file);
> + return ret;
> + }
> + err = 0;
> }
> + }
> +
> + fclose(file);
> +
> + return err;
> +}
> +/*
> + * lxc_cgroup_path_get: put into *path the pathname for
> + * %subsystem and cgroup %name. If %subsystem is NULL, then
> + * the first mounted cgroup will be used (for nr_tasks)
> + */
> +int lxc_cgroup_path_get(char **path, const char *subsystem, const char *name)
> +{
> + static char buf[MAXPATHLEN];
> + static char retbuf[MAXPATHLEN];
> +
> + /* what lxc_cgroup_set calls subsystem is actually the filename, i.e.
> + 'devices.allow'. So for our purposee we trim it */
> + if (subsystem) {
> + snprintf(retbuf, MAXPATHLEN, "%s", subsystem);
> + char *s = index(retbuf, '.');
> + *s = '\0';
> + DEBUG("%s: called for subsys %s name %s\n", __func__, retbuf, name);
> }
> + if (get_cgroup_mount(MTAB, subsystem ? retbuf : NULL, buf)) {
> + ERROR("cgroup is not mounted");
> + return -1;
> + }
> +
> + snprintf(retbuf, MAXPATHLEN, "%s/%s", buf, name);
> +
> + DEBUG("%s: returning %s for subsystem %s", __func__, retbuf, subsystem);
>
> - snprintf(buf, MAXPATHLEN, "%s/%s", cgroup_cached, name);
> - *path = buf;
> + *path = retbuf;
> return 0;
> }
>
> -int lxc_cgroup_set(const char *name, const char *subsystem, const char *value)
> +int lxc_cgroup_set(const char *name, const char *filename, const char *value)
> {
> int fd, ret;
> - char *nsgroup;
> + char *dirpath;
> char path[MAXPATHLEN];
>
> - ret = lxc_cgroup_path_get(&nsgroup, name);
> + ret = lxc_cgroup_path_get(&dirpath, filename, name);
> if (ret)
> return -1;
>
> - snprintf(path, MAXPATHLEN, "%s/%s", nsgroup, subsystem);
> + snprintf(path, MAXPATHLEN, "%s/%s", dirpath, filename);
>
> fd = open(path, O_WRONLY);
> if (fd < 0) {
> @@ -337,18 +386,18 @@ out:
> return ret;
> }
>
> -int lxc_cgroup_get(const char *name, const char *subsystem,
> +int lxc_cgroup_get(const char *name, const char *filename,
> char *value, size_t len)
> {
> int fd, ret = -1;
> - char *nsgroup;
> + char *dirpath;
> char path[MAXPATHLEN];
>
> - ret = lxc_cgroup_path_get(&nsgroup, name);
> + ret = lxc_cgroup_path_get(&dirpath, filename, name);
> if (ret)
> return -1;
>
> - snprintf(path, MAXPATHLEN, "%s/%s", nsgroup, subsystem);
> + snprintf(path, MAXPATHLEN, "%s/%s", dirpath, filename);
>
> fd = open(path, O_RDONLY);
> if (fd < 0) {
> @@ -366,16 +415,16 @@ int lxc_cgroup_get(const char *name, const char *subsystem,
>
> int lxc_cgroup_nrtasks(const char *name)
> {
> - char *nsgroup;
> + char *dpath;
> char path[MAXPATHLEN];
> int pid, ret, count = 0;
> FILE *file;
>
> - ret = lxc_cgroup_path_get(&nsgroup, name);
> + ret = lxc_cgroup_path_get(&dpath, NULL, name);
> if (ret)
> return -1;
>
> - snprintf(path, MAXPATHLEN, "%s/tasks", nsgroup);
> + snprintf(path, MAXPATHLEN, "%s/tasks", dpath);
>
> file = fopen(path, "r");
> if (!file) {
> diff --git a/src/lxc/cgroup.h b/src/lxc/cgroup.h
> index 8607fa8..f6df8c5 100644
> --- a/src/lxc/cgroup.h
> +++ b/src/lxc/cgroup.h
> @@ -28,6 +28,6 @@
> struct lxc_handler;
> int lxc_cgroup_create(const char *name, pid_t pid);
> int lxc_cgroup_destroy(const char *name);
> -int lxc_cgroup_path_get(char **path, const char *name);
> +int lxc_cgroup_path_get(char **path, const char *subsystem, const char *name);
> int lxc_cgroup_nrtasks(const char *name);
> #endif
> diff --git a/src/lxc/freezer.c b/src/lxc/freezer.c
> index 07eede5..94984a3 100644
> --- a/src/lxc/freezer.c
> +++ b/src/lxc/freezer.c
> @@ -45,7 +45,7 @@ static int freeze_unfreeze(const char *name, int freeze)
> char tmpf[32];
> int fd, ret;
>
> - ret = lxc_cgroup_path_get(&nsgroup, name);
> + ret = lxc_cgroup_path_get(&nsgroup, "freezer", name);
> if (ret)
> return -1;
>
> diff --git a/src/lxc/lxc.h b/src/lxc/lxc.h
> index a091baa..04df1c6 100644
> --- a/src/lxc/lxc.h
> +++ b/src/lxc/lxc.h
> @@ -114,22 +114,22 @@ extern lxc_state_t lxc_state(const char *name);
> * Set a specified value for a specified subsystem. The specified
> * subsystem must be fully specified, eg. "cpu.shares"
> * @name : the name of the container
> - * @subsystem : the subsystem
> + * @filename : the cgroup attribute filename
> * @value : the value to be set
> * Returns 0 on success, < 0 otherwise
> */
> -extern int lxc_cgroup_set(const char *name, const char *subsystem, const char *value);
> +extern int lxc_cgroup_set(const char *name, const char *filename, const char *value);
>
> /*
> * Get a specified value for a specified subsystem. The specified
> * subsystem must be fully specified, eg. "cpu.shares"
> * @name : the name of the container
> - * @subsystem : the subsystem
> + * @filename : the cgroup attribute filename
> * @value : the value to be set
> * @len : the len of the value variable
> * Returns the number of bytes read, < 0 on error
> */
> -extern int lxc_cgroup_get(const char *name, const char *subsystem,
> +extern int lxc_cgroup_get(const char *name, const char *filename,
> char *value, size_t len);
>
> /*
> diff --git a/src/lxc/state.c b/src/lxc/state.c
> index 6720011..b435eba 100644
> --- a/src/lxc/state.c
> +++ b/src/lxc/state.c
> @@ -71,7 +71,7 @@ static int freezer_state(const char *name)
> FILE *file;
> int err;
>
> - err = lxc_cgroup_path_get(&nsgroup, name);
> + err = lxc_cgroup_path_get(&nsgroup, "freezer", name);
> if (err)
> return -1;
>
--
Michael H. Warfield (AI4NB) | (770) 985-6132 | mhw at WittsEnd.com
/\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/
NIC whois: MHW9 | An optimist believes we live in the best of all
PGP Key: 0x674627FF | possible worlds. A pessimist is sure of it!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 482 bytes
Desc: This is a digitally signed message part
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20110625/db5976b9/attachment.pgp>
More information about the lxc-users
mailing list