[lxc-devel] [lxd/master] Documentation improvements

stgraber on Github lxc-bot at linuxcontainers.org
Tue Apr 14 16:58:26 UTC 2020


A non-text attachment was scrubbed...
Name: not available
Type: text/x-mailbox
Size: 301 bytes
Desc: not available
URL: <http://lists.linuxcontainers.org/pipermail/lxc-devel/attachments/20200414/d95817e9/attachment-0001.bin>
-------------- next part --------------
From 4e53e75245ec5a182efa3e0071d6c540b42b3db6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?St=C3=A9phane=20Graber?= <stgraber at ubuntu.com>
Date: Tue, 14 Apr 2020 12:52:20 -0400
Subject: [PATCH 1/2] doc/instances: Clarify config conditions
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Closes #7179

Signed-off-by: Stéphane Graber <stgraber at ubuntu.com>
---
 doc/instances.md | 124 +++++++++++++++++++++++------------------------
 1 file changed, 62 insertions(+), 62 deletions(-)

diff --git a/doc/instances.md b/doc/instances.md
index 8305efe943..c212149f0c 100644
--- a/doc/instances.md
+++ b/doc/instances.md
@@ -34,68 +34,68 @@ currently supported:
 
 The currently supported keys are:
 
-Key                                         | Type      | Default           | Live update   | Condition     | Description
-:--                                         | :---      | :------           | :----------   | :----------       | :----------
-boot.autostart                              | boolean   | -                 | n/a           | -                 | Always start the instance when LXD starts (if not set, restore last state)
-boot.autostart.delay                        | integer   | 0                 | n/a           | -                 | Number of seconds to wait after the instance started before starting the next one
-boot.autostart.priority                     | integer   | 0                 | n/a           | -                 | What order to start the instances in (starting with highest)
-boot.host\_shutdown\_timeout                | integer   | 30                | yes           | -                 | Seconds to wait for instance to shutdown before it is force stopped
-boot.stop.priority                          | integer   | 0                 | n/a           | -                 | What order to shutdown the instances (starting with highest)
-environment.\*                              | string    | -                 | yes (exec)    | -                 | key/value environment variables to export to the instance and set on exec
-limits.cpu                                  | string    | - (all)           | yes           | -                 | Number or range of CPUs to expose to the instance
-limits.cpu.allowance                        | string    | 100%              | yes           | -                 | How much of the CPU can be used. Can be a percentage (e.g. 50%) for a soft limit or hard a chunk of time (25ms/100ms)
-limits.cpu.priority                         | integer   | 10 (maximum)      | yes           | -                 | CPU scheduling priority compared to other instances sharing the same CPUs (overcommit) (integer between 0 and 10)
-limits.disk.priority                        | integer   | 5 (medium)        | yes           | -                 | When under load, how much priority to give to the instance's I/O requests (integer between 0 and 10)
-limits.hugepages.64KB                       | string    | -                 | yes           | container         | Fixed value in bytes (various suffixes supported, see below) to limit number of 64 KB hugepages (Available hugepage sizes are architecture dependent.)
-limits.hugepages.1MB                        | string    | -                 | yes           | container         | Fixed value in bytes (various suffixes supported, see below) to limit number of 1 MB hugepages (Available hugepage sizes are architecture dependent.)
-limits.hugepages.2MB                        | string    | -                 | yes           | container         | Fixed value in bytes (various suffixes supported, see below) to limit number of 2 MB hugepages (Available hugepage sizes are architecture dependent.)
-limits.hugepages.1GB                        | string    | -                 | yes           | container         | Fixed value in bytes (various suffixes supported, see below) to limit number of 1 GB hugepages (Available hugepage sizes are architecture dependent.)
-limits.kernel.\*                            | string    | -                 | no            | container         | This limits kernel resources per instance (e.g. number of open files)
-limits.memory                               | string    | - (all)           | yes           | -                 | Percentage of the host's memory or fixed value in bytes (various suffixes supported, see below)
-limits.memory.enforce                       | string    | hard              | yes           | container         | If hard, instance can't exceed its memory limit. If soft, the instance can exceed its memory limit when extra host memory is available
-limits.memory.hugepages                     | boolean   | false             | no            | virtual-machine   | Controls whether to back the instance using hugepages rather than regular system memory
-limits.memory.swap                          | boolean   | true              | yes           | -                 | Whether to allow some of the instance's memory to be swapped out to disk
-limits.memory.swap.priority                 | integer   | 10 (maximum)      | yes           | -                 | The higher this is set, the least likely the instance is to be swapped to disk (integer between 0 and 10)
-limits.network.priority                     | integer   | 0 (minimum)       | yes           | -                 | When under load, how much priority to give to the instance's network requests (integer between 0 and 10)
-limits.processes                            | integer   | - (max)           | yes           | container         | Maximum number of processes that can run in the instance
-linux.kernel\_modules                       | string    | -                 | yes           | container         | Comma separated list of kernel modules to load before starting the instance
-migration.incremental.memory                | boolean   | false             | yes           | container         | Incremental memory transfer of the instance's memory to reduce downtime
-migration.incremental.memory.goal           | integer   | 70                | yes           | container         | Percentage of memory to have in sync before stopping the instance
-migration.incremental.memory.iterations     | integer   | 10                | yes           | container         | Maximum number of transfer operations to go through before stopping the instance
-nvidia.driver.capabilities                  | string    | compute,utility   | no            | container         | What driver capabilities the instance needs (sets libnvidia-container NVIDIA\_DRIVER\_CAPABILITIES)
-nvidia.runtime                              | boolean   | false             | no            | container         | Pass the host NVIDIA and CUDA runtime libraries into the instance
-nvidia.require.cuda                         | string    | -                 | no            | container         | Version expression for the required CUDA version (sets libnvidia-container NVIDIA\_REQUIRE\_CUDA)
-nvidia.require.driver                       | string    | -                 | no            | container         | Version expression for the required driver version (sets libnvidia-container NVIDIA\_REQUIRE\_DRIVER)
-raw.apparmor                                | blob      | -                 | yes           | container         | Apparmor profile entries to be appended to the generated profile
-raw.idmap                                   | blob      | -                 | no            | container         | Raw idmap configuration (e.g. "both 1000 1000")
-raw.lxc                                     | blob      | -                 | no            | container         | Raw LXC configuration to be appended to the generated one
-raw.qemu                                    | blob      | -                 | no            | virtual-machine   | Raw Qemu configuration to be appended to the generated command line
-raw.seccomp                                 | blob      | -                 | no            | container         | Raw Seccomp configuration
-security.devlxd                             | boolean   | true              | no            | -                 | Controls the presence of /dev/lxd in the instance
-security.devlxd.images                      | boolean   | false             | no            | -                 | Controls the availability of the /1.0/images API over devlxd
-security.idmap.base                         | integer   | -                 | no            | container         | The base host ID to use for the allocation (overrides auto-detection)
-security.idmap.isolated                     | boolean   | false             | no            | container         | Use an idmap for this instance that is unique among instances with isolated set
-security.idmap.size                         | integer   | -                 | no            | container         | The size of the idmap to use
-security.nesting                            | boolean   | false             | yes           | -                 | Support running lxd (nested) inside the instance
-security.privileged                         | boolean   | false             | no            | container         | Runs the instance in privileged mode
-security.protection.delete                  | boolean   | false             | yes           | -                 | Prevents the instance from being deleted
-security.protection.shift                   | boolean   | false             | yes           | container         | Prevents the instance's filesystem from being uid/gid shifted on startup
-security.secureboot                         | boolean   | true              | no            | virtual-machine   | Controls whether UEFI secure boot is enabled with the default Microsoft keys
-security.syscalls.blacklist                 | string    | -                 | no            | container         | A '\n' separated list of syscalls to blacklist
-security.syscalls.blacklist\_compat         | boolean   | false             | no            | container         | On x86\_64 this enables blocking of compat\_\* syscalls, it is a no-op on other arches
-security.syscalls.blacklist\_default        | boolean   | true              | no            | container         | Enables the default syscall blacklist
-security.syscalls.intercept.mknod           | boolean   | false             | no            | container         | Handles the `mknod` and `mknodat` system calls (allows creation of a limited subset of char/block devices)
-security.syscalls.intercept.mount           | boolean   | false             | no            | container         | Handles the `mount` system call
-security.syscalls.intercept.mount.allowed   | string    | -                 | yes           | container         | Specify a comma-separated list of filesystems that are safe to mount for processes inside the instance
-security.syscalls.intercept.mount.fuse      | string    | -                 | yes           | container         | Whether to mount shiftfs on top of filesystems handled through mount syscall interception
-security.syscalls.intercept.mount.shift     | boolean   | false             | yes           | container         | Whether to redirect mounts of a given filesystem to their fuse implemenation (e.g. ext4=fuse2fs)
-security.syscalls.intercept.setxattr        | boolean   | false             | no            | container         | Handles the `setxattr` system call (allows setting a limited subset of restricted extended attributes)
-security.syscalls.whitelist                 | string    | -                 | no            | container         | A '\n' separated list of syscalls to whitelist (mutually exclusive with security.syscalls.blacklist\*)
-snapshots.schedule                          | string    | -                 | no            | -                 | Cron expression (`<minute> <hour> <dom> <month> <dow>`)
-snapshots.schedule.stopped                  | bool      | false             | no            | -                 | Controls whether or not stopped instances are to be snapshoted automatically
-snapshots.pattern                           | string    | snap%d            | no            | -                 | Pongo2 template string which represents the snapshot name (used for scheduled snapshots and unnamed snapshots)
-snapshots.expiry                            | string    | -                 | no            | -                 | Controls when snapshots are to be deleted (expects expression like `1M 2H 3d 4w 5m 6y`)
-user.\*                                     | string    | -                 | n/a           | -                 | Free form user key/value storage (can be used in search)
+Key                                         | Type      | Default           | Live update   | Condition                 | Description
+:--                                         | :---      | :------           | :----------   | :----------               | :----------
+boot.autostart                              | boolean   | -                 | n/a           | -                         | Always start the instance when LXD starts (if not set, restore last state)
+boot.autostart.delay                        | integer   | 0                 | n/a           | -                         | Number of seconds to wait after the instance started before starting the next one
+boot.autostart.priority                     | integer   | 0                 | n/a           | -                         | What order to start the instances in (starting with highest)
+boot.host\_shutdown\_timeout                | integer   | 30                | yes           | -                         | Seconds to wait for instance to shutdown before it is force stopped
+boot.stop.priority                          | integer   | 0                 | n/a           | -                         | What order to shutdown the instances (starting with highest)
+environment.\*                              | string    | -                 | yes (exec)    | -                         | key/value environment variables to export to the instance and set on exec
+limits.cpu                                  | string    | - (all)           | yes           | -                         | Number or range of CPUs to expose to the instance
+limits.cpu.allowance                        | string    | 100%              | yes           | container                 | How much of the CPU can be used. Can be a percentage (e.g. 50%) for a soft limit or hard a chunk of time (25ms/100ms)
+limits.cpu.priority                         | integer   | 10 (maximum)      | yes           | container                 | CPU scheduling priority compared to other instances sharing the same CPUs (overcommit) (integer between 0 and 10)
+limits.disk.priority                        | integer   | 5 (medium)        | yes           | -                         | When under load, how much priority to give to the instance's I/O requests (integer between 0 and 10)
+limits.hugepages.64KB                       | string    | -                 | yes           | container                 | Fixed value in bytes (various suffixes supported, see below) to limit number of 64 KB hugepages (Available hugepage sizes are architecture dependent.)
+limits.hugepages.1MB                        | string    | -                 | yes           | container                 | Fixed value in bytes (various suffixes supported, see below) to limit number of 1 MB hugepages (Available hugepage sizes are architecture dependent.)
+limits.hugepages.2MB                        | string    | -                 | yes           | container                 | Fixed value in bytes (various suffixes supported, see below) to limit number of 2 MB hugepages (Available hugepage sizes are architecture dependent.)
+limits.hugepages.1GB                        | string    | -                 | yes           | container                 | Fixed value in bytes (various suffixes supported, see below) to limit number of 1 GB hugepages (Available hugepage sizes are architecture dependent.)
+limits.kernel.\*                            | string    | -                 | no            | container                 | This limits kernel resources per instance (e.g. number of open files)
+limits.memory                               | string    | - (all)           | yes           | -                         | Percentage of the host's memory or fixed value in bytes (various suffixes supported, see below)
+limits.memory.enforce                       | string    | hard              | yes           | container                 | If hard, instance can't exceed its memory limit. If soft, the instance can exceed its memory limit when extra host memory is available
+limits.memory.hugepages                     | boolean   | false             | no            | virtual-machine           | Controls whether to back the instance using hugepages rather than regular system memory
+limits.memory.swap                          | boolean   | true              | yes           | container                 | Whether to allow some of the instance's memory to be swapped out to disk
+limits.memory.swap.priority                 | integer   | 10 (maximum)      | yes           | container                 | The higher this is set, the least likely the instance is to be swapped to disk (integer between 0 and 10)
+limits.network.priority                     | integer   | 0 (minimum)       | yes           | -                         | When under load, how much priority to give to the instance's network requests (integer between 0 and 10)
+limits.processes                            | integer   | - (max)           | yes           | container                 | Maximum number of processes that can run in the instance
+linux.kernel\_modules                       | string    | -                 | yes           | container                 | Comma separated list of kernel modules to load before starting the instance
+migration.incremental.memory                | boolean   | false             | yes           | container                 | Incremental memory transfer of the instance's memory to reduce downtime
+migration.incremental.memory.goal           | integer   | 70                | yes           | container                 | Percentage of memory to have in sync before stopping the instance
+migration.incremental.memory.iterations     | integer   | 10                | yes           | container                 | Maximum number of transfer operations to go through before stopping the instance
+nvidia.driver.capabilities                  | string    | compute,utility   | no            | container                 | What driver capabilities the instance needs (sets libnvidia-container NVIDIA\_DRIVER\_CAPABILITIES)
+nvidia.runtime                              | boolean   | false             | no            | container                 | Pass the host NVIDIA and CUDA runtime libraries into the instance
+nvidia.require.cuda                         | string    | -                 | no            | container                 | Version expression for the required CUDA version (sets libnvidia-container NVIDIA\_REQUIRE\_CUDA)
+nvidia.require.driver                       | string    | -                 | no            | container                 | Version expression for the required driver version (sets libnvidia-container NVIDIA\_REQUIRE\_DRIVER)
+raw.apparmor                                | blob      | -                 | yes           | container                 | Apparmor profile entries to be appended to the generated profile
+raw.idmap                                   | blob      | -                 | no            | unprivileged container    | Raw idmap configuration (e.g. "both 1000 1000")
+raw.lxc                                     | blob      | -                 | no            | container                 | Raw LXC configuration to be appended to the generated one
+raw.qemu                                    | blob      | -                 | no            | virtual-machine           | Raw Qemu configuration to be appended to the generated command line
+raw.seccomp                                 | blob      | -                 | no            | container                 | Raw Seccomp configuration
+security.devlxd                             | boolean   | true              | no            | container                 | Controls the presence of /dev/lxd in the instance
+security.devlxd.images                      | boolean   | false             | no            | container                 | Controls the availability of the /1.0/images API over devlxd
+security.idmap.base                         | integer   | -                 | no            | unprivileged container    | The base host ID to use for the allocation (overrides auto-detection)
+security.idmap.isolated                     | boolean   | false             | no            | unprivileged container    | Use an idmap for this instance that is unique among instances with isolated set
+security.idmap.size                         | integer   | -                 | no            | unprivileged container    | The size of the idmap to use
+security.nesting                            | boolean   | false             | yes           | container                 | Support running lxd (nested) inside the instance
+security.privileged                         | boolean   | false             | no            | container                 | Runs the instance in privileged mode
+security.protection.delete                  | boolean   | false             | yes           | -                         | Prevents the instance from being deleted
+security.protection.shift                   | boolean   | false             | yes           | container                 | Prevents the instance's filesystem from being uid/gid shifted on startup
+security.secureboot                         | boolean   | true              | no            | virtual-machine           | Controls whether UEFI secure boot is enabled with the default Microsoft keys
+security.syscalls.blacklist                 | string    | -                 | no            | container                 | A '\n' separated list of syscalls to blacklist
+security.syscalls.blacklist\_compat         | boolean   | false             | no            | container                 | On x86\_64 this enables blocking of compat\_\* syscalls, it is a no-op on other arches
+security.syscalls.blacklist\_default        | boolean   | true              | no            | container                 | Enables the default syscall blacklist
+security.syscalls.intercept.mknod           | boolean   | false             | no            | container                 | Handles the `mknod` and `mknodat` system calls (allows creation of a limited subset of char/block devices)
+security.syscalls.intercept.mount           | boolean   | false             | no            | container                 | Handles the `mount` system call
+security.syscalls.intercept.mount.allowed   | string    | -                 | yes           | container                 | Specify a comma-separated list of filesystems that are safe to mount for processes inside the instance
+security.syscalls.intercept.mount.fuse      | string    | -                 | yes           | container                 | Whether to mount shiftfs on top of filesystems handled through mount syscall interception
+security.syscalls.intercept.mount.shift     | boolean   | false             | yes           | container                 | Whether to redirect mounts of a given filesystem to their fuse implemenation (e.g. ext4=fuse2fs)
+security.syscalls.intercept.setxattr        | boolean   | false             | no            | container                 | Handles the `setxattr` system call (allows setting a limited subset of restricted extended attributes)
+security.syscalls.whitelist                 | string    | -                 | no            | container                 | A '\n' separated list of syscalls to whitelist (mutually exclusive with security.syscalls.blacklist\*)
+snapshots.schedule                          | string    | -                 | no            | -                         | Cron expression (`<minute> <hour> <dom> <month> <dow>`)
+snapshots.schedule.stopped                  | bool      | false             | no            | -                         | Controls whether or not stopped instances are to be snapshoted automatically
+snapshots.pattern                           | string    | snap%d            | no            | -                         | Pongo2 template string which represents the snapshot name (used for scheduled snapshots and unnamed snapshots)
+snapshots.expiry                            | string    | -                 | no            | -                         | Controls when snapshots are to be deleted (expects expression like `1M 2H 3d 4w 5m 6y`)
+user.\*                                     | string    | -                 | n/a           | -                         | Free form user key/value storage (can be used in search)
 
 The following volatile keys are currently internally used by LXD:
 

From 24255c43d17d9cd72b92d593b6233f7bb1771de9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?St=C3=A9phane=20Graber?= <stgraber at ubuntu.com>
Date: Tue, 14 Apr 2020 12:57:51 -0400
Subject: [PATCH 2/2] doc/index: Clarify bind-mount in FAQ
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Closes #7178

Signed-off-by: Stéphane Graber <stgraber at ubuntu.com>
---
 doc/index.md | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/doc/index.md b/doc/index.md
index b820aa5634..8ca5273edb 100644
--- a/doc/index.md
+++ b/doc/index.md
@@ -277,25 +277,25 @@ experimental stages and may not work for all workloads. Please report bugs on
 lxc-devel, and we can escalate to CRIU lists as necessary.
 
 #### Can I bind mount my home directory in a container?
-Yes. The easiest way to do that is using a privileged container to avoid file ownership issues:
-
-1.a) create a container.
+Yes. This can be done using a disk device:
 
 ```bash
-lxc launch ubuntu privilegedContainerName -c security.privileged=true
+lxc config device add container-name home disk source=/home/$USER path=/home/ubuntu
 ```
 
-1.b) or, if your container already exists.
+For unprivileged containers, you will also need one of:
 
-```bash
-lxc config set privilegedContainerName security.privileged true
-```
+ - Pass `shifted=true` to the `lxc config device add` call. This depends on `shiftfs` being supported (see `lxc info`)
+ - raw.idmap entry (see [Idmaps for user namespace](userns-idmap.md))
+ - Recursive POSIX ACLs placed on your home directory
 
-2) then.
+Either of those can be used to allow the user in the container to have working read/write permissions.
+When not setting one of those, everything will show up as the overflow uid/gid (65536:65536)
+and access to anything that's not world readable will fail.
 
-```bash
-lxc config device add privilegedContainerName shareName disk source=/home/$USER path=/home/ubuntu
-```
+
+Privileged containers do not have this issue as all uid/gid inthe container are the same outside.
+But that's also the cause of most of the security issues with such privileged containers.
 
 #### How can I run docker inside a LXD container?
 In order to run Docker inside a LXD container the `security.nesting` property of the container should be set to `true`. 


More information about the lxc-devel mailing list