[lxc-devel] [lxd/master] Extend nvidia runtime options
stgraber on Github
lxc-bot at linuxcontainers.org
Wed Sep 12 23:02:55 UTC 2018
A non-text attachment was scrubbed...
Name: not available
Type: text/x-mailbox
Size: 792 bytes
Desc: not available
URL: <http://lists.linuxcontainers.org/pipermail/lxc-devel/attachments/20180912/51517711/attachment.bin>
-------------- next part --------------
From 2325ba266da4ffa95084f4e38d1765047ce9b58c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?St=C3=A9phane=20Graber?= <stgraber at ubuntu.com>
Date: Wed, 12 Sep 2018 19:01:16 -0400
Subject: [PATCH] Extend nvidia runtime options
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This introduces an additional 3 configuration keys to control the
libnvidia-container integration:
- nvidia.driver.capabilities (maps to NVIDIA_DRIVER_CAPABILITIES)
- nvidia.require.cuda (maps to NVIDIA_REQUIRE_CUDA)
- nvidia.require.driver (maps to NVIDIA_REQUIRE_DRIVER)
Details on the valid values for those options can be found in the NVIDIA
documentation here:
https://github.com/NVIDIA/nvidia-container-runtime
Signed-off-by: Stéphane Graber <stgraber at ubuntu.com>
---
doc/api-extensions.md | 8 ++++++++
doc/containers.md | 3 +++
lxd/container_lxc.go | 30 +++++++++++++++++++++++++++---
scripts/bash/lxd-client | 1 +
shared/container.go | 5 ++++-
shared/version/api.go | 1 +
6 files changed, 44 insertions(+), 4 deletions(-)
diff --git a/doc/api-extensions.md b/doc/api-extensions.md
index 393085f1f2..4bdd5bde20 100644
--- a/doc/api-extensions.md
+++ b/doc/api-extensions.md
@@ -585,3 +585,11 @@ This introduces the config keys `candid.domains` and `candid.expiry`. The
former allows specifying allowed/valid Candid domains, the latter makes the
macaroon's expiry configurable. The `lxc remote add` command now has a
`--domain` flag which allows specifying a Candid domain.
+
+## nvidia\_runtime\_config
+This introduces a few extra config keys when using nvidia.runtime and the libnvidia-container library.
+Those keys translate pretty much directly to the matching nvidia-container environment variables:
+
+ - nvidia.driver.capabilities => NVIDIA\_DRIVER\_CAPABILITIES
+ - nvidia.require.cuda => NVIDIA\_REQUIRE\_CUDA
+ - nvidia.require.driver => NVIDIA\_REQUIRE\_DRIVER
diff --git a/doc/containers.md b/doc/containers.md
index 24842ba6a4..e9038a93d6 100644
--- a/doc/containers.md
+++ b/doc/containers.md
@@ -57,7 +57,10 @@ linux.kernel\_modules | string | - | yes
migration.incremental.memory | boolean | false | yes | migration\_pre\_copy | Incremental memory transfer of the container's memory to reduce downtime.
migration.incremental.memory.goal | integer | 70 | yes | migration\_pre\_copy | Percentage of memory to have in sync before stopping the container.
migration.incremental.memory.iterations | integer | 10 | yes | migration\_pre\_copy | Maximum number of transfer operations to go through before stopping the container.
+nvidia.driver.capabilities | string | all | no | nvidia\_runtime\_config | What driver capabilities the container needs (sets libnvidia-container NVIDIA\_DRIVER\_CAPABILITIES)
nvidia.runtime | boolean | false | no | nvidia\_runtime | Pass the host NVIDIA and CUDA runtime libraries into the container
+nvidia.require.cuda | string | - | no | nvidia\_runtime\_config | Version expression for the required CUDA version (sets libnvidia-container NVIDIA\_REQUIRE\_CUDA)
+nvidia.require.driver | string | - | no | nvidia\_runtime\_config | Version expression for the required driver version (sets libnvidia-container NVIDIA\_REQUIRE\_DRIVER)
raw.apparmor | blob | - | yes | - | Apparmor profile entries to be appended to the generated profile
raw.idmap | blob | - | no | id\_map | Raw idmap configuration (e.g. "both 1000 1000")
raw.lxc | blob | - | no | - | Raw LXC configuration to be appended to the generated one
diff --git a/lxd/container_lxc.go b/lxd/container_lxc.go
index de14f2a814..1fc9203d98 100644
--- a/lxd/container_lxc.go
+++ b/lxd/container_lxc.go
@@ -1229,9 +1229,33 @@ func (c *containerLXC) initLXC(config bool) error {
return err
}
- err = lxcSetConfigItem(cc, "lxc.environment", "NVIDIA_DRIVER_CAPABILITIES=compute,utility")
- if err != nil {
- return err
+ nvidiaDriver := c.expandedConfig["nvidia.driver.capabilities"]
+ if nvidiaDriver == "" {
+ err = lxcSetConfigItem(cc, "lxc.environment", "NVIDIA_DRIVER_CAPABILITIES=all")
+ if err != nil {
+ return err
+ }
+ } else {
+ err = lxcSetConfigItem(cc, "lxc.environment", fmt.Sprintf("NVIDIA_DRIVER_CAPABILITIES=%s", nvidiaDriver))
+ if err != nil {
+ return err
+ }
+ }
+
+ nvidiaRequireCuda := c.expandedConfig["nvidia.require.cuda"]
+ if nvidiaRequireCuda == "" {
+ err = lxcSetConfigItem(cc, "lxc.environment", fmt.Sprintf("NVIDIA_REQUIRE_CUDA=%s", nvidiaRequireCuda))
+ if err != nil {
+ return err
+ }
+ }
+
+ nvidiaRequireDriver := c.expandedConfig["nvidia.require.driver"]
+ if nvidiaRequireDriver == "" {
+ err = lxcSetConfigItem(cc, "lxc.environment", fmt.Sprintf("NVIDIA_REQUIRE_DRIVER=%s", nvidiaRequireDriver))
+ if err != nil {
+ return err
+ }
}
err = lxcSetConfigItem(cc, "lxc.hook.mount", hookPath)
diff --git a/scripts/bash/lxd-client b/scripts/bash/lxd-client
index bb12d7d5ea..95caea3a2c 100644
--- a/scripts/bash/lxd-client
+++ b/scripts/bash/lxd-client
@@ -82,6 +82,7 @@ _have lxc && {
limits.memory.swap limits.memory.swap.priority limits.network.priority \
limits.processes linux.kernel_modules migration.incremental.memory \
migration.incremental.memory.goal nvidia.runtime \
+ nvidia.driver.capabilities nvidia.require.cuda nvidia.require.driver \
migration.incremental.memory.iterations raw.apparmor raw.idmap raw.lxc \
raw.seccomp security.idmap.base security.idmap.isolated \
security.idmap.size security.devlxd security.devlxd.images \
diff --git a/shared/container.go b/shared/container.go
index 5fb1d1ab9b..e7cb82dad1 100644
--- a/shared/container.go
+++ b/shared/container.go
@@ -206,7 +206,10 @@ var KnownContainerConfigKeys = map[string]func(value string) error{
"migration.incremental.memory.iterations": IsUint32,
"migration.incremental.memory.goal": IsUint32,
- "nvidia.runtime": IsBool,
+ "nvidia.runtime": IsBool,
+ "nvidia.driver.capabilities": IsAny,
+ "nvidia.require.cuda": IsAny,
+ "nvidia.require.driver": IsAny,
"security.nesting": IsBool,
"security.privileged": IsBool,
diff --git a/shared/version/api.go b/shared/version/api.go
index 5e5f380823..e15f3f04c3 100644
--- a/shared/version/api.go
+++ b/shared/version/api.go
@@ -123,6 +123,7 @@ var APIExtensions = []string{
"candid_authentication",
"backup_compression",
"candid_config",
+ "nvidia_runtime_config",
}
// APIExtensionsCount returns the number of available API extensions.
More information about the lxc-devel
mailing list