<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>I'm using lxd under ubuntu 16.04 with zfs.</p>
<p>I want to use an existing container snapshot as a cloning base to
create other containers. As far as I can see this is done via "lxc
publish", although I can't find much documentation on this apart
from this blog post:<br>
</p>
<p><a class="moz-txt-link-freetext"
href="https://insights.ubuntu.com/2015/06/30/publishing-lxd-images/">https://insights.ubuntu.com/2015/06/30/publishing-lxd-images/</a><br>
</p>
<p>My question is around zfs disk space usage. I was hoping that the
publish operation would simply take a snapshot of the existing
container and therefore use no more local disk space, but in fact
it seems to use the whole amount of disk space again. Let me
demonstrate. First the clean system:</p>
<p><tt>root@vtp:~# zpool list</tt><tt><br>
</tt><tt>NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP
HEALTH ALTROOT</tt><tt><br>
</tt><tt>lxd 77G 95.5K 77.0G - 0% 0% 1.00x
ONLINE -</tt><br>
</p>
<p>Now I create a container, then a couple more, from the same
image:<br>
</p>
<tt>root@vtp:~# lxc launch ubuntu:16.04 base1</tt><tt><br>
</tt><tt>Creating base1</tt><tt><br>
</tt><tt>Retrieving image: 100%</tt><tt><br>
</tt><tt>Starting base1</tt><tt><br>
</tt><tt>root@vtp:~# zpool list</tt><tt><br>
</tt><tt>NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP
HEALTH ALTROOT</tt><tt><br>
</tt><tt>lxd 77G 644M 76.4G - 0% 0% 1.00x
ONLINE -</tt><tt><br>
</tt><tt>root@vtp:~# lxc launch ubuntu:16.04 base2</tt><tt><br>
</tt><tt>Creating base2</tt><tt><br>
</tt><tt>Starting base2</tt><tt><br>
</tt><tt>root@vtp:~# lxc launch ubuntu:16.04 base3</tt><tt><br>
</tt><tt>Creating base3</tt><tt><br>
</tt><tt>Starting base3</tt><tt><br>
</tt><tt>root@vtp:~# lxc exec base1 /bin/sh -- -c 'echo hello
>/usr/test.txt'</tt><tt><br>
</tt><tt>root@vtp:~# lxc stop base1</tt><tt><br>
</tt><tt>root@vtp:~# zpool list</tt><tt><br>
</tt><tt>NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP
HEALTH ALTROOT</tt><tt><br>
</tt><tt>lxd 77G 655M 76.4G - 0% 0% 1.00x
ONLINE -</tt><tt><br>
</tt><br>
So disk space usage is about 645MB for the image, and small change
for the instances launched from it. Now I want to clone further
containers from base1, so I publish it:<br>
<br>
<tt>root@vtp:~# time lxc publish base1 --alias clonemaster</tt><tt><br>
</tt><tt>Container published with fingerprint:
80ec0105da9d1f8f173e45233921bc772319e39364c322786a5b4cfec895cb68</tt><tt><br>
</tt><tt><br>
</tt><tt>real 0m45.155s</tt><tt><br>
</tt><tt>user 0m0.000s</tt><tt><br>
</tt><tt>sys 0m0.012s</tt><tt><br>
</tt><tt>root@vtp:~# zpool list</tt><tt><br>
</tt><tt>NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP
HEALTH ALTROOT</tt><tt><br>
</tt><tt>lxd 77G 1.27G 75.7G - 0% 1% 1.00x
ONLINE -</tt><tt><br>
</tt><tt>root@vtp:~# zfs list -t snapshot</tt><tt><br>
</tt><tt>NAME
USED AVAIL REFER MOUNTPOINT</tt><tt><br>
</tt><tt>lxd/images/80ec0105da9d1f8f173e45233921bc772319e39364c322786a5b4cfec895cb68@readonly
0 - 638M -</tt><tt><br>
</tt><tt>lxd/images/f4c4c60a6b752a381288ae72a1689a9da00f8e03b732c8d1b8a8fcd1a8890800@readonly
0 - 638M -</tt><tt><br>
</tt><br>
I notice that (a) publish is a slow process, and (b) disk usage has
doubled. Finally launch a container:<br>
<br>
<tt>root@vtp:~# lxc launch clonemaster myclone</tt><tt><br>
</tt><tt>Creating myclone</tt><tt><br>
</tt><tt>Starting myclone</tt><tt><br>
</tt><tt>root@vtp:~# zpool list</tt><tt><br>
</tt><tt>NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP
HEALTH ALTROOT</tt><tt><br>
</tt><tt>lxd 77G 1.27G 75.7G - 0% 1% 1.00x
ONLINE -</tt><tt><br>
</tt><tt><br>
</tt>That's fine - it's sharing with the image as expected.<br>
<br>
Now, what I was hoping for was that the named image (clonemaster)
would be a snapshot derived directly from the parent, so that it
would also share disk space. What I'm actually trying to achieve is
a workflow like this:<br>
<br>
- launch (say) 10 initial master containers<br>
- customise those 10 containers in different ways (e.g. install
different software packages in each one)<br>
- launch multiple instances from each of those master containers<br>
<br>
This is for a training lab. The whole lot will then be packaged up
and distributed as a single VM. It would be hugely helpful if the
initial zfs usage came to around 650MB not 6.5GB.<br>
<p>The only documentation I can find about images is here:<br>
<a class="moz-txt-link-freetext"
href="https://github.com/lxc/lxd/blob/master/doc/image-handling.md">https://github.com/lxc/lxd/blob/master/doc/image-handling.md</a></p>
It talks about the tarball image format: is it perhaps the case that
"lxc publish" is creating a tarball, and then untarring it into a
fresh snapshot? Is that tarball actually stored anywhere? If so, I
can't find it. Or is the tarball created dynamically when you do
"lxc image copy" to a remote? If so, why not just use a zfs snapshot
for "lxc publish"?<br>
<br>
<<Digs around>> Maybe it's done this way because "<a
href="http://docs.oracle.com/cd/E19253-01/819-5461/6n7ht6r4f/">A
dataset cannot be destroyed if snapshots of the dataset exist</a>":
i.e. using a snapshot for publish would prevent the original
container being deleted. That makes sense - although I suppose it
could have its contents rm -rf'd and then <a
href="http://docs.oracle.com/cd/E19253-01/819-5461/gamnn/index.html">renamed</a>
to a graveyard name.<br>
<br>
The other option I can think of is zfs dedupe. The finished target
system won't have the resources to do dedupe continuously. However I
could turn on dedupe during the cloning, do the cloning, and then
turn it back off again (*)<br>
<br>
Have I understood this correctly? Any additional clues gratefully
received.<br>
<br>
Thanks,<br>
<br>
Brian Candler.<br>
<br>
(*) P.S. I did a quick test of this. It looks like this doesn't work
to deduplicate against any pre-existing files:<br>
<br>
<tt>root@vtp:~# zfs set dedup=on lxd</tt><tt><br>
</tt><tt>root@vtp:~# zpool list</tt><tt><br>
</tt><tt>NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP
HEALTH ALTROOT</tt><tt><br>
</tt><tt>lxd 77G 1.27G 75.7G - 0% 1% 1.00x
ONLINE -</tt><tt><br>
</tt><tt>root@vtp:~# lxc exec base2 /bin/sh -- -c 'echo world
>/usr/test.txt'</tt><tt><br>
</tt><tt>root@vtp:~# lxc stop base2</tt><tt><br>
</tt><tt>root@vtp:~# lxc publish base2 --alias clonemaster2</tt><tt><br>
</tt><tt>Container published with fingerprint:
8a288bd1364d82d4d8afb23aee67fa13586699c539fad94e7946f60372767150</tt><tt><br>
</tt><tt>root@vtp:~# zpool list</tt><tt><br>
</tt><tt>NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP
HEALTH ALTROOT</tt><tt><br>
</tt><tt>lxd 77G 1.88G 75.1G - 1% 2% 1.05x
ONLINE -</tt><tt><br>
</tt><br>
But then I rebooted, and published another image:<br>
<br>
<tt>root@vtp:~# zpool list</tt><tt><br>
</tt><tt>NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP
HEALTH ALTROOT</tt><tt><br>
</tt><tt>lxd 77G 1.87G 75.1G - 1% 2% 1.05x
ONLINE -</tt><tt><br>
</tt><tt>root@vtp:~# lxc exec base3 /bin/sh -- -c 'echo world2
>/usr/test.txt'</tt><tt><br>
</tt><tt>root@vtp:~# lxc stop base3</tt><tt><br>
</tt><tt>root@vtp:~# time lxc publish base3 --alias clonemaster3</tt><tt><br>
</tt><tt>Container published with fingerprint:
6abbeb5df75989944a533fdbb1d8ab94be4d18cccf20b320c009dd8aef4fb65b</tt><tt><br>
</tt><tt><br>
</tt><tt>real 0m55.338s</tt><tt><br>
</tt><tt>user 0m0.008s</tt><tt><br>
</tt><tt>sys 0m0.008s</tt><tt><br>
</tt><tt>root@vtp:~# zpool list</tt><tt><br>
</tt><tt>NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP
HEALTH ALTROOT</tt><tt><br>
</tt><tt>lxd 77G 1.88G 75.1G - 1% 2% 2.11x
ONLINE -</tt><tt><br>
</tt><br>
So I suspect it would have all worked if I'd turned on dedupe before
the very first image was fetched.<br>
</body>
</html>