[Lxc-users] Copy-on-write hard-link / hashify feature

Gordan Bobic gordan at bobich.net
Thu Jun 10 09:25:54 UTC 2010


On 06/09/2010 11:47 PM, Daniel Lezcano wrote:
> On 06/09/2010 10:46 PM, Gordan Bobic wrote:
>> On 06/09/2010 09:08 PM, Daniel Lezcano wrote:
>>> On 06/09/2010 08:45 PM, Gordan Bobic wrote:
>>>> Is there a feature that allows unifying identical files between guests
>>>> via hard-links to save both space and memory (on shared libraries)?
>>>> VServers has a feature for this called hashify, but I haven't been able
>>>> to find such a thing in LXC documentation. Is there such a thing?
>>>>
>>>> Obviously, I could manually do the searching and hard-linking, but this
>>>> is dangerous since without the copy-on-write feature for such
>>>> hard-linked files that VServers provides, it would be dangerous as any
>>>> guest could change a file on all guests.
>>>>
>>>> Is there a way to do this safely with LXC?
>>> No because it is supported by the system with the btrfs cow / snapshot
>>> file system.
>>>
>>> https://btrfs.wiki.kernel.org
>>>
>>> You can create your btrfs filesystem, mount it somewhere in your fs,
>>> install a distro and then make a snapshot, that will result in a
>>> directory. Assign this directory as the rootfs of your container. For
>>> each container you want to install, create a snapshot of the initial
>>> installation and assign each resulting directory for a container.
>> OK, this obviously saves the disk space. What about shared libraries
>> memory conservation? Do the shared files in different snapshots have the
>> same inodes?
>
> Yes.

So this implicitly implements COW hard-linking?

>> What about re-merging them after they get out of sync? For example, if I
>> yum update, and a new glibc gets onto each of the virtual hosts, they
>> will become unshared and each get different inode numbers which will
>> cause them to no longer be mmap()-ed as one, thus rapidly increasing the
>> memory requirements. Is there a way to merge them back together with the
>> approach you are suggesting? I ask because VServer tools handle this
>> relatively gracefully, and I see it as a frequently occurring usage
>> pattern.
>
> The use case you are describing suppose the guests do not upgrade their
> os, so no need of a cow fs for some private modifications, no ?

No, the use-case I'm describing treats guests pretty independently. I am 
saying that I can see a lot of cases where I might update a package in 
the guest which will cause those files to be COW-ed and unshared. I 
might then update another guest with the same package. It's files will 
not be COW-ed and unshared, too. Proceed until all guests are updated. 
now all instances of files in this package are COW-ed and unshared, but 
they are again identical files. I want to merge them back into COW 
hard-links in order to save disk-space and memory.

I know that BTRFS has block-level deduplication feature (or will have 
such a feature soon), but that doesn't address the memory saving, does 
it? My understanding (potentially erroneous?) is that DLLs get mapped 
into same shared memory iif their inodes are the same (i.e. if the two 
DLLs are hard-linked).

VServer's hashify feature handles this unmerge-remerge scenario 
gracefully so as to preserver both the disk and memory savings. I can 
understand that BTRFS will preserve (some of) the disk savings with it's 
features, but it is not at all clear to me that it will preserve the 
memory savings.

> In this case, an empty file hierarchy as a rootfs and the hosts system
> libraries, tools directories can be ro-binded-mounted in this rootfs
> with a private /etc and /home.

That is an interesting idea, and might work to some extent, but it is 
rather inflexible compared to the VServer alternative that is 
effectively fully dynamic.

Gordan




More information about the lxc-users mailing list