[lxc-devel] LXC snapshot using overlayfs fsfreeze

Tycho Andersen tycho at docker.com
Tue Apr 11 14:59:34 UTC 2017


Hi Amir,

On Tue, Apr 11, 2017 at 01:37:53PM +0300, Amir Goldstein wrote:
> On Mon, Apr 10, 2017 at 5:20 PM, Tycho Andersen <tycho at docker.com> wrote:
> > Hi Amir,
> >
> > On Sat, Apr 08, 2017 at 09:35:01PM +0200, Amir Goldstein wrote:
> >> [moving this discussion over from fsdevel to containers list and
> >> changing the title]
> >>
> >> On Tue, Apr 4, 2017 at 9:07 PM, Tycho Andersen <tycho at docker.com> wrote:
> >> > On Tue, Apr 04, 2017 at 09:59:16PM +0300, Amir Goldstein wrote:
> >> >> On Tue, Apr 4, 2017 at 9:01 PM, Tycho Andersen <tycho at docker.com> wrote:
> >> >> > On Tue, Apr 04, 2017 at 12:47:52PM -0500, Serge E. Hallyn wrote:
> >> >> >> > Would lxc-snapshot gain anything from the ability to fsfreeze an overlay
> >> >> >> > mount?
> >> >> >>
> >> >> >> lxc-snapshot only works on stopped containers.  'lxc snapshot' can do live
> >> >> >> snapshots using criu.  Tycho, does that do anything right now to freeze the
> >> >> >> fs?
> >> >> >
> >> >> > Not that I'm aware of (CRIU might, but we don't in liblxc).
> >> >> >
> >> >> >> I'm not sure that freezing all the tasks is necessarily enough to settle
> >> >> >> the fs, but I assume you're doing something about that already?
> >> >> >
> >> >> > I suspect it's not, but we're not doing anything besides freezing the
> >> >> > tasks. In fact, we freeze the tasks by using the freezer cgroup,
> >> >> > which itself is buggy, since the freezer cgroup can race with various
> >> >> > filesystems. So, freezing tasks is hard, and I haven't even thought
> >> >> > about how to freeze the fs for real :)
> >> >> >
> >> >> > But in any case, an fs freezing primitive does sound useful for
> >> >> > checkpoint restore, assuming that we're right and freezing the tasks
> >> >> > is simply not enough.
> >> >> >
> >> >>
> >> >> So I already asked Pavel that question and he said that freezing
> >> >> the tasks is enough. I am not convinced it is really enough to bring
> >> >> a file system image (i.e. underlying blockdev) to a quiescent state,
> >> >> but I think it may be enough for getting a stable view of the mounted
> >> >> file system, so the files could be dumped somewhere.
> >> >> I am guessing is what lxc snapshot does?
> >> >
> >> > Yes, lxc snapshot is basically just a frontend for CRIU.
> >> >
> >> >> I still didn't understand wrt lxc snapshot, is there a use case for
> >> >> taking live snapshots without using CRIU? (because freezer cgroup
> >> >> mentioned races or whatnot?).
> >> >
> >> > No, I think CRIU is the only project that will ever attempt to do
> >> > checkpoint restore this way ;-).
> >>
> >> I don't doubt that.
> >>
> >> My question is whether it is interesting to snapshot a live container fs
> >> without having to checkpoint not restore at all.
> >>
> >> >  CRIU supports two different ways of
> >> > freezing tasks: one using the freezer cgroup and one without. The one
> >> > without doesn't work against fork bombs very well, and the one with
> >> > doesn't work because of some filesystems. So it's mostly a container
> >> > engine implementation choice which to use.
> >> >
> >> >> It's definitely possible with btrfs and if my overlayfs freeze patches
> >> >> are not terribly wrong, then it should be easy with overlayfs as well.
> >> >> Does lxc snapshot already support live snapshot of btrfs container?
> >> >
> >> > Yes, it does. It freezes the tasks via the cgroup freezer and then
> >> > does a btrfs snapshot of the filesystem once the tasks are frozen.
> >> >
> >>
> >> So what I am not sure is if there are use cases where criu cannot be
> >> used or maybe there are reasons not to use it. and for these cases
> >> if it may be interesting to support snapshot of the storage by:
> >> - fsfreeze -f
> >> - copy upper dir
> >> - fsfreeze -u
> >
> > I don't see a reason for it, but perhaps I'm not being very
> > imaginative. Without the memory state, the potentially inconsistent fs
> > state doesn't seem very helpful.
> >
> 
> Hi Tycho,
> 
> The use case is quite simple really.
> Same use case as any LVM snapshot and btrfs snapshot on a
> non-containerized system:
> Before installing some stuff, sync, take a snapshot of the root fs and
> you can always
> restart your system from that snapshot of root fs if something went wrong.
> 
> You don't need to save any memory state for that and you don't need to dump any
> processes info for that.
> It's simply a snapshot that you can *start* from and not *resume* from.
> 
> I am quite surprised to learn that containers don't have that
> functionality (they don't?).
> I guess it may be because containers CAN freeze processes, so they do it,
> but it's really not a prerequisite for live *image* snapshot -
> fsfreeze is enough.

Well, the problem is when some container has some state in memory that
it hasn't tried to commit to disk yet. Doing an fsfreeze on a running
container doesn't seem safe in the general case. Of course offline
(i.e. the container is not currently running) freezes are safe and in
wide use today, I was speaking only of online freezes.

> The thing is it is easy to snapshot container image based on LVM and btrfs today
> (lvm snapshot command does fsfreeze on the file system on top of lvm volume),
> but it is not possible to snapshot container image based on overlayfs
> the same way.
> 
> My patches implement fsfreeze for overlayfs, and quite frankly, I am
> taken by surprise,
> that container users don't find this useful. I may be missing something.

I don't think you are. Container engines today use the snapshotting
features of LVM, btrfs (and zfs) for offline freezes (and indeed,
features like `btrfs send` and online snapshots to speed up live
migration).

Cheers,

Tycho


More information about the lxc-devel mailing list