[lxc-users] Strange freezes with btrfs backend

Fajar A. Nugraha list at fajar.net
Sun Dec 4 10:05:11 UTC 2016

On Sat, Dec 3, 2016 at 7:56 PM, Ron Kelley <rkelleyrtp at gmail.com> wrote:

> My 0.02
> We have been using btrfs in production for more than a year on other
> projects and about 6mos with LXD.  It has been rock solid.  I have multiple
> LXD servers each with >20 containers. We have a separate btrfs filesystem
> (with compression enabled) to store the LXD containers. I take nightly
> snapshots for all containers, and each server probably has 2000 snapshots.
> The only issue thus far is the IO hit when deleting lots of snapshots at
> one time.  You need to delete a few (10 at a time), pause for 60secs, then
> delete the next 10.

Ultimately, IMHO it comes down to what you're comfortable with best.

I like the fact that btrfs can be used in nested lxd, but I didn't like the
fact that you can't get "disk usage of one container" in btrfs. My
compromise so far was to always use zfs, but assign btrfs-formatted zvol
when I need nested lxd.

> I have used ZFS in Linux in the past and could never get adequate
> performance - regardless of tuning or amount of RAM given to ZFS.  In fact,
> I started using ZFS for our backup server (64TB raw storage with 32GB RAM)
> but had to move back to XFS due to severe performance issues.  Nothing
> fancy; I did a by-the-bok install and enabled compression and snapshots. I
> tried every tuning option available (including SSD for L2-ARC). Nothing
> would improve the performance.

AFAIK the recommendation is 1GB RAM (for zfs use) for every 1TB of raw
disk. That is on top of whatever amount of RAM required by the OS/app.
Depending on your load, SLOG might be more useful than L2ARC (in fact, when
configured incorrectly, L2ARC can do more harm than good). Testing this is
easy enough though: if you experience much better performance with
"sync=disabled", then you need SLOG.

> To the OP: are you sure btrfs is causing your issues?  Have you traced the
> OP activity during the hiccup moments?
> ... hence my earlier recommendation: htop, check syslog for OOM messages.

@Pierce: Add "iostat -mx 3" to that (especially to monitor IOPS usage), and
also Tomasz's advice: don't use a disk image file.
if your provider doesn't allow additional disk images (or makes it REALLY
hard to do so, like many cheap KVM-SSD VPS provider), then I highly
recommend you check out EC2: their free tier includes vps with 1GB RAM, and
you can easily add additional block devices.

