[lxc-users] LXD - Small Production Deployment - Storage

Stéphane Graber stgraber at ubuntu.com
Wed Mar 29 22:21:58 UTC 2017


On Thu, Mar 30, 2017 at 12:01:14AM +0200, Gabriel Marais wrote:
> On Wed, Mar 29, 2017 at 6:01 PM, Stéphane Graber <stgraber at ubuntu.com>
> wrote:
> 
> > On Wed, Mar 29, 2017 at 03:13:36PM +0200, Gabriel Marais wrote:
> > > Hi Guys
> > >
> > > If this is the incorrect platform for this post, please point me in the
> > > right direction.
> > >
> > > We are in the process of deploying a small production environment with
> > the
> > > following equipment:-
> > >
> > > 2 x Dell R430 servers each with 128GB Ram and 3 x 600GB SAS 10k drives
> > > 1 x Dell PowerVault MD3400 with
> > >       3 x 600GB 15k SAS Drives
> > >       3 x 6TB 7.2k Nearline SAS drives
> > >
> > > The PowerVault is cabled directly to the Host Servers via Direct Attached
> > > Storage, redundantly.
> > >
> > >
> > > We would like to run a mixture of KVM and LXD containers on both Host
> > > Servers.
> > >
> > > The big question is, how do we implement the PowerVault (and to a certain
> > > extent the storage on the Host Servers themselves) to be most beneficial
> > in
> > > this mixed environment.
> > >
> > > I have a few ideas on what I could do, but since I don't have much
> > > experience with shared storage, I am probably just picking straws and
> > would
> > > like to hear from others that probably has more experience than me.
> >
> > Hi,
> >
> > I'm not particularly familiar with the DELL PowerVault series, but it
> > looks like the other answers you've received so far have entirely missed
> > the "Direct Attached Storage" part of your description :)
> >
> > For others reading this thread, this setup will effectively show up on
> > both servers as directly attached disks through /dev/mapper (because of
> > multipath), there is no need to use any kind of networked storage on top
> > of this.
> >
> >
> > The answer to your question I suspect will depend greatly on whether
> > you're dealing with a fixed number of VMs and containers, or if you
> > intend to spawn and delete them frequently. And also on whether you need
> > fast (no copy) migration of individual VMs and containers between the
> > two hosts.
> >
> 
> We are not planning on having a fixed number of VMs and containers. It will
> always grow as the requirement grow and as many as CPU and RAM will allow.
> Migration of containers seem easy enough using the lxc migrate command,
> although we have only done migrations with containers running on a normal
> file system (e.g ext4) on a host.
> 
> 
> >
> > One approach is to have a physical partition per virtual machine.
> >
> 
> Meaning, to create physical partitions on the controller and assign it to
> both hosts. So lets say we create a physical partitions on the controller,
> 60GB to be used for a specific VM. That partition will show up as e.g. sdd
> on the host. We can then install the VM on that partition. If I understand
> you correctly?

Correct, though you should use something more unique than sdd as those
device names are sequential based on detection order and so can change
after reboot. Since you're in a HA setup, you should have multiple paths
to each drive. Then something like multipathd will detect the different
paths and setup a virtual device in /dev/mapper for you to use. That
device is typically named after the WWN of the drive which is a unique
stable identifier that you can rely on.

> > With this, you can then access the drive from either host (obviously
> > never from both at the same time), which means that should you want to
> > start the VM on the other host, you just need to stop the kvm process on
> > one and start it again on the other, without any data ever being moved.
> >
> 
> I assume we would simply use XML (export / import) to create the VM on the
> other host which is pointing to the partition sitting on the storage device?

Yep

> > For containers, it's a bit trickier as we don't support using a raw
> > block device as the root of the container. So you'd need LXD to either
> > use the host's local storage for the container root and then mount block
> > devices into those containers at paths that hold the data you care
> > about. Or you'd need to define a block device for each server in the
> > PowerVault and have LXD use that for storage (avoiding using the local
> > storage).
> >
> 
> I like the idea of creating a block device on the storage for each
> container and have LXD use that block device for a specific container. I'm
> not sure how and if we would be able to simply migrate a container from one
> host to the other (assuming that we would have those block devices
> available to both hosts)...?

Right so as I mentioned, you can't have a LXD container use a single
partition from your storage array as kvm lets you do.

So instead your best bet is probably to setup one chunk of storage for
use for the container root filesystems and point LXD to that as the
default storage pool.

For container data, you can then use a partition from your storage
array, format it and pass it to the LXD container with something like:

lxc config device add some-container mysql disk source=/dev/mapper/PARTITION path=/var/lib/mysql

> > The obvious advantage of the second option is that should one of the
> > server go away for whatever reason, you'd be able to mount that server's
> > LXD pool onto the other server and spawn a second LXD daemon on it,
> > taking over the role of the dead server.
> >
> 
> From what I've read, a particular host can only use one ZFS Pool. This
> creates a limitation since we won't be able to create two pools - 1 for the
> faster storage drives and 1 for the slower storage drives.

That's true of LXD until LXD 2.9 which introduces our storage API which
does allow you to use multiple different storage pools in whatever way
you want.


On recent LXD, you can now do:

lxc storage create ssd zfs tank-ssd/lxd
lxc storage create spindle zfs tank-spindle/lxd

Which will define two storage pools, both using ZFS and both using a
subset of a different zpool for the containers.

You can choose which pool to use at launch time with:

lxc launch ubuntu:16.04 blah -s ssd

Or set a default pool for a profile with:

lxc profile device add default root disk path=/ pool=spindle

> My initial planning was to:-
> 
> 
> Option 1
> ----------------
> Take the Fast Storage (3 x 600GB 15k SAS drives) and configure them on the
> controller as RAID5, split them into 2 and give each host a partition (sdx)
> of +/- 600GB
> Take the Slower Storage (3 x 6TB 7.2k drives) and configure them on the
> controller as RAID5, split them into 2 and give each host a partition (sdy)
> of +/- 6TB
> 
> Setup LVM with two volume groups, e.g. Vol0 - Fast Storage (300GB) and Vol1
> (3TB) - Slow Storage
> Create logical volumes as needed in terms of disk space per container and
> VMs and install the container and VMs onto those logical volumes.

That would work fine, though I'd recommend you use ZFS rather than LVM.

ZFS is much better whenever you have to deal with files, as containers
do but also supports creating block devices for use with virtual
machines.

You can even configure libvirt to directly allocate such ZFS volumes as
needed (if you're using libvirt), otherwise, they can be create with
"zfs create -V".

> Option 2
> ----------------
> The other option I was thinking off was to create partitions on the
> controller and split the storage up so we could use say:-
> 
> 150GB Fast Storage as a ZFS Pool for containers (that needs disk speed)
> 1.5TB Slow Storage as a ZFS Pool for containers (that doesn't need as much
> disk speed)
> 1.5TB Slow Storage with LVM for VMs
> 
> and the same with the Slower Storage but then as far as I know, we can only
> have one ZFS Pool per host. So that's not going to work...?

So if you're willing to use the non-LTS branch of LXD, you can get the
multi-pool support I described above.

Though again, I'd recommend just putting everything on ZFS so you don't
need to guess how much space you need for containers vs VMs and can run
everything using the same storage technology.

> Option 3
> ----------------
> The last option I had in mind was to create the partitions on the
> controller and assign them onto their respective hosts (having sda, sdb,
> sdc etc.)
> That way, we could select whether we wanted Fast or Slow storage for a
> partition and have LXD and KVM use the partition for the installation.
> 
> 
> At least with LVM we could still leverage from snapshots.
> 
> 
> Option 4
> ----------------
> 
> Create a partition on the controller per host with space and use ZFS on
> those partitions and setup LXD to use those partitions, we would select
> either slower storage or faster storage for the purpose.
> Create the same size partition on both hosts.
> That way we can leverage from live migrations and snapshots.
> 
> Use LVM for the VMs and/or data mount points within the VMs / Containers.


So sounds like your best bet may be:
 - Setup two "fast" volumes on the PowerVault, give one to each server
 - Setup two "slow" volumes on the PowerVault, give one to each server
 - Setup the hosts to each have:
    - external-fast zpool
    - external-slow zpool
    - internal-fast zpool
 - Then for each zpool, create a "vm" and a "lxd" dataset. If using the
   newer LXD, then create 3 LXD pools using those "lxd" datasets.
 - You can then set quotas for the "vm" and "lxd" datasets as needed,
   tweak compression settings and any other ZFS option.
 - You can then choose for every container and virtual machine, what
   kind of storage to give them and can create additional volumes to attach
   to them as needed (for example a container that's on the limited fast
   storage could get a big chunk of storage attached from the slower
   storage pool for a given path).

Hope that helps.


-- 
Stéphane Graber
Ubuntu developer
http://www.ubuntu.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: not available
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20170329/fc40df2b/attachment.sig>


More information about the lxc-users mailing list