[lxc-devel] a container can remount ro the host's mount point
lxc at zitta.fr
lxc at zitta.fr
Mon Mar 15 17:52:53 UTC 2010
Le 15/03/2010 18:28, Michael H. Warfield a écrit :
> On Mon, 2010-03-15 at 15:39 +0100, lxc at zitta.fr wrote:
>
>> Le 15/03/2010 15:05, Michael H. Warfield a écrit :
>>
>>> On Sun, 2010-03-14 at 08:33 +0100, lxc at zitta.fr wrote:
>>>
>>>
>>>> Hi,
>>>>
>>>> When I create a full os container (for example a debian), I have to
>>>> remove init script that remount / read only on halt
>>>> example : umountfs for lenny
>>>>
>>>> If I don't do this, the container remounts readonly the mount point
>>>> where rootfs are when it stops.
>>>>
>>>> Why a container is able to do this?
>>>> If you store multiples containers on the same mount point, it could be
>>>> very problematic.
>>>>
>>>>
>>> Ah HA! So THAT'S the root cause of THAT problem. Several of us have
>>> noticed that effect. Yeah, major PITA. Also explains just why I no
>>> longer see it. Because of a practice I started using in setting up my
>>> containers...
>>>
>>> As it so happens, because all of my containers are OpenVZ compatibility
>>> containers, I use a bind mount in the fstab for the root fs. OpenVZ has
>>> this concept of a "private" and a "rootfs" to aid in setting disk quotas
>>> in the container and I'm hoping to also eventually use that with union
>>> mounts / unionfs to do a linux-vservers style unify. But... That also
>>> prevents this problem because the container's rootfs is NOT a real fs in
>>> the host, it's the bind mount and that insulates the hosts fs and mount
>>> points from any actions in the container.
>>>
>>> Example from one of my containers is like this:
>>>
>>> Config:
>>>
>>> ==
>>> lxc.rootfs = /srv/lxc/rootfs
>>> lxc.mount = /srv/lxc/config/1004.fstab
>>> =
>>>
>>> fstab:
>>>
>>> ==
>>> /srv/lxc/private/1004 /srv/lxc/rootfs none bind 0 0
>>>
>>> /export /srv/lxc/rootfs/export none bind 0 0
>>> /home/shared /srv/lxc/rootfs/srv/shared none bind 0 0
>>> ==
>>>
>>> Would be really NICE if that bind could be something like a fuse with
>>> unionfs or, eventually, a union mount once those are mature and stable
>>> in the kernel, but we're not there yet.
>>>
>>> Now, you won't actually see anything in /srv/lxc/rootfs because it's
>>> private to the container and it's just a dummy mount point that can be
>>> used by all of your containers. The only thing that varies between my
>>> containers then is the location of the fstab (and the network stuff,
>>> obviously). The container can screw up its mounts all it want's their
>>> ALL isolated and private to the container, including the rootfs.
>>>
>>>
>>>
>>>> Regards,
>>>>
>>>>
>>>
>>>
>>>> Guillaume ZITTA
>>>>
>>>>
>>> Regards,
>>> Mike
>>>
>>>
>> Thanks.
>> I noticed that practice whas used by lxc-create in version 0.6.3
>>
> No, not exactly, and it wasn't being done by lxc-create. lxc-create was
> merely creating the directory, it was not doing the bind mount and could
> not do the bind mount. The actual mount was being done by lxc-start at
> run time when starting that container. The code in lxc-create was
> removed because the behavior of lxc-start was changed to no longer
> require that directory.
>
>
>> with lxc-0.6.3, lxc-create is a binary and it does this for you and
>> other things in /var/lib/lxc
>> with lxc-0.6.5, lxc-create is a shell script and it does less things
>> than the binary one
>>
> Close but not quite.
>
>
>> Is this a voluntary regression?
>>
> It was a change (and Daniel may chime in here an correct me at any
> moment) coupled with the introduction of using pivot root to actually
> improve the isolation of the containers from the host and prevent the
> containers from breaking out of their chrooted jails. That was a
> security fix. He did drop that additional bind mount at that time and
> yes that did provide the additional functional isolation in this one
> peculiar way that protected the host from random acts of terrorism by
> the container on its rootfs. An unanticipated side effect.
>
>
>> If not I propose myself to update lxc-create script to propose the same
>> kind of functionality than the C version.
>>
> No. Do not do that. It did not work the way you're thinking it did and
> that will not work. It would create a situation where you would have to
> rerun lxc-create after reboot or restarting because you will have lost
> the bind mounts. This never was done in lxc-create, only the creation
> of the directory. The mounting is done in lxc-start and must be done in
> lxc-start. Don't do this. Personally, I like the method of adding the
> bind mount explicitly to the fstab and plan to continue that way. Maybe
> we should merely make that be a "best practice". That also gives us the
> flexibility later down the road in adding disk quotas or different types
> of file systems, other than a vanilla bind mount. But all that's up to
> Daniel. I'm sure he's now fully aware of this unintended consequence of
> that change in lxc-start and he'll have to decide on the path moving
> forward. ITMT... The bind mount is a successful and safe workaround
> for the problem. Don't go back to the old way of doing things here.
>
> Regards,
>
> Mike
Ok, I'll modify my container creation script to populate the fstab with
a mount bind.
Thanks for the explanation.
Regards,
Guillaume
More information about the lxc-devel
mailing list