[Lxc-users] Containers are all getting same IP address

Jay Taylor jay at jaytaylor.com
Wed Aug 14 20:49:41 UTC 2013


One additional note:

Make sure the btrfs volume is a fast disk.  I just tried with an AWS EBS
volume and was unable reproduce the problem.  As soon as I switched to
using an ephemeral (local storage) disk, I was able to reproduce after only
2 runs of the test script.


On Wed, Aug 14, 2013 at 1:22 PM, Jay Taylor <jay at jaytaylor.com> wrote:

> Hi Serge,
>
> I added zfs support to the application and systems creating/hosting the
> containers, and I have subsequently been unable to reproduce any issues.
>
> As far as trying to reproduce it with btrfs, I've had some success.
>
> The general system state is something like:
> N containers already running happily
> Launch N+ more containers in rapid succession (in parallell, not serially).
>
> I've modified your test script to reflect more closely what my application
> is actually doing, by slowly launching 10 containers, and then using "&" to
> rapidly fork and additional 10 clone/start operations.  I have it doing 2
> cycles of this and it eventually triggers the problem (it's taken up to 3
> runs for to trigger the problem).
>
> And for reference, here is an exact copy the scripts I used to reproduce
> the problem:
>
> test.sh:
>
> #!/usr/bin/env bash
>
> prefix=$1
>
> test -z "${prefix}" && echo 'error: missing required parameter: prefix'
> 1>&2 && exit 1
>
> path=/mnt
>
> sudo lxc-destroy -n c1 2>/dev/null
> sudo lxc-create -t ubuntu -B btrfs -n c1
>
> for i in `seq 1 10`; do
>     sudo lxc-clone -s -B btrfs -P $path -o c1 -n $prefix$i
>     sudo lxc-start -d -n $prefix$i
> done
> for i in `seq 11 20`; do
>     echo $(sudo lxc-clone -s -B btrfs -P $path -o c1 -n $prefix$i; sudo
> lxc-start -d -n $prefix$i) &
> done
>
> sleep 10
>
> # Create even more.
> for i in `seq 21 30`; do
>     sudo lxc-clone -s -B btrfs -P $path -o c1 -n $prefix$i
>     sudo lxc-start -d -n $prefix$i
> done
> for i in `seq 31 40`; do
>     echo $(sudo lxc-clone -s -B btrfs -P $path -o c1 -n $prefix$i; sudo
> lxc-start -d -n $prefix$i) &
> done
>
>
> stop.sh:
>
> #!/usr/bin/env bash
>
> prefix=$1
>
> test -z "${prefix}" && echo 'error: missing required parameter: prefix'
> 1>&2 && exit 1
>
> sudo lxc-destroy -n c1;
>
> for i in `seq 1 40`; do
>     echo $(sudo lxc-stop -k -n $prefix$i; sudo lxc-destroy -n $prefix$i) &
> done
>
>
>
> bash ./test.sh x
> bash ./test.sh y
> bash ./test.sh z
>
>
> If it doesn't manifest at first, try stopping/starting varying quantities
> of containers for several cycles.  Eventually I consistently end up not
> ever getting ip addresses:
>
> x1                          RUNNING  -     -     NO
> x10                         RUNNING  -     -     NO
> x11                         RUNNING  -     -     NO
> x12                         RUNNING  -     -     NO
> x13                         RUNNING  -     -     NO
> x14                         RUNNING  -     -     NO
> x15                         RUNNING  -     -     NO
> x16                         RUNNING  -     -     NO
> x17                         RUNNING  -     -     NO
> x18                         RUNNING  -     -     NO
> x19                         RUNNING  -     -     NO
> x2                          RUNNING  -     -     NO
> x20                         RUNNING  -     -     NO
> x21                         RUNNING  -     -     NO
> x22                         RUNNING  -     -     NO
> x23                         RUNNING  -     -     NO
> x24                         RUNNING  -     -     NO
> x25                         RUNNING  -     -     NO
> x26                         RUNNING  -     -     NO
> x27                         RUNNING  -     -     NO
> x28                         RUNNING  -     -     NO
> x29                         RUNNING  -     -     NO
> x3                          RUNNING  -     -     NO
> x30                         RUNNING  -     -     NO
> x31                         RUNNING  -     -     NO
> x32                         RUNNING  -     -     NO
> x33                         RUNNING  -     -     NO
> x34                         RUNNING  -     -     NO
> x35                         RUNNING  -     -     NO
> x36                         RUNNING  -     -     NO
> x37                         RUNNING  -     -     NO
> x38                         RUNNING  -     -     NO
> x39                         RUNNING  -     -     NO
> x4                          RUNNING  -     -     NO
> x40                         RUNNING  -     -     NO
> x5                          RUNNING  -     -     NO
> x6                          RUNNING  -     -     NO
> x7                          RUNNING  -     -     NO
> x8                          RUNNING  -     -     NO
> x9                          RUNNING  -     -     NO
>
>
> On Wed, Aug 14, 2013 at 10:12 AM, Serge Hallyn <serge.hallyn at ubuntu.com>wrote:
>
>> Quoting Serge Hallyn (serge.hallyn at ubuntu.com):
>> > Quoting Jay Taylor (jay at jaytaylor.com):
>> > > After further investigation yesterday, I am not convinced it is an
>> > > IP-address issue.  The affected host machines are unable to start any
>> > > existing or newly created containers.  The incident that triggered the
>> > > issue was cloning 1 container into 10 new ones, and then launching
>> them all
>> > > simultaneously.  Are there any known concurrency issues with LXC which
>> > > would explain why executing a lot of clone/start LXC commands at the
>> same
>> >
>> > Known, no, but that doesn't mean they're not there :)
>> >
>> > However, could you try to reproduce this with non-btrfs?
>> >
>> > I'll try to reproduce with btrfs...
>>
>> In a fresh raring instance I mounted a btrfs disk on /mnt, and did
>>
>>         lxc-create -t ubuntu -B btrfs -P /mnt -n c1
>>         for i in `seq 1 10`; do
>>                 lxc-clone -s -p /mnt -o c1 -n x$i
>>         done
>>         for i in `seq 1 10`; do
>>                 lxc-start -d -P /mnt -n x$i
>>         done
>>
>>         Then connected to two of the containers with lxc-console,
>>                 lxc-console -P /mnt -n x2
>>                 lxc-console -P /mnt -n x9
>>
>>         both were up and had unique ip addresses.
>>
>> Again this was a raring instance with ppa:ubuntu-lxc/daily installed.
>>
>> -serge
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20130814/ebb46e7b/attachment.html>


More information about the lxc-users mailing list