[Lxc-users] Containers slow to start after 1600

Benoit Lourdelet blourdel at juniper.net
Mon Mar 18 17:56:53 UTC 2013


Hello,


Inline after BL>> 


Quoting Benoit Lourdelet (blourdel at juniper.net):
>Hello Serge,
>I am running on a 256MB RAM host, with plenty of free memory.

G? :)

>I issue  echo t > /proc/sysrq-trigger  when containers was taking 30s to
>start , it gave the following. Nothing that caught my attention.

Hm.  Thanks.

I guess I would first just try:

for i in `seq 1 1700`; do
	sudo unshare -n sleep 2h
	if [ $((i % 100)) -eq 0 ]; then
		echo $i
	fi
Done

B>>> on my system, it does not give any output : never reaches "echo"

and see if  it starts to slow down with just that.  If so, then go to
the linux-kernel mailing list as there is something in the netns
which is not scaling.  If not, then next write a script like

cat > /bin/simplenetns << EOF
#!/bin/sh
ip link add type veth
sleep 2h
EOF
chmod ugo+x /bin/simplenetns

and do

for i in `seq 1 1700`; do
	sudo unshare -n /bin/simplenetns > /tmp/out.$$ 2>&1 &
	if [ $((i % 100)) -eq 0 ]; then
		echo $i
	fi
Done


BL>> This script takes a couple of second, even I scaled to 5000 without
taking more the a couple of seconds.


If that slows down, then it's the veth creations doing it.

If not, then try adding the veths from the parent task and one end into
the container, so that you end up with n/2 veths in the host.

BL>> I would need detailed instructions.

Thanks

Benoit

If that does it, then it may just be a sysfs scalability bug.

And if that still doesn't do it, then try adding the host end of each
veth pair to a host bridge.  If that does it, then it may be a bridge
scaling issue.

(if you want scripts for the later ones pls shout)

I'll be very interested to hear which if any of these triggers it.

thanks,
-serge






More information about the lxc-users mailing list