[lxc-devel] regression: lxc-start -d hangs in lxc_monitor_sock_name (at process_lock)

Dwight Engen dwight.engen at oracle.com
Thu Sep 12 13:42:08 UTC 2013


On Thu, 12 Sep 2013 00:27:04 -0400
Stéphane Graber <stgraber at ubuntu.com> wrote:

> Hello,
> 
> It looks like Dwight's last change introduce a bit of a regression
> when running lxc-start -d.

Yikes, sorry I didn't catch that in my testing. My follow on patch
for doing the monitor socket in the abstract space gets rid of this
entirely, so this is an additional reason to consider it.

> Tracing it down (added a ton of printf all over), it looks like it's
> hanging on:
>  - lxcapi_start
>    - wait_on_daemonized_start
>      - lxcapi_wait
>        - lxc_wait
>          - lxc_monitor_open
>            - lxc_monitor_sock_name
> 
> Specifically, it's hanging at the process_lock() call because
> process_lock() was already called as part of lxcapi_start and only
> gets unlocked right after wait_on_daemonized_start returns.
> 
> 
> Looking at the code, I'm not even sure why we need process_lock there.
> What it protects is another thread triggering the mkdir_p in parallel,
> but that shouldn't really be a problem since running two mkdir_p at
> the same time should still result in the hierarchy being created, or
> did I miss something?
 
That sounds logical to me, but hmm, does that mean we don't need it in
lxclock_name() either (where I was modeling this on)? I wonder if
there is a code flow that its possible for us to hang there. 
 
> Anyway, I'll let someone who knows that code better figure out a fix,
> until then, lxc-start -d is broken in staging.
> 





More information about the lxc-devel mailing list