Re: Mainloop debugging tool idea



On 18/11/12 23:55, Bastien Nocera wrote:
> was wondering whether the shell was actually blocking at any point

Short answer, from the (blockers of the) bug you mentioned: "yes". Help
welcome; I've done some work on reducing blocking, but I'm probably not
going to be able to continue to work on that in the short term, and I
don't know Shell's design well enough to assess how feasible it is to
design out the blocking by (e.g.) loading images before they're needed.

As far as I could tell, the main problems were:

* Things that take time, but have to be done from the main thread
  (either because library code like telepathy-glib[1], the
  NetworkManager libraries[1], and GtkIconTheme aren't thread-safe, or
  because particular objects like GL contexts can only be manipulated
  from the main thread)

* File I/O for images that needs to be done before the next frame can
  be rendered, because that frame will contain those images

[1] this is mostly dbus-glib's fault

> Came the idea of using of watchdog thread, based on the mainloop, which
> would check for how long the mainloop was running a particular
> iteration, and have to dump a backtrace if the main loop was blocking on
> a particular task for more than X amount of time.

I think having the main loop time itself, as I did on #588139, might be
a better approach: if the watchdog is polling the main thread somehow,
then the necessary synchronization for it to poll the main thread will
perturb the timing anyway? I could be wrong.

I found that the backtrace of the location at which the source was added
was a more useful thing to have than the backtrace at the location where
it's run. I didn't include that in the patch for #588139 but I can tidy
it up if there's interest. This part is non-portable (I just used
glibc's execinfo.h) but I doubt that really matters.

> This would make testing problems like the ones split off from:
> https://bugzilla.gnome.org/show_bug.cgi?id=687362
> much easier.

Beware that if you insert enough tracepoints to be able to see what's
going on, and attach a tracing tool (assuming the overhead of running
strace is representative of similar tools), it alters the timings
themselves quite a lot (I've seen a factor of 4). The approach I used
for #687362 was to have two separate builds of GLib and Shell:

* one with only the patch from #588139, to assess how long we're
  blocking for and whether a change I made has had a positive effect

* one with lots of extra profiling (#688337, and equivalents in Shell),
  to try to debug what is blocking and why, but knowing that everything
  is much slower than it ought to be and numbers I produce from this
  build are not necessarily meaningful

Yes yes I know, systemtap, but that needs the unmerged utrace kernel
patches, which aren't in my target kernel, and I didn't really feel like
digressing into kernel hacking at the moment :-)

    S


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]