glib: processing events in multiple threads



Hello!

In SyncEvolution, I am using a library with synchronous method calls
which is unaware of glib. In order to use it in combination with glib
event processing, I usually do:
     1. run event loop until I need to call the library
     2. quit event loop
     3. call library
     4. in a callback from the library back into my code: enter event
        loop again until ready to continue in the library, or
     5. after the library returns: go back to step 1

Having to set up an event loop and quitting it at the right time turned
out to be confusing, so instead I started to use the following pattern:
        if (<not done>) {
                g_main_context_iteration(NULL, TRUE);
        }
where <not done> is changed by some event in the event loop. The
assumption obviously is that g_main_context_iteration() will return
control after processing that event.

To make matters worse, now I have to use multithreading inside that
library. At first glance, the code above still seemed to work, but I
started to get random test failures which point towards a race
condition. I think the assumption above no longer holds. I haven't seen
it happen yet, but it's fairly obvious that it can (and as Murphy tells
us, will) go wrong like this:
     1. Thread 1 calls g_main_context_iteration(NULL, TRUE).
     2. Thread 2 sets up an event, say g_timeout_add(), then calls
        g_main_context_iteration(NULL, TRUE). It blocks waiting for
        ownership of the context.
     3. Thread 1 gets woken up by the event from thread 2, returns from
        g_main_context_iteration().
     4. Thread 2 (inside g_main_context_iteration()) becomes the owner
        of the context and continues to wait inside the method.
     5. Thread 1 checks its own <not done>, calls
        g_main_context_iteration(NULL, TRUE)
=> process is stuck unless some other event wakes up thread 2, which
might never happen. Right?

A quick and dirty workaround would be to ensure that at regular
intervals a "watch dog" timer wakes up whatever thread currently owns
the main context. But what's the right solution?

I thought that g_main_loop_run() with separate GMainLoop instances each
time a threads needs to wait might work, but I can see at least one race
condition with that. The code would look like this, for the simple "time
out after certain time" case:

    gboolean timed_out(gpointer data) {
            GMainLoop *loop = (GMainLoop *)data;
            g_main_loop_quit(loop);
            return G_SOURCE_REMOVE;
        }

    ....
        GMainLoop *loop = g_main_loop_new(NULL, TRUE /* running */);
        g_timeout_add(1 /*ms*/, timed_out, loop);
        g_main_loop_run(loop);

The problem with that is:
     1. Thread 1 is inside g_main_loop_run().
     2. Thread 2 sets the timeout, gets stopped after g_timeout_add()
        and before g_main_loop_run().
     3. Thread 1 calls timed_out(), which stops thread 2's event loop
        and removes the source.
     4. Thread 2 enters g_main_loop_run() which sets "is running" to
        true and is stuck forever.

It's a bit contrived because it's unlikely to happen when the timeout is
large, but it may be more likely with other kinds of events.

It may further reduce the risk to do:
        if (!g_main_loop_is_running(loop)) {
            g_main_loop_run(loop);
        }
but because that code can be interrupted, too, it just shifts the
problem. The real solution would be a g_main_loop_run_if_running() which
atomically checks the flag and returns immediately if false. Is
something like that possible with the current API, or are there other
solutions to the problem?

-- 
Best Regards, Patrick Ohly

The content of this message is my personal opinion only and although
I am an employee of Intel, the statements I make here in no way
represent Intel's position on the issue, nor am I authorized to speak
on behalf of Intel on this matter.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]