Instabilities with multi threaded application



Yeah, I know you are probably thinking "gdk_threads_enter" and
"gdk_threads_leave" already. Let lets assume that I did read the glib/
gdk and gtk+ documentation about multi threaded applications :). I do
know that issues with multi-threading are often related to misuse of
these statements.

We have a application that has a GUI which may not hang nor can wait and
let the user think that it has been crashed, ever. My application will
read image-data from a slow device (in fact it's a microscope, the
application is for a scientific microscope device. More or less a
embedded GUI for this device).

We use CORBA (more specifically ORBit) for communication with the
device. The device will deliver us it's images through a CORBA link. The
device cannot perform tasks in parallel.

So I created a queue at the GUI. This queue will launch functions that
will launch CORBA methods which will return me the data and I want to
put that data in a GtkDrawingArea. My question is not about CORBA, we
are not yet using CORBA so far. 

So I did the common stuff for drawing images, which is to implement the
expose callback for the GtkDrawingArea. You will see a struct
InteractiveWindow, this is the struct for the GUI that can work
interactive with the microscope. So now you should understand the naming
that I am using :). It basically means: You click on a image, the
microscope does it's thing, and you see a zoomed version of what you
just clicked. 

static gboolean on_pattern_drawingarea_expose_event
 (GtkDrawingArea *drawingarea, GdkEventExpose *event, gpointer window)
{
 InteractiveWindow *win = (InteractiveWindow*) window;
 gdk_draw_rgb_image(GTK_WIDGET(drawingarea)->window, 
  GTK_WIDGET(drawingarea)->style->fg_gc[GTK_STATE_NORMAL],
  0, 0, width, height,
  GDK_RGB_DITHER_MAX, win->pattern_img, width*3);
  return FALSE;
}

And of course I implemented the realize like this

static void on_pattern_drawingarea_realize
(GtkDrawingArea *drawingarea, gpointer window)
{
  InteractiveWindow *win = (InteractiveWindow*) window;
  gtk_widget_add_events (GTK_WIDGET(drawingarea), 
		GDK_ENTER_NOTIFY_MASK
		| GDK_LEAVE_NOTIFY_MASK
		| GDK_BUTTON_PRESS_MASK
		| GDK_BUTTON_RELEASE_MASK
		| GDK_POINTER_MOTION_MASK
		| GDK_POINTER_MOTION_HINT_MASK );
}

When I alter the pointer win->pattern_img, which is a guchar, the
GtkDrawingArea will draw it. This is working and tested.



Because I want the actions that the user is requesting to get queued, I
create a simple queue:

Queue * new_Queue (gpointer host)
{
	Queue *queue = (Queue*)g_malloc(sizeof (Queue));
	
	queue->app = host;
	queue->add_item = add_item;
	queue->cleanup = cleanup;
	queue->items = NULL;
	pthread_mutex_init (&queue->items_mutex, NULL);
	return queue;
}


static void * thread_main_func (void * data)
{
	Queue *queue = (Queue*)data;

	while (queue->items)
	{
		QueueItem *item;
		gpointer result;

		
		pthread_mutex_lock (&queue->items_mutex);
		item = (QueueItem*)queue->items->data;
		pthread_mutex_unlock (&queue->items_mutex);

		
		item->app = queue->app;
		result = item->launcher (item);

		item->callback (item, result);

		//sleep(1);
		g_free (result);

		pthread_mutex_lock (&queue->items_mutex);
		queue->items = g_list_next (queue->items);
		pthread_mutex_unlock (&queue->items_mutex);
	}

	pthread_mutex_lock (&queue->items_mutex);
	g_list_free (queue->items);
	queue->items = NULL;
	pthread_mutex_unlock (&queue->items_mutex);

	return NULL;
}

static void add_item (Queue *queue, QueueItem *item)
{
	App *app = (App*) queue->app;
	item->app = app;

	pthread_mutex_lock (&queue->items_mutex);
	queue->items = g_list_append (queue->items, item);
	pthread_mutex_unlock (&queue->items_mutex);

	g_print ("%d\n", g_list_length(queue->items));
	if (g_list_length (queue->items) == 1)
	{
	 pthread_create (&queue->thread, NULL, thread_main_func, queue);
	}
}


So this is the .h file of course:

typedef struct _Queue Queue;
typedef struct _QueueItem QueueItem;

struct _Queue 
{
	gpointer app;
	GList *items;
	pthread_t thread;
	pthread_mutex_t items_mutex;

	void (*add_item) (Queue *queue, QueueItem *item);

	void (*cleanup) (Queue *queue);
};

struct _QueueItem
{
	gpointer app;
	gpointer sender;
	gpointer args;
	gpointer (*launcher) (QueueItem *item);
	void (*callback) (QueueItem *item, gpointer data);
};

Queue * new_Queue (gpointer host);


I don' think that I have to paste the App struct here, and it's not yet
used anyway (it will be used but we are not using it yet).

With this queue, if we add an item .. it will process it. If we add an
item while it's processing another item, it will queue this item and
will process it later on. If all are processed within time, the thread
is killed. If we add an item and there is no thread, a new thread will
be created that will start processing stuff. Thus a very simple queueing
mechanism.


If I change these lines:

//result = item->launcher (item);
//item->callback (item, result);
sleep(1);

Everything is working perfectly! The queue is being travelled and the
thread ends once all is done. I can add new items very very rapidly and
it will just queue them all and one by one process them. This is of
course what I want.


To add an item to the queue, I am doing this in the GUI:


typedef struct _DemoWalkArgs DemoWalkArgs;
struct _DemoWalkArgs 
{
	gint pattern_width;
	gint pattern_height;
	gint pos;
};


static gpointer demowalk_launcher (QueueItem *item)
{
	gpointer result=NULL;
	DemoWalkArgs *args = (DemoWalkArgs*)item->args;

	gdk_threads_enter ();

	g_print ("demowalk_launcher %d, %d, %d\n", 
		args->pattern_width, args->pattern_height, args->pos);

	result = g_strdup ("a very long string");

	gdk_threads_leave();

	return result;
}

static void demowalk_callback (QueueItem *item, gpointer data)
{
	
	gdk_threads_enter();
	g_print ("demowalk_callback %s\n", (gchar*)data);
	gdk_threads_leave ();
	return;
}


static gboolean on_carrier_drawingarea_button_press_event
	(GtkDrawingArea *drawingarea, GdkEventButton *event, gpointer window)
{

	InteractiveWindow *win = (InteractiveWindow*) window;
	DemoWalkArgs *args = (DemoWalkArgs*) g_malloc
		 (sizeof (DemoWalkArgs));

	/* These variables are specific to our application of course*/

	App* app = (App*) win->app;
	Queue *queue = app->queue;

	QueueItem demowalk_queueitem =
	{
		app,
		win,
		(gpointer)args,
		demowalk_launcher,
		demowalk_callback
	};


	g_print ("Adding\n");

	/* Some dummy values (unused)*/

	args->pattern_width = 256;
	args->pattern_height = 256;
	args->pos = 10;

	queue->add_item (queue, &demowalk_queueitem);
	g_print ("Item added\n");

	return FALSE;
}




While most of the times this works perfectly, I am getting 

Xlib: unexpected async reply (sequence 0x46c)!

When I click the GtkDrawingArea to fast. The thread which is created to
process queued items will get killed by a segmentation error.

Which is strange because if I don't touch the function-pointers
"launcher" and "callback", it seems to be working. I'd then say thats
because the functions that are referred by these pointers have Gtk+
functions, but they don't!! They immediately return. 

I have removed and added gdk_threads_enter and gdk_threads_leave
statements but it didn't really increase stability. Sometimes it did but
then it just crashed after pressing the GtkDrawingArea very very
rapidly. I don't want it to crash, ever. Nomatter what I do or how fast
I abuse the GUI.


I think it's strange that it's getting caused by the speed of the
clicks. As far as I know will such a click only add a new pointer to a
GList and in few cases also startup a new thread. Nevertheless, none of
them are xlib functions, so I wonder what is actually calling the xlib
since it's xlib (and not Gtk+) that is throwing a warning at me and
killing the thread.

As far as I know is there not one xlib function in the whole lifecycle
of the thread at this moment. There will be, protected with
gdk_threads_enter-stuff of course. Once I have the image, I want to draw
it a first time (or let it trigger the expose event). So then I will
indeed need to touch xlib through Gtk+. But I am not yet doing that.


I don't think that I need to include other source, but if you want me
to; I can send you in private some more samples and code snippets. This
is not an OpenSource project but it's using OpenSource technologies. So
we cannot post the complete sources here :-), of course.

-- 
Philip Van Hoof, Software Developer @ Cronos
home: me at freax dot org
gnome: pvanhoof at gnome dot org
work: Philip dot VanHoof at Cronos dot Be
http://www.freax.be, http://www.freax.eu.org




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]