gnome-perl: problems with threading



Hi

This is not really a gnome-specific problem, but it bit me when
using the gnome-perl module, so maybe a warning is in order,
in case anyone else is puzzling over a similar problem. It certainly
applies to linux ELF glibc2 (e.g. RedHat 5.2), and maybe other
systems, depending on how they resolve symbols in shared
libraries. It can catch you whenever you load a threaded shared
library dynamically into a non-threaded application such as a
scripting language. The most obvious symptom is programs
failing with an obscure "X IO error" message.

If this is all well known stuff, my apologies!

I eventually figured it out by using a debugging version of glibc2.
>From http://www.wainpr.demon.co.uk/threading.html:

The perils of threading...

Or: where you have to use LD_PRELOAD

Here's a thing...

Last night I compiled gnome-perl, which contains a perl module for
interfacing with the GTK widget set. I downloaded slashes.pl, a news
ticker for slashdot, which uses perl and GTK. I started it up and it
immediately crashed with an X I/O error. Hmmm, thats informative.
However, I had come across this one before, when (don't ask) I had
accidentally linked the module with -lpthreads (before pthreads were
actually required).  So I looked in the GTK libraries using

    objdump -x /usr/local/gnome/lib/libgtk.so | grep NEEDED

and, sure enough, libpthreads was in there.

In a threaded application, procedures which set the error code
variable errno actually store into a thread-specific location which is
a field in the thread structure. Non-threaded applications use a
global variable. In either case the location of errno is obtained by
calling __errno_location(). This function is a weak symbol in libc.so
(returning the global address), which can be overridden by a
definition in libpthread.so (returning the thread-specific address).
The trouble is this: each shared library makes calls through a jump
table in the .plt section of its own address space. The actual
procedure address is resolved the first time it is called by each
library.  Therefore, if a new library is loaded dynamically during
program execution, and that library overrides weak symbols in another,
then different libraries within the program may obtain different
values for their external symbols.

As far as I recall from my last encounter with the problem, this
catches us as follows: If for some reason the X request queue or event
queue is temporarily blocked, then a read() or write() will set errno
to EAGAIN, or EWOULDBLOCK to indicate a temporary non-fatal
error. Since these routines are located in libc.so which is loaded at
program startup, and since errno has certainly been referenced early
in program execution, thus forcing the address of __errno_location()
to be resolved already, these will be stored in the global errno
variable. However, the X routine which retrieves the error code will
obtain its definition of __errno_location() from the pthreads library
which has been loaded with the module. It will read the thread-specific
error code (which is 0) and it doesn't know what to do with it (an error
which is not an error!). This is what generates the X I/O error.

The moral is, loading pthreads in the middle of a program does not
work.  If you have a library linked against pthreads, and you need to
load it dynamically into a non-threaded application (such as perl on
my system) the solution is to preload threads by setting the LD_PRELOAD
variable in the environment:

    LD_PRELOAD=libpthreads.so.0 perl slashes.pl


--
Peter Wainwright
Home: prw@wainpr.demon.co.uk   Work: peter.wainwright@nrpb.org.uk
http://www.wainpr.demon.co.uk
Visit the Opera Exchange Homepage at http://www.treda.co.uk/opex/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]