hi all, i went over gtype.c and inserted the required locks to make it thread safe. however, the code is far from simple, there are more than 100 lock/unlock calls now. that's why i'd like to have it reviewed by as many eyes as possible, intrinsic knowledge of the type system isn't strictly neccessary to catch basic locking issues, so please take the time and have a look (especially sebastian). a couple of notes on the locking used by the type system, a few portions of the type nodes may be modified during runtime, such as node->data, node->gdata, node->children and the node's memchunks to alocate instances from. the rest however is fairly constant except for some global variables like the class cache function list and of course the basic pointer arrays for the type branches. since the most frequently called type functions perform pure readouts, i went for a read/write lock, so concurrent reads can be handled simultanously. the type system has to call out for user code from some portions, e.g. for class cache functions, class/instance/interface initializers and finalizers etc. no locks can be held across invocation of those functions, that and the read/write locking lead me to tag all internal (static) functions to expose their locking behaviour. the different types in use are: * LOCKING: * lock handling issues when calling static functions are indicated by * uppercase letter postfixes, all static functions have to have * one of the below postfixes: * - _I: [Indifferent about locking] * function doesn't care about locks at all * - _U: [Unlocked invocation] * no read or write lock has to be held across function invocation * (locks may be acquired and released during invocation though) * - _L: [Locked invocation] * a write lock or more than 0 read locks have to be held across * function invocation * - _W: [Write-locked invocation] * a write lock has to be held across function invokation * - _Wm: [Write-locked invocation, mutatable] * like _W, but the write lock might be released and reacquired * during invocation, watch your pointers external functions, e.g. g_type_is_a() always have to be called with locks released, i.e. _U alike. special care has to be taken when macros are involved, some macros expand into external function calls where no lock may be held, i tried to mark all of those, e.g.: /* G_TYPE_IS_ABSTRACT() is an external call: _U */ if (G_TYPE_IS_ABSTRACT (type)) [...] a static function that for instance initializes a class by calling third-party code, would have either the _U or _Wm postfix. in actuall code, it does use the _Wm postfix because the code calling class/instance initialization functions do need to do additional type system modifications within a write lock, so using _U here would have been ineffective. an extra pitfall is code that acquires a write lock, checks some type system internal conditions and then calls a _Wm function. after that function has returned, the previously checked conditions usually shouldn't have changed, however because _Wm functions have to temporarily release the write lock, another thread could in theory have modified type systems internal state in a way that screwed our conditions. put another way, type system modifications require write locks, but sometimes third-party code has to be called in the middle of those modifications, requirering us to temporarily release the write lock which introduces a race condition where other threads or recursive invocations of type system functions can screw our internal state. in general, this indicates bugs in third-party code, e.g. g_type_class_ref() is being called on the type system, the type system loads that class' implementation through third-party code (g_type_plugin_*) and the third-party code erroneously calls g_type_class_ref() for this same class again. i inserted extra checks to catch these kind of situations, and then i error out with an INVALID_RECURSION() macro that tries to give a helpfull error message. simplified overview: 1) g_type_class_ref (42) { 2) G_WRITE_LOCK (type_rw_lock); node = lookup_type_node_L (42); 3) if (!node->class) { class = g_new (...); 4) G_WRITE_UNLOCK (type_rw_lock); /* call third-party code, unlocked */ 5) g_type_plugin_use (node->plugin); 5) g_type_plugin_complete_type_info (node->plugin, &node->type_info, ...); 5) node->type_info.class_init (class); 6) G_WRITE_LOCK (type_rw_lock); 7) if (node->class) 8) INVALID_RECURSION (...); 9) node->class = class; } node->class_ref_count += 1; G_WRITE_UNLOCK (type_rw_lock); return node->class; } 1) class should be referenced 2) acquire write lock because referencing a class requires modifications 3) if the class doesn't exist yet, initialize it 4) since class initilaization requires third-party code, release the write lock 5) call third-party code 6) reacquire write lock, so we can proceed with node modifiactions 7) recheck internal state, since 5) could have called g_type_class_ref(42), thus causing node->class to be setup already, this is pathologic, so: 8) error out here, shouldn't be triggered 9) finally setup node->class as i said, usually third-party code shouldn't retrigger the code portions that its being invoked from (i.e. g_type_class_ref(42) in the above example). however, another scenario which i'm not 100% sure is erroneous on the user code side but could still trigger INVALID_RECURSION() is this: thread code A: 1) g_type_class_ref(42); A: 2) G_WRITE_LOCK(); // acquired B: 1) g_type_class_ref(42) B: 2) G_WRITE_LOCK(); // waiting, lock held by A A: 4) G_WRITE_UNLOCK(); // released A: 5) call user code B: 2) G_WRITE_LOCK(); // acquired A: 8) G_WRITE_LOCK(); // waiting, lock held by B B: 5) call user code so basically two threads are in the process of setting up the same class here, one of them will modify node->class first and so the other will definitely INVALID_RECURSION() at some point. questionable here is whether two threads may reference the same class at the same time, provided the user knew the class contents are constant once initialized, he might choose to access them without explicit locking (something similar to GDK global lock for instance). so far i choosed to not provide additional paranoia checks to catch such behaviour, additional input is apprechiated here though (basically we'd had to do per-class/instance/type/etc. locking inside the type system which would be overly complex and of questionable benfit). a note on why no write lock and not even a read lock can be held across third-party code invocation, perfectly valid user code could read: gtk_widget_class_init (GtkWidgetClass *class) { GtkStyleClass *style_class = g_type_class_ref (GTK_TYPE_STYLE); [...] } during GtkWidgetClass initialization, another class has to be initialized, here GtkStyleClass. holding a write lock, or just a read lock across gtk_widget_class_init() would prevent this and result in a dead lock. granted, the code is not so easy to review, but if it were a 1-2-3-hack issue, i'd probably not send out an email calling for peer reviews ;) examples for catches even not-type-system-addicted persons can watch out for: G_READ_UNLOCK (type_rw_lock); g_warning ("cannot unreference class of invalid (unclassed) type `%s'", type_descriptive_name_L (class->g_type)); // type_descriptive_name_L requires a lock to be held, indicated by _L G_READ_UNLOCK (type_rw_lock); /* do something locked */ G_READ_UNLOCK (type_rw_lock); // obviously the first line should s/UNLOCK/LOCK/ G_READ_LOCK (type_rw_lock); g_type_name (0); some_function_W (); G_READ_UNLOCK (type_rw_lock); // g_type_name() is an external function, requires no lock being held // some_function_W() requires a WRITE lock to be held G_*_LOCK (type_rw_lock); return XX; // forgot to unlock attached are the new gtype.c (the old one is still in CVS) and a space-indifferent patch for the diff-adicted ;) --- ciaoTJ
Attachment:
gtype.c.MT.gz
Description: MT-safety patched gtype.c
Attachment:
gtype.c.MT.diff.gz
Description: MT-safety diff for gtype.c