plea for review

From: Tim Janik <timj gtk org>
To: Gtk+ Developers <gtk-devel-list gnome org>
Subject: plea for review
Date: Wed, 24 Jan 2001 11:56:58 +0100 (CET)
hi all,

i went over gtype.c and inserted the required locks to make it thread safe.
however, the code is far from simple, there are more than 100 lock/unlock
calls now.
that's why i'd like to have it reviewed by as many eyes as possible, intrinsic
knowledge of the type system isn't strictly neccessary to catch basic locking
issues, so please take the time and have a look (especially sebastian).

a couple of notes on the locking used by the type system, a few portions of
the type nodes may be modified during runtime, such as node->data,
node->gdata, node->children and the node's memchunks to alocate instances
from.
the rest however is fairly constant except for some global variables like
the class cache function list and of course the basic pointer arrays for the
type branches. since the most frequently called type functions perform
pure readouts, i went for a read/write lock, so concurrent reads can
be handled simultanously.
the type system has to call out for user code from some portions, e.g. 
for class cache functions, class/instance/interface initializers and
finalizers etc.
no locks can be held across invocation of those functions, that and the
read/write locking lead me to tag all internal (static) functions to
expose their locking behaviour. the different types in use are:

 * LOCKING:
 * lock handling issues when calling static functions are indicated by
 * uppercase letter postfixes, all static functions have to have
 * one of the below postfixes:
 * - _I:        [Indifferent about locking]
 *   function doesn't care about locks at all
 * - _U:        [Unlocked invocation]
 *   no read or write lock has to be held across function invocation
 *   (locks may be acquired and released during invocation though)
 * - _L:        [Locked invocation]
 *   a write lock or more than 0 read locks have to be held across
 *   function invocation
 * - _W:        [Write-locked invocation]
 *   a write lock has to be held across function invokation
 * - _Wm:       [Write-locked invocation, mutatable]
 *   like _W, but the write lock might be released and reacquired
 *   during invocation, watch your pointers

external functions, e.g. g_type_is_a() always have to be called
with locks released, i.e. _U alike.
special care has to be taken when macros are involved, some macros
expand into external function calls where no lock may be held,
i tried to mark all of those, e.g.:
  /* G_TYPE_IS_ABSTRACT() is an external call: _U */
  if (G_TYPE_IS_ABSTRACT (type))
    [...]

a static function that for instance initializes a class by calling
third-party code, would have either the _U or _Wm postfix.
in actuall code, it does use the _Wm postfix because the code calling
class/instance initialization functions do need to do additional
type system modifications within a write lock, so using _U here
would have been ineffective.

an extra pitfall is code that acquires a write lock, checks some
type system internal conditions and then calls a _Wm function.
after that function has returned, the previously checked conditions
usually shouldn't have changed, however because _Wm functions have
to temporarily release the write lock, another thread could in theory
have modified type systems internal state in a way that screwed our
conditions.
put another way, type system modifications require write locks, but
sometimes third-party code has to be called in the middle of those
modifications, requirering us to temporarily release the write lock
which introduces a race condition where other threads or recursive
invocations of type system functions can screw our internal state.
in general, this indicates bugs in third-party code, e.g.
g_type_class_ref() is being called on the type system, the type
system loads that class' implementation through third-party code
(g_type_plugin_*) and the third-party code erroneously calls
g_type_class_ref() for this same class again.
i inserted extra checks to catch these kind of situations, and
then i error out with an INVALID_RECURSION() macro that tries to
give a helpfull error message.

simplified overview:
 1) g_type_class_ref (42)
    {
 2)   G_WRITE_LOCK (type_rw_lock);
      node = lookup_type_node_L (42);
 3)   if (!node->class)
        {
          class = g_new (...);
 4)       G_WRITE_UNLOCK (type_rw_lock);
          /* call third-party code, unlocked */
 5)       g_type_plugin_use (node->plugin);
 5)       g_type_plugin_complete_type_info (node->plugin, &node->type_info, ...);
 5)       node->type_info.class_init (class);
 6)       G_WRITE_LOCK (type_rw_lock);
 7)       if (node->class)
 8)         INVALID_RECURSION (...);
 9)       node->class = class;
        }
      node->class_ref_count += 1;
      G_WRITE_UNLOCK (type_rw_lock);
      return node->class;
    }

1) class should be referenced
2) acquire write lock because referencing a class requires modifications
3) if the class doesn't exist yet, initialize it
4) since class initilaization requires third-party code, release the write lock
5) call third-party code
6) reacquire write lock, so we can proceed with node modifiactions
7) recheck internal state, since 5) could have called g_type_class_ref(42),
   thus causing node->class to be setup already, this is pathologic, so:
8) error out here, shouldn't be triggered
9) finally setup node->class

as i said, usually third-party code shouldn't retrigger the code portions
that its being invoked from (i.e. g_type_class_ref(42) in the above example).

however, another scenario which i'm not 100% sure is erroneous on the user
code side but could still trigger INVALID_RECURSION() is this:

thread	code
A:	1) g_type_class_ref(42);
A:	2) G_WRITE_LOCK(); // acquired
B:	1) g_type_class_ref(42)
B:	2) G_WRITE_LOCK(); // waiting, lock held by A
A:	4) G_WRITE_UNLOCK(); // released
A:	5) call user code
B:	2) G_WRITE_LOCK(); // acquired
A:	8) G_WRITE_LOCK(); // waiting, lock held by B
B:	5) call user code

so basically two threads are in the process of setting up the same class
here, one of them will modify node->class first and so the other will
definitely INVALID_RECURSION() at some point.
questionable here is whether two threads may reference the same class
at the same time, provided the user knew the class contents are constant
once initialized, he might choose to access them without explicit locking
(something similar to GDK global lock for instance).

so far i choosed to not provide additional paranoia checks to catch such
behaviour, additional input is apprechiated here though (basically we'd
had to do per-class/instance/type/etc. locking inside the type system
which would be overly complex and of questionable benfit).

a note on why no write lock and not even a read lock can be held
across third-party code invocation, perfectly valid user code could
read:

gtk_widget_class_init (GtkWidgetClass *class)
{
  GtkStyleClass *style_class = g_type_class_ref (GTK_TYPE_STYLE);
  [...]
}

during GtkWidgetClass initialization, another class has to be
initialized, here GtkStyleClass. holding a write lock, or just a
read lock across gtk_widget_class_init() would prevent this and
result in a dead lock.

granted, the code is not so easy to review, but if it were a 1-2-3-hack
issue, i'd probably not send out an email calling for peer reviews ;)

examples for catches even not-type-system-addicted persons can watch out for:

  G_READ_UNLOCK (type_rw_lock);
  g_warning ("cannot unreference class of invalid (unclassed) type `%s'",
             type_descriptive_name_L (class->g_type));
// type_descriptive_name_L requires a lock to be held, indicated by _L

  G_READ_UNLOCK (type_rw_lock);
  /* do something locked */
  G_READ_UNLOCK (type_rw_lock);
// obviously the first line should s/UNLOCK/LOCK/

  G_READ_LOCK (type_rw_lock);
  g_type_name (0);
  some_function_W ();
  G_READ_UNLOCK (type_rw_lock);
// g_type_name() is an external function, requires no lock being held
// some_function_W() requires a WRITE lock to be held
 
  G_*_LOCK (type_rw_lock);
  return XX;
// forgot to unlock


attached are the new gtype.c (the old one is still in CVS) and a
space-indifferent patch for the diff-adicted ;)

---
ciaoTJ
Attachment: gtype.c.MT.gz
Description: MT-safety patched gtype.c
Attachment: gtype.c.MT.diff.gz
Description: MT-safety diff for gtype.c
Follow-Ups:
- Re: plea for review
  - From: Alexander Larsson
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]