Re: Exception handling (was: Quickies about the new GObject)

From: Thomas Mailund <mailund mail1 stofanet dk>
To: gtk-devel-list redhat com
Subject: Re: Exception handling (was: Quickies about the new GObject)
Date: 03 Jun 2000 09:24:12 +0200
>>>>> "M" == Michael Beach <mbeach@cisco.com> writes:

 M> Derek Simkowiak wrote:
 >>  I'm just getting into Java, but this sounds to me like Java's
 >> exceptions.  A "throw" is a signal emission of the "exception"
 >> signal.
 >> 

 M> I think you're a little off track here. The key thing (IMHO) about
 M> exception handling mechanisms in languages that have them is that
 M> the binding of handler to exceptions occurs with dynamic scope (as
 M> opposed to lexical scope or some kind of flat global-ish sort of
 M> scope).

If you really want something like try/throw/catch exceptions in C, the
solution is setjmp/longjmp, as has been pointed out.  It's not really
as horrible as the man pages will have you believe, but you do need to
be careful when you use them.  You probably don't wont to use them as
the default error handling mechanism, but there are times when the
Right Thing(tm) is to tag the state, try something out, and get the
hell out if things didn't go as planned, and for these kind of
situations, setjmp/longjmp is just the thing.

Off the top of my head, there's two things that will give you trouble
if you treat jumps as exceptions.  

 * they are not really recursive
 * they won't help you manage resources

They are not really recursive:
------------------------------

With 'setjmp' you save the stack context.  You put that somewhere
where the call to 'longjmp' can get to it, for example a global
variable or pass it as a parameter to the further calls.  

    STACK
  +---------+                 When you call 'longjmp' you return the
  |         |           the place where you sat the jump.  If, in the
  |         |           mean time, you've called 'setjmp' again, well,
  +---------+           who cares?  You still return to the context you
  | current |           pass to 'longjmp'.  So if you want to jump to
  |  frame  |           the nearest setjmp, like the try/catch scheme,
  |         |           you need to handle it explicitly.  When you do
  | longjmp +--+        the second 'setjmp' you need to save the old
  |         |  |        context somewhere where you can get to it, then
  +---------+  |        create the new and store that one somewhere
  |         |  |        where 'longjmp' can get to it.
   .........   .        
  |         |  |        If an exception was thrown and you cannot handle
  +---------+  |        it at the second 'setjmp' just make a 'longjmp'
  | second  |  |        on the old context.  If you return succesfully
  | setjmp  +? |        from your 'try' block, you need to restore the
  |         |  |        old context, so that other calls still can
  +---------+  |        throw exceptions to the first 'setjmp'.
  |         |  |
   .........   .        The easiest way to handle this is to take the
  |         |  |        Java approach and explicitly mark the functions
  +---------+  |        that can throw exceptions.  You don't
  |  first  |  |        distinguish between different kinds of exceptions
  | setjmp  +--+        you just throw 'em.  And you tag them by giving
  |         |           them the extra parameter 'jmp_buf env' to throw
   .........            their exception at.

Then the situation above would look like the pseudo-C below.  There's
a function 'foo' with a try/catch block.  It calls the function 'bar'
in the try block, so 'bar' takes a 'jmp_buf' parameter, and 'foo' pass
the 'env' it got from 'setjmp' to 'bar' using this parameter.  The
function 'bar' has its own try/catch block, and therefore has its own
'jmp_buf' variable.  It calls the function 'baz' both inside and
outside the try/catch block.  Inside 'baz' is called with 'bar's
local 'jmp_buf', and outside it is called with the default.  So if
'baz' throws an exception it jumps to 'bar' if it was called inside
the try/catch block, and jumps to 'foo' if not.

---<code>-------------------------------------------------------------

foo ()
{
  int ex;
  jmp_buf env;

  ...
  /* try */
  if ( (ex = setjmp(env)) != 0) goto catch:
  
  /* do your stuff */
  bar (env,...); /* call other function, pass 'env' as parameter */


  goto after_catch:
 catch:
  /* handle exceptions */
  switch (ex)
    {
      ...
    }
 after_catch:

 
}

bar (jmp_buf env, ...)
{
  int ex;
  jmp_buf new_env;

  ...
  /* try */
  if ( (ex = setjmp(new_env)) != 0) goto catch:
  
  /* do your stuff */
  baz (new_env,...); /* call other function, pass 'new_env' as parameter */

  goto after_catch:
 catch:
  /* handle exceptions */
  switch (ex)
    {
      ...

      default:
        /* propagate exception */
        longjmp (env, ex);
    }
 after_catch:


  /* call 'baz' outside try block.  here we call with the old env */
  baz (env,...);

}

baz (jmp_buf env, ...)
{
  ...
  longjmp (env, ex);
  ...
}

---</code>------------------------------------------------------------

With this scheme, functions that throws exceptions knows where to
throw them and knows that there is someone to catch them.  The
disadvantage of this scheme is that all functions between the 'try'
and the 'throw' needs to pass the 'jmp_buf' along.  In my experience
this means you have to re-write a _lot_ of code to put in a simple
try/throw/catch.  You have the same problem in Java (which is why I
called it the "Java approach") and it annoys me there as well.

Another solution, which is SML-like, is to allow any function to throw
an exception, without explicitly stating that it can do so.  In C this
is a bit harder to do, because (1) you need your own 'jmp_buf' stack
and (2) you need to know if someone is going to catch the exception.

Number one is obvious from the discussion above.  Number two is a
consequence of the way 'longjmp' works: when you jump, you jump.  You
jump to the place on the stack where you sat the jump point.  If
you've returned from the function that sat the point, or there never
was one, you're jumping to garbage which in the best case crashes the
application.  In the worst case you're in for a couple of hours of
exciting debugging.  Of course, the 'jmp_buf' stack will fix both
problems, so problem number two just says that you need a default
handling of un-caught exceptions.  Calling assert will probably do.

I personally prefer the SML-like exceptions, but be careful with it.
It is much easier to abuse, and if you abuse 'longjmp' you usually pay
the price.  It also gets a bit tricky if you combine these exceptions
with threads.  The exception stack should be private to the threads
since jumping to a point in another threads stack is likely to be
confusing...I've never had to combine jumps and threads so I cannot
tell you how it works.


They won't help you manage resources:
-------------------------------------

The recursion is not the big problem.  With the SML-like solution it
become almost transparent.  But you really need to be careful with
allocating resources.  When you throw an exception you lose your
references to the resources allocated since the 'setjmp'.  In Java or
SML this is not a problem, since resources are garbage collected.  In
C you need to clean up after yourself.  This means you shouldn't throw
the exception when there are un-freed resources.

With the Java-like solution, you know when you are in the middle of a
call that can throw exceptions and you can behave accordingly: if you
allocate anything you set a jump point so that you will be notified
when an exception is thrown, and in the 'catch' you free the resources
and propagate the exception:

  res = malloc(HUGE_CHUNK_OF_MEM);
  if ( (ex = setjmp(new_env)) != 0) goto catch:
  
  /* do your stuff */

  goto after_catch:
 catch:
  /* handle exceptions */
  switch (ex)
    {
      default:
       /* free resources and propagate */
       free (res);
       longjmp (env, ex);
    }
 after_catch:
  /* do what you want but don't call any function that
     throws exceptions */

Allocating on the stack and copying when you know everything went well
will get you a long way with memory, but for opening files and similar
the trick above will do.

With the SML-like solution you can do something similar, but
transparency is lost.  You need to _know_ if an exception can be
thrown.  With the Java solution you have a type check that tells you
if an exception can be thrown, with the SML solution you don't.  This
can lead to leaks.  Since you can call any function between the try
and the throw this means that _every_ function in principle should
install exception handlers if they allocate resources.  This is not
really acceptable.  You can probably get around some of the problems
with some allocation magic, I have never had to so I don't have any
suggestions to how.


Most of the stuff above could probably be handled with macros and a
few utility functions and become a nifty little library.  If you are
careful with how you use it I would imagine that it would work quite
well, but it will never be "real" exceptions so you shouldn't do magic
with it unless you are _sure_ you know what you are doing.

        /mailund
References:
- Exception handling (was: Quickies about the new GObject)
  - From: Derek Simkowiak
- Re: Exception handling (was: Quickies about the new GObject)
  - From: Michael Beach
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]