Re: garbage collection experiment



On the issue of finalization, it may be helpful to know the history
of the matter in Java.

Java, when first shipped, had one mechanism for finalization: You could
define a method on an object called finalize(), and, if it was present,
when there were no live references to an object, the object would be added
to a finalization queue, and the objects that the finalizable object
pointed to were kept around as live references. In the case of cycles
consisting of finalizable objects, both (or however many) of the objects
would be put on the queue simultaneously, and the order in which the
finalize method was called was unspecified. Once on the finalization queue,
the object was considered alive again, at least temporarily. However, when
a finalizer was called, there was nothing stopping it from causing more
live references to it or other finalizable objects. Java's policy was (and
continues to be) a given object will only ever get added to the
finalization queue once, so, if after finalization, an object managed to
still be live, it would never be re-finalized the next time it (nearly)
became garbage.

I outline the above, not because it's a particularly good scheme, rather,
quite the opposite. It causes no end of trouble, and the trouble, when you
boil it down, is attributable to the fact that when finalization happens,
there is still a live reference to the object being finalized. Everything
else stems from that: Live reference means it can hand out more live
references; live reference means that cycles are possible; etc. That is,
there are all sorts of weird edge cases induced because when the gc, by
normal rules, ought to be dropping a reference, but instead the object gets
one more chance to say "I'm not dead yet!" and comedic hijinks ensues.

Thankfully, because enough developers whinged and Sun saw the light, there
is now a viable, useful, powerful, and straightforward way to do
finalization in Java. The key insight is that you split your
object-to-be-finalized into two facets. One facet you hand out publicly,
and the other you keep privately in the class (e.g. as a static variable).
The private facet contains all the information needed for finalization
(canonical example, it has the file descriptor that needs to be closed),
and it's pointed to in exactly two places: from the public facet, and from
a table internal to the class itself. The public facet is pointed to by the
private facet, but only via a weak reference (that is, a reference that
does not prevent gc from happening, and through which one can observe when
the reference becomes garbage), so the fact that the public facet is in a
strongly held table in the class doesn't keep the public facets from being
gc'ed. Every so often (or as it happens: it can be a polling or queueing
type system; Java supports both patterns), one runs the finalization code
for all the private facets whose public facet has become garbage.

Since the private facets are only ever pointed to (privately) by the
public facets and in a (private) internal table, and because the private
facet only ever points weakly at the public facet, there is no possibility
of having a finalization cycle; all finalization will end up having a
well-defined (partial) order. This is true even if there are cycles in
the public facets. 

The typical bug scenario is this: If A's private facet depends on (has a
strong reference to) B's public facet, and B's private facet depends on A's
public facet, then neither A nor B will ever become garbage, each being
strongly pointed to by the other's strongly-held private facet table. In
practice, I have never actually seen this scenario arise, though, whereas
in the original Java scheme, I saw a number of cases of the equivalent
circularity (because there was no clean separation between the minimal bit
needed for finalization).

Having used both systems pretty extensively at this point, I'd advocate
strongly against pre-mortem finalization (the original Java scheme) and
strongly for post-mortem finalization (the current, recommended Java
scheme). It does depend on gc support for weak references, and I don't know
if the collector you're now using has that or not. If not, it very well may
be worth the effort to add support for weak references to it.

-dan




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]