Audit your code for dangling timeout event source IDs!

Two recent Rhythmbox bugs (<> and <>) revealed a fairly easy error one can make with timeout event sources. If the bug appears twice in Rhythmbox is likely appears in other code as well. Developers would be well advised to audit their own code to ensure that they are not making the same mistake.

The mistake is to keep around a timeout event source ID after the corresponding timeout callback has returned FALSE. When the callback returns FALSE, the timeout event source is implicitly destroyed. That means that this event source's ID number is no longer valid. Keeping it around is the ID number equivalent of having a dangling pointer.

What we see in Rhythmbox is that the event source ID is being retained in a private field of an object even after the callback has returned FALSE. Later on, that object might decide to destroy the event source by calling g_source_remove() on this stale ID. If the ID number has been reassigned to some other event source, that other source will be prematurely destroyed. In the case of Rhythmbox, this in turn appears to result in memory corruption and a subsequent crash.

So if your code uses glib timeouts, take a few minutes to see what you're doing with those event source IDs. You should never be keeping an ID around after the timeout callback returns FALSE.

As an aside, the methodology by which this bug was found is quite unusual. It was discovered as part of the Cooperative Bug Isolation Project: <>. This is a research project at UC Berkeley and Stanford University that tries to find bugs by identifying statistically significant differences in program behavior between good and bad runs. See <> for slightly more information on what we found in Rhythmbox, or <> for all the gory details.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]