GLib RFC: Improve checking provided with --enable-mem-check



Hi,

I recently tried to use a glib compiled with --enable-mem-check to debug gnucash, but instead ran into a 'block freed x times' message (from gtkcalendar.c). The number x was an incredibly large number, suggesting that

  1. The pointer being freed had never been allocated or
  2. The memory had been overwritten by something else after previously
     being freed.

The current implementation cannot distinguish between these two possibilities, and is incapable of detecting a few other kinds of problem.

I am proposing to make some additions to g_malloc, g_free and related functions (g_malloc0, g_realloc) to make the debugging memory manager more robust to misuse. This will improve it's utilitiy in bug-hunting.

I have previously written a debugging memory manager for C++, which works by overloading operator new and operator delete. It operated by aquiring it's memory blocks from the malloc/free interface, much the same way as g_malloc/g_free does today.

My implementation added 2 pointers (8 bytes on 32 bit arch) to each block which are used to form a doubly linked list. When a block is allocated, it is entered into a hash table (overflow by doubly linked lists). When a pointer is given to g_free, I propose to verify it by looking it up in the hash table. It is thus possible to diagnose free's of invalid pointers vs multiple frees.

I also propose to fill freed blocks with a non-0 number, such as 0xDEADBEEF, because 0-filling will hide errors. Filling with some other number will lead to diagnosis by invalid pointer errors (bus errors, frees of invalid pointers) if the block is re-used. 0-filling will lead to a segmentation fault if the pointer is dereferenced, but will not generate an error if it is passed to g_free.

It is also possible to detect block write overruns by adding a check magic number at the beginning and end of the block, at the cost of 8 more bytes per allocation. This method does not detect reads over the end of an allocated block. The check would be carried out when the block is freed.

The current implementation never releases allocated blocks, so as to detect if they are ever freed again, and also to help detect usage of the freed block. It is possible to retain this behaviour by removing the newly freed block from the hash table of allocated blocks, and to enter it into a doubly linked list of freed blocks. If malloc runs out of memory, it is possible to de-allocate some of the blocks on the freed-blocks list until the request is satisfied or there are none left. I consider that it's more important to continue running the program than to hold on to all the blocks ever freed in the hope of detecting an error.

The addition of this information allows a more meaningful 'check_heap' operation, which may actually walk the heap and check that all the doubly linked list structure is intact, and that the magic numbers guarding the front and back of each block are intact. It would also check that the blocks of memory in the freed block list have not been overwritten. I would like to be able to invoke this operation from the debugger, but I am not yet aware of how to do so.

The proposed modification will consume more resources than the current one, but will also diagnose more errors than the current one. It will not consume the incredible amount of resources which electric fence does by allocating at least two VM pages per allocation. (This makes Electric Fence unsuitable for use with large programs such as gnucash - the machine runs out of resources before the program has finished initialising.)

Remember that the overhead is only incurred if you compile with --enable-mem-check. This option could be expanded to have a no/minimal/yes option, similarly to the current --enable-debug option. minimal would retain the current implementation, yes would select my new implementation.

The techniques that I developed for writing my memory manager were used to test student assignments for errors, and were verified to uncover a wide variety of memory errors.

I propose to leave the --enable-mem-profile code alone and to make my enhancements cooperate with it where they are used together.

Comments are solicited, bearing in mind that this is a debugging memory manager proposal, not the normal one, and that I am aiming for robustness in the face of programmer error over efficiency, That being said, if you know of ways to achieve the same goal using fewer bytes and achieving higher performance, please describe them.

I am volunteering to implement this proposal.

I have currently only looked at gmem.c in glib-1.2.9. Is there any more recent version that I should know about?

Ben.






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]