Heap Analysis: Making Memory Errors a Thing of the Past

Introduction

If you develop a program that dynamically allocates memory, you're also responsible for tracking any memory that you allocate whenever a task is performed, and for releasing that memory when it's no longer required. If you fail to track the memory correctly you may introduce "memory leaks," or unintentionally write to an area outside of the memory space.

Conventional debugging techniques usually prove to be ineffective for locating the source of corruption or leak because memory-related errors typically manifest themselves in an unrelated part of the program. Tracking down an error in a multithreaded environment becomes even more complicated because the threads all share the same memory address space.

In this chapter, we'll introduce you to a special version of our memory management functions that'll help you to diagnose your memory management problems.

Dynamic memory management

In a program, you'll dynamically request memory buffers or blocks of a particular size from the runtime environment using malloc(), realloc(), or calloc(), and then you'll release them back to the runtime environment when they're no longer required using free().

The memory allocator ensures that your requests are satisfied by managing a region of the program's memory area known as the heap. In this heap, it tracks all of the information -- such as the size of the original block -- about the blocks and heap buffers that it has allocated to your program, in order that it can make the memory available to you during subsequent allocation requests. When a block is released, it places it on a list of available blocks called a free list. It usually keeps the information about a block in the header that precedes the block itself in memory.

The runtime environment grows the size of the heap when it no longer has enough memory available to satisfy allocation requests, and it returns memory from the heap to the system when the program releases memory.

Heap corruption

Heap corruption occurs when a program damages the allocator's view of the heap. The outcome can be relatively benign and cause a memory leak (where some memory isn't returned to the heap and is inaccessible to the program afterwards), or it may be fatal and cause a memory fault, usually within the allocator itself. A memory fault typically occurs within the allocator when it manipulates one or more of its free lists after the heap has been corrupted.

It's especially difficult to identify the source of corruption when the source of the fault is located in another part of the code base. This is likely to happen if the fault occurs when:

Contiguous memory blocks

When contiguous blocks are used, a program that writes outside of the bounds can corrupt the allocator's information about the block of memory it's using, as well as, the allocator's view of the heap. The view may include a block of memory that's before or after the block being used, and it may or may not be allocated. In this case, a fault in the allocator will likely occur during an unrelated allocation or release attempt.

Multithreaded programs

Multithreaded execution may cause a fault to occur in a different thread from the thread that actually corrupted the heap, because threads interleave requests to allocate or release memory.

When the source of corruption is located in another part of the code base, conventional debugging techniques usually prove to be ineffective. Conventional debugging typically applies breakpoints -- such as stopping the program from executing -- to narrow down the offending section of code. While this may be effective for single-threaded programs, it's often unyielding for multithreaded execution because the fault may occur at an unpredictable time and the act of debugging the program may influence the appearance of the fault by altering the way that thread execution occurs. Even when the source of the error has been narrowed down, there may be a substantial amount of manipulation performed on the block before it's released, particularly for long-lived heap buffers.

Allocation strategy

A program that works in a particular memory allocation strategy may abort when the allocation strategy is changed in a minor way. A good example of this would be a memory overrun condition (for more information see "Overrun and underrun errors," below) where the allocator is free to return blocks that are larger than requested in order to satisfy allocation requests. Under this circumstance, the program may behave normally in the presence of overrun conditions. But a simple change, such as changing the size of the block requested, may result in the allocation of a block of the exact size requested, resulting in a fatal error for the offending program.

Fatal errors may also occur if the allocator is configured slightly differently, or if the allocator policy is changed in a subsequent release of the runtime library. This makes it all the more important to detect errors early in the life cycle of an application, even if it doesn't exhibit fatal errors in the testing phase.

Common sources

Some of the most common sources of heap corruption include:

Even the most robust allocator can occasionally fall prey to the above problems.

Let's take a look at the last three bullets in more detail:

Overrun and underrun errors

Overrun and underrun errors occur when your program writes outside of the bounds of the allocated block. They're one of the most difficult type of heap corruption to track down, and usually the most fatal to program execution.

Overrun errors occur when the program writes past the end of the allocated block. Frequently this causes corruption in the next contiguous block in the heap, whether or not it's allocated. When this occurs, the behavior that's observed varies depending on whether that block is allocated or free, and whether it's associated with a part of the program related to the source of the error. When a neighboring block that's allocated becomes corrupted, the corruption is usually apparent when that block is released elsewhere in the program. When an unallocated block becomes corrupted, a fatal error will usually result during a subsequent allocation request. Although this may well be the next allocation request, it's actually dependent on a complex set of conditions that could result in a fault at a much later point in time, in a completely unrelated section of the program, especially when small blocks of memory are involved.

Underrun errors occur when the program writes before the start of the allocated block. Often they corrupt the header of the block itself, and sometimes, the preceding block in memory. Underrun errors usually result in a fault that occurs when the program attempts to release a corrupted block.

Releasing memory

Requests to release memory requires your program to track the pointer for the allocated block and pass it to the free() function. If the pointer is stale, or if it doesn't point to the exact start of the allocated block, it may result in heap corruption.

A pointer is stale when it refers to a block of memory that's already been released. A duplicate request to free() involves passing free() a stale pointer -- there's no way to know whether this pointer refers to unallocated memory, or to memory that's been used to satisfy an allocation request in another part of the program.

Passing a stale pointer to free() may result in a fault in the allocator, or worse, it may release a block that's been used to satisfy another allocation request. If this happens, the code making the allocation request may compete with another section of code that subsequently allocated the same region of heap, resulting in corrupted data for one or both. The most effective way to avoid this error is to NULL out pointers when the block is released, but this is uncommon, and difficult to do when pointers are aliased in any way.

A second common source of errors is to attempt to release an interior pointer (i.e. one that's somewhere inside the allocated block rather than at the beginning). This isn't a legal operation, but it may occur when the pointer has been used in conjunction with pointer arithmetic. The result of providing an interior pointer is highly dependent on the allocator and is largely unpredictable, but it frequently results in a fault in the free() call.

A more rare source of errors is to pass an uninitialized pointer to free(). If the uninitialized pointer is an automatic (stack) variable, it may point to a heap buffer, causing the types of coherency problems described for duplicate free() requests above. If the pointer contains some other nonNULL value, it may cause a fault in the allocator.

Using uninitialized or stale pointers

If you use uninitialized or stale pointers, you might corrupt the data in a heap buffer that's allocated to another part of the program, or see memory overrun or underrun errors.

Detecting and reporting errors

The primary goal for detecting heap corruption problems is to correctly identify the source of the error, rather than getting a fault in the allocator at some later point in time.

A first step to achieving this goal is to create an allocator that's able to determine whether the heap was corrupted on every entry into the allocator, whether it's for an allocation request or for a release request. For example, on a release request, the allocator should be capable of determining whether:

To achieve this goal, we'll use a replacement library for the allocator that can keep additional block information in the header of every heap buffer. This library may be used during the testing of the application to help isolate any heap corruption problems. When a source of heap corruption is detected by this allocator, it can print an error message indicating:

The library technique can be refined to also detect some of the sources of errors that may still elude detection, such as memory overrun or underrun errors, that occur before the corruption is detected by the allocator. This may be done when the standard libraries are the vehicle for the heap corruption, such as an errant call to memcpy(), for example. In this case, the standard memory manipulation functions and string functions can be replaced with versions that make use of the information in the debugging allocator library to determine if their arguments reside in the heap, and whether they would cause the bounds of the heap buffer to be exceeded. Under these conditions, the function can then call the error reporting functions to provide information about the source of the error.

Using the malloc debug library

The malloc debug library provides the capabilities described in the above section. It's available when you link to either the normal memory allocator library, or to the debug library:

To access: Link using this option:
Nondebug library -lmalloc
Debug library -lmalloc_g

If you use the debug library, you must also include:

/usr/lib/malloc_g


Note:

This library was updated in the QNX Momentics 6.3.0 SP2.


as the first entry of your $LD_LIBRARY_PATH environment variable before running your application.

Another way to use the debug malloc library is to use the LD_PRELOAD capability to the dynamic loader. The LD_PRELOAD environment variable lets you specify libraries to load prior to any other library in the system. In this case, set the LD_PRELOAD variable to point to the location of the debug malloc library (or the nondebug one as the case may be), by saying:

LD_PRELOAD=/usr/lib/malloc_g/libmalloc.so.2

or:

LD_PRELOAD=/usr/lib/libmalloc.so.2

Note: In this chapter, all references to the malloc library refer to the debug version, unless otherwise specified.

Both versions of the library share the same internal shared object name, so it's actually possible to link against the nondebug library and test using the debug library when you run your application. To do this, you must change the $LD_LIBRARY_PATH as indicated above.

The nondebug library doesn't perform heap checking; it provides the same memory allocator as the system library.

By default, the malloc library provides a minimal level of checking. When an allocation or release request is performed, the library checks only the immediate block under consideration and its neighbors, looking for sources of heap corruption.

Additional checking and more informative error reporting can be done by using additional calls provided by the malloc library. The mallopt() function provides control over the types of checking performed by the library. There are also debug versions of each of the allocation and release routines that you can use to provide both file and line information during error-reporting. In addition to reporting the file and line information about the caller when an error is detected, the error-reporting mechanism prints out the file and line information that was associated with the allocation of the offending heap buffer.

To control the use of the malloc library and obtain the correct prototypes for all the entry points into it, it's necessary to include a different header file for the library. This header file is included in <malloc_g/malloc.h>. If you want to use any of the functions defined in this header file, other than mallopt(), make sure that you link your application with the debug library. If you forget, you'll get undefined references during the link.

The recommended practice for using the library is to always use the library for debug variants in builds. In this case, the macro used to identify the debug variant in C code should trigger the inclusion of the <malloc_g/malloc.h> header file, and the malloc debug library option should always be added to the link command. In addition, you may want to follow the practice of always adding an exit handler that provides a dump of leaked memory, and initialization code that turns on a reasonable level of checking for the debug variant of the program.

The malloc library achieves what it needs to do by keeping additional information in the header of each heap buffer. The header information includes additional storage for keeping doubly-linked lists of all allocated blocks, file, line and other debug information, flags and a CRC of the header. The allocation policies and configuration are identical to the normal system memory allocation routines except for the additional internal overhead imposed by the malloc library. This allows the malloc library to perform checks without altering the size of blocks requested by the program. Such manipulation could result in an alteration of the behavior of the program with respect to the allocator, yielding different results when linked against the malloc library.

All allocated blocks are integrated into a number of allocation chains associated with allocated regions of memory kept by the allocator in arenas or blocks. The malloc library has intimate knowledge about the internal structures of the allocator, allowing it to use short cuts to find the correct heap buffer associated with any pointer, resorting to a lookup on the appropriate allocation chain only when necessary. This minimizes the performance penalty associated with validating pointers, but it's still significant.

The time and space overheads imposed by the malloc library are too great to make it suitable for use as a production library, but are manageable enough to allow them to be used during the test phase of development and during program maintenance.

What's checked?

As indicated above, the malloc library provides a minimal level of checking by default. This includes a check of the integrity of the allocation chain at the point of the local heap buffer on every allocation request. In addition, the flags and CRC of the header are checked for integrity. When the library can locate the neighboring heap buffers, it also checks their integrity. There are also checks specific to each type of allocation request that are done. Call-specific checks are described according to the type of call below.

You can enable additional checks by using the mallopt() call. For more information on the types of checking, and the sources of heap corruption that can be detected, see of "Controlling the level of checking," below.

Allocating memory

When a heap buffer is allocated using any of the heap-allocation routines, the heap buffer is added to the allocation chain for the arena or block within the heap that the heap buffer was allocated from. At this time, any problems detected in the allocation chain for the arena or block are reported. After successfully inserting the allocated buffer in the allocation chain, the previous and next buffers in the chain are also checked for consistency.

Reallocating memory

When an attempt is made to resize a buffer through a call to the realloc() function, the pointer is checked for validity if it's a non-NULL value. If it's valid, the header of the heap buffer is checked for consistency. If the buffer is large enough to satisfy the request, the buffer header is modified, and the call returns. If a new buffer is required to satisfy the request, memory allocation is performed to obtain a new buffer large enough to satisfy the request with the same consistency checks being applied as in the case of memory allocation described above. The original buffer is then released.

If fill-area boundary checking is enabled (described in the "Manual Checking" section) the guard code checks are also performed on the allocated buffer before it's actually resized. If a new buffer is used, the guard code checks are done just before releasing the old buffer.

Releasing memory

This includes, but isn't limited to, checking to ensure that the pointer provided to a free() request is correct and points to an allocated heap buffer. Guard code checks may also be performed on release operations to allow fill-area boundary checking.

Controlling the level of checking

The mallopt() function call allows extra checks to be enabled within the library. The call to mallopt() requires that the application is aware that the additional checks are programmatically enabled. The other way to enable the various levels of checking is to use environment variables for each of the mallopt() options. Using environment variables lets the user specify options that will be enabled from the time the program runs, as opposed to only when the code that triggers these options to be enabled (i.e. the mallopt() call) is reached. For certain programs that perform a lot of allocations before main(), setting options using mallopt() calls from main() or after that may be too late. In such cases it is better to use environment variables.

The prototype of mallopt() is:

int mallopt ( int cmd, 
              int value );

The arguments are:

cmd
Options used to enable additional checks in the library.
value
A value corresponding to the command used.

See the Description section for mallopt() for more details.

Description of optional checks

MALLOC_CKACCESS
Turn on (or off) boundary checking for memory and string operations. Environment variable: MALLOC_CKACCESS.

The value argument can be:

This helps to detect buffer overruns and underruns that are a result of memory or string operations. When on, each pointer operand to a memory or string operation is checked to see if it's a heap buffer. If it is, the size of the heap buffer is checked, and the information is used to ensure that no assignments are made beyond the bounds of the heap buffer. If an attempt is made that would assign past the buffer boundary, a diagnostic warning message is printed.

Here's how you can use this option to find an overrun error:

...
char *p;
int opt;
opt = 1;
mallopt(MALLOC_CKACCESS, opt);
p = malloc(strlen("hello"));
strcpy(p, "hello, there!");  /* a warning is generated
                                          here */
...

The following illustrates how access checking can trap a reference through a stale pointer:

...

char *p;
int opt;
opt = 1;
mallopt(MALLOC_CKACCESS, opt);
p = malloc(30);
free(p);
strcpy(p, "hello, there!");
MALLOC_FILLAREA
Turn on (or off) fill-area boundary checking that validates that the program hasn't overrun the user-requested size of a heap buffer. Environment variable: MALLOC_FILLAREA.

The value argument can be:

It does this by applying a guard code check when the buffer is released or when it's resized. The guard code check works by filling any excess space available at the end of the heap buffer with a pattern of bytes. When the buffer is released or resized, the trailing portion is checked to see if the pattern is still present. If not, a diagnostic warning message is printed.

The effect of turning on fill-area boundary checking is a little different than enabling other checks. The checking is performed only on memory buffers allocated after the point in time at which the check was enabled. Memory buffers allocated before the change won't have the checking performed.

Here's how you can catch an overrun with the fill-area boundary checking option:

...
...
int *foo, *p, i, opt;
opt = 1;
mallopt(MALLOC_FILLAREA, opt);
foo = (int *)malloc(10*4);
for (p = foo, i = 12; i > 0; p++, i--)
    *p = 89;
free(foo);  /* a warning is generated here */

MALLOC_CKCHAIN
Enable (or disable) full-chain checking. This option is expensive and should be considered as a last resort when some code is badly corrupting the heap and otherwise escapes the detection of boundary checking or fill-area boundary checking. Environment variable: MALLOC_CKCHAIN.

The value argument can be:

This can occur under a number of circumstances, particularly when they're related to direct pointer assignments. In this case, the fault may occur before a check such as fill-area boundary checking can be applied. There are also circumstances in which both fill-area boundary checking and the normal attempts to check the headers of neighboring buffer fail to detect the source of the problem. This may happen if the buffer that's overrun is the first or last buffer associated with a block or arena. It may also happen when the allocator chooses to satisfy some requests, particularly those for large buffers, with a buffer that exactly fits the program's requested size.

Full-chain checking traverses the entire set of allocation chains for all arenas and blocks in the heap every time a memory operation (including allocation requests) is performed. This lets the developer narrow down the search for a source of corruption to the nearest memory operation.

Forcing verification

You can force a full allocation chain check at certain points while your program is executing, without turning on chain checking. Specify the following option for cmd:

MALLOC_VERIFY
Perform a chain check immediately. If an error is found, perform error handling. The value argument is ignored.

Specifying an error handler

Typically, when the library detects an error, a diagnostic message is printed and the program continues executing. In cases where the allocation chains or another crucial part of the allocator's view is hopelessly corrupted, an error message is printed and the program is aborted (via abort()).

You can override this default behavior by specifying a handler that determines what is done when a warning or a fatal condition is detected.

cmd
Specify the error handler to use.
MALLOC_FATAL
Specify the malloc fatal handler. Environment variable: MALLOC_FATAL.
MALLOC_WARN
Specify the malloc warning handler handler. Environment variable: MALLOC_WARN.
value
An integer value that indicates which one of the standard handlers provided by the library.
M_HANDLE_ABORT
Terminate execution with a call to abort().
M_HANDLE_EXIT
Exit immediately.
M_HANDLE_IGNORE
Ignore the error and continue.
M_HANDLE_CORE
Cause the program to dump a core file.
M_HANDLE_SIGNAL
Stop the program when this error occurs, by sending itself a stop signal. This lets you one attach to this process using a debugger. The program is stopped inside the error handler function, and a backtrace from there should show you the exact location of the error.

If you use environment variables to specify options to the malloc library for either MALLOC_FATAL or MALLOC_WARN, you must pass the value that indicates the handler, not its symbolic name.

Handler Value
M_HANDLE_IGNORE 0
M_HANDLE_ABORT 1
M_HANDLE_EXIT 2
M_HANDLE_CORE 3
M_HANDLE_SIGNAL 4

These values are also defined in /usr/include/malloc_g/malloc-lib.h


Note:

The last two handlers were added in the QNX Momentics 6.3.0 SP2.

You can OR any of these handlers with the value, MALLOC_DUMP, to cause a complete dump of the heap before the handler takes action.


Here's how you can cause a memory overrun error to abort your program:

...
int *foo, *p, i;
int opt;
opt = 1;
mallopt(MALLOC_FILLAREA,  opt);
foo = (int *)malloc(10*4);
for (p = foo, i = 12; i > 0; p++, i--)
    *p = 89;
opt = M_HANDLE_ABORT;
mallopt(MALLOC_WARN, opt);
free(foo); /* a fatal error is generated here */


Other environment variables

MALLOC_INITVERBOSE
Enable some initial verbose output regarding other variables that are enabled.
MALLOC_BTDEPTH
Set the depth of the backtrace on CPUs that support deeper backtrace levels. Currently the builtin-return-address feature of gcc is used to implement deeper backtraces for the debug malloc library. This environment variable controls the depth of the backtrace for allocations (i.e. where the allocation occurred). The default value is 0.
MALLOC_TRACEBT
Set the depth of the backtrace, on CPUs that support deeper backtrace levels. Currently the builtin-return-address feature of gcc is used to implement deeper backtraces for the debug malloc library. This environment variable controls the depth of the backtrace for errors and warning. The default value is 0.
MALLOC_DUMP_LEAKS
Trigger leak detection on exit of the program. The output of the leak detection is sent to the file named by this variable.
MALLOC_TRACE
Enable tracing of all calls to malloc(), free(), calloc(), realloc() etc. A trace of the various calls is made available in the file named by this variable.
MALLOC_CKACCESS_LEVEL
Specify the level of checking performed by the MALLOC_CKACCESS option. By default, a basic level of checking is performed. By increasing the LEVEL of checking, additional things that could be errors are also flagged. For example, a call to memset() with a length of zero is normally safe, since no data is actually moved. If the arguments, however, point to illegal locations (memory references that are invalid), this normally suggests a case where there is a problem potentially lurking inside the code. By increasing the level of checking, these kinds of errors are also flagged.

Note: These environment variables were added in the QNX Momentiocs 6.3.0 SP2.

Caveats

The debug malloc library, when enabled with various checking, uses more stack space (i.e. calls more functions, uses more local variables etc.) than the regular libc allocator. This implies that programs that explicitly set the stack size to something smaller than the default may encounter problems such as running out of stack space. This may cause the program to crash. You can prevent this by increasing the stack space allocated to the threads in question.

MALLOC_FILLAREA is used to do fill-area checking. If fill-area checking isn't enabled, the program can't detected certain types of errors. For example, errors that occur where an application accesses beyond the end of a block, and the real block allocated by the allocator is larger than what was requested, the allocator won't flag an error unless MALLOC_FILLAREA is enabled. By default, this environment variable isn't enabled.

MALLOC_CKACCESS is used to validate accesses to the str* and mem* family of functions. If this variable isn't enabled, such accesses won't be checked, and errors aren't reported. By default, this environment variable isn't enabled.

MALLOC_CKCHAIN performs extensive heap checking on every allocation. When you enable this environment variable, allocations can be much slower. Also since full heap checking is performed on every allocation, an error anywhere in the heap could be reported upon entry into the allocator for any operation. For example, a call to free(x) will check block x, and also the complete heap for errors before completing the operation (to free block x). So any error in the heap will be reported in the context of freeing block x, even if the error itself isn't specifically related to this operation.

When the debug library reports errors, it doesn't always exit immediately; instead it continues to perform the operation that causes the error, and corrupts the heap (since that operation that raises the warning is actually an illegal operation). You can control this behavior by using the MALLOC_WARN and MALLOC_FATAL handler described earlier. If specific handlers are not provided, the heap will be corrupted and other errors could result and be reported later because of the first error. The best solution is to focus on the first error and fix it before moving onto other errors. Look at description of MALLOC_CKCHAIN for more information on how these errors may end up getting reported.

Although the debug malloc library allocates blocks to the user using the same algorithms as the standard allocator, the library itself requires additional storage to maintain block information, as well as to perform sanity checks. This means that the layout of blocks in memory using the debug allocator is slightly different than with the standard allocator.

The use of certain optimization options such as -O1, -O2 or -O3 don't allow the debug malloc library to work correctly. The problem occurs due to the fact that, during compilation and linking, the gcc command call the builtin functions instead of the intended functions, e.g. strcpy() or strcmp(). You should use -fno-builtin option to circumvent this problem.

Manual checking (bounds checking)

There are times when it may be desirable to obtain information about a particular heap buffer or print a diagnostic or warning message related to that heap buffer. This is particularly true when the program has its own routines providing memory manipulation and the developer wishes to provide bounds checking. This can also be useful for adding additional bounds checking to a program to isolate a problem such as a buffer overrun or underrun that isn't associated with a call to a memory or string function.

In the latter case, rather than keeping a pointer and performing direct manipulations on the pointer, the program may define a pointer type that contains all relevant information about the pointer, including the current value, the base pointer and the extent of the buffer. Access to the pointer can then be controlled through macros or access functions. The accessors can perform the necessary bounds checks and print a warning message in response to attempts to exceed the bounds.

Any attempt to dereference the current pointer value can be checked against the boundaries obtained when the pointer was initialized. If the boundary is exceeded the malloc_warning() function should be called to print a diagnostic message and perform error handling. The arguments are: file, line, message.

Getting pointer information

To obtain information about the pointer, two functions are provided:

find_malloc_ptr()
void* find_malloc_ptr ( const void* ptr,
                        arena_range_t* range );
    

This function finds information about the heap buffer containing the given C pointer, including the type of allocation structure it's contained in and the pointer to the header structure for the buffer. The function returns a pointer to the Dhead structure associated with this particular heap buffer. The pointer returned can be used in conjunction with the DH_() macros to obtain more information about the heap buffer. If the pointer doesn't point into the range of a valid heap buffer, the function returns NULL.

For example, the result from find_malloc_ptr() can be used as an argument to DH_ULEN() to find out the size that the program requested for the heap buffer in the call to malloc(), calloc() or a subsequent call to realloc().

_mptr()
char* _mptr ( const char* ptr );
    

Return a pointer to the beginning of the heap buffer containing the given C pointer. Information about the size of the heap buffer can be obtained with a call to _msize() or _musize() with the value returned from this call.

Getting the heap buffer size

Three interfaces are provided so that you can obtain information about the size of a heap buffer:

_msize()
ssize_t _msize( const char* ptr );
    

Return the actual size of the heap buffer given the pointer to the beginning of the heap buffer. The value returned by this function is the actual size of the buffer as opposed to the program-requested size for the buffer. The pointer must point to the beginning of the buffer -- as in the case of the value returned by _mptr() -- in order for this function to work.

_musize()
ssize_t _musize( const char* ptr );
     

Return the program-requested size of the heap buffer given the pointer to the beginning of the heap buffer. The value returned by this function is the size argument that was given to the routine that allocated the block, or to a subsequent invocation of realloc() that caused the block to grow.

DH_ULEN()
DH_ULEN( ptr )
    

Return the program-requested size of the heap buffer given a pointer to the Dhead structure, as returned by a call to find_malloc_ptr(). This is a macro that performs the appropriate cast on the pointer argument.

Memory leaks

The ability of the malloc library to keep full allocation chains of all the heap memory allocated by the program -- as opposed to just accounting for some heap buffers -- allows heap memory leaks to be detected by the library in response to requests by the program. Leaks can be detected in the program by performing tracing on the entire heap. This is described in the sections that follow.

Tracing

Tracing is an operation that attempts to determine whether a heap object is reachable by the program. In order to be reachable, a heap buffer must be available either directly or indirectly from a pointer in a global variable or on the stack of one of the threads. If this isn't the case, then the heap buffer is no longer visible to the program and can't be accessed without constructing a pointer that refers to the heap buffer -- presumably by obtaining it from a persistent store such as a file or a shared memory object. The set of global variables and stack for all threads is called the root set. Because the root set must be stable for tracing to yield valid results, tracing requires that all threads other than the one performing the trace be suspended while the trace is performed.

Tracing operates by constructing a reachability graph of the entire heap. It begins with a root set scan that determines the root set comprising the initial state of the reachability graph. The roots that can be found by tracing are:

Once the root set scan is complete, tracing initiates a mark operation for each element of the root set. The mark operation looks at a node of the reachability graph, scanning the memory space represented by the node, looking for pointers into the heap. Since the program may not actually have a pointer directly to the start of the buffer -- but to some interior location -- and it isn't possible to know which part of the root set or a heap object actually contains a pointer, tracing utilizes specialized techniques for coping with ambiguous roots. The approach taken is described as a conservative pointer estimation since it assumes that any word-sized object on a word-aligned memory cell that could point to a heap buffer or the interior of that heap buffer actually points to the heap buffer itself.

Using conservative pointer estimation for dealing with ambiguous roots, the mark operation finds all children of a node of the reachability graph. For each child in the heap that's found, it checks to see whether the heap buffer has been marked as referenced. If the buffer has been marked, the operation moves on to the next child. Otherwise, the trace marks the buffer, and recursively initiates a mark operation on that heap buffer.

The tracing operation is complete when the reachability graph has been fully traversed. At this time every heap buffer that's reachable will have been marked, as could some buffers that aren't actually reachable, due to the conservative pointer estimation. Any heap buffer that hasn't been marked is definitely unreachable, constituting a memory leak. At the end of the tracing operation, all unmarked nodes can be reported as leaks.

Causing a trace and giving results

A program can cause a trace to be performed and memory leaks to be reported by calling the malloc_dump_unreferenced() function provided by the library:

int malloc_dump_unreferenced ( int fd, 
                               int detail );

Suspend all threads, clear the mark information for all heap buffers, perform the trace operation, and print a report of all memory leaks detected. All items are reported in memory order.

fd
The file descriptor on which the report should be produced.
detail
Indicate how the trace operation should deal with any heap corruption problems it encounters. For a value of:
1
Any problems encountered can be treated as fatal errors. After the error encountered is printed abort the program. No report is produced.
0
Print case errors, and a report based on whatever heap information is recoverable.

Analyzing dumps

The dump of unreferenced buffers prints out one line of information for each unreferenced buffer. The information provided for a buffer includes:

File and line information is available if the call to allocate the buffer was made using one of the library's debug interfaces. Otherwise, the return address of the call is reported in place of the line number. In some circumstances, no return address information is available. This usually indicates that the call was made from a function with no frame information, such as the system libraries. In such cases, the entry can usually be ignored and probably isn't a leak.

From the way tracing is performed we can see that some leaks may escape detection and may not be reported in the output. This happens if the root set or a reachable buffer in the heap has something that looks like a pointer to the buffer.

Likewise, each reported leak should be checked against the suspected code identified by the line or call return address information. If the code in question keeps interior pointers -- pointers to a location inside the buffer, rather than the start of the buffer -- the trace operation will likely fail to find a reference to the buffer. In this case, the buffer may well not be a leak. In other cases, there is almost certainly a memory leak.

Compiler support

Manual bounds checking can be avoided in circumstances where the compiler is capable of supporting bounds checking under control of a compile-time option. For C compilers this requires explicit support in the compiler. Patches are available for the Gnu C Compiler that allow it to perform bounds checking on pointers in this manner. This will be dealt with later. For C++ compilers extensive bounds checking can be performed through the use of operator overloading and the information functions described earlier.

C++ issues

In place of a raw pointer, C++ programs can make use of a CheckedPtr template that acts as a smart pointer. The smart pointer has initializers that obtain complete information about the heap buffer on an assignment operation and initialize the current pointer position. Any attempt to dereference the pointer causes bounds checking to be performed and prints a diagnostic error in response an attempt to dereference a value beyond the bounds of the buffer. The CheckedPtr template is provided in the <malloc_g/malloc> header for C++ programs.

The checked pointer template provided for C++ programs can be modified to suit the needs of the program. The bounds checking performed by the checked pointer is restricted to checking the actual bounds of the heap buffer, rather than the program requested size.

For C programs it's possible to compile individual modules that obey certain rules with the C++ compiler to get the behavior of the CheckedPtr template. C modules obeying these rules are written to a dialect of ANSI C that can be referred to as Clean C.

Clean C

The Clean C dialect is that subset of ANSI C that is compatible with the C++ language. Writing Clean C requires imposing coding conventions to the C code that restrict use to features that are acceptable to a C++ compiler. This section provides a summary of some of the more pertinent points to be considered. It is a mostly complete but by no means exhaustive list of the rules that must be applied.

To use the C++ checked pointers, the module including all header files it includes must be compatible with the Clean C subset. All the system headers for Neutrino as well as the <malloc_g/malloc.h> header satisfy this requirement.

The most obvious aspect to Clean C is that it must be strict ANSI C with respect to function prototypes and declarations. The use of K&R prototypes or definitions isn't allowable in Clean C. Similarly, default types for variable and function declarations can't be used.

Another important consideration for declarations is that forward declarations must be provided when referencing an incomplete structure or union. This frequently occurs for linked data structures such as trees or lists. In this case the forward declaration must occur before any declaration of a pointer to the object in the same or another structure or union. For example, a list node may be declared as follows:

  
struct ListNode;
struct ListNode {
   struct ListNode *next;
   void *data;
};

Operations on void pointers are more restrictive in C++. In particular, implicit coercions from void pointers to other types aren't allowed including both integer types and other pointer types. Void pointers should be explicitly cast to other types.

The use of const should be consistent with C++ usage. In particular, pointers that are declared as const must always be used in a compatible fashion. Const pointers can't be passed as non-const arguments to functions unless const is cast away.

C++ example

Here's how the overrun example from earlier could have the exact source of the error pinpointed with checked pointers:

typedef CheckedPtr<int> intp_t;
...
intp_t foo, p;
int i;
int opt;
opt = 1;
mallopt(MALLOC_FILLAREA, opt);
foo = (int *)malloc(10*4);
opt = M_HANDLE_ABORT;
mallopt(MALLOC_WARN, opt);
for (p = foo, i = 12; i > 0; p++, i--)
    *p = 89; /* a fatal error is generated here */
opt = M_HANDLE_IGNORE;
mallopt(MALLOC_WARN, opt);
free(foo);

Bounds checking GCC

Bounds checking GCC is a variant of GCC that allows individual modules to be compiled with bounds checking enabled. When a heap buffer is allocated within a checked module, information about the buffer is added to the runtime information about the memory space kept on behalf of the compiler. Attempts to dereference or update the pointer in checked modules invokes intrinsic functions that obtain information about the bounds of the object -- it may be stack, heap or an object in the data segment -- and checks to see that the reference is in bounds. When an access is out of bounds, the runtime environment generates an error.

The bounds checking variant of GCC hasn't been ported to the Neutrino environment. In order to check objects that are kept within the data segment of the application, the compiler runtime environment requires some Unix functions that aren't provided by Neutrino. The intrinsics would have to be modified to work in the Neutrino environment.

The model for obtaining information about heap buffers with this compiler is also slightly different than the model employed by the malloc library. Instead of this, the compiler includes an alternative malloc implementation that registers checked heap buffers with a tree data structure outside of the program's control. This tree is used for searches made by the intrinsics to obtain information about checked objects. This technique may take more time than the malloc mechanism for some programs, and is incompatible with the checking and memory leak detection provided by the malloc library. Rather than performing multiple test runs, a port which reimplemented the compiler intrinsics to obtain heap buffer information from the malloc library would be desirable.

Summary

When you develop an application, we recommend that you test it against a debug version that incorporates the malloc library to detect possible sources of memory errors, such as overruns and memory leaks.

The malloc library and the different levels of compiler support can be very useful in detecting the source of overrun errors (which may escape detection during integration testing) during unit testing and program maintenance. However, in this case, more stringent checking for low-level bounds checking of individual pointers may prove useful. The use of the Clean C subset may also help by facilitating the use of C++ templates for low-level checking. Otherwise, you might consider porting the bounds checking variant of GCC to meet your individual project requirements.