A.2 Finding Dynamic Memory Errors

When writing a program, you frequently can't know how much memory the program will need when it runs. For example, a line read from a file at runtime might have any finite length. C and C++ programs use malloc, free, and their variants to dynamically allocate memory while the program is running. The rules for dynamic memory use include these:

·         The number of allocation calls (calls to malloc) must exactly match the number of deallocation calls (calls to free).

·         Reads and writes to the allocated memory must occur within the memory, not outside its range.

·         The allocated memory cannot be used before it is allocated or after it is deallocated.

Because dynamic memory allocation and deallocation occur at runtime, static program analysis rarely find violations. Instead, memory-checking tools run the program, collecting data to determine if any of these rules have been violated. The violations a tool may find include the following:

·         Reading from memory before allocating it

·         Writing to memory before allocating it

·         Reading before the beginning of allocated memory

·         Writing before the beginning of allocated memory

·         Reading after the end of allocated memory

·         Writing after the end of allocated memory

·         Reading from memory after its deallocation

·         Writing to memory after its deallocation

·         Failing to deallocate allocated memory

·         Deallocating the same memory twice

·         Deallocating memory that is not allocated

It is also useful to warn about requesting an allocation with 0 bytes, which probably indicates programmer error.

Table A.1 indicates four different tools' diagnostic capabilities. Unfortunately, no single tool diagnoses all the memory use errors. Also, no tool claims to detect reading or writing before allocating memory, but doing so will probably cause a segmentation fault. Deallocating memory twice will probably also cause a segmentation fault. These tools diagnose only errors that actually occur while the program is running. If you run the program with inputs that cause no memory to be allocated, the tools will indicate no memory errors. To test a program thoroughly, you must run the program using different inputs to ensure that every possible path through the program occurs. Also, you may use only one tool at a time, so you'll have to repeat testing with several tools to get the best error checking.

Table A.1. Capabilities of Dynamic Memory-Checking Tools (X Indicates Detection, and O Indicates Detection for Some Cases)

Erroneous Behavior

malloc Checking

mtrace

ccmalloc

Electric Fence

Read before allocating memory

 

 

 

 

Write before allocating memory

 

 

 

 

Read before beginning of allocation

 

 

 

X

Write before beginning of allocation

O

 

O

X

Read after end of allocation

 

 

 

X

Write after end of allocation

 

 

X

X

Read after deallocation

 

 

 

X

Write after deallocation

 

 

 

X

Failure to deallocate memory

 

X

X

 

Deallocating memory twice

X

 

X

 

Deallocating nonallocated memory

 

X

X

 

Zero-size memory allocation

 

 

X

X

In the sections that follow, we first describe how to use the more easily used malloc checking and mtrace, and then ccmalloc and Electric Fence.

A.2.1 A Program to Test Memory Allocation and Deallocation

We'll use the malloc-use program in Listing A.2 to illustrate memory allocation, deallocation, and use. To begin running it, specify the maximum number of allocated memory regions as its only command-line argument. For example, malloc-use 12 creates an array A with 12 character pointers that do not point to anything. The program accepts five different commands:

·         To allocate b bytes pointed to by array entry A[ i ], enter a i b . The array index i can be any non-negative number smaller than the command-line argument. The number of bytes must be non-negative.

·         To deallocate memory at array index i, enter d i .

·         To read the pth character from the allocated memory at index i (as in A[i][p]), enter r i p . Here, p can have an integral value.

·         To write a character to the pth position in the allocated memory at index i, enter w i p .

·         When finished, enter q.

We'll present the program's code later, in Section A.2.7, and illustrate how to use it.

A.2.2 malloc Checking

The memory allocation functions provided by the GNU C library can detect writing before the beginning of an allocation and deallocating the same allocation twice. Defining the environment variable MALLOC_CHECK_ to the value 2 causes a program to halt when such an error is detected. (Note the environment variable's ending underscore.) There is no need to recompile the program.

We illustrate diagnosing a write to memory to a position just before the beginning of an allocation.

 
% export MALLOC_CHECK_=2 
% ./malloc-use 12 
Please enter a command: a 0 10 
Please enter a command: w 0 -1 
Please enter a command: d 0 
Aborted (core dumped) 

export turns on malloc checking. Specifying the value 2 causes the program to halt as soon as an error is detected.

Using malloc checking is advantageous because the program need not be recompiled, but its capability to diagnose errors is limited. Basically, it checks that the allocator data structures have not been corrupted. Thus, it can detect double deallocation of the same allocation. Also, writing just before the beginning of a memory allocation can usually be detected because the allocator stores the size of each memory allocation just before the allocated region. Thus, writing just before the allocated memory will corrupt this number. Unfortunately, consistency checking can occur only when your program calls allocation routines, not when it accesses memory, so many illegal reads and writes can occur before an error is detected. In the previous example, the illegal write was detected only when the allocated memory was deallocated.

A.2.3 Finding Memory Leaks Using mtrace

The mtrace tool helps diagnose the most common error when using dynamic memory: failure to match allocations and deallocations. There are four steps to using mtrace, which is available with the GNU C library:

1.       Modify the source code to include <mcheck.h> and to invoke mtrace () as soon as the program starts, at the beginning of main. The call to mtrace turns on tracking of memory allocations and deallocations.

2.       .Specify the name of a file to store information about all memory allocations and deallocations:

3.           
% export MALLOC_TRACE=memory.log 

4.       .Run the program. All memory allocations and deallocations are stored in the logging file.

5.       Using the mtrace command, analyze the memory allocations and deallocations to ensure that they match.

6.           
% mtrace my_program $MALLOC_TRACE 

The messages produced by mtrace are relatively easy to understand. For example, for our malloc-use example, the output would look like this:

 
- 0000000000 Free 3 was never alloc'd malloc-use.c:39 
 
Memory not freed: 
-----------------
   Address     Size     Caller 
0x08049d48      0xc at  malloc-use.c:30 

These messages indicate an attempt on line 39 of malloc-use.c to free memory that was never allocated, and an allocation of memory on line 30 that was never freed.

mtrace diagnoses errors by having the executable record all memory allocations and deallocations in the file specified by the MALLOC_TRACE environment variable. The executable must terminate normally for the data to be written. The mtrace command analyzes this file and lists unmatched allocations and deallocations.

A.2.4 Using ccmalloc

The ccmalloc library diagnoses dynamic memory errors by replacing malloc and free with code tracing their use. If the program terminates gracefully, it produces a report of memory leaks and other errors. The ccmalloc library was written by Armin Bierce.

You'll probably have to download and install the ccmalloc library yourself. Download it from http://www.inf.ethz.ch/personal/biere/projects/ccmalloc/, unpack the code, and run configure. Run make and make install, copy the ccmalloc.cfg file to the directory where you'll run the program you want to check, and rename the copy to .ccmalloc. Now you are ready to use the tool.

The program's object files must be linked with ccmalloc's library and the dynamic linking library. Append -lccmalloc -ldl to your link command, for instance.

 
% gcc -g -Wall -pedantic malloc-use.o -o ccmalloc-use -lccmalloc –ldl 

Execute the program to produce a report. For example, running our malloc-use program to allocate but not deallocate memory produces the following report:

 
% ./ccmalloc-use 12 
file-name=a.out does not contain valid symbols 
trying to find executable in current directory ... 
using symbols from 'ccmalloc-use' 
(to speed up this search specify 'file ccmalloc-use' 
 in the startup file '.ccmalloc') 
Please enter a command: a 0 12 
Please enter a command: q 

graphics/apafig01.gif

The last few lines indicate the chain of function calls that allocated memory that was not deallocated.

To use ccmalloc to diagnose writes before the beginning or after the end of the allocated region, you'll have to modify the .ccmalloc file in the current directory. This file is read when the program starts execution.

A.2.5 Electric Fence

Written by Bruce Perens, Electric Fence halts executing programs on the exact line where a write or a read outside an allocation occurs. This is the only tool that discovers illegal reads. It is included in most GNU/Linux distributions, but the source code can be found at http://www.perens.com/FreeSoftware/.

As with ccmalloc, your program's object files must be linked with Electric Fence's library by appending -lefence to the linking command, for instance:

 
% gcc -g -Wall -pedantic malloc-use.o -o emalloc-use –lefence 

As the program runs, allocated memory uses are checked for correctness. A violation causes a segmentation fault:

 
% ./emalloc-use 12 
  Electric Fence 2.0.5 Copyright (C) 1987-1998 Bruce Perens. 
Please enter a command: a 0 12 
Please enter a command: r 0 12 
Segmentation fault 

Using a debugger, you can determine the context of the illegal action.

By default, Electric Fence diagnoses only accesses beyond the ends of allocations. To find accesses before the beginning of allocations instead of accesses beyond the end of allocations, use this code:

 
% export EF_PROTECT_BELOW=1 

To find accesses to deallocated memory, set EF_PROTECT_FREE to 1. More capabilities are described in the libefence manual page.

Electric Fence diagnoses illegal memory accesses by storing each allocation on at least two memory pages. It places the allocation at the end of the first page; any access beyond the end of the allocation, on the second page, causes a segmentation fault. If you set EF_PROTECT_BELOW to 1, it places the allocation at the beginning of the second page instead. Because it allocates two memory pages per call to malloc, Electric Fence can use an enormous amount of memory. Use this library for debugging only.

A.2.6 Choosing Among the Different Memory-Debugging Tools

We have discussed four separate, incompatible tools to diagnose erroneous use of dynamic memory. How does a GNU/Linux programmer ensure that dynamic memory is correctly used? No tool guarantees diagnosing all errors, but using any of them does increase the probability of finding errors. To ease finding dynamically allocated memory errors, separately develop and test the code that deals with dynamic memory. This reduces the amount of code that you must search for errors. If you are using C++, write a class that handles all dynamic memory use. If you are using C, minimize the number of functions using allocation and deallocation. When testing this code, be sure to use only one tool at a one time because they are incompatible. When testing a program, be sure to vary how the program executes, to test the most commonly executed portions of the code.

Which of the four tools should you use? Because failing to match allocations and deallocations is the most common dynamic memory error, use mtrace during initial development. The program is available on all GNU/Linux systems and has been well tested. After ensuring that the number of allocations and deallocations match, use Electric Fence to find illegal memory accesses. This will eliminate almost all memory errors. When using Electric Fence, you will need to be careful to not perform too many allocations and deallocations because each allocation requires at least two pages of memory. Using these two tools will reveal most memory errors.

A.2.7 Source Code for the Dynamic Memory Program

Listing A.2 shows the source code for a program illustrating dynamic memory allocation, deallocation, and use. See Section A.2.1, "A Program to Test Memory Allocation and Deallocation," for a description of how to use it.

Listing A.2 (malloc-use.c) Dynamic Memory Allocation Checking Example
/* Use C's dynamic memory allocation functions.  */ 
 
/* Invoke the program using one command-line argument specifying the 
   size of an array.  This array consists of pointers to (possibly) 
   allocated arrays. 
 
   When the programming is running, select among the following 
   commands: 
 
   o allocate memory:   a <index> <memory-size> 
   o deallocate memory: d <index> 
   o read from memory:  r <index> <position-within-allocation> 
   o write to  memory:  w <index> <position-within-allocation> 
   o quit:              q 
 
   The user is responsible for obeying (or disobeying) the rules on dynamic 
   memory use.  */ 
 
#ifdef MTRACE 
#include <mcheck.h> 
#endif /* MTRACE */ 
#include <stdio.h> 
#include <stdlib.h> 
#include <assert.h> 
 
/* Allocate memory with the specified  size, returning nonzero upon 
   success.  */ 
 
void allocate (char** array, size_t size) 
{
  *array = malloc (size); 
} 
 
/* Deallocate memory.  */ 
 
void deallocate (char** array) 
{
  free ((void*) *array); 
} 
 
/* Read from a position in memory.  */ 
 
void read_from_memory (char* array, int position) 
{
  char character = array[position]; 
} 
 
/* Write to a position in memory.  */ 
 
void write_to_memory (char* array, int position) 
{
  array[position] = 'a'; 
} 
 
int main (int argc, char* argv[]) 
{
  char** array; 
  unsigned array_size; 
  char command[32]; 
  unsigned array_index; 
  char command_letter; 
  int size_or_position; 
  int error = 0; 
 
#ifdef MTRACE 
  mtrace (); 
#endif /* MTRACE  */ 
 
  if (argc != 2) {
    fprintf (stderr, "%s: array-size\n", argv[0]); 
    return 1; 
  } 
 
  array_size = strtoul (argv[1], 0, 0); 
  array = (char **) calloc (array_size, sizeof (char *)); 
  assert (array != 0); 
 
  /* Follow the user's commands.  */ 
  while (!error) {
    printf ("Please enter a command: "); 
    command_letter = getchar (); 
    assert (command_letter != EOF); 
    switch (command_letter) {
 
    case 'a': 
      fgets (command, sizeof (command), stdin); 
      if (sscanf (command, "%u %i", &array_index, &size_or_position) == 2 
          && array_index < array_size) 
     allocate (&(array[array_index]), size_or_position); 
   else 
     error = 1; 
   break; 
 
  case 'd': 
    fgets (command, sizeof (command), stdin); 
    if (sscanf (command, "%u", &array_index) == 1 
        && array_index < array_size) 
      deallocate (&(array[array_index])); 
    else 
      error = 1; 
    break; 
 
  case 'r': 
    fgets (command, sizeof (command), stdin); 
    if (sscanf (command, "%u %i", &array_index, &size_or_position) == 2 
        && array_index < array_size) 
      read_from_memory (array[array_index], size_or_position); 
    else 
      error = 1; 
    break; 
 
  case 'w': 
    fgets (command, sizeof (command), stdin); 
    if (sscanf (command, "%u %i", &array_index, &size_or_position) == 2 
        && array_index < array_size) 
      write_to_memory (array[array_index], size_or_position); 
    else 
      error = 1; 
    break; 
 
  case 'q': 
    free ((void *) array); 
    return 0; 
 
  default: 
    error = 1; 
  } 
} 
 
 free ((void *) array); 
 return 1; 
}