7.2 Process Entries

The /proc file system contains a directory entry for each process running on the GNU/Linux system. The name of each directory is the process ID of the corresponding process. [1] These directories appear and disappear dynamically as processes start and terminate on the system. Each directory contains several entries providing access to information about the running process. From these process directories the /proc file system gets its name.

[1] On some UNIX systems, the process IDs are padded with zeros. On GNU/Linux, they are not.

Each process directory contains these entries:

·         cmdline contains the argument list for the process. The cmdline entry is described in Section 7.2.2, "Process Argument List."

·         cwd is a symbolic link that points to the current working directory of the process (as set, for instance, with the chdir call).

·         environ contains the process's environment. The environ entry is described in Section 7.2.3, "Process Environment."

·         exe is a symbolic link that points to the executable image running in the process. The exe entry is described in Section 7.2.4, "Process Executable."

·         fd is a subdirectory that contains entries for the file descriptors opened by the process. These are described in Section 7.2.5, "Process File Descriptors."

·         maps displays information about files mapped into the process's address. See Chapter 5, "Interprocess Communication," Section 5.3, "Mapped Memory," for details of how memory-mapped files work. For each mapped file, maps displays the range of addresses in the process's address space into which the file is mapped, the permissions on these addresses, the name of the file, and other information.

The maps table for each process displays the executable running in the process, any loaded shared libraries, and other files that the process has mapped in.

·         root is a symbolic link to the root directory for this process. Usually, this is a symbolic link to /, the system root directory. The root directory for a process can be changed using the chroot call or the chroot command. [2]

[2] The chroot call and command are outside the scope of this book. See the chroot man page in Section 1 for information about the command (invoke man 1 chroot), or the chroot man page in Section 2 (invoke man 2 chroot) for information about the call.

·         stat contains lots of status and statistical information about the process. These are the same data as presented in the status entry, but in raw numerical format, all on a single line. The format is difficult to read but might be more suitable for parsing by programs.

If you want to use the stat entry in your programs, see the proc man page, which describes its contents, by invoking man 5 proc.

·         statm contains information about the memory used by the process. The statm entry is described in Section 7.2.6, "Process Memory Statistics."

·         status contains lots of status and statistical information about the process, formatted to be comprehensible by humans. Section 7.2.7, "Process Statistics," contains a description of the status entry.

·         The cpu entry appears only on SMP Linux kernels. It contains a breakdown of process time (user and system) by CPU.

Note that for security reasons, the permissions of some entries are set so that only the user who owns the process (or the superuser) can access them.

7.2.1 /proc/self

One additional entry in the /proc file system makes it easy for a program to use /proc to find information about its own process. The entry /proc/self is a symbolic link to the /proc directory corresponding to the current process. The destination of the /proc/self link depends on which process looks at it: Each process sees its own process directory as the target of the link.

For example, the program in Listing 7.2 reads the target of the /proc/self link to determine its process ID. (We're doing it this way for illustrative purposes only; calling the getpid function, described in Chapter 3, "Processes," in Section 3.1.1, "Process IDs," is a much easier way to do the same thing.) This program uses the readlink system call, described in Section 8.11, "readlink: Reading Symbolic Links," to extract the target of the symbolic link.

Listing 7.2 (get-pid.c) Obtain the Process ID from /proc/self
#include <stdio.h> 
#include <sys/types.h> 
#include <unistd.h> 
 
/*  Returns the process ID of the calling processes, as determined from 
     the /proc/self symlink.  */ 
 
pid_t get_pid_from_proc_self () 
{
   char target[32]; 
   int pid; 
   /*  Read the target of the symbolic link.  */ 
   readlink ("/proc/self", target, sizeof (target)); 
   /*  The target is a directory named for the process ID.  */ 
   sscanf (target, "d", &pid); 
   return (pid_t) pid; 
} 
 
int main () 
{
   printf ("/proc/self reports process id %d\n", 
           (int) get_pid_from_proc_self ()); 
   printf ("getpid() reports process id %d\n", (int) getpid ()); 
   return 0; 
} 

7.2.2 Process Argument List

The cmdline entry contains the process argument list (see Chapter 2, "Writing Good GNU/Linux Software," Section 2.1.1, "The Argument List" ). The arguments are presented as a single character string, with arguments separated by NULs. Most string functions expect that the entire character string is terminated with a single NUL and will not handle NULs embedded within strings, so you'll have to handle the contents specially.

NUL vs. NULL

NUL is the character with integer value 0. It's different from NULL, which is a pointer with value 0.

In C, a character string is usually terminated with a NUL character. For instance, the character string "Hello, world!" occupies 14 bytes because there is an implicit NUL after the exclamation point indicating the end of the string.

NULL, on the other hand, is a pointer value that you can be sure will never correspond to a real memory address in your program.

In C and C++, NUL is expressed as the character constant '\0', or (char) 0. The definition of NULL differs among operating systems; on Linux, it is defined as ((void*)0) in C and simply 0 in C++.

In Section 2.1.1, we presented a program in Listing 2.1 that printed out its own argument list. Using the cmdline entries in the /proc file system, we can implement a program that prints the argument of another process. Listing 7.3 is such a program; it prints the argument list of the process with the specified process ID. Because there may be several NULs in the contents of cmdline rather than a single one at the end, we can't determine the length of the string with strlen (which simply counts the number of characters until it encounters a NUL). Instead, we determine the length of cmdline from read, which returns the number of bytes that were read.

Listing 7.3 (print-arg-list.c) Print the Argument List of a Running Process
#include <fcntl.h> 
#include <stdio.h> 
#include <stdlib.h> 
#include <sys/stat.h> 
#include <sys/types.h> 
#include <unistd.h> 
 
/* Prints the argument list, one argument to a line, of the process 
   given by PID.  */ 
 
void print_process_arg_list (pid_t pid) 
{
  int fd; 
  char filename[24]; 
  char arg_list[1024]; 
  size_t length; 
  char* next_arg; 
 
  /* Generate the name of the cmdline file for the process.  */ 
  snprintf (filename, sizeof (filename), "/proc/%d/cmdline", (int) pid); 
  /* Read the contents of the file. */ 
  fd = open (filename, O_RDONLY); 
  length = read (fd, arg_list, sizeof (arg_list)); 
  close (fd); 
  /* read does not NUL-terminate the buffer, so do it here.  */ 
  arg_list[length] = '\0'; 
 
  /* Loop over arguments. Arguments are separated by NULs.  */ 
  next_arg = arg_list; 
  while (next_arg < arg_list + length) {
       /* Print the argument. Each is NUL-terminated, so just treat it 
          like an ordinary string.  */ 
       printf ("%s\n", next_arg); 
       /* Advance to the next argument. Since each argument is 
          NUL-terminated, strlen counts the length of the next argument, 
          not the entire argument list.  */ 
       next_arg += strlen (next_arg) + 1; 
  } 
} 
 
int main (int argc, char* argv[]) 
{
   pid_t pid = (pid_t) atoi (argv[1]); 
   print_process_arg_list (pid); 
   return 0; 
} 

For example, suppose that process 372 is the system logger daemon, syslogd.

 
% ps 372 
  PID TTY      STAT   TIME COMMAND 
  372 ?        S      0:00 syslogd -m 0 
% ./print-arg-list 372 
syslogd 
-m 
0 

In this case, syslogd was invoked with the arguments -m 0.

7.2.3 Process Environment

The environ entry contains a process's environment (see Section 2.1.6, "The Environment" ). As with cmdline, the individual environment variables are separated by NULs. The format of each element is the same as that used in the environ variable, namely VARIABLE=value .

Listing 7.4 presents a generalization of the program in Listing 2.3 in Section 2.1.6. This version takes a process ID number on its command line and prints the environment for that process by reading it from /proc.

Listing 7.4 (print-environment.c) Display the Environment of a Process
#include <fcntl.h> 
#include <stdio.h> 
#include <stdlib.h> 
#include <sys/stat.h> 
#include <sys/types.h> 
#include <unistd.h> 
 
/* Prints the environment, one environment variable to a line, of the 
   process given by PID.  */ 
 
void print_process_environment (pid_t pid) 
{
  int fd; 
  char filename[24]; 
  char environment[8192]; 
  size_t length; 
  char* next_var; 
 
  /* Generate the name of the environ file for the process.  */ 
  snprintf (filename, sizeof (filename), "/proc/%d/environ", (int) pid); 
  /* Read the contents of the file.  */ 
  fd = open (filename, O_RDONLY); 
  length = read (fd, environment, sizeof (environment)); 
  close (fd); 
  /* read does not NUL-terminate the buffer, so do it here.  */ 
  environment[length] = '\0'; 
  /* Loop over variables. Variables are separated by NULs.  */ 
  next_var = environment; 
  while (next_var < environment + length) {
    /* Print the variable. Each is NUL-terminated, so just treat it 
       like an ordinary string. */ 
    printf ("%s\n", next_var); 
    /* Advance to the next variable. Since each variable is 
       NUL-terminated, strlen counts the length of the next variable, 
       not the entire variable list. */ 
    next_var += strlen (next_var) + 1; 
  } 
} 
 
int main (int argc, char* argv[]) 
{
  pid_t pid = (pid_t) atoi (argv[1]); 
  print_process_environment (pid); 
  return 0; 
} 

7.2.4 Process Executable

The exe entry points to the executable file being run in a process. In Section 2.1.1, we explained that typically the program executable name is passed as the first element of the argument list. Note, though, that this is purely conventional; a program may be invoked with any argument list. Using the exe entry in the /proc file system is a more reliable way to determine which executable is running.

One useful technique is to extract the path containing the executable from the /proc file system. For many programs, auxiliary files are installed in directories with known paths relative to the main program executable, so it's necessary to determine where that executable actually is. The function get_executable_path in Listing 7.5 determines the path of the executable running in the calling process by examining the symbolic link /proc/self/exe.

Listing 7.5 (get-exe-path.c) Get the Path of the Currently Running Program Executable
#include <limits.h> 
#include <stdio.h> 
#include <string.h> 
#include <unistd.h> 
 
/* Finds the path containing the currently running program executable. 
   The path is placed into BUFFER, which is of length LEN.  Returns 
   the number of characters in the path, or -1 on error.  */ 
size_t get_executable_path (char* buffer, size_t len) 
{
  char* path_end; 
  /* Read the target of /proc/self/exe.  */ 
  if (readlink ("/proc/self/exe", buffer, len) <= 0) 
    return -1; 
  /* Find the last occurrence of a forward slash, the path separator.  */ 
  path_end = strrchr (buffer, '/'); 
  if (path_end == NULL) 
    return -1; 
  /* Advance to the character past the last slash.  */ 
  ++path_end; 
  /* Obtain the directory containing the program by truncating the 
     path after the last slash.  */ 
   *path_end = '\0'; 
   /* The length of the path is the number of characters up through the 
      last slash.  */ 
   return (size_t) (path_end - buffer); 
} 
 
int main () 
{
   char path[PATH_MAX]; 
   get_executable_path (path, sizeof (path)); 
   printf ("this program is in the directory %s\n", path); 
   return 0; 
} 

7.2.5 Process File Descriptors

The fd entry is a subdirectory that contains entries for the file descriptors opened by a process. Each entry is a symbolic link to the file or device opened on that file descriptor. You can write to or read from these symbolic links; this writes to or reads from the corresponding file or device opened in the target process. The entries in the fd subdirectory are named by the file descriptor numbers.

Here's a neat trick you can try with fd entries in /proc. Open a new window, and find the process ID of the shell process by running ps.

 
% ps 
  PID TTY          TIME CMD 
 1261 pts/4    00:00:00 bash 
 2455 pts/4    00:00:00 ps 

In this case, the shell (bash) is running in process 1261. Now open a second window, and look at the contents of the fd subdirectory for that process.

 
% ls -l /proc/1261/fd 
total 0 
lrwx------    1 samuel   samuel         64 Jan 30 01:02 0 -> /dev/pts/4 
lrwx------    1 samuel   samuel         64 Jan 30 01:02 1 -> /dev/pts/4 
lrwx------    1 samuel   samuel         64 Jan 30 01:02 2 -> /dev/pts/4 

(There may be other lines of output corresponding to other open file descriptors as well.) Recall that we mentioned in Section 2.1.4, "Standard I/O," that file descriptors 0, 1, and 2 are initialized to standard input, output, and error, respectively. Thus, by writing to /proc/1261/fd/1, you can write to the device attached to stdout for the shell process—in this case, the pseudo TTY in the first window. In the second window, try writing a message to that file:

 
% echo "Hello, world." >> /proc/1261/fd/1 

The text appears in the first window.

File descriptors besides standard input, output, and error appear in the fd subdirectory, too. Listing 7.6 presents a program that simply opens a file descriptor to a file specified on the command line and then loops forever.

Listing 7.6 (open-and-spin.c) Open a File for Reading
#include <fcntl.h> 
#include <stdio.h> 
#include <sys/stat.h> 
#include <sys/types.h> 
#include <unistd.h> 
 
int main (int argc, char* argv[]) 
{
   const char* const filename = argv[1]; 
   int fd = open (filename, O_RDONLY); 
   printf ("in process %d, file descriptor %d is open to %s\n", 
           (int) getpid (), (int) fd, filename); 
   while (1); 
   return 0; 
} 

Try running it in one window:

 
%  ./open-and-spin /etc/fstab 
in process 2570, file descriptor 3 is open to /etc/fstab 

In another window, take a look at the fd subdirectory corresponding to this process in /proc.

 
% ls -l /proc/2570/fd 
total 0 
lrwx------    1 samuel   samuel         64 Jan 30 01:30 0 -> /dev/pts/2 
lrwx------    1 samuel   samuel         64 Jan 30 01:30 1 -> /dev/pts/2 
lrwx------    1 samuel   samuel         64 Jan 30 01:30 2 -> /dev/pts/2 
lr-x------    1 samuel   samuel         64 Jan 30 01:30 3 -> /etc/fstab 

Notice the entry for file descriptor 3, linked to the file /etc/fstab opened on this descriptor.

File descriptors can be opened on sockets or pipes, too (see Chapter 5 for more information about these). In such a case, the target of the symbolic link corresponding to the file descriptor will state "socket" or "pipe" instead of pointing to an ordinary file or device.

7.2.6 Process Memory Statistics

The statm entry contains a list of seven numbers, separated by spaces. Each number is a count of the number of pages of memory used by the process in a particular category. The categories, in the order the numbers appear, are listed here:

·         The total process size

·         The size of the process resident in physical memory

·         The memory shared with other processes—that is, memory mapped both by this process and at least one other (such as shared libraries or untouched copy-on-write pages)

·         The text size of the process—that is, the size of loaded executable code

·         The size of shared libraries mapped into this process

·         The memory used by this process for its stack

·         The number of dirty pages—that is, pages of memory that have been modified by the program

7.2.7 Process Statistics

The status entry contains a variety of information about the process, formatted for comprehension by humans. Among this information is the process ID and parent process ID, the real and effective user and group IDs, memory usage, and bit masks specifying which signals are caught, ignored, and blocked.