3.2 Creating Processes

Two common techniques are used for creating a new process. The first is relatively simple but should be used sparingly because it is inefficient and has considerably security risks. The second technique is more complex but provides greater flexibility, speed, and security.

3.2.4 Using system

The system function in the standard C library provides an easy way to execute a command from within a program, much as if the command had been typed into a shell. In fact, system creates a subprocess running the standard Bourne shell (/bin/sh) and hands the command to that shell for execution. For example, this program in Listing 3.2 invokes the ls command to display the contents of the root directory, as if you typed ls -l / into a shell.

Listing 3.2 (system.c) Using the system Call
#include <stdlib.h> 
 
int main ( ) 
{
 int return_value ; 
 return_value = system ( "ls -l /" ); 
 return return_value; 
} 

The system function returns the exit status of the shell command. If the shell itself cannot be run, system returns 127; if another error occurs, system returns -1.

Because the system function uses a shell to invoke your command, it's subject to the features, limitations, and security flaws of the system's shell. You can't rely on the availability of any particular version of the Bourne shell. On many UNIX systems, /bin/sh is a symbolic link to another shell. For instance, on most GNU/Linux systems, /bin/sh points to bash (the Bourne-Again SHell), and different GNU/Linux distributions use different versions of bash. Invoking a program with root privilege with the system function, for instance, can have different results on different GNU/Linux systems . Therefore, it's preferable to use the fork and exec method for creating processes.

3.2.5 Using fork and exec

The DOS and Windows API contains the spawn family of functions. These functions take as an argument the name of a program to run and create a new process instance of that program. Linux doesn't contain a single function that does all this in one step. Instead, Linux provides one function, fork, that makes a child process that is an exact copy of its parent process. Linux provides another set of functions, the exec family, that causes a particular process to cease being an instance of one program and to instead become an instance of another program. To spawn a new process, you first use fork to make a copy of the current process. Then you use exec to transform one of these processes into an instance of the program you want to spawn.

Calling fork

When a program calls fork, a duplicate process, called the child process, is created. The parent process continues executing the program from the point that fork was called. The child process, too, executes the same program from the same place.

So how do the two processes differ? First, the child process is a new process and therefore has a new process ID, distinct from its parent's process ID. One way for a program to distinguish whether it's in the parent process or the child process is to call getpid. However, the fork function provides different return values to the parent and child processes—one process "goes in" to the fork call, and two processes "come out," with different return values. The return value in the parent process is the process ID of the child. The return value in the child process is zero. Because no process ever has a process ID of zero, this makes it easy for the program whether it is now running as the parent or the child process.

Listing 3.3 is an example of using fork to duplicate a program's process. Note that the first block of the if statement is executed only in the parent process, while the else clause is executed in the child process.

Listing 3.3 (fork.c) Using fork to Duplicate a Program's Process
#include <stdio.h> 
#include <sys/types.h> 
#include <unistd.h> 
 
int main () 
{
 pid_t child_pid; 
 
 printf ("the main program process ID is %d\n", (int) getpid()); 
 
 child_pid = fork () ; 
 if (child_pid != 0) {
    printf ("this is the parent process, with id %d\n", (int) getpid ()); 
    printf ("the child's process ID is %d\n",(int) child_pid ); 
} 
 else 
   printf ("this is the child process, with id %d\n", (int) getpid ()); 
 
 return 0; 
} 
Using the exec Family

The exec functions replace the program running in a process with another program. When a program calls an exec function, that process immediately ceases executing that program and begins executing a new program from the beginning, assuming that the exec call doesn't encounter an error.

Within the exec family, there are functions that vary slightly in their capabilities and how they are called.

·         Functions that contain the letter p in their names (execvp and execlp) accept a program name and search for a program by that name in the current execution path; functions that don't contain the p must be given the full path of the program to be executed.

·         Functions that contain the letter v in their names (execv, execvp, and execve) accept the argument list for the new program as a NULL-terminated array of pointers to strings. Functions that contain the letter l (execl, execlp, and execle) accept the argument list using the C language's varargs mechanism.

·         Functions that contain the letter e in their names (execve and execle) accept an additional argument, an array of environment variables. The argument should be a NULL-terminated array of pointers to character strings. Each character string should be of the form "VARIABLE=value".

Because exec replaces the calling program with another one, it never returns unless an error occurs.

The argument list passed to the program is analogous to the command-line arguments that you specify to a program when you run it from the shell. They are available through the argc and argv parameters to main. Remember, when a program is invoked from the shell, the shell sets the first element of the argument list argv[0]) to the name of the program, the second element of the argument list (argv [1]) to the first command-line argument, and so on. When you use an exec function in your programs, you, too, should pass the name of the function as the first element of the argument list.

Using fork and exec Together

A common pattern to run a subprogram within a program is first to fork the process and then exec the subprogram. This allows the calling program to continue execution in the parent process while the calling program is replaced by the subprogram in the child process.

The program in Listing 3.4, like Listing 3.2, lists the contents of the root directory using the ls command. Unlike the previous example, though, it invokes the ls command directly, passing it the command-line arguments -l and / rather than invoking it through a shell.

Listing 3.4 (fork-exec.c) Using fork and exec Together
#include <stdio.h> 
#include <stdlib.h> 
#include <sys/types.h> 
#include <unistd.h> 
 
/* Spawn a child process running a new program. PROGRAM is the name 
   of the program to run; the path will be searched for this program. 
   ARG_LIST is a NULL-terminated list of character strings to be 
   passed as the program's argument list. Returns  the process ID of 
   the spawned process.  */ 
 
int spawn (char* program, char** arg_list) 
{
  pid_t child_pid; 
 
  /* Duplicate this process. */ 
  child_pid = fork (); 
  if (child_pid != 0) 
    /* This is the parent process. */ 
    return child_pid; 
  else {
    /* Now execute PROGRAM, searching for it in the path.  */ 
    execvp (program,  arg_list); 
    /* The execvp  function returns only if an error occurs.  */ 
    fprintf (stderr,  "an error occurred in execvp\n"); 
    abort (); 
  } 
} 
 
int main () 
{
  /*  The argument list to pass to the "ls" command.  */ 
  char* arg_list[] = {
    "ls",     /* argv[0], the name of the program.  */ 
    "-l", 
    "/", 
    NULL      /* The argument list must end with a NULL.  */ 
  }; 
 
  /* Spawn a child process running the "ls" command. Ignore the 
     returned child process ID.  */ 
  spawn ("ls", arg_list); 
 
  printf ("done with main program\n"); 
 
  return 0; 
} 

3.2.6 Process Scheduling

Linux schedules the parent and child processes independently; there's no guarantee of which one will run first, or how long it will run before Linux interrupts it and lets the other process (or some other process on the system) run. In particular, none, part, or all of the ls command may run in the child process before the parent completes. [2] Linux promises that each process will run eventually—no process will be completely starved of execution resources.

[2] A method for serializing the two processes is presented in Section 3.4.1, "Waiting for Process Termination."

You may specify that a process is less important—and should be given a lower priority —by assigning it a higher niceness value. By default, every process has a niceness of zero. A higher niceness value means that the process is given a lesser execution priority; conversely, a process with a lower (that is, negative) niceness gets more execution time.

To run a program with a nonzero niceness, use the nice command, specifying the niceness value with the -n option. For example, this is how you might invoke the command "sort input.txt > output.txt", a long sorting operation, with a reduced priority so that it doesn't slow down the system too much:

 
%  nice  -n  10  sort  input.txt  >  output.txt 

You can use the renice command to change the niceness of a running process from the command line.

To change the niceness of a running process programmatically, use the nice function. Its argument is an increment value, which is added to the niceness value of the process that calls it. Remember that a positive value raises the niceness value and thus reduces the process's execution priority.

Note that only a process with root privilege can run a process with a negative niceness value or reduce the niceness value of a running process. This means that you may specify negative values to the nice and renice commands only when logged in as root, and only a process running as root can pass a negative value to the nice function. This prevents ordinary users from grabbing execution priority away from others using the system.