4.5 GNU/Linux Thread Implementation

The implementation of POSIX threads on GNU/Linux differs from the thread implementation on many other UNIX-like systems in an important way: on GNU/Linux, threads are implemented as processes. Whenever you call pthread_create to create a new thread, Linux creates a new process that runs that thread. However, this process is not the same as a process you would create with fork; in particular, it shares the same address space and resources as the original process rather than receiving copies.

The program thread-pid shown in Listing 4.15 demonstrates this. The program creates a thread; both the original thread and the new one call the getpid function and print their respective process IDs and then spin infinitely.

Listing 4.15 (thread-pid) Print Process IDs for Threads
#include <pthread.h> 
#include <stdio.h> 
#include <unistd.h> 
 
void* thread_function (void* arg) 
{
  fprintf (stderr, "child thread pid is %d\n", (int) getpid ()); 
  /* Spin forever.  */ 
  while (1); 
  return NULL; 
} 
 
int main () 
{
  pthread_t thread; 
  fprintf (stderr, "main thread pid is %d\n", (int) getpid ()); 
  pthread_create (&thread, NULL, &thread_function, NULL); 
  /* Spin forever.  */ 
  while (1); 
  return 0; 
} 

Run the program in the background, and then invoke ps x to display your running processes. Don't forget to kill the thread-pid program afterward—it consumes lots of CPU doing nothing. Here's what the output might look like:

 
% cc thread-pid.c -o thread-pid -lpthread 
% ./thread-pid & 
[1] 14608 
main thread pid is 14608 
child thread pid is 14610 
% ps x 
  PID TTY          STAT   TIME COMMAND 
14042   pts/9      S      0:00 bash 
14608   pts/9      R      0:01 ./thread-pid 
14609 pts/9   S      0:00 ./thread-pid 
14610 pts/9   R      0:01 ./thread-pid 
14611 pts/9   R      0:00 ps x 
% kill 14608 
[1]+  Terminated              ./thread-pid 

Job Control Notification in the Shell

The lines starting with [1] are from the shell. When you run a program in the background, the shell assigns a job number to it—in this case, 1—and prints out the program's pid. If a background job terminates, the shell reports that fact the next time you invoke a command.

Notice that there are three processes running the thread-pid program. The first of these, with pid 14608, is the main thread in the program; the third, with pid 14610, is the thread we created to execute thread_function.

How about the second thread, with pid 14609? This is the "manager thread," which is part of the internal implementation of GNU/Linux threads. The manager thread is created the first time a program calls pthread_create to create a new thread.

4.5.1 Signal Handling

Suppose that a multithreaded program receives a signal. In which thread is the signal handler invoked? The behavior of the interaction between signals and threads varies from one UNIX-like system to another. In GNU/Linux, the behavior is dictated by the fact that threads are implemented as processes.

Because each thread is a separate process, and because a signal is delivered to a particular process, there is no ambiguity about which thread receives the signal. Typically, signals sent from outside the program are sent to the process corresponding to the main thread of the program. For instance, if a program forks and the child process execs a multithreaded program, the parent process will hold the process id of the main thread of the child process's program and will use that process id to send signals to its child. This is generally a good convention to follow yourself when sending signals to a multithreaded program.

Note that this aspect of GNU/Linux's implementation of pthreads is at variance with the POSIX thread standard. Do not rely on this behavior in programs that are meant to be portable.

Within a multithreaded program, it is possible for one thread to send a signal specifically to another thread. Use the pthread_kill function to do this. Its first parameter is a thread ID, and its second parameter is a signal number.

4.5.2 The clone System Call

Although GNU/Linux threads created in the same program are implemented as separate processes, they share their virtual memory space and other resources. A child process created with fork, however, gets copies of these items. How is the former type of process created?

The Linux clone system call is a generalized form of fork and pthread_create that allows the caller to specify which resources are shared between the calling process and the newly created process. Also, clone requires you to specify the memory region for the execution stack that the new process will use. Although we mention clone here to satisfy the reader's curiosity, that system call should not ordinarily be used in programs. Use fork to create new processes or pthread_create to create threads.