4.1 Thread Creation

Each thread in a process is identified by a thread ID. When referring to thread IDs in C or C++ programs, use the type pthread_t.

Upon creation, each thread executes a thread function. This is just an ordinary function and contains the code that the thread should run. When the function returns, the thread exits. On GNU/Linux, thread functions take a single parameter, of type void*, and have a void* return type. The parameter is the thread argument: GNU/Linux passes the value along to the thread without looking at it. Your program can use this parameter to pass data to a new thread. Similarly, your program can use the return value to pass data from an exiting thread back to its creator.

The pthread_create function creates a new thread. You provide it with the following:

1.       A pointer to a pthread_t variable, in which the thread ID of the new thread is stored.

2.       A pointer to a thread attribute object. This object controls details of how the thread interacts with the rest of the program. If you pass NULL as the thread attribute, a thread will be created with the default thread attributes. Thread attributes are discussed in Section 4.1.5, "Thread Attributes."

3.       A pointer to the thread function. This is an ordinary function pointer, of this type:

4.           
void*  (*)  (void*) 

5.       A thread argument value of type void*. Whatever you pass is simply passed as the argument to the thread function when the thread begins executing.

A call to pthread_create returns immediately, and the original thread continues executing the instructions following the call. Meanwhile, the new thread begins executing the thread function. Linux schedules both threads asynchronously, and your program must not rely on the relative order in which instructions are executed in the two threads.

The program in Listing 4.1 creates a thread that prints x's continuously to standard error. After calling pthread_create, the main thread prints o's continuously to standard error.

Listing 4.1 (thread-create.c) Create a Thread
#include <pthread.h> 
#include <stdio.h> 
 
/*Prints x's to stderr.  The parameter is unused.  Does not return.  */ 
 
void* print_xs (void* unused) 
{
  while (1) 
    fputc ('x', stderr); 
  return NULL; 
} 
 
/* The main program.  */ 
 
int main () 
{
  pthread_t thread_id; 
  /* Create a new thread.  The new thread will run the print_xs 
     function.  */ 
  pthread_create  (&thread_id,  NULL, &print_xs, NULL); 
  /* Print o's continuously  to stderr.  */ 
  while (1) 
    fputc ('o', stderr); 
  return 0; 
} 

Compile and link this program using the following code:

 
%  cc  -o  thread-create  thread-create.c  -lpthread 

Try running it to see what happens. Notice the unpredictable pattern of x's and o's as Linux alternately schedules the two threads.

Under normal circumstances, a thread exits in one of two ways. One way, as illustrated previously, is by returning from the thread function. The return value from the thread function is taken to be the return value of the thread. Alternately, a thread can exit explicitly by calling pthread_exit. This function may be called from within the thread function or from some other function called directly or indirectly by the thread function. The argument to pthread_exit is the thread's return value.

4.1.1 Passing Data to Threads

The thread argument provides a convenient method of passing data to threads. Because the type of the argument is void*, though, you can't pass a lot of data directly via the argument. Instead, use the thread argument to pass a pointer to some structure or array of data. One commonly used technique is to define a structure for each thread function, which contains the "parameters" that the thread function expects.

Using the thread argument, it's easy to reuse the same thread function for many threads. All these threads execute the same code, but on different data.

The program in Listing 4.2 is similar to the previous example. This one creates two new threads, one to print x's and the other to print o's. Instead of printing infinitely, though, each thread prints a fixed number of characters and then exits by returning from the thread function. The same thread function, char_print, is used by both threads, but each is configured differently using struct char_print_parms.

Listing 4.2 (thread-create2) Create Two Threads
#include <pthread.h> 
#include <stdio.h> 
 
/* Parameters to print_function. */ 
 
struct char_print_parms 
{
 /* The character to print.*/ 
 char character; 
 /* The number of times to print it.  */ 
 int count; 
}; 
 
/* Prints a number of characters to stderr, as given by PARAMETERS, 
   which is a pointer to a struct char_print_parms. */ 
 
void* char_print (void* parameters) 
{
    /* Cast the cookie pointer to the right type.  */ 
    struct char_print_parms*  p = (struct char_print_parms*) parameters; 
    int i; 
 
    for (i = 0; i < p->count; ++i) 
      fputc (p->character,stderr); 
    return NULL; 
} 
 
/* The main program. */ 
 
int main () 
{
 pthread_t thread1_id; 
  pthread_t  thread2_id; 
  struct  char_print_parms  thread1_args; 
  struct  char_print_parms  thread2_args; 
 
  /*  Create  a  new  thread  to  print  30,000  'x's.  */ 
  thread1_args.character  =  'x'; 
  thread1_args.count  =  30000; 
  pthread_create  (&thread1_id,  NULL,  &char_print,  &thread1_args); 
 
  /*  Create  a  new  thread  to  print  20,000  o's.  */ 
  thread2_args.character  =  'o'; 
  thread2_args.count  =  20000; 
  pthread_create  (&thread2_id,  NULL,  &char_print,  &thread2_args); 
 
  return  0; 
} 

But wait! The program in Listing 4.2 has a serious bug in it. The main thread (which runs the main function) creates the thread parameter structures (thread1_args and thread2_args) as local variables, and then passes pointers to these structures to the threads it creates. What's to prevent Linux from scheduling the three threads in such a way that main finishes executing before either of the other two threads are done? Nothing! But if this happens, the memory containing the thread parameter structures will be deallocated while the other two threads are still accessing it.

4.1.2 Joining Threads

One solution is to force main to wait until the other two threads are done. What we need is a function similar to wait that waits for a thread to finish instead of a process. That function is pthread_join, which takes two arguments: the thread ID of the thread to wait for, and a pointer to a void* variable that will receive the finished thread's return value. If you don't care about the thread return value, pass NULL as the second argument.

Listing 4.3 shows the corrected main function for the buggy example in Listing 4.2. In this version, main does not exit until both of the threads printing x's and o's have completed, so they are no longer using the argument structures.

Listing 4.3 Revised Main Function for thread-create2.c
int main () 
{
  pthread_t thread1_id; 
  pthread_t thread2_id; 
  struct char_print_parms thread1_args; 
  struct char_print_parms thread2_args; 
  /* Create  a  new  thread  to  print  30,000  x's.  */ 
  thread1_args.character  =  'x'; 
  thread1_args.count  =  30000; 
  pthread_create  (&thread1_id,  NULL,  &char_print,  &thread1_args); 
 
  /* Create  a  new  thread  to  print  20,000  o's.  */ 
  thread2_args.character  =  'o'; 
  thread2_args.count  =  20000; 
  pthread_create  (&thread2_id,  NULL,  &char_print,  &thread2_args); 
 
  /* Make  sure  the  first  thread  has  finished.  */ 
  pthread_join  (thread1_id,  NULL); 
  /* Make  sure  the  second  thread  has  finished.  */ 
  pthread_join  (thread2_id,  NULL); 
 
  /* Now  we  can  safely  return.  */ 
  return  0; 
} 

The moral of the story: Make sure that any data you pass to a thread by reference is not deallocated, even by a different thread, until you're sure that the thread is done with it. This is true both for local variables, which are deallocated when they go out of scope, and for heap-allocated variables, which you deallocate by calling free (or using delete in C++).

4.1.3 Thread Return Values

If the second argument you pass to pthread_join is non-null, the thread's return value will be placed in the location pointed to by that argument. The thread return value, like the thread argument, is of type void*. If you want to pass back a single int or other small number, you can do this easily by casting the value to void* and then casting back to the appropriate type after calling pthread_join. [1]

[1] Note that this is not portable, and it's up to you to make sure that your value can be cast safely to void* and back without losing bits.

The program in Listing 4.4 computes the n th prime number in a separate thread. That thread returns the desired prime number as its thread return value. The main thread, meanwhile, is free to execute other code. Note that the successive division algorithm used in compute_prime is quite inefficient; consult a book on numerical algorithims if you need to compute many prime numbers in your programs.

Listing 4.4 (primes.c) Compute Prime Numbers in a Thread
#include <pthread.h> 
#include <stdio.h> 
 
/* Compute successive prime numbers (very  inefficiently). Return  the 
   Nth prime  number, where N is the value pointed to by *ARG.  */ 
 
void*  compute_prime  (void*  arg) 
{
   int  candidate  =  2; 
   int  n  =  *((int*)  arg); 
 
   while  (1)  {
     int  factor; 
     int  is_prime  =  1; 
 
     /* Test  primality  by  successive  division.  */ 
     for  (factor  =  2;  factor  <  candidate;  ++factor) 
       if (candidate  %  factor  ==  0)  {
         is_prime  =  0; 
         break; 
       } 
       /* Is  this  the  prime  number  we're  looking  for?  */ 
       if  (is_prime)  {
         if  (--n  ==  0) 
            /*   Return  the  desired  prime  number  as  the  thread  return  value.  */ 
            return  (void*)  candidate; 
       } 
       ++candidate; 
  } 
   return  NULL; 
} 
 
int  main  () 
{
  pthread_t  thread; 
  int  which_prime  =  5000; 
  int  prime; 
 
   /* Start  the  computing  thread,  up  to  the  5,000th  prime  number.  */ 
   pthread_create  (&thread,  NULL,  &compute_prime,  &which_prime); 
   /* Do  some  other  work  here...  */ 
   /* Wait  for  the  prime  number  thread  to  complete,  and  get  the  result.  */ 
   pthread_join  (thread,  (void*)  &prime); 
   /* Print  the  largest  prime  it  computed.  */ 
   printf("The  %dth  prime  number  is  %d.\n",  which_prime,  prime); 
   return  0; 
} 

4.1.4 More on Thread IDs

Occasionally, it is useful for a sequence of code to determine which thread is executing it. The pthread_self function returns the thread ID of the thread in which it is called. This thread ID may be compared with another thread ID using the pthread_equal function.

These functions can be useful for determining whether a particular thread ID corresponds to the current thread. For instance, it is an error for a thread to call pthread_join to join itself. (In this case, pthread_join would return the error code EDEADLK.) To check for this beforehand, you might use code like this:

 
if  (!pthread_equal  (pthread_self  (),  other_thread)) 
pthread_join  (other_thread,  NULL); 

4.1.5 Thread Attributes

Thread attributes provide a mechanism for fine-tuning the behavior of individual threads. Recall that pthread_create accepts an argument that is a pointer to a thread attribute object. If you pass a null pointer, the default thread attributes are used to configure the new thread. However, you may create and customize a thread attribute object to specify other values for the attributes.

To specify customized thread attributes, you must follow these steps:

1.       Create a pthread_attr_t object. The easiest way is simply to declare an automatic variable of this type.

2.       Call pthread_attr_init, passing a pointer to this object. This initializes the attributes to their default values.

3.       Modify the attribute object to contain the desired attribute values.

4.       Pass a pointer to the attribute object when calling pthread_create.

5.       Call pthread_attr_destroy to release the attribute object. The pthread_attr_t variable itself is not deallocated; it may be reinitialized with pthread_attr_init.

A single thread attribute object may be used to start several threads. It is not necessary to keep the thread attribute object around after the threads have been created.

For most GNU/Linux application programming tasks, only one thread attribute is typically of interest (the other available attributes are primarily for specialty real-time programming). This attribute is the thread's detach state. A thread may be created as a joinable thread (the default) or as a detached thread. A joinable thread, like a process, is not automatically cleaned up by GNU/Linux when it terminates. Instead, the thread's exit state hangs around in the system (kind of like a zombie process) until another thread calls pthread_join to obtain its return value. Only then are its resources released. A detached thread, in contrast, is cleaned up automatically when it terminates. Because a detached thread is immediately cleaned up, another thread may not synchronize on its completion by using pthread_join or obtain its return value.

To set the detach state in a thread attribute object, use pthread_attr_setdetachstate. The first argument is a pointer to the thread attribute object, and the second is the desired detach state. Because the joinable state is the default, it is necessary to call this only to create detached threads; pass PTHREAD_CREATE_DETACHED as the second argument.

The code in Listing 4.5 creates a detached thread by setting the detach state thread attribute for the thread.

Listing 4.5 (detached.c) Skeleton Program That Creates a Detached Thread
#include  <pthread.h> 
 
void*  thread_function  (void*  thread_arg) 
{
   /* Do  work  here...  */ 
} 
 
int  main  () 
{
   pthread_attr_t attr; 
   pthread_t  thread; 
 
   pthread_attr_init  (&attr); 
   pthread_attr_setdetachstate  (&attr,  PTHREAD_CREATE_DETACHED); 
   pthread_create  (&thread,  &attr,  &thread_function,  NULL); 
   pthread_attr_destroy  (&attr); 
 
   /* Do  work  here...  */ 
 
   /* No  need  to  join  the  second  thread.  */ 
   return 0; 
} 

Even if a thread is created in a joinable state, it may later be turned into a detached thread. To do this, call pthread_detach. Once a thread is detached, it cannot be made joinable again.