Poster of Linux kernelThe best gift for a Linux geek
 Linux kernel map 
⇦ prev ⇱ home next ⇨

8.5. Per-CPU Variables

Per-CPU variables are an interesting 2.6 kernel feature. When you create a per-CPU variable, each processor on the system gets its own copy of that variable. This may seem like a strange thing to want to do, but it has its advantages. Access to per-CPU variables requires (almost) no locking, because each processor works with its own copy. Per-CPU variables can also remain in their respective processors' caches, which leads to significantly better performance for frequently updated quantities.

A good example of per-CPU variable use can be found in the networking subsystem. The kernel maintains no end of counters tracking how many of each type of packet was received; these counters can be u pdated thousands of times per second. Rather than deal with the caching and locking issues, the networking developers put the statistics counters into per-CPU variables. Updates are now lockless and fast. On the rare occasion that user space requests to see the values of the counters, it is a simple matter to add up each processor's version and return the total.

The declarations for per-CPU variables can be found in <linux/percpu.h>. To create a per-CPU variable at compile time, use this macro:

DEFINE_PER_CPU(type, name);

If the variable (to be called name) is an array, include the dimension information with the type. Thus, a per-CPU array of three integers would be created with:

DEFINE_PER_CPU(int[3], my_percpu_array);

Per-CPU variables can be manipulated without explicit locking—almost. Remember that the 2.6 kernel is preemptible; it would not do for a processor to be preempted in the middle of a critical section that modifies a per-CPU variable. It also would not be good if your process were to be moved to another processor in the middle of a per-CPU variable access. For this reason, you must explicitly use the get_cpu_var macro to access the current processor's copy of a given variable, and call put_cpu_var when you are done. The call to get_cpu_var returns an lvalue for the current processor's version of the variable and disables preemption. Since an lvalue is returned, it can be assigned to or operated on directly. For example, one counter in the networking code is incremented with these two statements:

get_cpu_var(sockets_in_use)++;
put_cpu_var(sockets_in_use);

You can access another processor's copy of the variable with:

per_cpu(variable, int cpu_id);

If you write code that involves processors reaching into each other's per-CPU variables, you, of course, have to implement a locking scheme that makes that access safe.

Dynamically allocated per-CPU variables are also possible. These variables can be allocated with:

void *alloc_percpu(type);
void *_ _alloc_percpu(size_t size, size_t align);

In most cases, alloc_percpu does the job; you can call _ _alloc_percpu in cases where a particular alignment is required. In either case, a per-CPU variable can be returned to the system with free_percpu. Access to a dynamically allocated per-CPU variable is done via per_cpu_ptr:

per_cpu_ptr(void *per_cpu_var, int cpu_id);

This macro returns a pointer to the version of per_cpu_var corresponding to the given cpu_id. If you are simply reading another CPU's version of the variable, you can dereference that pointer and be done with it. If, however, you are manipulating the current processor's version, you probably need to ensure that you cannot be moved out of that processor first. If the entirety of your access to the per-CPU variable happens with a spinlock held, all is well. Usually, however, you need to use get_cpu to block preemption while working with the variable. Thus, code using dynamic per-CPU variables tends to look like this:

int cpu;

cpu = get_cpu(  )
ptr = per_cpu_ptr(per_cpu_var, cpu);
/* work with ptr */
put_cpu(  );

When using compile-time per-CPU variables, the get_cpu_var and put_cpu_var macros take care of these details. Dynamic per-CPU variables require more explicit protection.

Per-CPU variables can be exported to modules, but you must use a special version of the macros:

EXPORT_PER_CPU_SYMBOL(per_cpu_var);
EXPORT_PER_CPU_SYMBOL_GPL(per_cpu_var);

To access such a variable within a module, declare it with:

DECLARE_PER_CPU(type, name);

The use of DECLARE_PER_CPU (instead of DEFINE_PER_CPU) tells the compiler that an external reference is being made.

If you want to use per-CPU variables to create a simple integer counter, take a look at the canned implementation in <linux/percpu_counter.h>. Finally, note that some architectures have a limited amount of address space available for per-CPU variables. If you create per-CPU variables in your code, you should try to keep them small.

    ⇦ prev ⇱ home next ⇨
    Poster of Linux kernelThe best gift for a Linux geek