Poster of Linux kernelThe best gift for a Linux geek
 Linux kernel map 
Team LiB
Previous Section Next Section


The place to start this discussion of the actual bottom half methods is with softirqs. Softirqs are rarely used; tasklets are a much more common form of bottom half. Nonetheless, because tasklets are built on softirqs we'll cover them first. The softirq code lives in kernel/softirq.c.

Implementation of Softirqs

Softirqs are statically allocated at compile-time. Unlike tasklets, you cannot dynamically register and destroy softirqs. Softirqs are represented by the softirq_action structure, which is defined in <linux/interrupt.h>:

 * structure representing a single softirq entry
struct softirq_action {
        void (*action)(struct softirq_action *); /* function to run */
        void *data;                              /* data to pass to function */

A 32-entry array of this structure is declared in kernel/softirq.c:

static struct softirq_action softirq_vec[32];

Each registered softirq consumes one entry in the array. Consequently, there can be a maximum of 32 registered softirqs. Note that this cap is fixedthe maximum number of registered softirqs cannot be dynamically changed. In the current kernel, however, only 6 of the 32 entries are used[3].

[3] Most drivers use tasklets for their bottom half. Tasklets are built off softirqs, as the next section explains.

The Softirq Handler

The prototype of a softirq handler, action, looks like:

void softirq_handler(struct softirq_action *)

When the kernel runs a softirq handler, it executes this action function with a pointer to the corresponding softirq_action structure as its lone argument. For example, if my_softirq pointed to an entry in the softirq_vec array, the kernel would invoke the softirq handler function as


It seems a bit odd that the kernel passes the entire structure, and not just the data value, to the softirq handler. This trick allows future additions to the structure without requiring a change in every softirq handler. Softirq handlers can retrieve the data value, if they need to, simply by dereferencing their argument and reading the data member.

A softirq never preempts another softirq. In fact, the only event that can preempt a softirq is an interrupt handler. Another softirqeven the same onecan run on another processor, however.

Executing Softirqs

A registered softirq must be marked before it will execute. This is called raising the softirq. Usually, an interrupt handler marks its softirq for execution before returning. Then, at a suitable time, the softirq runs. Pending softirqs are checked for and executed in the following places:

  • In the return from hardware interrupt code

  • In the ksoftirqd kernel thread

  • In any code that explicitly checks for and executes pending softirqs, such as the networking subsystem

Regardless of the method of invocation, softirq execution occurs in do_softirq(). The function is really quite simple. If there are pending softirqs, do_softirq() loops over each one, invoking its handler. Let's look at a simplified variant of the important part of do_softirq():

u32 pending = softirq_pending(cpu);

if (pending) {
    struct softirq_action *h = softirq_vec;

    softirq_pending(cpu) = 0;

    do {
        if (pending & 1)
        pending >>= 1;
    } while (pending);

This snippet is the heart of softirq processing. It checks for, and executes, any pending softirqs. Specifically,

  1. It sets the pending local variable to the value returned by the softirq_pending() macro. This is a 32-bit mask of pending softirqsif bit n is set, the nth softirq is pending.

  2. Now that the pending bitmask of softirqs is saved, it clears the actual bitmask[4].

    [4] This actually occurs with local interrupts disabled, but that is omitted in this simplified example. If interrupts were not disabled, a softirq could have been raised (and thus be pending) in the intervening time between saving the mask and clearing it. This would result in incorrectly clearing a pending bit.

  3. The pointer h is set to the first entry in the softirq_vec.

  4. If the first bit in pending is set, h->action(h) is called.

  5. The pointer h is incremented by one so that it now points to the second entry in the softirq_vec array.

  6. The bitmask pending is right-shifted by one. This tosses the first bit away, and moves all other bits one place to the right. Consequently, the second bit is now the first (and so on).

  7. The pointer h now points to the second entry in the array and the pending bitmask now has the second bit as the first. Repeat the previous steps.

  8. Continue repeating until pending is zero, at which point there are no more pending softirqs and the work is done. Note, this check is sufficient to ensure h always points to a valid entry in softirq_vec because pending has at most 32 set bits and thus this loop executes at most 32 times.

Using Softirqs

Softirqs are reserved for the most timing-critical and important bottom-half processing on the system. Currently, only two subsystemsnetworking and SCSIdirectly use softirqs. Additionally, kernel timers and tasklets are built on top of softirqs. If you are adding a new softirq, you normally want to ask yourself why using a tasklet is insufficient. Tasklets are dynamically created and are simpler to use because of their weaker locking requirements, and they still perform quite well. Nonetheless, for timing-critical applications that are able to do their own locking in an efficient way, softirqs might be the correct solution.

Assigning an Index

You declare softirqs statically at compile-time via an enum in <linux/interrupt.h>. The kernel uses this index, which starts at zero, as a relative priority. Softirqs with the lowest numerical priority execute before those with a higher numerical priority.

Creating a new softirq includes adding a new entry to this enum. When adding a new softirq you might not want to simply add your entry to the end of the list, as you would elsewhere. Instead, you need to insert the new entry depending on the priority you want to give it. Historically, HI_SOFTIRQ is always the first and TASKLET_SOFTIRQ is always the last entry. A new entry probably belongs somewhere after the network entries, but prior to TASKLET_SOFTIRQ. Table 7.2 contains a list of the existing tasklet types.

Table 7.2. Listing of Tasklet Types



Softirq Description



High-priority tasklets



Timer bottom half



Send network packets



Receive network packets



SCSI bottom half




Registering Your Handler

Next, the softirq handler is registered at run-time via open_softirq(), which takes three parameters: the softirq's index, its handler function, and a value for the data field. The networking subsystem, for example, registers its softirqs like this:

open_softirq(NET_TX_SOFTIRQ, net_tx_action, NULL);
open_softirq(NET_RX_SOFTIRQ, net_rx_action, NULL);

The softirq handlers run with interrupts enabled and cannot sleep. While a handler runs, softirqs on the current processor are disabled. Another processor, however, can execute other softirqs. In fact, if the same softirq is raised again while it is executing, another processor can run it simultaneously. This means that any shared dataeven global data used only within the softirq handler itselfneeds proper locking (as discussed in the next two chapters). This is an important point, and it is the reason tasklets are usually preferred. Simply preventing your softirqs from running concurrently is not ideal. If a softirq obtained a lock to prevent another instance of itself from running simultaneously, there would be no reason to use a softirq. Consequently, most softirq handlers resort to per-processor data (data unique to each processor and thus not requiring locking) or some other tricks to avoid explicit locking and provide excellent scalability.

The raison d'ètre to softirqs is scalability. If you do not need to scale to infinitely many processors, then use a tasklet. Tasklets are ultimately softirqs where multiple instances of the same handler do not run concurrently on multiple processors.

Raising Your Softirq

After a handler is added to the enum list and registered via open_softirq(), it is ready to run. To mark it pending, so that it is run at the next invocation of do_softirq(), call raise_softirq(). For example, the networking subsystem would call


This raises the NET_TX_SOFTIRQ softirq. Its handler, net_tx_action(), runs the next time the kernel executes softirqs. This function disables interrupts prior to actually raising the softirq, and then restores them to their previous state. If interrupts are already off, the function raise_softirq_irqoff() can be used as a minor optimization. For example:

 * interrupts must already be off!

Softirqs are most often raised from within interrupt handlers. In the case of interrupt handlers, the interrupt handler performs the basic hardware-related work, raises the softirq, and then exits. When processing interrupts, the kernel invokes do_softirq(). The softirq then runs and picks up where the interrupt handler left off. In this example, the "top half" and "bottom half" naming should make sense.

    Team LiB
    Previous Section Next Section