Poster of Linux kernelThe best gift for a Linux geek
 Linux kernel map 
⇦ prev ⇱ home next ⇨

9.4. Using I/O Memory

Despite the popularity of I/O ports in the x86 world, the main mechanism used to communicate with devices is through memory-mapped registers and device memory. Both are called I/O memory because the difference between registers and memory is transparent to software.

I/O memory is simply a region of RAM-like locations that the device makes available to the processor over the bus. This memory can be used for a number of purposes, such as holding video data or Ethernet packets, as well as implementing device registers that behave just like I/O ports (i.e., they have side effects associated with reading and writing them).

The way to access I/O memory depends on the computer architecture, bus, and device being used, although the principles are the same everywhere. The discussion in this chapter touches mainly on ISA and PCI memory, while trying to convey general information as well. Although access to PCI memory is introduced here, a thorough discussion of PCI is deferred to Chapter 12.

Depending on the computer platform and bus being used, I/O memory may or may not be accessed through page tables. When access passes though page tables, the kernel must first arrange for the physical address to be visible from your driver, and this usually means that you must call ioremap before doing any I/O. If no page tables are needed, I/O memory locations look pretty much like I/O ports, and you can just read and write to them using proper wrapper functions.

Whether or not ioremap is required to access I/O memory, direct use of pointers to I/O memory is discouraged. Even though (as introduced in Section 9.1) I/O memory is addressed like normal RAM at hardware level, the extra care outlined in the Section 9.1.1 suggests avoiding normal pointers. The wrapper functions used to access I/O memory are safe on all platforms and are optimized away whenever straight pointer dereferencing can perform the operation.

Therefore, even though dereferencing a pointer works (for now) on the x86, failure to use the proper macros hinders the portability and readability of the driver.

9.4.1. I/O Memory Allocation and Mapping

I/O memory regions must be allocated prior to use. The interface for allocation of memory regions (defined in <linux/ioport.h>) is:

struct resource *request_mem_region(unsigned long start, unsigned long len,
                                    char *name);

This function allocates a memory region of len bytes, starting at start. If all goes well, a non-NULL pointer is returned; otherwise the return value is NULL. All I/O memory allocations are listed in /proc/iomem.

Memory regions should be freed when no longer needed:

void release_mem_region(unsigned long start, unsigned long len);

There is also an old function for checking I/O memory region availability:

int check_mem_region(unsigned long start, unsigned long len);

But, as with check_region, this function is unsafe and should be avoided.

Allocation of I/O memory is not the only required step before that memory may be accessed. You must also ensure that this I/O memory has been made accessible to the kernel. Getting at I/O memory is not just a matter of dereferencing a pointer; on many systems, I/O memory is not directly accessible in this way at all. So a mapping must be set up first. This is the role of the ioremap function, introduced in Section 8.4 in Chapter 8. The function is designed specifically to assign virtual addresses to I/O memory regions.

Once equipped with ioremap (and iounmap), a device driver can access any I/O memory address, whether or not it is directly mapped to virtual address space. Remember, though, that the addresses returned from ioremap should not be dereferenced directly; instead, accessor functions provided by the kernel should be used. Before we get into those functions, we'd better review the ioremap prototypes and introduce a few details that we passed over in the previous chapter.

The functions are called according to the following definition:

#include <asm/io.h>
void *ioremap(unsigned long phys_addr, unsigned long size);
void *ioremap_nocache(unsigned long phys_addr, unsigned long size);
void iounmap(void * addr);

First of all, you notice the new function ioremap_nocache. We didn't cover it in Chapter 8, because its meaning is definitely hardware related. Quoting from one of the kernel headers: "It's useful if some control registers are in such an area, and write combining or read caching is not desirable." Actually, the function's implementation is identical to ioremap on most computer platforms: in situations where all of I/O memory is already visible through noncacheable addresses, there's no reason to implement a separate, noncaching version of ioremap.

9.4.2. Accessing I/O Memory

On some platforms, you may get away with using the return value from ioremap as a pointer. Such use is not portable, and, increasingly, the kernel developers have been working to eliminate any such use. The proper way of getting at I/O memory is via a set of functions (defined via <asm/io.h>) provided for that purpose.

To read from I/O memory, use one of the following:

unsigned int ioread8(void *addr);
unsigned int ioread16(void *addr);
unsigned int ioread32(void *addr);

Here, addr should be an address obtained from ioremap (perhaps with an integer offset); the return value is what was read from the given I/O memory.

There is a similar set of functions for writing to I/O memory:

void iowrite8(u8 value, void *addr);
void iowrite16(u16 value, void *addr);
void iowrite32(u32 value, void *addr);

If you must read or write a series of values to a given I/O memory address, you can use the repeating versions of the functions:

void ioread8_rep(void *addr, void *buf, unsigned long count);
void ioread16_rep(void *addr, void *buf, unsigned long count);
void ioread32_rep(void *addr, void *buf, unsigned long count);
void iowrite8_rep(void *addr, const void *buf, unsigned long count);
void iowrite16_rep(void *addr, const void *buf, unsigned long count);
void iowrite32_rep(void *addr, const void *buf, unsigned long count);

These functions read or write count values from the given buf to the given addr. Note that count is expressed in the size of the data being written; ioread32_rep reads count 32-bit values starting at buf.

The functions described above perform all I/O to the given addr. If, instead, you need to operate on a block of I/O memory, you can use one of the following:

void memset_io(void *addr, u8 value, unsigned int count);
void memcpy_fromio(void *dest, void *source, unsigned int count);
void memcpy_toio(void *dest, void *source, unsigned int count);

These functions behave like their C library analogs.

If you read through the kernel source, you see many calls to an older set of functions when I/O memory is being used. These functions still work, but their use in new code is discouraged. Among other things, they are less safe because they do not perform the same sort of type checking. Nonetheless, we describe them here:

unsigned readb(address);

unsigned readw(address);

unsigned readl(address);

These macros are used to retrieve 8-bit, 16-bit, and 32-bit data values from I/O memory.

void writeb(unsigned value, address);

void writew(unsigned value, address);

void writel(unsigned value, address);

Like the previous functions, these functions (macros) are used to write 8-bit, 16-bit, and 32-bit data items.

Some 64-bit platforms also offer readq and writeq, for quad-word (8-byte) memory operations on the PCI bus. The quad-word nomenclature is a historical leftover from the times when all real processors had 16-bit words. Actually, the L naming used for 32-bit values has become incorrect too, but renaming everything would confuse things even more.

9.4.3. Ports as I/O Memory

Some hardware has an interesting feature: some versions use I/O ports, while others use I/O memory. The registers exported to the processor are the same in either case, but the access method is different. As a way of making life easier for drivers dealing with this kind of hardware, and as a way of minimizing the apparent differences between I/O port and memory accesses, the 2.6 kernel provides a function called ioport_map:

void *ioport_map(unsigned long port, unsigned int count);

This function remaps count I/O ports and makes them appear to be I/O memory. From that point thereafter, the driver may use ioread8 and friends on the returned addresses and forget that it is using I/O ports at all.

This mapping should be undone when it is no longer needed:

void ioport_unmap(void *addr);

These functions make I/O ports look like memory. Do note, however, that the I/O ports must still be allocated with request_region before they can be remapped in this way.

9.4.4. Reusing short for I/O Memory

The short sample module, introduced earlier to access I/O ports, can be used to access I/O memory as well. To this aim, you must tell it to use I/O memory at load time; also, you need to change the base address to make it point to your I/O region.

For example, this is how we used short to light the debug LEDs on a MIPS development board:

mips.root# ./short_load use_mem=1 base=0xb7ffffc0
mips.root# echo -n 7 > /dev/short0

Use of short for I/O memory is the same as it is for I/O ports.

The following fragment shows the loop used by short in writing to a memory location:

while (count--) {
    iowrite8(*ptr++, address);
    wmb(  );

Note the use of a write memory barrier here. Because iowrite8 likely turns into a direct assignment on many architectures, the memory barrier is needed to ensure that the writes happen in the expected order.

short uses inb and outb to show how that is done. It would be a straightforward exercise for the reader, however, to change short to remap I/O ports with ioport_map, and simplify the rest of the code considerably.

9.4.5. ISA Memory Below 1 MB

One of the most well-known I/O memory regions is the ISA range found on personal computers. This is the memory range between 640 KB (0xA0000) and 1 MB (0x100000). Therefore, it appears right in the middle of regular system RAM. This positioning may seem a little strange; it is an artifact of a decision made in the early 1980s, when 640 KB of memory seemed like more than anybody would ever be able to use.

This memory range belongs to the non-directly-mapped class of memory.[5] You can read/write a few bytes in that memory range using the short module as explained previously, that is, by setting use_mem at load time.

[5] Actually, this is not completely true. The memory range is so small and so frequently used that the kernel builds page tables at boot time to access those addresses. However, the virtual address used to access them is not the same as the physical address, and thus ioremap is needed anyway.

Although ISA I/O memory exists only in x86-class computers, we think it's worth spending a few words and a sample driver on it.

We are not going to discuss PCI memory in this chapter, since it is the cleanest kind of I/O memory: once you know the physical address, you can simply remap and access it. The "problem" with PCI I/O memory is that it doesn't lend itself to a working example for this chapter, because we can't know in advance the physical addresses your PCI memory is mapped to, or whether it's safe to access either of those ranges. We chose to describe the ISA memory range, because it's both less clean and more suitable to running sample code.

To demonstrate access to ISA memory, we use yet another silly little module (part of the sample sources). In fact, this one is called silly, as an acronym for Simple Tool for Unloading and Printing ISA Data, or something like that.

The module supplements the functionality of short by giving access to the whole 384-KB memory space and by showing all the different I/O functions. It features four device nodes that perform the same task using different data transfer functions. The silly devices act as a window over I/O memory, in a way similar to /dev/mem. You can read and write data, and lseek to an arbitrary I/O memory address.

Because silly provides access to ISA memory, it must start by mapping the physical ISA addresses into kernel virtual addresses. In the early days of the Linux kernel, one could simply assign a pointer to an ISA address of interest, then dereference it directly. In the modern world, though, we must work with the virtual memory system and remap the memory range first. This mapping is done with ioremap, as explained earlier for short:

#define ISA_BASE    0xA0000
#define ISA_MAX     0x100000  /* for general memory access */

    /* this line appears in silly_init */
    io_base = ioremap(ISA_BASE, ISA_MAX - ISA_BASE);

ioremap returns a pointer value that can be used with ioread8 and the other functions explained in Section 9.4.2.

Let's look back at our sample module to see how these functions might be used. /dev/sillyb, featuring minor number 0, accesses I/O memory with ioread8 and iowrite8. The following code shows the implementation for read, which makes the address range 0xA0000-0xFFFFF available as a virtual file in the range 0-0x5FFFF. The read function is structured as a switch statement over the different access modes; here is the sillyb case:

case M_8: 
  while (count) {
      *ptr = ioread8(add);

The next two devices are /dev/sillyw (minor number 1) and /dev/sillyl (minor number 2). They act like /dev/sillyb, except that they use 16-bit and 32-bit functions. Here's the write implementation of sillyl, again part of a switch:

case M_32: 
  while (count >= 4) {
      iowrite8(*(u32 *)ptr, add);
      add += 4;
      count -= 4;
      ptr += 4;

The last device is /dev/sillycp (minor number 3), which uses the memcpy_*io functions to perform the same task. Here's the core of its read implementation:

case M_memcpy:
  memcpy_fromio(ptr, add, count);

Because ioremap was used to provide access to the ISA memory area, silly must invoke iounmap when the module is unloaded:


9.4.6. isa_readb and Friends

A look at the kernel source will turn up another set of routines with names such as isa_readb. In fact, each of the functions just described has an isa_ equivalent. These functions provide access to ISA memory without the need for a separate ioremap step. The word from the kernel developers, however, is that these functions are intended to be temporary driver-porting aids and that they may go away in the future. Therefore, you should avoid using them.

    ⇦ prev ⇱ home next ⇨
    Poster of Linux kernelThe best gift for a Linux geek