11.4. Other Portability Issues
In addition to data typing, there
are a few other software issues to keep in mind when writing a driver
if you want it to be portable across Linux platforms.
A general rule is to be suspicious of explicit constant values.
Usually the code has been parameterized using preprocessor macros.
This section lists the most important portability problems. Whenever
you encounter other values that have been parameterized, you can find
hints in the header files and in the device drivers distributed with
the official kernel.
11.4.1. Time Intervals
When dealing with time
intervals, don't assume that there are 1000 jiffies
per second. Although this is currently true for the i386
architecture, not every Linux platform runs at this speed. The
assumption can be false even for the x86 if you play with the
HZ value (as some people do), and nobody knows
what will happen in future kernels. Whenever you calculate time
intervals using jiffies, scale your times using HZ
(the number of timer interrupts per second). For example, to check
against a timeout of half a second, compare the elapsed time against
HZ/2. More generally, the number of jiffies
corresponding to msec milliseconds is always
msec*HZ/1000.
11.4.2. Page Size
When playing games with memory, remember
that a memory page is PAGE_SIZE bytes, not 4 KB.
Assuming that the page size is 4 KB and hardcoding the value is a
common error among PC programmers, instead, supported platforms show
page sizes from 4 KB to 64 KB, and sometimes they differ between
different implementations of the same platform. The relevant macros
are PAGE_SIZE and PAGE_SHIFT.
The latter contains the number of bits to shift an address to get its
page number. The number currently is 12 or greater for pages that are
4 KB and larger. The macros are defined in
<asm/page.h>; user-space programs can use
the getpagesize library function if they ever
need the information.
Let's look at a nontrivial situation. If a driver
needs 16 KB for temporary data, it shouldn't specify
an order of 2 to
get_free_pages. You need a portable solution.
Such a solution, fortunately, has been written by the kernel
developers and is called get_order:
#include <asm/page.h>
int order = get_order(16*1024);
buf = get_free_pages(GFP_KERNEL, order);
Remember that the argument to get_order must be
a power of two.
11.4.3. Byte Order
Be careful not to make assumptions about
byte ordering. Whereas the PC stores multibyte values low-byte first
(little end first, thus little-endian), some high-level platforms
work the other way (big-endian). Whenever possible, your code should
be written such that it does not care about byte ordering in the data
it manipulates. However, sometimes a driver needs to build an integer
number out of single bytes or do the opposite, or it must communicate
with a device that expects a specific order.
The include file <asm/byteorder.h> defines
either _ _BIG_ENDIAN or _
_LITTLE_ENDIAN, depending on the
processor's byte ordering. When dealing with byte
ordering issues, you could code a bunch of #ifdef _
_LITTLE_ENDIAN conditionals, but there is a better way. The
Linux kernel defines a set of macros that handle conversions between
the processor's byte ordering and that of the data
you need to store or load in a specific byte order. For example:
u32 cpu_to_le32 (u32);
u32 le32_to_cpu (u32);
These two macros convert a value from whatever the CPU uses to an
unsigned, little-endian, 32-bit quantity and back. They work whether
your CPU is big-endian or little-endian and, for that matter, whether
it is a 32-bit processor or not. They return their argument unchanged
in cases where there is no work to be done. Use of these macros makes
it easy to write portable code without having to use a lot of
conditional compilation constructs.
There are dozens of similar routines; you can see the full list in
<linux/byteorder/big_endian.h> and
<linux/byteorder/little_endian.h>. After a
while, the pattern is not hard to follow.
be64_to_cpu converts an unsigned, big-endian,
64-bit value to the internal CPU representation.
le16_to_cpus, instead, handles signed,
little-endian, 16-bit quantities. When dealing with pointers, you can
also use functions like cpu_to_le32p, which take
a pointer to the value to be converted rather than the value itself.
See the include file for the rest.
11.4.4. Data Alignment
The last problem worth
considering when writing portable code is how to access unaligned
data—for example, how to read a 4-byte value stored at an
address that isn't a multiple of 4 bytes. i386 users
often access unaligned data items, but not all architectures permit
it. Many modern architectures generate an exception every time the
program tries unaligned data transfers; data transfer is handled by
the exception handler, with a great performance penalty. If you need
to access unaligned data, you should use the following macros:
#include <asm/unaligned.h>
get_unaligned(ptr);
put_unaligned(val, ptr);
These macros are typeless and work for every data item, whether
it's one, two, four, or eight bytes long. They are
defined with any kernel version.
Another issue related to alignment is
portability of data structures across platforms. The same data
structure (as defined in the C-language source file) can be compiled
differently on different platforms. The compiler arranges structure
fields to be aligned according to conventions that differ from
platform to platform.
In order to write data structures for data
items that can be moved across architectures, you should always
enforce natural alignment of the data items in addition to
standardizing on a specific endianness. Natural
alignment means storing data items at an address that is a
multiple of their size (for instance, 8-byte items go in an address
multiple of 8). To enforce natural alignment while preventing the
compiler to arrange the fields in unpredictable ways, you should use
filler fields that avoid leaving holes in the data structure.
To show how alignment is
enforced by the compiler, the dataalign program
is distributed in the misc-progs directory of
the sample code, and an equivalent kdataalign
module is part of misc-modules. This is the
output of the program on several platforms and the output of the
module on the SPARC64:
arch Align: char short int long ptr long-long u8 u16 u32 u64
i386 1 2 4 4 4 4 1 2 4 4
i686 1 2 4 4 4 4 1 2 4 4
alpha 1 2 4 8 8 8 1 2 4 8
armv4l 1 2 4 4 4 4 1 2 4 4
ia64 1 2 4 8 8 8 1 2 4 8
mips 1 2 4 4 4 8 1 2 4 8
ppc 1 2 4 4 4 8 1 2 4 8
sparc 1 2 4 4 4 8 1 2 4 8
sparc64 1 2 4 4 4 8 1 2 4 8
x86_64 1 2 4 8 8 8 1 2 4 8
kernel: arch Align: char short int long ptr long-long u8 u16 u32 u64
kernel: sparc64 1 2 4 8 8 8 1 2 4 8
It's interesting to note that not all platforms
align 64-bit values on 64-bit boundaries, so you need filler fields
to enforce alignment and ensure portability.
Finally, be aware that the compiler may quietly insert padding into
structures itself to ensure that every field is aligned for good
performance on the target processor. If you are defining a structure
that is intended to match a structure expected by a device, this
automatic padding may thwart your attempt. The way around this
problem is to tell the compiler that the structure must be
"packed," with no fillers added.
For example, the kernel header file
<linux/edd.h> defines several data
structures used in interfacing with the x86 BIOS, and it includes the
following definition:
struct {
u16 id;
u64 lun;
u16 reserved1;
u32 reserved2;
} _ _attribute_ _ ((packed)) scsi;
Without the _ _attribute_ _ ((packed)), the
lun field would be preceded by two filler bytes or
six if we compile the structure on a 64-bit platform.
11.4.5. Pointers and Error Values
Many internal kernel functions return a pointer value to the caller.
Many of those functions can also fail.
In
most cases, failure is indicated by returning a
NULL pointer value. This technique works, but it
is unable to communicate the exact nature of the problem. Some
interfaces really need to return an actual error code so that the
caller can make the right decision based on what actually went wrong.
A number of kernel interfaces return this information by encoding the
error code in a pointer value. Such functions must be used with care,
since their return value cannot simply be compared against
NULL. To help in the creation and use of this sort
of interface, a small set of functions has been made available (in
<linux/err.h>).
A function returning a pointer type can return an error value with:
void *ERR_PTR(long error);
where error is the usual negative error code. The
caller can use IS_ERR to test whether a returned
pointer is an error code or not:
long IS_ERR(const void *ptr);
If you need the actual error code, it can be extracted with:
long PTR_ERR(const void *ptr);
You should use PTR_ERR only on a value for which
IS_ERR returns a true value; any other value is
a valid
pointer.
|