4.4. Debugging by Watching
Sometimes minor
problems
can be tracked down by watching the
behavior of an application in user space. Watching programs can also
help in building confidence that a driver is working correctly. For
example, we were able to feel confident about
scull after looking at how its
read implementation reacted to read requests for
different amounts of data.
There are various ways to watch a user-space program working. You can
run a debugger on it to step through its functions, add print
statements, or run the program under strace.
Here we'll discuss just the last technique, which is
most interesting when the real goal is examining kernel code.
The strace
command is a powerful tool that shows all the system calls issued by
a user-space program. Not only does it show the calls, but it can
also show the arguments to the calls and their return values in
symbolic form. When a system call fails, both the symbolic value of
the error (e.g., ENOMEM) and the corresponding
string (Out of memory) are
displayed. strace has many command-line options;
the most useful of which are -t to display the
time when each call is executed,
-T to display the time spent in the call,
-e to limit the types of calls traced, and
-o to redirect the output to a file. By default,
strace prints tracing information on
stderr.
strace receives information from the kernel
itself. This means that a program can be traced regardless of whether
or not it was compiled with debugging support (the
-g option to gcc) and
whether or not it is stripped. You can also attach tracing to a
running process, similar to the way a debugger can connect to a
running process and control it.
The trace information is often used to support bug reports sent to
application developers, but it's also invaluable to
kernel programmers. We've seen how driver code
executes by reacting to system calls; strace
allows us to check the consistency of input and output data of each
call.
For example, the following screen dump shows (most of) the last lines
of running the command strace ls /dev >
/dev/scull0 :
open("/dev", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 3
fstat64(3, {st_mode=S_IFDIR|0755, st_size=24576, ...}) = 0
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
getdents64(3, /* 141 entries */, 4096) = 4088
[...]
getdents64(3, /* 0 entries */, 4096) = 0
close(3) = 0
[...]
fstat64(1, {st_mode=S_IFCHR|0664, st_rdev=makedev(254, 0), ...}) = 0
write(1, "MAKEDEV\nadmmidi0\nadmmidi1\nadmmid"..., 4096) = 4000
write(1, "b\nptywc\nptywd\nptywe\nptywf\nptyx0\n"..., 96) = 96
write(1, "b\nptyxc\nptyxd\nptyxe\nptyxf\nptyy0\n"..., 4096) = 3904
write(1, "s17\nvcs18\nvcs19\nvcs2\nvcs20\nvcs21"..., 192) = 192
write(1, "\nvcs47\nvcs48\nvcs49\nvcs5\nvcs50\nvc"..., 673) = 673
close(1) = 0
exit_group(0) = ?
It's apparent from the
first write call that after
ls finished looking in the target directory, it
tried to write 4 KB. Strangely (for ls), only
4000 bytes were written, and the operation was retried. However, we
know that the write implementation in
scull writes a single quantum at a time, so we
could have expected the partial write. After a few steps, everything
sweeps through, and the program exits successfully.
As another example,
let's read the
scull device (using the
wc command):
[...]
open("/dev/scull0", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFCHR|0664, st_rdev=makedev(254, 0), ...}) = 0
read(3, "MAKEDEV\nadmmidi0\nadmmidi1\nadmmid"..., 16384) = 4000
read(3, "b\nptywc\nptywd\nptywe\nptywf\nptyx0\n"..., 16384) = 4000
read(3, "s17\nvcs18\nvcs19\nvcs2\nvcs20\nvcs21"..., 16384) = 865
read(3, "", 16384) = 0
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
write(1, "8865 /dev/scull0\n", 17) = 17
close(3) = 0
exit_group(0) = ?
As expected, read is able to retrieve only 4000
bytes at a time, but the total amount of data is the same that was
written in the previous example. It's interesting to
note how retries are organized in this example, as opposed to the
previous trace. wc is optimized for fast reading
and, therefore, bypasses the standard library, trying to read more
data with a single system call. You can see from the
read lines in the trace how
wc tried to read 16 KB at a time.
Linux experts can find much useful information in the output of
strace. If you're put off by
all the symbols, you can limit yourself to watching how the file
methods (open, read, and so
on) work with the efile flag.
Personally, we find strace most useful for
pinpointing runtime errors from system calls. Often
the
perror call in the
application or demo program isn't verbose enough to
be useful for debugging, and being able to tell exactly which
arguments to which system call triggered the error can be a great
help.
|