8.4 fsync and fdatasync: Flushing Disk Buffers

On most operating systems, when you write to a file, the data is not immediately written to disk. Instead, the operating system caches the written data in a memory buffer, to reduce the number of required disk writes and improve program responsiveness. When the buffer fills or some other condition occurs (for instance, enough time elapses), the system writes the cached data to disk all at one time.

Linux provides caching of this type as well. Normally, this is a great boon to performance. However, this behavior can make programs that depend on the integrity of disk-based records unreliable. If the system goes down suddenly—for instance, due to a kernel crash or power outage—any data written by a program that is in the memory cache but has not yet been written to disk is lost.

For example, suppose that you are writing a transaction-processing program that keeps a journal file. The journal file contains records of all transactions that have been processed so that if a system failure occurs, the state of the transaction data can be reconstructed. It is obviously important to preserve the integrity of the journal file— whenever a transaction is processed, its journal entry should be sent to the disk drive immediately.

To help you implement this, Linux provides the fsync system call. It takes one argument, a writable file descriptor, and flushes to disk any data written to this file. The fsync call doesn't return until the data has physically been written.

The function in Listing 8.3 illustrates the use of fsync. It writes a single-line entry to a journal file.

Listing 8.3 (write_journal_entry.c) Write and Sync a Journal Entry
#include <fcntl.h> 
#include <string.h> 
#include <sys/stat.h> 
#include <sys/types.h> 
#include <unistd.h> 
 
const char* journal_filename = "journal.log"; 
 
void write_journal_entry (char* entry) 
{
  int fd = open (journal_filename, O_WRONLY | O_CREAT | O_APPEND, 0660); 
  write (fd, entry, strlen (entry)); 
  write (fd, "\n", 1); 
  fsync (fd); 
  close (fd); 
} 

Another system call, fdatasync does the same thing. However, although fsync guarantees that the file's modification time will be updated, fdatasync does not; it guarantees only that the file's data will be written. This means that in principal, fdatasync can execute faster than fsync because it needs to force only one disk write instead of two. However, in current versions of Linux, these two system calls actually do the same thing, both updating the file's modification time.

The fsync system call enables you to force a buffer write explicitly. You can also open a file for synchronous I/O, which causes all writes to be committed to disk immediately. To do this, specify the O_SYNC flag when opening the file with the open call.