for a file,
returns a file descriptor, a small, nonnegative integer
for use in subsequent system calls
(read(2), write(2), lseek(2), fcntl(2), etc.).
The file descriptor returned by a successful call will be
the lowest-numbered file descriptor not currently open for the process.
By default, the new file descriptor is set to remain open across an
file descriptor flag described in
is initially disabled; the Linux-specific
flag, described below, can be used to change this default).
The file offset is set to the beginning of the file (see
A call to
creates a new
open file description,
an entry in the system-wide table of open files.
This entry records the file offset and the file status flags
(modifiable via the
A file descriptor is a reference to one of these entries;
this reference is unaffected if
is subsequently removed or modified to refer to a different file.
The new open file description is initially not shared
with any other process,
but sharing may arise via
must include one of the following
O_RDONLY, O_WRONLY, or O_RDWR.
These request opening the file read-only, write-only, or read/write,
In addition, zero or more file creation flags and file status flags
file creation flags
O_CREAT, O_EXCL, O_NOCTTY, and O_TRUNC.
file status flags
are all of the remaining flags listed below.
The distinction between these two groups of flags is that
the file status flags can be retrieved and (in some cases)
The full list of file creation flags and file status flags is as follows:
The file is opened in append mode.
the file offset is positioned at the end of the file,
as if with
may lead to corrupted files on NFS file systems if more than one process
appends data to a file at once.
This is because NFS does not support
appending to a file, so the client kernel has to simulate it, which
can't be done without a race condition.
Enable signal-driven I/O:
generate a signal
by default, but this can be changed via
when input or output becomes possible on this file descriptor.
This feature is only available for terminals, pseudo-terminals,
sockets, and (since Linux 2.6) pipes and FIFOs.
for further details.
O_CLOEXEC (Since Linux 2.6.23)
Enable the close-on-exec flag for the new file descriptor.
Specifying this flag permits a program to avoid additional
operations to set the
use of this flag is essential in some multithreaded programs
since using a separate
operation to set the
flag does not suffice to avoid race conditions
where one thread opens a file descriptor at the same
time as another thread does a
If the file does not exist it will be created.
The owner (user ID) of the file is set to the effective user ID
of the process.
The group ownership (group ID) is set either to
the effective group ID of the process or to the group ID of the
parent directory (depending on file system type and mount options,
and the mode of the parent directory, see the mount options
specifies the permissions to use in case a new file is created.
This argument must be supplied when
is specified in
is not specified, then
The effective permissions are modified by
in the usual way: The permissions of the created file are
(mode & ~umask).
Note that this mode only applies to future accesses of the
newly created file; the
call that creates a read-only file may well return a read/write
The following symbolic constants are provided for
00700 user (file owner) has read, write and execute permission
00400 user has read permission
00200 user has write permission
00100 user has execute permission
00070 group has read, write and execute permission
00040 group has read permission
00020 group has write permission
00010 group has execute permission
00007 others have read, write and execute permission
00004 others have read permission
00002 others have write permission
00001 others have execute permission
O_DIRECT (Since Linux 2.4.10)
Try to minimize cache effects of the I/O to and from this file.
In general this will degrade performance, but it is useful in
special situations, such as when applications do their own caching.
File I/O is done directly to/from user space buffers.
flag on its own makes at an effort to transfer data synchronously,
but does not give the guarantees of the
that data and necessary metadata are transferred.
To guarantee synchronous I/O the
must be used in addition to
below for further discussion.
A semantically similar (but deprecated) interface for block devices
is described in
If pathname is not a directory, cause the open to fail.
This flag is Linux-specific, and was added in kernel version 2.1.126, to
avoid denial-of-service problems if
is called on a
FIFO or tape device, but should not be used outside of the
Ensure that this call creates the file:
if this flag is specified in conjunction with
already exists, then
The behavior of
is undefined if
is not specified.
When these two flags are specified, symbolic links are not followed:
is a symbolic link, then
fails regardless of where the symbolic link points to.
is only supported when using NFSv3 or later on kernel 2.6 or later.
In NFS environments where
support is not provided, programs that rely on it
for performing locking tasks will contain a race condition.
Portable programs that want to perform atomic file locking using a lockfile,
and need to avoid reliance on NFS support for
can create a unique file on
the same file system (e.g., incorporating hostname and PID), and use
to make a link to the lockfile.
returns 0, the lock is successful.
on the unique file to check if its link count has increased to 2,
in which case the lock is also successful.
Allow files whose sizes cannot be represented in an
(but can be represented in an
to be opened.
macro must be defined
in order to obtain this definition.
feature test macro to 64 (rather than using
is the preferred method of obtaining
method of accessing large files on 32-bit systems (see
O_NOATIME (Since Linux 2.6.8)
Do not update the file last access time (st_atime in the inode)
when the file is
This flag is intended for use by indexing or backup programs,
where its use can significantly reduce the amount of disk activity.
This flag may not be effective on all file systems.
One example is NFS, where the server maintains the access time.
refers to a terminal device --- see
--- it will not become the process's controlling terminal even if the
process does not have one.
If pathname is a symbolic link, then the open fails.
This is a FreeBSD extension, which was added to Linux in version 2.1.126.
Symbolic links in earlier components of the pathname will still be
O_NONBLOCK or O_NDELAY
When possible, the file is opened in nonblocking mode.
nor any subsequent operations on the file descriptor which is
returned will cause the calling process to wait.
For the handling of FIFOs (named pipes), see also
For a discussion of the effect of
in conjunction with mandatory file locks and with file leases, see
The file is opened for synchronous I/O.
on the resulting file descriptor will block the calling process until
the data has been physically written to the underlying hardware.
But see NOTES below.
If the file already exists and is a regular file and the open mode allows
writing (i.e., is
it will be truncated to length 0.
If the file is a FIFO or terminal device file, the
flag is ignored.
Otherwise the effect of
Some of these optional flags can be altered using
after the file has been opened.
is equivalent to
return the new file descriptor, or -1 if an error occurred
(in which case,
is set appropriately).
The requested access to the file is not allowed, or search permission
is denied for one of the directories in the path prefix of
or the file did not exist yet and write access to the parent directory
is not allowed.
already exists and
O_CREAT and O_EXCL
points outside your accessible address space.
While blocked waiting to complete an open of a slow device
(e.g., a FIFO; see
the call was interrupted by a signal handler; see
refers to a directory and the access requested involved writing
Too many symbolic links were encountered in resolving
or O_NOFOLLOW was specified but
was a symbolic link.
The process already has the maximum number of files open.
was too long.
The system limit on the total number of open files has been reached.
refers to a device special file and no corresponding device exists.
(This is a Linux kernel bug; in this situation
must be returned.)
is not set and the named file does not exist.
Or, a directory component in
does not exist or is a dangling symbolic link.
Insufficient kernel memory was available.
was to be created but the device containing
has no room for the new file.
A component used as a directory in
is not, in fact, a directory, or O_DIRECTORY was specified and
was not a directory.
O_NONBLOCK | O_WRONLY
is set, the named file is a FIFO and
no process has the file open for reading.
Or, the file is a device special file and no corresponding device exists.
refers to a regular file that is too large to be opened.
The usual scenario here is that an application compiled
on a 32-bit platform without
tried to open a file whose size exceeds
This is the error specified by POSIX.1-2001;
in kernels before 2.6.24, Linux gave the error
for this case.
flag was specified, but the effective user ID of the caller
did not match the owner of the file and the caller was not privileged
refers to a file on a read-only file system and write access was
refers to an executable image which is currently being executed and
write access was requested.
flag was specified, and an incompatible lease was held on the file
SVr4, 4.3BSD, POSIX.1-2001.
flags are Linux-specific, and one may need to define
to obtain their definitions.
flag is not specified in POSIX.1-2001,
but is specified in POSIX.1-2008.
is not specified in POSIX; one has to define
to get its definition.
Under Linux, the
flag indicates that one wants to open
but does not necessarily have the intention to read or write.
This is typically used to open devices in order to get a file descriptor
for use with
Unlike the other values that can be specified in
O_RDONLY, O_WRONLY, and O_RDWR,
do not specify individual bits.
Rather, they define the low order two bits of
and are defined respectively as 0, 1, and 2.
In other words, the combination
O_RDONLY | O_WRONLY
is a logical error, and certainly does not have the same meaning as
Linux reserves the special, nonstandard access mode 3 (binary 11) in
check for read and write permission on the file and return a descriptor
that can't be used for reading or writing.
This nonstandard access mode is used by some Linux drivers to return a
descriptor that is only to be used for device-specific
The (undefined) effect of
O_RDONLY | O_TRUNC
varies among implementations.
On many systems the file is actually truncated.
There are many infelicities in the protocol underlying NFS, affecting
O_SYNC and O_NDELAY.
POSIX provides for three different variants of synchronized I/O,
corresponding to the flags
Currently (2.6.31), Linux only implements
but glibc maps
to the same numerical value as
Most Linux file systems don't actually implement the POSIX
semantics, which require all metadata updates of a write
to be on disk on returning to userspace, but only the
semantics, which require only actual file data and metadata necessary
to retrieve it to be on disk by the time the system call returns.
can open device special files, but
cannot create them; use
On NFS file systems with UID mapping enabled,
return a file descriptor but, for example,
requests are denied
This is because the client performs
by checking the
permissions, but UID mapping is performed by the server upon
read and write requests.
If the file is newly created, its
(respectively, time of last access, time of last status change, and
time of last modification; see
to the current time, and so are the
fields of the
Otherwise, if the file is modified because of the
flag, its st_ctime and st_mtime fields are set to the current time.
flag may impose alignment restrictions on the length and address
of userspace buffers and the file offset of I/Os.
In Linux alignment
restrictions vary by file system and kernel version and might be
However there is currently no file system-independent
interface for an application to discover these restrictions for a given
file or file system.
Some file systems provide their own interfaces
for doing so, for example the
Under Linux 2.4, transfer sizes, and the alignment of the user buffer
and the file offset must all be multiples of the logical block size
of the file system.
Under Linux 2.6, alignment to 512-byte boundaries
flag was introduced in SGI IRIX, where it has alignment
restrictions similar to those of Linux 2.4.
IRIX has also a
call to query appropriate alignments, and sizes.
FreeBSD 4.x introduced
a flag of the same name, but without alignment restrictions.
support was added under Linux in kernel version 2.4.10.
Older Linux kernels simply ignore this flag.
Some file systems may not implement the flag and
will fail with
if it is used.
Applications should avoid mixing
and normal I/O to the same file,
and especially to overlapping byte regions in the same file.
Even when the file system correctly handles the coherency issues in
this situation, overall I/O throughput is likely to be slower than
using either mode alone.
Likewise, applications should avoid mixing
of files with direct I/O to the same files.
The behaviour of
with NFS will differ from local file systems.
Older kernels, or
kernels configured in certain ways, may not support this combination.
The NFS protocol does not support passing the flag to the server, so
I/O will only bypass the page cache on the client; the server may
still cache the I/O.
The client asks the server to make the I/O
synchronous to preserve the synchronous semantics of
Some servers will perform poorly under these circumstances, especially
if the I/O size is small.
Some servers may also be configured to
lie to clients about the I/O having reached stable storage; this
will avoid the performance penalty at some risk to data integrity
in the event of server power failure.
The Linux NFS client places no alignment restrictions on
is a potentially powerful tool that should be used with caution.
It is recommended that applications treat use of
as a performance option which is disabled by default.
"The thing that has always disturbed me about O_DIRECT is that the whole
interface is just stupid, and was probably designed by a deranged monkey
on some serious mind-controlling substances." --- Linus
Currently, it is not possible to enable signal-driven
I/O by specifying
to enable this flag.