Unix file types


The standard Unix file types are regular, directory, symbolic link, FIFO special, block special, character special, and socket as defined by POSIX. A single file can only have one type value. Different OS-specific implementations allow more types than what POSIX requires. A file's type can be identified by the ls -l command, which displays the type in the first character of the file system permissions field.
For regular files, Unix does not impose or provide any internal file structure; therefore, their structure and interpretation is entirely dependent on the software using them. However, the file command can be used to determine what type of data they contain.

Representations

Numeric

In the stat structure, file type and permissions are stored together in a bit field, which has a size of at least 12 bits. The layout for permissions is defined by POSIX to be at the least-significant 9 bits, but the rest is undefined.
By convention, the mode is a 16-bit value written out as a six-digit octal number without a leading zero. The format part occupies the lead 4-bits, and "10" usually stands for a regular file. The mid 3 bits are usually used for setuid, setgid, and sticky. The last part is already defined by POSIX to contain the permission. An example is "100644" for a typical file. This format can be seen in git, tar, and ar, among other places.
The type of a file can be tested using macros like S_ISDIR. Such a check is usually performed by masking the mode with S_IFMT and checking whether the result matches S_IFDIR. S_IFMT is not a core POSIX concept, but a X/Open System Interfaces extension; systems conforming to only POSIX may use some other methods.

Mode string

Take for example one line in the ls -l output:
drwxr-xr-x 2 root root 0 Jan 1 1970 home
POSIX specifies the format of the output for the long format. In particular, the first field is dubbed the "file mode string" and its first character describes the file type. The rest of this string indicates the file permissions.
Therefore, in the example, the mode string is drwxr-xr-x: the file type is d and the permissions are rwxr-xr-x.

Examples of implementations

The GNU coreutils version of ls uses a call to filemode, a glibc function to get the mode string.
FreeBSD uses a simpler approach but allows a smaller number of file types.

Regular file

Regular files show up in ls -l with a hyphen-minus - in the mode field:
$ ls -l /etc/passwd
-rw-r--r--... /etc/passwd

Directory

The most common special file is the directory. The layout of a directory file is defined by the filesystem used. As several filesystems are available under Unix, both native and non-native, there is no one directory file layout.
A directory is marked with a d as the first letter in the mode field in the output of ls -dl or stat, e.g.
$ ls -dl /
drwxr-xr-x 26 root root 4096 Sep 22 09:29 /
$ stat /
File: "/"
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: 802h/2050d Inode: 128 Links: 26
Access: Uid: Gid:
...

Symbolic link

A symbolic link is a reference to another file. This special file is stored as a textual representation of the referenced file's path.
A symbolic link is marked with an l as the first letter of the mode string, e.g.
lrwxrwxrwx... termcap -> /usr/share/misc/termcap
lrwxrwxrwx... S03xinetd ->../init.d/xinetd

FIFO (named pipe)

One of the strengths of Unix has always been inter-process communication. Among the facilities provided by the OS are pipes, which connect the output of one process to the input of another. This is fine if both processes exist in the same parent process space, started by the same user, but there are circumstances where the communicating processes must use FIFOs, here referred to as named pipes. One such circumstance occurs when the processes must be executed under different user names and permissions.
Named pipes are special files that can exist anywhere in the file system. They can be created with the command mkfifo as in mkfifo mypipe.
A named pipe is marked with a p as the first letter of the mode string, e.g.
prw-rw----... mypipe

Socket

A socket is a special file used for inter-process communication, which enables communication between two processes. In addition to sending data, processes can send file descriptors across a Unix domain socket connection using the sendmsg and recvmsg system calls.
Unlike named pipes which allow only unidirectional data flow, sockets are fully duplex-capable.
A socket is marked with an s as the first letter of the mode string, e.g.
srwxrwxrwx /tmp/.X11-unix/X0

Device file (block, character)

In Unix, almost all things are handled as files and have a location in the file system, even hardware devices like hard drives. The great exception is network devices, which do not turn up in the file system but are handled separately.
Device files are used to apply access rights to the devices and to direct operations on the files to the appropriate device drivers.
Unix makes a distinction between character devices and block devices. The distinction is roughly as follows:
Although, for example, disk partitions may have both character devices that provide un-buffered random access to blocks on the partition and block devices that provide buffered random access to blocks on the partition.
A character device is marked with a c as the first letter of the mode string. Likewise, a block device is marked with a b, e.g.
crw-------... /dev/null
brw-rw----... /dev/sda

Door

A door is a special file for inter-process communication between a client and server, currently implemented only in Solaris.
A door is marked with a D as the first letter of the mode string, e.g.
Dr--r--r--... name_service_door