Sunday, September 27, 2009

Linux:About files and the file system

Chapter 3. About files and the file system
After the initial exploration in Chapter 2, we are ready to discuss the files and directories on a
Linux system in more detail. Many users have difficulties with Linux because they lack an
overview of what kind of data is kept in which locations. We will try to shine some light on
the organization of files in the file system.
We will also list the most important files and directories and use different methods of viewing
the content of those files, and learn how files and directories can be created, moved and
deleted.
After completion of the exercises in this chapter, you will be able to:
¨ Describe the layout of a Linux file system
¨ Display and set paths
¨ Describe the most important files, including kernel and shell
¨ Find lost and hidden files
¨ Create, move and delete files and directories
¨ Display contents of files
¨ Understand and use different link types
¨ Find out about file properties and change file permissions
3.1. General overview of the Linux file system
3.1.1. Files
3.1.1.1. General
A simple description of the UNIX system, also applicable to Linux, is this:
"On a UNIX system, everything is a file; if something is not a file, it is a process."
This statement is true because there are special files that are more than just files (named pipes and sockets, for
instance), but to keep things simple, saying that everything is a file is an acceptable generalization. A Linux
system, just like UNIX, makes no difference between a file and a directory, since a directory is just a file
containing names of other files. Programs, services, texts, images, and so forth, are all files. Input and output
devices, and generally all devices, are considered to be files, according to the system.
In order to manage all those files in an orderly fashion, man likes to think of them in an ordered tree-like
structure on the hard disk, as we know from MS-DOS (Disk Operating System) for instance. The large
branches contain more branches, and the branches at the end contain the tree's leaves or normal files. For now
we will use this image of the tree, but we will find out later why this is not a fully accurate image.
3.1.1.2. Sorts of files
Most files are just files, called regular files; they contain normal data, for example text files, executable files
or programs, input for or output from a program and so on.
Chapter 3. About files and the file system 32
While it is reasonably safe to suppose that everything you encounter on a Linux system is a file, there are
some exceptions.
· Directories: files that are lists of other files.
Special files: the mechanism used for input and output. Most special files are in /dev, we will
discuss them later.
·
Links: a system to make a file or directory visible in multiple parts of the system's file tree. We will
talk about links in detail.
·
(Domain) sockets: a special file type, similar to TCP/IP sockets, providing inter-process networking
protected by the file system's access control.
·
Named pipes: act more or less like sockets and form a way for processes to communicate with each
other, without using network socket semantics.
·
The -l option to ls displays the file type, using the first character of each input line:
jaime:~/Documents> ls -l
total 80
-rw-rw-r-- 1 jaime jaime 31744 Feb 21 17:56 intro Linux.doc
-rw-rw-r-- 1 jaime jaime 41472 Feb 21 17:56 Linux.doc
drwxrwxr-x 2 jaime jaime 4096 Feb 25 11:50 course
This table gives an overview of the characters determining the file type:
Table 3-1. File types in a long list
Symbol Meaning
- Regular file
d Directory
l Link
c Special file
s Socket
p Named pipe
b Block device
In order not to always have to perform a long listing for seeing the file type, a lot of systems by default don't
issue just ls, but ls -F, which suffixes file names with one of the characters "/=*|@" to indicate the file type.
To make it extra easy on the beginning user, both the -F and --color options are usually combined, see
Section 3.3.1.1. We will use ls -F throughout this document for better readability.
As a user, you only need to deal directly with plain files, executable files, directories and links. The special
file types are there for making your system do what you demand from it and are dealt with by system
administrators and programmers.
Now, before we look at the important files and directories, we need to know more about partitions.
3.1.2. About partitioning
Introduction to Linux
Chapter 3. About files and the file system 33
3.1.2.1. Why partition?
Most people have a vague knowledge of what partitions are, since every operating system has the ability to
create or remove them. It may seem strange that Linux uses more than one partition on the same disk, even
when using the standard installation procedure, so some explanation is called for.
One of the goals of having different partitions is to achieve higher data security in case of disaster. By
dividing the hard disk in partitions, data can be grouped and separated. When an accident occurs, only the data
in the partition that got the hit will be damaged, while the data on the other partitions will most likely survive.
This principle dates from the days when Linux didn't have journaled file systems and power failures might
have lead to disaster. The use of partitions remains for security and robustness reasons, so a breach on one
part of the system doesn't automatically mean that the whole computer is in danger. This is currently the most
important reason for partitioning. A simple example: a user creates a script, a program or a web application
that starts filling up the disk. If the disk contains only one big partition, the entire system will stop functioning
if the disk is full. If the user stores the data on a separate partition, then only that (data) partition will be
affected, while the system partitions and possible other data partitions keep functioning.
Mind that having a journaled file system only provides data security in case of power failure and sudden
disconnection of storage devices. This does not protect your data against bad blocks and logical errors in the
file system. In those cases, you should use a RAID (Redundant Array of Inexpensive Disks) solution.
3.1.2.2. Partition layout and types
There are two kinds of major partitions on a Linux system:
data partition: normal Linux system data, including the root partition containing all the data to start
up and run the system; and
·
· swap partition: expansion of the computer's physical memory, extra memory on hard disk.
Most systems contain a root partition, one or more data partitions and one or more swap partitions. Systems in
mixed environments may contain partitions for other system data, such as a partition with a FAT or VFAT file
system for MS Windows data.
Most Linux systems use fdisk at installation time to set the partition type. As you may have noticed during the
exercise from Chapter 1, this usually happens automatically. On some occasions, however, you may not be so
lucky. In such cases, you will need to select the partition type manually and even manually do the actual
partitioning. The standard Linux partitions have number 82 for swap and 83 for data, which can be journaled
(ext3) or normal (ext2, on older systems). The fdisk utility has built-in help, should you forget these values.
Apart from these two, Linux supports a variety of other file system types, such as the relatively new Reiser
file system, JFS, NFS, FATxx and many other file systems natively available on other (proprietary) operating
systems.
The standard root partition (indicated with a single forward slash, /) is about 100-500 MB, and contains the
system configuration files, most basic commands and server programs, system libraries, some temporary
space and the home directory of the administrative user. A standard installation requires about 250 MB for the
root partition.
Swap space (indicated with swap) is only accessible for the system itself, and is hidden from view during
normal operation. Swap is the system that ensures, like on normal UNIX systems, that you can keep on
Introduction to Linux
Chapter 3. About files and the file system 34
working, whatever happens. On Linux, you will virtually never see irritating messages like Out of memory,
please close some applications first and try again, because of this extra memory. The swap or virtual memory
procedure has long been adopted by operating systems outside the UNIX world by now.
Using memory on a hard disk is naturally slower than using the real memory chips of a computer, but having
this little extra is a great comfort. We will learn more about swap when we discuss processes in Chapter 4.
Linux generally counts on having twice the amount of physical memory in the form of swap space on the hard
disk. When installing a system, you have to know how you are going to do this. An example on a system with
512 MB of RAM:
· 1st possibility: one swap partition of 1 GB
· 2nd possibility: two swap partitions of 512 MB
· 3rd possibility: with two hard disks: 1 partition of 512 MB on each disk.
The last option will give the best results when a lot of I/O is to be expected.
Read the software documentation for specific guidelines. Some applications, such as databases, might require
more swap space. Others, such as some handheld systems, might not have any swap at all by lack of a hard
disk. Swap space may also depend on your kernel version.
The kernel is on a separate partition as well in many distributions, because it is the most important file of your
system. If this is the case, you will find that you also have a /boot partition, holding your kernel(s) and
accompanying data files.
The rest of the hard disk(s) is generally divided in data partitions, although it may be that all of the
non-system critical data resides on one partition, for example when you perform a standard workstation
installation. When non-critical data is separated on different partitions, it usually happens following a set
pattern:
· a partition for user programs (/usr)
· a partition containing the users' personal data (/home)
· a partition to store temporary data like print- and mail-queues (/var)
· a partition for third party and extra software (/opt)
Once the partitions are made, you can only add more. Changing sizes or properties of existing partitions is
possible but not advisable.
The division of hard disks into partitions is determined by the system administrator. On larger systems, he or
she may even spread one partition over several hard disks, using the appropriate software. Most distributions
allow for standard setups optimized for workstations (average users) and for general server purposes, but also
accept customized partitions. During the installation process you can define your own partition layout using
either your distribution specific tool, which is usually a straight forward graphical interface, or fdisk, a
text-based tool for creating partitions and setting their properties.
A workstation or client installation is for use by mainly one and the same person. The selected software for
installation reflects this and the stress is on common user packages, such as nice desktop themes, development
tools, client programs for E-mail, multimedia software, web and other services. Everything is put together on
one large partition, swap space twice the amount of RAM is added and your generic workstation is complete,
providing the largest amount of disk space possible for personal use, but with the disadvantage of possible
data integrity loss during problem situations.
Introduction to Linux
Chapter 3. About files and the file system 35
On a server, system data tends to be separate from user data. Programs that offer services are kept in a
different place than the data handled by this service. Different partitions will be created on such systems:
· a partition with all data necessary to boot the machine
· a partition with configuration data and server programs
one or more partitions containing the server data such as database tables, user mails, an ftp archive
etc.
·
· a partition with user programs and applications
· one or more partitions for the user specific files (home directories)
· one or more swap partitions (virtual memory)
Servers usually have more memory and thus more swap space. Certain server processes, such as databases,
may require more swap space than usual; see the specific documentation for detailed information. For better
performance, swap is often divided into different swap partitions.
3.1.2.3. Mount points
All partitions are attached to the system via a mount point. The mount point defines the place of a particular
data set in the file system. Usually, all partitions are connected through the root partition. On this partition,
which is indicated with the slash (/), directories are created. These empty directories will be the starting point
of the partitions that are attached to them. An example: given a partition that holds the following directories:
videos/ cd-images/ pictures/
We want to attach this partition in the filesystem in a directory called /opt/media. In order to do this, the
system administrator has to make sure that the directory /opt/media exists on the system. Preferably, it
should be an empty directory. How this is done is explained later in this chapter. Then, using the mount
command, the administrator can attach the partition to the system. When you look at the content of the
formerly empty directory /opt/media, it will contain the files and directories that are on the mounted
medium (hard disk or partition of a hard disk, CD, DVD, flash card, USB or other storage device).
During system startup, all the partitions are thus mounted, as described in the file /etc/fstab. Some
partitions are not mounted by default, for instance if they are not constantly connected to the system, such like
the storage used by your digital camera. If well configured, the device will be mounted as soon as the system
notices that it is connected, or it can be user-mountable, i.e. you don't need to be system administrator to
attach and detach the device to and from the system. There is an example in Section 9.3.
On a running system, information about the partitions and their mount points can be displayed using the df
command (which stands for disk full or disk free). In Linux, df is the GNU version, and supports the -h or
human readable option which greatly improves readability. Note that commercial UNIX machines commonly
have their own versions of df and many other commands. Their behavior is usually the same, though GNU
versions of common tools often have more and better features.
The df command only displays information about active non-swap partitions. These can include partitions
from other networked systems, like in the example below where the home directories are mounted from a file
server on the network, a situation often encountered in corporate environments.
freddy:~> df -h
Filesystem Size Used Avail Use% Mounted on
/dev/hda8 496M 183M 288M 39% /
/dev/hda1 124M 8.4M 109M 8% /boot
/dev/hda5 19G 15G 2.7G 85% /opt
/dev/hda6 7.0G 5.4G 1.2G 81% /usr
Introduction to Linux
Chapter 3. About files and the file system 36
/dev/hda7 3.7G 2.7G 867M 77% /var
fs1:/home 8.9G 3.7G 4.7G 44% /.automount/fs1/root/home
3.1.3. More file system layout
3.1.3.1. Visual
For convenience, the Linux file system is usually thought of in a tree structure. On a standard Linux system
you will find the layout generally follows the scheme presented below.
Figure 3-1. Linux file system layout
Introduction to Linux
Chapter 3. About files and the file system 37
This is a layout from a RedHat system. Depending on the system admin, the operating system and the mission
of the UNIX machine, the structure may vary, and directories may be left out or added at will. The names are
not even required; they are only a convention.
The tree of the file system starts at the trunk or slash, indicated by a forward slash (/). This directory,
containing all underlying directories and files, is also called the root directory or "the root" of the file system.
Directories that are only one level below the root directory are often preceded by a slash, to indicate their
position and prevent confusion with other directories that could have the same name. When starting with a
new system, it is always a good idea to take a look in the root directory. Let's see what you could run into:
emmy:~> cd /
emmy:/> ls
bin/ dev/ home/ lib/ misc/ opt/ root/ tmp/ var/
boot/ etc/ initrd/ lost+found/ mnt/ proc/ sbin/ usr/
Table 3-2. Subdirectories of the root directory
Directory Content
/bin Common programs, shared by the system, the system administrator and the users.
/boot
The startup files and the kernel, vmlinuz. In some recent distributions also grub data. Grub is
the GRand Unified Boot loader and is an attempt to get rid of the many different boot-loaders we
know today.
/dev Contains references to all the CPU peripheral hardware, which are represented as files with
special properties.
/etc Most important system configuration files are in /etc, this directory contains data similar to
those in the Control Panel in Windows
/home Home directories of the common users.
/initrd (on some distributions) Information for booting. Do not remove!
/lib Library files, includes files for all kinds of programs needed by the system and the users.
/lost+found Every partition has a lost+found in its upper directory. Files that were saved during failures
are here.
/misc For miscellaneous purposes.
/mnt Standard mount point for external file systems, e.g. a CD-ROM or a digital camera.
/net Standard mount point for entire remote file systems
/opt Typically contains extra and third party software.
/proc
A virtual file system containing information about system resources. More information about the
meaning of the files in proc is obtained by entering the command man proc in a terminal
window. The file proc.txt discusses the virtual file system in detail.
/root The administrative user's home directory. Mind the difference between /, the root directory and
/root, the home directory of the root user.
/sbin Programs for use by the system and the system administrator.
/tmp Temporary space for use by the system, cleaned upon reboot, so don't use this for saving any
work!
/usr Programs, libraries, documentation etc. for all user-related programs.
/var Storage for all variable files and temporary files created by users, such as log files, the mail
queue, the print spooler area, space for temporary storage of files downloaded from the Internet,
Introduction to Linux
Chapter 3. About files and the file system 38
or to keep an image of a CD before burning it.
How can you find out which partition a directory is on? Using the df command with a dot (.) as an option
shows the partition the current directory belongs to, and informs about the amount of space used on this
partition:
sandra:/lib> df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/hda7 980M 163M 767M 18% /
As a general rule, every directory under the root directory is on the root partition, unless it has a separate entry
in the full listing from df (or df -h with no other options).
Read more in man hier.
3.1.3.2. The file system in reality
For most users and for most common system administration tasks, it is enough to accept that files and
directories are ordered in a tree-like structure. The computer, however, doesn't understand a thing about trees
or tree-structures.
Every partition has its own file system. By imagining all those file systems together, we can form an idea of
the tree-structure of the entire system, but it is not as simple as that. In a file system, a file is represented by an
inode, a kind of serial number containing information about the actual data that makes up the file: to whom
this file belongs, and where is it located on the hard disk.
Every partition has its own set of inodes; throughout a system with multiple partitions, files with the same
inode number can exist.
Each inode describes a data structure on the hard disk, storing the properties of a file, including the physical
location of the file data. When a hard disk is initialized to accept data storage, usually during the initial system
installation process or when adding extra disks to an existing system, a fixed number of inodes per partition is
created. This number will be the maximum amount of files, of all types (including directories, special files,
links etc.) that can exist at the same time on the partition. We typically count on having 1 inode per 2 to 8
kilobytes of storage.
At the time a new file is created, it gets a free inode. In that inode is the following information:
· Owner and group owner of the file.
· File type (regular, directory, ...)
· Permissions on the file Section 3.4.1
· Date and time of creation, last read and change.
· Date and time this information has been changed in the inode.
· Number of links to this file (see later in this chapter).
· File size
· An address defining the actual location of the file data.
The only information not included in an inode, is the file name and directory. These are stored in the special
directory files. By comparing file names and inode numbers, the system can make up a tree-structure that the
user understands. Users can display inode numbers using the -i option to ls. The inodes have their own
separate space on the disk.
Introduction to Linux
Chapter 3. About files and the file system 39
3.2. Orientation in the file system
3.2.1. The path
When you want the system to execute a command, you almost never have to give the full path to that
command. For example, we know that the ls command is in the /bin directory (check with which -a ls),
yet we don't have to enter the command /bin/ls for the computer to list the content of the current directory.
The PATH environment variable takes care of this. This variable lists those directories in the system where
executable files can be found, and thus saves the user a lot of typing and memorizing locations of commands.
So the path naturally contains a lot of directories containing bin somewhere in their names, as the user below
demonstrates. The echo command is used to display the content ("$") of the variable PATH:
rogier:> echo $PATH
/opt/local/bin:/usr/X11R6/bin:/usr/bin:/usr/sbin/:/bin
In this example, the directories /opt/local/bin, /usr/X11R6/bin, /usr/bin, /usr/sbin and
/bin are subsequently searched for the required program. As soon as a match is found, the search is stopped,
even if not every directory in the path has been searched. This can lead to strange situations. In the first
example below, the user knows there is a program called sendsms to send an SMS message, and another user
on the same system can use it, but she can't. The difference is in the configuration of the PATH variable:
[jenny@blob jenny]$ sendsms
bash: sendsms: command not found
[jenny@blob jenny]$ echo $PATH
/bin:/usr/bin:/usr/bin/X11:/usr/X11R6/bin:/home/jenny/bin
[jenny@blob jenny]$ su - tony
Password:
tony:~>which sendsms
sendsms is /usr/local/bin/sendsms
tony:~>echo $PATH
/home/tony/bin.Linux:/home/tony/bin:/usr/local/bin:/usr/local/sbin:\
/usr/X11R6/bin:/usr/bin:/usr/sbin:/bin:/sbin
Note the use of the su (switch user) facility, which allows you to run a shell in the environment of another
user, on the condition that you know the user's password.
A backslash indicates the continuation of a line on the next, without an Enter separating one line from the
other.
In the next example, a user wants to call on the wc (word count) command to check the number of lines in a
file, but nothing happens and he has to break off his action using the Ctrl+C combination:
jumper:~> wc -l test
(Ctrl-C)
jumper:~> which wc
wc is hashed (/home/jumper/bin/wc)
jumper:~> echo $PATH
/home/jumper/bin:/usr/local/bin:/usr/local/sbin:/usr/X11R6/bin:\
/usr/bin:/usr/sbin:/bin:/sbin
The use of the which command shows us that this user has a bin-directory in his home directory, containing
a program that is also called wc. Since the program in his home directory is found first when searching the
Introduction to Linux
Chapter 3. About files and the file system 40
paths upon a call for wc, this "home-made" program is executed, with input it probably doesn't understand, so
we have to stop it. To resolve this problem there are several ways (there are always several ways to solve a
problem in UNIX/Linux): one answer could be to rename the user's wc program, or the user can give the full
path to the exact command he wants, which can be found by using the -a option to the which command.
If the user uses programs in the other directories more frequently, he can change his path to look in his own
directories last:
jumper:~> export PATH=/usr/local/bin:/usr/local/sbin:/usr/X11R6/bin:\
/usr/bin:/usr/sbin:/bin:/sbin:/home/jumper/bin
Changes are not permanent!
Note that when using the export command in a shell, the changes are temporary and only valid for this
session (until you log out). Opening new sessions, even while the current one is still running, will not
result in a new path in the new session. We will see in Section 7.2 how we can make these kinds of
changes to the environment permanent, adding these lines to the shell configuration files.
3.2.2. Absolute and relative paths
A path, which is the way you need to follow in the tree structure to reach a given file, can be described as
starting from the trunk of the tree (the / or root directory). In that case, the path starts with a slash and is called
an absolute path, since there can be no mistake: only one file on the system can comply.
In the other case, the path doesn't start with a slash and confusion is possible between ~/bin/wc (in the
user's home directory) and bin/wc in /usr, from the previous example. Paths that don't start with a slash
are always relative.
In relative paths we also use the . and .. indications for the current and the parent directory. A couple of
practical examples:
When you want to compile source code, the installation documentation often instructs you to run the
command ./configure, which runs the configure program located in the current directory (that came
with the new code), as opposed to running another configure program elsewhere on the system.
·
In HTML files, relative paths are often used to make a set of pages easily movable to another place:
Garden with trees
·
Notice the difference one more time:
theo:~> ls /mp3
ls: /mp3: No such file or directory
theo:~>ls mp3/
oriental/ pop/ sixties/
·
3.2.3. The most important files and directories
3.2.3.1. The kernel
The kernel is the heart of the system. It manages the communication between the underlying hardware and the
peripherals. The kernel also makes sure that processes and daemons (server processes) are started and stopped
at the exact right times. The kernel has a lot of other important tasks, so many that there is a special
kernel-development mailing list on this subject only, where huge amounts of information are shared. It would
lead us too far to discuss the kernel in detail. For now it suffices to know that the kernel is the most important
Introduction to Linux
Chapter 3. About files and the file system 41
file on the system.
3.2.3.2. The shell
3.2.3.2.1. What is a shell?
When I was looking for an appropriate explanation on the concept of a shell, it gave me more trouble than I
expected. All kinds of definitions are available, ranging from the simple comparison that "the shell is the
steering wheel of the car", to the vague definition in the Bash manual which says that "bash is an
sh-compatible command language interpreter," or an even more obscure expression, "a shell manages the
interaction between the system and its users". A shell is much more than that.
A shell can best be compared with a way of talking to the computer, a language. Most users do know that
other language, the point-and-click language of the desktop. But in that language the computer is leading the
conversation, while the user has the passive role of picking tasks from the ones presented. It is very difficult
for a programmer to include all options and possible uses of a command in the GUI-format. Thus, GUIs are
almost always less capable than the command or commands that form the backend.
The shell, on the other hand, is an advanced way of communicating with the system, because it allows for
two-way conversation and taking initiative. Both partners in the communication are equal, so new ideas can
be tested. The shell allows the user to handle a system in a very flexible way. An additional asset is that the
shell allows for task automation.
3.2.3.2.2. Shell types
Just like people know different languages and dialects, the computer knows different shell types:
sh or Bourne Shell: the original shell still used on UNIX systems and in UNIX related environments.
This is the basic shell, a small program with few features. When in POSIX-compatible mode, bash
will emulate this shell.
·
bash or Bourne Again SHell: the standard GNU shell, intuitive and flexible. Probably most advisable
for beginning users while being at the same time a powerful tool for the advanced and professional
user. On Linux, bash is the standard shell for common users. This shell is a so-called superset of the
Bourne shell, a set of add-ons and plug-ins. This means that the Bourne Again SHell is compatible
with the Bourne shell: commands that work in sh, also work in bash. However, the reverse is not
always the case. All examples and exercises in this book use bash.
·
csh or C Shell: the syntax of this shell resembles that of the C programming language. Sometimes
asked for by programmers.
·
· tcsh or Turbo C Shell: a superset of the common C Shell, enhancing user-friendliness and speed.
ksh or the Korn shell: sometimes appreciated by people with a UNIX background. A superset of the
Bourne shell; with standard configuration a nightmare for beginning users.
·
The file /etc/shells gives an overview of known shells on a Linux system:
mia:~> cat /etc/shells
/bin/bash
/bin/sh
/bin/tcsh
/bin/csh
Fake Bourne shell
Introduction to Linux
Chapter 3. About files and the file system 42
Note that /bin/sh is usually a link to Bash, which will execute in Bourne shell compatible mode when
called on this way.
Your default shell is set in the /etc/passwd file, like this line for user mia:
mia:L2NOfqdlPrHwE:504:504:Mia Maya:/home/mia:/bin/bash
To switch from one shell to another, just enter the name of the new shell in the active terminal. The system
finds the directory where the name occurs using the PATH settings, and since a shell is an executable file
(program), the current shell activates it and it gets executed. A new prompt is usually shown, because each
shell has its typical appearance:
mia:~> tcsh
[mia@post21 ~]$
3.2.3.2.3. Which shell am I using?
If you don't know which shell you are using, either check the line for your account in /etc/passwd or type
the command
echo $SHELL
3.2.3.3. Your home directory
Your home directory is your default destination when connecting to the system. In most cases it is a
subdirectory of /home, though this may vary. Your home directory may be located on the hard disk of a
remote file server; in that case your home directory may be found in /nethome/your_user_name. In
another case the system administrator may have opted for a less comprehensible layout and your home
directory may be on /disk6/HU/07/jgillard.
Whatever the path to your home directory, you don't have to worry too much about it. The correct path to your
home directory is stored in the HOME environment variable, in case some program needs it. With the echo
command you can display the content of this variable:
orlando:~> echo $HOME
/nethome/orlando
You can do whatever you like in your home directory. You can put as many files in as many directories as you
want, although the total amount of data and files is naturally limited because of the hardware and size of the
partitions, and sometimes because the system administrator has applied a quota system. Limiting disk usage
was common practice when hard disk space was still expensive. Nowadays, limits are almost exclusively
applied in large environments. You can see for yourself if a limit is set using the quota command:
pierre@lamaison:/> quota -v
Diskquotas for user pierre (uid 501): none
In case quotas have been set, you get a list of the limited partitions and their specific limitations. Exceeding
the limits may be tolerated during a grace period with fewer or no restrictions at all. Detailed information can
be found using the info quota or man quota commands.
No Quota?
If your system can not find the quota, then no limitation of file system usage is being applied.
Your home directory is indicated by a tilde (~), shorthand for /path_to_home/user_name. This same
path is stored in the HOME variable, so you don't have to do anything to activate it. A simple application:
Introduction to Linux
Chapter 3. About files and the file system 43
switch from /var/music/albums/arno/2001 to images in your home directory using one elegant
command:
rom:/var/music/albums/arno/2001> cd ~/images
rom:~/images> pwd
/home/rom/images
Later in this chapter we will talk about the commands for managing files and directories in order to keep your
home directory tidy.
3.2.4. The most important configuration files
As we mentioned before, most configuration files are stored in the /etc directory. Content can be viewed
using the cat command, which sends text files to the standard output (usually your monitor). The syntax is
straight forward:
cat file1 file2 ... fileN
In this section we try to give an overview of the most common configuration files. This is certainly not a
complete list. Adding extra packages may also add extra configuration files in /etc. When reading the
configuration files, you will find that they are usually quite well commented and self-explanatory. Some files
also have man pages which contain extra documentation, such as man group.
Table 3-3. Most common configuration files
File Information/service
aliases
Mail aliases file for use with the Sendmail and Postfix mail server.
Running a mail server on each and every system has long been
common use in the UNIX world, and almost every Linux distribution
still comes with a Sendmail package. In this file local user names are
matched with real names as they occur in E-mail addresses, or with
other local addresses.
apache Config files for the Apache web server.
bashrc
The system-wide configuration file for the Bourne Again SHell.
Defines functions and aliases for all users. Other shells may have their
own system-wide config files, like cshrc.
crontab and the cron.*
directories
Configuration of tasks that need to be executed periodically - backups,
updates of the system databases, cleaning of the system, rotating logs
etc.
default Default options for certain commands, such as useradd.
filesystems Known file systems: ext3, vfat, iso9660 etc.
fstab Lists partitions and their mount points.
ftp*
Configuration of the ftp-server: who can connect, what parts of the
system are accessible etc.
group
Configuration file for user groups. Use the shadow utilities groupadd,
groupmod and groupdel to edit this file. Edit manually only if you
really know what you are doing.
Introduction to Linux
Chapter 3. About files and the file system 44
hosts
A list of machines that can be contacted using the network, but without
the need for a domain name service. This has nothing to do with the
system's network configuration, which is done in /etc/sysconfig.
inittab Information for booting: mode, number of text consoles etc.
issue Information about the distribution (release version and/or kernel info).
ld.so.conf Locations of library files.
lilo.conf, silo.conf,
aboot.conf etc.
Boot information for the LInux LOader, the system for booting that is
now gradually being replaced with GRUB.
logrotate.*
Rotation of the logs, a system preventing the collection of huge
amounts of log files.
mail Directory containing instructions for the behavior of the mail server.
modules.conf Configuration of modules that enable special features (drivers).
motd
Message Of The Day: Shown to everyone who connects to the system
(in text mode), may be used by the system admin to announce system
services/maintenance etc.
mtab Currently mounted file systems. It is advised to never edit this file.
nsswitch.conf
Order in which to contact the name resolvers when a process demands
resolving of a host name.
pam.d Configuration of authentication modules.
passwd
Lists local users. Use the shadow utilities useradd, usermod and
userdel to edit this file. Edit manually only when you really know what
you are doing.
printcap
Outdated but still frequently used printer configuration file. Don't edit
this manually unless you really know what you are doing.
profile
System wide configuration of the shell environment: variables, default
properties of new files, limitation of resources etc.
rc* Directories defining active services for each run level.
resolv.conf Order in which to contact DNS servers (Domain Name Servers only).
sendmail.cf Main config file for the Sendmail server.
services Connections accepted by this machine (open ports).
sndconfig or sound Configuration of the sound card and sound events.
ssh Directory containing the config files for secure shell client and server.
sysconfig
Directory containing the system configuration files: mouse, keyboard,
network, desktop, system clock, power management etc. (specific to
RedHat)
X11
Settings for the graphical server, X. RedHat uses XFree, which is
reflected in the name of the main configuration file, XFree86Config.
Also contains the general directions for the window managers available
on the system, for example gdm, fvwm, twm, etc.
xinetd.* or inetd.conf
Configuration files for Internet services that are run from the system's
(extended) Internet services daemon (servers that don't run an
independent daemon).
Throughout this guide we will learn more about these files and study some of them in detail.
Introduction to Linux
Chapter 3. About files and the file system 45
3.2.5. The most common devices
Devices, generally every peripheral attachment of a PC that is not the CPU itself, is presented to the system as
an entry in the /dev directory. One of the advantages of this UNIX-way of handling devices is that neither
the user nor the system has to worry much about the specification of devices.
Users that are new to Linux or UNIX in general are often overwhelmed by the amount of new names and
concepts they have to learn. That is why a list of common devices is included in this introduction.
Table 3-4. Common devices
Name Device
cdrom CD drive
console Special entry for the currently used console.
cua* Serial ports
dsp* Devices for sampling and recording
fd*
Entries for most kinds of floppy drives, the default is
/dev/fd0, a floppy drive for 1.44 MB floppies.
hd[a-t][1-16]
Standard support for IDE drives with maximum amount
of partitions each.
ir* Infrared devices
isdn* Management of ISDN connections
js* Joystick(s)
lp* Printers
mem Memory
midi* midi player
mixer* and music Idealized model of a mixer (combines or adds signals)
modem Modem
mouse (also msmouse, logimouse, psmouse,
input/mice, psaux) All kinds of mouses
null Bottomless garbage can
par* Entries for parallel port support
pty* Pseudo terminals
radio* For Radio Amateurs (HAMs).
ram* boot device
sd* SCSI disks with their partitions
sequencer
For audio applications using the synthesizer features of
the sound card (MIDI-device controller)
tty* Virtual consoles simulating vt100 terminals.
usb* USB card and scanner
video* For use with a graphics card supporting video.
Introduction to Linux
Chapter 3. About files and the file system 46
3.2.6. The most common variable files
In the /var directory we find a set of directories for storing specific non-constant data (as opposed to the ls
program or the system configuration files, which change relatively infrequently or never at all). All files that
change frequently, such as log files, mailboxes, lock files, spoolers etc. are kept in a subdirectory of /var.
As a security measure these files are usually kept in separate parts from the main system files, so we can keep
a close eye on them and set stricter permissions where necessary. A lot of these files also need more
permissions than usual, like /var/tmp, which needs to be writable for everyone. A lot of user activity might
be expected here, which might even be generated by anonymous Internet users connected to your system. This
is one reason why the /var directory, including all its subdirectories, is usually on a separate partition. This
way, there is for instance no risk that a mail bomb, for instance, fills up the rest of the file system, containing
more important data such as your programs and configuration files.
/var/tmp and /tmp
Files in /tmp can be deleted without notice, by regular system tasks or because of a system reboot. On
some (customized) systems, also /var/tmp might behave unpredictably. Nevertheless, since this is not
the case by default, we advise to use the /var/tmp directory for saving temporary files. When in
doubt, check with your system administrator. If you manage your own system, you can be reasonably
sure that this is a safe place if you did not consciously change settings on /var/tmp (as root, a normal
user can not do this).
Whatever you do, try to stick to the privileges granted to a normal user - don't go saving files directly
under the root (/) of the file system, don't put them in /usr or some subdirectory or in another reserved
place. This pretty much limits your access to safe file systems.
One of the main security systems on a UNIX system, which is naturally implemented on every Linux machine
as well, is the log-keeping facility, which logs all user actions, processes, system events etc. The configuration
file of the so-called syslogdaemon determines which and how long logged information will be kept. The
default location of all logs is /var/log, containing different files for access log, server logs, system
messages etc.
In /var we typically find server data, which is kept here to separate it from critical data such as the server
program itself and its configuration files. A typical example on Linux systems is /var/www, which contains
the actual HTML pages, scripts and images that a web server offers. The FTP-tree of an FTP server (data that
can be downloaded by a remote client) is also best kept in one of /var's subdirectories. Because this data is
publicly accessible and often changeable by anonymous users, it is safer to keep it here, away from partitions
or directories with sensitive data.
On most workstation installations, /var/spool will at least contain an at and a cron directory,
containing scheduled tasks. In office environments this directory usually contains lpd as well, which holds
the print queue(s) and further printer configuration files, as well as the printer log files.
On server systems we will generally find /var/spool/mail, containing incoming mails for local users,
sorted in one file per user, the user's "inbox". A related directory is mqueue, the spooler area for unsent mail
messages. These parts of the system can be very busy on mail servers with a lot of users. News servers also
use the /var/spool area because of the enormous amounts of messages they have to process.
The /var/lib/rpm directory is specific to RPM-based (RedHat Package Manager) distributions; it is
where RPM package information is stored. Other package managers generally also store their data somewhere
in /var.
Introduction to Linux
Chapter 3. About files and the file system 47
3.3. Manipulating files
3.3.1. Viewing file properties
3.3.1.1. More about ls
Besides the name of the file, ls can give a lot of other information, such as the file type, as we already
discussed. It can also show permissions on a file, file size, inode number, creation date and time, owners and
amount of links to the file. With the -a option to ls, files that are normally hidden from view can be displayed
as well. These are files that have a name starting with a dot. A couple of typical examples include the
configuration files in your home directory. When you've worked with a certain system for a while, you will
notice that tens of files and directories have been created that are not automatically listed in a directory index.
Next to that, every directory contains a file named just dot (.) and one with two dots (..), which are used in
combination with their inode number to determine the directory's position in the file system's tree structure.
You should really read the Info pages about ls, since it is a very common command with a lot of useful
options. Options can be combined, as is the case with most UNIX commands and their options. A common
combination is ls -al; it shows a long list of files and their properties as well as the destinations that any
symbolic links point to. ls -latr displays the same files, only now in reversed order of the last change, so
that the file changed most recently occurs at the bottom of the list. Here are a couple of examples:
krissie:~/mp3> ls
Albums/ Radio/ Singles/ gene/ index.html
krissie:~/mp3> ls -a
./ .thumbs Radio gene/
../ Albums/ Singles/ index.html
krissie:~/mp3> ls -l Radio/
total 8
drwxr-xr-x 2 krissie krissie 4096 Oct 30 1999 Carolina/
drwxr-xr-x 2 krissie krissie 4096 Sep 24 1999 Slashdot/
krissie:~/mp3> ls -ld Radio/
drwxr-xr-x 4 krissie krissie 4096 Oct 30 1999 Radio/
krissie:~/mp3> ls -ltr
total 20
drwxr-xr-x 4 krissie krissie 4096 Oct 30 1999 Radio/
-rw-r--r-- 1 krissie krissie 453 Jan 7 2001 index.html
drwxrwxr-x 30 krissie krissie 4096 Oct 20 17:32 Singles/
drwxr-xr-x 2 krissie krissie 4096 Dec 4 23:22 gene/
drwxrwxr-x 13 krissie krissie 4096 Dec 21 11:40 Albums/
On most Linux versions ls is aliased to color-ls by default. This feature allows to see the file type without
using any options to ls. To achieve this, every file type has its own color. The standard scheme is in
/etc/DIR_COLORS:
Table 3-5. Color-ls default color scheme
Color File type
blue directories
Introduction to Linux
Chapter 3. About files and the file system 48
red compressed archives
white text files
pink images
cyan links
yellow devices
green executables
flashing red broken links
More information is in the man page. The same information was in earlier days displayed using suffixes to
every non-standard file name. For mono-color use (like printing a directory listing) and for general
readability, this scheme is still in use:
Table 3-6. Default suffix scheme for ls
Character File type
nothing regular file
/ directory
* executable file
@ link
= socket
| named pipe
A description of the full functionality and features of the ls command can be read with info coreutils ls.
3.3.1.2. More tools
To find out more about the kind of data we are dealing with, we use the file command. By applying certain
tests that check properties of a file in the file system, magic numbers and language tests, file tries to make an
educated guess about the format of a file. Some examples:
mike:~> file Documents/
Documents/: directory
mike:~> file high-tech-stats.pdf
high-tech-stats.pdf: PDF document, version 1.2
mike:~> file Nari-288.rm
Nari-288.rm: RealMedia file
mike:~> file bijlage10.sdw
bijlage10.sdw: Microsoft Office Document
mike:~> file logo.xcf
logo.xcf: GIMP XCF image data, version 0, 150 x 38, RGB Color
mike:~> file cv.txt
cv.txt: ISO-8859 text
mike:~> file image.png
image.png: PNG image data, 616 x 862, 8-bit grayscale, non-interlaced
mike:~> file figure
figure: ASCII text
Introduction to Linux
Chapter 3. About files and the file system 49
mike:~> file me+tux.jpg
me+tux.jpg: JPEG image data, JFIF standard 1.01, resolution (DPI),
"28 Jun 1999", 144 x 144
mike:~> file 42.zip.gz
42.zip.gz: gzip compressed data, deflated, original filename,
`42.zip', last modified: Thu Nov 1 23:45:39 2001, os: Unix
mike:~> file vi.gif
vi.gif: GIF image data, version 89a, 88 x 31
mike:~> file slide1
slide1: HTML document text
mike:~> file template.xls
template.xls: Microsoft Office Document
mike:~> file abook.ps
abook.ps: PostScript document text conforming at level 2.0
mike:~> file /dev/log
/dev/log: socket
mike:~> file /dev/hda
/dev/hda: block special (3/0)
The file command has a series of options, among others the -z option to look into compressed files. See info
file for a detailed description. Keep in mind that the results of file are not absolute, it is only a guess. In
other words, file can be tricked.
Why all the fuss about file types and formats?
Shortly, we will discuss a couple of command-line tools for looking at plain text files. These tools will
not work when used on the wrong type of files. In the worst case, they will crash your terminal and/or
make a lot of beeping noises. If this happens to you, just close the terminal session and start a new one.
But try to avoid it, because it is usually very disturbing for other people.
3.3.2. Creating and deleting files and directories
3.3.2.1. Making a mess...
... Is not a difficult thing to do. Today almost every system is networked, so naturally files get copied from
one machine to another. And especially when working in a graphical environment, creating new files is a
piece of cake and is often done without the approval of the user. To illustrate the problem, here's the full
content of a new user's directory, created on a standard RedHat system:
[newuser@blob user]$ ls -al
total 32
drwx------ 3 user user 4096 Jan 16 13:32 .
drwxr-xr-x 6 root root 4096 Jan 16 13:32 ..
-rw-r--r-- 1 user user 24 Jan 16 13:32 .bash_logout
-rw-r--r-- 1 user user 191 Jan 16 13:32 .bash_profile
-rw-r--r-- 1 user user 124 Jan 16 13:32 .bashrc
drwxr-xr-x 3 user user 4096 Jan 16 13:32 .kde
-rw-r--r-- 1 user user 3511 Jan 16 13:32 .screenrc
-rw------- 1 user user 61 Jan 16 13:32 .xauthDqztLr
On first sight, the content of a "used" home directory doesn't look that bad either:
Introduction to Linux
Chapter 3. About files and the file system 50
olduser:~> ls
app-defaults/ crossover/ Fvwm@ mp3/ OpenOffice.org638/
articles/ Desktop/ GNUstep/ Nautilus/ staroffice6.0/
bin/ Desktop1/ images/ nqc/ training/
brol/ desktoptest/ Machines@ ns_imap/ webstart/
C/ Documents/ mail/ nsmail/ xml/
closed/ Emacs@ Mail/ office52/ Xrootenv.0
But when all the directories and files starting with a dot are included, there are 185 items in this directory.
This is because most applications have their own directories and/or files, containing user-specific settings, in
the home directory of that user. Usually these files are created the first time you start an application. In some
cases you will be notified when a non-existent directory needs to be created, but most of the time everything is
done automatically.
Furthermore, new files are created seemingly continuously because users want to save files, keep different
versions of their work, use Internet applications, and download files and attachments to their local machine. It
doesn't stop. It is clear that one definitely needs a scheme to keep an overview on things.
In the next section, we will discuss our means of keeping order. We only discuss text tools available to the
shell, since the graphical tools are very intuitive and have the same look and feel as the well known
point-and-click MS Windows-style file managers, including graphical help functions and other features you
expect from this kind of applications. The following list is an overview of the most popular file managers for
GNU/Linux. Most file managers can be started from the menu of your desktop manager, or by clicking your
home directory icon, or from the command line, issuing these commands:
nautilus: The default file manager in Gnome, the GNU desktop. Excellent documentation about
working with this tool can be found at http://www.gnome.org.
·
konqueror: The file manager typically used on a KDE desktop. The handbook is at
http://docs.kde.org.
·
mc: Midnight Commander, the Unix file manager after the fashion of Norton Commander. All
documentation available from http://gnu.org/directory/ or a mirror, such as http://www.ibiblio.org.
·
These applications are certainly worth giving a try and usually impress newcomers to Linux, if only because
there is such a wide variety: these are only the most popular tools for managing directories and files, and
many other projects are being developed. Now let's find out about the internals and see how these graphical
tools use common UNIX commands.
3.3.2.2. The tools
3.3.2.2.1. Creating directories
A way of keeping things in place is to give certain files specific default locations by creating directories and
subdirectories (or folders and sub-folders if you wish). This is done with the mkdir command:
richard:~> mkdir archive
richard:~> ls -ld archive
drwxrwxrwx 2 richard richard 4096 Jan 13 14:09 archive/
Creating directories and subdirectories in one step is done using the -p option:
richard:~> cd archive
richard:~/archive> mkdir 1999 2000 2001
Introduction to Linux
Chapter 3. About files and the file system 51
richard:~/archive> ls
1999/ 2000/ 2001/
richard:~/archive> mkdir 2001/reports/Restaurants-Michelin/
mkdir: cannot create directory `2001/reports/Restaurants-Michelin/':
No such file or directory
richard:~/archive> mkdir

No comments: