Friday, May 27, 2011

Chapter 2. Working with Files and Directories

Chapter Syllabus

2.1 Basic Operations on Files

2.2 File Naming Rules

2.3 Working with Directories

2.4 Copying and Moving Files

2.5 Wildcards

2.6 File Types

2.7 Searching File Contents

2.8 Finding Files

2.9 Miscellaneous File Handling Commands

As a user of a UNIX system, dealing with files is a routine task. A considerable proportion of time is consumed working with files. To make the best use of your time, you always need to reduce time spent on file handling. This can be accomplished through efficient use of commands related to file and directory operations. HP-UX provides a simple and powerful command mechanism. Other uses of file handling commands are for grouping these commands and passing output of one command to another. Operations can be performed on single files or a group of files at the same time. When preparing for the HP-UX certification examination, knowing these basic file operations is important. There are some questions on the certification exam that are directly related to file handling. Other than that, many other questions have implicit application to the commands presented here.

Most of this chapter is devoted to the use and explanation of file handling commands. You will learn very basic operations on files. These operations

include creating, listing, deleting, and displaying contents of a file. Then you will learn some rules that apply to file names, and you will see which file names are legal in HP-UX. After that, basic operations on directories will be presented, where you will find commands for creating, deleting, and changing directories. Copying and moving files are routine tasks, and commands for these are presented next. For simultaneous operations on multiple files, wildcards are used, and you will see the uses of each of them. Searching text files for a particular string and finding files with a particular name on a system will be the next topic. At the end of the chapter, you will learn more commands for file manipulation.

2.1 Basic Operations on Files

The most common operations on files are creating new files, deleting unnecessary files, listing file names, and displaying the contents of a file. After login, you can perform all of these operations in your home directory. Let us see how we do it.

Creating a File

A file is a named area on the disk(s) where you can store information. The cat command is a basic command used for creating new files containing text. For example, if you want to create a file with the name newfile, containing three lines, the process is as follows:

$ cat > newfile

This is first line. <ENTER>

This is the second line. <ENTER>

This is third and last line. <ENTER>

$

Note that you press the key at the end of each line. When you have finished entering the text, you press (pressing the Control and d keys simultaneously) to end the text entry process and save the file.

Please note that use of the cat command for creating a new file is not very common but it is the simplest way to do so. Most of the time you will be using the vi editor to create or modify files. The vi editor is discussed in more detail in Chapter 5.

Listing Files

Now that you have created a file, you can verify it by listing it using the ls command.

$ ls newfile

newfile

$

The ls command shows that our newly created file, newfile, does exist and that the file creation process was successful. What if you want to see a list of all other files? This is very simple; you use the ls command without any argument.

$ ls

FORMAT FORMAT.ZIP myf newfile rafeeq.zip

$

Now the ls command shows that there are five files with the names shown above. Please note that UNIX is case sensitive, meaning it differentiates between lowercase and uppercase letters. So the file name myfile is different from MyFile.

HP-UX has another popular command to list files. This is the ll (long listing) command.

$ ll

total 350

-rw-r----- 1 boota users 104230 Aug 27 19:04 FORMAT

-rw-rw-rw- 1 boota users 0 Aug 30 20:47 myf

-rw-rw-rw- 1 boota users 72 Aug 30 20:47 newfile

$

This command shows that there are three files, with the names displayed in the last column of the output. If you are wondering what the -rw-rw-rw- characters displayed in the first column are, just leave these for the time being. These are the file permissions showing who is allowed to read, write, and execute a particular file. We will be discussing file permissions in more detail in Chapter 7. If you remember from the first chapter that some commands are linked to other commands, ll is another example. This command is linked to the ls -l

command. The ls command has many options, and you can have a look at these using its manual pages.

Now we try to figure out the other columns in the file listing. The second column shows how many links are associated with this file. A 1 (numeric one) means there is no other link to this file. The next column shows the owner of the file. The users is the group name of the user boota who owns this file. The next column shows the file size in number of bytes. Then we have the date and time of last modification to the file, and in the last column the file name is displayed.

Deleting Files

To keep the system clean, you need to delete unwanted files from time to time. The files are deleted with the rm command.

$ rm newfile

$

Warning

The rm command has no output in the normal case. You need to be careful when deleting files, as the deleted files cannot be undeleted. An error message is displayed only if the file you are deleting does not exist.

Displaying Contents of a Text File

We already have used the cat command for creating new files. The same command is used to display contents of text files.

$ cat newfile

This is first line.

This is the second line.

This is third and last line.

$

We have just omitted the ">" symbol from the command line. The cat command displays the entire contents of a file in one step, no matter how long the file is. As

a result, the user is able to see only the last page of text displayed. There is another useful command with the name more that displays one page of text at a time. After displaying the first page, it stops until the user hits the spacebar. The more command then displays the next page of text and so on. Figure 2-1 shows a screen shot of the more command while displaying the .profile file.

Figure 2-1. Use of the more command.

2.2 File Naming Rules

When you create a new file, there are some rules governing the naming of the file. These rules are related to the length of the file name and the characters allowed in naming a file.

General Guidelines for File Names

Generally a file name in UNIX can be as long as 256 characters. The rules that apply to the file names are as follows.

1. A file name can be a combination of letters, numbers, and special characters.

2. All letters, both upper- (A–Z) and lowercase (a–z) can be used.

3. Numbers from 0 to 9 can be used.

4. Special characters like plus (+), minus (-), underscore (_), or dot (.) can be used.

5. As mentioned earlier, UNIX is case sensitive, and uppercase and lowercase letters are treated separately. So file names myfile, Myfile, MyFile, and myfilE are different names.

6. There are no special names for executable files in UNIX; the file permissions show which file is executable and which is not.

Hidden Files

Any file that starts with a dot (.) is not displayed when using the ll or ls command. These are hidden or invisible files. Usually these files are used to store configuration information. If you remember the user startup file with the name .profile, it is a hidden file. To display the hidden files, use the ls -a command.

$ ls -a

.profile newfile testfile.zip

$

Hidden files are more protected against the rm command when used to delete all files in a directory. This command does not delete hidden files.

2.3 Working with Directories

Basic operations on directories are creating new directories, deleting directories, and moving from one directory to another in a directory hierarchy. Commands used for these operations are presented in this section. As far as names of directories are concerned, rules that apply to ordinary files also apply here.

Creating Directories

A directory can be considered a special type of file used as a folder to contain other files and directories. Directories are used to organize files in a more logical and manageable way. A directory can be created with the mkdir command.

$ mkdir newdir

$

After creating a directory, verify its existence with the ls or ll command. Note

that when we use the ll command for the long listing, the first character in the file permissions is "d" instead of "-", showing that it is a directory, not an ordinary file.

$ ll

total 3

-rw-rw-rw- 1 boota users 0 Aug 30 20:47 myf

drwxrwxrwx 1 boota users 96 Aug 27 19:04 newdir

-rw-rw-rw- 1 boota users 72 Aug 30 20:47 newfile

$

Using the ls command without the -l option shows all names of files and directories, and you are not able to distinguish between them. If you don't want to display the long listing and still need to distinguish between files and directories, you can use the lsf or ls -F command. These are equivalent commands, and the screen output just appends a "/" symbol at the end of the directory name.

$ lsf

mydir/ newfile testfile.zip

$

Here you can see that mydir is a directory, whereas the other two are ordinary files.

Deleting Directories

Directories are deleted with rmdir command. This command deletes only empty directories. If the directory contains another file or directory, first that file or directory must be deleted. In case a user needs to delete a directory that is not empty, it is possible to use rm -rf command, which can delete a nonempty directory.

Warning

Be careful in using rm -rf, as it removes the entire directory tree without any warning to the user.

Understanding Directory Structure

The UNIX file system is composed of directories and files. The top-level directory is called the root directory and is represented by "/" symbol. All other directories and files may be considered inside the root directory. A directory one level above is called a parent directory, while a directory one level below is called a child directory. For example, the root directory "/" is the parent directory for home directory, and boota is a child directory of home directory (see sample directory tree in Figure 2-2).

Figure 2-2. A sample directory tree.

Parent and child directories are just relative to each other. For example, home directory is a child directory of the rootdirectory but it is a parent directory for the boota directory.

The directory names are referenced relative to the root directory. A complete reference name to a directory is called a path name. For example, the path name of the home directory is /home. Similarly, the path name of directory boota is /home/boota. It is easy to judge from the path name that bootais a child directory of home, which in turn is a child directory of the root directory. Files also have path names similar to directories. For example, a complete path name for a file created in directory /home/boota with name myfile is /home/boota/myfile. A path name that starts with the "/" symbol is called the absolute path name. We

can also use relative path names, which start from the current directory. For example, to refer to a file with the name alpha in the parent directory of the current directory, we may use a path name ../alpha.

Whenever a new directory is created, two entries are created in the new directory automatically. These are "." and ".." where "." is a reference to the current directory and ".." is a reference to the parent directory of the current directory.

Moving Around in a Directory Tree

You used the pwd command in Chapter 1. This command was used to check the current directory. The cd (change directory) command is used to move to some other directory in the directory tree. This command, like other UNIX commands, can be used both with absolute and relative path names. You already know that a user automatically goes to the home directory just after the login process. We again consider the example of user boota who has just logged in and is in home directory, /home/boota. To confirm that she is indeed in her home directory and then move to the /etc directory, the user issues the following commands.

$ pwd

/home/boota

$ cd /etc

$

$ pwd

/etc

$

The last pwd command showed that the user has moved to the destination directory /etc. In this example, we used an absolute path. In an example of using a relative path, consider the user boota is in her home directory /home/boota and wants to move to the /home (the parent) directory. She can use the cd .. or cd /home command, and either will have the same effect. In cd .., she asked the shell to move to the parent directory of the current directory. What if you use cd ../..?

Study Break

Basic Operations on Files and Directories

Having learned how to create and delete files and directories, create a new directory with the name topdir. Use the cd command to go into this directory and create a file with the name file1. Go back to the parent directory with the command cd .. and try to use the rmdir command to remove the topdir directory. You will see an error message showing the directory is not empty. Use the cd command to move to this directory and delete the file using the rm command. Now again go to the parent directory and delete topdir with the rmdir command.

Once again create the same directory and the same file inside it. Now use the rm -rf command to delete the nonempty directory.

Use the cd /var/adm command to move to this directory. Now again use the cd command, using both absolute and relative paths to go to the /etc directory. For the absolute path, you need to use the cd /etc command, while for the relative path the command will be cd ../../etc.

2.4 Copying and Moving Files

Many times you will be copying or moving files from one place to another. These two operations are similar except that the old file is deleted in the move operation.

Copying Files

The files are copied with the cp command. The source and destination file names are supplied to the cp command as arguments. The first argument is the source file name, and second argument is the destination file name.

$ cp myfile anotherfile

$

This command copies myfile from the current directory to anotherfile in the current directory. It is possible to copy files from any directory to any other directory using the path names of the files. For example, if you want to copy profile from the /etc directory to the current directory with the name myprofile, the command line will be as follows.

$ cp /etc/profile myprofile

$

As another example, if you want to copy the file in the above example with the same name in the current directory, you just use "." in place of the destination name. Note that the "." character is a relative path that refers to the current directory. For example, to copy /etc/profile with the name profile in the current directory, the following command can be used.

$ cp /etc/profile .

$

Two or more files can be copied simultaneously using the cp command. In this case, the destination must be a directory name. The following command copies two files, file1 and file2, from the current directory to the /tmp directory.

$ cp file1 file2 /tmp

$

Moving and Renaming Files

The mv command is used for renaming files and moving files from one place to another in the directory structure. Like the cp command, it takes source and destination file names as arguments. If both source and destination names are specified without any path (absolute or relative), the file is renamed. On the other hand, if any or both of the file names contain a path name, the file is moved from the source location to the destination location.

RENAME A FILE

$ mv myfile newfile

$

Make sure that the operation was successful by using the ll command.

MOVE A FILE

$ mv myfile /tmp/myfile

$

Two or more files can be moved simultaneously using the mv command. The destination must be a directory name. The following command moves two files, file1 and file2, to directory /tmp.

$ mv file1 file2 /tmp

$

Note

You must be careful with the mv command, as it will overwrite any existing file if the destination file name matches any source file. And it will do it without any warning. To make sure that existing files are not overwritten, always use the mv command as mv -i. In this case, if the destination file already exists, the mv command will ask you to confirm the move or rename operation.

2.5 Wildcards

When you want to use many file names in one command, such as the one where grep is used to search a pattern in many files, it is very inconvenient to type all these names at the command line. Wildcard characters are used as a shortcut to refer to many files. Two wildcards are used in UNIX, the asterisk character (*) and the question mark (?). The * matches zero or more characters, whereas ? matches only one character. There is a third type of character matching mechanism that checks a range of characters. This is the [] pattern, and a range is specified inside the square brackets. Sometimes this is called the third wildcard.

Use of *

Suppose you use the ls command to list files and the following list appears.

$ ls

myfile myfile00 myfile01 myfile010 myf xyz

$

Now we can use the * character to list files we want to be displayed. If we want to list all files that start with myfile, the command is:

$ ls myfile*

myfile myfile00 myfile01 myfile010

$

To list all files that start with my, we use:

$ ls my*

myfile myfile00 myfile01 myfile010 myf

$

Use of ?

The ? matches only a single character. For example, if you want to list all files that start with myfile0 and the last character may be anything, the result is:

$ ls myfile0?

myfile00 myfile01

$

Now try to figure out why myfile010 did not appear in the list.

The wildcard characters can be used wherever you need to specify more than one file. For example, if you want to copy all files from the current directory to the /tmp directory, the command will be:

$ cp * /tmp

$

Similarly, if you want to search for the word root in all files of the /etc directory, you can use this command.

$ grep root /etc/*

The wildcard characters are very useful, and if you master these, you can save a lot of time in your daily computer use.

Use of [ ] Wildcard

This wildcard matches a range of characters given inside the square brackets. Only one character from the range is taken. For example [a-m] means any one character between "a" and "m". Similarly [a,c,x] means character "a", "c," or "x".

$ ls /etc/[w,x]*

/etc/wall /etc/whodo /etc/wtmp /etc/xtab

$

The above command lists all files in the /etc directory that start with a "w" or "x" character.

2.6 File Types

You have been using commands like cat and more with text files. How do you know which file is a text file, which contains binary data, or which is a C program? The UNIX file command is used to determine the type of file. See the following examples.

A Text File

$ file /etc/profile

/etc/profile: ascii text

$

A Directory

$ file /etc

/etc: directory

$

An Executable File

$ file /bin/ls

/bin/ls: PA-RISC1.1 shared executable

$

A Shared Library

$ file /lib/libc.1

/lib/libc.1: PA-RISC1.1 shared library -not stripped

$

A Shell Script

$ file abc

abc: commands text

$

Similarly, the file command is able to detect a number of other file types. The file command uses the /etc/magic file to determine different file types by finding a magic string inside the file. A detailed discussion on magic numbers is out of the scope of this book, but you can see man pages for /etc/magic for

further information on magic numbers. The file command is very useful in situations where you want to determine the type of file before performing an operation on it. It is quite possible that your display would be garbled if you were to use the cat command on a binary file.

Study Break

Copying and Moving Files Using Wildcards and Finding the Type of a File

General syntax of the cp and mv commands is that you specify the source file name first and then the destination file name. Create a directory with the name impfiles in your home directory. Copy the /etc/hosts file into this directory. Also copy all files starting with "m" from the /etc directory to this directory. Now move the hosts file from the impfiles directory to the /tmp directory. Using range characters [a,e,i,o,u], list all files in the /usr directory that start with any vowel.

You can find out the type of a file by using the file command. Try to find a shared executable file on the system by applying this command to different files.

2.7 Searching File Contents

Finding a text string in one or multiple text files is easy using the grep (global regular expression print) command. It does the job in a number of ways. You can search for text strings in one or many files. You can also specify additional criteria for the string, such as whether it occurs at the start or at the end of a line. If you are using multiple files for a search, grep also displays the name of the file in which the string is found. It can also display the location in the file where the string is found.

Searching a Word

Here we show how you can find whether a particular user exists by applying the grep command on the /etc/passwd file.

$ grep Mark /etc/passwd

mstyle:elBY:2216:125:Mark Style,,,:/home/mstyle:/usr/bin/sh

mgany:iF5UeWQ:2259:125:Mark Gany,,,:/home/mgany:/usr/bin/sh

mbuna:tQfwUNo:2318:125:Mark Buna,,,:/home/mbuna:/usr/bin/sh

mblack:ipCg:2388:125:Mark Black,,,:/home/mblack:/usr/bin/sh

$

This command shows that there are four users on the system with the name Mark. If you want to make a search case insensitive, you may use grep -i instead of grep. If you are interested to know how many times the string occurs in the file, without displaying the lines containing the string, use grep -c. You can even reverse the selection of lines by grep -v. In this case, all lines that don't match the string pattern are displayed.

Searching Multiple Words

If you want to search using a string of multiple words, enclose the words with double quotes. For example, if you want to search for "Mark Black" in /etc/passwd, you will use the grep command.

$ grep "Mark Black" /etc/passwd

mblack:ipCg:2388:125:Mark Black,,,:/home/mblack:/usr/bin/sh

$

For a case-insensitive search of "Mark Black," use the following command.

$ grep -i "mark black" /etc/passwd

mblack:ipCg:2388:125:Mark Black,,,:/home/mblack:/usr/bin/sh

$

Searching a String in Multiple Files

As I mentioned earlier, the grep command can be used to search multiple files for a matching string. You need to specify all file names in which you want to search for the text string. For example, if you search for the word root in the /etc/passwd and /etc/group files, the following result is displayed.

$ grep root /etc/passwd /etc/group

/etc/passwd:root:8JgNSmFv806dA:0:3:,,,:/home/root:/sbin/sh

/etc/group:root::0:root

/etc/group:other::1:root,hpdb

/etc/group:bin::2:root,bin

$

The command shows that the word root occurs once in the /etc/passwd file and three times in the /etc/group file.

2.8 Finding Files

The find command is used to search for a file on a system. For example, if you want to find all files that start with my in the /etc directory and all of its subdirectories, the command is:

$ find /etc -name "my*"

/etc/profile

/etc/protocols

$

In a similar way, the find command can be used to find files that are newer versions of a certain file. The search can also be made on file types and file permissions. Please refer to man pages for more information on the find command.

2.9 Miscellaneous File Handling Commands

Here are some other useful commands related to file handling.

The Head and the Tail

Sometimes you need to view only the first or last few lines of a text file. By default, the head command lists the first ten lines of a text file, and the tail command lists the last ten lines of a file. For example, if you want to see the first ten lines of the /etc/passwd file (used to store user names and passwords), the command and its output will be:

$ head /etc/passwd

root:8JgNSmFv806dA:0:3:,,,:/home/root:/sbin/sh

mmsecad:ETxUQ5wSQZCAk:0:3::/:/sbin/sh

daemon:*:1:5::/:/sbin/sh

bin:*:2:2::/usr/bin:/sbin/sh

sys:*:3:3::/:

adm:*:4:4::/var/adm:/sbin/sh

uucp:*:5:3::/var/spool/uucppublic:/usr/lbin/uucp/uucico

lp:*:9:7::/var/spool/lp:/sbin/sh

nuucp:*:11:11::/var/spool/uucppublic:/usr/lbin/uucp/uucico

hpdb:*:27:1:ALLBASE:/:/sbin/sh

$

Additional parameters can be used with both the head and tail commands to view any number of lines of text. A tail -n 3 /etc/passwd will show the last three lines of the file. If you want to see what is being added to a text file by a process in real time, you can use the tail -f command. This is a very useful tool to see text being added to a log file.

Counting Characters, Words, and Lines in a Text File

Many times, you want to know how many characters, words, or lines there are in a file. In the /etc/passwd file, for example, there is one line for every user. You can count the number of users on the HP-UX system if you count the number of lines in the file. We use the wc (word count) command for this purpose. It displays the number of lines, words, and characters, respectively.

$ wc /etc/profile

171 470 3280 /etc/profile

$

It shows that there are 171 lines, 470 words, and 3280 characters in the /etc/profile file. If you want to count only the number of lines in a file, you can use wc -l. Similarly, for counting words, wc -w, and for counting characters, wc -c, can be used.

$ wc -l /etc/passwd

2414 /etc/passwd

$

It shows that there are 2414 lines in /etc/passwd, which is an indirect way to find out the number of users on this system.

Link Files

Many times you need to refer to the same file that has different names. You can create a link file that is not the actual file but points to some other file to which it is linked. There are two types of links, hard and soft. Soft links may be established across file systems. The soft link is a special type of file; the first character of the ll command list is "l" for link files. To create a link, the ln command is used. For example, to create a hard link, abc, to a file, myfile, we use:

$ ln myfile abc

$

To create a soft link, we use the -s option.

$ ln -s myfile abc

$

1 comment:


  1. Nice information, this is will helpfull a lot, Thank for sharing, Keep do posting i like to follow this ui online training

    ReplyDelete