Friday, May 27, 2011

Chapter 4. Input/Output Redirection and Pipes

Chapter Syllabus

4.1 Redirecting Standard Output

4.2 Redirecting Standard Input

4.3 Redirecting Standard Error

4.4 Redirecting Standard Input, Output, and Error Simultaneously

4.5 Pipes and How They Are Used

4.6 The T-Junction

Most UNIX commands are designed to take simple text (alphanumeric) data and punctuation as input. Usually, the output is also of simple text. Whenever you start a UNIX command, it opens three standard data streams: standard input (stdin), standard output (stdout), and standard error (stderr). Every UNIX command takes input data from stdin and sends its normal output to stdout and error messages to stderr. These data streams are often called standard input/output. UNIX associates numbers known as file descriptors with all open files. File descriptor 0 is used with standard input, 1 with standard output, and 2 with standard error.

Standard input, usually the user keyboard, is normally the place where a program reads its input from. Standard output, usually your terminal screen, is where the results of a command or program are displayed. In normal cases, standard error messages are also displayed on the terminal screen, but it is always possible to separate stdout from stderr. The UNIX shell can redirect any of these streams to a file, a device, or some other command, as required by the user. We call this process I/O redirection. You studied one example of output redirection in Chapter 2, when you created a new file with the cat command. In its normal use, the cat

command reads from the keyboard (stdin) and writes to the terminal screen (stdout). We used the ">" symbol to redirect output from the stdout to a file. Similarly, when we displayed the contents of a file with the cat command, we redirected input to the cat command from the keyboard (stdin) to the file. Figure 4-1 shows the standard location of input, output, and error for any UNIX command or program.

Figure 4-1. Location of standard I/O.

Another useful feature of UNIX is the pipe, with which we can send output of one command to the input of another command. This is often used to process and format data produced by a command and make it more understandable. Many commands are used as filters in UNIX, which take input from a command, filter the required data, and throw away the garbage. For example, the cat

/etc/passwd command displays the contents of the password file, but using a filter we can extract only login names of the system users.

Sometimes UNIX is also called a file-based operating system, meaning that any type of input or output device can be considered as being a file. All of the devices connected to a system are controlled through device driver files. When you want to print something, just direct it to the printer device file. If you want to send something to the terminal display, send it to the display device file. I/O redirection and pipes are considered very powerful features of UNIX, as any combination of commands can be used to get the desired result.

In this chapter, you will learn how to redirect any type of input, output, and error to another location. You can also redirect all of these at the same time. You will also learn the uses of pipes and tees to filter and redirect data to multiple locations.

4.1 Redirecting Standard Output

Redirection of stdout is controlled by ">" the greater-than symbol. The process of redirecting output is shown in Figure 4-2. The command takes input from the keyboard but sends its output to a file on disk.

Figure 4-2. Standard output redirection.

Note that error messages still go to the terminal screen. To demonstrate the process of output redirection, we can use the same example of Chapter 2, where we displayed contents of a file as follows.

$ cat newfile

This is first line.

This is the second line.

This is third and last line.

$

To redirect the output of the cat command we use the following step.

$ cat newfile > file1

$

Now the cat command displayed nothing, as the output of the command is redirected to a file. If we check the contents of file file1, it will contain the same text as newfile (the output of the cat command).

Note

This is another way of copying text files. As you go through the book, you will find how versatile the UNIX commands are and how many

different ways these commands can be used. Until now, you have used the cat command to create a new file, display contents of a file, and copy a text file using redirection. The same command is used for other purposes as well, and you will learn more uses of the cat command later in this chapter.

As another example, consider the who command. We redirected its output to a file with the name whofile. We can verify the contents of whofile with the more or cat command.

$ who > whofile

$ cat whofile

operator pts/ta Aug 30 16:05

boota pts/tb Aug 30 15:59

john pts/tc Aug 30 14:34

$

Note

If a file with the name file1 already exists, it will be overwritten by using the above command without any warning.

Joining Two or More Files

Two or more files can be joined into a single file by the use of the cat command and redirecting output to a file. Let us suppose there are three files in your home directory, with the names file1, file2, and file3. If you use the cat command with file1 and file2 as its arguments, it will show you the contents of file1 and file2, respectively. What if we use the cat * command? It will display the contents of all files in the directory. Now, by simply redirecting the output to another file, the command will concatenate all of these files.

$ cat file1 file2 >file4

$

This command created file4, which contains the contents of both file1 and file2. The following command creates file5, containing all files in the directory.

$ cat * >file5

$

Note

This is the another use of the cat command is for joining two or more files.

Appending to a File

In the case of output redirection with the ">" symbol, the file to which we redirect the output of a command is overwritten. It means that the previous contents of the file are destroyed. We can use the double redirection symbol ">>" to keep the previous contents of the file. In such a situation, the output of a command is appended to the file. Consider the following example.

$ cat file1 >>file2

$

This command means that file2 still contains the old contents of file2. In addition to this, the contents of file1 are added to the end of file2. If file2 does not exist, it is created. This is a very useful feature and is used in many situations. For example, if we want to check how many users are logged in every hour, we can ask UNIX to run date and who commands every hour and redirect (append) the output of both of these commands to a log file. The date command will append the current date and time and the who command will append a list of users. Later on we can view this log file to get the desired information.

Redirecting Standard Output to Devices

In addition to redirecting output of a command to a file, you can also redirect it to any device, as UNIX treats all devices as files. Just as an example, the device

file of the console is /dev/console. If you want to send the contents of a file to the console, you can use the following command.

$ cat file1 >/dev/console

$

The file will be displayed on the console screen. Similarly, if you know the device file name of another terminal, you can send a file to the monitor of that terminal.

Note

Many times, systems administrators use this procedure to diagnose a faulty terminal. If you don't get a login prompt of an attached terminal, try to send a file to that terminal with the above-mentioned procedure to ensure that the cabling is not faulty. If the file is displayed on the terminal screen, you have assurance that there is no hardware fault and that something is missing in the configuration of that terminal.

Sometimes you can use the same redirection method to print simple text files, if the printer is directly connected to the HP-UX machine and you know the device name for the printer.

When redirecting output, keep in mind that sterr is not redirected automatically with the output. If the command you issue generates an error message, it will still be displayed on your own terminal screen.

4.2 Redirecting Standard Input

UNIX commands can send output anywhere when using output redirection, and they can also get input from places other than the keyboard. Figure 4-3 shows I/O locations in the case of stdin redirection.

Figure 4-3. Standard input redirection.

We use the "less-than" symbol (<) for input redirection. Say that you have already created a text file with name myfile. You want to send this text file to a user jane through electronic mail. The easiest way to do this is to ask the mail program to get its input from this file instead of from the keyboard. The process of doing this is as follows.

$ mail jane < b="">

$

The mail program sends an email message to user jane on the current system consisting of the contents of myfile. This is a more convenient way to type messages when you need time to compose them. You just create a file, and when you are satisfied with what you have typed, send it through email.

4.3 Redirecting Standard Error

The stderr stream can be redirected in a similar fashion as stdin or stdout. Figure 4-4 shows what happens when we redirect the stderr.

Figure 4-4. Standard error redirection.

There is no special symbol for redirecting stderr. The same ">" symbol is used but with the number 2 attached in front of it. If you remember from previous pages, there are three file descriptors opened whenever a command is issued. These file descriptors are shown in Table 4-1.

Table 4-1. Standard File Descriptors

File Descriptor Number

Description

0

Standard input

1

Standard output

2

Standard error

We use "2>" for stderr redirection to tell the shell that we want to redirect the error messages instead of stdout (for which the file descriptor value is 0). Consider the following command.

$ ll xyz

xyz not found.

$

We tried to list a file with name xyz and the command result shows that this file does not exist. This is an error message of the ll command. Now see the following command.

$ ll xyz >abc

xyz not found.

$

We tried to redirect the output, but still the message is displayed on our screen. The redirection had no effect because we are trying to redirect stdout while the command is generating stderr messages. Now let us see what happens if we change ">" to "2>".

$ ll xyz 2>abc

$

Now there is nothing displayed because the error message has been stored in a file with name abc. You can use the cat command to verify that the error message was indeed stored in the abc file.

4.4 Redirecting Standard Input, Output, and Error Simultaneously

As you become more and more accustomed to HP-UX, you will often need to run unattended programs that execute at specific times, for example, at midnight. You need the program to take input from some files, such as system log files, and send its output to some other file. You also need to know if some error occurred during the execution of such programs. You can then look over the results of the program and any errors at any convenient time. This is the case where you redirect all of the three types of standard I/O to files. See Figure 4-5, showing where data streams come and go in such a case.

Figure 4-5. Redirection of standard input, output, and error.

We shall demonstrate the use of all of the three redirections with the sort command. Let us suppose we have a file with name unsorted with the following four lines in it.

$ cat unsorted

This is number 1

This is number 5

This is number 3

This is number 2

$

We can use the sort command to arrange (sort) these lines. When we use the sort command with input redirection to this file, this result appears.

$ sort < unsorted

This is number 1

This is number 2

This is number 3

This is number 5

$

Now we can redirect output of the command to a file named sorted and the error to a file named error with the following command.

$ sort sorted 2>error

$

Does this seem complicated to you? Indeed it is not. You can even change the order in which input, output, and error files appear.

Study Break

Use of I/O Redirection

As you have seen, I/O redirection is an important tool for a UNIX user. Until now you have studied all types of I/O redirection. As you have seen, you can use one or all types of redirection with a command. The I/O redirection feature is used by system administrators extensively in scripts used for system maintenance purposes. Most of the time, these scripts are time scheduled and run without any user interaction. It is very useful to record output of these scripts to diagnose any problem occurring during execution. You will see in the next section that the pipe is another very important tool, which, when used with I/O redirection, can filter useful data from system log files.

This is the time to practice with I/O redirection. Use the date command and redirect its output to a file named logfile. Use the who command and append its output to the same file. Wait for five minutes and again use the date and who commands and append their output to logfile. Now use the cat command to display logfile. You will see that it contains a line for time and date and then a list of users who were logged in at that time. You can use this technique with the UNIX scheduler (cron) to create a log for a whole day and study it later.

4.5 Pipes and How They Are Used

Look at Figure 4-6 carefully. It shows another powerful feature of the UNIX shell, using the output of one command as input of another command. We call this process piping due to its similarity to the real-world use of a pipe. At the command line, both of the processes (commands) are connected using the vertical bar symbol "|". This symbol is often called a pipe symbol. When two commands are connected through a pipe, the first command sends its output to the pipe instead of sending it to the terminal screen. The second command reads its input from the pipe rather than from the keyboard. Both of the commands still send error messages to the terminal screen, as shown in the figure. The first command takes input from the keyboard (stdin), and the second command sends its output to the terminal screen (stdout). If the second command needs input data but nothing is available in the pipe, it just waits for the first command to send something into the pipe.

Figure 4-6. Use of pipes.

Pipes are often used to filter, modify, or manipulate data output of one command. Multiple levels of pipes can be used in one command line. Similarly, pipes can also be used in combination with I/O redirection symbols.

Use of Pipes as Filters

Many times we don't need all of the output produced by a command. In such a case, we can filter the desired information from the output produced by a command. Filtering means extracting useful data and throwing away the rest. We have already studied the who command, which is used to see the names of logged-in users. In large systems, where hundreds of users are logged in simultaneously, it is difficult to find out whether a particular user is currently logged in. In this situation, we use the filter to get the desired information. We can use the who command with the grep command, where grep acts as a filter. Consider the next example, where we want to find if a user "mike" is logged in.

First we use only the who command and then we combine the who and grep commands.

$ who

operator pts/ta Aug 30 16:05

boota pts/tb Aug 30 15:59

mike pts/tc Aug 30 15:44

linda pts/td Aug 30 14:34

$

Now we use a pipe to filter out our required information.

$ who | grep mike

mike pts/tc Aug 30 15:44

$

As you can see, only the line containing the word "mike" is now displayed. We have used the grep command previously to find a string from one or multiple files. The grep commands, at that time, used file names as input. In this example, it did the same thing but took its input from the pipe.

How did grep know that no more data were coming from the pipe and that it should stop processing? Well, this is quite simple. The who command sends an end of file (EOF) character when it has completed sending output to the pipe. The grep command checks the EOF character and stops execution when it finds the character. In case there are no data in the pipe and the grep command has not received the EOF character, it will just wait until it gets more data or the EOF character.

As another example, we can get only login names from the who command by using another filter known as cut. We will discuss the cut command in more detail in the last chapter, but for the time being just see how we use it to extract the first word of each line and throw away the rest.

$ who | cut -f 1 -d " "

operator

boota

mike

linda

$

The cut command takes its input as fields separated by space characters and picks the first field from each input line. Since the first field of all output lines is the login name, we got the login names only in the output.

You can also use multiple levels of pipes as shown below.

$ who | cut -f 1 -d " "| grep mike

mike

$

Try to explain what is happening here. We have filtered the output of one command and then again filtered the output of the second command. You can continue this process as far as you want.

Use of Pipes for Data Manipulation

As we have used pipes for filtering data, we can also use them for reorganizing and manipulating data. What if you need to get output of a command in sorted form? Yes, it is quite simple if you pass it through the sort command using a pipe. Consider the above example of using the who command. See how the output changes without and with a sort pipe.

$ who

operator pts/ta Aug 30 16:05

boota pts/tb Aug 30 15:59

mike pts/tc Aug 30 15:44

linda pts/td Aug 30 14:34

$

Now we use a pipe with the sort command.

$ who | sort

boota pts/tb Aug 30 15:59

linda pts/td Aug 30 14:34

mike pts/tc Aug 30 15:44

operator pts/ta Aug 30 16:05

$

The sort command has arranged the output of the who command in alphabetical order. If there are many users logged in, the output of the who command just scrolls up and you see only the last page. In that case, you can use the more command as a filter to stop the scrolling at the end of each page.

$ who | more

Filters can do many things for you in a very simple way. If you were using some other operating system, you might need to write separate programs!

4.6 The T-Junction

This is a special type of pipe similar to a T pipe junction in real life. This is used to redirect incoming data in a pipe to more than one place. Please see Figure 4-7 to get an idea how the T-junction works.

Figure 4-7. The T-junction.

The tee command is used to form a T-junction. It takes its input from stdin and writes the same thing to stdout as well as to another file at the same time. Consider the same example of the who command. If you want to display the output of the who command at the terminal as well as save it in whofile for future use, the command line and result will be as follows.

$ who | tee whofile

operator pts/ta Aug 30 16:05

boota pts/tb Aug 30 15:59

mike pts/tc Aug 30 15:44

linda pts/td Aug 30 14:34

$

Now if we see the contents of the whofile, it will contain the same data.

$ cat whofile

operator pts/ta Aug 30 16:05

boota pts/tb Aug 30 15:59

mike pts/tc Aug 30 15:44

linda pts/td Aug 30 14:34

$

Like ordinary pipes and redirection symbols, multiple levels of t-junction pipe can be used to send data to many places. Can you use the sort or head commands with the tee command now? How about using the spell command to check spellings of a command output?

Table 4-2 is a summary of the redirection and pipe symbols used in HP-UX.

Table 4-2. Standard I/O Redirection

Symbol

Function

Syntax

>

Redirect stdout and overwrite or create a file

prog > file

<

Redirect stdin

prog < file

>>

Redirect stdout and append to, or create a file

prog >> file

2>

Redirect stderr

prog2> file

2>&1

Send stderr and stdout to the same file

prog2>&1 file

|

Pipe stdout of prog1 to stdin of prog2

prog1 | prog2

|&

Pipe stdout and stderr of prog1 to stdin of prog2

prog1 |& prog2

Note:prog, prog1, and prog2 represent a command or executable program, while file is any file

Test Your Knowledge

1:

What is the file descriptor used for stderr?

A. 1

B. 0

C. 2

D. 3

2:

The symbol used to append to a file when redirecting stdout to that file is:

A. >

B. >>

C. <

D. 2>

3:

When you redirect both stdout and stderr to the same location, you use:

A. 2&>

B. 2&>1

C. 2>&1

D. 1>&2

4:

A pipe is used to:

A. send output of one command to the input of another command

B. filter certain data from the output of a command

C. reorder output of a command

D. all of the above

5:

Which is not true?

A. Pipes can be used with redirection symbols.

B. Pipes cannot be used when stdin redirection is used in a command.

C. It is possible to redirect all stdin, stdout, and stderr at the same time.

D. The tee command sends output of a command to two locations.

No comments:

Post a Comment