Linux
by David Till
IN THIS CHAPTER
- What Is Perl?
- How Do I Find Perl?
- A Sample Perl Program
- Running a Perl Program
- Line 1 of Your Program: How Comments Work
- Line 2: Statements, Tokens, and <STDIN>
- Line 3: Writing to Standard Output
- Error Messages
- Interpretive Languages Versus Compiled Languages
Welcome to a brief look at Perl 5. In this chapter, you'll learn about the following
topics:
- What Perl is and why Perl is useful
- How to get Perl if you do not already have it
- How to run Perl programs
- How to write a simple Perl program
- The difference between interpretive and compiled programming languages
- What an algorithm is and how to develop one
Perl is an acronym, short for Practical Extraction and Report Language. It was
designed by Larry Wall as a tool for writing programs in the UNIX environment and
is continually being updated and maintained by him.
For its many fans, Perl provides the best of several worlds. For instance:
- Perl has the power and flexibility of a high-level programming language such
as C. In fact, as you will see, many of the features of the language are borrowed
from C.
- Like shell script languages, Perl does not require a special compiler and linker
to turn the programs you write into working code. Instead, all you have to do is
write the program and tell Perl to run it. This means that Perl is ideal for producing
quick solutions to small programming problems, or for creating prototypes to test
potential solutions to larger problems.
- Perl provides all the features of the script languages sed and awk, plus features
not found in either of these two languages. Perl also supports a sed-to-Perl translator
and an awk-to-Perl translator.
In short, Perl is as powerful as C but as convenient as awk, sed, and shell scripts.
As you'll see, Perl is very easy to learn. Indeed, if you are familiar with other
programming languages, learning Perl is a snap. Even if you have very little programming
experience, Perl can have you writing useful programs in a very short time. If you
pick up a copy of Teach Yourself Perl 5 in 21 Days (Sams Publishing, 1995), you'll
easily learn enough about Perl to be able to solve many problems.
To find out whether Perl already is available on your system, take the following
steps:
If you are currently working in a UNIX programming environment, check to see whether
the file /usr/local/bin/perl exists.
-
- If you are working in any other environment, check the place where you normally
keep your executable programs, or check the directories accessible from your PATH
environment variable.
If you do not find Perl in this way, talk to your system administrator and ask
whether he has Perl running somewhere else. If you don't have Perl running in your
environment, don't despair--read on!
One of the reasons Perl is becoming so popular is that it is available free to
anyone who wants it. If you are on the Internet, you can obtain a copy of Perl with
File Transfer Protocol (FTP). Following is a sample FTP session that transfers a
copy of the Perl distribution. The items shown in boldface type are what you would
enter during the session.
$ ftp prep.ai.mit.edu
Connected to prep.ai.mit.edu.
220 aeneas FTP server (Version wu-2.4(1) Thu Apr 14 20:21:35 EDT 1994)
Âready.
Name (prep.ai.mit.edu:dave): anonymous
331 Guest login ok, send your complete e-mail address as password.
Password:
230-Welcome, archive user!
230-
230-If you have problems downloading and are seeing "Access denied" or
230-"Permission denied", please make sure that you started your FTP
230-client in a directory to which you have write permission.
230-
230-If you have any problems with the GNU software or its downloading,
230-please refer your questions to <[email protected]>. If you have any
230-other unusual problems, please report them to <[email protected]>.
230-
230-If you do have problems, please try using a dash (-) as the first
230-character of your password -- this will turn off the continuation
230-messages that may be confusing your FTP client.
230-
230 Guest login ok, access restrictions apply.
ftp> cd pub/gnu
250-If you have problems downloading and are seeing "Access denied" or
250-"Permission denied", please make sure that you started your FTP
250-client in a directory to which you have write permission.
250-
250-Please note that all files ending in `.gz' are compressed with
250-'gzip', not with the unix `compress' program. Get the file README
250- and read it for more information.
250-
250-Please read the file README
250- it was last modified on Thu Feb 1 15:00:50 1996 - 32 days ago
250-Please read the file README-about-.diff-files
250- it was last modified on Fri Feb 2 12:57:14 1996 - 31 days ago
250-Please read the file README-about-.gz-files
250- it was last modified on Wed Jun 14 16:59:43 1995 - 264 days ago
250 CWD command successful.
ftp> binary
200 Type set to I.
ftp> get perl-5.001.tar.gz
200 PORT command successful.
150 Opening ASCII mode data connection for perl-5.001.tar.gz (1130765 bytes).
226 Transfer complete.
1130765 bytes received in 9454 seconds (1.20 Kbytes/s)
ftp> quit
221 Goodbye.
$
The commands entered in this session are explained in the following steps. If
some of these steps are not familiar to you, ask your system administrator for help.
- 1. The command $ ftp prep.ai.mit.edu connects you to the main
Free Software Foundation source depository at MIT.
2. The user ID anonymous tells FTP that you want to perform an anonymous
FTP operation.
3. When FTP asks for a password, enter your user ID and network address. This
lets the MIT system administrator know who is using the MIT archives. (For security
reasons, the password is not actually displayed when you type it.)
4. The command cd pub/gnu sets your current working directory to
be the directory containing the Perl source.
5. The binary command tells FTP that the file you'll be receiving
is a file that contains unreadable (nontext) characters.
6. The get command copies the file perl-5.001.tar.gz from
the MIT source depository to your own site. (It's usually best to do this in off-peak
hours to make things easier for other Internet users--it takes a while.) This file
is quite large because it contains all the source files for Perl bundled together
into a single file.
7. The quit command disconnects from the MIT source repository and
returns you to your own system.
After you've retrieved the Perl distribution, take the following steps:
- 1. Create a directory and move the file you just received, perl-5.001.tar.gz,
to this directory. (Or, alternatively, move it to a directory already reserved for
this purpose.)
2. The perl-5.001.tar.gz file is compressed to save space. To uncompress
it, enter this command:
$ gunzip perl-5.001.tar.gz
- 1. gunzip is the GNU uncompress program. If it's not available on your
system, see your system administrator. (You can, in fact, retrieve it from prep.ai.mit.edu
using anonymous FTP with the same commands you used to retrieve the Perl distribution.)
2. When you run gunzip, the file perl-5.001.tar.gz will be replaced
by perl-5.001.tar, which is the uncompressed version of the Perl distribution
file.
3. The next step is to unpack the Perl distribution. In other words, use the
information in the Perl distribution to create the Perl source files. To do this,
enter the following command:
$ tar xvf - <perl-5.001.tar
- As this command executes, it creates each source file in turn and displays the
name and size of each file as it is created. The tar command also creates
subdirectories where appropriate; this ensures that the Perl source files are organized
in a logical way.
4. Using your favorite C compiler, compile the Perl source code using the
makefile provided. (This makefile should have been created when the source files
were unpacked in the preceding step.)
5. Place the compiled Perl executable into the directory where you normally
keep your executables. On UNIX systems, this directory usually is called /usr/local/bin,
and Perl usually is named /usr/local/bin/perl.
You might need your system administrator's help to do this because you might not
have the necessary permissions.
If you cannot access the MIT site from where you are, you can get Perl from the
following sites via anonymous FTP:
North America
Site
|
Location
|
ftp.netlabs.com |
Internet address 192.94.48.152 |
|
Directory /pub/outgoing/perl5.0 |
ftp.cis.ufl.edu |
Internet address 128.227.100.198 |
|
Directory /pub/perl/src/5.0 |
ftp.uu.net |
Internet address 192.48.96.9 |
|
Directory /languages/perl |
ftp.khoros.unm.edu |
Internet address 198.59.155.28 |
|
Directory /pub/perl |
ftp.cbi.tamucc.edu |
Internet address 165.95.1.3 |
|
Directory /pub/duff/Perl |
ftp.metronet.com |
Internet address 192.245.137.1 |
|
Directory /pub/perl/sources |
genetics.upenn.edu
|
Internet address 128.91.200.37
|
|
Directory /perl5
|
Europe
Site
|
Location
|
ftp.cs.ruu.nl |
Internet address 131.211.80.17 |
|
Directory /pub/PERL/perl5.0/src |
ftp.funet.fi |
Internet address 128.214.248.6 |
|
Directory /pub/languages/perl/ports/perl5 |
ftp.zrz.tu-berlin.de |
Internet address 130.149.4.40 |
|
Directory /pub/unix/perl |
src.doc.ic.ac.uk
|
Internet address 146.169.17.5Directory /packages/perl5
|
Australia
Site
|
Location
|
sungear.mame.mu.oz.au
|
Internet address 128.250.209.2
|
|
Directory /pub/perl/src/5.0
|
South America
Site
|
Location
|
ftp.inf.utfsm.cl
|
Internet address 146.83.198.3
|
|
Directory /pub/gnu
|
You also can obtain Perl from most sites that store GNU source code, or from any
site that archives the Usenet newsgroup comp.sources.unix.
Now that Perl is available on your system, it's time to show you a very simple
program that illustrates how easy it is to use Perl. The program shown in Listing
29.1 asks for a line of input and writes it.
1: #!/usr/local/bin/perl
2: $inputline = <STDIN>;
3: print( $inputline );
Here is the output from this listing:
Line 1 is the header comment. Line 2 reads a line of input. Line 3 writes the
line of input back to your screen.
The following sections describe how to create and run this program, and they describe
it in more detail.
To run the program shown in Listing 29.1, carry out the following actions:
- 1. Using your favorite editor, type the program and save it in a file
called program29_1.
2. Tell the system that this file contains executable statements. To do this
in the UNIX environment, enter the following command:
$ chmod +x program29_1
- 3. Run the program by entering this command:
$ program129_1
When you run program29_1, it waits for you to enter a line of input.
After you enter the line of input, program29_1 prints what you entered,
as shown here:
$ program29_1
This is my line of input.
This is my line of input.
$
If Listing 29.1 is stored in the file program29_1 and run according to
the preceding steps, the program should run successfully. If the program doesn't
run, one of two things has likely happened:
- The system can't find the file program29_1.
- The system can't find Perl.
If you receive the error message
program29_1 not found
or something similar, your system couldn't find the file program29_1.
To tell the system where program29_1 is located, you can do one of two things
in a UNIX environment:
- Enter the command ./program29_1, which gives the system the pathname
of program29_1 relative to the current directory.
- Add the current directory . to your PATH environment variable.
This tells the system to search in the current directory when looking for executable
programs such as program29_1.
If you receive the message
/usr/local/bin/perl not found
or something similar, Perl is not installed properly on your machine. Refer to
the section "How Do I Find Perl?" earlier in this chapter, for more details.
If you don't understand these instructions or are still having trouble running
Listing 29.1, talk to your system administrator.
Now that you've run your first Perl program, let's look at each line of Listing
29.1 and figure out what it does.
Line 1 of this program is a special line that tells the system that this is a
Perl program:
#!/usr/local/bin/perl
Let's break this line down, one part at a time:
- The first character in the line, the # character, is the Perl comment
character. It tells the system that this line is not an executable instruction.
- The ! character is a special character; it indicates what type of program
this is. (You don't need to worry about the details of what the ! character
does. All you have to do is remember to include it.)
- The path /usr/local/bin/perl is the location of the Perl executable
on your system. This executable interprets your program; in other words, it figures
out what you want to do and then does it. Because the Perl executable has the job
of interpreting Perl instructions, it usually is called the Perl interpreter.
If, after reading this, you still don't understand the meaning of the line #!/usr/local/bin/perl,
don't worry. The actual specifics of what it does are not important for our purposes
in this book. Just remember to include it as the first line of your program, and
Perl will take it from there.
-
NOTE: If you are running
Perl on a system other than UNIX, you might need to replace the line #!/usr/local/bin/perl
with some other line indicating the location of the Perl interpreter on your system.
Ask your system administrator for details on what you need to include here. After
you have found out what the proper first line is in your environment, include that
line as the first line of every Perl program you write, and you're all set.
As you have just seen, the first character of the line
#!/usr/local/bin/perl
is the comment character, #. When the Perl interpreter sees the #,
it ignores the rest of that line.
Comments can be appended to lines containing code, or they can be lines of their
own:
$inputline = <STDIN>; # this line contains an appended comment
# this entire line is a comment
You can--and should--use comments to make your programs easier to understand.
Listing 29.2 is the simple program you saw earlier, but it has been modified to include
comments explaining what the program does.
-
NOTE: As you create your
own programs--such as the one in Listing 29.2--you can, of course, name them anything
you want. For illustration and discussion purposes, I've adopted the convention of
using a name that corresponds to the listing number. For example, the program in
Listing 29.2 is called program29_2. The program name is used in the input
and output examples such as the one following this listing, as well as in the following
analysis, where the listing is discussed in detail. When you follow the input and
output examples, just remember to substitute your program's name for the one shown
in the example.
1: #!/usr/local/bin/perl
2: # this program reads a line of input and writes the line
3: # back out
4: $inputline = <STDIN>; # read a line of input
5: print( $inputline ); # write the line out
This is the sample input and output of this program:
$ program29_2
This is a line of input.
This is a line of input.
$
The behavior of the program in Listing 29.2 is identical to that of Listing 29.1
because the code is the same. The only difference is that Listing 29.2 has comments
in it.
Note that in an actual program, comments normally are used only to explain complicated
code or to indicate that the following lines of code perform a specific task. Because
Perl instructions usually are pretty straightforward, Perl programs don't need to
have a lot of comments.
-
NOTE: Do use comments
whenever you think that a line of code is not easy to understand. Don't clutter your
code with unnecessary comments. The goal is readability. If a comment makes a program
easier to read, include it. Otherwise, don't bother. Don't put anything else after
/usr/local/bin/perl in the first line:
- #!/usr/local/bin/perl
This line is a special comment line, and it is not treated like the others.
Now that you've learned what the first line of Listing 29.1 does, let's take a
look at line 2:
$inputline = <STDIN>;
This is the first line of code that actually does any work. To understand what
this line does, you need to know what a Perl statement is and what its components
are.
The line of code you have just seen is an example of a Perl statement. Basically,
a statement is one task for the Perl interpreter to perform. A Perl program can be
thought of as a collection of statements performed one at a time.
When the Perl interpreter sees a statement, it breaks the statement into smaller
units of information. In this example, the smaller units of information are $inputline,
=, <STDIN>, and ;. Each of these smaller units of
information is called a token.
Tokens can normally be separated by as many spaces and tabs as you like. For example,
the following statements are identical in Perl:
$inputline = <STDIN>;
$inputline=<STDIN>;
$inputline = <STDIN>;
Your statements can take up as many lines of code as you like. For example, the
following statement is equivalent to the preceding ones:
$inputline
=
<STDIN>
;
The collection of spaces, tabs, and new lines separating one token from another
is known as white space.
When programming in Perl, you should use white space to make your programs more
readable. The examples in this book use white space in the following ways:
- New statements always start on a new line.
- One blank space is used to separate one token from another (except in special
cases, some of which you'll see in this chapter).
As you've seen already, the statement
$inputline = <STDIN>;
consists of four tokens: $inputline, =, <STDIN>,
and ;. The following subsections explain what each of these tokens does.
The $inputline and = Tokens The first token in line 1, $inputline (at the
left of the statement), is an example of a scalar variable. In Perl, a scalar variable
can store one piece of information.
The = token, called the assignment operator, tells the Perl interpreter
to store the item specified by the token to the right of the = in the place
specified by the token to the left of the =. In this example, the item on
the right of the assignment operator is the <STDIN> token, and the
item to the left of the assignment operator is the $inputline token. Thus,
<STDIN> is stored in the scalar variable $inputline.
Scalar variables and assignment operators are covered in more detail in Teach
Yourself Perl 5 in 21 Days. The <STDIN> Token and the Standard Input File The
next token, <STDIN>, represents a line of input from the standard
input file. The standard input file, or STDIN for short, typically contains everything
you enter when running a program.
For example, when you run program29_1 and enter
This is a line of input.
the line you enter is stored in the standard input file.
The <STDIN> token tells the Perl interpreter to read one line from
the standard input file, where a line is defined to be a set of characters terminated
by a new line. In this example, when the Perl interpreter sees <STDIN>,
it reads
This is a line of input.
If the Perl interpreter then sees another <STDIN> in a different
statement, it reads another line of data from the standard input file. The line of
data you read earlier is destroyed unless it has been copied somewhere else.
-
NOTE: If there are more
lines of input than there are <STDIN> tokens, the extra lines of input
are ignored.
Because the <STDIN> token is to the right of the assignment operator
=, the line
This is a line of input.
is assigned to the scalar variable $inputline. The ; Token The ;
token at the end of the statement is a special token that tells Perl that the statement
is complete. You can think of it as a punctuation mark that is like a period in English.
Now that you understand what statements and tokens are, consider line 3 of Listing
29.1:
print ($inputline);
This statement refers to the library function that is called print. Library
functions, such as print, are provided as part of the Perl interpreter;
each library function performs a useful task.
The print function's task is to send data to the standard output file.
The standard output file stores data that is to be written to your screen. The standard
output file sometimes appears in Perl programs under the name STDOUT.
In this example, print sends $inputline to the standard output
file. Because the second line of the Perl program assigns the line
This is a line of input.
to $inputline, this is what print sends to the standard output
file and what appears on your screen.
When a reference to print appears in a Perl program, the Perl interpreter
calls, or invokes, the print library function. This function invocation
is similar to a function invocation in C, a GOSUB statement in BASIC, or
a PERFORM statement in COBOL. When the Perl interpreter sees the print
function invocation, it executes the code contained in print and returns
to the program when print is finished.
Most library functions require information to tell them what to do. For example,
the print function needs to know what you want to print. In Perl, this information
is supplied as a sequence of comma-separated items located between the parentheses
of the function invocation. For example, the statement you've just seen
print ($inputline);
supplies one piece of information that is passed to print: the variable
$inputline. This piece of information commonly is called an argument.
The following call to print supplies two arguments:
print ($inputline, $inputline);
You can supply print with as many arguments as you like; it prints each
argument starting with the first one (the one on the left). In this case, print
writes two copies of $inputline to the standard output file.
You also can tell print to write to any other specified file.
If you incorrectly type a statement when creating a Perl program, the Perl interpreter
detects the error and tells you where the error is located.
For example, look at Listing 29.3. This program is identical to the program you've
been seeing all along, except that it contains one small error. Can you spot it?
1: #!/usr/local/bin/perl
2: $inputline = <STDIN>
3: print ($inputline);
The output should give you a clue.
$ program29_3
Syntax error in file program29_3 at line 3, next char (
Execution of program29_3 aborted due to compilation errors.
$
When you try to run this program, an error message appears. The Perl interpreter
has detected that line 2 of the program is missing its closing ; character.
The error message from the interpreter tells you what the problem is and identifies
the line on which the problem is located.
-
TIP: You should fix errors
starting from the beginning of your program and working down. When the Perl interpreter
detects an error, it tries to figure out what you meant to say and carries on from
there; this feature is known as error recovery. Error recovery enables the interpreter
to detect as many errors as possible at one time, which speeds up the development
process. Sometimes, however, the Perl interpreter can get confused and think you
meant to do one thing when you really meant to do another. In this situation, the
interpreter might start trying to detect errors that don't really exist. This problem
is known as error cascading. It's usually pretty easy to spot error cascading. If
the interpreter is telling you that errors exist on several consecutive lines, it
usually means that the interpreter is confused. Fix the first error, and the others
might very well go away.
As you've seen, running a Perl program is easy. All you need to do is create the
program, mark it as executable, and run it. The Perl interpreter takes care of the
rest. Languages such as Perl that are processed by an interpreter are known as interpretive
languages.
Some programming languages require more complicated processing. If a language
is a compiled language, the program you write must be translated into machine-readable
code by a special program known as a compiler. In addition, library code might need
to be added by another special program known as a linker. After the compiler and
linker have done their jobs, the result is a program that can be executed on your
machine--assuming, of course, that you have written the program correctly. If not,
you have to compile and link the program all over again.
Interpretive languages and compiled languages both have advantages and disadvantages,
as mentioned here:
- As you've seen with Perl, it takes very little time to run a program in an interpretive
language.
- Interpretive languages, however, cannot run unless the interpreter is available.
Compiled programs, on the other hand, can be transferred to any machine that understands
them.
As you'll see, Perl is as powerful as a compiled language. This means that you
can do a lot of work quickly and easily.
In this chapter you learned that Perl is a programming language that provides
many of the capabilities of a high-level programming language such as C. You also
learned that Perl is easy to use; basically, you just write the program and run it.
You saw a very simple Perl program that reads a line of input from the standard
input file and writes the line to the standard output file. The standard input file
stores everything you type from your keyboard, and the standard output file stores
everything your Perl program sends to your screen.
You learned that Perl programs contain a header comment, which indicates to the
system that your program is written in Perl. Perl programs also can contain other
comments, each of which must be preceded by a #.
Perl programs consist of a series of statements, which are executed one at a time.
Each statement consists of a collection of tokens, which can be separated by white
space.
Perl programs call library functions to perform certain predefined tasks. One
example of a library function is print, which writes to the standard output
file. Library functions are passed chunks of information called arguments; these
arguments tell a function what to do.
The Perl interpreter executes the Perl programs you write. If it detects an error
in your program, it displays an error message and uses the error-recovery process
to try to continue processing your program. If Perl gets confused, error cascading
can occur, and the Perl interpreter might display inappropriate error messages.
Finally, you learned about the differences between interpretive languages and
compiled languages, and that Perl is an example of an interpretive language.
Contact
[email protected] with questions or comments.
Copyright 1998
EarthWeb Inc., All rights reserved.
PLEASE READ THE ACCEPTABLE USAGE STATEMENT.
Copyright 1998 Macmillan Computer Publishing. All rights reserved.