Perl - Basic Concepts

Perl is also an acronym for Practically Everything Really Likable.

Let's talk about "shell" scripts.  And, in particular, DOS "shell" scripts (although the same concepts apply to UNIX "shell" scripts) or "batch" files.  At a DOS root command prompt (assuming C:/ here), enter the following:

C:/echo date > dosscrip.bat	' copy the word "date" into a new file named dosscrip.bat

echo time >> dosscrip.bat	' append the word "time" into that same file created in
				' the step above

C:/dosscrip			' run the batch file you've just created

As you can see, you've just created and run a DOS "shell" script or batch file which first displays the date, queries you for an update (if necessary), and then displays the time and queries you for an update (if necessary).  Basically, a sequence of "shell" commands stuffed into a text (or batch) file.

Similarly, a Perl program is a bunch of Perl statements and definitions placed in a file.  You then type the name of the file at a "shell" prompt.  Bingo, a perl program!  Perl is written in C and so borrows a lot from the C programming language.  In addition, it was written for the UNIX Operating System and so borrows a lot from UNIX shell scripts.  And, finally, it borrows much from the English language (recall that Larry Wall, the person who originally designed the language and its interpreter, has an academic background in linquistics).

There are some differences running a Perl program in the Windows Operating System (O/S) environment and the UNIX Operating System (O/S) environments.  And, we'll try to point some of them out as we continue.  One difference between running Perl in the Windows (O/S) environment and the UNIX (O/S) environment is that in UNIX the file has to tell UNIX that it is a Perl program.

Most of the time, this step involves placing the line

#!/usr/bin/perl

as the first line of the file.  And since Windows doesn't care about a first line like that, some Perl programmers keep a line like that even in a Perl program written for the Windows O/S!  Then, if they have to port the program over to UNIX, that special "UNIX" line is already in the file!

Do you have Perl installed?

It's critically important to have Perl installed on your computer before reading too much further.  As you read the examples, you'll want to try them.  If Perl is not already installed, momentum and time will be lost.

It is very easy to see if your system already has Perl installed.  Simply go to a DOS command-line prompt and type:

perl -v

Hopefully, the response will be similar to this:

This is perl, version 5.8.8 built for MSWin32-x86-multi-thread
(with 18 registered patches, see perl -V for more detail)

Copyright 1987-2007, Larry Wall

Binary build 822 [280952] provided by ActiveState Tool Corp. http://www.ActiveState.com
Built Jul 31 2007 19:34:48

Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on
this system using `man perl' or `perldoc perl'.  If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.

If you get an error message or you have version 4 of Perl, please following the instructions below to get and install Perl yourself. 

Getting and Installing Perl

New versions of Perl are released on the Internet and distributed to Web sites and ftp archives across the world. 

Each operating system has its own way of getting and installing Perl.

For UNIX and OS/2 - The Certified Perl Archive Network (CPAN) contains software links (http://www.cpan.org/ports/) that will enable you to download the latest Perl source code.  Hopefully, your UNIX system will already have Perl installed. 

For Windows 9x/ME/2000/NT - The ActivePerl download page of ActiveState (http://www.activestate.com/Products/ActivePerl/Download.html) contains links to download the most recent build (822 based on Perl 5.8.8) of the i86 Release Binary.

Instructions for compiling Perl or for installing on each operating system are included with the distribution files.  Follow the instructions provided and you should having a working Perl installation rather quickly.

Perl - back to basics

Perl is mostly a "free-form" language (like C) - whitespace (spaces, tabs, newlines, carriage returns, or form feeds) between "tokens" (elements of the program, like print or +) is optional, unless two tokens put together can be mistaken for another token, in which case whitespace of some kind is mandatory.

Although nearly any Perl program can be written all on one line, typically a Perl program is indented (much like a C program), with nested parts of statements indented more than the surrounding parts.  A typical indenting style in WordPad (the text editior of choice [unless you're beta testing Visual Perl] - Notepad used with "word wrap" on will mangle a Perl program to be run in the UNIX environment and, so, it's best to be avoided so that you don't get used to using it]) would be a quarter-inch tab.

Perl comments use an indicator right out of UNIX - the # (pound) sign.  Anything from an unquoted # to the end of the line is regarded as a comment.  There are no multi-line comment codes in Perl which means that each comment line will have to begin with a #.

Unlike most shells, the Perl "interpreter" completely parses and compiles a Perl program into an internal format before executing any of it.  This means that you can never get a syntax error from the program once the program has started, and that the whitespace and comments simply disappear and won't slow the program down.  This compilation phase ensures the rapid execution of Perl operations once it is started.  Now, since this compilation does take time, it's inefficient to have a large Perl program that does one small quick task and then exits - the compile-time will dwarf the run-time.

So Perl is like a compiler and an interpreter.  It's a compiler because the program is completely read and parsed before the first statement is executed.  It's an interpreter because there is no object code sitting around filling up disk space.  In some ways, it's the best of both worlds.

Hello World

We're going to hack on a small application.  The explanations during this stroll are very brief, if you don't click on the links!  If you do click on the links, then each subject area is discussed in much greater detail.

But before we begin hacking, let's do an obligatory "Hello, world" program.  Rather straight-forward in Perl.  Enter the following lines into WordPad:

#!/usr/bin/perl
# This program prints "Hello, world" to the standard output (the screen)
print ("Hello, world\n");

Save this file as hello.pl in a sub-folder to your Perl install named "cgi-bin", and at a DOS command prompt enter perl [drive]:\perl\cgi-bin\hello.pl (where [drive] is the drive on which perl is installed on your machine) to run the program.

The first line above is the incantation (isn't it strange that it's also a comment [begins with #] - unlike all other comments in a program, the one on the first line is special.  Perl looks on that line for any optional arguments) that tells the UNIX O/S that this is a perl program.  Remember that this is not a required line when running under Windows.

The second line above is a program comment.  Get used to including comments in your programs (even simple ones like the above) because there's no telling what you're going to remember about the Perl programming language six months or a year down the road.  Besides, what if you're a newby to Perl and you are given a more complex Perl program to look at because the person who wrote it originally is out sick or no longer with your company!  Wouldn't you like to have a "coach" to tell you what in the heck is going on in that program?

The third line above is the entire executable part of this program.  There's the print command - a built-in function which has, in this case, just one argument, a C-like text string.  Within this string, the character combination \n stands for the newline character (hex 0a in UNIX; hex 0d0a in Windows).  Finally, there's the terminating ;.  As in C (and Java, but unlike BASIC and MPL), all simple statements in Perl end with a semicolon!  Remember to keep your "p's and q's," or, in this case, "semicolons," straight!

When this program is run, the Windows O/S kernel fires up the Perl interpreter, which parses the entire program (all three lines of it, counting the "comment" lines) and then executes the compiled form.  The first and only operation is the execution of the print function, which sends its arguments to, in this case, standard output (the screen).  After the program has completed, the Perl process ends, and returns a successful exit code to the O/S kernel.

OK!  Now here's another interesting thing about Perl.  You may see Perl programs where the print function (and other functions) are sometimes called with parentheses and other times without them.  So much for consistency!  There's a simple rule (based on the slogans "make easy tasks easy, and impossible tasks possible" and "there's more than one way to do it") - parentheses for built-in functions are never required nor forbidden.  Their use can help or hinder clarity, so use your own judgement.

Well, that first Perl program is a touch cold and inflexible.  In the next lesson we're going to add a bit more sophistication - have the program call you by your name!  And, your homework assignment, should you decide to accept it, is to consider the programming elements that you would need to accomplish this task.  Hint: Probably three elements, one of which is "a way to ask for your name"!  This tape will self-destruct in 30 seconds . . . .