Chapter 19

How to Build Pages On the Fly


CONTENTS


Usually an HTML document is written once and maintained forever. At most, a few pieces may change based on server-side includes (SSI). But sometimes you want to build the page on the spot, in response to a user's database query, real-time events, or other factors.

This chapter begins with a description of why a user might want to build pages in this way, and a general description of how to do it.

We also go into more detail on the different ways to build a page on the fly, ranging from the simplest to the most complex. Finally, you learn how to add a CGI Sandwich to the search engine of the example site. A CGI Sandwich consists of a document with three parts:

The CGI sandwich is useful when you want to have two variants of a page. For example, you might want a (static) search form that can be called up whenever someone wants a search. Then you might want a search form that is called from the search results page, which uses the search results to tune the parameters of the next search. This second page would be nearly identical to the first, with some of the INPUTs in the middle preset to different options. The top and bottom come directly from the static search form, while the INPUTs are defined at runtime.

The Philosophy of Creating Web Pages Dynamically

Understanding what "pages on the fly" is about is a question of how you look at the world, deciding if the glass is half full or half empty. Is it partly sunny or partly cloudy? Are HTML pages static or dynamic? Are pages made "on the fly" static, real documents or fleeting compositions that disappear as soon as you switch to a new URL?

HTML documents are indeed static-they remain on the server and do not change unless a builder or graphics person needs to update information contained in the page. The act of updating the information involves editing the page by using an authoring tool to change the contents of the HTML document. HTML documents are maintained for as long as they exist on the server.

Building pages on the fly is slang for generating HTML via a CGI script. CGI script programming is the core of most all the applications presented in this book. Pages built by CGI programming do not exist though. Now, that might sound really strange or make perfect sense. How could a page in the site not exist?

CGI scripts generate "pages on the fly." On the spur of the moment, (when a user does something in the site) the page is created. Although we are led to believe that the Internet is about simple shrink-wrapped tools found on the shelves of software stores, surfing the Internet is precarious business. Generating pages "on the fly" is not as carefree as the phrase suggests. There is nothing random about dynamic page generation. But it is just as fun as surfing the Web.

Like magic, the pages are created in that the page is downloaded to the client just as any other static page is loaded by a client. Web browsers are client programs that do interesting things. We have browsers that can read e-mail, UseNet groups, even play sound and video. But behind all those features, the browser is an HTML viewer. The browser accepts all kinds of documents: GIFS, Shockwave pieces, sound files, and HTML files. Pages generated by CGI scripts are made on the fly the instant the user loads a CGI script URL.

Once again, "pages on the fly" really means that you are generating pages with CGI scripts and programs. The main function of CGI programming is to generate pages. Write that on a card and tape it to your computer monitor.

Why Dynamic Page Generation Is Needed

Dynamic page generation is needed to build pages depending on inputs given by the user and random situations raised by other users. For example, when Web chat environments are implemented, there are no resources to create static HTML files for all possible conversations. The users who visit the Web chat environment create the messages. The creators need to reorganize those messages in a way that all can view them.

We need dynamic page generation to build pages to serve the user's requests. A searching tool asks for input and a tool (script or program) generates matches to your query and comes back (displays content) to show you the results.

We need dynamic page generation to perform system-level duties on a Web server without giving shell access to the Web server. A system for checking the run status of the Web server, to perhaps restart it, or stop it for maintenance, are all done with dynamic page generation techniques.

Dynamic page generation is needed for any situation where the user doesn't know what to expect. A Web chat environment is an online place where users talk to one another. Dynamic page generation builds the pages for the users depending on what happens in the chat environment. A searching tool generates matches of any length depending on what the search criteria was.

When a page is generated dynamically, it's not apparent just before that process what the page will end up looking like. The process itself of generating pages is dynamic. But, the scope of the content should be expected. For example, if a search tool is built to find articles in a newspaper database, that database would be considered a "closed system." It is expected that the articles would be formatted a certain way and the layout of the search results also formatted in a regular way. The user may not expect what is returned, but the tools used to generate the output must be aware of all possibilities dictated by the "schema" of the newspaper database.

Locating CGI Scripts

The tools you write (and the tools we present in this book) live on the Web server in several places. The main-street in CGI town is a directory usually referred to as cgi-bin. The cgi-bin directory is most often located within the ServerRoot of the Web server. Check out the chapter on Web server setup for background on that topic, but come right back!

The CGI scripts that generate pages are located on the Web server, usually in a place called:

/cgi-bin

Although there is no rule where you keep your scripts, the standard usage is to place scripts in /cgi-bin. The configuration files for the Web server specify where your CGI scripts are located. The scripts on the server are stored in a directory "out of reach" of the user. In other words, we don't want to keep CGI scripts in a directory a user can browse. CGI scripts may contain secret information. Only the effect generated by CGI scripts should be visible to the user.

The Dual-Purpose of CGI Scripts

Creating pages on the fly introduces two personalities to a CGI script. CGI scripts are programs first; they are written in programming languages that perform logical steps leading to an end. The "end" is the generation of HTML. The HTML they generate is the other personality, or job, of CGI scripts.

Not all CGI scripts are necessarily doing system tasks and HTML generation at the same time. During the process of generating HTML, CGI scripts have the liberty of performing tasks other than generating HTML. But, CGI scripts are written for specific applications. Sometimes it involves only generating HTML, while other times CGI scripts perform many "system level" tasks and only acknowledging the user with a quick link:

"Click here to continue"

Usually though, CGI scripts are required to do a little of both. They manage information hidden from the user performing system tasks and use that information to create HTML pages.

For example, a lot of sites have "hit counters" on them. These are small scripts that generate a block of output. The output is a visible representation of the number of times that particular page has been visited. Some sites generate graphical numbers much like the numbers on your car's odometer. Others just generate text to be inserted into the document. SSI (server-side includes) is a good way to actually implement one of these counters.

The effect of the script is to generate a number. The only goal of the counter script is to generate a number one larger than the one before. First, you need to store the last number somewhere. The script will have to perform some file open and read to get that number. Then, the logic of the script takes over and it increments the number. Almost done, the script then formulates the textual display and prints the new number. This is the data gathered by the Web server during the process of parsing the SSI page. Finally the script needs to rewrite the new number back to the system for the next time the page is "counted," another system task again. So, just for the page counter, at least three main system tasks are performed-reading and rewriting the last counter value, and one "page generation" task-displaying the new count value.

Here is a portion of an HTML file using SSI to count pages:

This page has been read 
<!--#exec cmd="/var/web/book/bin/counter home_page"  -->
times.

The HTML page is static. It will always contain the text "This page has been read…" The data generated "on the fly" is the textual message of the value of how many times the page has been visited.

The program "counter" will compute that value and return the number. When this HTML page is loaded, it generates output like:

This page has been read 10332 times.

This is a custom page. It is generated on the fly. The user visited the page and caused the script to be invoked to update the hit count and the entire effect created a new page, one that did not exist before. Using SSI is one form of dynamic page generation. This chapter is about creating HTML on the fly so we'll refocus to traditional CGI programming techniques.

The template of the page is in the static HTML file, but the content is not static-it is ever changing. Do you have a site bookmarked in your browser with a hit counter on it? Go there and reload the page over and over. Watch the number increase. If your browser caches data, it may not increment because your browser will not actively reload the page. But, in general, going to dynamic pages causes CGI scripts to execute and run and generate HTML.

The issue of formats and templates has more to do with SSI than pure dynamic page generation, but there are always places where dynamic page generation, generating pages on the fly can enhance a site's content.

How to Build Dynamic Pages

The plain vanilla approach to building dynamic pages starts with a CGI script. Let's use this one for our skeleton CGI script:

#!/usr/local/bin/perl
require 'web.pl';
%Form = &getStdin;
&beginHTML;
# end of script

First, we've decided to write the skeleton in Perl. Perl's a great language to write CGI scripts in because they don't need to be compiled and Perl has a lot of features that make CGI scripts work well. For some scripts you find out on the Internet, some might start with

#!/usr/bin/perl

It's still a Perl script, the location of the Perl program is in /usr/bin/perl versus /usr/local/bin/perl. If you are not responsible for installing programs on the Web server, consult with the local guide or system administrator to find out what the correct path name is for Perl. There is a chapter of this book that deals with the installation and configuration issues of Perl because it is so often used as a CGI programming language.

We've set this script up to be run as a Perl script by the first line:

#!/usr/local/bin/perl

That instructs the operating system to execute this as a /usr/local/bin/perl program.

Next, we do some housekeeping and include a library called web.pl. We use require to include the Perl library into this script. web.pl is a Perl library that contains functions that are used in most all CGI programs.

We could have simply inserted the contents of the Perl library into the CGI script, but as you write more CGI scripts you'll find that sometimes you want to improve or revise your common functions. Putting them in a library minimizes the amount of inconsistent code you create. When you refer to a function like getStdin, it is the same function as long as it came from the web.pl library. If you found an improved version somewhere or revised it yourself then you can just change the code in one location.

With your own Web library, you're creating an API for all your future CGI scripts. This will come in handy as your projects get bigger and involve many scripts to support just one component of your Web site.

One thing to consider though with using libraries is that because the functions used by CGI scripts are sources from that library, changing the interface to the function (the number of arguments, the types of arguments, and so on) can be really dangerous. For example, a script to format a chat message wants the user's name, then e-mail address, then the message itself. Let's say we move the display function for transcript page generation into a library and then decide to add a new argument to the display routine. If we don't take the existing usage into consideration, we will break the CGI script (or else it'll do unexpected things). CGI scripts should never do unexpected things. As we mentioned previously in the newspaper article searching example, the CGI script and the library functions it uses should be aware of the data they manipulate.

A flat HTML file doesn't have that problem. The HTML contained in the file doesn't change and there are no "other cases" to deal with. A CGI script generating HTML on the fly (especially one that performs logic on inputs to selectively generate HTML) needs to "seal up HTML leaks." In other words, the script that generates HTML dynamically must be written so that all possible outcomes based on the logic of the script are accounted for.

So, the best advice is to use the library to store commonly used functions and make changes easier (just changing one function affects all the scripts).

After the require statement, all the instructions in the library are executed. Functions are defined, variables are set.

If you are writing CGI scripts to generate HTML that need to be portable, need to be moved to other machines, then consider using the library to store default path information. For example, a ServerRoot on the native server could be:

/var/web/default

If the whole set of CGI scripts needs to be moved to another server and installed, define a variable to store the path of the ServerRoot. If a CGI script needs to access a file based off the ServerRoot, instead of hard-coding that into the script, the script should use the "global" variable defined in the library.

After the library is resourced the CGI script begins a generic phase of performing logical instructions towards the end of generating HTML.

In our skeleton CGI script, we make a statement:

%Form = &getStdin;

This statement assigns the return value of the function getStdin to the associative array %Form.

In our applications, we put the function getStdin in the web.pl library because we use that function in almost every CGI script that accepts input from a user.

It's called getStdin for two reasons: One, it "gets" information, and secondly, the information it gets comes from stdin. Where does it get information from? Stdin?

Where Data Comes From

Well, we don't really know until we look at all the options available. We should pause here for a minute and take a close look at what happens when you attempt to send data to a CGI script. There are several methods for sending data to a CGI script; the two common methods are POST and GET.

The GET Versus POST Analogy

Pretend you are in your car and you go to the drive-up teller at your bank. The teller behind the window is the CGI script and you are the Webmaster deciding if you should use GET or POST. First, GET and POST work in the same direction. You always are sending things to the teller. We aren't interested yet in what the teller sends to you. Pretend even more that this teller doesn't greet you with "Hello." Her sole purpose is to accept the information you send her, that's it. Which method do we use to send information?

GET and POST are methods for sending data. The word "Get" might make you think that GET and POST work in opposite directions. They don't. Ok, so you are at the teller booth and the plastic cylinder is there by your window. You also see the microphone. There are two ways to send "information" to the teller. You can speak into the microphone, or you can stuff things into the cylinder. The analogy to CGI programming is you can send data to CGI scripts (the teller) by the GET method (putting things in the cylinder) or POST method (speaking into the microphone).

There is a reason why GET is associated with "stuffing things into the cylinder" and POST refers to "speaking into the microphone." Let's look at GET first.

The GET method forces the data you send the CGI script to be seen by the user. The data passed via the GET method is passed in the URL.

http://www.mcp.com/cgi-bin/search.cgi?topic=boats

This is how data is sent using the GET method.

You can see the data, it's in the URL. It's visible, just like the things you stuff in the plastic cylinder. You can see your check deposit slip, the pen, even the numbers you wrote on the slip. Once the teller notices the cylinder (merely sending the cylinder is not enough, she has to actually get the cylinder to complete the sending process), she receives the cylinder when she has it in her hands. Receiving the data is an important step. The teller can see the information is present without even knowing what it is only after she receives it. If the cylinder is empty, there is no information there. If the cylinder is not empty, there is information there to handle.

The POST method of sending information to the CGI script is like speaking into the microphone. You cannot "see" the words, but they are transmitted just the same. The teller receives the information you give via the microphone no matter what. She can ignore you or she can listen carefully. It makes no difference, she cannot avoid receiving the information you pass through the microphone. By speaking into the microphone, she automatically receives it. She doesn't have to wait (even intentionally) to notice what you send her via the microphone.

On the other hand, the plastic cylinder can sit there unnoticed-noticed by the teller. She has to detect the cylinder is present before she can ascertain if information is present in the cylinder or not. It can seem like a childish way to analyze POST and GET, but if we think of GET and POST this way we can visualize the way data is passed to CGI scripts much easier.

We said that data passed using GET is visible and that data passed using POST is not.

The differences technically don't matter. The data is treated the same regardless of the method used to send it. But to the user, there are visible differences. If data is sent to a CGI script using the GET method, then all the variables and data are part of the URL to that CGI script:

/cgi-bin/test/useGet.cgi?x=10&name=Jeff

If the method used to send data to the CGI script is POST then the data sent to the CGI script does not appear in the URL. It is data read from stdin.

Data streams into and out of a CGI script. When data is available to be read it comes from a "file" called stdin. Input data is read from stdin. Data coming out of a CGI script is written to stdout. Output data is written to stdout (see Fig. 19.1). The names stdin and stdout are completely analogous to the stdin and stdout used when talking about C programs.

Figure 19.1: The CGI script reads data from an input stream (stdin), and writes out to stdout.

Data sent to the CGI script using the POST method is read by the CGI script by reading from stdin.

In Perl:

read(STDIN, $buffer, 256);

This reads 256 bytes of data from STDIN and stores whatever it reads (up to 256 bytes) in the scalar variable $buffer.

So, now we know the difference between GET and POST. This distinction is really important so please reread the previous section if the concept is still fuzzy.

CONTENT_LENGTH, QUERY_STRING, and Environment Variables

All data sent to a CGI script comes neatly packaged. The mechanism of reading from stdin for the POST method borrows from the C programming model. The way all data is sent to a CGI script is also modeled after the C programming model.

The package containing all the data sent to a CGI script initially is wrapped into one large set of variables. These variables are environment variables. Listing 19.1 is a short Perl script that you can run from the command line to show what environment variables are.


Listing 19.1  myEnvironment.pl-A Shell Script to Inspect the Environment Variables

#!/usr/local/bin/perl

foreach $variable (keys %ENV) {
   print "$variable is set to: $ENV{$variable}\n";
}

If you create a file called myEnvironment with the preceding Perl code and type

perl myEnvironment 

you'll see a list of names (in caps probably) on the left and the strings "is equal to: <some data>" on the right.

EDITOR:  vi<br>
EXINIT:  se wrapmargin=3  sm<br>
HOME:  /export/home/jdw<br>
HZ:  100<br>
LD_LIBRARY_PATH:  /export/home/oracle/lib<br>
LOGNAME:  jdw<br>
MAIL:  /var/mail/jdw<br>
MORE:  -c<br>
OPENWINHOME:  /usr/openwin<br>
ORACLE_HOME:  /export/home/oracle<br>
ORACLE_SID:  free<br>
ORGANIZATION:  FreeRange Media<br>
PATH:  /usr/bin:/usr/etc:/usr/local/bin:/usr/local:/usr/lang:/opt/gnu/bin:/usr/ccs/bin:
/usr/ccs/lib:/export/home/jdw:/export/home/jdw/bin:/usr/openwin/demo:/usr/openwin/bin:
/usr/openwin/bin/xview:.:/var/oracle/bin:/export/home/oracle/bin<br>
PRINTER:  lpr<br>
PWD:  /export/home/jdw/bookweb/cgi-bin<br>
SHELL:  /bin/csh<br>
TERM:  vt100<br>
TZ:  US/East-Indiana<br>
USER:  jdw<br>

When data is sent to a CGI script, it's stored in various environment variables. The type of browser used is sent, the IP address of the client's machine is sent; these are pieces of information we get for free. These are the free and unsolicited pieces of information the CGI script gets when any data is sent to it. Among the environment variables that store the browser type, the IP addresses of the client, the server, and so forth, there are a few very special environment variables directly related to our friends GET and POST.

The CGI script wants to know if the data you sent is using the GET and POST method. Like our bank teller example, we know that if data is sent via POST, the CGI script cannot avoid listening for it. If the data is passed via GET, the CGI script cannot avoid seeing it.

Here's the quick way to figure out how the data was sent:

  1. If the environment variable CONTENT_LENGTH is equal to a non-zero value, that means the data is sent via POST. CONTENT_LENGTH is the number of bytes the CGI has to read from stdin to get every last byte of data. Remember:
    read(STDIN, $buffer, 256);

  2. Replace 256 with CONTENT_LENGTH and $buffer fills up with exactly all the data sent using the POST method.
    In Perl:
    read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});

  3. On the other hand, if the environment variable QUERY_STRING is not null, that means the data is sent via GET. QUERY_STRING is the variable containing the data sent to the CGI script. It actually looks suspiciously similar to what is after the ? mark in the URL:
    http://www.mcp.com/cgi-bin/foo.cgi?topic=boats


    QUERY_STRING is equal to topic=boats.
    To illustrate what environment variables are passed to CGI scripts, here is a simple CGI to dump the environment variables:
    #!/usr/local/bin/perl
    
    print "Content-type: text/html\n\n";
    
    foreach $variable( sort keys %ENV) {
      print "$variable:  $ENV{$variable}<br>\n";
    }

  4. Call this script printEnv.cgi and place it in your cgi-bin directory. Be sure to change the mode to 755:
    chmod 755 printEnv.cgi

  5. Then point your browser to that CGI script:
    http://your.server/cgi-bin/printEnv.cgi


    The output (the resulting page) should look something like Figure 19.2.
    If you ran the CGI script from the UNIX shell prompt, the output will be very similar to the following from env-out.txt:

    Figure 19.2: The output from printEnv.cgi shows all the environment variables of the shell that invoked the Web server process.

    <HTML>
    <TITLE>Environment Variables</TITLE>
    <BODY bgcolor=ffffff>
    
    
    
    DOCUMENT_ROOT:  /t2/home/jdw/bookweb/htdocs<br>
    GATEWAY_INTERFACE:  CGI/1.1<br>
    HTTP_ACCEPT:  image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*<br>
    HTTP_CONNECTION:  Keep-Alive<br>
    HTTP_HOST:  www.mcp.com:9888<br>
    HTTP_REFERER:  http://www.mcp.com:9888/ch19/listing2.html<br>
    HTTP_USER_AGENT:  Mozilla/2.0 (Win95; I)<br>
    KEY:  83139635424884<br>
    PATH:  /bin:/usr/bin:/usr/ucb:/usr/bsd:/usr/local/bin<br>
    QUERY_STRING:  <br>
    REMOTE_ADDR:  192.187.229.43<br>
    REMOTE_HOST:  192.187.229.43<br>
    REQUEST_METHOD:  GET<br>
    SCRIPT_NAME:  /cgi-bin/printPrintEnv.cgi<br>
    SERVER_NAME:  www.mcp.com<br>
    SERVER_PORT:  9888<br>
    SERVER_PROTOCOL:  HTTP/1.0<br>
    SERVER_SOFTWARE:  NCSA/1.4<br>


    The default method data that is sent to CGI scripts is GET. We can examine the environment variables to get all the information possible. The QUERY_STRING variable is null because no "form data" was passed.
  6. Return to the browser and go to this URL:
    http://your.host/cgi-bin/printEnv.cgi?x=10


    The resulting page contains the same lines, except for the QUERY_STRING variable:
    QUERY_STRING: x=10


    The data passed to the CGI is stored in QUERY_STRING if the method used is GET.
  7. On some HTML forms, use the POST method:
    <form method="POST" action="/cgi-bin/printEnv.cgi">
    <input name="x">
    <input type="submit" value="Go">
    scx
    </form>

  8. Create an HTML file in your document root called env.html and make it a form like the one in Step 7.
  9. Go to the URL:
    http://your.server/env.html

  10. Put some text in the text box and press "Go." The output (resulting page) looks like this:
#!/usr/local/bin/perl


@INC = ('../lib', @INC);

require 'web.pl';
%Form = &getStdin;

&beginHTML('Environment Variables', 'bgcolor=ffffff');


print "CONTENT_LENGTH = ", $ENV{'CONTENT_LENGTH'},"<p>\n";

print "Data passsed via POST:<p>\n";


print "<h1>The %Form variable</h1>\n",
      "%Form = (";

@ks = keys %Form;

for($i=0;$i<$#ks;$i++) {
   print "\'$ks[$i]\', \'$Form{$ks[$i]}\',<br>\n";
}
print "\'$ks[$#ks]\', \'$Form{$ks[$#ks]}\');<br>\n";

The following is the output from printPost.cgi:

<HTML>
<TITLE>Environment Variables</TITLE>
<BODY bgcolor=ffffff>



CONTENT_LENGTH = 19<p>
Data passsed via POST:<p>
<h1>The %Form variable</h1>
%Form = ('aVariable', 'some data');<br>

Whenever you want to use the POST method, you need some sort of HTML form ahead of it. POST unlike GET doesn't use the arguments in the URL for passing data so it's difficult to simulate it unless you have a form that specifically indicates to use the POST method as shown in the preceding code.

The HTML form used by printPost.cgi is in Figure 19.3.

Figure 19.3: The data you send via POST is specified as "POST" in the HTML form that collects the data for the CGI.

Processing the Data

We've spent a lot of time at the teller window now, getting practice sending cylinders or talking to the teller. It's time to look at what happens after CGI scripts have received the information.

We are still dealing with our function getStdin. It receives the data stored in QUERY_STRING or read from stdin depending on which method was used to send the data. It breaks down the data from the raw format into a simple associative array (see Listing 19.2).

%Form = &getStdin;

Consider an HTML form and Figure 19.4.

Figure 19.4: The order page from the OK Hotel Café to choose an entree or soup of the day.

Listing 19.2  okHotel.html-Sample Ordering Form for the OK Hotel Café

<HTML>
<TITLE>Chapter 19 Programs for Webmaster Expert Solutions</TITLE>
<BODY BGCOLOR=FFFFFF>
<H1>Chapter 19</H1>


  <h1>OK Hotel Resturant</h1>
  <h2>Located in Historic Pioneer Square</h2>
<form method="post" action="/cgi-bin/chooseEntre.cgi">
Please place your order and then press "done".
<P>
<input name="entre" value="Route 66 Burrito" type="radio">
Route 66 Burrito <BR>
<input name="entre" value="Ceasar Salad" type="radio">
Caesar Salad <BR>
<input name="entre" value="Soup of the Day" type="radio">
Soup of the day:

<!--#exec cmd="/export/home/jdw/bookweb/bin/querySoup.pl" --> 


<BR>
Sides:
<P>
<input name="sides" value="fries" type="radio">
Fries <BR>
<input name="sides" value="pasta" type="radio">
Pasta <BR>
<input name="sides" value="bread" type="radio">
Bread Sticks <BR>
<P>
<input type="submit" value="done">
</form>
<HR>
<A HREF="../index.html">Home</A>
</BODY>
</HTML>

The SSI in this page is a small script (see Listing 19.3) that gets the date, and figures out what the soup of the day is. Like the café, it also chooses at random two other soups in case the soup du jour isn't what you want.


Listing 19.3  querySoup.pl-Picks the Soup du Jour at the OK Hotel Café

#!/usr/local/bin/perl

srand(time|$$);

@dat = localtime(time);

@soups = ('Tomato', 
          'Clam Chowder', 
          'French Onion', 
          'Cream of Brocoli',
          'Vegetable',
          'Navy Bean',
          'Chicken Noodle');

@others = ('Chili',
           'Split Pea',
           'Fish Stew',
           'Won Ton');

print "<select name=\"soup\">\n",
      "<option value=\"$soups[$dat[6]]\">$soups[$dat[6]]\n";

for($i=0;$i<2;$i++) {
   @back = grep(!/X/, @others);
   print "<option value=\"$back[$x=int(rand($#back))]\">$back[$x]\n";
   $others[$x]="X";
}

print "</select>\n";

If a customer picked the Caesar salad and bread, this is what the data returned by &getStdin looks like (see Fig. 19.5):

Figure 19.5: The form data obtained from the user when he picked his entree from the menu.

%Form = ('entre', 'Caesar Salad',
         'sides', 'bread');

Or another way to look at it:

$Form{'entry'} is equal to "Caesar Salad";

and

$Form{'sides'} is equal to "bread";

The way to organize the data received by the CGI script is in an associative array. It's a very easy storage facility to work with.

The Teller Speaks

We're ready to get into what makes people browse Web sites to begin with, the content! We've seen how CGI scripts are formed, a basic skeleton for a couple and where they reside on the system. We've looked at the process of sending data to CGI scripts using different methods and a little bit on how a library function can be used to read the response from the CGI script.

The teller speaks refers to the analogy we used to describe the process of sending data to the CGI script. That was part of the personality of a CGI script that performs system level tasks. Now we're going to go over the other side and show how CGI scripts talk back to us. How do we make CGI scripts actually generate HTML or whatever we want.

Here's how in Perl: (you may need to sit down)

print "hello world";

We use print. That's how pages are generated; they are printed. Whatever language you speak, whatever the content is, in order to create pages, the content has to be written to stdout. This is done by print(ing), or writ(ing), any other function that generates output to stdout.

We can best demonstrate this by looking at four cases in the following sections.

By Generating a Dynamic Page from a Static File

Let's say we are training a new HTML builder Nina and by mistake she picked up the wrong manual. Instead of the HTML guide, Nina also grabbed the Perl manual. After a few minutes, here's what she created:

A plain text file:

Hello! This is my first plain text file. 

Nina has been reading about CGI programming by mistake and thinks that all pages are generated from CGI scripts and there is no such thing as static HTML. (See Listing 19.4 and Fig. 19.6.)

Figure 19.6: The CGI sandwich makes a page by inserting in raw text from a file.


Listing 19.4  myFile.cgi-A Simple CGI Sandwich Script

#!/usr/local/bin/perl
print "Content-type: text/html\n\n";


print "<title>My File</title>\n",
      "<body bgcolor=ffffff>\n";

open(MYFILE, "< myfile.txt");
@allLines = <MYFILE>;
close MYFILE;
print @allLines;
exit(0);

The manager came by and saw what the builder had done and was impressed, Nina managed to be very resourceful. The manager pointed out that if she changed the name of the file to myfile.html she could load it without having to write a CGI script to just read it in and print it back out.

But, the builder wasn't even getting started yet. She looked at the output and noticed the page was not formatted correctly. It was missing the HTML and BODY tags, it didn't have a TITLE or a banner message heading (H1). Apparently, Nina now knew that static HTML works with documents that don't change very often; she was still determined to fully understand all she could do with CGI scripts. But, it is getting close to lunch time.

She gets hungry so she heads to the deli for some food. She is in line watching the cooks prepare food and notices something about the way they are preparing sandwiches. It gives her an idea so Nina heads back to her office and looks at the CGI script again.

What is missing is the layers before and after the document that make it whole. She can create text files easily, but she wants to explore the limits of CGI programming. She takes a step forward and reedits the CGI script.

Nina makes some changes and comes up with Listing 19.5.


Listing 19.5  sandwich.cgi-Bigger CGI Sandwich Script

#!/usr/local/bin/perl
print "Content-type: text/html\n\n";



open(MYFILE, "< myfile.txt");
@allLines = <MYFILE>;
close MYFILE;
print "<HTML>\n",
      "<TITLE> My File </TITLE>\n",
      "<BODY BGCOLOR=FFFFFF>\n",
      "<HEAD>\n",
      "<H1> My File </H1>\n",
      "</HEAD>\n",
      "<BODY BGCOLOR=FFFFFF>\n";
print @allLines;
print "</BODY>\n",
      "</HTML>\n";
exit(0);

Nina saves her CGI program as sandwich.cgi and loads it with her browser (see Fig. 19.7).

Figure 19.7: The bigger CGI sandwich puts the required HTML and BODY tags around the static file.

Making a CGI Sandwich

Unfortunately, there are many different ways to do the same thing. Dynamic Page generation is no exception. It causes confusion. When do I use SSI? When do I import whole pages and display them via CGI scripts? The example above shows how a CGI script is used to generate a "bologna" CGI sandwich. It's food, but it's not as tasty as a bacon, lettuce and tomato, or a ham and cheese. If your appetite is big enough, you might want to go for the submarine sandwich.

Of course, we're talking about sandwiches, not eating CGI scripts.

The BLT CGI Sandwich

The BLT has lettuce, bacon, and tomatoes, three layers that are pretty much standard to all BLTs. Our precocious builder Nina wants to use her basic bologna CGI sandwich program to create a more complex document.

Our builder still isn't ready yet for using exotic ingredients that are in the reuben and submarine (form data). She decides to stick with generic data sources. She wants to incorporate the time of day and how many users are on the server for her page. She still has her raw text file "myfile.txt" around and she decides to use that too. She copies it to greeting.txt and edits greeting.txt with a nice welcome message:

Welcome to Free Range Media

Flipping through the Perl manual some, a few minutes later Nina creates her BLT CGI sandwich script (see Listing 19.6).


Listing 19.6  BLT.cgi-Tastier CGI Sandwich with More Layers Making It More Complicated

#!/usr/local/bin/perl
print "Content-type: text/html\n\n";

$MyLoginName = "nina";
open(MYFILE, "< greeting.txt");
@allLines = <MYFILE>;
close MYFILE;
print "<HTML>\n",
      "<TITLE> My File </TITLE>\n",
      "<BODY BGCOLOR=FFFFFF>\n",
      "<HEAD>\n",
      "<H1> My File </H1>\n",
      "</HEAD>\n",
      "<BODY BGCOLOR=FFFFFF>\n";
chop($Date = 'date');
      
print "Today's date is: $Date\n",
      "<P>\n";
print @allLines;
chop(@peopleOn = 'who');
@everyoneElse = grep(!/^$MyLoginName/, @peopleOn);
if ($#everyoneElse < 0) {
    print "No one is logged on <BR>\n";
}
else
{ 
    print "There are ", 
          $#everyoneElse + 1,
          " people logged in: <BR>\n",
 
          join("<BR>\n", @everyoneElse);
}
print "</BODY>\n",
      "</HTML>\n";

Besides displaying the static page greeting.txt, this CGI puts some dynamic stuff on top and bottom.

First, it gets the date by running the date command and storing it in the scalar variable $Date.

Then it prints out the contents of greeting.txt; it reads in the file at the start of the script (see Fig. 19.8).

Figure 19.8: The CGI script shows who else is logged in.

Finally, she plays a little trick. She generates a list of all the users logged into the server, but she uses grep() to take out any reference to her. She also puts some logic into the script. If there are no other users besides herself, then it sends the text out "No one is logged on." Otherwise, if there are other people logged in, the index of the last element of the array @everyoneElse will be equal or greater than zero.

Nina does something else interesting, too. The array @everyoneElse is just an array of lines. The new-line characters are missing and if it gets printed like

print @everyoneElse;

everyone's name will be crammed together. She wants each line to be forced to a separate line so she puts a <BR>\n on the end of each line by using join(). Essentially, it's the same as putting a <BR>\n between each element of the array @everyoneElse

joe <BR>\n bill <BR>\n julie <BR>\n

comes out

joe<BR>
bill<BR>
julie<BR>

The Custom Ham and Cheese CGI Sandwich

Nina starts thinking about the different ways to make a ham and cheese sandwich. On Mondays, the deli offers it on special, but it's on sourdough bread. On Tuesday, it's on whole wheat bread. On Wednesday, it's on light rye bread. It's different every day. Ok, what about fix-ins? The deli has onions, tomatoes, pickles, lettuce, and sprouts. Nina is going to need some help. She wants to use an interface to build a dynamic page that shows what all the ingredients are in the ham and cheese besides the basic slice of ham and cheese (see Listing 19.7). She does a little research and calls the deli to get a list of prices and ingredients that go in the ham and cheese.


Listing 19.7  deliEngine.html-An Order Form That Builds Sandwiches

<html>
<title>Deli Engine</title>
<body bgcolor=ffffff>

Bread selection is determined by what day it is. The required ingredients 
are ham and cheese, but you can pick what kind of cheese and also add
 any combination of "fixins".
After you build your ham & cheese sandwich, press "pick up" and it'll
 be ready for you in 10 minutes.
<form method=post action="/cgi-bin/hamcheese.cgi">
Today's bread is:


<!--#exec cmd="/export/home/jdw/bookweb/bin/bread-selector.pl" -->


<p>
What kind of cheese? 
<select name="cheese">
<option value="None:0.0">None<BR>
<option value="Cheddar:0.50">Cheddar<BR>
<option value="Swiss:0.50">Swiss<BR>
<option value="Harvarti:1.00">Creamy Havarti<BR>
</select>
<p>
Your sandwich comes with Ham. Do you want a vegetarian substitue?
<input name="noMeat" value=1 type="radio">Yes 
<input name="noMeat" value=0 type="radio" checked>No
<p>
Add your own fixins:<BR>
<input name="onions" value="Onion:0.25" type="radio">Onions<BR>
<input name="tomatoes" value="Tomatos:0.50" type="radio">Tomatos<BR>
<input name="lettuce" value="Lettuce:0.10" type="radio">Lettuce<BR>
<input name="sprouts" value="Sprouts:0.40" type="radio">Sprouts<BR>
<input name="pickles" value="Pickles:0.25" type="radio">Pickles<BR>
Your first name: <input type="text" name="customerName"><P>
<input type="submit" value="Pick Up">
</form>

Let's go over the HTML form and what it does for this example.

We tell the user what the form is for. The customer can build a sandwich and submit the order. The bread selection is made by using an SSI. The SSI does the following:

Nina is becoming a good Perl programmer so she comes up with this SSI to pick the bread (see Listing 19.8).


Listing 19.8  bread-selector.pl-Picks the Bread of the Day

#!/usr/local/bin/perl
$wday = (localtime(time))[6];
%breads = (0, 'Sourdough',
           1, 'Whole Wheat',
           2, 'Light Rye',
           3, 'Dark Rye',
           4, '9 Grain',
           5, 'Russian Rye',
           6, 'Kaiser Roll');
print $breads{$wday}, "\n";
print "<input type=\"hidden\" name=\"bread\" value=\"$breads{$wday}\">\n";

Depending on what day it is, the following text gets "inserted" into the big form (let's say it's Monday):

Whole Wheat
<input type="hidden" name="bread" value="Whole Wheat">

The SSI was called inside the <form> </form> block so the hidden type storing the bread type is passed along with the other inputs made by the user.

This looks like a good interface (see Fig. 19.9). Nina is using some SSI and an HTML form with different types of data. She starts into the programming of the CGI script to handle this ham and cheese CGI sandwich application. First, she decides that the output of the CGI is going to include the following information. The date and time of the order is placed at the top. Then include the text of a raw text file that contains the standard disclaimer and information that only natural ingredients are used, the phone number, and address of the deli. Finally, she plans on writing the CGI so it will generate a list of what is in the sandwich: the bread, meat (if any), cheese (if any), and the fix-ins.

Figure 19.9: The deliEngine offers choices, but also restricts the customer to pick from the daily specials.

This is something that wasn't covered in too much detail before, but now seems a good time. Knowing exactly what the layout of the resulting page will contain is critical to programming a properly functioning CGI script. We know the complete set of variables possible from the inputs asked for in the HTML form.

Nina may have obtained a copy of a menu to see how the deli displays the ingredients of its food. Some of the graphic artists by her station have already taken notice of her project and have started creating some graphics for improving her ham and cheese CGI sandwich script later.

At any rate, she deftly writes a CGI script to handle the data (see Listing 19.9).


Listing 19.9  hamcheese.cgi-Handles the Data from the deliEngine

#!/usr/local/bin/perl


@INC = ('../lib', @INC);

require 'web.pl';

%Form = &getStdin;
&beginHTML;

$doneLater = time + 10 * 60;
($sec, $min, $hour) = localtime($doneLater);
$doneLater = sprintf("%02d:%02d:%02d", $hour, $min, $sec);
open(DISCLAIMER, "< disclaimer.txt");
@allLines = <DISCLAIMER>;
close DISCLAIMER;
print "<HTML>\n",
      "<TITLE>Your sandwich will be ready at $doneLater</TITLE>\n",
      "<HEAD>\n",
      "<H1>Thanks for the order!</H1>\n",
      "</HEAD>\n",
      @allLines;
$totalcost = 3.95;  #basic sandwich
print "<HR>\n",
      "Your sandwich is on $Form{'bread'}.\n";
print $Form{'noMeat'}?"It has a vegetarian substitute for ham":
                      "It has a slice of smoked ham.\n";
($cheese, $cost) = split(/\:/,$Form{'cheese'});
$totalcost += $cost;
print ($Form{'cheese'} =~ /^None/)?"No Cheese":$cheese, ".\n";
foreach $fixin ('onions','tomatoes','lettuce','pickles','sprouts') {
   if ( defined ($Form{$fixin}) ) {
       ($what, $cost) = split(/\:/, $Form{$fixin});
       push(@fixns, $what);
       $totalcost += $cost;
   }
}
print $#fixns<0?" Thats it!\n": "Plus, these fixins" .
           join(", ", @fixns), ".\n";
print "<p>\n",
      "total cost: $totalcost<p>\n";
print "<p>\n", 
      "Thanks $Form{'customerName'}. Print this page and bring it\n",
      " with you when you pick up the sandwich\n";

The deliEngine takes the orders and adds up all the costs to make the sandwich. The HTML form uses the hidden data type to pass along data to the CGI script.

The CGI script hamcheese.cgi adds up the costs for the sandwich, and tells the user how much it costs and when it's ready (see Fig. 19.10).

Figure 19.10: The costs of the sandwich are determined in the hamcheese.cgi.