Chapter 35

How to Add Video


Chapter 33, "How to Add High-End Graphics," introduced the concept of the multimedia Web site. Chapter 34, "How to Add Sound," extends the concept to audio. This chapter brings together the two concepts of visual images and time-based concept and introduces online video.

Recall that high-quality graphics are difficult to serve because they are so large. A bitstream like audio is difficult to serve because it usually has a higher bit rate than the client's network connection. Video suffers from both these problems-each frame can be the size of a single high-end graphic, and there are many frames per second. Nevertheless, there are some ways to deliver video over the Net.

Using Full-Motion Video

Let's face it-most of us are video junkies. We turn to the television for news, watch CNN as we pass through the airports, and go to (or rent) movies for entertainment-if we don't already have one of those 200-plus channel satellite dishes. Is it any wonder that, when we look to the Web, our eyes are drawn to full-motion video?

Full-Motion Video: What It Takes

Everyone loves to watch those thirty- and sixty-second snippets extruded from desktop video systems like WaveFront that show bizarre creatures roaming the landscape, or animated jets roaring across a simulated sky. Or consider the popularity of films such as Toy Story or Star Wars. In fact, some of the most impressive entertainment video today is digital, and could conceivably be downloaded from a Web site. Most Webmasters, in their few idle moments, have asked themselves, "What would it really take to put that kind of thing on my Web site?"

The short answer is, "A lot." The longer answer is, "Maybe not as much as you might think."

Video is, after all, a series of still images. On television, the screen is repainted 60 times a second, but each frame is interlaced and has only half the lines, so a full frame is delivered thirty times a second. Recall from Chapter 33, "How to Add High-End Graphics," that a small full-color graphic can take up 80K or more. At 30 frames a second, a full minute of raw video would take nearly 150M. Not only would it fill a big chunk of most people's hard drives, it would take nearly a day to download over a 14,400 bps connection. This full-motion stuff is a far cry from the simple juggling GIF or Server Push animation described in Chapter 33. Forget it.

Or maybe not. After all, still images compress rather well, and a video stream should have much more redundancy than a series of still images. What would it take to put video on the site?

Practical Considerations

Good video (you wouldn't want any other kind on your site, would you?) is expensive. High-end, computerized animation workstations start at around $20,000. Some newer technology brings the price down, but animation is always going to take more memory, more disk space, and more time than desktop publishing or even still image production. Commercial design studios typically quote rates between $2,000 and $4,000 a minute. Again, some smaller shops using newer tools on PowerPC Macintoshes or perhaps a high-end Windows machine offer good quality at a lower price.

Of course, when someone says video is expensive, one must ask, "compared to what?" Certainly video is expensive compared with simple text and graphics. If a site is effective with text and graphics, by all means leave out the video. If the alternative to video is a person-to-person sales call and a live demo, video may be competitive price-wise.

Video is best used when the material to be presented is naturally time-based. Movies and TV programs are obviously products which benefit by being promoted using video, but so is software (video the demo), real estate (walk through the homes), and automobiles (provide a test drive). Education, training, and technical support can also benefit from video. For some products, a few seconds of video may deliver compelling impact. For others, only a full clip, many minutes long, will do.

Like most material on the Web, the best answer is a compromise. Given the limited bandwidth of the Web, Webmasters have three choices in delivering video:

If the finished product will be viewed on a computer screen, the dominant factors are disk space, playback speed, and memory requirements. Few desktop machines have the special hardware it takes to keep up with the decoding of highly compressed data in real time. Even fewer have the high-speed connections necessary to accept less-compressed data. If quality is important, the video will have to be downloaded slowly for later playback. The dominant file formats for this kind of work are QuickTime, AVI, and MPEG.

If the final product will be transferred to tape, then much more quality can be preserved since the transfer time is typically short compared to the overall production time. The Disney movie Toy Story, produced by Pixar, set the standard for digital video-to-tape transfer. Details of its approach are given on their Web site, http:/, and in the August 1995 issue of Computer Graphics World.

Fans of Pixar can see more of its work by calling +1-510-236-0388. it has several shorter works available in VHS tape format. Other sources include Expanded Entertainment (1-800-996-TOON, extension 125) and Media Magic (1-800-882-8284).

File Formats

In addition to the "big three" video formats, there are many formats which are either vendor-specific or are being developed by researchers as possible next-generation candidates. Stephane Woillez maintains a Web site at that lists conversion utilities available on the Net.


QuickTime was originally defined by Apple Computer. It is the native format of the Macintosh and is supported on both Windows and UNIX machines.

While most people associate QuickTime with video, Apple is quick to point out that QuickTime is suitable with all time-based media, including sound and interactive video. QuickTime version 2.1, introduced in August 1995, includes explicit provision for animated images outside the video data. Figure 35.1 illustrates this technology, called "Sprite Tracks."

Figure 35.1 : Illustration of sprite tracks.

In earlier versions of QuickTime, there was a video track and an audio track, much like MPEG. The new Sprite Track holds a pointer to an image. At runtime, the image can be transformed, translated, or even replaced with a different image. Tracks are available for text, pictures, sounds, and time codes. With a movie editor, such as QuickTime, the user chooses which tracks to use.

Depending upon your point of view, QuickTime and MPEG do the same thing differently, or they do different things (but use similar approaches). At any rate, it is possible to translate from QuickTime to MPEG using a converter written by Rainer Menes called qt2mpeg. This utility is available at Apple now supports MPEG capability (in QuickTime 2.5, released in March 1996), so the need for qt2mpeg may grow faster than ever if developers produce in QuickTime and want to save in MPEG.

The definitive site on things related to QuickTime is http://quicktime. This site will always have pointers to the latest version, information for developers and users, and links to nice-looking QuickTime movies.

When served on UNIX and Windows machines (and often on Macintoshes), QuickTime movies are identifed by the file extension .mov or sometimes .qt. When setting up a server or client, the appropriate MIME type is



AVI is Microsoft's native video format. There is software available to play AVI on both Macintoshes and UNIX machines. A player is provided with Windows 95.

AVI files are about four times the size of MPEG clips of similar quality and duration, so many Webmasters are turning from AVI to MPEG for video on their sites.

AVI files are identifed by the file extension .avi. When setting up a server or client, the appropriate MIME type is



The international standard for computer video is defined by MPEG, the Moving Pictures Expert Group. Recall from Chapter 34, "How to Add Sound," that MPEG is part of a committee of the International Standards Organization (ISO). Part 3 of their specification (IS-11172) defines how to compress CD-quality audio so that it fits in a small portion of a 1.5-Mbps bitstream. Part 2 of that spec defines how to use the remainder of that bitstream for full-motion video.

A newer version of the standard, MPEG 2, offers higher quality but at higher bit rates. Broadcast quality is possible at rates between 3 and 4 Mbps. Scenes with complex space-time interaction, such as many sporting events, only compress down to 5 or 6 Mbps. Laserdisc quality is achievable between 3 and 6 Mbps.

The MPEG 3 initiative was short-lived, as researchers found they could accomplish the MPEG 3 objectives with a relatively straightforward extension of MPEG 2. MPEG 4 is currently under development, and is aimed at very low bit rate coding. The draft specification is expected to be released in 1997.

MPEG 1 video starts with relatively low-resolution video: 352¥240-pixel frames at a frame rate of 30 frames per second. The images are in color, using a color map called YUV. (See Chapter 33, "How to Add High-End Graphics," for a discussion of color maps.) The Y channel carries luminance; U and V carry chrominance. U and V are further decimated down to 176¥120 pixels. In natural images, this decimation is not noticeable. At this point, the video signal still requires far more bandwidth than is available.

MPEG takes advantage of the fact that much of the motion information in a given frame may be predicted by the frames around it. The Y channel of each frame is broken into 16¥16 pixel blocks, and the encoder tries to predict motion by looking for a close match to each block in other frames which appear before or after this one. The Discrete Cosine Transform (DCT; the same compression mechanism used in JPEG still images) is applied to each frame using 8¥8 pixel blocks on the U and V channels, and the DCT coefficients of the differences between a given block and its close match are quantized. If the differences are small, the quantization drives the differences to zero. Further compression is applied to whatever differences survive the above process.

The process of choosing which frames to send and which to predict is sophisticated. To start the process, one frame (not necessarily the first frame) is chosen as an "I-frame" or intraframe. Other frames, known as P-frames, are built up from I-frames by predicting them based on DCT coefficients. If a frame has very little similarity to any existing I or P frame, it is sent as another I-frame.

Between the I and P frames, there are so-called bidirectional frames, or B-frames. The encoder looks at the frame ahead of the B frame, and the frame behind it. If it cannot predict the B-frame from either of those two, it tries to average the blocks ahead and behind and stores the differences between the B-frame and the average. If none of these techniques work, the block is encoded like an I-frame. Thus, a typical sequence in MPEG is


There are 12 frames between one I-frame and the next, giving the eye (and the algorithm) a fully transmitted block every 0.4 second.

Some more-sophisticated products tune the sequence of I, P, and B frames to achieve even higher compression, at some loss of compatibility. Be sure to check compatibility when selecting an encoder for MPEGs that are to be served over the Web.

The frames are sent out of sequence so that frames 1 and 2 can be computed based on frame 3. A typical decoder displays frame 0 (an I-frame) and then reads and decodes frame 3 (a P-frame). But it's not time for frame 3 yet, so the decoder reads and decodes frames 1 and 2 (the B-frames). When frame 0 is complete, frame 1 is put up. Then frame 2 is put up. Finally frame 3 goes up. While frame 3 is going up, the same process begins again for the next P frame (frame 6) and its associated B frames (frames 4 and 5).

MPEG files are denoted by the file extension .mpg or sometimes .mpeg or .mpe. When setting up a server or client, the appropriate MIME type is


Producing Full-Motion Video

Although artists and producers will not necessarily always do things in the same order, they generally go through the same steps to produce video. These steps are illustrated in Figure 35.2.

Figure 35.2 : The animatio process.


To crystallize the concept, start by preparing a video treatment. Summarize the storyline or content and describe each production element: graphics to be produced, animation required, music which must be obtained, and live shots to be recorded. Research each concept to be presented and compile all of the material that will contribute toward the video.

Based on the video treatment, prepare a schedule and budget. Depending upon the level of experience of the production staff, the budget may be fairly accurate or wildly over- or underinflated. If the production team has limited experience, consider hiring a more experienced designer to work with the team. Software (such as Movie Magic Scheduling and Movie Magic Budgeting, both from Screenplay Systems) is available to help double-check initial estimates but does not substitute for experience and judgement.

Estimating Resources

If the in-house resources are limited, put the video treatment out for bid. Expect quotes to run from $2,000 to $4,000 a minute, with extreme values anywhere from $1,000 to $10,000 a minute, depending upon the material. The more thorough the video treatment, the more accurate (and sometimes lower) the quotes will be. A typical budget breakdown will allocate about 30 percent for planning and preproduction, 30 percent for production, and 40 percent for postproduction.

Script the Production

The next step after the initial planning and allocation of resources are complete is to prepare a script. For best results, use one of the formats employed by professional video production shops:

Various tools are available to help a screenwriter lay out a storyboard and capture key frames, transitions, and dialog. These steps can be done by a general word processor or in special programs such as Scriptor and Dramatica from Screenplay Systems Software. Some software can switch between formats, allowing creative talent to write in teleplay format, and then switch to two-column format during production.

Each major visual element should be documented in a storyboard-a visual rendition of the scene with a description of the associated audio elements. Don't skimp here, particularly if the production staff is new at this. A few days spent laying out each scene in detail can save weeks of production time and many dollars worth of wasted animation.

Based on the script and storyboard, refine the budget and schedule and get management (or client) approval to begin production.


Use the storyboards or the script to prepare a "shot list." If the video involves location work, group all the shots for a given location together. Gather any stock images or animation that will be put into the finished product. Identify those scenes that will use computer-generated animation and prepare instructions for the animators.

Design the Models

Once an animation concept is set down in a script, the modeler begins to build the characters and components of the video. The modeler is concerned with three-dimensional shape and size and the character of the surface of the model.

High-end animation software has a Model module. With this module the artist can use polygons, metaballs, and Non-Uniform Rational B-Splines (NURBS) as primitives and begin to build up a model.

High-end graphics modeling is done with mathematical components. Polygons and metaballs are used for general shapes. Splines (including NURBS) are used for general lines and curves. These primitives can be built into larger structures. Splines can be rotated and shifted through all degrees of freedom to form complex shapes-these transforms go by names like "extrude," "loft," and "sweep."

Splines and NURBS may be revolved and extruded, lofted and swept along defined paths. Once they join the model, the shapes they define may be moved into position and connected with other entities to produce sophisticated models. Once an object is built up in cross-section, the Model module allows the artist to layer a "skin" on top of it and apply complex curves to the surface.

Surfaces have a number of definable characteristics. They may have various levels of texture, bump, reflection, and transparency mapped on to them.

Design the Animation

The animation designer uses the models and sets them in motion in accordance with the script. The animator may use key-frame or particle techniques (as described in Chapter 33, "How to Add High-End Graphics") to reduce the number of frames that must be set up by hand. Powerful computerized morphing tools are available to build the in-betweens that tie one key frame to the next.

The animator is also concerned with the interaction between characters and components. Do characters collide with walls, floors, or each other? If so, do they recoil in a realistic manner? Is the lighting consistent? Getting the lighting model right can consume a great deal of CPU power.

High-end animators permit key-frame animation, event animation, shape interpolation (morphing), and inverse kinematics. Most high-end software also includes various forms of particle animation such as a "flock" command. Some also include even higher-level functions, such as gravity, friction, collision, turbulence, and wind. shows a frame from an MPEG running with real-time collision detection software by Madhav K. Ponamgi, Jonathan D. Cohen, Ming C. Lin, and Dinesh Manocha at the University of North Carolina. In this MPEG, the hand interacts with various kitchen utensils. Whenever the hand collides with another object, the collision is detected in real time and marked with a red marker.

Recall that Chapter 33, "How to Add High-End Graphics," introduced the technique of producing photo-realistic images by raytracing. Rendering raytraced animation is slow, even on high-end computers. Most artists work with wireframe figures or simple hidden-line depth cue renderings during animation design and only add fully textured surfaces or "skins" when the design is essentially intact.

Key-frame and event animation were described in Chapter 33 in the context of 2-D animation. Their 3-D counterparts are similar. Inverse kinematics has to do with how natural joints bend. The animator specifies the position of the end of a limb (for example, a hand, a hoof, or the tip of a wing) and the computer bends the joints in the right way to put the limb into position.

Paths for animated entities are commonly defined as splines and are used to get more natural motion. All high-end packages allow animated models to be placed on a spline curve.

Morphing is an advanced technique crucial to modern animation. Anyone who has watched an expensive television commercial or a movie with special effects has seen morphing. The term was coined at Industrial Light and Magic (ILM) where they once had a program, "morf," which interpolated between two images. While the details of morphing algorithms are mathematically complex and are often proprietary, the basic principles are clear.

To morph one image into another, the animator specifies which points and regions on one image correspond to which points and regions on the other. If the images are similar, such as faces, the transformation is straightforward. If the images are topologically dissimilar (such as morphing a coffee cup, which is topologically a torus or doughnut, into a brick) special techniques must be used for the morph to be believable.

Many artists have noted that, as computer animation has gotten better, audiences have become more demanding. Reportedly, a number of people have gone through laserdiscs of the movie Terminator 2 frame by frame, looking for inconsistencies. Professional morphing artists have developed a number of techniques to trick the eye into believing what the mind knows to be impossible. For example, the morph is done at much finer resolution at the beginning and the end (when the objects are most recognizable) than in the middle. The morphs are often staggered-different parts of the image change at different times. In the Michael Jackson film clip "Black or White," there are sometimes up to seven planes of morphing going on at once. In the scene with the dancers, the features on some dancers have already completed the transformation while others still show the original image. This design confuses the eye, so that the viewer has no place on which to focus and try to catch the morph "in the act."

Another advanced technique is volume morphing. While most morphing techniques transform the image, volume morphing transforms the model. Figures 35.3 through 35.5 show a volume morph done by members of the Volume Rendering Project of the Stanford Graphics Laboratory. MPEGs of the morph, as well as more information and other files, are available at

Figure 35.3 : Original dart.

Figure 35.4 : Morph in progress.

Figure 35.5 : X-29 Fighter Aircraft.

Produce (Render) the Images

Images are actually produced using rendering software. Even on high-end workstations, these tools run slowly, grinding out one image at a time in accordance with the rules of the animation. Because these tools are slow and expensive, some animators like to set up animation prototypes on desktop computers like the Macintosh. In fact, some animators report that these "prototypes" are good enough for production on some jobs.

One of the best software libraries available for building 3-D rendering software is OpenGL, developed by Silicon Graphics, Inc. Most major workstation vendors offer a version of OpenGL for their machines. In addition, Brian Paul, of the Space Science and Engineering Center at the University of Wisconsin at Madison, has released a publicly available version of OpenGL. This version is called Mesa and is at Not only is Mesa compatible with the OpenGL library calls, but it is distributed as source code, so beginning programmers can examine it to learn how to implement three-dimensional graphics.

The Mesa documentation shows how to set up symbolic links so that "off-the-shelf" OpenGL applications compile painlessly. Some versions of UNIX do not respect symbolic links when they point to shared libraries. For best results in using Mesa as a replacement for commercial OpenGL, put a copy of the libMesaGL.a and libMesaGLU.a libraries into /usr/lib, and rename them libGL.a and libGLU.a, respectively.

Renderers are concerned with light, shadow, and surface texture. One of the most powerful rendering techniques is raytracing, in which the paths of individual beams of light are followed from the source to the eye (or camera). Along they way, they may be diffused, reflected, and absorbed. Raytracing is computer-intensive and is often applied late in the process of refining the rendering. When used, it produces exceptionally high-quality results.

Some of the best high-end packages do their rendering by handing off to Pixar's RenderMan, a highly regarded dedicated function renderer. Many packages will also interface with NetRenderMan, which performs sophisticated rendering using a network of inexpensive PCs.

Sophisticated rendering techniques, including texture mapping, make some computer-generated images pass for photographs. In texture mapping, a two-dimensional image is mapped onto the skin of a three-dimensional model. The effect is similar to wrapping clothes around a body. Texture mapping is also known as digital image warping. The classic text on the subject is George Wolberg's Digital Image Warping (IEEE Computer Society Press, 1990). Although the focus of the book is on image warping, much of the discussion is applicable to general 3-D graphics, such as the chapters on sampling theory and antialiasing.

When an artist lays down the brush in favor of a computer, he or she moves from the analog world into the digital. (Some of the mathematics of this transition are covered in Chapter 34, "How to Add Sound." The digital world is sampled-under certain conditions input can be undersampled. When that happens, false signals, called aliases, appear. In digital images, these aliases manifest themselves as jagged edges and moire patterns.
Filtering techniques are available to reduce aliasing. Most of these antialiasing techniques involve blurring the signal before it is sampled and then reconstructing the signal using various mathematically sophisticated techniques. For details on these algorithms, see Chapter 6 of Wolberg.

To develop an appreciation for the capabilities of digital morphing, watch for morphing in movies and commercials. Exxon ran a commercial in the early '90s in which a moving car begins to ripple. The ripples become stripes, and the car becomes a tiger. The most famous company in the special effects industry is Industrial Light and Magic. Its morphing credits include Willow, The Abyss, Indiana Jones and the Last Crusade, and Terminator 2. Terminator 2 and The Abyss were both rendered on Pixar's RenderMan.

To get a taste of morphing on publicly available software, check out Morphine by Mark Hall ( Morphine is not comparable to professional-grade software, but it is quite good, and the source code is available for reference. The input file allows the user to specify

Capturing Natural Images

While for many purposes computer-generated animation is the best way, or even the only way, to produce an image, there are still many times when a video photographer and a television camera can give similar results that are either of higher quality, faster, or less costly.

Setting Up the Camera

The key to recording good images is a good camera and a good video photographer. Just as a professional-grade microphone was recommended for audio in Chapter 34, "How to Add Sound," a good camera is crucial to making good video. Consider hiring a freelance video photographer-most will have their own equipment. A good freelancer will come with ideas for camera angles and creative shots. Their skills will be particularly appreciated if the camera is to move during the shot. Moving the camera can be tricky and is one of the fastest ways to turn an otherwise professional-looking work into an amateurish video.

Be sure to use the highest-quality tape available. If the recorder is a consumer-grade VCR, set it to "SP." This speed eats up tape six times faster than "SLP," but gives the higher quality necessary to getting good results.

Prepare the tapes before going on location. First, "pack" the tape by fast-forwarding to the end and then rewinding to the beginning. This process removes slack introduced during shipping. Then, record one minute of color bars and audio tone and 30 seconds of black at the beginning of the tape. Keep track of tape time, and stop recording at least a full minute before the end of the tape.

Remember that part of video photography, like all photography, is art. Encourage the photographer to think about scene composition. Take advantage of the three-dimensional aspects of video-have a defined foreground and background. Consider having actors and action in both having an actor move from foreground to background or vice versa.

Vary the kind of shots taken. Set the scene with long shots. Use close-ups to show visual details. Get plenty of footage-at least three seconds of any given shot. Excess material can be cut during postproduction. Be sure to leave a long pause (at least ten seconds) between the time the video recorder is turned on and the time the action is started. The extra time will help synchronize editing equipment.

During the shooting of natural scenes, actually look at the scene. Do the actors and key equipment stand out from the background? If not, think about changing the background or the actors' clothing or switch to equipment that doesn't blend into the background.

Use clapboards at the beginning and end of each shot. It may seem a bit hokeyish or "Hollywood," but there are sound, practical reasons why the movie industry does things. (Well, as least some things.) For example, clapboards provide a reference on the tape for each "take" of each scene. If the director decides on the scene that "Take 4" is the best one to use for a particular scene, the editor can forward directly to that take.

Likewise, mark each tape with details about its contents. During postproduction, editors will thank you for making sure each shot is clearly marked, both on the outside of the tape and in the footage.

After it has been used, write-protect each tape and put it back into its case to keep it safe. Double-check to make sure it has an accurate label that is visible when the tape is in its case.

Shoot key scenes from various angles and distances, to give the editor some flexibility.

If the location work must stop for the night, take an instant photo of the scene, so it can be set up the next morning exactly as it appeared the previous day. It wouldn't do for a coffee cup or a pen to blink out of existence in the middle of a scene.

Working with Light

Be conscious of the passage of time. Even when shooting indoors, outside scenes are often visible. If a 30-minute interview begins with a mid-morning sun outside the window, it should end with consistent lighting outside the window, even if the actual shooting went on into the late afternoon.

To solve this problem on indoor shots, use a camera angle that avoids outside scenes or lighting.

Ask the video photographer to set up "three-point lighting" when appropriate. Don't use existing office lights-use lights provided by the photographer to ensure repeatability. A good photographer will keep notes about the lighting setup on each scene, in case any material has to be reshot.

Make sure the brightest light is on the person or equipment you want to highlight.

If the scene includes a computer screen, make sure no reflections get through to the camera. Move lights or the camera to avoid reflections, or place an anti-glare filter over the monitor.

Recording Sound

Once on location, be generous with audiotape. Collect natural sound (known in the trade as "nat sound," "wild sound," or "presence"). These sounds can add a realistic quality to computer-generated animation as well as serve as background for natural images.

Get the microphone as close to the sound source as possible. Turn off any background noise such as printers or air conditioners. Keep microphone cables away from power cords-the power hum can transfer onto the audio signal.

Power cords carry AC electricity at relatively high voltages (100 to 200 volts) compared to electronic equipment (which is typically 5 to 12 volts). The AC power is 60 Hz in the U.S. and a few other parts of the world and 50 Hz in Europe and much of Asia. If you hear a low hum in the audio, look at the power cords and find out where that hum may be leaking into the electronics.

Review the recommendations in Chapter 34, "How to Add Sound," about getting good quality sound.


Once the raw material (sound, graphics, stock materials, natural images, and computer-generated animation) is gathered, postproduction begins. During postproduction, the material is logged and edited into the final work. If any material is unusable, arrangements are made to replace it. Finally, the title and credits are added, and the work is ready for duplication and distribution, either on or off the Net.


Postproduction editors review all of the raw material and identify what goes where in the final program. Any material which is not usable must be replaced. This process of examining the material and getting it ready for editing is known as logging.

Modifying, Compositing, and Sequencing the Animation

Once the images are all produced or gathered, it is up to the postproduction staff to edit them into a consistent whole. They may use image-processing tools or other special effects packages to get just the appearance they want. They may also add stock images such as clouds in the sky or repaint portions of the image to change day into night or to turn lights on in a window. If the finished work is to include natural scenes in combination with computerized effects, that compositing is done in postproduction.

At each step in the process, most graphic artists do a bit of touch-up. The computer can relieve the artist of tremendous amounts of labor, but the artist can help the computer out here or there to get the very best results.

Most professionals use a two-step editing process. First, the editors make a series of rough decisions about how to use the material. Then they actually get on the computer or videotape machines and do the edits. The first step is known as offline editing, the second as online editing.

During offline editing, the staff prepares an Edit Decision List (EDL). The EDL is a shot-by-shot record of decisions that is used as a plan during online editing. Some software, such as Adobe Premiere, facilitates the capture of the EDL during offline editing. Adobe reports (on its Premiere site, that Jeff DePonte of JDVIDEO in Hawaii uses Adobe Premiere to prepare his EDLs and then sends them electronically to the video postproduction house for online editing. DePonte reports that 75 to 80 percent of his online editing costs can be avoided using this technique.

Once online, editors traditionally use one or more of three types of editing equipment:

Professional editors use equipment like this to perform insert editing. High-end videotape formats have three channels-one for video and two for audio. The editor will edit the narration onto the master tape while waiting for the video. Then video will be added as it comes in, leaving holes for missing material. Finally, the missing images are added, filling all the holes and leaving a finished master.

Digital Video Capabilities

In a world where desktop video software such as Adobe's Premiere allows a user to integrate natural video images, computer-generated animation, still graphics, and sound and music, the price of good editing equipment has come down dramatically. Systems costing just a few thousand dollars can do sophisticated DVEs that were once out of reach of all but the most expensive systems. Even an inexpensive editor such as Apple's QuickTime Movie Player allows insert editing-the user can drop one track into position beside another, and have the new material "stretch" or "compress" as necessary. The difference between digital video and the older technologies is that, in traditional editing, the computer was used to control the videotape equipment. In digital video, the video data actually becomes a computer file and is manipulated by the computer directly.

Neil Fox, TRW's manager of multimedia services, reports that that corporation used Macintosh computers and Adobe Premiere to put out an interactive employee orientation on CD-ROM to its 65,000 employees. "It would have cost ten times as much if we had to use traditional video editing methods," said Fox. He also estimated that the use of digital video decreased the schedule by about 25-fold.

Digital Video Requirements and Limitations

The minimum requirements for a digital video system are digitizing hardware-one or more cards used to transform data from a videotape to the computer file-a large, fast hard drive, and software to manipulate the resulting images.

When transferring video data into the computer, the bottleneck is often in the speed of the connection to the hard disk. Disconnect CD-ROMs and disable CD-ROM software. Make sure to use a fast SCSI driver (there is quite a bit of variation between manufacturers). Put media in any unused drives (for example, the floppy drive) to eliminate polling delays.

To improve the throughput of a desktop video system, you need to be able to measure that throughput. The easiest way to do that is to use statistics built into the software. For example, Adobe Premiere has a Report Dropped Frames option under the Capture menu. Set the scratch disk (under the File menu's Preferences menu) to the disk being tested. If you are using the VideoVision Studio from Radius, look for the drive performance estimation feature in the Compression Options dialog box (the Find button) and the Movie Analysis tool to determine how high you can set the quality without dropping frames.

Digital video systems are much faster than tape-based editing since any frame can be accessed without waiting for tape to roll. The downside is that the quality of the finished product may be slightly less than that of the original image on tape. For video that is going to be delivered in computerized format (on the Net or otherwise), this quality difference is not noticeable. Digital video may not be acceptable for some high-end broadcast applications, however.

Several utilities are available from the drive vendors to test SCSI performance. While these utilities give a relative idea of drive throughput, they are not a good indicator of the system's performance when handling video data.

If the hard drive still cannot keep up with the video rate, use a program like Norton Utilities to see if the disk is badly fragmented. Video files are big-make sure there is one contiguous hole so the disk driver doesn't have to hunt all over finding places to stuff data. Defragment the drive, if necessary. As a last resort, back up the drive, reformat, and restore the files.

On NuBus-based Macintoshes (a particularly popular platform for desktop video) it is possible to set the data rate so high that the NuBus cannot keep up with both audio and video during recording. (The problem does not occur on playback.) If this happens, load the video first and then the audio. Newer Macs, with a PCI-bus, do not have this problem.

On Windows machines, Terminate and Stay Resident (TSR) programs can steal CPU cycles. On Macintoshes, some extensions (such as DiskLight) can do the same thing. Hunt down any program that doesn't absolutely need the CPU and disable it, to get maximum performance when loading video data.

If a system that has been working well suddenly begins to drop frames, listen to the disk drive. If you can hear the heads thrashing, you probably need to defragment the drive (see previous note). If the drive is no noisier than it was in the past, the drive may be engaged in thermal recalibration (T-Cal) (see the following caution).

Radius and others offer hardware solutions that speed up the overall throughput of the system (through adaptive compression) and have more repeatable frame rates than the software-only solutions. For more information on digital video and on Radius's products in particular, visit

As disk drives heat up, they need to recalibrate and realign the heads. This activity is called thermal recalibration, or T-Cal. During T-Cal, no data is written to the disk; incoming data can be lost. The drive sends out a signal to the CPU asking it to resend the data. That request for the data to be resent doesn't work when video or audio is being loaded-the CPU has already reused those buffers for new data. Newer drives have enough memory to cache this data, and don't have this problem.

The last step in editing is the development of the title credits. Be sure everyone involved in production has an opportunity to see their name go on the work.

The FCC has standards about the maximum brightness allowed in a broadcast image. If it is possible that the video may be used on the air, have the post-production shop use a waveform monitor such as the Tektronix 1780R Video Measurement Set to ensure that signal levels meet the government requirements. Even if the video will only be used on computer screens, it is a good idea to ensure that the brightness does not exceed reasonable levels.

Transfer-Putting it in the Can

If the target medium is a Web site, the output of postproduction may already be in the finished format. In some cases, it will be necessary to change it from, say, a QuickTime format to an MPEG. If the medium is film or videotape, the computer-generated images must be transferred to that medium using special hardware.

Even if the work will be distributed over the Net, it is sometimes worthwhile to prepare a VHS master so the finished product may be shared with management, clients, and other people who participated in the process but may not have ready access to the Net and the hardware to display an MPEG to its full effect.

For professional or industrial use, the finished product may end up on VHS tape or Video8 or possibly U-Matic 3/4-inch or U-Matic SP. Broadcast work is usually transferred to one-inch type C or possibly high-end digital media such as D-1 or D-2. Most film recorders have at least 2,000 line-per-inch (lpi) resolution. Professional-grade equipment has 4,000 lpi. For best results, the images should have a little more than twice the resolution of the film recorder.

If it is necessary to make more than about five copies (known in the trade as "dubs"), it is usually cost-effective and more time-efficient to rent time at a professional video duplication service.

Take the same care in packaging the finished tapes as was taken during the development of the product. Use a high-quality graphic in a plastic library case to convey the image of quality work.

How to Compress Files

In a word, don't. All video formats are highly compressed. Backup utilities or special communications utilities that try to compress usually backfire and make the file larger. Note that some utilities like gzip and compress can figure this out by themselves and won't compress uncompressible files. Note, too, that it's safe to send files over a V.42bis modem connection-the modem will recognize that the file is already compressed and will disable its own compression.

If your modem uses the MNP 5 protocol, disable it before transferring video files. MNP 5 can cause the file to take longer to transfer under some conditions and rarely improves the performance when transmitting data that is already compressed. Do leave the underlying error correction (MNP 2, 3, or 4) on, though.

Serving Video-Carefully

Video will remain an exotic medium for some time to come. Even widespread use of ISDN will not allow realtime downloading of MPEG 1, let alone MPEG 2. Initiatives in the cable industry promise high bandwidth over fiber-optic cables in the coming years, but it is far from clear that there is enough capacity for each household to have its own 1.5 Mbps channel. Video has a place on the Web, if it is used carefully.

To get a better idea of the impact various choices have in encoding MPEG, visit That site documents a movie through changes in five different parameters. The movie, the raw log files, and a summary of their findings (tracked across four output variables) are all there (viewed in four different formats, no less!)

When Is Video Useful?

Video is generally less appropriate for highly technical material than for stories, concepts, and matter-of-fact information. Facts, figures, data and analyses are best presented in tables and graphics on a Web site, rather than in a video. Remember, too, that only a small percentage of visitors may see the video. Make sure as much information as possible is presented in a more traditional format.

For some applications, a good solution may be to capture some key frames in JPEG, or possibly put up a short MPEG (say, 2M to 8M) on an FTP site and offer more of the same by videotape or even on CD as MPEGs. A nice MPEG running a minute or two can serve as an incentive to visitors to take the next step, whether that step is ordering a product, voting for a candidate, or contributing to a cause. With a 14,400 bps connection, a simple 30-second, 300- to 500K MPEG can be downloaded in about four to six minutes. If it is carefully crafted, such a clip can make an effective impact on a visitor to the site.

This approach has been used successfully by PHADE SOFTWARE, a German firm that sells a product called "The Internet MPEG CD-ROM." This product is one big HTML document, bundled with the browser Cello. An online intro to the product is available at The actual CD-ROM includes 600M of digital movies, sounds, and songs as well as a variety of utilities. The CD-ROM conforms to ISO-9660, so it should be readable on Windows machines and Macintoshes as well as nearly all UNIX platforms. By making extremely large files available via Web browsers, PHADE may have solved the bandwidth problem and still made effective use of the Web to offer "free samples." The product is available by mail order and can also be ordered through software CD-ROM channels. The publisher is Hardmann Multimedia Services. Full details and pricing information are available on the Web site.

Alexander Scourby Bible Products uses a similar approach. Its product is the King James Bible, as read by Alexander Scourby. Its site, http://www.iminet. com/bible/ offers .avi and .wav samples of its video- and audiocassettes.

Delivering the Multimedia Presentation

Like high-end graphics (discussed in Chapter 33, "How to Add High-End Graphics") and sound (described in Chapter 34, "How to Add Sound"), use of video requires some extra planning by the Webmaster:

Digital video requires considerable care when serving it over the Web. Even a few seconds of video can take many minutes to download. Carefully consider the purpose of the video and decide if video is really the best mechanism for accomplishing the objective.

When video is justified, decide whether to send a few seconds over the Web, to make a somewhat longer video available by FTP or even to produce a video (perhaps using digital means), provide a small sample on the Web site, and then provide a longer sample on a video cassette that can be delivered by overnight carrier.

With high-end graphics, sound, and video, a site can justify the label "multimedia." But one more component remains: three-dimensional models, which can be examined using VRML browsers. Those models are the subject of Chapter 36, "The Third Dimension: VRML."