BOOKSHELF ON LASER DISK
ST searches 540 megabytes in 3 secondsby NAT FRIEDLAND, Antic Editor
540 megabytes of memory. On a disk that's smaller than a 45 rpm phonograph
What does that really mean?
540 megabytes is more than 500 million characters-enough typed characters to stretch 1,000 miles from San Francisco to Denver. It's equal to the amount of information that could be stored on 6,000 Atari floppy disks, or in 50 cubic feet of printed pages...
The businesspeople who go to the huge twice-yearly Consumer Electronics Shows are cynical pros who have seen it all. It takes a lot to get them excited.
But the Atari exhibit at the Chicago CES in June was packing them in to see something genuinely new-the 540 megabyte CD ROM system (Compact Disk, Read Only Memory) running on the 5205T computer.
Software by the Activenture Corp. of Monterey, California put the 26-volume Grolier Encyclopedia on a CD ROM disk-along with a smart database that finds all references for any word in the encyclopedia in three seconds fiat.
This system will be premiered only with Atari ST computers. Atari is committed to release a CD ROM player- targeted to retail at $599-by the end of the year. Grolier will probably price the CD ROM encyclopedia disk at around $150-200.
CD ROM on the ST computers is in fact a major bonus from the Atari-Digital Research connection. The company that developed the three-second encyclopedia-indexing software was founded by Tom Rolander, a former operating systems architecture designer at Digital Research Inc. (DRI).
Rolander has been a close associate of DRI chairman Gary Kildall since they met at the University of Washington 14 years ago. They're both fanatical pilots and share ownership of a small armada of airplanes, including aerobatic and ultra-light models. As an individual, Kildall is listed as technical consultant to Activenture.
In January 1985, Rolander and Kildall went to see Atari Chairman Jack Tramiel. It was Activenture's very first meeting to raise outside support.
Rolander and Kildall explained how CD ROM took advantage of the new optical disk technology currently becoming popular for digital audio recordings. Read-only optical disks were already being inexpensively mass produced and could be adapted to hold vast amounts of any kind of computer-readable information They told Tramiel that the Bible, Shakespeare's complete works, the total card catalog of the Library of Congress, entire medical and law databases, computer programs and video images too-virtually any type of data could be digitized, stored on optical disks, and referenced almost instantly by a personal computer.
Only two CD ROM disks would be needed to store every phone listing in the USA for speedy updating.
A single CD could hold a world atlas, a complete directory of international airline schedules, and detailed information about major destinations. It would be like having an expert travel agent on a disk.
Interactive CD ROM cookbooks could be programmed to recommend recipes based on your input of the ingredients available in your refrigerator or on sale at weekly specials.
After 15 minutes of this, Tramiel looked off into space and said quietly, "This would give people a good reason to buy my new computer, wouldn't it?
As a result of that meeting, not only will the Atari ST will be first microcomputer to have CD ROM capability-Atari will have exclusive rights to the Activenture process for some time after release.
In the person of Technical Editor Jack Powell and myself, Antic was the first publication to interview Tom Rolander at Activenture after his triumphant return from CES. The company is located in a brand-new high-tech office complex. Pt's alongside the Monterey Airport so that Rolander can clear his mind with flight breaks, if he gets bogged down in a programming problem.
Enthusiastic and personable, Rolander started Antic's visit by showing off the latest version of Facts and Figures on an Atari 520ST. This is the program that comes with the encyclopedia disk and controls all the CD ROM operations.
It should be emphasized that the software we saw here and at CES was not merely a demonstration fragment. The program was fully operational, with apparently only minor debugging left.
Rolander called up the Encyclopedia Bookshelf screen and showed us the Browse Mode, which is like flipping through multiple books and pages with unprecedented ease and convenience. Using the mouse, you can swiftly move forward and backward, choose any individual volumes, sections and entries.
It's a lot handier than shelving and reshelving five cubic feet of encyclopedia books. And any of the text displayed on screen is easily copied to your printer or disk.
We then switched to Search Mode, the real selling point of CD ROM databases. Rolander invited us to ask for any article or reference. We requested "Transylvania." In a few seconds we had a long list of references and were clicking our way alphabetically through articles about Balkan history and geography, the infamous Elizabeth Bathory and good old Vlad Dracula.
The program can also do relational searches. It will seek out as many multiple topics as can be fit into some 500 characters. And you can choose whether you want only those multiple topics mentioned in the same sentence, same paragraph, or simply any article containing the multiple topics.
We asked for a multi-topic search on: German, Submarine. A full screen of references popped up in a flash. As we scanned the alphabetized headings, we were surprised to see an entry for "Hemingway, Ernest."
Unable to resist taking a look, we immediately clicked to Hemingway's biography article and discovered that the great novelist considered himself to be a volunteer anti-submarine watcher during WWII when he lived in Cuba. This is typical of the kind of unexpected information that Activenture s CD ROM software consistently turns up.
For a search on pioneer psychologist Carl Jung, Rolander selected the Bibliography choice from the search window menu. We instantly saw a screen full of book references, arranged by topics and subtopics. Here is another valuable research tool provided by the Activenture CD ROM software.
We asked Rolander if the software would support "wild card" searches. He said that it could be done, but he was still trying to decide on a wild card system that would be easier for non-programmers than the asterisks and question marks commonly used in computer files for indicating the search's wild characters.
The hardware set-up doing all this was a standard 520ST cabled to a Philips CM100 CD ROM player via a prototype controller box. When the product is released, the controller interface circuitry will be reduced to a board inside the disk player.
Atari plans to contract with one of the major CD ROM manufacturers to make a disk player for sale under the Atari name. At this writing, Philips was the front-runner for the deal. The disk mastering for Facts and Figures has been done by Philips in Holland. The first CD ROM mastering facility in North America was due to be opened by 3M this fall.
We started by bringing up our questions about CD hardware. How different is CD ROM from standard digital audio disk technology? Could you use CD ROM on any compact disk player?
"No, it's not 100% compatible with digital audio. But the idea is to keep down CD ROM costs by using as much as possible of the CD audio technology," said Rolander. " And there is a universal CD ROM standard that has been accepted by Philips, Sony, Hitachi and all the other major manufacturers involved in the field. So there won't be any problems with competing formats."
Similarities between CD ROM and CD audio include the same 4.75-inch disk size, with identical mastering and duplicating processes. This keeps expenses low. It costs no more than $4,000 to make a master disk for pressing. The cost for pressing 1,000 disks is $4 apiece.
All CD players share the same principles of laser optics, the same motor and drive specifications. However, CD ROM requires greater precision in mechanically positioning the laser head and mirror.
Also, CD ROM needs a higher degree of error correction accuracy. It uses 10 to the minus 12th power- meaning you might get a typographical error once in a trillion times. This is accomplished by adding 288 bytes of error correction code onto every data "block" of 2,048 bytes. An unformatted CD ROM disk could actually store 600 megabytes.
"CD ROM needs these more precise tolerances because you cannot have the two-or-three bit error factor that's acceptable for compact disk audio reproduction," said Rolander. Accordingly, he wouldn't be surprised if top-of-the-line CD ROM players also include audio disk capability in the near future.
WHAT'S ON TAPE
"Any text that's stored on magnetic tape can be machine read and automatically indexed by our software," said Rolander. This immediately made us ask how much reference material was now available on magnetic tape.
His answer was that just about all printed matter of any substance that has been published within the past five years could be found on tape. That's because the largest state-of-the. art typesetting machines, such as the Compugraphic 8600 and the top-line Mergenthaler model, normally keep the text data on electronic tape.
"Also there are the huge libraries of information already processed electronically for online databases," Rolander added. "A surprising amount of this material is in public domain, often because it has been prepared by the government."
Rolander predicts that CD ROM will soon replace microfiche film storage of documents. Activenture has already been contacted by a U.S. intelligence agency about the possibility of converting vast libraries of raw information into CD ROM databases.
CD ROM SOFTWARE
To get a better idea of how Activenture's software is set up to access massive amounts of data so rapidly, Rolander took us into Activenture's development room.
We walked up two stairs onto the raised floor of an air-conditioned computer center and saw an array of state-of-the-art computer hardware. The heart of the system was a VAX 11/750 super-minicomputer with 8 megabytes of main memory and 1500 megabyte disk drives.
Across the aisle from the VAX was the video equipment, featuring a Sony professional broadcasting one-inch tape deck. A complete video editing system worked off the Sony, including a character generator, special effects and digitizing consoles and a camera stand.
This video set-up has been used for establishing that it's technologically possible to incorporate digitized illutrations into the Facts and Figures database text. However, the first encyclopedia release will not include illustrations.
"For one thing, processing the pictures is very labor-intensive," Rolander explained. "We could fit 13,000 illustrations onto the current encyclopedia disk, at 32K memory per illustration. But that would mean somebody has to place 13,000 pictures on the camera stand and operate the recording controls each time."
But that wasn't all. "Up to now, the encyclopedia companies normally own only a small percentage of the illustrations they print. The rest are leased from archives for one-time use. Activenture does not presently have the resources to negotiate rights for thousands of pictures. However, I'm sure it won't be long before fully illustrated CD ROM databases are marketed."
Rolander sees Activenture as an "optical typesetter." Paid by royalty fees, Activenture offers the service of creating a fast, interactive index for existing reference material and databases. When Rolander isn't hurrying to finalize his software in time for Atari's September deadline, he's flying East to meet with traditional publishers and sew up more CD ROM rights.
HOW IT'S DONE
The CD ROM disk has four different sections. First is the raw data-which is nothing more than all of the encyclopedia, from A to Z. Then comes the index, or table, which contains pointers to all unique words in the encyclopedia. Next is the directory, which is similar to the file management sectors of a floppy disk. It tells the program where to find a file on the disk.
Finally, there is the Facts and Figures software, which loads into the computer and runs the show. At this writing, Rolander was uncertain whether this section would be on the CD ROM or on a separate floppy disk. It depended on whether Atari made the CD ROM Player a self-booting peripheral.
All the Activenture CD ROM software was programmed entirely in the C language. After Rolander wrote his minicomputer indexing program, it took the VAX no more than six hours to read the approximately 58 million characters in the Grolier Encyclopedia and create the index table.
The program counted the number of unique words at just around 141,000. Some 30 "stop words"-. including but, a, and, of, the, etc.- were ignored in the index.
At the same time, the unique words were also alphabetized and every one of their locations in the encyclopedia was mapped. One reason for the lightning speed of the Facts and Figures software is that it searches references in the index, not in the encyclopedia.
Interestingly, the fully mapped index takes up 50 megabytes, almost as long as the 58 megabytes of the encyclopedia itself. However, the entire encyclopedia and index only require one-fifth of a standard compact disk!
The encyclopedia text files must be usable with video monitors that have different resolution formats. So the software formats the text in real time as it is going into display.
"To keep the program moving fast, it calls up very large buffers," said Rolander. "In fact, it will use whatever free memory is available." The storage buffer requires a minimum of 64K, and the Facts and Figures software will also need its own 64K of RAM.
Transfer rate of the CD ROM is 150 kilobytes per second. An important design element of the ST, to speed this huge data transfer, is the DMA (Direct Memory Access) chip. And it's no accident the ST has this capability Rolander and Atari ST hardware designer, Shiraz Shivji, worked closely together, once it was decided the ST would have CD ROM as a peripheral.
CD FILE FORMAT
The format of a standard floppy disk consists of tracks in concentric rings, each consisting of a number of sectors. Optical laser disks have two standard formats: CAV (Constant Angular Velocity) and CLV (Constant Linear Velocity).
CAV is similar to floppy disk formats. The tracks are concentric rings, each containing a number of sectors- except the sectors are called "frames" or "blocks." The CAV format wastes a great deal of space. The outside tracks are longer, but they contain the same number of blocks as the shorter inside tracks. However, CAV is easier to program for read-write access, and some laser video players use this method because it permits "freeze-frame."
CLV is a spiral format, much like a phonograph record. All the blocks in CLV are equidistant along one long spiral. So there are three times as many blocks per track at the outer edge as there are towards the center. The CD ROM's 540 megabytes in CLV format are divided into 270,000 blocks, with 2,048 bytes in each block. CLV is the format of CD audio and some video players. Rolander chose the CLV format for his CD ROM system because it permits far more storage.
AND THE FUTURE
Personally we can't wait until something like the microfilm library of the "New York Times" becomes available on CD ROM so that we can browse among odd and obscure facts to our heart's content.
At the same time, we have told our typesetter (the same one since Antic began) never to erase any of the magazine's floppy disk files from now on. It would not be a bad idea to bring out a CD ROM disk containing every issue of Antic. All topics and all listings ever printed in the magazine would be instantly accessible via the CD ROM database.
And while we are at it, we might as well include every program in the Antic public domain library on the same disk...
WHAT'S A CD?
CD stands for compact disk which has become the commonly used term for a digitally recorded audio disk that is read by a beam of laser light.
This new digital recording technology has great potential for unprecedentedly fast, acessible, high-density data storage.
Digital audio has become popular very quickly because it reproduces original sound with remarkable accuracy and dynamic range. Compact disks also have virtually no added background noise or distortion. No stylus ever touches a CD to wear it out, there is no tape hiss
Digital music recording involves sampling the audio signal thousands of times a second. These samples are translated into binary code-the zeroes and ones that make up the bytes in your computer memory. A laser beam cuts the binary code onto a master disk in microscopic "pits.'
Compact disks are then pressed from the master and coated with a thin layer of aluminum. They also have a protective layer of plastic that makes the CD extremely hard to damage. A CD player reads the coded pits by using a laser head and a mirror that focuses the light onto an optical sensor.