Classic Computer Magazine Archive COMPUTE! ISSUE 131 / JULY 1991 / PAGE 84

Why PC can't read. (grammar checkers)
by Steven Anzovin

Hal 9000 could do it. So could the Terminator, not to mention the Robot in Lost in Space. These Hollywood computers could read, use a phone book, and even read lips. In the real world, we also want our computers to understand what we write and say and to return enlightening responses. Unfortunately, they don't understand a word.

That's what computing newcomer Daniel Lombardo, curator of the Emily Dickinson Collection in the poet's hometown, Amherst, found out when he tried the popular grammar-checking program Grammatik IV (Reference Software International, 330 Townsend Street, Suite 123, San Francisco, California 94107; 800-872-9933, 415-541-0222). Danny was writing an article about Dickinson on his new PC, which he ran through Grammatik's battery of grammar, style, mechanics, and spelling analyzers.

Like other grammar and style checkers, Grammatik offers suggestions for improving text based on rules developed by academic writing experts. The program's internal rules couldn't cope with Dickinson's writing, however. About Dickinson's poem on a hummingbird--

A Route of Evanescence

With a revolving Wheel--

A Resonance of Emerald--

A Rush of Cochineal--

And every Blossom on the Bush

Adjusts its tumbled Head--

The mail from Tunis, probably,

An easy Morning's Ride--

Grammatik said, This may be an incomplete sentence. Long sentence can be difficult to understand. Consider revising so that no more than one complete thought is expressed in each sentence.

The use of in case in a line from one of Dickinson's letters, "I found abundance of candy in my stocking, which I do not think had the anticipated effect upon my disposition, in case it was to sweeten it," prompted this response: Hackneyed, Cliche, or Trite . . . Avoid cliches, they distract the reader and weaken your message. Cliches are a symptom of lazy writing.

As Danny remarked sarcastically, "After 30 years, the great critic Thomas Wentworth Higgins was still bewildered by Emily's writing. Grammatik got right to the point in a microsecond. She was lazy."

Gramatik is actually one of the better programs of its kind on the market and can be a real help to expository writers--not poets--trying to learn their craft. Reference Software doesn't claim Grammatik will make a computer "understand" your writing any more than a paint program can critique your network.

Danny's experience points to a more general problem in what's called natural language processing, the yet-to-be-achieved ability of computers to understand everyday language. Computers work by rules, called algorithms, and many theorists of artificial intelligence think the human brain works much the same way. In this view, the only important difference between brains and computers is in the brain's greater complexity and adaptability. Make computers more complex, faster, and better able to learn, and natural language processing should follow--you merely need to feed in the right language rules. The same rules Grammatik now uses to analyze a poem are the primitive precursors of a system that may someday allow computers to read natural human language.

But research shows that reading isn't as simple as it appears; it requires a knowledge of how the world works, not just the rules of language. Some experts estimate that an ordinary, common-sense understanding of the world may actually require a knowledge base of as many as 10 million instantly accessible rules of thumb. But because language evolves over time and varies in usage with each writer and speaker, it may not be possible to define all the rules.

To get a sense of the difficulties involved in natural language processing, first remember what it was like to learn how to read in grammar school. Now imagine attempting the same complex task having lived your life in a features box with no speaking ability and having the innate language capabilities of a gnat. A daunting prospect.

Clever programming can yield software that gives the appearance of language understanding. MS-DOS seems to "know" what you mean when you type dir, and that's what misleads computing neophytes. Also, it doesn't help that movie robots all talk fluently, only occasionally stumbling over human colloquialisms. However, most researchers in the field of natural language processing are just beginning to admit that devising a real-world capable of understanding text, including poetry, on a human level is probably decades away from happening.

So why can't PC read? Because we don't know how we do it ourselves, and until we know better, our computers will be unable, in Dickinson's words, to "expound the skies."