Classic Computer Magazine Archive COMPUTE! ISSUE 72 / MAY 1986 / PAGE 100

The Beginners Page

Tom R Halfhill, Editor

String Comparisons

As we've pointed out more than once in the past few columns, computers really know nothing about our written language of alphabetic characters, punctuation marks, and symbols-they are capable of dealing only with numbers. Although this means computers have to spend a lot of time translating things for our convenience (and vice versa), it also means that computers can perform "arithmetic" on character strings.
    This concept seems a little strange at first, because we're used to thinking of the written word and mathematics as two different, incompatible languages. After all, a phrase such as "The quick brown fox jumped over the lazy dogs" is just as meaningless in mathematics as the phrase "X= (Y +2)*(Z/4)" is in English. But since a computer sees "The quick brown fox..." as merely a string of numbers (character codes), we can write programs that perform a kind of arithmetic on what appears to us as strings of characters. Here's an example:

IF "A"<"B" THEN PRINT "IT WORKS!"

    When you press RETURN, the result is the message IT WORKS!.
    Notice the subtle yet vital difference between this line and the statement IF A<B THEN PRINT "IT WORKS!". Although both statements are comparing two values with an arithmetic operator (<, the less-than sign), the first statement isn't comparing two numeric values; it's comparing two character values.
    At least, that's how it looks on the surface. From the computer's point of view, two numbers-character codes-are still being compared. The character A is "less than" the character B because the character code for A is a smaller number than the character code for B, You can confirm this by typing PRINT ASC("A") and PRINT ASC("B")-the character codes are 65 and 66, respectively. (See the February 1986 "Beginner's Page" for more details on ASCII character codes.) It's easy to remember that the letter A is less than the letter B, because A precedes B in the alphabet. But keep in mind that it's really the character codes, not the alphabetical positions, that count. Consider this example:

IF "A">"a" THEN PRINT "IT WORKS!"


From the computer's point of view, two numbers-character codes-are being compared. The character A is "less than" the character B because the character code for A is a smaller number than the character code for B.


    When you enter this statement, you might expect to see the message IT WORKSl. Alphabetically, the uppercase letter A should take precedence over the lowercase letter a. But it doesn't work that way on most computers. Instead, the IF-THEN test fails; A is not greater than a. Why? Because the character codes for uppercase letters are numbered from 65 to 90, and the codes for lowercase letters are numbered from 96 to 122. (Yes, it's odd.) Therefore, A (65) is less than a (96). The statement above is really the equivalent of this:

IF ASC("A")>ASC("a") THEN PRINT
   "IT WORKS!"

which, in turn, is the equivalent of this:

IF 65>96 THEN PRINT "IT WORKS!"

    As long as the computer can figure out that 65 isn't greater than 96, it doesn't have to know anything about alphabets.
    Incidentally, you'll get different results if you try some of these examples on Commodore computers (except the Amiga). Commodore machines assign character codes a bit differently than other computers do. Normally, the Commodore 64, 128, and VIC-20 don't display upper/lowercase characters-you have to press the SHIFT-Commodore keys to switch to this mode. This renumbers the lowercase character set from 65 to 90 and the uppercase set from 193 to 218. So on a Commodore, the uppercase letters are indeed "greater than" the lowercase letters.
    Other types of comparisons are possible with strings, too. Try these:

IF "OK"="OK" THEN PRINT "OK"

IF "DIAGNOSTIC TEST"<>"DIAG-
  NOSTIC TEST" THEN PRINT
  "YOU'VE GOT A HARDWARE
  PROBLEM"

IF "DOG">"CAT" THEN PRINT "TOLD
  YA SO"

    All of the examples we've seen so far compare string literals. Of course, you can also compare characters stored in string variables:

10 DIM A$(5),B$(5):REM This line for
   Atari only
20 A$="<"
30 B$=">"
40 IF A$<B$ THEN PRINT "< IS LESS
   THAN >"

    String arithmetic isn't limited to comparisons. Next month, we'll see how you can add two strings together in various versions of BASIC, and cover some remaining string functions as well.