Starting out

FIRST LESSON IN ASSEMBLY LANGUAGE

Excerpt from Atari Roots - a terrific new book

by MARK ANDREWS

This article is an excerpt from Atari Roots - A Guide to Atari Assembly Language. Written by Mark Andrews and published by Datamost ($14.95), this new paperback book is an excellent introduction to Assembly Language programming for Atari computerists.

Start programming immediately in machine language! Turn on your Atari computer and type in this program. Then RUN it, type a few words, and you'll see something very interesting on your computer screen.

10 REM**"D:HEADSUP.BAS"**
20 REM* *A MACHINE LANGUAGE
PROGRAM**
30 REM**THAT YOU CAN RUN**
40 REM**STANDING ON YOUR
HEAD**
50 REM
60 GRAPHICS 0:PRINT
100 POKE 755,4
110 OPEN #1,4,0,"K:"
120 GET #1,K
130 PRINT CHR$(K);
140 GOTO 120

This is, of course, a BASIC program. Line 60 clears your computer screen with a GRAPHICS 0 command. Line 110 opens the Atari keyboard as an input device. Then, in lines 120 through 140, there is a loop that prints typed-in characters on your screen. But the most important line in this program, the line that makes it do what it's supposed to do, is line 100.
The active ingredient of line 100, the instruction POKE 755,4 is actually a machine language instruction. In fact, all POKE commands in BASIC are machine language instructions. When you use a POKE command in BASIC, what you're actually doing is storing a number in a specific memory location in your computer. And when you store a number in a specific memory location in your computer, what you're doing is using machine language.

UNDER THE HOOD OF YOUR ATARI
Every computer has -three main parts: a Central Processing unit (CPU), memory (usually divided into two blocks called Random Access Memory (RAM) and Read Only Memory (ROM), and Input/Output (I/O) devices.
Your Atari's main input device is its keyboard. Its main output device is its video monitor. Other I/O devices that an Atari computer can be connected to (or interfaced with) include telephone modems, graphics tablets, cassette data recorders, and disk drives. In a microcomputer, all of the functions of a central processing unit are contained in a MicroProcessor Unit (or MPU). Your Atari computer's MPU, as well as its CPU (Central Processing Unit), is a circuit using Large Scale Integration (LSI) called a 6502 microprocessor.

THE 6502 FAMILY
The 6502 microprocessor, your computer's command center, used not only in Atari computers, but also in personal computers manufactured by Apple, Commodore, and Ohio Scientific. That means, of course, that 6502 assembly language can also be used to program many different personal computers.

THE FOUNTAINS OF ROM
Your computer has two kinds of memory: Random Access Memory (RAM) and Read Only Memory (ROM). ROM is your Atari's long-term memory. It was installed in your computer at the factory, and it's as permanent as your keyboard. Your computer's ROM is permanently etched into a certain group of chips, so it never gets erased, even when the power is turned off. For most home computer owners, that's a good thing. Without its ROM, your Atari wouldn't be an Atari, In fact, it wouldn't be much more than an expensive, high tech doorstop.
The biggest block of memory in ROM is the block that holds your computer's Operating System, or OS. Your Atari's Operating System is what enables it to do all of those wonderful things that Ataris are supposed to do, such as accepting inputs from the keyboard, displaying characters on the screen, and so on. ROM is also what enables your computer to communicate with peripherals such as disk drives, cassette recorders, and telephone modems. If you own one of Atari's XL series of computers, your unit's ROM package also contains a number of added features, such as a built-in self-diagnostic system, a built-in foreign language character set, and built-in BASIC.

RAM IS FLEETING
ROM, as you can imagine, was not built in a day. Your Atari's ROM package is the result of a lot of work by a lot of assembly language programmers. RAM, on the other hand, can be written by anybody - even you. RAM is your computer's main memory. It has a lot more memory cells than ROM does, but RAM, unlike ROM, is fleeting. The trouble with RAM is that it's erasable, or, as a computer engineer might put it, volatile.
When you turn your computer on, the block or memory inside it that's reserved for RAM is as empty as a blank sheet of paper. And when you turn your computer off, anything you may have in RAM disappears. That's why most computer programs have to be loaded into RAM from mass storage devices such as cassette data recorders and disk drives. After you've written a program, you have to store it somewhere so it won't be erased when the power goes off and erases your RAM.
Your computer's RAM, or main memory, can be visualized as a huge grid made up of thousands of compartments, or cells, something like tiers upon tiers of post office boxes along a wall. Each cell in this vast memory matrix is called a memory location, and each memory location, like each box in a post office, has an individual and unique memory address. The analogy between computers and post office boxes doesn't end there. A computer program, like an expert postal worker putting mail in post offices boxes, can get to any location in its memory about as quickly as it can get to any other. In other words, it can access any location in its memory at random. And that's why user-addressable memory in a computer is known as random access memory.

ITS "LETTERS" ARE NUMBERS
Our post office analogy isn't absolutely perfect, however. A post office box can be stuffed full of letters, but each memory location in a computer's memory can hold only one number. And that number can represent only one of three things:

1. The stored number itself.
2. A code representing a typed character.
3. A machine language instruction.

WHAT NEXT?
When a computer goes to a memory location and finds a number, it must be told what to do with the number it finds. If the number equates to just a number, then the computer must be told why the number is there. If the number is a code representing a typed character, then the computer must be told how the character is to be used. And if the number is to be interpreted as a machine language instruction, the computer must be told that, too.

ITS INSTRUCTIONS AREPROGRAMS
The instructions that computers are given so that they can find and interpret the numbers stored in their memories are called computer programs. People who write programs are, of course, called programmers. The languages that programs are written in are called programming languages. Of all the programming languages assembly language is the most comprehensive.

RUNNING A MACHINE LANGUAGE PROGRAM
When your computer runs a program, the first thing it has to be told is where the program has been stored in its memory. Once it has that information, it can go to the memory address where the program begins and take a look at what's there. If the computer finds an instruction that it's programmed to understand, then it will carry out that instruction. The computer will then move on to the next address in its memory. After it follows the instruction it finds there, it will move on to the next address, and so on.
The computer will repeat this process of carrying out an instruction and moving on to the next one until it reaches the end of whatever program has been stored in its memory. Then, unless it encounters an instruction to return to an address within the program or to jump to a new address, it will simply sit there, patiently waiting to receive another instruction.

COMPUTER LANGUAGES
As you know, programs can be written in dozens of computer languages such as BASIC, COBOL, Pascal, Logo, and so on. Languages like these are called high level languages, not because they're particularly esoteric or profound, but because they're written at too high a level for a computer to understand. A computer can actually understand only one language, machine language, which is written entirely in numbers. So before a computer can run a program written in a high level language, the program must somehow be translated into machine language.
Programs written in high level languages are usually translated into machine language using software packages called interpreters or compilers. An interpreter is a piece of software that can convert a program into machine language as it is being written.
A compiler converts high level languages into machine language after they are written. COBOL, Pascal and other high level languages are usually translated into machine language with compilers.

MACHINE LANGUAGE ASSEMBLERS
Interpreters and compilers are not used in writing assembly language programs. Assembly language programs are almost always written with the aid of software packages called assemblers. A number of assemblers for Atari computers are available, including OSS's very advanced MAC/65. An assembler doesn't work like an interpreter, or like a compiler. That's because assembly language is not a high level language.
One could say, in fact, that assembly language is not really a programming language at all. Actually, assembly language is nothing more than a notation system used for writing machine language programs using alphabetical symbols that human programmers can understand.
What we're trying to get across here is the fact that assembly language is totally different from every other programming language. When a high level language is translated into machine language by an interpreter or a compiler, one instruction in the original programming language can easily equate to dozens - sometimes even hundreds - of machine language instructions. When you write a program in assembly language, however, every assembly language instruction that you use equates to just one machine language instruction with exactly the same meaning. In other words, there is an exact one-to-one relationship between assembly language instructions and machine language instructions.

THE PROGRAMMER'S PLIGHT
Ironically, even though assembly language programs run much faster than programs written in high level languages, they require many more instructions and take much longer to write. One widely quoted estimate is that it takes an expert programmer about ten times as long to write an assembly language program than it would take him (or her) to write the same program in a high level language such as BASIC, COBOL, or Pascal.
On the other hand, assembly language programs run 10 to 1000 faster than BASIC programs, and can do things that BASIC programs can't do at any speed. So if you want to become an expert programmer, you really have no choice but to learn assembly language.

HOW MACHINE LANGUAGE WORKS
Machine language, like every other computer language, is made up of instructions. As we have pointed out, however, every instruction used in machine language is a number. The numbers that computers understand are not the kind that we're accustomed to using. Computers think in binary numbers - numbers that are nothing but strings of ones and zeros. Here, for example, is part of an actual computer program written in binary numbers (the kind of numbers that a computer understands):

00011000
11011000
10101001
00000010
01101001
00000010
10000101
11001011
01100000

It doesn't take much imagination to see that you'd be in for quite a struggle if you had to write long programs, which typically contain thousands of instructions, in binary style machine language. With an assembler, however, the job of writing a machine language program is considerably easier. Here, for example, is the above program as it would appear if you wrote it in assembly language:

CLC
CLD
LDA
#02
ADC
#02
STA
$CB
RTS

You may not understand all of that, but you'll have to admit that it at least looks more comprehensible. What this program does, by the way, is add 2 and 2. Then it stores the result of its calculation in a certain memory location in your computer - specifically, memory address 203.

ASSEMBLY LANGUAGE AND BASIC COMPARED
Assembly language is written using three-letter instructions called mnemonics. Some mnemonics are quite similar to BASIC instructions. One assembly language instruction that's much like a BASIC instruction is RTS, the last instruction in the sample routine we just looked at. RTS (written 0110 0000 in machine language) means "ReTurn from Subroutine." It's used much like the RETURN instructions in BASIC. There's also an assembly language mnemonic that's similar to BASIC's GOSUB instruction. It's written JSR, and means to "Jump to SuBroutine." Its equivalent in binary coded machine language is 0010 0000.
Not all assembly language instructions bear such a close resemblance to BASIC instructions, however, An assembly language instruction never tells a computer to do something as complex as draw a line or print a letter on a screen, for example. Instead, most assembly language mnemonics instruct computers to carry out very elementary tasks such as adding two numbers, comparing two pieces of data, or (as we have seen) jumping to a subroutine. 'That's why it often takes vast numbers of assembly language instructions to equal just one or two words in a high level language.

SOURCE CODE AND OBJECT CODE
When you write an assembly language program, the listing that you produce is called source code, since it's the source from which a machine language program will be produced. Once you've written an assembly language program in source code, you can run it through an assembler. The assembler will then convert it into object code, which is just another name for a machine language program produced by an assembler.

THE SPEED AND EFFICIENCY OF MACHINE LANGUAGE
Since assembly language instructions are so specific (you might even say primitive) it obviously takes lots of them to make up a complete program; many, many more instructions than it would take to write the same program in a high level language. Ironically, machine language programs still take up less memory space than programs written in high level languages do.
That's because when a program written in a high level language is interpreted or compiled into machine language, big blocks of machine code must be repeated every time they are used. But in a well-written assembly language program, a routine that's used over and over can be written just once, and then addressed as many times as needed with JSR, RTS, and similar commands. Many other kinds of techniques can also be used to conserve memory in assembly language programs.

Mark Andrews writes a syndicated column about computers. Atari Roots is his 11th computer book. He owns five home computer systems and the Atari is his favorite.