Starting out
FIRST LESSON IN ASSEMBLY LANGUAGE
Excerpt from Atari Roots - a terrific new book
by MARK ANDREWS
This article is an excerpt from Atari Roots - A Guide to Atari Assembly Language. Written by Mark Andrews and published by Datamost ($14.95), this new paperback book is an excellent introduction to Assembly Language programming for Atari computerists.
Start programming immediately in machine language! Turn on your
Atari computer and type in this program. Then RUN it, type a few
words, and you'll see something very interesting on your computer screen.
10 REM**"D:HEADSUP.BAS"**
20 REM* *A MACHINE LANGUAGE
PROGRAM**
30 REM**THAT YOU CAN RUN**
40 REM**STANDING ON YOUR
HEAD**
50 REM
60 GRAPHICS 0:PRINT
100 POKE 755,4
110 OPEN #1,4,0,"K:"
120 GET #1,K
130 PRINT CHR$(K);
140 GOTO 120
This is, of course, a BASIC program. Line 60 clears your computer
screen with a GRAPHICS 0 command. Line 110 opens the Atari keyboard
as an input device. Then, in lines 120 through 140, there is a loop
that prints typed-in characters on your screen. But the most important
line in this program, the line that makes it do what it's supposed to do,
is line 100.
The active ingredient of line 100, the instruction POKE
755,4 is actually a machine language instruction. In fact, all POKE
commands in BASIC are machine language instructions. When you use
a POKE command in BASIC, what you're actually doing is storing a number
in a specific memory location in your computer. And when you store
a number in a specific memory location in your computer, what you're doing
is using machine language.
UNDER THE HOOD OF YOUR ATARI
Every computer has -three main parts: a Central Processing unit (CPU),
memory (usually divided into two blocks called Random Access Memory (RAM)
and Read Only Memory (ROM), and Input/Output (I/O) devices.
Your Atari's main input device is its keyboard.
Its main output device is its video monitor. Other I/O devices that
an Atari computer can be connected to (or interfaced with) include telephone
modems, graphics tablets, cassette data recorders, and disk drives.
In a microcomputer, all of the functions of a central processing unit are
contained in a MicroProcessor Unit (or MPU). Your Atari computer's
MPU, as well as its CPU (Central Processing Unit), is a circuit using Large
Scale Integration (LSI) called a 6502 microprocessor.
THE 6502 FAMILY
The 6502 microprocessor, your computer's command center, used not only
in Atari computers, but also in personal computers manufactured by Apple,
Commodore, and Ohio Scientific. That means, of course, that 6502
assembly language can also be used to program many different personal computers.
THE FOUNTAINS OF ROM
Your computer has two kinds of memory: Random Access Memory (RAM) and
Read Only Memory (ROM). ROM is your Atari's long-term memory.
It was installed in your computer at the factory, and it's as permanent
as your keyboard. Your computer's ROM is permanently etched into
a certain group of chips, so it never gets erased, even when the power
is turned off. For most home computer owners, that's a good thing.
Without its ROM, your Atari wouldn't be an Atari, In fact, it wouldn't
be much more than an expensive, high tech doorstop.
The biggest block of memory in ROM is the block that holds
your computer's Operating System, or OS. Your Atari's Operating System
is what enables it to do all of those wonderful things that Ataris are
supposed to do, such as accepting inputs from the keyboard, displaying
characters on the screen, and so on. ROM is also what enables your
computer to communicate with peripherals such as disk drives, cassette
recorders, and telephone modems. If you own one of Atari's XL series
of computers, your unit's ROM package also contains a number of added features,
such as a built-in self-diagnostic system, a built-in foreign language
character set, and built-in BASIC.
RAM IS FLEETING
ROM, as you can imagine, was not built in a day. Your Atari's
ROM package is the result of a lot of work by a lot of assembly language
programmers. RAM, on the other hand, can be written by anybody -
even you. RAM is your computer's main memory. It has a lot
more memory cells than ROM does, but RAM, unlike ROM, is fleeting.
The trouble with RAM is that it's erasable, or, as a computer engineer
might put it, volatile.
When you turn your computer on, the block or memory inside
it that's reserved for RAM is as empty as a blank sheet of paper.
And when you turn your computer off, anything you may have in RAM disappears.
That's why most computer programs have to be loaded into RAM from mass
storage devices such as cassette data recorders and disk drives.
After you've written a program, you have to store it somewhere so it won't
be erased when the power goes off and erases your RAM.
Your computer's RAM, or main memory, can be visualized
as a huge grid made up of thousands of compartments, or cells, something
like tiers upon tiers of post office boxes along a wall. Each cell
in this vast memory matrix is called a memory location, and each memory
location, like each box in a post office, has an individual and unique
memory address. The analogy between computers and post office boxes
doesn't end there. A computer program, like an expert postal worker
putting mail in post offices boxes, can get to any location in its memory
about as quickly as it can get to any other. In other words, it can access
any location in its memory at random. And that's why user-addressable
memory in a computer is known as random access memory.
ITS "LETTERS" ARE NUMBERS
Our post office analogy isn't absolutely perfect, however. A
post office box can be stuffed full of letters, but each memory location
in a computer's memory can hold only one number. And that number
can represent only one of three things:
1. The stored number itself.
2. A code representing a typed character.
3. A machine language instruction.
WHAT NEXT?
When a computer goes to a memory location and finds a number, it must
be told what to do with the number it finds. If the number equates
to just a number, then the computer must be told why the number is there.
If the number is a code representing a typed character, then the computer
must be told how the character is to be used. And if the number is
to be interpreted as a machine language instruction, the computer must
be told that, too.
ITS INSTRUCTIONS AREPROGRAMS
The instructions that computers are given so that they can find and
interpret the numbers stored in their memories are called computer programs.
People who write programs are, of course, called programmers. The
languages that programs are written in are called programming languages.
Of all the programming languages assembly language is the most comprehensive.
RUNNING A MACHINE LANGUAGE PROGRAM
When your computer runs a program, the first thing it has to be told
is where the program has been stored in its memory. Once it has that
information, it can go to the memory address where the program begins and
take a look at what's there. If the computer finds an instruction
that it's programmed to understand, then it will carry out that instruction.
The computer will then move on to the next address in its memory.
After it follows the instruction it finds there, it will move on to the
next address, and so on.
The computer will repeat this process of carrying out
an instruction and moving on to the next one until it reaches the end of
whatever program has been stored in its memory. Then, unless it encounters
an instruction to return to an address within the program or to jump to
a new address, it will simply sit there, patiently waiting to receive another
instruction.
COMPUTER LANGUAGES
As you know, programs can be written in dozens of computer languages
such as BASIC, COBOL, Pascal, Logo, and so on. Languages like these
are called high level languages, not because they're particularly esoteric
or profound, but because they're written at too high a level for a computer
to understand. A computer can actually understand only one language,
machine language, which is written entirely in numbers. So before
a computer can run a program written in a high level language, the program
must somehow be translated into machine language.
Programs written in high level languages are usually translated
into machine language using software packages called interpreters or compilers.
An interpreter is a piece of software that can convert a program into machine
language as it is being written.
A compiler converts high level languages into machine
language after they are written. COBOL, Pascal and other high level
languages are usually translated into machine language with compilers.
MACHINE LANGUAGE ASSEMBLERS
Interpreters and compilers are not used in writing assembly language
programs. Assembly language programs are almost always written with
the aid of software packages called assemblers. A number of assemblers
for Atari computers are available, including OSS's very advanced MAC/65.
An assembler doesn't work like an interpreter, or like a compiler.
That's because assembly language is not a high level language.
One could say, in fact, that assembly language is not
really a programming language at all. Actually, assembly language
is nothing more than a notation system used for writing machine language
programs using alphabetical symbols that human programmers can understand.
What we're trying to get across here is the fact that
assembly language is totally different from every other programming language.
When a high level language is translated into machine language by an interpreter
or a compiler, one instruction in the original programming language can
easily equate to dozens - sometimes even hundreds - of machine language
instructions. When you write a program in assembly language, however,
every assembly language instruction that you use equates to just one machine
language instruction with exactly the same meaning. In other words,
there is an exact one-to-one relationship between assembly language instructions
and machine language instructions.
THE PROGRAMMER'S PLIGHT
Ironically, even though assembly language programs run much faster
than programs written in high level languages, they require many more instructions
and take much longer to write. One widely quoted estimate is that
it takes an expert programmer about ten times as long to write an assembly
language program than it would take him (or her) to write the same program
in a high level language such as BASIC, COBOL, or Pascal.
On the other hand, assembly language programs run 10 to
1000 faster than BASIC programs, and can do things that BASIC programs
can't do at any speed. So if you want to become an expert programmer,
you really have no choice but to learn assembly language.
HOW MACHINE LANGUAGE WORKS
Machine language, like every other computer language, is made up of
instructions. As we have pointed out, however, every instruction
used in machine language is a number. The numbers that computers
understand are not the kind that we're accustomed to using. Computers
think in binary numbers - numbers that are nothing but strings of ones
and zeros. Here, for example, is part of an actual computer program
written in binary numbers (the kind of numbers that a computer understands):
00011000
11011000
10101001
00000010
01101001
00000010
10000101
11001011
01100000
It doesn't take much imagination to see that you'd be in for quite a struggle if you had to write long programs, which typically contain thousands of instructions, in binary style machine language. With an assembler, however, the job of writing a machine language program is considerably easier. Here, for example, is the above program as it would appear if you wrote it in assembly language:
CLC
CLD
LDA
#02
ADC
#02
STA
$CB
RTS
You may not understand all of that, but you'll have to admit that it at least looks more comprehensible. What this program does, by the way, is add 2 and 2. Then it stores the result of its calculation in a certain memory location in your computer - specifically, memory address 203.
ASSEMBLY LANGUAGE AND BASIC COMPARED
Assembly language is written using three-letter instructions called
mnemonics. Some mnemonics are quite similar to BASIC instructions.
One assembly language instruction that's much like a BASIC instruction
is RTS, the last instruction in the sample routine we just looked at.
RTS (written 0110 0000 in machine language) means "ReTurn from Subroutine."
It's used much like the RETURN instructions in BASIC. There's also
an assembly language mnemonic that's similar to BASIC's GOSUB instruction.
It's written JSR, and means to "Jump to SuBroutine." Its equivalent in
binary coded machine language is 0010 0000.
Not all assembly language instructions bear such a close
resemblance to BASIC instructions, however, An assembly language instruction
never tells a computer to do something as complex as draw a line or print
a letter on a screen, for example. Instead, most assembly language
mnemonics instruct computers to carry out very elementary tasks such as
adding two numbers, comparing two pieces of data, or (as we have seen)
jumping to a subroutine. 'That's why it often takes vast numbers
of assembly language instructions to equal just one or two words in a high
level language.
SOURCE CODE AND OBJECT CODE
When you write an assembly language program, the listing that you produce
is called source code, since it's the source from which a machine language
program will be produced. Once you've written an assembly language
program in source code, you can run it through an assembler. The
assembler will then convert it into object code, which is just another
name for a machine language program produced by an assembler.
THE SPEED AND EFFICIENCY OF MACHINE LANGUAGE
Since assembly language instructions are so specific (you might even
say primitive) it obviously takes lots of them to make up a complete program;
many, many more instructions than it would take to write the same program
in a high level language. Ironically, machine language programs still take
up less memory space than programs written in high level languages do.
That's because when a program written in a high level
language is interpreted or compiled into machine language, big blocks of
machine code must be repeated every time they are used. But in a
well-written assembly language program, a routine that's used over and
over can be written just once, and then addressed as many times as needed
with JSR, RTS, and similar commands. Many other kinds of techniques
can also be used to conserve memory in assembly language programs.
Mark Andrews writes a syndicated column about computers. Atari
Roots is his 11th computer book. He owns five home computer systems
and the Atari is his favorite.