Classic Computer Magazine Archive COMPUTE! ISSUE 30 / NOVEMBER 1982 / PAGE 90

Part I:

How To Use SYS And USR

J. C. Johnson
McKinney, TX

In addition to providing an introduction to the use of SYS which allows you to take advantage of the machine language routines in your BASIC's ROM chips, this article also demonstrates a way to pass information between BASIC and machine language.

Written for the CBM/PET (all BASIC versions), the accompanying table makes this article useful to Apple users as well. A companion article, "Getting The Most Out Of USR," expands on some of these topics as they apply to Atari BASIC.

Next month, the tutorial concludes by detailing how to handle complex multiplication from BASIC, but viamachine language, as an illustration of the techniques introduced in Part 1.

BASIC is a powerful language and is easy to use, but it has limitations. Fortunately, there is a SYS command that can be used to access machine language subroutines. This command is among the most powerful commands in BASIC.

With the SYS command it is possible to generate FORTRAN-like subroutines that allow the user the luxury of defining the variables passed at the line that calls the subroutine. This feature will greatly reduce the manipulation required to set up the variables for a subroutine call. It is also possible to write a subroutine that can be used with many different programs without the need to carefully select variables in such a way that the subroutine and the main program do not interfere with one another.

All of this power is available when the machine language subroutine is called, but it isn't without its price. The penalty is in programming difficulty. When working with machine language, it is necessary to know (or at least be able to find out) all actual machine addresses for each subroutine or variable. Fortunately, this is not too difficult, as will be evident. That, in fact, is the purpose of this article: to define the entry points and the use of some of the commonly needed utilities available in BASIC ROMs, and to show how to pass parameters between machine language and BASIC programs using these subroutines.

It is essential, of course, to define the operating system used. All entry points discussed in the article are for the Commodore Upgrade ROMs. Equivalent entry points for most of the utilities exist in Commodore's Original and 4.0 BASICs and can be found in Table 1. Table 1 also includes the equivalents for Applesoft in ROM.

With the information presented here, it is hoped that the interested reader will be able to realize more of the capabilities of his BASIC and will find it somewhat easier to understand the use of the utilities that are available in the ROMs.

To start with, the SYS command is nothing more than a GOSUB statement. The important difference is that the subroutine GOSUBed to is written in machine language. The form of the statement is:

10 SYS A

where A is a decimal address referencing the location in memory of the first instruction of the machine language subroutine. Another typical example would be:

20 SYS 826

The 826 means that the machine language subroutine starts at decimal address 826, which is the Commodore second cassette buffer.

Since the subroutine called by the SYS command is written in machine language, the capabilities are limited only by the system capabilities, not by the language implementation. FORTRAN-like subroutines can be implemented where the arguments are transferred in a "transparent" manner. Such a call might look like this:

100 SYS 826, A,B(K),2*INT(Y),3*LOG(A) + SIN(X),A,2

where the parameters between the commas are transferred to the machine language subroutine for processing. The next time the call is made the statement might look slightly different, like this:

576 SYS 826,P1,C,3.6*TAN(Q),A(6),3.1,S(I,J)

The arrangement of the parameters is left up to the user. In the above examples (and throughout the remainder of this article) it is assumed that the first two, P1 and C in line 576, are the outputs and the remainder are the input arguments. It should be obvious that the number and arrangement of outputs and inputs can be defined as needed by any given problem.

Parameter Passing Via CHRGET

If it is desired to have a set of subroutines callable by a single SYS, then the particular subset can be flagged by one of the parameters:

200 SYS 826,*,A,B,C,D,E,F

The "*" might signal a complex multiplication. The remainder of this article will deal with the use of some of the ROM utilities required to pick up and use the parameters that are transferred by the BASIC program. A description of a subroutine to perform complex multiplication and division will be given as an example.

In PET's BASIC implementation there is a line scanner at address $70 called CHRGET. This subroutine picks up the next character in the line being executed. An alternate entry point at $76 called CHRGOT retrieves the last character that was accessed on the BASIC line. To use this subroutine one simply calls with a jump subroutine:

JSR CHRGET

or

JSR CHRGOT

where the character accessed is returned in the 6502 accumulator. In addition, the carry flag is cleared if the accessed character is an ASCII number 0-9. Otherwise, the carry flag is set. All blanks are ignored. If the character is a colon or null the Z-flag is set; otherwise it is clear. Should it be necessary to change the line scan address, this can be done by putting the new address in TXTPTR, $77 and $78, in the standard 6502 LO,HI byte format. The line scanner subroutine is reproduced in Program 1 for reference.

The significance of this subroutine can be fully appreciated when one realizes that the line scanner is left positioned at the first character beyond the SYS call whenever the machine language subroutine is started. This first character would correspond to the comma after the 826 in line 100. Therefore, the user does not need to know the machine location of the calling statement in the program because the CHRGET subroutine contains that information automatically.

An essential requirement for using the line scan subroutine to fetch input information is that it be left in a position ready for BASIC to continue processing. This almost happens naturally, but is no accident. The scanning of a line, say line 100, is done to gather all parameters on that line that apply to a particular subroutine. When the subroutine is finished, BASIC will assume that the program has obtained all characters up to, and including, the last two, and the line scanner will then be positioned on the character following the two. This character should be one of two possible characters. If the SYS was the last statement on the line, the terminating character will be a null ($00). This character signals the end of a BASIC line and is present whether the SYS was entered from a running program or from the keyboard in the immediate mode.

If the SYS statement is not the last statement on the line, then the statements will be separated by a colon, and this character will be the one that is encountered. Returning to BASIC with the line scanner on either of these characters will allow a normal continuation of BASIC processing.

If the line scanner is left positioned on any other characters, then BASIC will respond with SYNTAX ERROR. If the subroutine needs to be terminated for any reason before encountering these characters, then it must call CHRGET to "clean-up" before returning to BASIC. It is as important to BASIC to leave the line scanner in the right place as it is to leave the 6502 stack properly positioned for a machine language program.

Using LOOKUP

The second subroutine needed is one to fetch the addresses of the variables used such as A and B in line 200. This subroutine, called LOOKUP, is located at $CF6D. This subroutine will activate the line scanner, find the variable, determine its address, and leave the address in zero page memory.

After calling this utility the address of the variable is located in memory locations $44 and $45 with the variable name in $42 and $43. The format for the variable name is the standard BASIC interpreter format listed in Table 1 for reference. If the variable was floating point, address $8 will be set to $00; if integer, $8 will be set to $80. If the variable was numerical (integer or floating), $7 will be set to $00 and if string $7 will be set to $FF. The address returned in $44 and $45 is the actual location in memory where the binary representation of the number exists. If the result was string, however, the address is the location where the string descriptor (3-byte sequence of length, address LO, address HI) can be found.

Table 1. PET Variable Name Format
$42 $43
Floating msb clr msb clr
INTEGER msb set msb set
STRING msb clr msb set

To use this utility, just position the line scanner to the first character of the variable name (in ASCII) and execute a jump subroutine to $CF6D. When the subroutine returns, the line scanner will be positioned to a terminating character (comma, colon, or null). The calling subroutine may then check $7 and $8 to determine the type of result before proceeding. The floating accumulator is altered if the variable is subscripted. A summary of the operational features is given in Figure 1.

The third subroutine needed is an expression evaluator. PET BASIC has one located at $CCA7, EXEVAL. This subroutine is a very powerful and versatile one. Its purpose is to evaluate any expression that is used as an argument. The subroutine retrieves variables, converts numbers, performs function evaluations, and any operations located between the separators (commas) in the calling statement.

This utility operates in much the same way as the LOOKUP subroutine. The line scanner is used to fetch the expression from the input line, is again left on the terminating character (comma, colon, or null), and will therefore be ready for processing the next piece of information when returning.

If the user's machine language subroutine scans each argument for special characters, such as "$" for hex input, before evaluating the expression, then the line scanner will be left one address beyond the correct starting position. An alternate entry point at $CC9F will take care of this situation by subtracting one from the line scanner address before executing the evaluate routine. To use this subroutine, just jump subroutine to $CCA7, and the utility does the rest.

Since this subroutine can evaluate any expression that can be used on the right hand side of an equal sign, it will evaluate both strings and numerics. While this article is primarily concerned with numerical work, a brief description of both will be presented.

For numerical expressions the result is located in the floating accumulator, FACC, in floating binary format. The FACC is located at $5E to $63. If the desired result is integer, a conversion must be performed. The result can be stored in a variable, at a temporary memory location, transferred to the alternate floating accumulator AFAC at $66 to $6B, or left in the FACC for further processing. The flag at $7 can be tested to determine the type of result (numeric or string).

If the result is string, then the FACC is not used. The string result is placed in upper memory with the string variables. A table is built in zero page starting at $16 containing three bytes of information for each result. The first byte is the string's length, and the next two are the string's address in high memory. The format, of course, is the standard 6502 "LO,HI" byte format. The table may contain two such string descriptors. To determine which one was the last result, another two bytes are provided at $14 and $15, which are the address of the string descriptor. The table is large enough for only two descriptors without overflowing. At this point an example is in order to show how it works.

Suppose that an evaluation of the string "ABC" + "DEF" is accomplished. The result is obviously a string and can be verified by testing location $7 for a value of $FF. Upon examination of $14 we find a value of $16, and $15 contains $00. This means that the string descriptor starts in $16 with the length and continues at $17 and $18 with the address. If this intermediate result is not cleared, then the next temporary result will leave $14, $15 and $19, $00 respectively, meaning that the length is in $19, and the address is in $20, $21. Once the string result is used and stored or discarded, it is necessary that the pointer at $14 be reset. One caution: the string evaluation can proceed to calculate additional intermediate results, but table space is not provided for the temporary descriptors. The resulting descriptors will be stored on top of the indirect index registers and will ultimately cause problems. If a return to BASIC is attempted with three or more string temporaries pending, then a "FORMULA TOO COMPLEX ERROR" will result. All string temporaries should be cleared before returning to BASIC. APPENDIX B summarizes the operation of the expression evaluator.

The fourth utility needed is actually a set of subroutines to transfer numerical results into and out of the floating accumulators and perform the arithmetic operations. Their names and entry points are listed in Table 2. These subroutines all have simple operating instructions. The STFACC subroutine causes the FACC contents to be stored into memory. The location in memory is specified by the contents of the 6502 Y and X registers with the most significant byte in Y. The LDFACC and LDAFAC subroutines cause the contents of memory to be loaded into the FACC and AFAC respectively. Here the address of memory is in the Y and A registers with the Y register again being the most significant. The last subroutine to move data causes the contents of the FACC to be transferred to the AFAC. To execute these subroutines, just load X, Y, and A as appropriate and execute a JSR to the subroutine's address.

Table 2. Some Useful PET Subroutines
NAME ADDRSS FUNCTION
STFACC $DAE3 STORE FACCINTO MEMORY
LDFACC $DAAE LOAD FACC FROM MEMORY
LDAFAC $D998 LOAD AFAC FROM MEMORY
FACALT $DB18 TRANSFER FACC TO AFAC
FADD $D773 ADD MEMORY TO FACC
FSUB $D773 SUBTRACT FACC FROM MEMORY
FMUL $D934 MULTIPLY FACC BY MEMORY
FDIV $DAIB DIVIDE MEMORY BY FACC
FDIVI $DA20 DIVIDE FACC BY MEMORY WITHOUT SIGN

The remaining subroutines in Table 2 are the dyadic arithmetic subroutines. There are several entry points to each subroutine that can be used, but only a few will be discussed here. The basic function of these subroutines is to perform the desired arithmetic operation in floating point binary format between the FACC and memory. The LDFACC or LDAFAC is part of each subroutine so the address of the number in memory is loaded into Y, A before each call. The FACC is added to or subtracted from the number in memory in the first two, and the number in memory is multiplied by or divided by the FACC in the latter two cases. The alternate entry point for FDIV1 causes the FACC to be divided by memory; however, the sign of the result will always be positive due to the way the FACC is loaded. The sign can be manipulated separately if necessary.

Table 3: ROM Entry Points
APPLE-SOFT In ROM Original (2.0) BASIC Upgrade (3.0) BASIC 4.0 (Disk) BASIC LABEL or Description
$B1 $C2 $70 $70 CHRGET
$B7 $C8 $76 $76 CHRGOT
$B8-B9 $C9-CA $77-78 $77-78 TXTPTR
$DFE3 $CFD7 $CF6D $C12B LOOKUP
$83-84 $96-97 $44-45 $44-45 Address of current variable
$81-82 $94-95 $42-43 $42-43 Name of current variable
$12 $5F $08 $08 Variable type
$11 $5E $07 $07 Variable type
$DD7B $CCB8 $CCA7 $BDA0 EXEVAL
$9D-A2 $B0-B5 $5E-63 $5E-63 FACC (Floating Ace. #1)
$A5-AB $B8-BD $66-6B $66-6B AFAC (Ace. #2)
$55-5B $68-6F $16-1C $16-1C String table
$53-54 $66-67 $14-15 $14-15 Last string
$EB1E $DAAB $DAE3 $CDOD STFACC
$EAF9 $DA74 $DAAE $CCD8 LDFACC
$D95E $D998 $CBC2 LDAFAC
$EB63 $DB18 $CD42 FACALT
$E7B9 $D73F $D773 $C99D FADD
$E7A7 $D728 $D733 $C986 FSUB
$E982 $D900 $D934 SCB5E FMUL
$EA55 $D9E4 SDAIB $CC45 $FDIV
$EA60 $DA20 $CC4A FDIVI

Figure 1. Variable Fetch Subroutine Summary

  1. Uses the line scanner to obtain input.
  2. Starts with CHRGOT (i.e., must begin with the line scanner on the first character of the variable name).
  3. Uses the standard PET variable format of ABBB…CDDD where A is an alphabetic character A-Z, B is an alpha-numeric A-Z or 0-9, C is a type symbol $ of % if appropriate, D is the subscript information if appropriate.
  4. Returns with the address in $44 and $45.
  5. The converted variable name is left in $42 and $43.
  6. Sets $7 and $8 to flag the result type (numeric/string or floating/integer).
  7. The FACC is altered if the variable is subscripted.
  8. The line scanner is left on the terminating character or "parameter separator" (comma, colon, or null).

Program 1. PET Line Scan Subroutine

0070 E677   CHRGET INC $77
0072 D002          BNE $76
0074 E678          INC $78
0076 AD0704 CHRGOT LDA $0407
0079 C93A          CMP #$3A
007B B00A          BCS $87
007D C920          CMP #$20
007F FE0F          BEQ $70
0081 38            SEC
0082 E930          SBC #$30
0084 38            SEC
0085 E9D0          SBC #$D0
0087 60            RTS

Getting The Most Out Of USR

Charles Brannon, Editorial Assistant

The Atari USR command is very powerful and flexible. Its strength is in parameter passing, the ability to directly communicate with a machine language routine using standard variables and arithmetic expressions.

A simple task for the USR command is to merely transfer control from BASIC to machine language.

X = USR (1536)

This would simulate a JSR (Jump to Sub-Routine) to location 1536, or $0600. The value returned in X is meaningless here. The machine language routine must begin with a PLA (Pull Accumulator) to "clear" the count byte (discussed later) and, when finished, return to BASIC with a RTS (ReTurn from Subroutine).

The real power of USR, however, is that it can pass a series of 16-bit binary integers. These are specified as a list after the address:

X = USR (1536, 1, 3, 5, 7)

Any arithmetic expression can be used, even variables and functions:

X = USR (1536, A * B, ASC (" + "))

From the machine language program's point of view, where are these numbers stored? How about the stack? The Atari USR command "pushes" the high and low bytes of each number onto the stack, and "tops it off" with a count byte. The count byte is the number of values passed. The machine language program would use PLA to read each byte into the accumulator. For example, a routine to simulate the Atari POSITION command might look like:

    ; A= USR (1536, X, Y)
* = $600
PLA	;Countbyte
PLA	;MSB of X
STA 86	;COLCRS + 1
PLA	;LSB of X
STA 85	;COLCRS
PLA	;MSB of Y (zero)
	;so ignore it
PLA	;LSB of Y (0-191)
STA 84	;ROWCRS
RTS	;Return to BASIC

Notice the order of the high byte (MSB) and low byte (LSB) of each argument on the stack. Also, the first argument (X) will be the first value on the stack.

Machine language routines can also work on strings, via the ADR function. ADR(A$) will return the memory location of the contents of A$. Using the LEN function, BASIC can tell the "whole story." For example, this routine transfers the contents of any string to any memory location (useful for player/missile graphics, or custom characters). The length of the string should be limited to 255 bytes.

	A = USR (1536, ADR (X$), LEN (X$), MEM)
	* = $0600
ADRL	= $CB
ADRH	= $CC
DESTL	= $CD
DESTH	= $CE
	PLA		;Countbyte
	PLA		;MSB of address
	STA ADRH	;zero page loc.
	PLA		;LSB
	STA ADRL
	PLA		;MSB of length
		        ;(ignore it)
	PLA		;LENgth
	TAY		;Use it for loop
	PLA		;MSB of destination
			;address
	STA DESTH	;Another z-page loc
	PLA		;LSB
	STA DESTL
LOOP LDA (ADRL), Y;Get byte
	STA (DESTL), Y
	DEY	        ;check loop
	BNE LOOP ;If not 0,
			;continue loop
	RTS          ;Return to BASIC

Going Back To BASIC

How can a routine pass a value back to BASIC? It could save the values in an area of memory and have BASIC PEEK them out. If only one value (one 16-bit integer) needs to be returned, you can use locations $D4-$D5 (212, 213). Store the result using the standard 6502 low/high byte format. The destination variable (X in X = USR (1536), Z in Z = USR (1536, 3, 2), or any variable) will take on the value placed is $D4-$D5 (labeled FR0). So, to quickly add two numbers, you could use: A = USR (1536, 1, 2) (any two arguments). "A" will contain the answer.

FR0	= $D4	;Low byte of return value
      * = $0600
      PLA	;Throw away count
PLA
STA FIRSTH
PLA
STA FIRSTL
PLA
STA SECONDH
PLA
STA SECONDH
CLC
LDA FIRSTL
ADC SECONDL
STA FR0
LDA FIRSTH
ADC SECONDH
STA FR0 + 1
RTS

In many programs, we want to make sure that the proper number of arguments has been sent. For example, if we have a routine that plays a musical tone on the internal speaker for a specified duration,

A = USR (1536, note, duration)

we may want to only accept exactly two values. We can use the first byte, the count byte, to monitor this. If the count is wrong, we must pull all the arguments off the stack and return to BASIC. We could even ring the bell and print an error message.

    * = $0600
    PLA
    CMP #2
    BNE ERROR
    .
    .
    (Routine continues normally)
    .
    .
    .
    RTS
ERROR                ;The error-handling
                     ;routine
    TAX              ;Count is in A
    BEQNOPULL        ;If zero, don't pull
ERRLOOP              ;ERROR loop
    PLA
    PLA              ;Pop an argument
    DEX              ;Continue
    BNE ERRLOOP      ;Until X = 0
NOPULL
    LDA #253         ;BELL character
    JSR $F6A4        ;Print it
    LDA #03          ;ERROR - 3
                     ;(VALUE ERROR)
    STA $B9          ;Error number
    JMP $B940        ;Print error

Machine language programmers have a friend in USR. If you have an Assembler, type in the examples. And when BASIC bogs you down, remember this motto: Use USR!