Classic Computer Magazine Archive ANTIC VOL. 6, NO. 12 / APRIL 1988

Super Sorters

Part I. Multikey Sort Routine

By KEVIN PECK

Multikey Sort begins a two-part series of powerful machine language sorting routines that will be extremely useful for intermediate BASIC programmers. This USR routine works on Atari 8-bit computers of any memory size, with disk or cassette.


A multikey sort will sort your data by a primary "key field" and then sort by a secondary field within that key field.

A common example would be to sort a mailing list by last name-with all first names then sorted within each last name. Our multikey sort routine can also handle the job when you need to sort by just one field.

Listing 1, MULTISOR.DEM, is a short BASIC demo that shows off some of the features of this speedy USR routine.

Type in listing 1, check it with TYPO II and SAVE a copy before you RUN it. If you have trouble typing in the special characters in lines 1010-1030, don't type them in. Listing 2 will create them for you Type Listing 2, check it with TYPO II, and SAVE a copy. When you RUN Listing 2 creates these hard-to-type lines, and stores them in a file called LINES.LST which contains the BASIC statements defining the machine language sort routine.

To merge the two programs, disk users LOAD "MULTISOR.DEM" and then ENTER "D:LINES.LST". Cassette users: CLOAD Listing 1, then insert the separate cassette used for Listing 2 and ENTER "C:". Finally, remember to SAVE the completed program before you RUN it.

To use this sort routine in your own BASIC programs, you'll need a line that calls the routine, plus some information to sort in another string.

SORTED DETAILS

Here is the format for the USR call that starts the sort. See lines 290, 360 and 430 for working examples of this line.

A = USR(ADR(MKS$),FIRST,LAST,FLEN,FOFF,SLEN, SOFF,RLEN,ORDER)

Below are the elements within this format:

ADR(MKS$) is the address of the machine language routine, MKS$.

FIRST is the starting address of the data to be sorted. Since the data is contained in D$, put ADR(D$) in place of FIRST.

LAST is the ending address of the data to be sorted. Calculate the ending address by adding the length of the string to its starting address. The formula for this is:

LAST= ADR(D$) + LEN(D$)

FLEN is the length, in bytes, of the first field to sort on, the key field.

   Figure 1
            
                   field     
   field          length
   ---------------------
   Last Name         7

   First Name        6

   Position         10
   ---------------------
   Record Length    23

FOFF is the offset into the record where the first field begins.

SLEN and SOFF are the same as FLEN and FOFF, except that they define the secondaty field for the sort. If you pass a value of 0 for SLEN, the routine will sort on the key field only.

RLEN is the length of the record being sorted. You must use fixed length records within a single, big string for the sort to work. All extra spaces in the record must be padded with the character of your choice, usually a space (ASCII 32).

ORDER is the order in which to sort the data-either ascending (pass a value of 0) or descending (pass a value of 1). Any non-zero value will produce a sort in descending order.

The demo program you created has three sample sorts. First it prints out the unsorted data and asks you to press the [SPACEBAR]. Next it sorts the data by last name (key field), with the first name (secondary field) sorted in order within the last name. This is done in line 290. The second sort in line 360 puts the data in order by "position" (key), then by last name (secondary) within each position. The final example sorts the data by the first name only, in descending order.

The program uses a record with the structure shown in Figure 1 for all examples. The table shows each field name and its length in bytes. Notice that not all of the field lengths are equal. Only the total record length must be consistent throughout the data for the sort to work. You must pad extra spaces in each record on a per field basis.

If you have data for a record where the last name is five characters long, the first name is four characters and the position is 10, for a total of 19 characters, you cannot simply add four spaces to the end of the record to get it to equal 23 characters. You will have to add two spaces to the last name to make it seven characters long, then add two spaces to the first name. (You can leave position alone because it's already the proper length.)

   Figure 2
                  1         2
         12345678901234567890123
         -----------------------
   WRONG:SmithMikeProgrammer

         HunterBarneyDriver

         JohnsonJillTrainer


                  1         2
         12345678901234567890123
         ----------------------- 

   RIGHT:Smith  Mike  Programmer

         Hunter BarneyDriver

         JohnsonJill  Trainer
         -----------------------
         7      6     10

This is what I mean by adding characters on a per field basis. Your database does not have to look anything like this-it doesn't have to have a 23-character record length either: there just has to be a fixed length for all records for the sort routine to work.

Now for a detailed explanation of all of the parameters for the routine. In the first example, we'll need to use 7 and 6 for the lengths of the first two fields. Just use the lengths presented in Figure 1.

The "offset" seems strange, but it's easily explained with some examples. Just add up the lengths of all the fields preceding the one you are going to sort on. The field named "position" has an offset of 13 (that is, 7 + 6), the lengths of all the fields before it. The "last name" field has an offset of 0, as there are no fields in front of it. The "first name" field has an offset of 7-the length of "last name" field, the only field preceding it.

You may sort any two fields, regardless of their order in the record itself. In the second example, starting at line 320, we're sorting by the position first, then the last name. Though the last name appears first in the data record, it is still a valid sort. Just choose the two fields you want to sort on, and call the routine.

An example of sorting a single field is presented in the demo starting at line 390. I also chose to sort the data in descending order to show you how the order flag operates.

Notice the two zeros (00) after the 7 in line 430 of the single key example. The first one must be a zero for a single key sort because it's the SLEN, or second field length. This zero tells the routine that we only want to sort off one field. The next zero will not affect anything because the routine only checks for the first zero.

You should usually pass this value as a zero to make single field sorts easier to spot within your program. The final number is a one this time-to do the sort in descending order instead of ascending order as in the three previous examples.

RUN the demo program and see what happens to the data onscreen after each sort.

NOTE: you can't sort one field in ascending order and the other in descending order. Both must be sorted in the same order. The largest string you can sort is 32K. This restriction is imposed by Atari BASIC, but I do not forsee many applications reaching or exceeding this limit.

To conclude this two-part series, Antic will present another machine language routine that sorts floating point numbers within strings and will also sort variable length numbers and a mixture of positive and negative numbers.

Kevin Peck is a computer science major from Salina, Kansas. His Word Searcher appeared in the March 1987 Antic

Listing 1: 88-04/MULTISOR.DEM Download

Listing 2: LINES.BAS (not needed)
On Disk: MULTISOR.M65 Download / View