Zounds!
by Ed RotbergSince this issue of ANTIC delves into the mysteries of computergenerated sound, I will share with you some of the inner workings of a major project of mine, the Rotberg Synthesizer. I will have to assume a reasonably high level of programming competency on your part.
The Synthesizer does a pretty good job of shaping POKEY sounds into approximations of real musical instruments. It works much better, in my opinion, than the Atari Music Composer cartridge. The most important reason why is that it can provide "envelopes" for the frequencies, and an amplitude for each note.
The term "envelope" refers to the temporal variation of some aspect of a sound. In this case, the aspects to be varied are frequency and amplitude. The code "ADSR" is the standard way of specifying an amplitude envelope, and the code stands for "Attack, Decay, Sustain and Release."
Figure 1 will give an idea of what these terms mean in the case of a harpsicord-like amplitude envelope. The rest of the article will present an approach to creating such envelopes in a music generating program like the Rotberg Synthesizer.
The whole project started as a gag while I was working at the Atari Coin Op Division. One of our colleagues, a disco freak, was compounumg his bad judgment by getting married. Such was the birth of the Synthesizer, which was used to compose our congratulatory lament, the Disco Dirge, written by Dan Pliskin, another ex-Atarian.
Some stubborness prevents me from just listing the program for you. I guess I'd rather lead you to an understanding of how to do it for yourself. I will be referring to various registers in the POKEY chip, and certain functions of the POKEY, but I will in no way describe that chip. Also, I will not be relating any of this to BASIC techniques, which are hopelessly slow for this kind of work. Nor will I discuss any sound editing techniques, but only the means of generating the musical sounds.
There are basically two major classes of sound generation used: static and dynamic. The first consists of nothing more than storing a few values to the various POKEY registers, and sitting back and listening. The capabilities of this approach are quickly exhausted. More useful, and far more interesting, are the dynamic sounds, in which the values stored to the POKEY are constantly changed during the duration of the sound. Three approaches to dynamic sound generation are:
1) Algorithmic. A short routine calculates the values to be stored. The possibilities are limited only by the imagination of the programmer.
2) Table driven. A short program keeps an index into a lookup table to determine what values are to be stored into POKEY during that time interval. New sounds can be generated very quickly by slopping some new values into the tables.
3) Interpretive. A small interpreter program reads instructions and data from a command stream, causing the sounds to be generated by a few preset rules. This method keeps the data tables short, compared to a pure table-driven approach.
Let's go over just what the Synthesizer is capable of. It has the ability to produce sound on all 4 channels of the POKEY simultaneously. The basic unit of sound is called a NOTE, since this program was intended to be primarily a music synthesizer, though it is capable of generating a wide variety of sounds. The frequency of the NOTE is specified by 8 bits which may either be a pointer into a table of frequencies, or the actual frequency itself. This is an implementation decision, and each method has its merits and drawbacks. If the actual frequency is stored, the NOTE must also specify the "noise content or distortion" value to be stored in the control register along with the "sustain" volume for each channel. Each NOTE can specify a 4-bit value for its sustain volume, and can have a duration specified by 16 bits.
This duration is relative to the cur rent TEMPO. The TEMPO is specified by an 8-bit value, which is used as a delay loop counter. The TEMPO can only be changed relative to its current value by a 2's complement add of any 8-bit value. Note that in versions of the Synthesizer that run during the vertical blanking interval, such as the Atari POP Demo program, the TEMPO feature is not implemented, as the tim ing interval is fixed at 60 hertz. Each channel can specify it's own current ENVELOPE table which controls the attack/decay of either amplitude, fre quency, or both. Attack and decay are not specified as rates or times, but rather as a table of digitized amplitudes during the attack/decay period. This period can cover a span of a few milli seconds to a few seconds.
Care must be taken not to wrap either of these values, unless of course that is the intended result. At the pres ent time, "Release" is not implemented. The Synthesizer has the ability to REPEAT a section of music up to 100 (hex) times. These REPEATS may be nested without any restriction except that the total number of REPEATS in a piece of music must not exceed 100 (hex). The Synthesizer can also play PHRASES. I have chosen not to imple ment the four separately tracking stacks necessary to allow for nesting of PHRASES, although this is certainly simple enough to do. Each PHRASE must specify its own RETURN. In addition, any channel's instruction stream can cause AUDCTL to be changed on the fly. That's about it. In its current form, THE ROTBERG SYNTHESIZER supports 7 instructions:
1) Repeat
2) Set/change Envelope
3) Set/change AUDCTL Register
4) Play Phrase
5) Return from Phrase
6) Change Tempo
7) Play 1 note
The Synthesizer processes 4 sets of these instructions simultaneously, one for each channel in POKEY. Each instruction stream is made up entirely of these instructions, in addition to a STOP directive that is only valid when encountered in channel 1's instruction stream.
The data structure format for each instruction follows, where each cell represents one byte. All value/ranges are given in hexadecimal.
REPEAT: op-code=FF __________ FF __________ nn __________ ll __________ hh __________ ii __________FF = REPEAT op-code
nn = repeat count (0=100, 1=NOP, count indicates number of times section is to be played)
ll = low byte of address of 1st instruction of section
hh = hi byte of address
ii = index into ram table for this section's repeat counter
This instruction has the effect of conditionally repeating a section of the instruction stream a specified number of times. Because each REPEAT instruction has its own loop counter in a RAM table 100 (hex) bytes long, any amount of nesting of these REPEAT instructions is allowed, as long as the total number of REPEATS in any composition is 100 or fewer. Each REPEAT can play its section up to 100 times. This instruction appears at the end of the section to be repeated, and refers to the first instruction of that section in its operand field. SET ENVELOPE: op-code = FE
__________ FE __________ ll __________ hh __________FE = SET ENVELOPE: op-code
ll = low byte of address envelope table
hh = hi byte of address
This instruction sets the pointer to the current ENVELOPE table for that channel. A SET ENVELOPE instruction MUST precede the first note instruction on any channel. ENVELOPES may be changed at any time.
CHANGE AUDCTL: op-code = FD
__________ FD __________ cc __________FD = CHANGE AUDCTL op-code
cc = new audctl value
This instruction is used to change AUDCTL on the fly. This represents powerful, dynamic control of the POKEY. It may be used from any channel, but in practice, it is best only altered from one channel within a piece, as AUDCTL can affect ALL channels.
FC = CALL PHRASE op-code
ll = low byte of address of 1st instruction of phrase
hh = hi byte of address
This instruction will transfer control to a PHRASE which can be "called" any number of times. In the current im plementation, there is NO nesting of PHRASE calls (i.e. only 1 level of call ing a PHRASE). PHRASES themselves, may therefore use any instructions other than CALL PHRASE, and must terminate with a RETURN instruction. Note, that while possible, it is dan gerous to have 2 channels use the same PHRASE, especially if that PHRASE contains REPEAT instructions.
RETURN FROM PHRASE:
op-code - FB
__________ FB __________FB = RETURN op-code
This instruction is used to return from a PHRASE.
CHANGE TEMPO
__________ FA __________ tt __________FA = CHANGE TEMPO op-code
tt = 2's complement delta change to TEMPO
This instruction is used to change the current TEMPO by a 2's complement delta value. This instruction can appear in any channel, and obviously affects all channels.
NOTE: op-code =
All instructions not having an opcode for FA or greater are NOTE
instructions. ENVELOPES will be applied to all NOTE instructions with
one exception! If the first two bytes (ca,ff) are zero, then the NOTE
is corlsidered a rest, and no envelope is applied. Note that in
processing the instruction stream for each channel, all
non-NOTE instructions are processed immediately, untit a NOTE
instruction is encountered. In other words, all nonNOTE instructions
take up NO duration time, and a NOTE instruction MUST be processed for
each channel every cycle through the interpreter. Also, when a rest
(NOTE ca,ff=0) of duration zero is encountered in channel 1, it is
evaluated as a global STOP instruction, and the piece is over.
Various data structures are used by the interpreter for processing the
instruction streams. A brief description of each follows.
PNTR- 8 bytes
Two bytes per channel. This table maintains the current "program
counter" for each channel.
NRPT - 8 bytes
Two bytes per channel. This structure contains the duration remaining
on the current NOTE of each channel.
RPTBLK -100 (hex) bytes
There is a one-to-one correspondence between each REPEAT instruction
and a unique byte in this table. These bytes contain the counts
remaining in each repeat section. When a REPEAT instruction is
encountered, this byte is checked. If it is zero, it is then
initialized to the value specified in the REPEAT instruction and
decremented immediately. If it is non:zero, then it is merely
decremented. The interpreter will then execute the REPEAT only if the
decrement does nQt bring the value to zero. Thus, a 1 for a repeat
count is an effective NOP, and the repeat count represents the number
of times a section - is actually played. Obviously, this entire table
MUST be erased prior to starting to play a piece.
TREG - 8 bytes
Two bytes per channel. This is a staging area for the values to be
stored to all 8 frequency and control registers for the four POKEY
voices. Since the processing time for each of the 4 channels in a
single interpreter cyde may vary, the POKEY values generated are saved
in a holding register until all are calculated, and can be stored to
POKEY with a single move loop.
ENVL- 8 bytes
Two bytes per channel. This table maintains the pointer to the current
ENVELOPE table.
EINDX - 4 bytes
One byte per channel. This is the current index into the ENVELOPE
table. It counts up by 2 from an initial value of 2. The reason for
this will become evident in the discussion of the ENVELOPE table
itself. EINDX is reset to 2 by the start of each new note.
RTNADR - 8 bytes
Two bytes per channel. This table contains the return address to a main
instruction stream from a PHRASE:. It is zero when in the main
instruction stream so-that RETURNS and CALL PHRASES can check for
validity. Because these return addresses are not stacked, there is no
nestmg of PHRASE calls allowed.
ENVELOPE tables 4 to 100 (hex) bytes
The first byte has the table lengths a maximum of FE (hex). The EINDX
value is compared against this first byte to determine whether the NOTE
value is to be modified by the ENVELOPE, or whether the duration has
exceeded the attack/decay period and the sustain values for frequency
and amplitude are to be used. Each 2 bytes in the table represent both
frequency and-amplitude rnodifiers for one duration count. Since a
maximum EINDX of FE is allowed, this means that durations longer than
7F cannot be modified by an envelope past that point. The hi byte of
each 2 byte value modifies the amplitude (low nibble only), and the low
byte modifies the frequency, both by 2's complement addition.
The remaining data structures used are the instruction streams
themselves. There must be one per channel, even if the channel is
dormant.
There it is, in the proverbial nutshell. This should be enough to get
the more adventure some of you started.
Ed Rotberg is an Electrical Engineer with many years of computer
programming experience. He was with~~~ Atari, Inc. from 1979 to 1981 as
a software developer and consultant on the ATARI 800 project. Among his
programs are the Rotberg Scrolling Marquee and the Rotberg Synthesizer.
He helped create sound effects, using the ATARI, for the movie TRON,
and is a partner in Videa, Inc., a new electronic entertainment firm in
Sunnyvale, CA.
a = sustain volume
ff = sustain frequency or pointer to freq. table
dd = low byte of 16 bit duration ee = hi byte of 16 bit duration.
Duration is relative to TEMPO. For convenience, a value of 100 (hex) is
usually used to represent a whole note. This means that for long
durations, the high byte (eeJ of the duration represents a measure
count in 4/4 time.