Atari BASIC


Atari BASIC is an interpreter for the BASIC programming language that shipped with the Atari 8-bit family of 6502-based home computers. Unlike most BASICs of the home computer era, Atari BASIC is not a derivative of Microsoft BASIC, and differs in significant ways. It includes keywords for Atari-specific features and lacks support for string arrays, for example.
The language was originally an 8 KB ROM cartridge for the first machines in the 8-bit series, the 400, 800 and 1200XL. Starting with the 600XL and 800XL, BASIC was built-in to the machines, but can be disabled by holding down the OPTION key while booting. The XEGS disables BASIC if powered without the keyboard attached.
Atari BASIC was generally near the bottom in performance benchmarks of the era. The original authors addressed most of these issues in BASIC XL, although a host of 3rd party solutions like Turbo-Basic XL also appeared.
The complete annotated source code and design specifications of Atari BASIC were published as The Atari BASIC Source Book in 1983.

History

The machines that would become the Atari 8-bit family had originally been developed as second-generation video game consoles intended to replace the Atari 2600. Ray Kassar, the new president of Atari, decided to challenge Apple Computer by building a home computer instead. This meant the designs, among other changes, needed to support the BASIC programming language, then the standard language for home computers.
In early 1978, Atari licensed the source code to the MOS 6502 version of Microsoft BASIC. The original Altair BASIC on the Intel 8080 came in three versions for different memory sizes, 4, 8 and 12 kB. The three versions were very different, the 4k version lacked string variables and functions and used a 32-bit floating point format, the 8k version added string functionality, and the 12k version added more functions, 64-bit variables, and other features. By the time the Atari machines were being designed, RAM was becoming much less expensive. Microsoft took advantage of this by producing a single version for the 6502, most closely resembling the 8k version from the 8080. It was offered in two similar versions, one using the original 32-bit number format that was about 7800 bytes, and another using an extended 40-bit format that was close to 9 kB.
Even the 32-bit version barely fit into the 8 kB size of the machine's ROM cartridge format. Atari also felt that they needed to expand the language to support the hardware features of their computers, similar to what Apple had done with Applesoft BASIC. This increased the size of Atari's version to around 11 kB; AppleSoft BASIC on the Apple II+ was 10240 bytes long. After six months, the code was pared down to almost fit in an 8 kB ROM. However, Atari was facing a January 1979 deadline with the Consumer Electronics Show where the machines would be demonstrated. They decided to ask for help to get a version of BASIC ready in time for the show.

Shepardson Microsystems

In September 1978, Shepardson Microsystems won the bid on completing BASIC and was finishing Cromemco 16K Structured BASIC for the Z80-based Cromemco S-100 bus machines. What became Atari BASIC is essentially a pared-down version of Cromemco BASIC ported to the 6502. That needed 10K of code. To make it fit in Atari's 8K cartridge, some of common routines were moved to the operating system ROMs. This included 1780 bytes for floating point support that were placed in a separate 2K ROM on the motherboard.
Atari accepted the proposal, and when the specifications were finalized in October 1978, Paul Laughton and Kathleen O'Brien began work on the new language. The contract specified the delivery date on or before 6 April 1979 and this also included a File Manager System. Atari's plans were to take an early 8K version of Microsoft BASIC to the 1979 CES and then switch to the new Atari BASIC for production. Development proceeded quickly, helped by a bonus clause in the contract, and an 8K cartridge was available just before the release of the machines. Atari took that version to CES instead of the MS version. Atari Microsoft BASIC later became available as a separate product.

Releases

The version Shepardson gave to Atari for the CES demo was not supposed to be the final version. Between the time they delivered the demo and the final delivery a few weeks later, Shepardson fixed several bugs in the code. Unknown to Shepardson, Atari had already sent the CES version to manufacturing.
This version was later known as Revision A. It contains a major bug in a subroutine that copies memory; deleting lines of code that were exactly 256 bytes long causes a lockup. This was sometimes known as the "two-line lockup" because it did not trigger until the next line of code or command was entered. It cannot be fixed by pressing the key.
Revision B attempted to fix all of the major bugs in Revision A and was released in 1983 as a built-in ROM in the 600XL and 800XL models. While fixing the memory copying bug, the programmer noticed the same pattern of code in the section for inserting lines, and applied the same fix. This instead introduced the original bug into this code. Inserting new lines is much more common than deleting old ones, so the change dramatically increased the number of crashes. Revision B also contains a bug that adds 16 bytes to a program every time it is SAVEd and LOADed, eventually causing the machine to run out of memory for even the smallest programs. Mapping the Atari described these as "awesome bugs" and advised Revision B owners "Don't fool around; get the new ROM, which is available on cartridge" from Atari. The book provides a type-in program to patch Revision B to Revision C for those without the cartridge.
Revision C eliminates the memory leaks in Revision B. It is built-in on later versions of the 800XL and all XE models including the XEGS. Revision C was also available as a cartridge.
The version can be determined by typing PRINT PEEK at the READY prompt. The result is 162 for Revision A, 96 for Revision B, and 234 for Revision C.

Description

Program editing

Atari BASIC uses a line editor like most home computer BASICs. Unlike most BASICs, Atari BASIC scans the just-entered program line and reports errors immediately. If an error is found, the editor re-displays the line, highlighting the text near the error in inverse video. Errors are displayed as numeric codes, with the descriptions printed in the manual.
Program lines entered with a leading line number, from 0 to 32767, inserts a new line or amends an existing one. Lines without a line number are executed immediately. When the programmer types RUN the program executes from the lowest line number. Atari BASIC allows all commands to be executed in both modes. For instance, the LIST command can be used inside a program, this was allowed in some BASICs and not others.
The LIST statement displays either the entire program or a range of lines separated with a comma. For instance, displays all lines from 10 to 50, inclusive. This contrasts with most dialects, which use the minus sign as the separator,. The output can be redirected by specifying the device identifier. sends a program listing to the printer instead of the screen, to the cassette.
Program lines can be up to three screen lines of 40 characters, so 120 characters total. The cursor can be moved freely in these lines, unlike most other BASICs where the cursor can only be moved left or right in the editor. The OS handles tracking whether a physical line flows to the next on the same logical line, so moving the cursor up and down moves within logical lines or between them, automatically.
Keywords can be abbreviated using the pattern set by Palo Alto Tiny BASIC, by typing a period at any point in the word. So L. is expanded to LIST, as is LI.. Only enough letters have to be typed to make the abbreviation unique, so PLOT requires PL. because the single letter P is not unique.
Pressing sends the tokenizer the line the cursor is on. In the example pictured above, all the author needs to do to fix the error is move the cursor over the U, type and hit. This is a common editing technique for, say, renumbering lines. Atari BASIC has no built-in renumbering command, but one can quickly learn to overwrite the numbers on a set of lines then just hit repeatedly to put them back into the program.

The tokenizer

Most BASIC interpreters perform at least some conversion from the original text form into various platform-specific formats. Tiny BASIC was on the simple end, it only converted the line number from its decimal format into binary. For instance, the line number "100" became a single byte value, $64, making it smaller to store in memory as well as easier to look up in machine code. The rest of the line was left in its original text format. MS-derived BASICs went slightly further, converting the line number into a two-byte value and also converting keywords, like or, into a single-byte value, the "token".
In contrast, Atari BASIC's tokenizer parses the entire line when it is entered or modified. All keywords, not just the first one, are converted into a one-byte token. Numeric constants are parsed into their 40-bit internal form and then placed in the line in that format, while strings are left in their original format, but prefixed with a byte describing their length. Variables have storage set aside as they are encountered, and their name is replaced with a pointer to their storage location in memory. Shepardson referred to this early-tokenizing concept as a "pre-compiling interpreter".
The original text for the line is stored in the BASIC Input Line Buffer in memory between 580 and 5FF16. The token output buffer is 256 bytes, and any tokenized statement larger than the buffer generates an error. The output from the tokenizer is then moved into more permanent storage in various locations in memory. The program is stored as a parse tree.
A set of pointers indicates various data: variable names are stored in the variable name table and their values are stored in the variable value table. By indirecting the variable names in this way, a reference to a variable needs only one byte to address its entry into the appropriate table. String variables have their own area as does the runtime stack used to store the line numbers of looping statements and subroutines pointer.
Atari BASIC allows keywords to be abbreviated using the pattern introduced in Tiny BASIC, using a period. To expand an abbreviation, the tokenizer searches through its list of reserved words to find the first that matches the portion supplied. More commonly used commands occur first in the list of reserved words, with REM at the beginning. When the program is later LISTed it will always write out the full words with three exceptions: PRINT has a synonym, ?; GOTO has a synonym, GO TO; and LET has a synonym which is the empty string. These are separate tokens, and so will remain as such in the program listing. MS BASICs also allowed ? as a short-form for PRINT, but did expand it when listing, treating it as an abbreviation, not a synonym.
In the keywords for communicating with peripherals such as OPEN # and PRINT #, the " #" is actually part of the tokenized keyword and not a separate symbol. For example, "PRINT" and "PRINT #0" are the same thing, just presented differently.

Mathematical functions

Atari BASIC includes twelve functions for mathematical and trigonometric calculations. The TAN function is not included, but may be derived via the EXP function. DEG and RAD are used to set whether trigonometric functions use radians or degrees. The RND function generates a number between 0 and 1, with the parameter to the function not being used. The number is derived via the POKEY random number register at address $D20A.

String handling

Atari BASIC differs considerably from Microsoft-style BASICs in the way it handles strings. Microsoft BASIC mostly copied the string-handling system of DEC's BASIC-PLUS, in which strings are first-class types with variable lengths and bounds. This allows both string variables, as well as arrays of strings, as both are represented internally by a computer word pointing to storage on a heap. In contrast, Atari BASIC copied the string-handling system of Hewlett-Packard BASIC, where the basic data type is a single character, and strings are arrays of characters.
The side-effect of this design is that there are no true string arrays and each string variable must be DIMensioned before it can be used. For example:

10 DIM A$
20 PRINT "ENTER MESSAGE: ";
30 INPUT A$
40 PRINT A$

In this program, a 20 character string is reserved, and any characters in excess of the string length will be truncated. The maximum possible length of a string in Atari BASIC is 32,768 characters.
MS BASIC includes functions for accessing bits of strings, for instance, LEFT$ would return the leftmost 10 characters of A$. In Atari BASIC the string is represented by an array, and was accessed using array indexing functions. The equivalent statement in Atari BASIC would be A$; the arrays are 1-indexed, not 0-indexed as in most modern variations. Because this slicing syntax was the same as two-dimensional arrays in other BASICs, there was no way to define or work with arrays of strings.
Atari BASIC does not initialize array variables, and a string or array will contain whatever data was present in memory when it was allocated. The following trick allows fast string initialization, and it is also useful for clearing large areas of memory of unwanted garbage :

10 REM Initialize A$ with 1000 characters of X
20 DIM A$
30 A$="X":A$=A$:A$=A$

String concatenation in Atari BASIC works as in the following example. The target string must be large enough to hold the combined string or an error will result:

10 DIM A$,B$
20 A$="Hello ":B$="there!"
30 A$=B$
40 PRINT A$

The INPUT statement cannot be used with a prompt nor with array variables. The latter must be filled indirectly via a statement like 20 INPUT A:B=A. Array variables in Atari BASIC also may contain two subscripts.
Strings included in DATA statements do not have to be enclosed in quote marks in Atari BASIC, as a result, it is also not possible for data items to contain a comma. The READ statement also cannot be used with array variables.
Arrays have a base index of 0, so a statement such as DIM A actually creates an 11-element array.

Input/Output

The Atari OS includes a subsystem for peripheral device input/output known as CIO. Most programs can be written independently of what device they might use, as they all conform to a common interface; this was rare on home computers at the time. New device drivers could be written fairly easily that would automatically be available to Atari BASIC and any other program using the Atari OS, and existing drivers could be supplanted or augmented by new ones. A replacement, for example could displace the one in ROM to provide an 80-column display, or to piggyback on it to generate a checksum whenever a line is returned.
Atari BASIC supports CIO access with reserved words and. There are routines in the OS for graphics fill and draw, but they are not all available as specific BASIC keywords. and for line drawing are supported while a command providing area fill is not. The fill feature can be used through the general CIO entry point, which is called using the BASIC command.
The BASIC statement prepares a device for I/O access:

10 REM Opens the cassette device on channel 1 for reading in BASIC
20 OPEN #1,4,0,"C:MYPROG.DAT"

Here, means "ensure channel 1 is free," call the driver to prepare the device. The third number is auxiliary information, set to 0 when not needed. The is the name of the device and the filename; the filename is ignored for the cassette driver. Physical devices can have numbers, so "" might be the plotter and "" the daisy-wheel printer, or "" may be one disk drive and "" and so on. If not present, 1 is assumed.
The LPRINT statement is used to output strings to the printer.
Atari BASIC does not include an equivalent of the Microsoft BASIC GET or INKEY$ commands to detect a keypress, this can be simulated either by POKEing the keyboard driver directly or opening it as a file although the latter will wait for a keypress unlike GET or INKEY$.
Typing DOS from BASIC will exit to the Atari DOS command menu. Any unsaved programs will be lost. There is no command to display a disk directory from within BASIC and this must be done by exiting out to DOS.
DOS occupies roughly 5k of memory, thus a cassette-based Atari machine will have around 37,000 bytes of free BASIC program memory and 32,000 bytes if DOS is present. BASIC cannot use the extra RAM on XL and XE machines.

Graphics and sound support

Atari BASIC has built-in support of sound,, setting up the screen graphics, joysticks, and paddles. The underlying operating system included a routine to fill arbitrary shapes, but BASIC did not have a command and it instead had to be called with the command.
There is no dedicated command for clearing the screen in Atari BASIC, this is usually done with, which PRINTs the clear screen control code. Atari BASIC does not include a TAB function; this can be simulated by either POKEing the cursor column position at $55 or the tab position at $C9, which has a default value of 10. The changed values will not take effect until a PRINT statement is executed. There is also no SPC function in Atari BASIC.
Advanced aspects of the hardware such as player/missile graphics, redefined character sets, scrolling, and custom graphics modes are not supported by BASIC; these will require machine language routines or PEEK/POKE statements. A few graphics modes cannot be accessed from BASIC on the Atari 400/800 as the OS ROMs do not support them; the only way to access them is in machine language by setting the ANTIC registers and Display List manually. The OS ROMs on the XL/XE added support for these modes.
Bitmap modes in BASIC are normally set to have a text window occupying the last three rows at the bottom of the screen so the user may display prompts and enter data in a program. If a 16 is added to the mode number invoked via the GRAPHICS statement, the entire screen will be in bitmap mode. If bitmap mode in full screen is invoked, Atari BASIC will automatically switch back into text mode when program execution is terminated unlike many other BASICs which leave the user in bitmap mode and have an unreadable screen that can only be switched out of via typing a blind command or resetting the computer.
Bitmap coordinates are calculated in the range of 1 to maximum row/column minus one, thus in Mode 6, the maximum coordinates for a pixel can be 159 and 191. If the user goes over the allowed coordinates for the mode, BASIC will exit out with an error.

Advanced techniques

Line labels

Atari BASIC allows numeric variables and expressions to be used to supply line numbers to GOTO and GOSUB commands. For instance, a subroutine that clears the screen can be written as GOSUB CLEARSCREEN, which is easier to understand than GOSUB 10000.

Includes

Most BASICs of the era allow the LIST command to send the source code to a printer or other device. Atari BASIC also includes the ENTER command, which reads source code from a device and merges it back into the program, as if the user had typed it in. This allows programs to be saved out in sections, reading them in using ENTER to merge or replace existing code. By carefully using blocks of line numbers that do not overlap, programmers can build libraries of subroutines and merge them into new programs as needed.

Embedded machine language

Atari BASIC can call machine code subroutines. The machine code is generally stored in strings, which can be anywhere in memory so the code needs to be position independent, or in the 256-byte Page 6 area, which is not used by BASIC or the operating system. Code can be loaded into Page 6 by reading it from DATA statements.
Machine code is invoked with the USR function. The first parameter is the address of the machine code routine and the following values are parameters. For example, if the machine language code is stored in a string named ROUTINE$ it can be called with parameters as.
Parameters are pushed onto the hardware stack as 16-bit integers in the order specified in the USR function in low byte, high byte order. The last value pushed to the stack is a byte indicating the number of arguments. The machine language code must remove all of these values before returning via the RTS instruction. A value can be returned to the BASIC program by placing it in addresses 21210 and 21310 as a 16-bit integer.

Performance

In theory, Atari BASIC should run faster than contemporary BASICs based on the MS pattern. Because the source code is fully tokenized when it is entered, the entire tokenization and parsing steps are already complete. Even complex mathematical operations are ready-to-run, with any numerical constants already converted to the internal 40-bit format, and variables values are looked up by address rather than having to be searched for. In spite of these theoretical advantages, in practice, Atari BASIC is slower than other home computer BASICs, often by a large amount.
On two widely used benchmarks of the era, Byte magazine's Sieve of Eratosthenes and the Creative Computing benchmark test written by David H. Ahl, the Atari finished near the end of the list in terms of performance, and was much slower than the contemporary Apple II or Commodore PET, in spite of having the same CPU but running it at roughly twice the speed of either. It finished behind relatively slow machines like the Sinclair ZX81 and even some programmable calculators.
Most of the language's slowness stemmed from three problems.
The first is that the floating-point math routines were poorly optimized. In the Ahl benchmark, a single exponent operation, which internally loops over the slow multiplication function, was responsible for much of the machine's poor showing.
In addition to performing most mathematical operations slowly, the conversion between the internal floating-point format and the 16-bit integers used in certain parts of the language were relatively slow. Internally, these integers were used for line numbers and array indexing, along with a few other tasks, but numbers in the tokenized program were always stored in binary coded decimal format. Whenever one of these is encountered, for instance, in the line number in, the tokenized BCD value has to be converted to an integer, an operation that can take as long as 3500 microseconds. Other BASICs avoided this delay by special-casing the conversion of numbers that could only possibly be integers, like the line number following a, switching to special ASCII-to-integer code to improve performance.
Another problem is due to how Atari BASIC implemented branches. To perform a branch in a GOTO or GOSUB, the interpreter searches through the entire program for the matching line number it needs. One minor improvement found in most Microsoft-derived BASICs is to compare the target line number to the current line number, and search forward from that point if it is greater, or start from the top if less. This improvement was missing in Atari BASIC.
The most serious problem was the implementation of FOR...NEXT loops. Almost all BASICs, including MS-derived versions, would push a pointer to the location of the FOR on a stack, so when it reached the NEXT it could easily return to the FOR again in a single branch operation. Atari BASIC pushed the line number instead. This meant every time a NEXT was encountered, the system had to search through the entire program to find the corresponding FOR line. As a result, any loops in an Atari BASIC program cause a large loss of performance relative to other BASICs.
Several BASICs for the Atari addressed some or all of these issues, resulting in large performance gains. BASIC XL reduced the time for the Byte benchmark from 194 to 58 seconds, over three times as fast. This was accomplished by caching the location of FOR/NEXT loops, as in other BASICs, and also used this same cache to perform GOTO and GOSUB line lookups for further improvements. Turbo-Basic XL included a different solution to the line-lookup issue, as well as a re-written, high-performance, floating-point library. On the Ahl benchmark, Atari BASIC required 405 seconds, while exactly the same code in Turbo BASIC took 41.6 seconds, an order of magnitude improvement.

Differences from Microsoft BASIC

Citation