Jackson structured programming
Jackson structured programming is a method for structured programming developed by British software consultant Michael A. Jackson and described in his 1975 book Principles of Program Design. The technique of JSP is to analyze the data structures of the files that a program must read as input and produce as output, and then produce a program design based on those data structures, so that the program control structure handles those data structures in a natural and intuitive way.
JSP describes structures using three basic structures— sequence, iteration, and selection. These structures are diagrammed as a visual representation of a regular expression.
Introduction
originally developed JSP in the 1970s. He documented the system in his 1975 book Principles of Program Design. In a 2001 conference talk, he provided a retrospective analysis of the original driving forces behind the method, and related it to subsequent software engineering developments. Jackson's aim was to make COBOL batch file processing programs easier to modify and maintain, but the method can be used to design programs for any programming language that has structured control constructs— sequence, iteration, and selection.Jackson Structured Programming was similar to Warnier/Orr structured programming although JSP considered both input and output data structures while the Warnier/Orr method focused almost exclusively on the structure of the output stream.
Motivation for the method
At the time that JSP was developed, most programs were batch COBOL programs that processed sequential files stored on tape. A typical program read through its input file as a sequence of records, so that all programs had the same structure— a single main loop that processed all of the records in the file, one at a time. Jackson asserted that this program structure was almost always wrong, and encouraged programmers to look for more complex data structures. In Chapter 3 of Principles of Program Design Jackson presents two versions of a program, one designed using JSP, the other using the traditional single-loop structure. Here is his example, translated from COBOL into Java. The purpose of these two programs is to recognize groups of repeated records in a sorted file, and to produce an output file listing each record and the number of times that it occurs in the file.Here is the traditional, single-loop version of the program.
String line;
int count = 0;
String firstLineOfGroup = null;
// begin single main loop
while ) != null)
if
Here is a JSP-style version of the same program. Note that it has two loops, one nested inside the other. The outer loop processes groups of repeating records, while the inner loop processes the individual records in a group.
String line;
int numberOfLinesInGroup;
line = in.readLine;
//begin outer loop: process 1 group
while
Jackson criticises the traditional single-loop version for failing to process the structure of the input file in a natural way. One sign of its unnatural design is that, in order to work properly, it is forced to include special code for handling the first and last record of the file.
The basic method
JSP uses semi-formal steps to capture the existing structure of a program's inputs and outputs in the structure of the program itself.The intent is to create programs which are easy to modify over their lifetime. Jackson's major insight was that requirement changes are usually minor tweaks to the existing structures. For a program constructed using JSP, the inputs, the outputs, and the internal structures of the program all match, so small changes to the inputs and outputs should translate into small changes to the program.
JSP structures programs in terms of four component types:
- fundamental operations
- sequences
- iterations
- selections
The input and output structures are then unified or merged into a final program structure, known as a Program Structure Diagram. This step may involve the addition of a small amount of high level control structure to marry up the inputs and outputs. Some programs process all the input before doing any output, whilst others read in one record, write one record and iterate. Such approaches have to be captured in the PSD.
The PSD, which is language neutral, is then implemented in a programming language. JSP is geared towards programming at the level of control structures, so the implemented designs use just primitive operations, sequences, iterations and selections. JSP is not used to structure programs at the level of classes and objects, although it can helpfully structure control flow within a class's methods.
JSP uses a diagramming notation to describe the structure of inputs, outputs and programs, with diagram elements for each of the fundamental component types.
A simple operation is drawn as a box.
An operation
A sequence of operations is represented by boxes connected with lines. In the example below, operation A consists of the sequence of operations B, C and D.
A sequence
An iteration is again represented with joined boxes. In addition the iterated operation has a star in the top right corner of its box. In the example below, operation A consists of an iteration of zero or more invocations of operation B.
An iteration
Selection is similar to a sequence, but with a circle drawn in the top right hand corner of each optional operation. In the example, operation A consists of one and only one of operations B, C or D.
A selection
A worked example
As an example, here is how a programmer would design and code a run length encoder using JSP.A run length encoder is a program which takes as its input a stream of bytes. It outputs a stream of pairs consisting of a byte along with a count of the byte's consecutive occurrences. Run length encoders are often used for crudely compressing bitmaps.
With JSP, the first step is to describe the structure of a program's inputs. A run length encoder has only one input, a stream of bytes which can be viewed as zero or more runs. Each run consists of one or more bytes of the same value. This is represented by the following JSP diagram.
The run length encoder input
The second step is to describe the structure of the output. The run length encoder output can be described as zero or more pairs, each pair consisting of a byte and its count. In this example, the count will also be a byte.
The run length encoder output
The next step is to describe the correspondences between the operations in the input and output structures.
The correspondences between the run length encoders inputs and its outputs
It is at this stage that the astute programmer may encounter a structure clash, in which there is no obvious correspondence between the input and output structures. If a structure clash is found, it is usually resolved by splitting the program into two parts, using an intermediate data structure to provide a common structural framework with which the two program parts can communicate. The two programs parts are often implemented as processes or coroutines.
In this example, there is no structure clash, so the two structures can be merged to give the final program structure.
The run length encoder program structure
At this stage the program can be fleshed out by hanging various primitive operations off the elements of the structure. Primitives which suggest themselves are
- read a byte
- remember byte
- set counter to zero
- increment counter
- output remembered byte
- output counter
- while there are more bytes
- while there are more bytes and this byte is the same as the run's first byte and the count will still fit in a byte
- include
- include
Techniques for handling difficult design problems
In Principles of Program Design Jackson recognized situations that posed specific kinds of design problems, and provided techniques for handling them.One of these situations is a case in which a program processes two input files, rather than one. In 1975, one of the standard "wicked problems" was how to design a transaction-processing program. In such a program, a sequential file of update records is run against a sequential master file, producing an updated master file as output. Principles of Program Design provided a standard solution for that problem, along with an explanation of the logic behind the design.
Another kind of problem involved what Jackson called "recognition difficulties" and today we would call parsing problems. The basic JSP design technique was supplemented by POSIT and QUIT operations to allow the design of what we would now call a backtracking parser.
JSP also recognized three situations that are called "structure clashes"— a boundary clash, an ordering clash, and an interleaving clash— and provided techniques for dealing with them. In structure clash situations the input and output data structures are so incompatible that it is not possible to produce the output file from the input file. It is necessary, in effect, to write two programs— the first processes the input stream, breaks it down into smaller chunks, and writes those chunks to an intermediate file. The second program reads the intermediate file and produces the desired output.