Colossus computer


Colossus was a set of computers developed by British codebreakers in the years 1943–1945 to help in the cryptanalysis of the Lorenz cipher. Colossus used thermionic valves to perform Boolean and counting operations. Colossus is thus regarded as the world's first programmable, electronic, digital computer, although it was programmed by switches and plugs and not by a stored program.
Colossus was designed by General Post Office research telephone engineer Tommy Flowers to solve a problem posed by mathematician Max Newman at the Government Code and Cypher School at Bletchley Park. Alan Turing's use of probability in cryptanalysis contributed to its design. It has sometimes been erroneously stated that Turing designed Colossus to aid the cryptanalysis of the Enigma. Turing's machine that helped decode Enigma was the electromechanical Bombe, not Colossus.
The prototype, Colossus Mark 1, was shown to be working in December 1943 and was in use at Bletchley Park by early 1944. An improved Colossus Mark 2 that used shift registers to quintuple the processing speed, first worked on 1 June 1944, just in time for the Normandy landings on D-Day. Ten Colossi were in use by the end of the war and an eleventh was being commissioned. Bletchley Park's use of these machines allowed the Allies to obtain a vast amount of high-level military intelligence from intercepted radiotelegraphy messages between the German High Command and their army commands throughout occupied Europe.
The existence of the Colossus machines was kept secret until the mid-1970s; the machines and the plans for building them had previously been destroyed in the 1960s as part of the effort to maintain the secrecy of the project. This deprived most of those involved with Colossus of the credit for pioneering electronic digital computing during their lifetimes. A functioning rebuild of a Mark 2 Colossus was completed in 2008 by Tony Sale and some volunteers; it is on display at The National Museum of Computing at Bletchley Park.

Purpose and origins

The Colossus computers were used to help decipher intercepted radio teleprinter messages that had been encrypted using an unknown device. Intelligence information revealed that the Germans called the wireless teleprinter transmission systems "Sägefisch". This led the British to call encrypted German teleprinter traffic "Fish", and the unknown machine and its intercepted messages "Tunny".
Before the Germans increased the security of their operating procedures, British cryptanalysts diagnosed how the unseen machine functioned and built an imitation of it called "British Tunny".
It was deduced that the machine had twelve wheels and used a Vernam ciphering technique on message characters in the standard 5-bit ITA2 telegraph code. It did this by combining the plaintext characters with a stream of key characters using the XOR Boolean function to produce the ciphertext.
In August 1941, a blunder by German operators led to the transmission of two versions of the same message with identical machine settings. These were intercepted and worked on at Bletchley Park. First, John Tiltman, a very talented GC&CS cryptanalyst, derived a key stream of almost 4000 characters. Then Bill Tutte, a newly arrived member of the Research Section, used this keystream to work out the logical structure of the Lorenz machine. He deduced that the twelve wheels consisted of two groups of five, which he named the χ and ψ wheels, the remaining two he called μ or "motor" wheels. The chi wheels stepped regularly with each letter that was encrypted, while the psi wheels stepped irregularly, under the control of the motor wheels.
With a sufficiently random keystream, a Vernam cipher removes the natural language property of a plaintext message of having an uneven frequency distribution of the different characters, to produce a uniform distribution in the ciphertext. The Tunny machine did this well. However, the cryptanalysts worked out that by examining the frequency distribution of the character-to-character changes in the ciphertext, instead of the plain characters, there was a departure from uniformity which provided a way into the system. This was achieved by "differencing" in which each bit or character was XOR-ed with its successor. After Germany surrendered, allied forces captured a Tunny machine and discovered that it was the electromechanical Lorenz SZ in-line cipher machine.
In order to decrypt the transmitted messages, two tasks had to be performed. The first was "wheel breaking", which was the discovery of the cam patterns for all the wheels. These patterns were set up on the Lorenz machine and then used for a fixed period of time for a succession of different messages. Each transmission, which often contained more than one message, was enciphered with a different start position of the wheels. Alan Turing invented a method of wheel-breaking that became known as Turingery. Turing's technique was further developed into "Rectangling", for which Colossus could produce tables for manual analysis. Colossi 2, 4, 6, 7 and 9 had a "gadget" to aid this process.
The second task was "wheel setting", which worked out the start positions of the wheels for a particular message, and could only be attempted once the cam patterns were known. It was this task for which Colossus was initially designed. To discover the start position of the chi wheels for a message, Colossus compared two character streams, counting statistics from the evaluation of programmable Boolean functions. The two streams were the ciphertext, which was read at high speed from a paper tape, and the key stream, which was generated internally, in a simulation of the unknown German machine. After a succession of different Colossus runs to discover the likely chi-wheel settings, they were checked by examining the frequency distribution of the characters in processed ciphertext. Colossus produced these frequency counts.

Decryption processes

By using differencing and knowing that the psi wheels did not advance with each character, Tutte worked out that trying just two differenced bits of the chi-stream against the differenced ciphertext would produce a statistic that was non-random. This became known as Tutte's "1+2 break in". It involved calculating the following Boolean function:
and counting the number of times it yielded "false". If this number exceeded a pre-defined threshold value known as the "set total", it was printed out. The cryptanalyst would examine the printout to determine which of the putative start positions was most likely to be the correct one for the chi-1 and chi-2 wheels.
This technique would then be applied to other pairs of, or single, impulses to determine the likely start position of all five chi wheels. From this, the de-chi of a ciphertext could be obtained, from which the psi component could be removed by manual methods. If the frequency distribution of characters in the de-chi version of the ciphertext was within certain bounds, "wheel setting" of the chi wheels was considered to have been achieved, and the message settings and de-chi were passed to the "Testery". This was the section at Bletchley Park led by Major Ralph Tester where the bulk of the decrypting work was done by manual and linguistic methods.
Colossus could also derive the start position of the psi and motor wheels, but this was not much done until the last few months of the war, when there were plenty of Colossi available and the number of Tunny messages had declined.

Design and construction

Colossus was developed for the "Newmanry", the section headed by the mathematician Max Newman that was responsible for machine methods against the twelve-rotor Lorenz SZ40/42 on-line teleprinter cipher machine. The Colossus design arose out of a prior project that produced a counting machine dubbed "Heath Robinson". Although it proved the concept of machine analysis for this part of the process, it was initially unreliable. The electro-mechanical parts were relatively slow and it was difficult to synchronise two looped paper tapes, one containing the enciphered message, and the other representing part of the key stream of the Lorenz machine, also the tapes tended to stretch when being read at up to 2000 characters per second.
allegedly from an original Colossus presented by the Director of GCHQ to the Director of the NSA to mark the 40th anniversary of the UKUSA Agreement in 1986
Tommy Flowers MBE was a senior electrical engineer and Head of the Switching Group at the Post Office Research Station at Dollis Hill. Prior to his work on Colossus, he had been involved with GC&CS at Bletchley Park from February 1941 in an attempt to improve the Bombes that were used in the cryptanalysis of the German Enigma cipher machine. He was recommended to Max Newman by Alan Turing, who had been impressed by his work on the Bombes. The main components of the Heath Robinson machine were as follows.
Flowers had been brought in to design the Heath Robinson's combining unit. He was not impressed by the system of a key tape that had to be kept synchronised with the message tape and, on his own initiative, he designed an electronic machine which eliminated the need for the key tape by having an electronic analogue of the Lorenz machine. He presented this design to Max Newman in February 1943, but the idea that the one to two thousand thermionic valves proposed, could work together reliably, was greeted with great scepticism, so more Robinsons were ordered from Dollis Hill. Flowers, however, knew from his pre-war work that most thermionic valve failures occurred as a result of the thermal stresses at power up, so not powering a machine down reduced failure rates to very low levels. Additionally, the heaters were started at a low voltage then slowly brought up to full voltage to reduce the thermal stress. The valves themselves were soldered in to avoid problems with plug-in bases, which could be unreliable. Flowers persisted with the idea and obtained support from the Director of the Research Station, W Gordon Radley. Flowers and his team of some fifty people in the switching group spent eleven months from early February 1943 designing and building a machine that dispensed with the second tape of the Heath Robinson, by generating the wheel patterns electronically. Flowers used some of his own money for the project.
This prototype, Mark 1 Colossus, contained 1600 thermionic valves. It performed satisfactorily at Dollis Hill on 8 December 1943 and was dismantled and shipped to Bletchley Park, where it was delivered on 18 January and re-assembled by Harry Fensom and Don Horwood. It was operational in January and it successfully attacked its first message on 5 February 1944. It was a large structure and was dubbed 'Colossus', supposedly by the WRNS operators. However, a memo held in the National Archives written by Max Newman on 18 January 1944 records that 'Colossus arrives today".
During the development of the prototype, an improved design had been developed – the Mark 2 Colossus. Four of these were ordered in March 1944 and by the end of April the number on order had been increased to twelve. Dollis Hill was put under pressure to have the first of these working by 1 June. Allen Coombs took over leadership of the production Mark 2 Colossi, the first of which – containing 2400 valves – became operational at 08:00 on 1 June 1944, just in time for the Allied Invasion of Normandy on D-Day. Subsequently, Colossi were delivered at the rate of about one a month. By the time of V-E Day there were ten Colossi working at Bletchley Park and a start had been made on assembling an eleventh.
in the space now containing the Tunny gallery of The National Museum of Computing
The main units of the Mark 2 design were as follows.
Most of the design of the electronics was the work of Tommy Flowers, assisted by William Chandler, Sidney Broadhurst and Allen Coombs; with Erie Speight and Arnold Lynch developing the photoelectric reading mechanism. Coombs remembered Flowers, having produced a rough draft of his design, tearing it into pieces that he handed out to his colleagues for them to do the detailed design and get their team to manufacture it. The Mark 2 Colossi were both five times faster and were simpler to operate than the prototype.
Data input to Colossus was by photoelectric reading of a paper tape transcription of the enciphered intercepted message. This was arranged in a continuous loop so that it could be read and re-read multiple times – there being no internal storage for the data. The design overcame the problem of synchronizing the electronics with the speed of the message tape, by generating a clock signal from reading its sprocket holes. The speed of operation was thus limited by the mechanics of reading the tape. During development, the tape reader was tested up to 9700 characters per second before the tape disintegrated. So 5000 characters/second was settled on as the speed for regular use. Flowers designed a 6-character shift register, which was used both for computing the delta function and for testing five different possible starting points of Tunny's wheels in the five processors. This five-way parallelism enabled five simultaneous tests and counts to be performed giving an effective processing speed of 25,000 characters per second. The computation used algorithms devised by W. T. Tutte and colleagues to decrypt a Tunny message.

Operation

The Newmanry was staffed by cryptanalysts, operators from the Women's Royal Naval Service – known as "Wrens" – and engineers who were permanently on hand for maintenance and repair. By the end of the war the staffing was 272 Wrens and 27 men.
The first job in operating Colossus for a new message was to prepare the paper tape loop. This was performed by the Wrens who stuck the two ends together using Bostik glue, ensuring that there was a 150-character length of blank tape between the end and the start of the message. Using a special hand punch they inserted a start hole between the third and fourth channels sprocket holes from the end of the blank section, and a stop hole between the fourth and fifth channels sprocket holes from the end of the characters of the message. These were read by specially positioned photocells and indicated when the message was about to start and when it ended. The operator would then thread the paper tape through the gate and around the pulleys of the bedstead and adjust the tension. The two-tape bedstead design had been carried on from Heath Robinson so that one tape could be loaded whilst the previous one was being run. A switch on the Selection Panel specified the "near" or the "far" tape.
After performing various resetting and zeroizing tasks, the Wren operators would, under instruction from the cryptanalyst, operate the "set total" decade switches and the K2 panel switches to set the desired algorithm. They would then start the bedstead tape motor and lamp and, when the tape was up to speed, operate the master start switch.

Programming

Howard Campaigne, a mathematician and cryptanalyst from the US Navy's OP-20-G, wrote the following in a foreword to Flowers' 1983 paper "The Design of Colossus".
Colossus was not a stored-program computer. The input data for the five parallel processors was read from the looped message paper tape and the electronic pattern generators for the chi, psi and motor wheels. The programs for the processors were set and held on the switches and jack panel connections. Each processor could evaluate a Boolean function and count and display the number of times it yielded the specified value of "false" or "true" for each pass of the message tape.
Input to the processors came from two sources, the shift registers from tape reading and the thyratron rings that emulated the wheels of the Tunny machine. The characters on the paper tape were called Z and the characters from the Tunny emulator were referred to by the Greek letters that Bill Tutte had given them when working out the logical structure of the machine. On the selection panel, switches specified either Z or ΔZ, either or Δ and either or Δ for the data to be passed to the jack field and 'K2 switch panel'. These signals from the wheel simulators could be specified as stepping on with each new pass of the message tape or not.
The K2 switch panel had a group of switches on the left-hand side to specify the algorithm. The switches on the right-hand side selected the counter to which the result was fed. The plugboard allowed less specialized conditions to be imposed. Overall the K2 switch panel switches and the plugboard allowed about five billion different combinations of the selected variables.
As an example: a set of runs for a message tape might initially involve two chi wheels, as in Tutte's 1+2 algorithm. Such a two-wheel run was called a long run, taking on average eight minutes unless the parallelism was utilised to cut the time by a factor of five. The subsequent runs might only involve setting one chi wheel, giving a short run taking about two minutes. Initially, after the initial long run, the choice of next algorithm to be tried was specified by the cryptanalyst. Experience showed, however, that decision trees for this iterative process could be produced for use by the Wren operators in a proportion of cases.

Influence and fate

Although the Colossus was the first of the electronic digital machines with programmability, albeit limited by modern standards, it was not a general-purpose machine, being designed for a range of cryptanalytic tasks, most involving counting the results of evaluating Boolean algorithms.
A Colossus computer was thus not a fully Turing complete machine. However, University of San Francisco professor Benjamin Wells has shown that if all ten Colossus machines made were rearranged in a specific cluster, then the entire set of computers could have simulated a universal Turing machine, and thus be Turing complete. The notion of a computer as a general-purpose machine – that is, as more than a calculator devoted to solving difficult but specific problems – did not become prominent until after World War II.
Colossus and the reasons for its construction were highly secret and remained so for 30 years after the War. Consequently, it was not included in the history of computing hardware for many years, and Flowers and his associates were deprived of the recognition they were due. Colossi 1 to 10 were dismantled after the war and parts returned to the Post Office. Some parts, sanitised as to their original purpose, were taken to Max Newman's Royal Society Computing Machine Laboratory at Manchester University. Tommy Flowers was ordered to destroy all documentation and burnt them in a furnace at Dollis Hill. He later said of that order:
Colossi 11 and 12, along with two replica Tunny machines, were retained, being moved to GCHQ's new headquarters at Eastcote in April 1946, and again with GCHQ to Cheltenham between 1952 and 1954. One of the Colossi, known as Colossus Blue, was dismantled in 1959; the other in 1960. There had been attempts to adapt them to other purposes, with varying success; in their later years they had been used for training. Jack Good related how he was the first to use Colossus after the war, persuading the US National Security Agency that it could be used to perform a function for which they were planning to build a special-purpose machine. Colossus was also used to perform character counts on one-time pad tape to test for non-randomness.
A small number of people who were associated with Colossus—and knew that large-scale, reliable, high-speed electronic digital computing devices were feasible—played significant roles in early computer work in the UK and probably in the US. However, being so secret, it had little direct influence on the development of later computers; it was EDVAC that was the seminal computer architecture of the time. In 1972 Herman Goldstine, who was unaware of Colossus and its legacy to the projects of people such as Alan Turing, Max Newman and Harry Huskey, wrote that,
Professor Brian Randell, who unearthed information about Colossus in the 1970s, commented on this, saying that:
Randell's efforts started to bear fruit in the mid-1970s, after the secrecy about Bletchley Park was broken when Group Captain Winterbotham published his book The Ultra Secret in 1974. In October 2000, a 500-page technical report on the Tunny cipher and its cryptanalysis—entitled General Report on Tunny—was released by GCHQ to the national Public Record Office, and it contains a fascinating paean to Colossus by the cryptographers who worked with it:

Reconstruction

Construction of a fully functional rebuild of a Colossus Mark 2 was undertaken between 1993 and 2008 by a team led by Tony Sale. In spite of the blueprints and hardware being destroyed, a surprising amount of material survived, mainly in engineers' notebooks, but a considerable amount of it in the U.S. The optical tape reader might have posed the biggest problem, but Dr. Arnold Lynch, its original designer was able to redesign it to his own original specification. The reconstruction is on display, in the historically correct place for Colossus No. 9, at The National Museum of Computing, in H Block Bletchley Park in Milton Keynes, Buckinghamshire.
In November 2007, to celebrate the project completion and to mark the start of a fundraising initiative for The National Museum of Computing, a Cipher Challenge pitted the rebuilt Colossus against radio amateurs worldwide in being first to receive and decode three messages enciphered using the Lorenz SZ42 and transmitted from radio station DL0HNF in the Heinz Nixdorf MuseumsForum computer museum. The challenge was easily won by radio amateur Joachim Schüth, who had carefully prepared for the event and developed his own signal processing and code-breaking code using Ada. The Colossus team were hampered by their wish to use World War II radio equipment, delaying them by a day because of poor reception conditions. Nevertheless, the victor's 1.4 GHz laptop, running his own code, took less than a minute to find the settings for all 12 wheels. The German codebreaker said: "My laptop digested ciphertext at a speed of 1.2 million characters per second—240 times faster than Colossus. If you scale the CPU frequency by that factor, you get an equivalent clock of 5.8 MHz for Colossus. That is a remarkable speed for a computer built in 1944."
The Cipher Challenge verified the successful completion of the rebuild project. "On the strength of today's performance Colossus is as good as it was six decades ago", commented Tony Sale. "We are delighted to have produced a fitting tribute to the people who worked at Bletchley Park and whose brainpower devised these fantastic machines which broke these ciphers and shortened the war by many months."

Other meanings

There was a fictional computer named Colossus in the 1970 movie which was based on the 1966 novel Colossus by D. F. Jones. This was sheer coincidence as it pre-dates the public release of information about Colossus, or even its name.
Neal Stephenson's novel Cryptonomicon also contains a fictional treatment of the historical role played by Turing and Bletchley Park.

Footnotes