Run-length limited

Run-length limited or RLL coding is a line coding technique that is used to send arbitrary data over a communications channel with bandwidth limits. RLL codes are defined by four main parameters: m, n, d, k. The first two, m/n, refer to the rate of the code, while the remaining two specify the minimal d and maximal k number of zeroes between consecutive ones. This is used in both telecommunication and storage systems that move a medium past a fixed recording head.
Specifically, RLL bounds the length of stretches of repeated bits during which the signal does not change. If the runs are too long, clock recovery is difficult; if they are too short, the high frequencies might be attenuated by the communications channel. By modulating the data, RLL reduces the timing uncertainty in :wikt:decode|decoding the stored data, which would lead to the possible erroneous insertion or removal of bits when reading the data back. This mechanism ensures that the boundaries between bits can always be accurately found, while efficiently using the media to reliably store the maximal amount of data in a given space.
Early disk drives used very simple encoding schemes, such as RLL FM code, followed by RLL MFM code which were widely used in hard disk drives until the mid-1980s and are still used in digital optical discs such as CD, DVD, MD, Hi-MD and Blu-ray. Higher density RLL and RLL codes became the de facto industry standard for hard disks by the early 1990s.

Need for RLL coding

On a hard disk drive, information is represented by changes in the direction of the magnetic field on the disk, and on magnetic media, the playback output is proportional to the density of flux transition. In a computer, information is represented by the voltage on a wire. No voltage on the wire in relation to a defined ground level would be a binary zero, and a positive voltage on the wire in relation to ground represents a binary one. Magnetic media, on the other hand, always carries a magnetic flux - either a "north" pole or a "south" pole. In order to convert the magnetic fields to binary data, some encoding method must be used to translate between the two.
One of the simplest practical codes, Modified Non-Return-to-Zero-Inverted, simply encodes a 1 as a magnetic polarity transition, also known as a "flux reversal", and a zero as no transition. With the disk spinning at a constant rate, each bit is given an equal time period, a "data window," for the magnetic signal that represents that bit, and the flux reversal, if any, occurs at the start of this window.
This method is not quite that simple, as the playback output is proportional to the density of ones, a long run of zeros means no playback output at all.
In a simple example, consider the binary pattern 101 with a data window of 1 ns. This will be stored on the disk as a change, followed by no change, and then another change. If the preceding magnetic polarity was already positive, the resulting pattern might look like this: −−+. A value of 255, or all binary ones, would be written as −+−+−+−+ or +−+−+−+−. A zero byte would be written as ++++++++ or −−−−−−−−. A 512 byte sector of zeros would be written as 4,096 sequential bits with the same polarity.
Since a disk drive is a physical piece of hardware, the rotational speed of the drive can change slightly, due to a change in the motor speed or thermal expansion of the disk platter. The physical media on a floppy disk can also become deformed, causing larger timing errors, and the timing circuit on the controller itself may have small variations in speed. The problem is that, with a long string of zeros, there's no way for the disk drive's controller to know the exact position of the read head, and thus no way to know exactly how many zeros there are. A speed variation of even 0.1% - which is more precise than any practical floppy drive - could result in four bits being added to or removed from the 4,096 bit data stream. Without some form of synchronization and error correction, the data would become completely unusable.
The other problem is due to the limits of magnetic media itself: it is only possible to write so many polarity changes in a certain amount of space, so there's an upper limit to how many 1's can also be written sequentially, this depends on the linear velocity and the head gap.
To prevent this problem, data is coded in such a way that long repetitions of a single binary value do not occur. By limiting the number of zeros written consecutively, this makes it possible for the drive controller to stay in sync. By limiting the number of 1's written in a row, the overall frequency of polarity changes is reduced, allowing the drive to store more data in the same amount of space, resulting in either a smaller package for the same amount of data or more storage in the same size package.

History

All codes used to record on magnetic disks have limited the length of transition-free runs and can therefore be characterized as RLL codes. The earliest and simplest variants were given specific names, such as Modified Frequency Modulation, and the name "RLL" is commonly used only for the more complex variants not given such specific names, but the term technically applies to them all.
The first "RLL" code used in hard drives was RLL, developed by IBM engineers and first used commercially in 1979 on the IBM 3370 DASD, for use with the 4300 series mainframe. During the late 1980s, PC hard disks began using RLL proper. RLL codes have found almost universal application in optical disc recording practice since 1980. In consumer electronics, RLLs like the EFM code are employed in the Compact Disc and MiniDisc, and the EFMPlus code used in the DVD. Parameters d, k are the minimum and maximum allowed run-lengths. For more coverage on the storage technologies, the references cited in this article are useful.

Technical overview

Generally run-length is the number of bits for which signal remains unchanged. A run-length of 3 for bit 1, represents a sequence of '111'. For instance, the pattern of magnetic polarizations on the disk might be '+−−−−++−−−++++++', with runs of length 1, 4, 2, 3, and 6. However, run length limited coding terminology assumes NRZI encoding, so 1 bits indicate changes and 0 bits indicate the absence of change, the above sequence would be expressed as '11000101001000001', and only runs of zero bits are counted.
Somewhat confusingly, the run length is the number of zeros between adjacent ones, which is one less than the number of bit times the signal actually remains unchanged. Run length limited sequences are characterized by two parameters, d and k, which stipulate the minimum and maximum zero-bit run length that can occur in the sequence. So RLL codes are generally specified as RLL. e.g.: RLL.

Coding

In the encoded format a "1" bit indicates a flux transition, while a "0" indicates that the magnetic field on the disk does not change for that time interval.

FM: (0,1) RLL

Generally, the term "RLL code" is used to refer to more elaborate encodings, but the original frequency modulation code, also called differential Manchester encoding, can be seen as a simple rate-1/2 RLL code.
The added 1 bits are referred to as clock bits.
Example:
Data: 0 0 1 0 1 1 0 1 0 0 0 1 1 0
Encoded: 1010111011111011101010111110
Clock: 1 1 1 1 1 1 1 1 1 1 1 1 1 1

GCR: (0,2) RLL

By extending the maximum run length to 2 adjacent 0 bits, the data rate can be improved to 4/5. This is the original IBM group coded recording variant.

Where possible, the bit pattern abcd is encoded by prefixing it with the complement of a: abcd. In the five cases where this would violate one of the rules, a code beginning with 11 is substituted.
Example:
Data: 0010 1101 0001 1000
Encoded: 10010011011101111010
Note that to meet the definition of RLL, it is not sufficient only that each 5-bit code contain no more than two consecutive zeros, but it is also necessary that any pair of 5-bit codes as a combined sequentially not contain more than two consecutive zeros. That is, there must not be more than two zeros between the last one bit in the first code and the first one bit in the second code, for any two arbitrarily chosen codes. This is required because for any RLL code, the run length limits—0 and 2 in this case—apply to the overall modulated bitstream, not just to the components of it that represent discrete sequences of plain data bits. The IBM GCR code above meets this condition, since the maximum run length of zeros at the beginning of any 5-bit code is one, and likewise the maximum run length at the end of any code is one, making a total run length of two at the junction between adjacent codes.

MFM: (1,3) RLL

begins to get interesting, because its special properties allow its bits to be written to a magnetic medium with twice the density of an arbitrary bit stream. There is a limit to how close in time flux transitions can be for reading equipment to detect them, and that constrains how closely bits can be recorded on the medium: In the worst case, with an arbitrary bit stream, there are two consecutive 1's, which produces two consecutive flux transitions in time, so bits must be spaced far enough apart that there would be sufficient time between those flux transitions for the reader to detect them. But this code imposes a constraint of d=1, i.e. there is a minimum of one 0 between each two 1's. That means in the worst case, flux transitions are two bit times apart, so the bits can be twice as close together as with the arbitrary bit stream without exceeding the reader's capabilities.
This doubled recording density compensates for the 1/2 coding rate of this code and makes it equivalent to a rate-1 code.
Where "x" is the complement of the previous encoded bit. Except for the clock bits—that "x" bit, and the "0" in the "01" code—this is the same as the FM table, and that is how this code gets its name. The inserted clock bits are 0 except between two 0 data bits.
Example:
Data: 0 0 1 0 1 1 0 1 0 0 0 1 1 0
Encoded: x010010001010001001010010100
Clock: x 1 0 0 0 0 0 0 0 1 1 0 0 0

(1,7) RLL

RLL maps 2 bits of data onto three bits on the disk, and the encoding is done in two or four bit groups. The encoding rules are: becomes, except becomes.
When encoding according to the table below, the longest match must be used; those are exceptions which handle situations where applying the earlier rules would lead to a violation of the code constraints.

Data	Encoded
00	101
01	100
10	001
11	010
00 00	101 000
00 01	100 000
10 00	001 000
10 01	010 000

Example:
Data: 0 0 1 0 1 1 0 1 0 0 0 1 1 0
Encoded: 101 001 010 100 100 000 001

(2,7) RLL

RLL is rate- code, mapping n bits of data onto 2n bits on the disk, like MFM, but because the minimum run length is 50% longer, the bits can be written faster, achieving 50% higher effective data density. The encoding is done in two, three or four bit groups.
Western Digital WD5010A, WD5011A, WD50C12

Data	RLL Encoded
11	1000
10	0100
000	100100
010	000100
011	001000
0011	00001000
0010	00100100

Seagate ST11R, IBM

Data	RLL Encoded
11	1000
10	0100
000	000100
010	100100
011	001000
0011	00001000
0010	00100100

Perstor Systems ADRC

Data	RLL Encoded
11	1000
10	0100
000	100100
010	000100
001	001000
0111	00001000
0110	00100100

The encoded forms begin with at most four, and end with at most three zero bits, giving the maximum run length of seven.
Example:
Data: 1 1 0 1 1 0 0 1 1
Encoded: 1000 001000 00001000

DC Free (1,7) RLL

There is also an alternate RLL encoding that is sometimes used to avoid a DC bias.

Data	Encoded
00	x01
01	010
10	x00
11 00	010 001
11 01	x00 000
11 10	x00 001
11 11	010 000

Where "x" is the complement of the previous encoded bit.
Example:
Data: 0 1 0 0 1 1 0 1 0 1
Encoded: 010 101 000 000 010

HHH(1,13)

The HHH code is a rate-2/3 code developed by three IBM researchers for use in the 16 MB/s IrDA VFIR physical layer. Unlike magnetic encoding, this is designed for an infrared transmitter where a 0 bit represents "off" and a 1 bit represents "on". Because 1 bits consume more power to transmit, this is designed to limit the density of 1 bits to less than 50%. In particular, it is a RLL code, where the final 5 indicates the additional constraint that there are at most 5 consecutive "10" bit pairs.

Data	Encoded
00	010
01	001
10	100
11	101
01 10	001 000
01 11	010 000
11 10	101 000
11 11	100 000
00 11 00	010 000 000
00 11 01	001 000 000
10 11 00	100 000 000
10 11 01	101 000 000
00 11 10 11	010 000 000 000
10 11 10 11	100 000 000 000

The first eight rows describe a standard -RLL code. The additional six exceptions increase the maximum run of zeros to 13, but limit the maximum average ones density to. The longest run of 1–0 pairs is 000 101 010 101 000.
This code limits the ones density to between and, with an average of 25.8%.

Examples

For example, let us encode the bit sequence 10110010 with different encodings

Encoding	Data	Encoded
RLL	10110010	1110111110101110
RLL	1011 0010	01011 10010
RLL	10100011	0100010010100101
RLL	10 11 00 10	001 010 101 001
RLL	10 11 0010	0100 1000 00100100

Densities

Suppose a magnetic tape can contain up to 3,200 flux reversals per inch. A Modified Frequency Modulation or RLL encoding stores each data bit as two bits on tape, but since there is guaranteed to be one 0 bit between any 1 bits then it is possible to store 6,400 encoded bits per inch on the tape, or 3,200 data bits per inch. A RLL encoding can also store 6,400 encoded bits per inch on the tape, but since it only takes 3 encoded bits to store 2 data bits this is 4,267 data bits per inch. A RLL encoding takes 2 encoded bits to store each data bit, but since there is guaranteed to be two 0 bits between any 1 bits then it is possible to store 9,600 encoded bits per inch on the tape, or 4,800 data bits per inch.
The flux reversal densities on hard drives are significantly greater, but the same improvements in storage density are seen by using different encoding systems.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...