PGP word list

The PGP Word List is a list of words for conveying data bytes in a clear unambiguous way via a voice channel. They are analogous in purpose to the NATO phonetic alphabet used by pilots, except a longer list of words is used, each word corresponding to one of the 256 unique numeric byte values.

History and structure

The PGP Word List was designed in 1995 by Patrick Juola, a computational linguist, and Philip Zimmermann, creator of PGP. The words were carefully chosen for their phonetic distinctiveness, using genetic algorithms to select lists of words that had optimum separations in phoneme space. The candidate word lists were randomly drawn from Grady Ward's Moby Pronunciator list as raw material for the search, successively refined by the genetic algorithms. The automated search converged to an optimized solution in about 40 hours on a DEC Alpha, a particularly fast machine in that era.
The Zimmermann–Juola list was originally designed to be used in PGPfone, a secure VoIP application, to allow the two parties to verbally compare a short authentication string to detect a man-in-the-middle attack. It was called a biometric word list because the authentication depended on the two human users recognizing each other's distinct voices as they read and compared the words over the voice channel, binding the identity of the speaker with the words, which helped protect against the MiTM attack. The list can be used in many other situations where a biometric binding of identity is not needed, so calling it a biometric word list may be imprecise. Later, it was used in PGP to compare and verify PGP public key fingerprints over a voice channel. This is known in PGP applications as the "biometric" representation. When it was applied to PGP, the list of words was further refined, with contributions by Jon Callas. More recently, it has been used in Zfone and the ZRTP protocol, the successor to PGPfone.
The list is actually composed of two lists, each containing 256 phonetically distinct words, in which each word represents a different byte value between 0 and 255. Two lists are used because reading aloud long random sequences of human words usually risks three kinds of errors: 1) transposition of two consecutive words, 2) duplicate words, or 3) omitted words. To detect all three kinds of errors, the two lists are used alternately for the even-offset bytes and the odd-offset bytes in the byte sequence. Each byte value is actually represented by two different words, depending on whether that byte appears at an even or an odd offset from the beginning of the byte sequence. The two lists are readily distinguished by the number of syllables; the even list has words of two syllables, the odd list has three. The two lists have a maximum word length of 9 and 11 letters, respectively. Using a two-list scheme was suggested by Zhahai Stewart.

Hex	Even Word	Odd Word
00	aardvark	adroitness
01	absurd	adviser
02	accrue	aftermath
03	acme	aggregate
04	adrift	alkali
05	adult	almighty
06	afflict	amulet
07	ahead	amusement
08	aimless	antenna
09	Algol	applicant
0A	allow	Apollo
0B	alone	armistice
0C	ammo	article
0D	ancient	asteroid
0E	apple	Atlantic
0F	artist	atmosphere
10	assume	autopsy
11	Athens	Babylon
12	atlas	backwater
13	Aztec	barbecue
14	baboon	belowground
15	backfield	bifocals
16	backward	bodyguard
17	banjo	bookseller
18	beaming	borderline
19	bedlamp	bottomless
1A	beehive	Bradbury
1B	beeswax	bravado
1C	befriend	Brazilian
1D	Belfast	breakaway
1E	berserk	Burlington
1F	billiard	businessman
20	bison	butterfat
21	blackjack	Camelot
22	blockade	candidate
23	blowtorch	cannonball
24	bluebird	Capricorn
25	bombast	caravan
26	bookshelf	caretaker
27	brackish	celebrate
28	breadline	cellulose
29	breakup	certify
2A	brickyard	chambermaid
2B	briefcase	Cherokee
2C	Burbank	Chicago
2D	button	clergyman
2E	buzzard	coherence
2F	cement	combustion
30	chairlift	commando
31	chatter	company
32	checkup	component
33	chisel	concurrent
34	choking	confidence
35	chopper	conformist
36	Christmas	congregate
37	clamshell	consensus
38	classic	consulting
39	classroom	corporate
3A	cleanup	corrosion
3B	clockwork	councilman
3C	cobra	crossover
3D	commence	crucifix
3E	concert	cumbersome
3F	cowbell	customer

Hex	Even Word	Odd Word
80	merit	intention
81	minnow	inventive
82	miser	Istanbul
83	Mohawk	Jamaica
84	mural	Jupiter
85	music	leprosy
86	necklace	letterhead
87	Neptune	liberty
88	newborn	maritime
89	nightbird	matchmaker
8A	Oakland	maverick
8B	obtuse	Medusa
8C	offload	megaton
8D	optic	microscope
8E	orca	microwave
8F	payday	midsummer
90	peachy	millionaire
91	pheasant	miracle
92	physique	misnomer
93	playhouse	molasses
94	Pluto	molecule
95	preclude	Montana
96	prefer	monument
97	preshrunk	mosquito
98	printer	narrative
99	prowler	nebula
9A	pupil	newsletter
9B	puppy	Norwegian
9C	python	October
9D	quadrant	Ohio
9E	quiver	onlooker
9F	quota	opulent
A0	ragtime	Orlando
A1	ratchet	outfielder
A2	rebirth	Pacific
A3	reform	pandemic
A4	regain	Pandora
A5	reindeer	paperweight
A6	rematch	paragon
A7	repay	paragraph
A8	retouch	paramount
A9	revenge	passenger
AA	reward	pedigree
AB	rhythm	Pegasus
AC	ribcage	penetrate
AD	ringbolt	perceptive
AE	robust	performance
AF	rocker	pharmacy
B0	ruffled	phonetic
B1	sailboat	photograph
B2	sawdust	pioneer
B3	scallion	pocketful
B4	scenic	politeness
B5	scorecard	positive
B6	Scotland	potato
B7	seabird	processor
B8	select	provincial
B9	sentence	proximate
BA	shadow	puberty
BB	shamrock	publisher
BC	showgirl	pyramid
BD	skullcap	quantity
BE	skydive	racketeer
BF	slingshot	rebellion

Examples

Each byte in a bytestring is encoded as a single word. A sequence of bytes is rendered in network byte order, from left to right. For example, the leftmost is considered "even" and is encoded using the PGP Even Word table. The next byte to the right is considered "odd" and is encoded using the PGP Odd Word table. This process repeats until all bytes are encoded. Thus, "E582" produces "topmost Istanbul", whereas "82E5" produces "miser travesty".
A PGP public key fingerprint that displayed in hexadecimal as

would display in PGP Words as

The order of bytes in a bytestring depends on Endianness.

Other word lists for data

There are several other word lists for conveying data in a clear unambiguous way via a voice channel:

the NATO phonetic alphabet maps individual letters and digits to individual words
the S/KEY system maps 64 bit numbers to 6 short words of 1 to 4 characters each from a publicly accessible 2048-word dictionary. The same dictionary is used in RFC 1760 and RFC 2289.
the Diceware system maps five base-6 random digits to a word from a dictionary of 7,776 unique words.
FIPS 181: Automated Password Generator converts random numbers into somewhat pronounceable "words".
mnemonic encoding converts 32 bits of data into 3 words from a vocabulary of 1626 words.
what3words encodes geographic coordinates in 3 dictionary words.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...