Skip list

In computer science, a skip list is a probabilistic data structure that allows search complexity as well as insertion complexity within an ordered sequence of elements. Thus it can get the best features of a sorted array while maintaining a linked list-like structure that allows insertion, which is not possible in an array. Fast search is made possible by maintaining a linked hierarchy of subsequences, with each successive subsequence skipping over fewer elements than the previous one. Searching starts in the sparsest subsequence until two consecutive elements have been found, one smaller and one larger than or equal to the element searched for. Via the linked hierarchy, these two elements link to elements of the next sparsest subsequence, where searching is continued until finally we are searching in the full sequence. The elements that are skipped over may be chosen probabilistically or deterministically, with the former being more common.

Description

A skip list is built in layers. The bottom layer is an ordinary ordered linked list. Each higher layer acts as an "express lane" for the lists below, where an element in layer appears in layer with some fixed probability . On average, each element appears in lists, and the tallest element in all the lists. The skip list contains lists.
A search for a target element begins at the head element in the top list, and proceeds horizontally until the current element is greater than or equal to the target. If the current element is equal to the target, it has been found. If the current element is greater than the target, or the search reaches the end of the linked list, the procedure is repeated after returning to the previous element and dropping down vertically to the next lower list. The expected number of steps in each linked list is at most, which can be seen by tracing the search path backwards from the target until reaching an element that appears in the next higher list or reaching the beginning of the current list. Therefore, the total expected cost of a search is which is, when is a constant. By choosing different values of, it is possible to trade search costs against storage costs.

Implementation details

The elements used for a skip list can contain more than one pointer since they can participate in more than one list.
Insertions and deletions are implemented much like the corresponding linked-list operations, except that "tall" elements must be inserted into or deleted from more than one linked list.
operations, which force us to visit every node in ascending order, provide the opportunity to perform a behind-the-scenes derandomization of the level structure of the skip-list in an optimal way, bringing the skip list to search time. However this also allows someone to know where all of the higher-than-level 1 nodes are and delete them.
Alternatively, we could make the level structure quasi-random in the following way:
make all nodes level 1
j ← 1
while the number of nodes at level j > 1 do
for each i'th node at level j do
if i is odd and i is not the last node at level j
randomly choose whether to promote it to level j+1
else if i is even and node i-1 was not promoted
promote it to level j+1
end if
repeat
j ← j + 1
repeat
Like the derandomized version, quasi-randomization is only done when there is some other reason to be running an operation.
The advantage of this quasi-randomness is that it doesn't give away nearly as much level-structure related information to an adversarial user as the de-randomized one. This is desirable because an adversarial user who is able to tell which nodes are not at the lowest level can pessimize performance by simply deleting higher-level nodes. The search performance is still guaranteed to be logarithmic.
It would be tempting to make the following "optimization": In the part which says "Next, for each ith...", forget about doing a coin-flip for each even-odd pair. Just flip a coin once to decide whether to promote only the even ones or only the odd ones. Instead of coin flips, there would only be of them. Unfortunately, this gives the adversarial user a 50/50 chance of being correct upon guessing that all of the even numbered nodes are higher than level one. This is despite the property that he has a very low probability of guessing that a particular node is at level N for some integer N.
A skip list does not provide the same absolute worst-case performance guarantees as more traditional balanced tree data structures, because it is always possible that the coin-flips used to build the skip list will produce a badly balanced structure. However, they work well in practice, and the randomized balancing scheme has been argued to be easier to implement than the deterministic balancing schemes used in balanced binary search trees. Skip lists are also useful in parallel computing, where insertions can be done in different parts of the skip list in parallel without any global rebalancing of the data structure. Such parallelism can be especially advantageous for resource discovery in an ad-hoc wireless network because a randomized skip list can be made robust to the loss of any single node.

Indexable skiplist

As described above, a skip list is capable of fast insertion and removal of values from a sorted sequence, but it has only slow lookups of values at a given position in the sequence ; however, with a minor modification the speed of random access indexed lookups can be improved to .
For every link, also store the width of the link. The width is defined as the number of bottom layer links being traversed by each of the higher layer "express lane" links.
For example, here are the widths of the links in the example at the top of the page:
1 10
o---> o---------------------------------------------------------> o Top level
1 3 2 5
o---> o---------------> o---------> o---------------------------> o Level 3
1 2 1 2 3 2
o---> o---------> o---> o---------> o---------------> o---------> o Level 2
1 1 1 1 1 1 1 1 1 1 1
o---> o---> o---> o---> o---> o---> o---> o---> o---> o---> o---> o Bottom level
Head 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th NIL
Node Node Node Node Node Node Node Node Node Node
Notice that the width of a higher level link is the sum of the component links below it. Consequently, the sum of all widths is the same on every level.
To index the skip list and find the i'th value, traverse the skip list while counting down the widths of each traversed link. Descend a level whenever the upcoming width would be too large.
For example, to find the node in the fifth position, traverse a link of width 1 at the top level. Now four more steps are needed but the next width on this level is ten which is too large, so drop one level. Traverse one link of width 3. Since another step of width 2 would be too far, drop down to the bottom level. Now traverse the final link of width 1 to reach the target running total of 5.

function lookupByPositionIndex
node ← head
i ← i + 1 # don't count the head as a step
for level from top to bottom do
while i ≥ node.width do # if next step is not too far
i ← i - node.width # subtract the current width
node ← node.next # traverse forward at the current level
repeat
repeat
return node.value
end function
This method of implementing indexing is detailed in .

History

Skip lists were first described in 1989 by William Pugh.
To quote the author:

Usages

List of applications and frameworks that use skip lists:

Apache Portable Runtime implements skip lists. See
MemSQL uses lock-free skip lists as its prime indexing structure for its database technology.
Cyrus IMAP server offers a "skiplist" backend DB implementation
Lucene uses skip lists to search delta-encoded posting lists in logarithmic time.
template class of Qt that provides a dictionary.
Redis, an ANSI-C open-source persistent key/value store for Posix systems, uses skip lists in its implementation of ordered sets.
, a very fast key-value embedded Database Storage Engine, uses skip lists for its memtable.
is an open-source database format using ordered key/value pairs.
and in the Java 1.6 API. Used by Apache HBase.
are a fast key-value datastore for Tcl that use skiplists for indexes and lockless shared memory.
, a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values
Con Kolivas' MuQSS Scheduler for the Linux kernel uses skip lists
uses skip lists as base data structure to build a more complex 3D Sparse Grid for Robot Mapping systems.
, a persistent C11 key/value storage library uses skip lists as its main data structure.

of running medians.
Skip lists are also used in distributed applications and for implementing highly scalable concurrent priority queues with less lock contention, or even without locking, as well as lockless concurrent dictionaries. There are also several US patents for using skip lists to implement priority queues and concurrent dictionaries.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...