Analysis of parallel algorithms

This article discusses the analysis of parallel algorithms. Like in the analysis of "ordinary", sequential, algorithms, one is typically interested in asymptotic bounds on the resource consumption, but the analysis is performed in the presence of multiple processor units that cooperate to perform computations. Thus, one can determine not only how many "steps" a computation takes, but also how much faster it becomes as the number of processors goes up. The analysis approach works by first suppressing the number of processors. The next background paragraph explains how the abstraction of the number of processors first emerged.
A so-called work-time framework was originally introduced by Shiloach and Vishkin
for conceptualizing and describing parallel algorithms.
In the WT framework, a parallel algorithm is first described in terms of parallel rounds. For each round, the operations to be performed are characterized, but several issues can be suppressed. For example, the number of operations at each round need not be clear, processors need not be mentioned and any information that may help with the assignment of processors to jobs need not be accounted for. Second, the suppressed information is provided. The inclusion of the suppressed information is, in fact, guided by the proof of a scheduling theorem due to Brent, which is explained later in this article. The WT framework is useful since while it can greatly simplify the initial description of a parallel algorithm, inserting the details suppressed by that initial description is often not very difficult. For example, the WT framework was adopted as the basic presentation framework in the parallel algorithms books
and
,
as well as in the class notes
. The overview below explains how the WT framework can be used for analyzing more general parallel algorithms, even when their description is not available within the WT framework.

Overview

Suppose computations are executed on a machine that has processors. Let denote the time that expires between the start of the computation and its end. Analysis of the computation's running time focuses on the following notions:

The work of a computation executed by processors is the total number of primitive operations that the processors perform. Ignoring communication overhead from synchronizing the processors, this is equal to the time used to run the computation on a single processor, denoted.
The depth or span is the length of the longest series of operations that have to be performed sequentially due to data dependencies. The depth may also be called the critical path length of the computation. Minimizing the depth/span is important in designing parallel algorithms, because the depth/span determines the shortest possible execution time. Alternatively, the span can be defined as the time spent computing using an idealized machine with an infinite number of processors.
The cost of the computation is the quantity. This expresses the total time spent, by all processors, in both computing and waiting.

Several useful results follow from the definitions of work, span and cost:

Work law. The cost is always at least the work:. This follows from the fact that processors can perform at most operations in parallel.
Span law. A finite number of processors cannot outperform an infinite number, so that.

Using these definitions and laws, the following measures of performance can be given:

Speedup is the gain in speed made by parallel execution compared to sequential execution:. When the speedup is for input size , the speedup is linear, which is optimal in simple models of computation because the work law implies that . The situation is called perfect linear speedup. An algorithm that exhibits linear speedup is said to be scalable.
Efficiency is the speedup per processor,.
Parallelism is the ratio. It represents the maximum possible speedup on any number of processors. By the span law, the parallelism bounds the speedup: if, then:

The slackness is. A slackness less than one implies that perfect linear speedup is impossible on processors.
Execution on a limited number of processors

Analysis of parallel algorithms is usually carried out under the assumption that an unbounded number of processors is available. This is unrealistic, but not a problem, since any computation that can run in parallel on processors can be executed on processors by letting each processor execute multiple units of work. A result called Brent's law states that one can perform such a "simulation" in time, bounded by
or, less precisely,
An alternative statement of the law bounds above and below by
showing that the span and the work together provide reasonable bounds on the computation time.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...

Analysis of parallel algorithms

Overview

Execution on a limited number of processors