In computer science, the analysis of parallel algorithms is the process of finding the computational complexity of algorithms executed in parallel – the amount of time, storage, or other resources needed to execute them. In many respects, analysis of parallel algorithms is similar to the analysis of sequential algorithms, but is generally more involved because one must reason about the behavior of multiple cooperating threads of execution. One of the primary goals of parallel analysis is to understand how a parallel algorithm's use of resources (speed, space, etc.) changes as the number of processors is changed.

Background

A so-called work-time (WT) (sometimes called work-depth, or work-span) framework was originally introduced by Shiloach and Vishkin [1] for conceptualizing and describing parallel algorithms. In the WT framework, a parallel algorithm is first described in terms of parallel rounds. For each round, the operations to be performed are characterized, but several issues can be suppressed. For example, the number of operations at each round need not be clear, processors need not be mentioned and any information that may help with the assignment of processors to jobs need not be accounted for. Second, the suppressed information is provided. The inclusion of the suppressed information is guided by the proof of a scheduling theorem due to Brent,[2] which is explained later in this article. The WT framework is useful since while it can greatly simplify the initial description of a parallel algorithm, inserting the details suppressed by that initial description is often not very difficult. For example, the WT framework was adopted as the basic presentation framework in the parallel algorithms books (for the Parallel random-access machine PRAM model) [3] and, [4] as well as in the class notes .[5] The overview below explains how the WT framework can be used for analyzing more general parallel algorithms, even when their description is not available within the WT framework.

Definitions

Suppose computations are executed on a machine that has p processors. Let Tp denote the time that expires between the start of the computation and its end. Analysis of the computation's running time focuses on the following notions:

Several useful results follow from the definitions of work, span and cost:

Using these definitions and laws, the following measures of performance can be given:

.[9]

Execution on a limited number of processors

Analysis of parallel algorithms is usually carried out under the assumption that an unbounded number of processors is available. This is unrealistic, but not a problem, since any computation that can run in parallel on N processors can be executed on p < N processors by letting each processor execute multiple units of work. A result called Brent's law states that one can perform such a "simulation" in time Tp, bounded by[10]

or, less precisely,[6]

An alternative statement of the law bounds Tp above and below by

.

showing that the span (depth) T and the work T1 together provide reasonable bounds on the computation time.[2]

References

  1. ^ Shiloach, Yossi; Vishkin, Uzi (1982). "An O(n2 log n) parallel max-flow algorithm". Journal of Algorithms. 3 (2): 128–146. doi:10.1016/0196-6774(82)90013-X.
  2. ^ a b Brent, Richard P. (1974-04-01). "The Parallel Evaluation of General Arithmetic Expressions". Journal of the ACM. 21 (2): 201–206. CiteSeerX 10.1.1.100.9361. doi:10.1145/321812.321815. ISSN 0004-5411. S2CID 16416106.
  3. ^ JaJa, Joseph (1992). An Introduction to Parallel Algorithms. Addison-Wesley. ISBN 978-0-201-54856-3.
  4. ^ Keller, Jorg; Kessler, Cristoph W.; Traeff, Jesper L. (2001). Practical PRAM Programming. Wiley-Interscience. ISBN 978-0-471-35351-5.
  5. ^ Vishkin, Uzi (2009). Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques, 104 pages (PDF). Class notes of courses on parallel algorithms taught since 1992 at the University of Maryland, College Park, Tel Aviv University and the Technion.
  6. ^ a b c d e f Casanova, Henri; Legrand, Arnaud; Robert, Yves (2008). Parallel Algorithms. CRC Press. p. 10. CiteSeerX 10.1.1.466.8142.
  7. ^ Blelloch, Guy (1996). "Programming Parallel Algorithms" (PDF). Communications of the ACM. 39 (3): 85–97. CiteSeerX 10.1.1.141.5884. doi:10.1145/227234.227246. S2CID 12118850.
  8. ^ Michael McCool; James Reinders; Arch Robison (2013). Structured Parallel Programming: Patterns for Efficient Computation. Elsevier. pp. 4–5.
  9. ^ a b c d e f Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2009) [1990]. Introduction to Algorithms (3rd ed.). MIT Press and McGraw-Hill. pp. 779–784. ISBN 0-262-03384-4.
  10. ^ Gustafson, John L. (2011). "Brent's Theorem". Encyclopedia of Parallel Computing. pp. 182–185. doi:10.1007/978-0-387-09766-4_80. ISBN 978-0-387-09765-7.