Performance Analysis

Performance Analysis of Parallel Programs

Once a program has been parallelized, its performance remains usually far from optimal. Too difficult is the process of performance optimization, which needs to consider the complex interplay between the algorithm and the hardware. Many parallel applications also suffer from latent performance limitations that may prevent them from scaling to larger problem or machine sizes. Often, such scalability bugs manifest themselves only when an attempt to scale the code is actually being made – a point where remediation can already be difficult. Performance models allow such issues to be predicted before they become relevant. A performance model is a formula that expresses a performance metric of interest such as execution time or energy consumption as a function of one or more execution parameters such as the size of the input problem or the number of processors. However, deriving such models analytically from the code is so laborious that too many application developers shy away from the effort.

To let a wider audience of developers profit from performance models, we create techniques to learn them automatically from a small set of performance measurements. Our performance-modeling tool Extra-P generates such empirical performance models for each function of even complex applications with hundreds of thousands of lines of code. In this way, the programmer can easily spot scalability problems or identify execution parameters that guarantee the desired degree of efficiency. Extra-P is available for download under an open-source license.

Selected Publications

  • Sergei Shudler, Alexandru Calotoiu, Torsten Hoefler, Felix Wolf: Isoefficiency in Practice: Configuring and Understanding the Performance of Task-based Applications. In Proc. of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), Austin, TX, USA, pages 131–143, ACM, February 2017. [PDF] [DOI] [BibTeX]

  • Alexandru Calotoiu, David Beckingsale, Christopher W. Earl, Torsten Hoefler, Ian Karlin, Martin Schulz, Felix Wolf: Fast Multi-Parameter Performance Modeling. In Proc. of the 2016 IEEE International Conference on Cluster Computing (CLUSTER), Taipei, Taiwan, pages 172-181, IEEE Computer Society, September 2016. [PDF] [DOI] [BibTeX]

  • Sergei Shudler, Alexandru Calotoiu, Torsten Hoefler, Alexandre Strube, Felix Wolf: Exascaling Your Library: Will Your Implementation Meet Your Expectations?. In Proc. of the International Conference on Supercomputing (ICS), Newport Beach, CA, USA, pages 165–175, ACM, June 2015. [PDF] [DOI] [BibTeX]

  • Alexandru Calotoiu, Torsten Hoefler, Marius Poke, Felix Wolf: Using Automated Performance Modeling to Find Scalability Bugs in Complex Codes. In Proc. of the ACM/IEEE Conference on Supercomputing (SC13), Denver, CO, USA, pages 1-12, ACM, November 2013. [PDF] [DOI] [BibTeX]