(slides)
Performance Analysis is an essential step in the development cycle of HPC codes targeting large scale infrastructures such as those used by many of the attendees of Cluster 2009. In this tutorial we will introduce how programmers can approach this important topic and how they can analyze the performance of their codes using the comprehensive open source tool set Open|SpeedShop, which is being developed and made available through a close collaboration between the Krell Institute and DOE/NNSA’s Tri-Labs (Lawrence Livermore, Los Alamos, and Sandia National Laboratories) for a wide range of cluster architectures.
In this tutorial we not only will introduce the attendees to Open|SpeedShop and its wide functionality, but we will directly focus on how they can use Open|SpeedShop’s extensive set of performance experiments to step by step understand the performance characteristics of their codes. We will focus both on node local (by studying global profiles, stack trace sampling, hardware counters, as well as I/O properties) and parallel performance (using a combination of tracing experiments and advanced analysis techniques). The latter will cover MPI applications as well as threaded codes.
