Profiling Toolkit for HPC Applications

Leitung:  Prof. Dr. G. Von Voigt
Jahr:  2017
Laufzeit:  2017 – 2020
Ist abgeschlossen:  ja

High performance computing (HPC) has been established as a standard method in many scientific and engineering fields, with ever-growing numbers of researchers developing and using HPC applications with diverse background knowledge of HPC systems. Often enough, researchers are satisfied as soon as they receive the results of their computations in an acceptable time frame without recognising the possibilities to optimize their resource usage through performance engineering. The main goal of this project is to increase the awareness of performance issues in all user communities.

Motivation

High performance computing (HPC) has become a standard research tool in many scientific disciplines. In the natural and engineering sciences, research without at least supporting HPC calculations is becoming increasingly rarer. On top of that, new disciplines discover HPC as an asset to their research, for example, in the areas of bioinformatics and social sciences. This means that more and more scientists start using HPC resources without having a good understanding of the work of such systems. On the other hand, the complexity of HPC resources is growing, thereby widening this knowledge gap.

Challenges & Highlights

Scientists with basic or intermediate HPC knowledge levels are often satisfied once their research problem can be solved on an available system in a tolerable time frame, even if it compromises the accuracy or the amount of questions addressed. The situation is worsened by the fact that these users mostly use their local Tier-3 compute center, which typically lacks sufficient human resources to work with them individually on application performance. Yet, at least in the beginning of their scientific research, most users understandably do not concern themselves with the optimal use of system resources or application performance. Their primary objective is to generate scientific output as soon as possible. This also can lead to a lock-in, where researchers will be unable to transfer their work to Tier-2 or Tier-1 compute centers even if that is required, simply because their calculations do not scale sufficiently.

Potential applications & future issues

The main goal of this project is to increase the awareness of performance issues in all user communities. To achieve this, systematic, unified, and easily understandable information on performance parameters of their applications will be provided to the users. These performance profiles will be presented in a way that is useful to newcomers and experienced users alike and are aimed at motivating to investigate possibilities to optimize applications.

Further information