Soutenance HDR – Patrick Carribault

Le mercredi 8 Juillet 2015, Patrick Carribault, Ingénieur-Chercheur au CEA, soutiendra son Habilitation à Diriger des Recherches intitulé Compiler/Runtime Cooperation for High-Performance Multi-Paradigm Parallelism.

Cette soutenance aura lieu à l’Université de Versailles-Saint-Quentin-en-Yvelins, Campus de Versailles (45 avenue des états-unis, 78035 Versailles Cedex), Bâtiment Descartes, Amphi B, à 14h.

Vous trouverez ci-dessous le résumé de ses travaux.

The current evolution of high-end hardware architecture leads to interesting problems for computer-science research and industry. Since 2010, supercomputers reached the Petaflops threshold allowing a program to run at the speed of 10^15 floating-point operations per second. But during the last five years, various computer-architecture designs arose to prepare the next generation of clusters.

Indeed, the Exascale era (10^18 floating-point operations per second) is predicted to appear by the end of this decade or, at least, early after the 2020 horizon. Even if this goal is still years away, the time to prepare both the hardware and software environment is really short, especially if we take into account the effort to update the scientific applications accordingly and the time to train their developers. With this target in mind, this presentation discusses the research I conducted during the last 7 years about the possible evolutions for the software stack (compilers and runtime systems) aiming the Exascale target.

Based on the shift of hardware architecture (large number of cores and processing units, low amount of memory per core…), one way for scientific applications to reach the Exascale milestone is to extend the parallelism from MPI to MPI+X. For this purpose, the whole software environment should evolve too. Indeed, the underlying runtime systems of each programming model have to be aware of each other to deal with resource allocation (cores and different memories). Because mixing parallel programming can be a tough trial, the whole toolchain should help this transition from the compiler to the runtime. This presentation brings together our research on these challenges and makes the following contributions: (i) design of a unified MPI+OpenMP runtime lowering the overhead and exposing additional features e.g., taxonomy for thread placement, (ii) resource management for both cores and memory tested on different architectures (CPUs and GPGPUs), and (iii) compiler support for parallel programming models including interaction with the runtime systems for data placement and debugging tools for various parallel paradigms.

Invitation