C++ and the Heterogeneous Challenge

As HPC shifts its long range focus from peta- to exascale, the need for programmers to be able to efficiently utilize the entirety of a machine’s compute resources has become more paramount. This has grown increasingly difficult as most of the Top500 machines utilize, in some capacity, hardware accelerators like GPUs and coprocessors which often require special languages and APIs to take advantage of them. In C++ the concept of executors, as currently discussed by the C++ standardization committee, has created a possibility for a flexible, and dynamic choice of the execution platform for various types of parallelism in C++, including the execution of user code on heterogeneous resources like accelerators and GPUs in a portable way. This will also allow to develop a solution that seamlessly integrates iterative execution (parallel algorithms) with other types of parallelism, such as task-based parallelism, asynchronous execution flows, continuation style computation, and explicit fork-join control flow of independent and non-homogeneous code paths.

HPX and C++ Executors

The STE||AR Group has implemented executors in HPX which, as proposed by the C++ standardization proposal called ‘Parallel Algorithms Need Executors’ (document number N4406), are objects that choose where and how a function call is completed. This is a step in the right direction for HPX and parallelism because executors give more flexibility on how and where task based work should be accomplished and gives the programmer a means to compose executors nicely with execution policies inside of algorithm implementations.