HPX and the C++ Standard – The STE||AR Group

While developing HPX, it has always been a goal to create an API which is as easy to learn and use as possible. We quickly realized that almost all of our functionality can be exposed through the interfaces which are already standardized as part of the C++11 standard library or which are being proposed for standardization over the next years. So we made it our goal to conform to the C++ standard documents and proposals as closely as possible. This decision has fundamental impact on almost all aspects of our work on HPX.

The main advantage of being conformant to an ISO standard is that this lessens the amount of learning one has to go through. There is already a lot of information available about C++ and its library. Our documentation can now focus on the things which we have added to accommodate for the distributed nature of HPX.

There is a lot of work being done currently in the C++ standardization committee which is aiming at extending the support for parallel and concurrent computing. The committee has decided to create various ‘study groups’, each of which focusses on a particular area of interest. Our group actively participates in SG1, the study group discussing the future of parallelism and concurrency in C++. The plan is to publish at least two technical specifications (TS) before or together with the planned C++17 international standard: the Parallelism TS and the Concurrency TS. As a result of the last C++ committee meeting in Lenexa (May 2015), the Parallelism TS will be officially published and the Concurrency TS will now have to be voted on by the national standardization bodies.

Both technical specifications are being implemented for HPX. Additionally, the Parallelism TS will contain 3 parallel algorithms which were proposed by members of the STE||AR group.

Here are two short examples of those features available today in HPX.

The Parallelism TS proposes standardizing a set of parallel algorithms, well aligned with the well known algorithms in the standard library. Essentially, for each of the existing algorithms in the standard library we will see a corresponding parallel version. For instance the following use of a sequential algorithm:

    int count_odd_numbers = 0;
    std::vector<int> data = { 1, 2, 3, 4, 5, ... };
    std::for_each(
        std::begin(v), std::end(v),
        [&](int element) {
            if (element % 2) 
                ++count_odd_numbers;
        });
    std::cout << "Number of odd integers: " 
        << count_odd_numbers << std::endl;

could be replaced by its parallel counterpart:

    std::atomic<int> count_odd_numbers(0);
    std::vector<int> data = { 1, 2, 3, 4, 5, ... };
    hpx::parallel::for_each(
        hpx::parallel::par,
        std::begin(v), std::end(v),
        [&](int element) {
            if (element % 2) 
                ++count_odd_numbers;
        });
    hpx::cout << "Number of odd integers: " 
        << count_odd_numbers << std::endl;

Note that the algorithm now lives in a different namespace and takes an additional first argument hpx::parallel::par. This is an execution policy defining how the algorithm is executed.

The Concurrency TS describes several new features to be added for improving concurrent programming. The most notable of those are related to additions to the already existing type std::future. The TS proposes adding sequential and parallel means of composition of future objects. Here is an example for sequential composition. It enables attaching a continuation to a future object. The continuation is triggered automatically once the future becomes ready.

    int universal_answer() { return 42; }
    // ...
    hpx::future<int> f = hpx::async(&universal_answer);
    f.then(
        [](hpx::future<int> f) {
            hpx::cout << "Universal answer: " << f.get()
                << std::endl;
        });

Here the lambda is invoked automatically once the function universal_answer() has returned its result.

Parallel composition of future objects allows to combine more than one future object into a single one:

    int universal_answer() { return 42; }
    void wait_for_7_million_years() 
        { hpx::this_thread::sleep(std::chrono::years(7000000)); }
    // ...
    hpx::future<hpx::tuple<hpx::future<int>, hpx::future<void>>> f =
         hpx::when_all(
             hpx::async(&universal_answer),
             hpx::async(&wait_for_7_million_years);
         );

Here, f will become ready once both future objects it depends on become ready. Nifty.

In short, HPX gives the opportunity to try out features today which will become available with every major C++ compiler platform over the next years. Beyond the small examples shown above, HPX implements many more cutting edge higher level abstraction mechanisms supporting parallelism and concurrency. We will continue to describe more of those over the coming weeks.

If you would like to have a look at HPX, please see our Github repository.

Leave a Reply Cancel reply