The STE||AR Group at LSU is excited to announce a search for a postdoctoral researcher. We are looking for a team member with strong C++ skills and experience and an interest in high performance computing, runtime systems, or machine learning. Familiarity with the paradigms and constructs of
C++ Standard
GSoC 17: Final documentation for Parallel algorithms 2
By taeguk (http://taeguk.me)
Abstract
HPX is “The C++ Standards Library for Concurrency and Parallelism”. So, implementing C++17 parallelism like N4409 is very important. Most of parallel algorithms are already implemented. But, still some algorithms are remained to be not implemented. My main
GSoC 17: Final documentation for Parallel algorithms 1
By Ajai V George under the supervision of Marcin Copik
Abstract
My proposal was to implement distributed versions of STL parallel algorithms. The main focus has been on resolving as much of the pending work in #1338 , which is about ensuring that these algorithms work seamlessly with distributed data structures like
CppChat: On Air with HPX
This past week members of the HPX team were invited to give a talk on Jon Kalb’s weekly CppChat. Bryce Adelstein-Lelbach, Hartmut Kaiser, Thomas Heller, and John Biddiscombe were invited onto the show to talk about HPX, our programming
Vectorized C++ Parallel Algorithms with HPX
In preparation for my talk at CppCon 2016 last week I decided to have a closer look at the possibility to add vectorization to HPX’s parallel abstractions1The slides for this talk can be downloaded here. The goal was to avoid using compiler specific extensions while enabling vectorization support. At the same time, I wanted to be able to integrate this with the already existing parallel algorithms in HPX, proving again that higher level APIs and best possible performance go hand in hand in HPX.
C++ Summer Lecture Series
When people struggle with HPX, we often find that their confusion and frustration is rooted in a lack of understanding of modern C++. In order to address this issue, the STE||AR Group @ LSU is hosting a 2016 C++ summer lecture series. These sessions will cover
HPX and Index-based C++ Parallel Loops
In Jacksonville at the winter 2016 C++ standardization meeting, Intel presented the second revision of their standardization proposal targeting index-based parallel for-loops (the latest document at the time of this writing is P0075R1). This document describes a couple of basic parallel for-loop constructs which complement the existing parallel algorithms which we have already implemented for quite some time in HPX. The following is taken from this document:
HPX and C++17
During the Jacksonville meeting of the C++ standards committee last week the so called Parallelism TS was accepted into the next official International Standard, also known as C++17. We can now expect for all major vendors of C++ compilers to implement the very same parallelism facilities as HPX has exposed for almost 2 years already!
A while back, we wrote about how HPX implements parallel algorithms. At the time of that writing, these parallel algorithms were freshly published as a WG21 Technical Specification with the goal of moving them into the main C++ International Standard at some point in the future. Since then, we have continued to work hard on implementing HPX versions of the proposed algorithms. In addition, we have added extensions in our implementation such as the “task” policy which enables asynchronous execution of the algorithm. We have also spent a lot of time tuning our implementation to make it as performant as possible. For instance, we have shown that it is possible for the relatively higher-level parallelization abstractions in HPX to match or to even outperform the performance of equivalent applications based on well-known and well-honed technologies like OpenMP.
The STE||AR Group is proud of the fact that our implementation in HPX has provided implementation, usage experience, and early feedback to the C++ committee. These contributions have helped to get this specification into the new standard. We hope to continue to be a flexible and valuable testbed for all future standardization proposals relating to parallelism and concurrency.
C++ and the Heterogeneous Challenge
As HPC shifts its long range focus from peta- to exascale, the need for programmers to be able to efficiently utilize the entirety of a machine’s compute resources has become more paramount. This has grown increasingly difficult as most of the Top500 machines utilize, in some capacity, hardware accelerators like GPUs and coprocessors which often require special languages and APIs to take advantage of them. In C++ the concept of executors, as currently discussed by the C++ standardization committee, has created a possibility for a flexible, and dynamic choice of the execution platform for various types of parallelism in C++, including the execution of user code on heterogeneous resources like accelerators and GPUs in a portable way. This will also allow to develop a solution that seamlessly integrates iterative execution (parallel algorithms) with other types of parallelism, such as task-based parallelism, asynchronous execution flows, continuation style computation, and explicit fork-join control flow of independent and non-homogeneous code paths.
HPX and C++ Futures
There has been a lot of attention to Futures in C++ lately. One of the main related events (even if it was not widely mentioned anywhere) was the final call for positions and comments for the preliminary draft technical specification for C++ Extensions for Concurrency (PDTS), see N4538. This call closed on July 7th, 2015. At this point, the document is out for the national bodies to vote on whether it should be accepted as a final TS (the balloting period ends on July 22nd, 2015). Personally, I expect for this document to be accepted unanimously, which means that we soon will have a second TS related to parallelism and concurrency ready. Compiler vendors will have a field day implementing all of this functionality over the next months (and years).