GSoC 2021 – Add vectorization to par_unseq implementations of Parallel Algorithms

by Srinivas Yadav

GSoC 2021 Final Report

Abstract

HPX algorithms support data parallelism through explicit vectorization using Vc library and only for a few algorithms like for_each, transform and count, but recently the support for Vc library has been deprecated and has been replaced by std::experimental::simd. In this project I have adapted many algorithms to datapar using new backend std::experimental::simd with two new policies simd and par_simd using the data-parallel types proposed in the experimental namespace. For all the algorithms adapted to datapar, separate tests have been created.

I have created a new github repository namely std-simd-perf for the benchmarks of the algorithms that I have adapted to datapar which have various plots for speed up analysis and roofline model for artificial benchmarks and real world applications.

Pull Requests for HPX Repo

Merged

Open

Other Adapted Algorithms to datapar [code]: 

  • adjacent_difference
  • adjacent_find
  • all_of , any_of, none_of
  • copy
  • count
  • find
  • for_each
  • generate
  • transform

Performance Benchmarks

  • The std-simd-perf repository contains all the benchmarks for simd on artificial algorithms such as for_each, transform, count, find etc.. and on real world examples such as Mandelbrot set.
  • These benchmarks were run on different clusters and have separate branches for each architecture in the repo.
  • Speed up plot for a compute bound kernel using for_each algorithm
  • Speed up plot for a simd reduction based algorithm using count algorithm

Beyond GSoC

  • Adapt #2333 rest of the algorithms to support data parallel.
  • I will be further working with STE||AR GROUP for HPX in other areas as well as this is a great community to learn with great people and expand my knowledge.

Acknowledgements

Special thanks to Hartmut Kaiser, Nikunj Gupta and Auriane R. for all the guidance and help with frequent meetings.

GSoC 2021 – Adapting algorithms to C++ 20 and Ranges TS

by Akhil Nair

Introduction:

My main task involves adapting the remaining algorithms from this issue to C++ 20 by using the tag_invoke CPO mechanism to add the correct overloads for the algorithms as mentioned by the C++20 standard. It also involves adding ranges and sentinel overloads for these algorithms as well as ensuring that the base implementations support sentinels. I also added doxygen documentation for each overload.

We have managed to cover almost all algorithms thanks to previous contributions prior to the 2021 GSoC period from Giannis, Hartmut, Mikael and others as well as from Chuanqiu He and Karame for adapting the rotate/rotate_copy and adjacent_difference respectively.

Apart from the adaptation work, I have also created PRs adding the shift_left and shift_right algorithms (Issue #3706) and the ranges starts_with and ends_with algorithms (Issue #5381) and they’re currently under review.

Details:

Tag_invoke:

We render the old hpx::parallel overloads as deprecated and add new tag_fallback_dispatch overloads according to the function signatures specified in the C++ 20 standard using the tag_invoke CPO mechanism for dispatching the call to the correct overloads.

The segmented overloads for an algorithm use tag_dispatch and the normal parallel and container overloads use the tag_fallback_dispatch, so that all the overloads of the segmented overloads are preferred before falling back to the remaining parallel overloads.

Range and sentinel overloads:

C++ 20 introduced the ranges overloads for many of the algorithms and we have done the same for our algorithms, available in the hpx::ranges namespace.

We can pass a range as either a single range argument or by using an iterator-sentinel pair. The range overloads also make use of tag_fallback_dispatch for overload resolution.

Separating the segmented overloads:

For algorithms having segmented overloads, we add tag_dispatch overloads and remove the forward declarations in both files to seperate the segmented overloads completely from the parallel overloads.

Shift left and shift right algorithms:

Shift left and shift right algorithms have been added. They make use of reverse in the parallel implementations (anyone reading this in the future, feel free to attempt a more efficient parallel implementation if possible). Range and sentinel overloads for these algorithms have been added as well. Ranges starts_with and ends_with algorithms have been added too.

Other:

I’ve also been looking into the senders and receivers proposal and looking into the performance issues of the scan partitioner by trying to measure the execution time and scheduling of the various stages of the scan algorithm.

PR Details:

The following PRs have been merged as of writing this report :-

Open PRs currently under review :-

My experience:

My experience working with and being mentored by the STE||AR Group has been amazing. This being my second gsoc, I was looking for an organization that had both challenging and interesting work and a helpful and supportive community, and the STE||AR Group ticked off both of those boxes wonderfully.

Hartmut and Giannis were amazing mentors and have been very helpful. The weekly meetings with them and Auriane were very useful to keep track of the progress and get guidance on how to proceed. Thanks to Hartmut, Auriane and Mikael for reviewing my PRs. I’m also grateful for the help of other members of the community who were very helpful and responsive on the IRC chat.

Over the summer my understanding of C++ has definitely increased, though there is a LOT more to cover, although I’m sure continuing to work on HPX (and asking questions on the IRC) will help with that. Having access to and being able to ask questions to the community members who have such a deep understanding of the topics is a very valuable advantage of contributing to HPX.

I fully intend to continue working on HPX and with the STE||AR Group after GSoC is over and look forward to learning and working on more interesting stuff in the coming months.

HPX’s Season of Docs 2021

By Rachitt Shah

HPX was recently selected to be part of Google’s Season of Docs (GSoD), a program designed to improve the documentation of open source software, as well as being a Google Summer of Code organization.

GSoD aims to cover and create the documentation gaps faced by organizations due to various reasons, alongside giving technical writers an avenue to showcase their skills.

I will be helping in the organization and update of the prior documentation to make it into a more navigable and to provide a user-friendly structure, which many users have had issues with using the current documentation. I will work closely with the HPX team and our users to collect feedback, find user pain-points, and improve preexisting docs, which mainly comprise of the build instructions.

Alongside, I would create a “design document” containing guidelines for how to add new content to the documentation: tips on how to structure new sections, general guidelines on what sort of content should be presented in what chapters, etc. The project may also include content rearrangement and a change of hierarchy, if the users find it is needed. 

I am currently working on a timeline and action items and researching about the possible shift to another documentation platform.

I am reachable at rachitt01@gmail.com or on the IRC as rachitt_shah, please contact me to suggest changes to the documentation or to provide feedback. We can always benefit from your ideas.

About me, I’m an undergrad studying electronics as my major, and I’m a casual sport programmer as well. I’ve been a product manager and venture capital intern in the past, and done Google Summer of Code with OpenAstronomy.

GSoD Final Report

By Rebecca Stobaugh.

We’ve reached the end of Google’s Season of Docs, and we’ve accomplished a lot in the past three months. My initial proposal was to work on three sections of the manual, and we have far exceeded our goal, managing to make changes to twelve different sections of the documentation. The majority of the work I’ve done has consisted of cleaning up grammatical errors and improving sentence structure. I have also added a style guide to the wiki, which should help standardize future changes to the documentation. The style guide can be found in the “HPX Source Code Structure and Coding Standards” wiki document under the section “Documentation Style Guide”. For a complete list of my pull requests during Season of Docs, please see here. To view my changes to the wiki, please see here.

Announcing HPX’s Season of Docs

By Rebecca Stobaugh

HPX was recently selected to be part of Google’s Season of Docs (GSoD), a program designed to improve the documentation of open source software. While the STE||AR Group has created extensive documentation for HPX, this documentation has been written by several different people, which has led to some inconsistencies and awkward organization. I am a technical writer and English PhD student who has been selected to edit and streamline HPX’s documentation. My goal is to clean up the content, concentrating on both grammatical issues and design concerns, to create a more cohesive, user-friendly product. My primary focus will be on two sections of the STE||AR Group’s instruction manual: “HPX Build System” and “Launching and Configuring HPX Applications.” If time allows, I will also revise the “Why HPX” page, with an emphasis on condensing and trimming repetitive content.

You can read my GSoD proposal here.

Trip Report: ICML 2019


By: Bita Hasheminezhad

A few weeks ago, I had the opportunity to attend the International Conference on Machine Learning (ICML) which is the premier gathering of professionals dedicated to the advancement of machine learning. Thirty-Sixth ICML was held on June 9th to 15th at the Long Beach Convention Center.

Compiling and Running Blazemark

By Shahrzad Shirzad

Blazemark is the benchmark suite for Blaze library. In order to compile and run Blazemark with HPX backend, take the following steps:

  1. Change the Configfile at blaze/blazemark by filling in the CXX=, CXXFLAGS=, LIBRARY_DIRECTIVES= fields in the Configfile:
    This is an example of the configurations used for Clang:
    # Compiler selection
    CXX="clang++"
    # Special compiler flags
    CXXFLAGS="-O3 -march=native -std=c++17 -stdlib=libc++ -DNDEBUG -fpermissive -DBLAZE_USE_HPX_THREADS -isystem /hpx/install/path/include -Wl,-wrap=main"
    # Library settings (optional)
    # In some cases it might be necessary to specify additional library paths and add additional
    # libraries. This can be done via this setting.
    LIBRARY_DIRECTIVES="-L/hpx/install/path/lib/ -lhpx -rdynamic /hpx/install/path/lib/libhpx_init.a -ldl -lrt -lhpx_wrap - L/boost/install/path/lib -lboost_system -lboost_program_options"
  2. ./configure Configfile
  3. make benchmark_name
  4. ./bin/benchmark_name

Notes:

  • You can change vector or matrix sizes to run the benchmark on through the benchmark_name.prm file located at /blaze/blazemark/params folder.

For more information on available benchmarks, command line parameters, and also the list of supported libraries please visit Blazemark.

Compiling and Running BlazeTest

By Shahrzad Shirzad

BlazeTest is a testing tool provided by Blaze. In order to compile and run BlazeTest with HPX backend, take the following steps:

  1.  Change the Configfile at blaze/blazetest by filling in the CXX=, CXXFLAGS=, LIBRARY_DIRECTIVES= fields in the Configfile:
    This is an example of the configurations used for Clang:
    # Compiler selection
    CXX="clang++"
    # Special compiler flags
    CXXFLAGS="-O3 -march=native -std=c++17 -stdlib=libc++ -DNDEBUG -fpermissive -DBLAZE_USE_HPX_THREADS -isystem /hpx/install/path/include -Wl,-wrap=main"# Library settings (optional)
    # In some cases it might be necessary to specify additional library paths and add additional
    # libraries. This can be done via this setting.
    LIBRARY_DIRECTIVES="-L/hpx/install/path/lib/ -lhpx -rdynamic /hpx/install/path/lib/libhpx_init.a -ldl -lrt -lhpx_wrap - L/boost/install/path/lib -lboost_system -lboost_program_options"
  2.  ./configure Configfile
  3.  make essentials
  4.  ./run