GSoC 17: Final documentation for Parallel algorithms 2

By taeguk (http://taeguk.me)

Abstract

HPX is “The C++ Standards Library for Concurrency and Parallelism”. So, implementing C++17 parallelism like N4409 is very important. Most of parallel algorithms are already implemented. But, still some algorithms are remained to be not implemented. My main objective is implementing remaining parallel algorithms as many as possible. And I adapt them to Ranges TS and add container versions of them. And I add unit tests and benchmark codes for them. Associated with mainly parallel algorithms, I also catch and suggest many issues and resolve them.

Pull Requests

  1. Implement parallel::is_heap and parallel::is_heap_until. (#2654)
    • Implementation, Unit Tests, Benchmark
    • Easy to implement using partitioner and cancellation_token.
  2. Adapt parallel::is_heap and parallel::is_heap_until to Ranges TS. (#2788)
    • Adapt parallel::is_heap and parallel::is_heap_until to Ranges TS.
    • Add container algorithms of them.
  3. Implement parallel::partition_copy. (#2716)
    • Implementation, Unit Tests, Benchmark, Container Algorithm (Ranges TS)
    • Easy to implement using scan_partitioner and additional memory.
  4. Implement parallel::unique_copy. (#2754)
    • Implementation, Unit Tests, Benchmark, Container Algorithm (Ranges TS)
    • Easy to implement using scan_partitioner and additional memory.
  5. Implement parallel::partition. (#2778)
  6. Implement parallel::merge. (#2833)
    • Implementation, Unit Tests, Benchmark, Container Algorithm (Ranges TS)
    • Only supports random access iterator tag. (See also: #2826)
    • Be implemented with recursively splitting the given range.
  7. Implement parallel::unique. (#2867)
    • Implementation, Unit Tests, Benchmark, Container Algorithm (Ranges TS)
    • It is hard to be implemented efficiently. (See also: #2804)
    • Be implemented using parallel scanning and sequential copying.
    • The parallel::util::scan_partitioner is modified.
    • The parallel::remove and parallel::remove_if are also implemented using same way.
  8. Fix bugs and improve that about HPX_HAVE_CXX11_AUTO_RETURN_VALUE of CMake. (#2821)
    • Fix bugs and improve some things in things that are implemented in #2720.
    • Fix things in cmake files.
  9. Fix a strange thing in parallel::util::detail::handle_local_exceptions. (#2832)

Suggested Issues

  1. Excessive overheads that loop_idx_n calls cancellation_token::was_cancelled per each loop. (#2661)
    • I said that it is excessive overhead that loop_idx_n calls cancellation_token::was_cancelled per each loop.
    • My opinion was accepted. And I was permitted to work and submit PR for this issue.
  2. Iterator requirements are different from standard in parallel copy_if. (#2699)
    • I said that several iterator requirements are different from standard and falling back to sequential execution when iterator doesn’t satisfy requirements may be confusing to the users.
    • My opinion was accepted and @hkaiser did PR about my issue. (#2733)
  3. Naming inconsistency in parallel algorithms. (#2700)
    • I complained that there are several naming inconsistencies in implementations of parallel algorithms.
    • My opinion was accepted. And I was permitted to work and submit PR for this issue.
  4. The in-place parallel algorithms are hard to implement efficiently. (#2804)
    • I discussed how to efficiently implement parallel algorithms such as unique, remove, remove_if, and inplace_merge.
  5. The std::bad_alloc is not handled correctly in parallel::sort. (#2816)
    • I found that std::bad_alloc is not handled correctly in parallel::sort. And I suggested the way to resolve.
    • My opinion was agreed by @hkaiser.
  6. Nested exception_list in parallel algorithm. (#2817)
    • I found that exception_list can be nested in parallel algorithms. And I discussed about that.
  7. A strange thing in parallel::util::detail::handle_local_exceptions. (#2818)
    • I found the strange thing in parallel::util::detail::handle_local_exceptions. And I suggested the way to resolve.
    • My opinion was accepted. And this issue was fixed in #2832.
  8. Support forward and bidirectional iterators in parallel::merge. (#2826)
    • I discussed how to support or handle forward and bidirectional iterators in parallel::merge.
  9. One suspicion in parallel::detail::handle_exception. (#2834)
    • I found the suspicious thing in parallel::detail::handle_exception. And I suggested the way to resolve.
    • My opinion was agreed. And @hkaiser said that he will try to investigate in more detail.
  10. The worst-case time complexity of parallel::sort seems to be O(N^2). (#2836)
    • I found that the worst-case time complexity of parallel::sort can be O(N^2).
  11. The chunk_size_iterator violates multipass guarantee. (#2866)
    • I found that the chunk_size_iterator violates multipass guarantee. And I suggested the way to fix this issue.
    • My opinion was accepted. And I was permitted to work and submit PR for this issue.

Code Reviews

  1. Adapting iterator requirements for parallel algorithms. (#2733)
    • Reviewed PR from @hkaiser for fixing the issue #2699 which I opened.
    • Found several mistakes and improvements.

Beyond GSoc

I will implement all of the rest of parallel algorithms. And I will finally close #1141. And I will resolve many issues that are suggested by me. And I also fix parallel::util::scan_partitioner for getting better performance and close #2325. These are what firstly I will do after gsoc finishes. After I finish these things, I will find other works that I can do. Of course, I will still do works for improving parallel algorithms, too.

Leave a Reply

Your email address will not be published. Required fields are marked *