Journal Publications

Mohammadiporshokooh, K., Brandt, S.R. & Kaiser, H. A New Execution Model and Executor for Adaptively Optimizing the Performance of Parallel Algorithms Using HPX Runtime System. SN COMPUT. SCI. 6, 911 (2025). https://doi.org/10.1007/s42979-025-04442-y
Bhattacharya, D., Lipton, R. & Diehl, P. Quasistatic fracture evolution using a nonlocal cohesive model. Int J Fract (2023). https://doi.org/10.1007/s10704-023-00711-0
Littlewood, D.J., Parks, M.L., Foster, J.T. Mitchell, J. Diehl, P. The Peridigm Meshfree Peridynamics Code. J Peridyn Nonlocal Model (2023). https://doi.org/10.1007/s42102-023-00100-0
P. Diehl et al., “Octo-Tiger’s New Hydro Module and Performance Using HPX+CUDA on ORNL’s Summit,” 2021 IEEE International Conference on Cluster Computing (CLUSTER), 2021, pp. 204-214, doi: 10.1109/Cluster48925.2021.00059.
P. Diehl et al., “Performance Measurements within Asynchronous Task-based Runtime Systems: A Double White Dwarf Merger as an Application,” in Computing in Science & Engineering, doi: 10.1109/MCSE.2021.3073626.
Dominic C Marcello, Sagiv Shiber, Orsola De Marco, Juhan Frank, Geoffrey C Clayton, Patrick M Motl, Patrick Diehl, Hartmut Kaiser. Octo-Tiger: a new, 3D hydrodynamic code for stellar mergers that uses HPX parallelization. Mon Not R Astron Soc, 2021. doi: 10.1093/mnras/stab937
Diehl, P., Jha, P.K., Kaiser, H. et al. An asynchronous and task-based implementation of peridynamics utilizing HPX—the C++ standard library for parallelism and concurrency. SN Appl. Sci. 2, 2144 (2020). doi: 10.1007/s42452-020-03784-x
Serge Prudhomme, Patrick Diehl, On the treatment of boundary conditions for bond-based peridynamic models,
Computer Methods in Applied Mechanics and Engineering, Dec 2020, 372, 113391, ISSN 0045-7825, doi: 10.1016/j.cma.2020.113391.
H. Kaiser, P. Diehl, A. Lemoine, B, et al. HPX – The C++ Standard Library for Parallelism and Concurrency. Journal of Open C++ Source Software, 5(53), 2352 (2020). doi: 10.21105/joss.02352
K. Schatz et al., Visual Analysis of Structure Formation in Cosmic Evolution, 2019 IEEE Scientific Visualization Conference (SciVis), Vancouver, BC, Canada, 2019, pp. 33-41, doi: 10.1109/SciVis47405.2019.8968855
K. Schatz, et al., 019 IEEE Scientific Visualization Contest Winner: Visual Analysis of Structure Formation in Cosmic Evolution, IEEE Computer Graphics and Applications, vol. , no. 01, pp. 1-1, 5555.
doi: 10.1109/MCG.2020.3004613
P. Thoman, K. Dichev, T. Heller, et al. A taxonomy of task-based parallel programming technologies for high-performance computing. Journal of Supercomputing (2018) 74: 1422. doi: 10.1007/s11227-018-2238-4
Thomas Heller, Bryce Adelstein Lelbach, Kevin A Huck, et al. Harnessing Billions of Tasks for a Scalable Portable Hydrodynamic Simulation of the Merger of Two Stars. The International Journal of High Performance Computing Applications, Feb. 2019, doi:10.1177/1094342018819744.
Frank Löffler, Zhoujian Cao, Steven R. Brandt, Zhihui Du. A new parallelization scheme for adaptive mesh refinement. Journal of Computational Science, 16 (2016) 79–88. pdf
Kevin Huck, Allan Porterfield, Nick Chaimov, Hartmut Kaiser, Allen D. Malony, Thomas Sterling, Rob Fowler. An Autonomic Performance Environment for Exascale. Supercomputing frontiers and innovations, 2.3 (2015): 49-66. pdf
Antoine Tran Tan, Joel Falcou, Daniel Etiemble, Hartmut Kaiser: Automatic Task-Based Code Generation for High Performance Domain Specific Embedded Language, International Journal of Parallel Programming (2015).
Mario Mulansky: Optimizing Large Scale ODE Simulations, submitted to SIAM Journal of Scientific Computing (2014), arXiv:1412.0544 [physics.comp-ph]. pdf
Z. Byerly, B. Adelstein-Lelbach, J. Tohline and D. Marcello: A Hybrid Advection Scheme for Conserving Angular Momentum on a Refined Cartesian Mesh, The Astrophysical Journal Supplement Series (2014). pdf
C. Dekate, M. Anderson, M. Brodowicz, H. Kaiser, B. Adelstein-Lelbach and T. Sterling: Improving the Scalability of Parallel N-body Applications with an Event-driven Constraint-based Execution Model, International Journal of High Performance Computing Applications (2012). pdf, bibtex

Conference Publications

Mohammadiporshokooh, K., Brandt, S.R., Tohid, R., Kaiser, H. (2026). Adaptively Optimizing the Performance of HPX’s Parallel Algorithms. In: Diehl, P., Cao, Q., Herault, T., Bosilca, G. (eds) Asynchronous Many-Task Systems and Applications. WAMTA 2025. Lecture Notes in Computer Science, vol 15690. Springer, Cham. https://doi.org/10.1007/978-3-031-97196-9_6
Strack, A., Taylor, C., Diehl, P., Pflüger, D. (2024). Experiences Porting Shared and Distributed Applications to Asynchronous Tasks: A Multidimensional FFT Case-Study. In: Diehl, P., Schuchart, J., Valero-Lara, P., Bosilca, G. (eds) Asynchronous Many-Task Systems and Applications. WAMTA 2024. Lecture Notes in Computer Science, vol 14626. Springer, Cham. https://doi.org/10.1007/978-3-031-61763-8_11
Strack, A., Pflüger, D. (2023). Scalability of Gaussian Processes Using Asynchronous Tasks: A Comparison Between HPX and PETSc. In: Diehl, P., Thoman, P., Kaiser, H., Kale, L. (eds) Asynchronous Many-Task Systems and Applications. WAMTA 2023. Lecture Notes in Computer Science, vol 13861. Springer, Cham. https://doi.org/10.1007/978-3-031-32316-4_5
G. Daiß et al., “Beyond Fork-Join: Integration of Performance Portable Kokkos Kernels with HPX,” 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Portland, OR, USA, 2021, pp. 377-386, doi: 10.1109/IPDPSW52791.2021.00066.
P. Diehl, et al., “Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku,” in SC-W ’23: Proceedings of the SC ’23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis. November 2023. Pages 1533–1542 https://doi.org/10.1145/3624062.3624230
Wu, N. et al. (2023). Quantifying Overheads in Charm++ and HPX Using Task Bench. In: Singer, J., Elkhatib, Y., Blanco Heras, D., Diehl, P., Brown, N., Ilic, A. (eds) Euro-Par 2022: Parallel Processing Workshops. Euro-Par 2022. Lecture Notes in Computer Science, vol 13835. Springer, Cham. https://doi.org/10.1007/978-3-031-31209-0_1
G. Daiß, P. Diehl, H. Kaiser, and D. Pflüger. 2023. Stellar Mergers with HPX-Kokkos and SYCL: Methods of using an Asynchronous Many-Task Runtime System with SYCL. In Proceedings of the 2023 International Workshop on OpenCL (IWOCL ’23). Association for Computing Machinery, New York, NY, USA, Article 8, 1–12. https://doi.org/10.1145/3585341.3585354
G. Daiß, S. Singanaboina, P. Diehl, H. Kaiser and D. Pflüger, “From Merging Frameworks to Merging Stars: Experiences using HPX, Kokkos and SIMD Types,” in 2022 IEEE/ACM 7th International Workshop on Extreme Scale Programming Models and Middleware (ESPM2), Dallas, TX, USA, 2022 pp. 10-19. doi: 10.1109/ESPM256814.2022.00007
G. Daiß, et al., “From Task-Based GPU Work Aggregation to Stellar Mergers: Turning Fine-Grained CPU Tasks into Portable GPU Kernels,” in 2022 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), Dallas, TX, USA, 2022 pp. 89-99. doi: 10.1109/P3HPC56579.2022.00014
S. A. Sakin, A. Bigelow, R. Tohid, C. Scully-Allison, C. Scheidegger, S. R. Brandt, C. Taylor, K. A. Huck, H. Kaiser, K. E. Isaacs, “Traveler: Navigating Task Parallel Traces for Performance Analysis,” To appear in IEEE TVCG proceedings of IEEE VIS, January 2023.
Shirzad, S., Tohid, R., Kheirkhahan, A., Wagle, B., Kaiser, H. (2022). Understanding the Effect of Task Granularity on Execution Time in Asynchronous Many-Task Runtime Systems. In: , et al. Euro-Par 2021: Parallel Processing Workshops. Euro-Par 2021. Lecture Notes in Computer Science, vol 13098. Springer, Cham. doi: https://doi.org/10.1007/978-3-031-06156-1_36
P. Gadikar, P. Diehl and P. Jha, “Load balancing for distributed nonlocal models within asynchronous many-task systems,” in 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Portland, OR, USA, 2021 pp. 669-678. doi: 10.1109/IPDPSW52791.2021.00103
S. Yadav, N. Gupta, A. Reverdell and H. Kaiser, “Parallel SIMD – A Policy Based Solution for Free Speed-Up using C++ Data-Parallel Types,” 2021 IEEE/ACM 6th International Workshop on Extreme Scale Programming Models and Middleware (ESPM2), St. Louis, MO, USA, 2021, pp. 20-29, doi: 10.1109/ESPM254806.2021.00008.
Demeshko, Irina P., Diehl, Patrick, Adelstein-Lelbach, Bryce, Buch, Ronak, Kaiser, Hartmut, Kale, Laxmikant, Khatami, Zahra, Koniges, Alice, and Shirzad, Shahrzad. TBAA20: Task-Based Algorithms and Applications. 2021. pdf
N. Gupta, R. Ashiwal, B. Brank, S. K. Peddoju and D. Pleiter, Performance Evaluation of ParalleX Execution model on Arm-based Platforms, 2020 IEEE International Conference on Cluster Computing (CLUSTER), Kobe, Japan, 2020, pp. 567-575, doi: 10.1109/CLUSTER49012.2020.00080.
N. Gupta, J. R. Mayo, A. S. Lemoine and H. Kaiser, Towards Distributed Software Resilience in Asynchronous Many- Task Programming Models, 2020 IEEE/ACM 10th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS), GA, USA, 2020, pp. 11-20, doi: 10.1109/FTXS51974.2020.00007.
B. Hasheminezhad, S. Shirzad, N. Wu, P. Diehl, H. Schulz and H. Kaiser, Towards a Scalable and Distributed Infrastructure for Deep Learning Applications, 2020 IEEE/ACM Fourth Workshop on Deep Learning on Supercomputers (DLS), Atlanta, GA, 2020, pp. 20-30, doi: 10.1109/DLS51937.2020.00008.
Gupta et al., Deploying a Task-based Runtime System on Raspberry Pi Clusters, 2020 IEEE/ACM 5th International Workshop on Extreme Scale Programming Models and Middleware (ESPM2), GA, USA, 2020, pp. 11-20, doi: 10.1109/ESPM251964.2020.00007.
Weile Wei, Arghya Chatterjee, Kevin Huck, Oscar Hernandez, Hartmut Kaiser: Performance Analysis of a Quantum Monte Carlo Application on Multiple Hardware Architectures Using the HPX Runtime. 2020 IEEE/ACM 11th Workshop on Latest Advances in ScalA, Oct 20. pdf
Patrick Diehl, Serge Prudhomme, Pablo Seleson: Report on Workshop on Experimental and Computational Fracture Mechanics 2020, Feb 26–28, 2020. pdf
P. Amini and H. Kaiser, “Assessing the Performance Impact of using an Active Global Address Space in HPX: A Case for AGAS,” 2019 IEEE/ACM Third Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware (IPDRM), 2019, pp. 26-33, doi: 10.1109/IPDRM49579.2019.00008.
Nikunj Gupta, Steve R. Brandt, Bibek Wagle, Nanmiao Wu, Alireza Kheirkhahan, Patrick Diehl, Felix W. Baumann, Hartmut Kaiser: Deploying a Task-based Runtime System on Raspberry Pi Clusters. SC20
Gregor Daiß, Parsa Amini, John Biddiscombe, Patrick Diehl, Juhan Frank, Kevin Huck, Hartmut Kaiser, Dominic Marcello, David Pfander, and Dirk Pflüger. 2019. From Piz Daint to the Stars: Simulation of Stellar Mergers using high-level Abstractions. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’19). ACM, New York, NY, USA, Article 62, 37 pages. DOI: 10.1145/3295500.3356221
Bibek Wagle, Mohammad Alaul Haque Monil, Kevin Huck, Allen D. Malony, Adrian Serio, and Hartmut Kaiser: Runtime Adaptive Task Inlining on Asynchronous Multitasking Runtime Systems, In Proceedings of the 48th International Conference on Parallel Processing (ICPP 2019). ACM, New York, NY, USA, Article 76, 10 pages. doi: 10.1145/3337821.3337915
Tianyi Zhang, Shahrzad Shirzad, Patrick Diehl, R. Tohid, Weile Wei, Hartmut Kaiser: An Introduction to hpxMP — A Modern OpenMP Implementation Leveraging Asynchronous Many-Tasking System, Proceedings of the International Workshop on OpenCL (IWOCL’19), Boston, May 13 – 15, 2019. doi: 10.1145/3318170.3318191, pdf
R. Tohid, Bibek Wagle, Shahrzad Shirzad, Patrick Diehl, Adrian Serio, Alireza Kheirkhahan, Parsa Amini, Katy Williams, Kate Isaacs, Kevin Huck, Steven Brandt, Hartmut Kaiser, “Asynchronous Execution of Python Code on Task Based Runtime Systems”, 4th International Workshop on Extreme Scale Programming Models and Middleware (ESPM2), Supercomputing 2018, November 12, 2018. doi: 10.1109/ESPM2.2018.00009
Patrick Diehl, Madhavan Seshadri, Thomas Heller, Hartmut Kaiser, “Integration of CUDA Processing within the C++ library for parallelism and concurrency (HPX)”, 4th International Workshop on Extreme Scale Programming Models and Middleware (ESPM2), Supercomputing 2018, November 12, 2018. doi: 10.1109/ESPM2.2018.00006
David Pfander, Gregor Daiß, Dominic Marcello, Hartmut Kaiser, Dirk Pflüger, “Accelerating Octo-Tiger: Stellar Mergers on Intel Knights Landing with HPX”, DHPCC++ Conference 2018 hosted by IWOCL, St Catherine’s College, Oxford, May 14, 2018. doi: 10.1145/3204919.3204938, video
B. Wagle, S. Kellar, A. Serio and H. Kaiser, “Methodology for Adaptive Active Message Coalescing in Task Based Runtime Systems,” 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Vancouver, BC, Canada, 2018, pp. 1133-1140.
doi: 10.1109/IPDPSW.2018.00173, pdf
Thomas Heller, Patrick Diehl, Zachary Byerly, John Biddiscombe, and Hartmut Kaiser, “HPX — An open source C++ Standard Library for Parallelism and Concurrency”, In Proceedings of OpenSuCo’17, Supercomputing 2017, Denver, Colorado, November 2017. pdf, bibtex
John Biddiscombe, Thomas Heller, Anton Bikineev, and Hartmut Kaiser. “Zero Copy Serialization using RMA in the Distributed Task-Based HPX runtime”. In 14th International Conference on Applied Computing. IADIS, International Association for Development of the Information Society, October 2017. pdf , bibtex
Zahra Khatami, Lukas Troska, Hartmut Kaiser, J. Ramanujan and Adrian Serio, “HPX Smart Executors”, In Proceedings of ESPM2’17: Third International Workshop on Extreme Scale Programming Models and Middleware (ESPM2’17), 2017. doi: 10.1145/3152041.3152084, pdf
Zahra Khatami, Hartmut Kaiser and J. Ramanujam, “Redesigning OP2 Compiler to Use HPX Runtime Asynchronous Techniques”, Parallel and Distributed Scientific and Engineering Computing (PDSEC17), 2017. pdf
Zahra Khatami, Sungpack Hong, Jinsu Lee, Siegfried Depner, Hassan Chafi, J. Ramanujam and Hartmut Kaiser, “A Load-Balanced Parallel and Distributed Sorting Algorithm Implemented with PGX.D”, Second Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware (IPDRM), 2017. pdf
Zahra Khatami, Lukas Troska, Hartmut Kaiser and J. Ramanujam, “Applying Logistic Regression Model on HPX Parallel Loops”, 15th Annual Workshop on Charm++ and its Applications, 2017. pdf
Marcin Copik and Hartmut Kaiser. “Using SYCL as an Implementation Framework for HPX.Compute.” In Proceedings of the 5th International Workshop on OpenCL (IWOCL 2017). ACM, New York, NY, USA, Article 30, 7 pages. DOI: https://doi.org/10.1145/3078155.3078187 pdf, slides, bibtex
Zahra Khatami, Hartmut Kaiser, Patricia Grubel, Adrian Serio, and J. Ramanujam: “A Massively Parallel Distributed N-Body Application Implemented with HPX’’, Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA16), The International Conference on High Performance Computing, Networking, Storage and Analysis (SC16), November 2016. DOI=10.1109/ScalA.2016.12. pdf
T. Heller, H. Kaiser, P. Diehl, D. Fey, and M. A. Schweitzer, Closing the Performance Gap with Modern C++, in High Performance Computing: ISC High Performance 2016 International Workshops, ExaComm, E-MuCoCoS, HPC-IODC, IXPUG, IWOPH, P3 MA, VHPC, WOPSSS, Frankfurt, Germany, June 19–23, 2016, Revised Selected Papers, M. Taufer, B. Mohr, and J. M. Kunkel, eds., vol. 9945 of Lecture Notes in Computer Science, Springer International Publishing, 2016, pp. 18–31. ISBN 978-3-319-46079-6. pdf, web, bibtex
Zahra Khatami, H. Kaiser, and J. Ramanujam: “Using HPX and OP2 for Improving Parallel Scaling Performance of Unstructured Grid Applications’’, Ninth International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), 2016. pdf
Antoine Tran Tan and Hartmut Kaiser. “Extending C++ with co-array semantics”. In Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming (ARRAY 2016). ACM, New York, NY, USA, 63-68. DOI=10.1109/ICPPW.2016.39. pdf
Patricia Grubel, Hartmut Kaiser, Kevin Huck, Jeanine Cook: Using Intrinsic Performance Counters to Assess Efficiency in Task-based Parallel Applications, HPCMASPA 2016 at IEEE IPDPS 2016 – Workshop on Monitoring and Analysis of HPC Systems Plus Applications (2016). DOI=10.1109/ICPPW.2016.39. pdf
Hartmut Kaiser, Thomas Heller, Daniel Bourgeois, Dietmar Fey: Higher-level Parallelization for Local and Distributed Asynchronous Task-Based Programming, ESPM2 2015 at SC’15 – First International Workshop on Extreme Scale Programming Models and Middleware (2015). pdf
Zachary D. Byerly, Hartmut Kaiser, Steven Brus, Andreas Schäfer: A Non-intrusive Technique for Interfacing Legacy Fortran Codes with Modern C++ Runtime Systems, LHAM 2015 at CANDAR’15 – International Workshop on Legacy HPC Application Migration (2015). pdf
Patricia Grubel, Hartmut Kaiser, Jeanine Cook, Adrian Serio: The Performance Implication of Task Size for Applications on the HPX Runtime System, HPCMASPA 2015 at IEEE Cluster 2015 – Workshop on Monitoring and Analysis of HPC Systems Plus Applications (2015). pdf
Hartmut Kaiser, Thomas Heller, Bryce Adelstein-Lelbach, Adrian Serio, Dietmar Fey: HPX – A Task Based Programming Model in a Global Address Space, PGAS 2014: The 8th International Conference on Partitioned Global Address Space Programming Models (2014). pdf
Steven R. Brandt, Hari Krishnan, Gokarna Sharma, Costas Busch: Concurrent, Parallel Garbage Collection in Linear Time, ISMM 2014: International Symposium on Memory Management (2014)
Antoine Tran Tan, Joel Falcou, Daniel Etiemble, Hartmut Kaiser: Automatic Task-based Code Generation for High Performance DomainSpecific Embedded Language, HLPP 2014: 7th International Symposium on High-Level Parallel Programming and Applications (2014). pdf
Shuangyang Yang, Maciej Brodowicz, Walter Ligon III, Hartmut Kaiser, PXFS: A Persistent Storage Model for Extreme Scale, ICNC 2014: International Conference on Computing, Networking and Communications, CNC Workshop (2014), pdf
T. Heller, H. Kaiser, A. Schäfer, and D. Fey, Using HPX and LibGeoDecomp for scaling HPC applications on heterogeneous supercomputers, ScalA ’13, Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Article No. 1, 2013, pdf
K. Huck, A. Malony, S. Shende, H. Kaiser, A. Porterfield, R. Fowler, and R. Brightwell: An Early Prototype of an Autonomic Performance Environment for Exascale, International Workshop on Runtime and Operating Systems for Supercomputers (ROSS 2013). pdf
M. Anderson, M. Brodowicz, H. Kaiser, B. Adelstein-Lelbach, T. Sterling: Tabulated equations of state with a many-tasking execution model, Workshop on Large-Scale Parallel Processing (2013). pdf
T. Heller, H. Kaiser and K. Iglberger: Application of the ParalleX Execution Model to Stencil-based Problems, International Supercomputing Conference (2012 – Hamburg, Germany). pdf
A. Tabbal, M. Anderson, M. Brodowicz, H. Kaiser and T. Sterling: Preliminary Design Examination of the ParalleX System from a Software and Hardware Perspective, PMBS Workshop SC10 (2010), ACM SIGMETRICS Performance Evaluation Review (2011). pdf
H. Kaiser, M. Brodowicz and T. Sterling: ParalleX: An Advanced Parallel Execution Model for Scaling-Impaired Applications, International Conference on Parallel Processing Workshops (2009 – Los Alamos, California – page 394 to 401). pdf

Tutorials

H. Kaiser, S. Brandt, K. Huck, A. Koniges, B. Lelbach: Interactive HPC: Using C++ and HPX Inside Jupyterhub to Write Performant Portable Parallel Code, Tutorial at Supercomputing in Denver, CO. Monday November 13, 2017. slides
John Biddiscombe, Thomas Heller: Task Based Programming with HPX, Tutorial at CSCS in Lugano, Switzerland. September 29 – 30, 2016. tutorial resources
Thomas Heller, Arne Hendricks: Massively Parallel Task-Based Programming with HPX, Tutorial at ARCS given on Tuesday April 5, 2016. web
H. Kaiser, S. Brandt, A. Koniges, S. Shende, K. Huck: Massively Parallel Task-Based Programming with HPX, Tutorial at Supercomputing in Austin, TX. Sunday November 15, 2015. slides

Talks, Presentations, and Videos

Nikunj Gupta, Jackson Mayo, Adrian Lemoine, Hartmut Kaiser: Towards Distributed Software Resilience in Asynchronous Many-Task Programming Models. FTXS workshop in SC’20
Weile Wei: Enabling High-level Parallel Abstractions to Dynamical Cluster Approximation (DCA++) using HPX and GPUDirect on the Summit Supercomputer, Portability, and Productivity in HPC Forum, Sept 1, 2020. Slides, Video
Hartmut Kaiser: Asynchronous Programming in Modern C++, Performance, Portability, and Productivity in HPC Forum, Sept 1, 2020. Slides, Video
Nanmiao Wu: Blaze Iterative Solvers, Research Re-do International Academic Webinar, June 26, 2020. recording
Karame Mohammadiporshokooh: Profiling Tool for Performance Analysis of Applications Running on HPX , Research Re-do International Academic Webinar, June 25, 2020. recording
Hartmut Kaiser: Asynchronous Programming in Modern C++, talk at CppCon (2019), Aurora, Colorado. September 2019. pdf, video
John Biddiscombe: HPX: High Performance Computing in C++, keynote talk at C++ Day 2018 held by the Italian C++ Community. Pavia, Italy. November 24, 2018. video
Madhavan Seshadri: HPXCL: Asynchronous integration of GPU computing with HPX many task processing, talk at FOSSASIA (2018), Singapore, March 23, 2018. video
Hartmut Kaiser: The Asynchronous C++ Parallel Programming Model, talk at CppCon (2017), Bellevue, Washington, September 2017. pdf, video
Lukas Troska, A HPX Backend for TensorFlow, talk to the STE||AR Group @ LSU, Baton Rouge, LA, April 5, 2017. pdf
Bryce Adelstein Lelbach, The C++17 Parallel Algorithms Library and Beyond, talk at CppCon (2016), Seattle, Washington, September 2016, pdf, video
Hartmut Kaiser: Parallelism in Modern C++, talk at CppCon (2016), Seattle, Washington, September 2016, pdf, video
Antoine Tran Tan: Extending C++ with Co-Array Semantics, invited talk at C++ Now (2016). pdf, video
Marcin Copik: HPX and GPU-parallelized STL, invited talk at C++ Now (2016). pdf, video
Anton Bikineev, HPX – рантайм-система для параллельных и распределенных вычислений, invited talk at CoreHard C++ Conference 2016, Minsk, Belorus, February 13, 2016. slides
Bryce Lelbach, HPX and More, CppCast, February 9, 2016, audio
Thomas Heller: C++ on Its Way to Exascale and Beyond. invited talk at Meeting C++. Berlin, Germany. December 5, 2015. video
Hartmut Kaiser: Parallelism in C++, invited talk at the Swiss National Supercomputing Center, Lugano, September 18, 2015. pdf
Anton Bikineev, HPX: A Runtime System for Parallel and Distributed Applications (HPX: C++11 рантайм-система для параллельных и распределённых вычислений), invited talk at C++ Siberia 2015, August 29, 2015. pdf, video
Grant Mercer and Daniel Bourgeois: Parallelizing the C++ Standard Template Library, invited talk at CppCon, Bellevue, Washington, September 22, 2015. pdf, video
Thomas Heller, Agustín Bergé, Hartmut Kaiser: Back to the Future, invited talk at C++Now (2015). pdf, video
Grant Mercer: Parallelizing the Standard Template Library (STL), invited talk at C++ Now (2015). pdf
Hartmut Kaiser, Asynchronous Programming, CppCast, April 22, 2015, audio
Zach Byerly: STORM: a Scalable Toolkit for an Open Community Supporting Near Realtime High Resolution Coastal Modeling, talk at ADCIRC Users Group Meeting, College Park, Maryland, March 31, 2015. pdf, video
Hartmut Kaiser: Plain Threads are the GOTO of Today’s Computing, Keynote at Meeting C++, December 6, 2014, pdf, video.
Thomas Heller: HPX by Example, talk at NERSC, October 2014, pdf
Martin Stumpf: Distributed GPGPU Computing, talk at LA-SiGMA TESC Meeting, LSU, Baton Rouge, Louisiana, September 25, 2014. pdf, video
Hartmut Kaiser: Asynchronous Computing in C++, talk at CppCon (2014), Seattle, Washington, September 2014, pdf, video
Hartmut Kaiser: HPX, talks at LA-SiGMA TESC Meeting, LSU, Baton Rouge, Louisiana, First Talk- January 30, 2014 pdf; Second Talk- February 6, 2014 pdf, video
Hartmut Kaiser, Vinay Amatya: HPX: A C++ Standards Compliant Runtime System For Asynchronous Parallel And Distributed Computing, invited talk at C++Now (2013). pdf, video
Shuangyang Yang, W. Ligon III, M. Brodowicz, and H. Kaiser: A Persistent Storage Model for Extreme Scale, Scientific Computing Around Louisiana (February 2013 – New Orleans, Louisiana), pdf
B. Adelstein-Lelbach, Z.Byerly, D.Marcello, G. Clayton, and H. Kaiser, Octopus: A scalable AMR Toolkit for Astrophysics, Scientific Computing Around Louisiana (February 2013 – New Orleans, Louisiana), pdf
B. Adelstein-Lelbach, M. Anderson and H. Kaiser: HPX: A Parallel, Distributed C++11 Runtime System, C++Now! (May 2012 – Aspen, Colorado). pdf, video
M. Anderson, T. Sterling, H. Kaiser and D. Neilsen: Neutron Star Evolutions using Tabulated Equations of State with a New Execution Model, American Physical Society April 2012 Meeting (April 1, 2011 – Atlanta, Georgia). pdf
H. Kaiser: High Performance ParalleX (HPX), invited talk for the Center of Computation and Technology (CCT) Tech Talk Series (November 2011 – Baton Rouge, Louisiana). pdf, video
H. Kaiser: ParalleX – A Cure for Performance Impaired Applications, invited talk at the International Research Workshop for Advanced High Performance Computing Systems (2011). pdf

Posters

Karame Mohammadiporshokooh, Hartmut Kaiser: Exploring Performance Characteristics Variations Effects on Parallel Algorithms, Feb 22, 2023.
Bita Hasheminezhad, Hartmut Kaiser: “A Deep Learning High-Performance Backend”, vGHC Poster Session, Sept 30, 2020. link
Weile Wei, Maxwell Reeser, Hartmut Kaiser, Adrian Serio, Avah Banerjee, R. Tohid, “Distributed Object Abstraction in HPX”, Rocky Mountain Advanced Computing Consortium HPC Symposium 2019. Poster Session. May 21, 2019. Boulder, Colorado. pdf
Weile Wei, Rod Tohid, Bibek Wagle, Shahrzad Shirzad, Parsa Amini, Bita Hasheminezhad, Katy Williams, Adrian Serio, Hartmut Kaiser “Performance Analysis of Machine Learning Algorithms for Phylanx: An Asynchronous Array Processing Toolkit” Poster Section at The 1st R-CCS International Symposium 2019. Kobe, Japan. Feb 2019. pdf
Gregor Daiss, David Pfander, and Hartmut Kaiser, “Optimizing the Node-Level Performance of a Hydrodynamics Simulation Modeling Binary Star Systems on an Intel Knights Landing CPU”, SCALA 2018, Poster Session (February 2, 2018 – Baton Rouge, Louisiana). pdf
Zahra Khatami, Hartmut Kaiser and J. Ramanujan, “Improving the Parallel Performance of an NBody Application Using Adaptive Techniques in HPX”, The 19th IEEE International Conference on High Performance Computing and Communications (HPCC 2017), 2017. pdf
Jesse Goncalves, Hartmut Kaiser, “Leveraging HPX on a Cluster of Raspberry Pis”, LSU Summer Undergraduate Research Forum (SURF), July 28, 2017, Baton Rouge, Louisiana. pdf
Zahra Khatami, Lukas Troska, Hartmut Kaiser and J. Ramanujam, “Applying Machine Learning Techniques on HPX Parallel Algorithms”, 31st IEEE International Parallel and Distributed Processing Symposium, IPDPS 2017 PhD Forum, 2017. pdf
Zahra Khatami, Hartmut Kaiser, and J. Ramanujam: “HPX Data Prefetching Iterator”, Women in High Performance Computing (WiHPC), The International Conference on High Performance Computing, Networking, Storage and Analysis (SC16), November 2016. pdf
Patricia Grubel, Bryce Lelback, Hartmut Kaiser: “Performance Characterization of HPX – A Task-based Runtime System on the Xeon Phi Knights Landing”, Women in High Performance Computing (WiHPC), The International Conference on High Performance Computing, Networking, Storage and Analysis (SC16), November 2016. pdf
Z. Khatami, H. Kaiser, J. Ramanujam: Scalable Parallel Octree Using HPX With Hilbert Curve, SCALA 2016, Poster Session (February 12, 2016 – Baton Rouge, Louisiana). pdf
B. Wagle, S. Kellar, S.X. Yang, K.M. Tam, H. Kaiser, M. Jarrell, J. Moreno: HPX Implementation of Parquet Approximation, SCALA 2016, Poster Session (February 12, 2016 – Baton Rouge, Louisiana). pdf
Alice Koniges, Jayashree Ajay Candadai, Hartmut Kaiser, Kevin Huck, Jeremy Kemp, Thomas Heller, Matthew Anderson, Andrew Lumsdaine, Adrian Serio, Michael Wolf, Bryce Lelbach, Ron Brightwell, Thomas Sterling, HPX Applications and Performance Adaptation, SC15, Poster Session, pdf
H. Kaiser, J. Westerink, R. Luettich, C. Dawson, STORM: a Scalable Toolkit for an Open Community Supporting Near Real-time High Resolution Coastal Modeling, 2015 NSF SI2 PI Workshop (February 17-18, 2015 – Arlington, Virginia). pdf
B. Adelstein-Lelbach, H. Kaiser, H. Johansen, Performance Modeling of a Dependency-Driven Mini-App for Climate, CSSSP 2014 Poster Session (August 7, 2014 – Berkeley, California). pdf
Grant Mercer, Hartmut Kaiser: Writing C++ Standard Conforming Parallel Algorithms in HPX, CCT/LA-SiGMA REU Poster Session (August 1, 2014 – Baton Rouge, Louisiana). pdf
M. Stumpf, H. Kaiser, T. Heller: Implementing an interactive Mandelbrot Visualization on a GPGPU cluster using HPXCL, CCT/LA-SiGMA REU Poster Session (August 1, 2014 – Baton Rouge, Louisiana). pdf
D. Howard, H. Kaiser: Concurrent CPU and GPU Execution using HPX and the CUDA Driver API, SCALA 2014, Poster Session (February 21, 2014 – Baton Rouge, Louisiana). pdf
P. Grubel, J. Cook, H. Kaiser: Performance Studies Towards Adaptive Thread Scheduling, Early Research Showcase, Doctoral Showcase Program, SC 13 (November 2013- Denver, Colorado). pdf
D. Howard, H. Kaiser: Scalability of Monte Carlo Pi Calculation Using HPX, SURF Poster Session (August 2, 2013 – Baton Rouge, Louisiana). pdf
S. Crillo, B. Adelstein-Lelbach, H.Kaiser: Parallelization of the Smith-Waterman Algorithm with HPX, CCT/LA-SiGMA REU Poster Session (July 27, 2012 – Baton Rouge, Louisiana). pdf
M. LeSane: Microsoft Visual C++ Demangling, CCT/LA-SiGMA REU Poster Session (July 27, 2012 – Baton Rouge, Louisiana). pdf
B. Adelstein-Lelbach: The Active Global Address Space, Scientific Computing Around Louisiana (January 20, 2012 – Baton Rouge, Louisiana). pdf
V. Amatya, B. Adelstein-Lelbach, M. Brodowicz and H. Kaiser: Parcel Routing Feature for AGAS, Scientific Computing Around Louisiana (January 20, 2012 – Baton Rouge, Louisiana). pdf
A. Serio and H. Kaiser: Solving N-body Problems using HPX, Scientific Computing Around Louisiana (January 20, 2012 – Baton Rouge, Louisiana). pdf
K. Kufahl, B. Adelstein-Lelbach and H. Kaiser: Performance Monitoring in Distributed ParalleX Applications, CCT/LA-SiGMA REU Poster Session (July 28, 2011 – Baton Rouge, Louisiana). pdf

Other Technical Publications

Diehl, P, Brandt, S, Morris, M, et al. Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HPX, Go, Julia, Python, Rust, Swift, and Java. arXiv, 2023. pdf
Diehl, P, Daiss, G, Huck, K, et al. Distributed, combined CPU and GPU profiling within HPX using APEX, arXiv, 2022. pdf
Hartmut Kaiser, Thomas Heller, Michael Wong, P0361R0: Invoking Algorithms asynchronously, Proposal to the ISO C++ Standardization Committee, May 2016, pdf
Jared Hoberock, Michael Garland, Olivier Giroux, Hartmut Kaiser, P0058R1: An Interface for Abstracting Execution, Proposal to the ISO C++ Standardization Committee, Feb. 2016, pdf
Michael Wong, Hartmut Kaiser, Thomas Heller, P0234R0: Towards Massive Parallelism (aka Heterogeneous Devices/Accelerators/GPGPU) support in C++, Proposal to the ISO C++ Standardization Committee, Feb. 2016, pdf
Grant Mercer, Agustín Bergé, Hartmut Kaiser, N4167: Transform Reduce, an Additional Algorithm for C++ Extensions for Parallelism, Proposal to the ISO C++ Standardization Committee, Nov. 2014, pdf.

Preprints

Parick Diehl, Gregor Daiss, Steven R. Brandt, Alireza Kheirkhahan, Hartmut Kaiser, Christopher Taylor, John Leidel (2023). Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger. arxiv preprint arXiv:2309.06530
Diehl, P., Daiß, G., Huck, K., Marcello, D., Shiber, S., Kaiser, H., & Pflüger, D. (2023). Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku. arXiv preprint arXiv:2304.11002.
Daiß, Gregor and Singanaboina, Srinivas Yadav and Diehl, Patrick and Kaiser, Hartmut and Pflüger, Dirk, From Merging Frameworks to Merging Stars: Experiences using HPX, Kokkos and SIMD Types, arXiv, 2022. pdf
Daiß, Gregor and Diehl, Patrick and Marcello, Dominic and Kheirkhahan, Alireza and Kaiser, Hartmut and Pflüger, Dirk, From Task-Based GPU Work Aggregation to Stellar Mergers: Turning Fine-Grained CPU Tasks into Portable GPU Kernels, arXiv, 2022. pdf
N. Wu, V. Castellana, H. Kaiser, “Towards Superior Software Portability with SHAD and HPX C++ Libraries”, accepted by 19th ACM International Conference on Computing Frontiers, 2022.
Patrick Diehl, Gregor Daiß, Dominic Marcello, Kevin Huck, Sagiv Shiber, Hartmut Kaiser, Juhan Frank, Dirk Pflüger Octo-Tiger’s New Hydro Module and Performance Using HPX+CUDA on ORNL’s Summit. Accepted to IEEE Cluster, 2021. arXiv:2107.10987
Weile Wei, Eduardo D’Azevedo, Kevin Huck, Arghya Chatterjee, Oscar Hernandez, Hartmut Kaiser, “Memory Reduction using a Ring Abstraction over GPU RDMA for Distributed Quantum Monte Carlo Solver”, The Platform for Advanced Scientific Computing (PASC) Conference, July 5, 2021. pdf
Pranav Gadikar, Patrick Diehl, Prashant K. Jha. Load balancing for distributed nonlocal models within asynchronous many-task systems. 2021. pdf
Patrick Diehl, Dominic Marcello, Parsa Armini, Hartmut Kaiser, Sagiv Shiber, Geoffrey C. Clayton, Juhan Frank, Gregor Daiß, Dirk Pflüger, David Eder, Alice Koniges, Kevin Huck. Performance Measurements within Asynchronous Task-based Runtime Systems: A Double White Dwarf Merger as an Application. 2021. pdf
Marcello, D.C., Shiber, S., Marco, O.D., Frank, J., Clayton, G.C., Motl, P.M., Diehl, P., & Kaiser, H. Octo-Tiger: A New, 3D Hydrodynamic Code for Stellar Mergers that uses HPX Parallelisation. 2021. pdf
Nikunj Gupta, Steve R. Brandt, Bibek Wagle, Nanmiao, Alireza Kheirkhahan, Patrick Diehl, Hartmut Kaiser, Felix W. Baumann: Deploying a Task-based Runtime System on Raspberry Pi Clusters. Oct 20, arXiv:2010.04106 [cs.DC]
Bita Hasheminezhad, Shahrzad Shirzad, Nanmiao Wu, Patrick Diehl , Hartmut Kaiser: Towards a Scalable and Distributed Infrastructure For Deep Learning Applications. Oct 20, arXiv:2010.03012 [cs.DC]
Bryce Adelstein-Lelbach, Patricia Grubely, Thomas Hellerz, Hartmut Kaiser, Jeanine Cook: Thread Management in the HPX Runtime: A
Performance and Overhead Characterization. 2020 pdf.
G. Laberge, S. Shirzad, P. Diehl, H. Kaiser, S. Prudhomme, A. Lemoine: Scheduling optimization of parallel linear algebra algorithms using Supervised Learning, submitted to the International Conference for High Performance Computing, Networking, Storage, and Analysis at Supercomputing 2019 (SC19). arxiv
Patrick Diehl, Prashant K. Jha, Hartmut Kaiser, Robert Lipton, Martin Levesque: Implementation of Peridynamics utilizing HPX — the C++ standard library for parallelism and concurrency, June 18, 2018. pdf
M. Anderson, M. Brodowicz, H. Kaiser, B. Adelstein-Lelbach and T. Sterling: Neutron Star Evolutions using Tabulated Equations of State with a New Execution Model, (2012). pdf, bibtex
M. Anderson, M. Brodowicz, H. Kaiser, B. Adelstein-Lelbach and T. Sterling: Adaptive Mesh Refinement for Astrophysics Applications with ParalleX, (2011). pdf, bibtex
M. Anderson, M. Brodowicz, H. Kaiser, and T. Sterling: An Application Driven Analysis of the ParalleX Execution Model, (2011). pdf, bibtex

Theses

Gonidelis, G. On The Performance Benefifits of Porting NWGraph, a Modern C++
Graph Library to HPX. Louisiana State University, 2023.
Qiu, C. Performance analysis of pararllel solver implemented using FLeSCI for application in structural dynamic problems. Louisiana State University, 2023. pdf
W. Wei, Optimizing the P Optimizing the Performance of P formance of Parallel and Concurr allel and Concurrent Applications Based on Asynchronous Many-Task Runtimes, Doctoral dissertation, Louisiana State University, 2022. pdf
N. Wu, “Performance Analysis and Improvement for Scalable and Distributed Applications Based on Asynchronous Many-Task Systems”, Doctoral dissertation, Louisiana State University, 2022. pdf
Parsa Amini, Adaptive Data Migration in Load-Imbalanced HPC Applications, Ph.D. dissertation defended at Louisiana State University, Oct 6, 2020. pdf
Shahrzad Shirzad, Optimizing the Performance of Multi-threaded Linear Algebra Libraries Based on Task Granularity, Ph.D. dissertation defended at Louisiana State University Oct 16, 2020. pdf
Weile Wei, Enabling Parallel Abstraction Layer to DCA++ Using HPX and GPUdirect, Master’s Thesis defended at Louisiana State University, May 2020. pdf
Maxwell Reeser, Extending Distributed Functionality in Phylanx. Master’s thesis defended at Louisiana State University, May 2020. pdf
Bibek Wagle: Managing Overheads in Asynchronous Many-Task Runtime Managing Overheads in Asynchronous Many-Task Runtime Systems, Ph.D. defended at Louisiana State University, Nov 2019. pdf
Thomas Heller: Extending the C++ Asynchronous Programming Model with the HPX Runtime System for Distributed Memory Computing, Ph.D. thesis defended at Friedrich-Alexander-University of Erlangen-Nurnberg, Feb 28, 2019. pdf
Tianyi Zhang: hpxMP, An Implementation of OpenMP Using HPX, Master’s Thesis defended at Louisiana State University, 2019. pdf
Gregor Daiss: Octo-Tiger: Binary Star Systems with HPX on Nvidia P100, Masters Thesis defended at the Universität Stuttgart, May 3, 2018. pdf
Zahra Khatami, Compiler and Runtime Optimization Techniques for Implementation Scalable Parallel Applications, Ph.D. defended at Louisiana State University, Aug 3, 2017. pdf
Lukas Troska: A HPX-based parallelization of a Navier-Stokes-solver, Bachelors Thesis defended at the Universität Bonn, August 2016. pdf
P. Grubel: Dynamic Adaptation in HPX – A Task-Based Parallel Runtime System, Ph.D. defended at New Mexico State University, August 2016. pdf
J. Wolf: Implementation of a backend to ISPC using HPX, bachelor’s thesis defended at Friedrich-Alexander Universität Erlangen-Nürnberg, July 13, 2016. pdf
R. Raj: Performance Analysis with HPX, Project Report at Louisiana State University, December, 2014. pdf
C. Guo: Implementing Asynchronous Prefix Scan Algorithm in HPX Execution Model, Project Report at Louisiana State University, December, 2014. pdf
B. Ghimire: Data Distribution in HPX, Thesis defended at Louisiana State University, December, 2014. pdf
V.C. Amatya: Parallel Processes in HPX: Designing an Infrastructure for Adaptive Resource Management, Ph.D. defended at Louisiana State University, Nov 11, 2014. pdf
S. Yang: A Persistent Storage Model for Extreme Computing, Ph.D. defended at Louisiana State University, Oct 31, 2014. pdf
J.A.E. Habraken: Adding capability-based security to high performance ParalleX, M.S. defended at the TU Eindhoven. Fac. Wiskunde en Informatica (Netherlands), October 7, 2013. pdf
T. Heller: Implementation of Data Flow for the ParalleX Execution Model, M.S. defended at Friedrich Alexander University Nuremberg (Germany), April 16, 2012. pdf
C. Dekate: Extreme Scale Parallel N-body Algorithm with an Event-driven Constraint-based Execution Model, defended at Louisiana State University, April 4, 2011. pdf
D. Stark: Advanced Semantics for Accelerated Graph Processing, defended at Louisiana State University, April 13, 2011. pdf

Related Work

Results created using STE||AR technology:

R. Lipton, E. Said, and P. Jha. “Free Damage Propagation with Memory.” Journal of Elasticity. 2018. DOI: 10.1007/s10659-018-9672-7
Patrick Diehl, Robert Lipton, and Marc Alexander Schweitzer. “Numerical verification of a bond-based softening peridynamic model for small displacements: Deducing material parameters from classical linear theory.” Institut für Numerische Simulation. 2016. pdf