OpenMP : enabling massive node-level parallelism : 17th international workshop on OpenMP, IWOMP 2021, Bristol, UK, September 14-16, 2021 : proceedings / / Simon McIntosh-Smith, Bronis R. de Supinski, Jannis Klinkenberg
| OpenMP : enabling massive node-level parallelism : 17th international workshop on OpenMP, IWOMP 2021, Bristol, UK, September 14-16, 2021 : proceedings / / Simon McIntosh-Smith, Bronis R. de Supinski, Jannis Klinkenberg |
| Autore | McIntosh-Smith Simon |
| Pubbl/distr/stampa | Cham, Switzerland : , : Springer International Publishing, , [2021] |
| Descrizione fisica | 1 online resource (231 pages) |
| Disciplina | 621.3916 |
| Collana | Lecture Notes in Computer Science |
| Soggetto topico |
Microprocessors - Computer-aided design
Logic design - Data processing |
| ISBN | 3-030-85262-8 |
| Formato | Materiale a stampa |
| Livello bibliografico | Monografia |
| Lingua di pubblicazione | eng |
| Nota di contenuto |
Intro -- Preface -- Organization -- Contents -- Synchronization and Data -- Improving Speculative taskloop in Hardware Transactional Memory -- 1 Introduction -- 2 Background and Related Work -- 2.1 Task-Based Parallelism -- 2.2 TLS on Hardware Transactional Memories -- 2.3 Speculative taskloop (STL) -- 2.4 Lost-Thread Effect -- 2.5 LLVM OpenMP Runtime Library -- 3 Implementation -- 3.1 First Attempt: Use priority Clause -- 3.2 Recursive Partition of Iterations -- 3.3 Immediate Execution When Deque is Full -- 3.4 Removal from Tail of Thread's Deque -- 4 Benchmarks, Methodology and Experimental Setup -- 5 Experimental Results and Analysis -- 6 Conclusions -- References -- Vectorized Barrier and Reduction in LLVM OpenMP Runtime -- 1 Introduction -- 2 Background and Related Work -- 2.1 Types of Barriers in Literature -- 2.2 Barriers and Reductions in OpenMP -- 3 Low Overhead Barrier and Reduction in OpenMP -- 3.1 Vectorized Barrier -- 3.2 Vectorized Reduction -- 4 Performance Results -- 4.1 Intel KNL -- 4.2 Fujitsu A64FX -- 5 Conclusions -- References -- Tasking Extensions I -- Enhancing OpenMP Tasking Model: Performance and Portability -- 1 Introduction -- 2 Motivation -- 3 The Taskgraph Model -- 3.1 The taskgraph Mechanism -- 3.2 Syntax of the taskgraph Clause -- 3.3 Semantics of the taskgraph Clause -- 3.4 Requirements of the taskgraph Region -- 4 Projected Results -- 4.1 Potential Performance Gain -- 4.2 The TDG: A Door for Expanding Portability -- 5 Related Work -- 6 Conclusion -- References -- OpenMP Taskloop Dependences -- 1 Introduction -- 2 Tasking Programmability Challenges -- 3 Related Work -- 4 Taskloop with Dependences -- 5 Implementation -- 6 Experiment Results -- 7 Conclusions and Future Work -- References -- Applications -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part I).
1 Introduction -- 2 Platforms Used -- 3 Application Experiences -- 3.1 BerkeleyGW -- 3.2 WDMApp -- References -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part II) -- 1 Introduction -- 2 Application Experiences -- 2.1 GAMESS -- 2.2 GESTS -- 2.3 GridMini -- 3 Conclusions -- References -- An Empirical Investigation of OpenMP Based Implementation of Simplex Algorithm -- 1 Introduction -- 2 Serial Algorithm -- 3 Parallel Algorithm -- 3.1 Implementation -- 3.2 Optimization Strategies -- 3.3 Algorithm Analysis -- 4 Experimental Results and Observations -- 4.1 NETLIB Dataset -- 4.2 Variation of the Number of Variables -- 4.3 Variation of the Number of Constraints -- 4.4 Variation in Matrix Density -- 4.5 Discussion -- 5 Conclusion -- A Appendix: Serial Algorithm - Working Example -- References -- Task Inefficiency Patterns for a Wave Equation Solver -- 1 Introduction -- 2 Case Studies -- 3 Test Environment -- 4 Benchmarking and Task Runtime Modifications -- 4.1 Direct Translation of Enclave Tasking to OpenMP (native) -- 4.2 Manual Task Postponing (Hold-Back) -- 4.3 Manual Backfilling (Backfill) -- 5 Evaluation and Conclusion -- References -- Case Studies -- Comparing OpenMP Implementations with Applications Across A64FX Platforms -- 1 Introduction -- 1.1 The A64FX Processor -- 1.2 Paper's Contribution and Organization -- 2 List of Applications and Experimental Setup -- 2.1 List of Applications -- 2.2 Systems and Compilers -- 2.3 Runtime Environment -- 2.4 Compiler Options -- 3 Experimental Results -- 3.1 Ookami -- 3.2 Fugaku -- 4 Related Work -- 5 Conclusions and Future Work -- References -- A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation -- 1 Introduction -- 2 Case Study: Porting DCA++ to Wombat -- 2.1 Evaluation Environment -- 2.2 DCA++ -- 2.3 Baseline Performance. 3 An LLVM Tool Methodology to Generate Efficient Vectorization -- 3.1 OpenMP SIMD -- 3.2 Using the Correct Compiler Flags -- 3.3 Loop Transformations -- 3.4 Results -- 4 Automating the Process: The OpenMP Advisor -- 5 Related Work -- 6 Conclusion -- References -- Heterogenous Computing and Memory -- Experience Report: Writing a Portable GPU Runtime with OpenMP 5.1 -- 1 Introduction -- 2 Background -- 2.1 Device Runtime Library -- 2.2 Compilation Flow of OpenMP Target Offloading in LLVM/Clang -- 2.3 Motivation -- 3 Implementation -- 3.1 Common Part -- 3.2 Target Specific Part -- 4 Evaluation -- 4.1 Code Comparison -- 4.2 Functional Testing -- 4.3 Performance Evaluation -- 5 Conclusions and Future Work -- References -- FOTV: A Generic Device Offloading Framework for OpenMP -- 1 Introduction -- 2 Background: OpenMP Offloading Infrastructure -- 2.1 Offloading Strategy -- 2.2 Advantages and Limitations -- 3 Architecture of the FOTV Generic Device Framework -- 3.1 The Runtime Library Components -- 3.2 The Code Extraction Tool -- 4 Device Management API Description -- 4.1 DeviceManagement Component -- 4.2 TgtRegionBase Component -- 5 Case Study: Running OpenCL Kernels as OpenMP Regions -- 5.1 The OpenCL Device Requirements -- 6 Results -- 7 Related Works -- 8 Conclusions and Future Works -- References -- Beyond Explicit Transfers: Shared and Managed Memory in OpenMP -- 1 Introduction -- 2 Current Support in OpenMP -- 2.1 Allocators -- 2.2 Host Memory -- 2.3 Device Memory -- 3 Survey -- 3.1 OpenCL -- 3.2 Level Zero -- 3.3 CUDA -- 3.4 HIP -- 4 Proposed OpenMP Extension -- 4.1 Memory Space Accessibility -- 4.2 Shared and Managed Memory -- 4.3 Memory Location Control -- 5 Evaluation -- 6 Conclusion -- References -- Tasking Extensions II -- Communication-Aware Task Scheduling Strategy in Hybrid MPI+OpenMP Applications -- 1 Introduction -- 2 Related Work. 3 Task Scheduling Strategy -- 3.1 Interoperation Between MPI and OpenMP Runtimes -- 3.2 Manual Policies -- 3.3 (Semi-)Automatic Policies -- 3.4 Summary -- 4 Implementation and Evaluation -- 4.1 Implementation -- 4.2 Evaluation Environment -- 4.3 Experimental Results -- 5 Conclusion and Future Work -- References -- An OpenMP Free Agent Threads Implementation -- 1 Introduction -- 2 Related Work -- 3 Proposal -- 3.1 Considered Aspects in the Design -- 3.2 The free_agent Task Clause -- 3.3 Proposed Mechanisms to Manage Free Agent Threads -- 4 Implementation -- 5 Evaluation -- 5.1 Use Case: Fixing Load Imbalance Between Parallel Regions -- 5.2 Use Case: Solving Load Imbalance in a Hybrid Application with DLB as an OMPT Tool -- 6 Conclusions and Future Work -- References -- Author Index. |
| Record Nr. | UNISA-996464509003316 |
McIntosh-Smith Simon
|
||
| Cham, Switzerland : , : Springer International Publishing, , [2021] | ||
| Lo trovi qui: Univ. di Salerno | ||
| ||
OpenMP: Advanced Task-Based, Device and Compiler Programming [[electronic resource] ] : 19th International Workshop on OpenMP, IWOMP 2023, Bristol, UK, September 13–15, 2023, Proceedings / / edited by Simon McIntosh-Smith, Michael Klemm, Bronis R. de Supinski, Tom Deakin, Jannis Klinkenberg
| OpenMP: Advanced Task-Based, Device and Compiler Programming [[electronic resource] ] : 19th International Workshop on OpenMP, IWOMP 2023, Bristol, UK, September 13–15, 2023, Proceedings / / edited by Simon McIntosh-Smith, Michael Klemm, Bronis R. de Supinski, Tom Deakin, Jannis Klinkenberg |
| Autore | McIntosh-Smith Simon |
| Edizione | [1st ed. 2023.] |
| Pubbl/distr/stampa | Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 |
| Descrizione fisica | 1 online resource (244 pages) |
| Disciplina | 005.275 |
| Altri autori (Persone) |
KlemmMichael
de SupinskiBronis R DeakinTom KlinkenbergJannis |
| Collana | Lecture Notes in Computer Science |
| Soggetto topico |
Microprocessors
Computer architecture Compilers (Computer programs) Microprogramming Computer input-output equipment Computers, Special purpose Computer systems Processor Architectures Compilers and Interpreters Control Structures and Microprogramming Input/Output and Data Communications Special Purpose and Application-Based Systems Computer System Implementation |
| ISBN | 3-031-40744-X |
| Formato | Materiale a stampa |
| Livello bibliografico | Monografia |
| Lingua di pubblicazione | eng |
| Nota di contenuto | OpenMP and AI: Advising OpenMP Parallelization via a Graph-Based Approach with Transformers -- Towards Effective Language Model Application in High-Performance Computing -- OpenMP Advisor: A Compiler Tool for Heterogeneous Architectures -- Tasking Extensions: Introducing Moldable Task in OpenMP -- Suspending OpenMP Tasks on Asynchronous Events: Extending the Taskwait Construct -- How to Efficiently Parallelize Irregular DOACROSS Loops Using Fine-Grained Granularity and OpenMP Tasks? The mcf Case -- OpenMP Offload Experiences: The Kokkos OpenMPTarget Backend: Implementation and Lessons Learned -- Fine-Grained Parallelism on GPUs Using OpenMP Target Offloading -- Improving a Multigrid Poisson Solver with Peer-to-Peer Communication and Task Dependencies -- Beyond Explicit GPU Support: Multipurpose Cacheing to accelerate OpenMP Target Regions on FPGAs -- Generalizing Hierarchical Parallelism -- Exploring the Limits of Generic Code Execution on GPUs via Direct (OpenMP) Offload -- OpenMP Infrastructure and Evaluation: Improving Simulations of Task-Based Applications on Complex NUMA Architectures -- Experimental Characterization of OpenMP Offloading Memory Operations and Unified Shared Memory Support -- OpenMP Reverse Offloading Using Shared Memory Remote Procedure Calls. |
| Record Nr. | UNISA-996546849103316 |
McIntosh-Smith Simon
|
||
| Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 | ||
| Lo trovi qui: Univ. di Salerno | ||
| ||
OpenMP: Advanced Task-Based, Device and Compiler Programming : 19th International Workshop on OpenMP, IWOMP 2023, Bristol, UK, September 13–15, 2023, Proceedings / / edited by Simon McIntosh-Smith, Michael Klemm, Bronis R. de Supinski, Tom Deakin, Jannis Klinkenberg
| OpenMP: Advanced Task-Based, Device and Compiler Programming : 19th International Workshop on OpenMP, IWOMP 2023, Bristol, UK, September 13–15, 2023, Proceedings / / edited by Simon McIntosh-Smith, Michael Klemm, Bronis R. de Supinski, Tom Deakin, Jannis Klinkenberg |
| Autore | McIntosh-Smith Simon |
| Edizione | [1st ed. 2023.] |
| Pubbl/distr/stampa | Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 |
| Descrizione fisica | 1 online resource (244 pages) |
| Disciplina | 005.275 |
| Altri autori (Persone) |
KlemmMichael
de SupinskiBronis R DeakinTom KlinkenbergJannis |
| Collana | Lecture Notes in Computer Science |
| Soggetto topico |
Microprocessors
Computer architecture Compilers (Computer programs) Microprogramming Computer input-output equipment Computers, Special purpose Computer systems Processor Architectures Compilers and Interpreters Control Structures and Microprogramming Input/Output and Data Communications Special Purpose and Application-Based Systems Computer System Implementation |
| ISBN |
9783031407444
303140744X |
| Formato | Materiale a stampa |
| Livello bibliografico | Monografia |
| Lingua di pubblicazione | eng |
| Nota di contenuto | OpenMP and AI: Advising OpenMP Parallelization via a Graph-Based Approach with Transformers -- Towards Effective Language Model Application in High-Performance Computing -- OpenMP Advisor: A Compiler Tool for Heterogeneous Architectures -- Tasking Extensions: Introducing Moldable Task in OpenMP -- Suspending OpenMP Tasks on Asynchronous Events: Extending the Taskwait Construct -- How to Efficiently Parallelize Irregular DOACROSS Loops Using Fine-Grained Granularity and OpenMP Tasks? The mcf Case -- OpenMP Offload Experiences: The Kokkos OpenMPTarget Backend: Implementation and Lessons Learned -- Fine-Grained Parallelism on GPUs Using OpenMP Target Offloading -- Improving a Multigrid Poisson Solver with Peer-to-Peer Communication and Task Dependencies -- Beyond Explicit GPU Support: Multipurpose Cacheing to accelerate OpenMP Target Regions on FPGAs -- GeneralizingHierarchical Parallelism -- Exploring the Limits of Generic Code Execution on GPUs via Direct (OpenMP) Offload -- OpenMP Infrastructure and Evaluation: Improving Simulations of Task-Based Applications on Complex NUMA Architectures -- Experimental Characterization of OpenMP Offloading Memory Operations and Unified Shared Memory Support -- OpenMP Reverse Offloading Using Shared Memory Remote Procedure Calls. |
| Record Nr. | UNINA-9910743696103321 |
McIntosh-Smith Simon
|
||
| Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 | ||
| Lo trovi qui: Univ. Federico II | ||
| ||
OpenMP: Enabling Massive Node-Level Parallelism : 17th International Workshop on OpenMP, IWOMP 2021, Bristol, UK, September 14–16, 2021, Proceedings / / edited by Simon McIntosh-Smith, Bronis R. de Supinski, Jannis Klinkenberg
| OpenMP: Enabling Massive Node-Level Parallelism : 17th International Workshop on OpenMP, IWOMP 2021, Bristol, UK, September 14–16, 2021, Proceedings / / edited by Simon McIntosh-Smith, Bronis R. de Supinski, Jannis Klinkenberg |
| Autore | McIntosh-Smith Simon |
| Edizione | [1st ed. 2021.] |
| Pubbl/distr/stampa | Cham : , : Springer International Publishing : , : Imprint : Springer, , 2021 |
| Descrizione fisica | 1 online resource (231 pages) |
| Disciplina | 621.3916 |
| Collana | Programming and Software Engineering |
| Soggetto topico |
Microprocessors
Computer architecture Logic design Computer programming Compilers (Computer programs) Operating systems (Computers) Processor Architectures Logic Design Programming Techniques Compilers and Interpreters Operating Systems |
| ISBN | 3-030-85262-8 |
| Formato | Materiale a stampa |
| Livello bibliografico | Monografia |
| Lingua di pubblicazione | eng |
| Nota di contenuto | Synchronization and Data -- Improving Speculative taskloop in Hardware Transactional Memory -- Vectorized Barrier and Reduction in LLVM OpenMP Runtime -- Tasking Extensions I -- Enhancing OpenMP Tasking Model: Performance and Portability -- OpenMP Taskloop Dependences -- Applications -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part I) -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part II) -- An empirical investigation of OpenMP based implementation of Simplex Algorithm -- Task inefficiency patterns for a wave equation solver -- Case Studies -- Comparing OpenMP Implementations With Applications Across A64FX Platforms -- A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation -- Heterogenous Computing and Memory -- Experience Report: Writing A Portable GPU Runtime with OpenMP 5.1 -- FOTV: A generic device offloading framework for OpenMP -- Beyond Explicit Transfers: Shared and Managed Memory in OpenMP -- Tasking Extensions II -- Communication-Aware Task Scheduling Strategy in Hybrid MPI+OpenMP Applications -- An OpenMP Free Agent threads implementation. |
| Record Nr. | UNINA-9910502669803321 |
McIntosh-Smith Simon
|
||
| Cham : , : Springer International Publishing : , : Imprint : Springer, , 2021 | ||
| Lo trovi qui: Univ. Federico II | ||
| ||