Advancing OpenMP for Future Accelerators : 20th International Workshop on OpenMP, IWOMP 2024, Perth, WA, Australia, September 23–25, 2024, Proceedings / / edited by Alexis Espinosa, Michael Klemm, Bronis R. de Supinski, Maciej Cytowski, Jannis Klinkenberg |
Autore | Espinosa Alexis |
Edizione | [1st ed. 2024.] |
Pubbl/distr/stampa | Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024 |
Descrizione fisica | 1 online resource (230 pages) |
Disciplina | 005.45 |
Altri autori (Persone) |
KlemmMichael
de SupinskiBronis R CytowskiMaciej KlinkenbergJannis |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Compilers (Computer programs)
Microprogramming Computer input-output equipment Computers, Special purpose Computer systems Compilers and Interpreters Control Structures and Microprogramming Input/Output and Data Communications Special Purpose and Application-Based Systems Computer System Implementation |
ISBN | 3-031-72567-0 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | -- Current and Future OpenMP Optimization. -- Towards Locality-Aware Host-to-Device Offloading in OpenMP. -- Performance Porting the ExaStar Multi-physics App Thornado On Heterogeneous Systems - A Fortran-OpenMP Code-base Evaluation. -- Event-Based OpenMP Tasks for Time-Sensitive GPU-Accelerated Systems. -- Targeting More Devices. -- Integrating Multi-FPGA Acceleration to OpenMP Distributed Computing. -- Towards a Scalable and Efficient PGAS-based Distributed OpenMP. -- Multilayer Multipurpose Caches for OpenMP Target Regions on FPGAs. -- Best Practices. -- Survey of OpenMP Practice in General Open Source Software. -- CI/CD Efforts for Validation, Verification and Benchmarking OpenMP Implementations. -- Evaluation of Directive-based Programming Models for Stencil Computation on Current GPGPU Architectures. -- Tools. -- Finding Equivalent OpenMP Fortran and C/C++ Code Snippets Using Large Language Models. -- Visualizing Correctness Issues in OpenMP Programs. -- Developing an Interactive OpenMP Programming Book with Large Language Models. -- Simplifying Parallelization. -- Automatic Parallelization and OpenMP Offloading of Fortran Array Notation. -- Detrimental Task Execution Patterns in Mainstream OpenMP Runtimes. |
Record Nr. | UNINA-9910888598803321 |
Espinosa Alexis | ||
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|
OpenMP : enabling massive node-level parallelism : 17th international workshop on OpenMP, IWOMP 2021, Bristol, UK, September 14-16, 2021 : proceedings / / Simon McIntosh-Smith, Bronis R. de Supinski, Jannis Klinkenberg |
Autore | McIntosh-Smith Simon |
Pubbl/distr/stampa | Cham, Switzerland : , : Springer International Publishing, , [2021] |
Descrizione fisica | 1 online resource (231 pages) |
Disciplina | 621.3916 |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Microprocessors - Computer-aided design
Logic design - Data processing |
ISBN | 3-030-85262-8 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
Intro -- Preface -- Organization -- Contents -- Synchronization and Data -- Improving Speculative taskloop in Hardware Transactional Memory -- 1 Introduction -- 2 Background and Related Work -- 2.1 Task-Based Parallelism -- 2.2 TLS on Hardware Transactional Memories -- 2.3 Speculative taskloop (STL) -- 2.4 Lost-Thread Effect -- 2.5 LLVM OpenMP Runtime Library -- 3 Implementation -- 3.1 First Attempt: Use priority Clause -- 3.2 Recursive Partition of Iterations -- 3.3 Immediate Execution When Deque is Full -- 3.4 Removal from Tail of Thread's Deque -- 4 Benchmarks, Methodology and Experimental Setup -- 5 Experimental Results and Analysis -- 6 Conclusions -- References -- Vectorized Barrier and Reduction in LLVM OpenMP Runtime -- 1 Introduction -- 2 Background and Related Work -- 2.1 Types of Barriers in Literature -- 2.2 Barriers and Reductions in OpenMP -- 3 Low Overhead Barrier and Reduction in OpenMP -- 3.1 Vectorized Barrier -- 3.2 Vectorized Reduction -- 4 Performance Results -- 4.1 Intel KNL -- 4.2 Fujitsu A64FX -- 5 Conclusions -- References -- Tasking Extensions I -- Enhancing OpenMP Tasking Model: Performance and Portability -- 1 Introduction -- 2 Motivation -- 3 The Taskgraph Model -- 3.1 The taskgraph Mechanism -- 3.2 Syntax of the taskgraph Clause -- 3.3 Semantics of the taskgraph Clause -- 3.4 Requirements of the taskgraph Region -- 4 Projected Results -- 4.1 Potential Performance Gain -- 4.2 The TDG: A Door for Expanding Portability -- 5 Related Work -- 6 Conclusion -- References -- OpenMP Taskloop Dependences -- 1 Introduction -- 2 Tasking Programmability Challenges -- 3 Related Work -- 4 Taskloop with Dependences -- 5 Implementation -- 6 Experiment Results -- 7 Conclusions and Future Work -- References -- Applications -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part I).
1 Introduction -- 2 Platforms Used -- 3 Application Experiences -- 3.1 BerkeleyGW -- 3.2 WDMApp -- References -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part II) -- 1 Introduction -- 2 Application Experiences -- 2.1 GAMESS -- 2.2 GESTS -- 2.3 GridMini -- 3 Conclusions -- References -- An Empirical Investigation of OpenMP Based Implementation of Simplex Algorithm -- 1 Introduction -- 2 Serial Algorithm -- 3 Parallel Algorithm -- 3.1 Implementation -- 3.2 Optimization Strategies -- 3.3 Algorithm Analysis -- 4 Experimental Results and Observations -- 4.1 NETLIB Dataset -- 4.2 Variation of the Number of Variables -- 4.3 Variation of the Number of Constraints -- 4.4 Variation in Matrix Density -- 4.5 Discussion -- 5 Conclusion -- A Appendix: Serial Algorithm - Working Example -- References -- Task Inefficiency Patterns for a Wave Equation Solver -- 1 Introduction -- 2 Case Studies -- 3 Test Environment -- 4 Benchmarking and Task Runtime Modifications -- 4.1 Direct Translation of Enclave Tasking to OpenMP (native) -- 4.2 Manual Task Postponing (Hold-Back) -- 4.3 Manual Backfilling (Backfill) -- 5 Evaluation and Conclusion -- References -- Case Studies -- Comparing OpenMP Implementations with Applications Across A64FX Platforms -- 1 Introduction -- 1.1 The A64FX Processor -- 1.2 Paper's Contribution and Organization -- 2 List of Applications and Experimental Setup -- 2.1 List of Applications -- 2.2 Systems and Compilers -- 2.3 Runtime Environment -- 2.4 Compiler Options -- 3 Experimental Results -- 3.1 Ookami -- 3.2 Fugaku -- 4 Related Work -- 5 Conclusions and Future Work -- References -- A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation -- 1 Introduction -- 2 Case Study: Porting DCA++ to Wombat -- 2.1 Evaluation Environment -- 2.2 DCA++ -- 2.3 Baseline Performance. 3 An LLVM Tool Methodology to Generate Efficient Vectorization -- 3.1 OpenMP SIMD -- 3.2 Using the Correct Compiler Flags -- 3.3 Loop Transformations -- 3.4 Results -- 4 Automating the Process: The OpenMP Advisor -- 5 Related Work -- 6 Conclusion -- References -- Heterogenous Computing and Memory -- Experience Report: Writing a Portable GPU Runtime with OpenMP 5.1 -- 1 Introduction -- 2 Background -- 2.1 Device Runtime Library -- 2.2 Compilation Flow of OpenMP Target Offloading in LLVM/Clang -- 2.3 Motivation -- 3 Implementation -- 3.1 Common Part -- 3.2 Target Specific Part -- 4 Evaluation -- 4.1 Code Comparison -- 4.2 Functional Testing -- 4.3 Performance Evaluation -- 5 Conclusions and Future Work -- References -- FOTV: A Generic Device Offloading Framework for OpenMP -- 1 Introduction -- 2 Background: OpenMP Offloading Infrastructure -- 2.1 Offloading Strategy -- 2.2 Advantages and Limitations -- 3 Architecture of the FOTV Generic Device Framework -- 3.1 The Runtime Library Components -- 3.2 The Code Extraction Tool -- 4 Device Management API Description -- 4.1 DeviceManagement Component -- 4.2 TgtRegionBase Component -- 5 Case Study: Running OpenCL Kernels as OpenMP Regions -- 5.1 The OpenCL Device Requirements -- 6 Results -- 7 Related Works -- 8 Conclusions and Future Works -- References -- Beyond Explicit Transfers: Shared and Managed Memory in OpenMP -- 1 Introduction -- 2 Current Support in OpenMP -- 2.1 Allocators -- 2.2 Host Memory -- 2.3 Device Memory -- 3 Survey -- 3.1 OpenCL -- 3.2 Level Zero -- 3.3 CUDA -- 3.4 HIP -- 4 Proposed OpenMP Extension -- 4.1 Memory Space Accessibility -- 4.2 Shared and Managed Memory -- 4.3 Memory Location Control -- 5 Evaluation -- 6 Conclusion -- References -- Tasking Extensions II -- Communication-Aware Task Scheduling Strategy in Hybrid MPI+OpenMP Applications -- 1 Introduction -- 2 Related Work. 3 Task Scheduling Strategy -- 3.1 Interoperation Between MPI and OpenMP Runtimes -- 3.2 Manual Policies -- 3.3 (Semi-)Automatic Policies -- 3.4 Summary -- 4 Implementation and Evaluation -- 4.1 Implementation -- 4.2 Evaluation Environment -- 4.3 Experimental Results -- 5 Conclusion and Future Work -- References -- An OpenMP Free Agent Threads Implementation -- 1 Introduction -- 2 Related Work -- 3 Proposal -- 3.1 Considered Aspects in the Design -- 3.2 The free_agent Task Clause -- 3.3 Proposed Mechanisms to Manage Free Agent Threads -- 4 Implementation -- 5 Evaluation -- 5.1 Use Case: Fixing Load Imbalance Between Parallel Regions -- 5.2 Use Case: Solving Load Imbalance in a Hybrid Application with DLB as an OMPT Tool -- 6 Conclusions and Future Work -- References -- Author Index. |
Record Nr. | UNISA-996464509003316 |
McIntosh-Smith Simon | ||
Cham, Switzerland : , : Springer International Publishing, , [2021] | ||
Materiale a stampa | ||
Lo trovi qui: Univ. di Salerno | ||
|
OpenMP : enabling massive node-level parallelism : 17th international workshop on OpenMP, IWOMP 2021, Bristol, UK, September 14-16, 2021 : proceedings / / Simon McIntosh-Smith, Bronis R. de Supinski, Jannis Klinkenberg |
Autore | McIntosh-Smith Simon |
Pubbl/distr/stampa | Cham, Switzerland : , : Springer International Publishing, , [2021] |
Descrizione fisica | 1 online resource (231 pages) |
Disciplina | 621.3916 |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Microprocessors - Computer-aided design
Logic design - Data processing |
ISBN | 3-030-85262-8 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
Intro -- Preface -- Organization -- Contents -- Synchronization and Data -- Improving Speculative taskloop in Hardware Transactional Memory -- 1 Introduction -- 2 Background and Related Work -- 2.1 Task-Based Parallelism -- 2.2 TLS on Hardware Transactional Memories -- 2.3 Speculative taskloop (STL) -- 2.4 Lost-Thread Effect -- 2.5 LLVM OpenMP Runtime Library -- 3 Implementation -- 3.1 First Attempt: Use priority Clause -- 3.2 Recursive Partition of Iterations -- 3.3 Immediate Execution When Deque is Full -- 3.4 Removal from Tail of Thread's Deque -- 4 Benchmarks, Methodology and Experimental Setup -- 5 Experimental Results and Analysis -- 6 Conclusions -- References -- Vectorized Barrier and Reduction in LLVM OpenMP Runtime -- 1 Introduction -- 2 Background and Related Work -- 2.1 Types of Barriers in Literature -- 2.2 Barriers and Reductions in OpenMP -- 3 Low Overhead Barrier and Reduction in OpenMP -- 3.1 Vectorized Barrier -- 3.2 Vectorized Reduction -- 4 Performance Results -- 4.1 Intel KNL -- 4.2 Fujitsu A64FX -- 5 Conclusions -- References -- Tasking Extensions I -- Enhancing OpenMP Tasking Model: Performance and Portability -- 1 Introduction -- 2 Motivation -- 3 The Taskgraph Model -- 3.1 The taskgraph Mechanism -- 3.2 Syntax of the taskgraph Clause -- 3.3 Semantics of the taskgraph Clause -- 3.4 Requirements of the taskgraph Region -- 4 Projected Results -- 4.1 Potential Performance Gain -- 4.2 The TDG: A Door for Expanding Portability -- 5 Related Work -- 6 Conclusion -- References -- OpenMP Taskloop Dependences -- 1 Introduction -- 2 Tasking Programmability Challenges -- 3 Related Work -- 4 Taskloop with Dependences -- 5 Implementation -- 6 Experiment Results -- 7 Conclusions and Future Work -- References -- Applications -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part I).
1 Introduction -- 2 Platforms Used -- 3 Application Experiences -- 3.1 BerkeleyGW -- 3.2 WDMApp -- References -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part II) -- 1 Introduction -- 2 Application Experiences -- 2.1 GAMESS -- 2.2 GESTS -- 2.3 GridMini -- 3 Conclusions -- References -- An Empirical Investigation of OpenMP Based Implementation of Simplex Algorithm -- 1 Introduction -- 2 Serial Algorithm -- 3 Parallel Algorithm -- 3.1 Implementation -- 3.2 Optimization Strategies -- 3.3 Algorithm Analysis -- 4 Experimental Results and Observations -- 4.1 NETLIB Dataset -- 4.2 Variation of the Number of Variables -- 4.3 Variation of the Number of Constraints -- 4.4 Variation in Matrix Density -- 4.5 Discussion -- 5 Conclusion -- A Appendix: Serial Algorithm - Working Example -- References -- Task Inefficiency Patterns for a Wave Equation Solver -- 1 Introduction -- 2 Case Studies -- 3 Test Environment -- 4 Benchmarking and Task Runtime Modifications -- 4.1 Direct Translation of Enclave Tasking to OpenMP (native) -- 4.2 Manual Task Postponing (Hold-Back) -- 4.3 Manual Backfilling (Backfill) -- 5 Evaluation and Conclusion -- References -- Case Studies -- Comparing OpenMP Implementations with Applications Across A64FX Platforms -- 1 Introduction -- 1.1 The A64FX Processor -- 1.2 Paper's Contribution and Organization -- 2 List of Applications and Experimental Setup -- 2.1 List of Applications -- 2.2 Systems and Compilers -- 2.3 Runtime Environment -- 2.4 Compiler Options -- 3 Experimental Results -- 3.1 Ookami -- 3.2 Fugaku -- 4 Related Work -- 5 Conclusions and Future Work -- References -- A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation -- 1 Introduction -- 2 Case Study: Porting DCA++ to Wombat -- 2.1 Evaluation Environment -- 2.2 DCA++ -- 2.3 Baseline Performance. 3 An LLVM Tool Methodology to Generate Efficient Vectorization -- 3.1 OpenMP SIMD -- 3.2 Using the Correct Compiler Flags -- 3.3 Loop Transformations -- 3.4 Results -- 4 Automating the Process: The OpenMP Advisor -- 5 Related Work -- 6 Conclusion -- References -- Heterogenous Computing and Memory -- Experience Report: Writing a Portable GPU Runtime with OpenMP 5.1 -- 1 Introduction -- 2 Background -- 2.1 Device Runtime Library -- 2.2 Compilation Flow of OpenMP Target Offloading in LLVM/Clang -- 2.3 Motivation -- 3 Implementation -- 3.1 Common Part -- 3.2 Target Specific Part -- 4 Evaluation -- 4.1 Code Comparison -- 4.2 Functional Testing -- 4.3 Performance Evaluation -- 5 Conclusions and Future Work -- References -- FOTV: A Generic Device Offloading Framework for OpenMP -- 1 Introduction -- 2 Background: OpenMP Offloading Infrastructure -- 2.1 Offloading Strategy -- 2.2 Advantages and Limitations -- 3 Architecture of the FOTV Generic Device Framework -- 3.1 The Runtime Library Components -- 3.2 The Code Extraction Tool -- 4 Device Management API Description -- 4.1 DeviceManagement Component -- 4.2 TgtRegionBase Component -- 5 Case Study: Running OpenCL Kernels as OpenMP Regions -- 5.1 The OpenCL Device Requirements -- 6 Results -- 7 Related Works -- 8 Conclusions and Future Works -- References -- Beyond Explicit Transfers: Shared and Managed Memory in OpenMP -- 1 Introduction -- 2 Current Support in OpenMP -- 2.1 Allocators -- 2.2 Host Memory -- 2.3 Device Memory -- 3 Survey -- 3.1 OpenCL -- 3.2 Level Zero -- 3.3 CUDA -- 3.4 HIP -- 4 Proposed OpenMP Extension -- 4.1 Memory Space Accessibility -- 4.2 Shared and Managed Memory -- 4.3 Memory Location Control -- 5 Evaluation -- 6 Conclusion -- References -- Tasking Extensions II -- Communication-Aware Task Scheduling Strategy in Hybrid MPI+OpenMP Applications -- 1 Introduction -- 2 Related Work. 3 Task Scheduling Strategy -- 3.1 Interoperation Between MPI and OpenMP Runtimes -- 3.2 Manual Policies -- 3.3 (Semi-)Automatic Policies -- 3.4 Summary -- 4 Implementation and Evaluation -- 4.1 Implementation -- 4.2 Evaluation Environment -- 4.3 Experimental Results -- 5 Conclusion and Future Work -- References -- An OpenMP Free Agent Threads Implementation -- 1 Introduction -- 2 Related Work -- 3 Proposal -- 3.1 Considered Aspects in the Design -- 3.2 The free_agent Task Clause -- 3.3 Proposed Mechanisms to Manage Free Agent Threads -- 4 Implementation -- 5 Evaluation -- 5.1 Use Case: Fixing Load Imbalance Between Parallel Regions -- 5.2 Use Case: Solving Load Imbalance in a Hybrid Application with DLB as an OMPT Tool -- 6 Conclusions and Future Work -- References -- Author Index. |
Record Nr. | UNINA-9910502669803321 |
McIntosh-Smith Simon | ||
Cham, Switzerland : , : Springer International Publishing, , [2021] | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|
OpenMP: Advanced Task-Based, Device and Compiler Programming [[electronic resource] ] : 19th International Workshop on OpenMP, IWOMP 2023, Bristol, UK, September 13–15, 2023, Proceedings / / edited by Simon McIntosh-Smith, Michael Klemm, Bronis R. de Supinski, Tom Deakin, Jannis Klinkenberg |
Autore | McIntosh-Smith Simon |
Edizione | [1st ed. 2023.] |
Pubbl/distr/stampa | Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 |
Descrizione fisica | 1 online resource (244 pages) |
Disciplina | 005.275 |
Altri autori (Persone) |
KlemmMichael
de SupinskiBronis R DeakinTom KlinkenbergJannis |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Microprocessors
Computer architecture Compilers (Computer programs) Microprogramming Computer input-output equipment Computers, Special purpose Computer systems Processor Architectures Compilers and Interpreters Control Structures and Microprogramming Input/Output and Data Communications Special Purpose and Application-Based Systems Computer System Implementation |
ISBN | 3-031-40744-X |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | OpenMP and AI: Advising OpenMP Parallelization via a Graph-Based Approach with Transformers -- Towards Effective Language Model Application in High-Performance Computing -- OpenMP Advisor: A Compiler Tool for Heterogeneous Architectures -- Tasking Extensions: Introducing Moldable Task in OpenMP -- Suspending OpenMP Tasks on Asynchronous Events: Extending the Taskwait Construct -- How to Efficiently Parallelize Irregular DOACROSS Loops Using Fine-Grained Granularity and OpenMP Tasks? The mcf Case -- OpenMP Offload Experiences: The Kokkos OpenMPTarget Backend: Implementation and Lessons Learned -- Fine-Grained Parallelism on GPUs Using OpenMP Target Offloading -- Improving a Multigrid Poisson Solver with Peer-to-Peer Communication and Task Dependencies -- Beyond Explicit GPU Support: Multipurpose Cacheing to accelerate OpenMP Target Regions on FPGAs -- Generalizing Hierarchical Parallelism -- Exploring the Limits of Generic Code Execution on GPUs via Direct (OpenMP) Offload -- OpenMP Infrastructure and Evaluation: Improving Simulations of Task-Based Applications on Complex NUMA Architectures -- Experimental Characterization of OpenMP Offloading Memory Operations and Unified Shared Memory Support -- OpenMP Reverse Offloading Using Shared Memory Remote Procedure Calls. |
Record Nr. | UNISA-996546849103316 |
McIntosh-Smith Simon | ||
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. di Salerno | ||
|
OpenMP: Advanced Task-Based, Device and Compiler Programming : 19th International Workshop on OpenMP, IWOMP 2023, Bristol, UK, September 13–15, 2023, Proceedings / / edited by Simon McIntosh-Smith, Michael Klemm, Bronis R. de Supinski, Tom Deakin, Jannis Klinkenberg |
Autore | McIntosh-Smith Simon |
Edizione | [1st ed. 2023.] |
Pubbl/distr/stampa | Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 |
Descrizione fisica | 1 online resource (244 pages) |
Disciplina | 005.275 |
Altri autori (Persone) |
KlemmMichael
de SupinskiBronis R DeakinTom KlinkenbergJannis |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Microprocessors
Computer architecture Compilers (Computer programs) Microprogramming Computer input-output equipment Computers, Special purpose Computer systems Processor Architectures Compilers and Interpreters Control Structures and Microprogramming Input/Output and Data Communications Special Purpose and Application-Based Systems Computer System Implementation |
ISBN | 3-031-40744-X |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | OpenMP and AI: Advising OpenMP Parallelization via a Graph-Based Approach with Transformers -- Towards Effective Language Model Application in High-Performance Computing -- OpenMP Advisor: A Compiler Tool for Heterogeneous Architectures -- Tasking Extensions: Introducing Moldable Task in OpenMP -- Suspending OpenMP Tasks on Asynchronous Events: Extending the Taskwait Construct -- How to Efficiently Parallelize Irregular DOACROSS Loops Using Fine-Grained Granularity and OpenMP Tasks? The mcf Case -- OpenMP Offload Experiences: The Kokkos OpenMPTarget Backend: Implementation and Lessons Learned -- Fine-Grained Parallelism on GPUs Using OpenMP Target Offloading -- Improving a Multigrid Poisson Solver with Peer-to-Peer Communication and Task Dependencies -- Beyond Explicit GPU Support: Multipurpose Cacheing to accelerate OpenMP Target Regions on FPGAs -- Generalizing Hierarchical Parallelism -- Exploring the Limits of Generic Code Execution on GPUs via Direct (OpenMP) Offload -- OpenMP Infrastructure and Evaluation: Improving Simulations of Task-Based Applications on Complex NUMA Architectures -- Experimental Characterization of OpenMP Offloading Memory Operations and Unified Shared Memory Support -- OpenMP Reverse Offloading Using Shared Memory Remote Procedure Calls. |
Record Nr. | UNINA-9910743696103321 |
McIntosh-Smith Simon | ||
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|
OpenMP: Portable Multi-Level Parallelism on Modern Systems [[electronic resource] ] : 16th International Workshop on OpenMP, IWOMP 2020, Austin, TX, USA, September 22–24, 2020, Proceedings / / edited by Kent Milfeld, Bronis R. de Supinski, Lars Koesterke, Jannis Klinkenberg |
Edizione | [1st ed. 2020.] |
Pubbl/distr/stampa | Cham : , : Springer International Publishing : , : Imprint : Springer, , 2020 |
Descrizione fisica | 1 online resource (XI, 344 p. 148 illus., 95 illus. in color.) |
Disciplina | 004.1 |
Collana | Programming and Software Engineering |
Soggetto topico |
Microprocessors
Computer programming Programming languages (Electronic computers) Logic design Operating systems (Computers) Architecture, Computer Processor Architectures Programming Techniques Programming Languages, Compilers, Interpreters Logic Design Operating Systems Computer System Implementation |
ISBN | 3-030-58144-6 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | Performance Methodologies -- FAROS: A Framework To Analyze OpenMP Compilation Through Benchmarking and Compiler Optimization Analysis -- Evaluating the Effciency of OpenMP Tasking for Unbalanced Computation on Diverse CPU Architectures -- Applications -- A Case Study of Porting HPGMG from CUDA to OpenMP Target Offload -- P-Aevol: an OpenMP Parallelization of a Biological Evolution Simulator, Through Decomposition in Multiple Loops -- Evaluating Performance of OpenMP Tasks in a Seismic Stencil Application -- OpenMP Extensions -- Unified Sequential Optimization Directives in OpenMP -- Support Data Shu e Between Threads in OpenMP -- Performance Studies -- Performance Study of SpMV Towards an Auto-tuned and Task-based SpMV (LASs Library) -- A Case Study on Addressing Complex Load Imbalance in OpenMP -- Tools -- On-the- y Data Race Detection with the Enhanced OpenMP Series-Parallel Graph -- AfterOMPT: An OMPT-based tool for ne-Grained Tracing of Tasks and Loops -- Co-designing OpenMP Programming Model Features with OMPT and Simulation -- NUMA -- sOMP: Simulating OpenMP Task-based Applications with NUMA Effects -- Virt ex: Automatic Adaptation to NUMA Topology Change for OpenMP Applications -- Compilation Techniques -- Using OpenMP to Detect and Speculate Dynamic DOALL Loops -- ComPar: Optimized Multi-Compiler for Automatic OpenMP S2S Parallelization -- Heterogeneous Computing -- OpenMP Device Offloading to FPGAs Using the Nymble Infrastructure -- Data Transfer and Reuse Analysis Tool for GPU-offloading Using OpenMP -- Toward Supporting Multi-GPU Targets via Taskloop and User-defined Schedules -- Memory -- Preliminary Experience with OpenMP Management Implementation Memory -- Memory Anomalies in OpenMP. |
Record Nr. | UNISA-996418299303316 |
Cham : , : Springer International Publishing : , : Imprint : Springer, , 2020 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. di Salerno | ||
|
OpenMP: Portable Multi-Level Parallelism on Modern Systems : 16th International Workshop on OpenMP, IWOMP 2020, Austin, TX, USA, September 22–24, 2020, Proceedings / / edited by Kent Milfeld, Bronis R. de Supinski, Lars Koesterke, Jannis Klinkenberg |
Edizione | [1st ed. 2020.] |
Pubbl/distr/stampa | Cham : , : Springer International Publishing : , : Imprint : Springer, , 2020 |
Descrizione fisica | 1 online resource (XI, 344 p. 148 illus., 95 illus. in color.) |
Disciplina |
004.1
005.1 |
Collana | Programming and Software Engineering |
Soggetto topico |
Microprocessors
Computer programming Programming languages (Electronic computers) Logic design Operating systems (Computers) Computer architecture Processor Architectures Programming Techniques Programming Languages, Compilers, Interpreters Logic Design Operating Systems Computer System Implementation |
ISBN | 3-030-58144-6 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | Performance Methodologies -- FAROS: A Framework To Analyze OpenMP Compilation Through Benchmarking and Compiler Optimization Analysis -- Evaluating the Effciency of OpenMP Tasking for Unbalanced Computation on Diverse CPU Architectures -- Applications -- A Case Study of Porting HPGMG from CUDA to OpenMP Target Offload -- P-Aevol: an OpenMP Parallelization of a Biological Evolution Simulator, Through Decomposition in Multiple Loops -- Evaluating Performance of OpenMP Tasks in a Seismic Stencil Application -- OpenMP Extensions -- Unified Sequential Optimization Directives in OpenMP -- Support Data Shu e Between Threads in OpenMP -- Performance Studies -- Performance Study of SpMV Towards an Auto-tuned and Task-based SpMV (LASs Library) -- A Case Study on Addressing Complex Load Imbalance in OpenMP -- Tools -- On-the- y Data Race Detection with the Enhanced OpenMP Series-Parallel Graph -- AfterOMPT: An OMPT-based tool for ne-Grained Tracing of Tasks and Loops -- Co-designing OpenMP Programming Model Features with OMPT and Simulation -- NUMA -- sOMP: Simulating OpenMP Task-based Applications with NUMA Effects -- Virt ex: Automatic Adaptation to NUMA Topology Change for OpenMP Applications -- Compilation Techniques -- Using OpenMP to Detect and Speculate Dynamic DOALL Loops -- ComPar: Optimized Multi-Compiler for Automatic OpenMP S2S Parallelization -- Heterogeneous Computing -- OpenMP Device Offloading to FPGAs Using the Nymble Infrastructure -- Data Transfer and Reuse Analysis Tool for GPU-offloading Using OpenMP -- Toward Supporting Multi-GPU Targets via Taskloop and User-defined Schedules -- Memory -- Preliminary Experience with OpenMP Management Implementation Memory -- Memory Anomalies in OpenMP. |
Record Nr. | UNINA-9910427719303321 |
Cham : , : Springer International Publishing : , : Imprint : Springer, , 2020 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|