top

  Info

  • Utilizzare la checkbox di selezione a fianco di ciascun documento per attivare le funzionalità di stampa, invio email, download nei formati disponibili del (i) record.

  Info

  • Utilizzare questo link per rimuovere la selezione effettuata.
Advancing OpenMP for Future Accelerators : 20th International Workshop on OpenMP, IWOMP 2024, Perth, WA, Australia, September 23–25, 2024, Proceedings / / edited by Alexis Espinosa, Michael Klemm, Bronis R. de Supinski, Maciej Cytowski, Jannis Klinkenberg
Advancing OpenMP for Future Accelerators : 20th International Workshop on OpenMP, IWOMP 2024, Perth, WA, Australia, September 23–25, 2024, Proceedings / / edited by Alexis Espinosa, Michael Klemm, Bronis R. de Supinski, Maciej Cytowski, Jannis Klinkenberg
Autore Espinosa Alexis
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (230 pages)
Disciplina 005.45
Altri autori (Persone) KlemmMichael
de SupinskiBronis R
CytowskiMaciej
KlinkenbergJannis
Collana Lecture Notes in Computer Science
Soggetto topico Compilers (Computer programs)
Microprogramming
Computer input-output equipment
Computers, Special purpose
Computer systems
Compilers and Interpreters
Control Structures and Microprogramming
Input/Output and Data Communications
Special Purpose and Application-Based Systems
Computer System Implementation
ISBN 3-031-72567-0
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto -- Current and Future OpenMP Optimization. -- Towards Locality-Aware Host-to-Device Offloading in OpenMP. -- Performance Porting the ExaStar Multi-physics App Thornado On Heterogeneous Systems - A Fortran-OpenMP Code-base Evaluation. -- Event-Based OpenMP Tasks for Time-Sensitive GPU-Accelerated Systems. -- Targeting More Devices. -- Integrating Multi-FPGA Acceleration to OpenMP Distributed Computing. -- Towards a Scalable and Efficient PGAS-based Distributed OpenMP. -- Multilayer Multipurpose Caches for OpenMP Target Regions on FPGAs. -- Best Practices. -- Survey of OpenMP Practice in General Open Source Software. -- CI/CD Efforts for Validation, Verification and Benchmarking OpenMP Implementations. -- Evaluation of Directive-based Programming Models for Stencil Computation on Current GPGPU Architectures. -- Tools. -- Finding Equivalent OpenMP Fortran and C/C++ Code Snippets Using Large Language Models. -- Visualizing Correctness Issues in OpenMP Programs. -- Developing an Interactive OpenMP Programming Book with Large Language Models. -- Simplifying Parallelization. -- Automatic Parallelization and OpenMP Offloading of Fortran Array Notation. -- Detrimental Task Execution Patterns in Mainstream OpenMP Runtimes.
Record Nr. UNINA-9910888598803321
Espinosa Alexis  
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
OpenMP : enabling massive node-level parallelism : 17th international workshop on OpenMP, IWOMP 2021, Bristol, UK, September 14-16, 2021 : proceedings / / Simon McIntosh-Smith, Bronis R. de Supinski, Jannis Klinkenberg
OpenMP : enabling massive node-level parallelism : 17th international workshop on OpenMP, IWOMP 2021, Bristol, UK, September 14-16, 2021 : proceedings / / Simon McIntosh-Smith, Bronis R. de Supinski, Jannis Klinkenberg
Autore McIntosh-Smith Simon
Pubbl/distr/stampa Cham, Switzerland : , : Springer International Publishing, , [2021]
Descrizione fisica 1 online resource (231 pages)
Disciplina 621.3916
Collana Lecture Notes in Computer Science
Soggetto topico Microprocessors - Computer-aided design
Logic design - Data processing
ISBN 3-030-85262-8
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Contents -- Synchronization and Data -- Improving Speculative taskloop in Hardware Transactional Memory -- 1 Introduction -- 2 Background and Related Work -- 2.1 Task-Based Parallelism -- 2.2 TLS on Hardware Transactional Memories -- 2.3 Speculative taskloop (STL) -- 2.4 Lost-Thread Effect -- 2.5 LLVM OpenMP Runtime Library -- 3 Implementation -- 3.1 First Attempt: Use priority Clause -- 3.2 Recursive Partition of Iterations -- 3.3 Immediate Execution When Deque is Full -- 3.4 Removal from Tail of Thread's Deque -- 4 Benchmarks, Methodology and Experimental Setup -- 5 Experimental Results and Analysis -- 6 Conclusions -- References -- Vectorized Barrier and Reduction in LLVM OpenMP Runtime -- 1 Introduction -- 2 Background and Related Work -- 2.1 Types of Barriers in Literature -- 2.2 Barriers and Reductions in OpenMP -- 3 Low Overhead Barrier and Reduction in OpenMP -- 3.1 Vectorized Barrier -- 3.2 Vectorized Reduction -- 4 Performance Results -- 4.1 Intel KNL -- 4.2 Fujitsu A64FX -- 5 Conclusions -- References -- Tasking Extensions I -- Enhancing OpenMP Tasking Model: Performance and Portability -- 1 Introduction -- 2 Motivation -- 3 The Taskgraph Model -- 3.1 The taskgraph Mechanism -- 3.2 Syntax of the taskgraph Clause -- 3.3 Semantics of the taskgraph Clause -- 3.4 Requirements of the taskgraph Region -- 4 Projected Results -- 4.1 Potential Performance Gain -- 4.2 The TDG: A Door for Expanding Portability -- 5 Related Work -- 6 Conclusion -- References -- OpenMP Taskloop Dependences -- 1 Introduction -- 2 Tasking Programmability Challenges -- 3 Related Work -- 4 Taskloop with Dependences -- 5 Implementation -- 6 Experiment Results -- 7 Conclusions and Future Work -- References -- Applications -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part I).
1 Introduction -- 2 Platforms Used -- 3 Application Experiences -- 3.1 BerkeleyGW -- 3.2 WDMApp -- References -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part II) -- 1 Introduction -- 2 Application Experiences -- 2.1 GAMESS -- 2.2 GESTS -- 2.3 GridMini -- 3 Conclusions -- References -- An Empirical Investigation of OpenMP Based Implementation of Simplex Algorithm -- 1 Introduction -- 2 Serial Algorithm -- 3 Parallel Algorithm -- 3.1 Implementation -- 3.2 Optimization Strategies -- 3.3 Algorithm Analysis -- 4 Experimental Results and Observations -- 4.1 NETLIB Dataset -- 4.2 Variation of the Number of Variables -- 4.3 Variation of the Number of Constraints -- 4.4 Variation in Matrix Density -- 4.5 Discussion -- 5 Conclusion -- A Appendix: Serial Algorithm - Working Example -- References -- Task Inefficiency Patterns for a Wave Equation Solver -- 1 Introduction -- 2 Case Studies -- 3 Test Environment -- 4 Benchmarking and Task Runtime Modifications -- 4.1 Direct Translation of Enclave Tasking to OpenMP (native) -- 4.2 Manual Task Postponing (Hold-Back) -- 4.3 Manual Backfilling (Backfill) -- 5 Evaluation and Conclusion -- References -- Case Studies -- Comparing OpenMP Implementations with Applications Across A64FX Platforms -- 1 Introduction -- 1.1 The A64FX Processor -- 1.2 Paper's Contribution and Organization -- 2 List of Applications and Experimental Setup -- 2.1 List of Applications -- 2.2 Systems and Compilers -- 2.3 Runtime Environment -- 2.4 Compiler Options -- 3 Experimental Results -- 3.1 Ookami -- 3.2 Fugaku -- 4 Related Work -- 5 Conclusions and Future Work -- References -- A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation -- 1 Introduction -- 2 Case Study: Porting DCA++ to Wombat -- 2.1 Evaluation Environment -- 2.2 DCA++ -- 2.3 Baseline Performance.
3 An LLVM Tool Methodology to Generate Efficient Vectorization -- 3.1 OpenMP SIMD -- 3.2 Using the Correct Compiler Flags -- 3.3 Loop Transformations -- 3.4 Results -- 4 Automating the Process: The OpenMP Advisor -- 5 Related Work -- 6 Conclusion -- References -- Heterogenous Computing and Memory -- Experience Report: Writing a Portable GPU Runtime with OpenMP 5.1 -- 1 Introduction -- 2 Background -- 2.1 Device Runtime Library -- 2.2 Compilation Flow of OpenMP Target Offloading in LLVM/Clang -- 2.3 Motivation -- 3 Implementation -- 3.1 Common Part -- 3.2 Target Specific Part -- 4 Evaluation -- 4.1 Code Comparison -- 4.2 Functional Testing -- 4.3 Performance Evaluation -- 5 Conclusions and Future Work -- References -- FOTV: A Generic Device Offloading Framework for OpenMP -- 1 Introduction -- 2 Background: OpenMP Offloading Infrastructure -- 2.1 Offloading Strategy -- 2.2 Advantages and Limitations -- 3 Architecture of the FOTV Generic Device Framework -- 3.1 The Runtime Library Components -- 3.2 The Code Extraction Tool -- 4 Device Management API Description -- 4.1 DeviceManagement Component -- 4.2 TgtRegionBase Component -- 5 Case Study: Running OpenCL Kernels as OpenMP Regions -- 5.1 The OpenCL Device Requirements -- 6 Results -- 7 Related Works -- 8 Conclusions and Future Works -- References -- Beyond Explicit Transfers: Shared and Managed Memory in OpenMP -- 1 Introduction -- 2 Current Support in OpenMP -- 2.1 Allocators -- 2.2 Host Memory -- 2.3 Device Memory -- 3 Survey -- 3.1 OpenCL -- 3.2 Level Zero -- 3.3 CUDA -- 3.4 HIP -- 4 Proposed OpenMP Extension -- 4.1 Memory Space Accessibility -- 4.2 Shared and Managed Memory -- 4.3 Memory Location Control -- 5 Evaluation -- 6 Conclusion -- References -- Tasking Extensions II -- Communication-Aware Task Scheduling Strategy in Hybrid MPI+OpenMP Applications -- 1 Introduction -- 2 Related Work.
3 Task Scheduling Strategy -- 3.1 Interoperation Between MPI and OpenMP Runtimes -- 3.2 Manual Policies -- 3.3 (Semi-)Automatic Policies -- 3.4 Summary -- 4 Implementation and Evaluation -- 4.1 Implementation -- 4.2 Evaluation Environment -- 4.3 Experimental Results -- 5 Conclusion and Future Work -- References -- An OpenMP Free Agent Threads Implementation -- 1 Introduction -- 2 Related Work -- 3 Proposal -- 3.1 Considered Aspects in the Design -- 3.2 The free_agent Task Clause -- 3.3 Proposed Mechanisms to Manage Free Agent Threads -- 4 Implementation -- 5 Evaluation -- 5.1 Use Case: Fixing Load Imbalance Between Parallel Regions -- 5.2 Use Case: Solving Load Imbalance in a Hybrid Application with DLB as an OMPT Tool -- 6 Conclusions and Future Work -- References -- Author Index.
Record Nr. UNISA-996464509003316
McIntosh-Smith Simon  
Cham, Switzerland : , : Springer International Publishing, , [2021]
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
OpenMP : enabling massive node-level parallelism : 17th international workshop on OpenMP, IWOMP 2021, Bristol, UK, September 14-16, 2021 : proceedings / / Simon McIntosh-Smith, Bronis R. de Supinski, Jannis Klinkenberg
OpenMP : enabling massive node-level parallelism : 17th international workshop on OpenMP, IWOMP 2021, Bristol, UK, September 14-16, 2021 : proceedings / / Simon McIntosh-Smith, Bronis R. de Supinski, Jannis Klinkenberg
Autore McIntosh-Smith Simon
Pubbl/distr/stampa Cham, Switzerland : , : Springer International Publishing, , [2021]
Descrizione fisica 1 online resource (231 pages)
Disciplina 621.3916
Collana Lecture Notes in Computer Science
Soggetto topico Microprocessors - Computer-aided design
Logic design - Data processing
ISBN 3-030-85262-8
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Contents -- Synchronization and Data -- Improving Speculative taskloop in Hardware Transactional Memory -- 1 Introduction -- 2 Background and Related Work -- 2.1 Task-Based Parallelism -- 2.2 TLS on Hardware Transactional Memories -- 2.3 Speculative taskloop (STL) -- 2.4 Lost-Thread Effect -- 2.5 LLVM OpenMP Runtime Library -- 3 Implementation -- 3.1 First Attempt: Use priority Clause -- 3.2 Recursive Partition of Iterations -- 3.3 Immediate Execution When Deque is Full -- 3.4 Removal from Tail of Thread's Deque -- 4 Benchmarks, Methodology and Experimental Setup -- 5 Experimental Results and Analysis -- 6 Conclusions -- References -- Vectorized Barrier and Reduction in LLVM OpenMP Runtime -- 1 Introduction -- 2 Background and Related Work -- 2.1 Types of Barriers in Literature -- 2.2 Barriers and Reductions in OpenMP -- 3 Low Overhead Barrier and Reduction in OpenMP -- 3.1 Vectorized Barrier -- 3.2 Vectorized Reduction -- 4 Performance Results -- 4.1 Intel KNL -- 4.2 Fujitsu A64FX -- 5 Conclusions -- References -- Tasking Extensions I -- Enhancing OpenMP Tasking Model: Performance and Portability -- 1 Introduction -- 2 Motivation -- 3 The Taskgraph Model -- 3.1 The taskgraph Mechanism -- 3.2 Syntax of the taskgraph Clause -- 3.3 Semantics of the taskgraph Clause -- 3.4 Requirements of the taskgraph Region -- 4 Projected Results -- 4.1 Potential Performance Gain -- 4.2 The TDG: A Door for Expanding Portability -- 5 Related Work -- 6 Conclusion -- References -- OpenMP Taskloop Dependences -- 1 Introduction -- 2 Tasking Programmability Challenges -- 3 Related Work -- 4 Taskloop with Dependences -- 5 Implementation -- 6 Experiment Results -- 7 Conclusions and Future Work -- References -- Applications -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part I).
1 Introduction -- 2 Platforms Used -- 3 Application Experiences -- 3.1 BerkeleyGW -- 3.2 WDMApp -- References -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part II) -- 1 Introduction -- 2 Application Experiences -- 2.1 GAMESS -- 2.2 GESTS -- 2.3 GridMini -- 3 Conclusions -- References -- An Empirical Investigation of OpenMP Based Implementation of Simplex Algorithm -- 1 Introduction -- 2 Serial Algorithm -- 3 Parallel Algorithm -- 3.1 Implementation -- 3.2 Optimization Strategies -- 3.3 Algorithm Analysis -- 4 Experimental Results and Observations -- 4.1 NETLIB Dataset -- 4.2 Variation of the Number of Variables -- 4.3 Variation of the Number of Constraints -- 4.4 Variation in Matrix Density -- 4.5 Discussion -- 5 Conclusion -- A Appendix: Serial Algorithm - Working Example -- References -- Task Inefficiency Patterns for a Wave Equation Solver -- 1 Introduction -- 2 Case Studies -- 3 Test Environment -- 4 Benchmarking and Task Runtime Modifications -- 4.1 Direct Translation of Enclave Tasking to OpenMP (native) -- 4.2 Manual Task Postponing (Hold-Back) -- 4.3 Manual Backfilling (Backfill) -- 5 Evaluation and Conclusion -- References -- Case Studies -- Comparing OpenMP Implementations with Applications Across A64FX Platforms -- 1 Introduction -- 1.1 The A64FX Processor -- 1.2 Paper's Contribution and Organization -- 2 List of Applications and Experimental Setup -- 2.1 List of Applications -- 2.2 Systems and Compilers -- 2.3 Runtime Environment -- 2.4 Compiler Options -- 3 Experimental Results -- 3.1 Ookami -- 3.2 Fugaku -- 4 Related Work -- 5 Conclusions and Future Work -- References -- A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation -- 1 Introduction -- 2 Case Study: Porting DCA++ to Wombat -- 2.1 Evaluation Environment -- 2.2 DCA++ -- 2.3 Baseline Performance.
3 An LLVM Tool Methodology to Generate Efficient Vectorization -- 3.1 OpenMP SIMD -- 3.2 Using the Correct Compiler Flags -- 3.3 Loop Transformations -- 3.4 Results -- 4 Automating the Process: The OpenMP Advisor -- 5 Related Work -- 6 Conclusion -- References -- Heterogenous Computing and Memory -- Experience Report: Writing a Portable GPU Runtime with OpenMP 5.1 -- 1 Introduction -- 2 Background -- 2.1 Device Runtime Library -- 2.2 Compilation Flow of OpenMP Target Offloading in LLVM/Clang -- 2.3 Motivation -- 3 Implementation -- 3.1 Common Part -- 3.2 Target Specific Part -- 4 Evaluation -- 4.1 Code Comparison -- 4.2 Functional Testing -- 4.3 Performance Evaluation -- 5 Conclusions and Future Work -- References -- FOTV: A Generic Device Offloading Framework for OpenMP -- 1 Introduction -- 2 Background: OpenMP Offloading Infrastructure -- 2.1 Offloading Strategy -- 2.2 Advantages and Limitations -- 3 Architecture of the FOTV Generic Device Framework -- 3.1 The Runtime Library Components -- 3.2 The Code Extraction Tool -- 4 Device Management API Description -- 4.1 DeviceManagement Component -- 4.2 TgtRegionBase Component -- 5 Case Study: Running OpenCL Kernels as OpenMP Regions -- 5.1 The OpenCL Device Requirements -- 6 Results -- 7 Related Works -- 8 Conclusions and Future Works -- References -- Beyond Explicit Transfers: Shared and Managed Memory in OpenMP -- 1 Introduction -- 2 Current Support in OpenMP -- 2.1 Allocators -- 2.2 Host Memory -- 2.3 Device Memory -- 3 Survey -- 3.1 OpenCL -- 3.2 Level Zero -- 3.3 CUDA -- 3.4 HIP -- 4 Proposed OpenMP Extension -- 4.1 Memory Space Accessibility -- 4.2 Shared and Managed Memory -- 4.3 Memory Location Control -- 5 Evaluation -- 6 Conclusion -- References -- Tasking Extensions II -- Communication-Aware Task Scheduling Strategy in Hybrid MPI+OpenMP Applications -- 1 Introduction -- 2 Related Work.
3 Task Scheduling Strategy -- 3.1 Interoperation Between MPI and OpenMP Runtimes -- 3.2 Manual Policies -- 3.3 (Semi-)Automatic Policies -- 3.4 Summary -- 4 Implementation and Evaluation -- 4.1 Implementation -- 4.2 Evaluation Environment -- 4.3 Experimental Results -- 5 Conclusion and Future Work -- References -- An OpenMP Free Agent Threads Implementation -- 1 Introduction -- 2 Related Work -- 3 Proposal -- 3.1 Considered Aspects in the Design -- 3.2 The free_agent Task Clause -- 3.3 Proposed Mechanisms to Manage Free Agent Threads -- 4 Implementation -- 5 Evaluation -- 5.1 Use Case: Fixing Load Imbalance Between Parallel Regions -- 5.2 Use Case: Solving Load Imbalance in a Hybrid Application with DLB as an OMPT Tool -- 6 Conclusions and Future Work -- References -- Author Index.
Record Nr. UNINA-9910502669803321
McIntosh-Smith Simon  
Cham, Switzerland : , : Springer International Publishing, , [2021]
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
OpenMP: Advanced Task-Based, Device and Compiler Programming [[electronic resource] ] : 19th International Workshop on OpenMP, IWOMP 2023, Bristol, UK, September 13–15, 2023, Proceedings / / edited by Simon McIntosh-Smith, Michael Klemm, Bronis R. de Supinski, Tom Deakin, Jannis Klinkenberg
OpenMP: Advanced Task-Based, Device and Compiler Programming [[electronic resource] ] : 19th International Workshop on OpenMP, IWOMP 2023, Bristol, UK, September 13–15, 2023, Proceedings / / edited by Simon McIntosh-Smith, Michael Klemm, Bronis R. de Supinski, Tom Deakin, Jannis Klinkenberg
Autore McIntosh-Smith Simon
Edizione [1st ed. 2023.]
Pubbl/distr/stampa Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Descrizione fisica 1 online resource (244 pages)
Disciplina 005.275
Altri autori (Persone) KlemmMichael
de SupinskiBronis R
DeakinTom
KlinkenbergJannis
Collana Lecture Notes in Computer Science
Soggetto topico Microprocessors
Computer architecture
Compilers (Computer programs)
Microprogramming
Computer input-output equipment
Computers, Special purpose
Computer systems
Processor Architectures
Compilers and Interpreters
Control Structures and Microprogramming
Input/Output and Data Communications
Special Purpose and Application-Based Systems
Computer System Implementation
ISBN 3-031-40744-X
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto OpenMP and AI: Advising OpenMP Parallelization via a Graph-Based Approach with Transformers -- Towards Effective Language Model Application in High-Performance Computing -- OpenMP Advisor: A Compiler Tool for Heterogeneous Architectures -- Tasking Extensions: Introducing Moldable Task in OpenMP -- Suspending OpenMP Tasks on Asynchronous Events: Extending the Taskwait Construct -- How to Efficiently Parallelize Irregular DOACROSS Loops Using Fine-Grained Granularity and OpenMP Tasks? The mcf Case -- OpenMP Offload Experiences: The Kokkos OpenMPTarget Backend: Implementation and Lessons Learned -- Fine-Grained Parallelism on GPUs Using OpenMP Target Offloading -- Improving a Multigrid Poisson Solver with Peer-to-Peer Communication and Task Dependencies -- Beyond Explicit GPU Support: Multipurpose Cacheing to accelerate OpenMP Target Regions on FPGAs -- Generalizing Hierarchical Parallelism -- Exploring the Limits of Generic Code Execution on GPUs via Direct (OpenMP) Offload -- OpenMP Infrastructure and Evaluation: Improving Simulations of Task-Based Applications on Complex NUMA Architectures -- Experimental Characterization of OpenMP Offloading Memory Operations and Unified Shared Memory Support -- OpenMP Reverse Offloading Using Shared Memory Remote Procedure Calls.
Record Nr. UNISA-996546849103316
McIntosh-Smith Simon  
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
OpenMP: Advanced Task-Based, Device and Compiler Programming : 19th International Workshop on OpenMP, IWOMP 2023, Bristol, UK, September 13–15, 2023, Proceedings / / edited by Simon McIntosh-Smith, Michael Klemm, Bronis R. de Supinski, Tom Deakin, Jannis Klinkenberg
OpenMP: Advanced Task-Based, Device and Compiler Programming : 19th International Workshop on OpenMP, IWOMP 2023, Bristol, UK, September 13–15, 2023, Proceedings / / edited by Simon McIntosh-Smith, Michael Klemm, Bronis R. de Supinski, Tom Deakin, Jannis Klinkenberg
Autore McIntosh-Smith Simon
Edizione [1st ed. 2023.]
Pubbl/distr/stampa Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Descrizione fisica 1 online resource (244 pages)
Disciplina 005.275
Altri autori (Persone) KlemmMichael
de SupinskiBronis R
DeakinTom
KlinkenbergJannis
Collana Lecture Notes in Computer Science
Soggetto topico Microprocessors
Computer architecture
Compilers (Computer programs)
Microprogramming
Computer input-output equipment
Computers, Special purpose
Computer systems
Processor Architectures
Compilers and Interpreters
Control Structures and Microprogramming
Input/Output and Data Communications
Special Purpose and Application-Based Systems
Computer System Implementation
ISBN 3-031-40744-X
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto OpenMP and AI: Advising OpenMP Parallelization via a Graph-Based Approach with Transformers -- Towards Effective Language Model Application in High-Performance Computing -- OpenMP Advisor: A Compiler Tool for Heterogeneous Architectures -- Tasking Extensions: Introducing Moldable Task in OpenMP -- Suspending OpenMP Tasks on Asynchronous Events: Extending the Taskwait Construct -- How to Efficiently Parallelize Irregular DOACROSS Loops Using Fine-Grained Granularity and OpenMP Tasks? The mcf Case -- OpenMP Offload Experiences: The Kokkos OpenMPTarget Backend: Implementation and Lessons Learned -- Fine-Grained Parallelism on GPUs Using OpenMP Target Offloading -- Improving a Multigrid Poisson Solver with Peer-to-Peer Communication and Task Dependencies -- Beyond Explicit GPU Support: Multipurpose Cacheing to accelerate OpenMP Target Regions on FPGAs -- Generalizing Hierarchical Parallelism -- Exploring the Limits of Generic Code Execution on GPUs via Direct (OpenMP) Offload -- OpenMP Infrastructure and Evaluation: Improving Simulations of Task-Based Applications on Complex NUMA Architectures -- Experimental Characterization of OpenMP Offloading Memory Operations and Unified Shared Memory Support -- OpenMP Reverse Offloading Using Shared Memory Remote Procedure Calls.
Record Nr. UNINA-9910743696103321
McIntosh-Smith Simon  
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
OpenMP: Portable Multi-Level Parallelism on Modern Systems [[electronic resource] ] : 16th International Workshop on OpenMP, IWOMP 2020, Austin, TX, USA, September 22–24, 2020, Proceedings / / edited by Kent Milfeld, Bronis R. de Supinski, Lars Koesterke, Jannis Klinkenberg
OpenMP: Portable Multi-Level Parallelism on Modern Systems [[electronic resource] ] : 16th International Workshop on OpenMP, IWOMP 2020, Austin, TX, USA, September 22–24, 2020, Proceedings / / edited by Kent Milfeld, Bronis R. de Supinski, Lars Koesterke, Jannis Klinkenberg
Edizione [1st ed. 2020.]
Pubbl/distr/stampa Cham : , : Springer International Publishing : , : Imprint : Springer, , 2020
Descrizione fisica 1 online resource (XI, 344 p. 148 illus., 95 illus. in color.)
Disciplina 004.1
Collana Programming and Software Engineering
Soggetto topico Microprocessors
Computer programming
Programming languages (Electronic computers)
Logic design
Operating systems (Computers)
Architecture, Computer
Processor Architectures
Programming Techniques
Programming Languages, Compilers, Interpreters
Logic Design
Operating Systems
Computer System Implementation
ISBN 3-030-58144-6
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Performance Methodologies -- FAROS: A Framework To Analyze OpenMP Compilation Through Benchmarking and Compiler Optimization Analysis -- Evaluating the Effciency of OpenMP Tasking for Unbalanced Computation on Diverse CPU Architectures -- Applications -- A Case Study of Porting HPGMG from CUDA to OpenMP Target Offload -- P-Aevol: an OpenMP Parallelization of a Biological Evolution Simulator, Through Decomposition in Multiple Loops -- Evaluating Performance of OpenMP Tasks in a Seismic Stencil Application -- OpenMP Extensions -- Unified Sequential Optimization Directives in OpenMP -- Support Data Shu e Between Threads in OpenMP -- Performance Studies -- Performance Study of SpMV Towards an Auto-tuned and Task-based SpMV (LASs Library) -- A Case Study on Addressing Complex Load Imbalance in OpenMP -- Tools -- On-the- y Data Race Detection with the Enhanced OpenMP Series-Parallel Graph -- AfterOMPT: An OMPT-based tool for ne-Grained Tracing of Tasks and Loops -- Co-designing OpenMP Programming Model Features with OMPT and Simulation -- NUMA -- sOMP: Simulating OpenMP Task-based Applications with NUMA Effects -- Virt ex: Automatic Adaptation to NUMA Topology Change for OpenMP Applications -- Compilation Techniques -- Using OpenMP to Detect and Speculate Dynamic DOALL Loops -- ComPar: Optimized Multi-Compiler for Automatic OpenMP S2S Parallelization -- Heterogeneous Computing -- OpenMP Device Offloading to FPGAs Using the Nymble Infrastructure -- Data Transfer and Reuse Analysis Tool for GPU-offloading Using OpenMP -- Toward Supporting Multi-GPU Targets via Taskloop and User-defined Schedules -- Memory -- Preliminary Experience with OpenMP Management Implementation Memory -- Memory Anomalies in OpenMP.
Record Nr. UNISA-996418299303316
Cham : , : Springer International Publishing : , : Imprint : Springer, , 2020
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
OpenMP: Portable Multi-Level Parallelism on Modern Systems : 16th International Workshop on OpenMP, IWOMP 2020, Austin, TX, USA, September 22–24, 2020, Proceedings / / edited by Kent Milfeld, Bronis R. de Supinski, Lars Koesterke, Jannis Klinkenberg
OpenMP: Portable Multi-Level Parallelism on Modern Systems : 16th International Workshop on OpenMP, IWOMP 2020, Austin, TX, USA, September 22–24, 2020, Proceedings / / edited by Kent Milfeld, Bronis R. de Supinski, Lars Koesterke, Jannis Klinkenberg
Edizione [1st ed. 2020.]
Pubbl/distr/stampa Cham : , : Springer International Publishing : , : Imprint : Springer, , 2020
Descrizione fisica 1 online resource (XI, 344 p. 148 illus., 95 illus. in color.)
Disciplina 004.1
005.1
Collana Programming and Software Engineering
Soggetto topico Microprocessors
Computer programming
Programming languages (Electronic computers)
Logic design
Operating systems (Computers)
Computer architecture
Processor Architectures
Programming Techniques
Programming Languages, Compilers, Interpreters
Logic Design
Operating Systems
Computer System Implementation
ISBN 3-030-58144-6
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Performance Methodologies -- FAROS: A Framework To Analyze OpenMP Compilation Through Benchmarking and Compiler Optimization Analysis -- Evaluating the Effciency of OpenMP Tasking for Unbalanced Computation on Diverse CPU Architectures -- Applications -- A Case Study of Porting HPGMG from CUDA to OpenMP Target Offload -- P-Aevol: an OpenMP Parallelization of a Biological Evolution Simulator, Through Decomposition in Multiple Loops -- Evaluating Performance of OpenMP Tasks in a Seismic Stencil Application -- OpenMP Extensions -- Unified Sequential Optimization Directives in OpenMP -- Support Data Shu e Between Threads in OpenMP -- Performance Studies -- Performance Study of SpMV Towards an Auto-tuned and Task-based SpMV (LASs Library) -- A Case Study on Addressing Complex Load Imbalance in OpenMP -- Tools -- On-the- y Data Race Detection with the Enhanced OpenMP Series-Parallel Graph -- AfterOMPT: An OMPT-based tool for ne-Grained Tracing of Tasks and Loops -- Co-designing OpenMP Programming Model Features with OMPT and Simulation -- NUMA -- sOMP: Simulating OpenMP Task-based Applications with NUMA Effects -- Virt ex: Automatic Adaptation to NUMA Topology Change for OpenMP Applications -- Compilation Techniques -- Using OpenMP to Detect and Speculate Dynamic DOALL Loops -- ComPar: Optimized Multi-Compiler for Automatic OpenMP S2S Parallelization -- Heterogeneous Computing -- OpenMP Device Offloading to FPGAs Using the Nymble Infrastructure -- Data Transfer and Reuse Analysis Tool for GPU-offloading Using OpenMP -- Toward Supporting Multi-GPU Targets via Taskloop and User-defined Schedules -- Memory -- Preliminary Experience with OpenMP Management Implementation Memory -- Memory Anomalies in OpenMP.
Record Nr. UNINA-9910427719303321
Cham : , : Springer International Publishing : , : Imprint : Springer, , 2020
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui