Share Catalogue

Storico ricerche

Pubblicazioni (Istanze)

Vai a Persone/Opere

Home / (Tutto) >> KlinkenbergJannis

Info

Utilizzare la checkbox di selezione a fianco di ciascun documento per attivare le funzionalità di stampa, invio email, download nei formati disponibili del (i) record.

Info

Utilizzare questo link per rimuovere la selezione effettuata.

Export / Download (0)

Esporta in PDF
Esporta in Excel
Esporta in HTML
Esporta in MARC (binario)
Esporta in MARC XML
Esporta in MARC (testo)
Invia tramite E-Mail

Biblioteca

Univ. Federico II (4)
Univ. di Salerno (3)

Tutto
+

MARC Lista (tabellare)

Seleziona tutti

Advancing OpenMP for Future Accelerators : 20th International Workshop on OpenMP, IWOMP 2024, Perth, WA, Australia, September 23–25, 2024, Proceedings / / edited by Alexis Espinosa, Michael Klemm, Bronis R. de Supinski, Maciej Cytowski, Jannis Klinkenberg

Espinosa Alexis

Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024

Materiale a stampa

Lo trovi qui: Univ. Federico II

Opac:

Controlla la disponibilità qui

OpenMP : enabling massive node-level parallelism : 17th international workshop on OpenMP, IWOMP 2021, Bristol, UK, September 14-16, 2021 : proceedings / / Simon McIntosh-Smith, Bronis R. de Supinski, Jannis Klinkenberg

McIntosh-Smith Simon

Cham, Switzerland : , : Springer International Publishing, , [2021]

Materiale a stampa

Lo trovi qui: Univ. di Salerno

Opac:

Controlla la disponibilità qui

McIntosh-Smith Simon

Cham, Switzerland : , : Springer International Publishing, , [2021]

Materiale a stampa

Lo trovi qui: Univ. Federico II

Opac:

Controlla la disponibilità qui

OpenMP: Advanced Task-Based, Device and Compiler Programming [[electronic resource] ] : 19th International Workshop on OpenMP, IWOMP 2023, Bristol, UK, September 13–15, 2023, Proceedings / / edited by Simon McIntosh-Smith, Michael Klemm, Bronis R. de Supinski, Tom Deakin, Jannis Klinkenberg

McIntosh-Smith Simon

Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023

Materiale a stampa

Lo trovi qui: Univ. di Salerno

Opac:

Controlla la disponibilità qui

OpenMP: Advanced Task-Based, Device and Compiler Programming : 19th International Workshop on OpenMP, IWOMP 2023, Bristol, UK, September 13–15, 2023, Proceedings / / edited by Simon McIntosh-Smith, Michael Klemm, Bronis R. de Supinski, Tom Deakin, Jannis Klinkenberg

McIntosh-Smith Simon

Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023

Materiale a stampa

Lo trovi qui: Univ. Federico II

Opac:

Controlla la disponibilità qui

OpenMP: Portable Multi-Level Parallelism on Modern Systems [[electronic resource] ] : 16th International Workshop on OpenMP, IWOMP 2020, Austin, TX, USA, September 22–24, 2020, Proceedings / / edited by Kent Milfeld, Bronis R. de Supinski, Lars Koesterke, Jannis Klinkenberg

Cham : , : Springer International Publishing : , : Imprint : Springer, , 2020

Materiale a stampa

Lo trovi qui: Univ. di Salerno

Opac:

Controlla la disponibilità qui

OpenMP: Portable Multi-Level Parallelism on Modern Systems : 16th International Workshop on OpenMP, IWOMP 2020, Austin, TX, USA, September 22–24, 2020, Proceedings / / edited by Kent Milfeld, Bronis R. de Supinski, Lars Koesterke, Jannis Klinkenberg

Cham : , : Springer International Publishing : , : Imprint : Springer, , 2020

Autore	Espinosa Alexis
Edizione	[1st ed. 2024.]
Pubbl/distr/stampa	Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024
Descrizione fisica	1 online resource (230 pages)
Disciplina	005.45
Altri autori (Persone)	KlemmMichael de SupinskiBronis R CytowskiMaciej KlinkenbergJannis
Collana	Lecture Notes in Computer Science
Soggetto topico	Compilers (Computer programs) Microprogramming Computer input-output equipment Computers, Special purpose Computer systems Compilers and Interpreters Control Structures and Microprogramming Input/Output and Data Communications Special Purpose and Application-Based Systems Computer System Implementation
ISBN	3-031-72567-0
Formato	Materiale a stampa
Livello bibliografico	Monografia
Lingua di pubblicazione	eng
Nota di contenuto	-- Current and Future OpenMP Optimization. -- Towards Locality-Aware Host-to-Device Offloading in OpenMP. -- Performance Porting the ExaStar Multi-physics App Thornado On Heterogeneous Systems - A Fortran-OpenMP Code-base Evaluation. -- Event-Based OpenMP Tasks for Time-Sensitive GPU-Accelerated Systems. -- Targeting More Devices. -- Integrating Multi-FPGA Acceleration to OpenMP Distributed Computing. -- Towards a Scalable and Efficient PGAS-based Distributed OpenMP. -- Multilayer Multipurpose Caches for OpenMP Target Regions on FPGAs. -- Best Practices. -- Survey of OpenMP Practice in General Open Source Software. -- CI/CD Efforts for Validation, Verification and Benchmarking OpenMP Implementations. -- Evaluation of Directive-based Programming Models for Stencil Computation on Current GPGPU Architectures. -- Tools. -- Finding Equivalent OpenMP Fortran and C/C++ Code Snippets Using Large Language Models. -- Visualizing Correctness Issues in OpenMP Programs. -- Developing an Interactive OpenMP Programming Book with Large Language Models. -- Simplifying Parallelization. -- Automatic Parallelization and OpenMP Offloading of Fortran Array Notation. -- Detrimental Task Execution Patterns in Mainstream OpenMP Runtimes.
Record Nr.	UNINA-9910888598803321

Autore	McIntosh-Smith Simon
Pubbl/distr/stampa	Cham, Switzerland : , : Springer International Publishing, , [2021]
Descrizione fisica	1 online resource (231 pages)
Disciplina	621.3916
Collana	Lecture Notes in Computer Science
Soggetto topico	Microprocessors - Computer-aided design Logic design - Data processing
ISBN	3-030-85262-8
Formato	Materiale a stampa
Livello bibliografico	Monografia
Lingua di pubblicazione	eng
Nota di contenuto	Intro -- Preface -- Organization -- Contents -- Synchronization and Data -- Improving Speculative taskloop in Hardware Transactional Memory -- 1 Introduction -- 2 Background and Related Work -- 2.1 Task-Based Parallelism -- 2.2 TLS on Hardware Transactional Memories -- 2.3 Speculative taskloop (STL) -- 2.4 Lost-Thread Effect -- 2.5 LLVM OpenMP Runtime Library -- 3 Implementation -- 3.1 First Attempt: Use priority Clause -- 3.2 Recursive Partition of Iterations -- 3.3 Immediate Execution When Deque is Full -- 3.4 Removal from Tail of Thread's Deque -- 4 Benchmarks, Methodology and Experimental Setup -- 5 Experimental Results and Analysis -- 6 Conclusions -- References -- Vectorized Barrier and Reduction in LLVM OpenMP Runtime -- 1 Introduction -- 2 Background and Related Work -- 2.1 Types of Barriers in Literature -- 2.2 Barriers and Reductions in OpenMP -- 3 Low Overhead Barrier and Reduction in OpenMP -- 3.1 Vectorized Barrier -- 3.2 Vectorized Reduction -- 4 Performance Results -- 4.1 Intel KNL -- 4.2 Fujitsu A64FX -- 5 Conclusions -- References -- Tasking Extensions I -- Enhancing OpenMP Tasking Model: Performance and Portability -- 1 Introduction -- 2 Motivation -- 3 The Taskgraph Model -- 3.1 The taskgraph Mechanism -- 3.2 Syntax of the taskgraph Clause -- 3.3 Semantics of the taskgraph Clause -- 3.4 Requirements of the taskgraph Region -- 4 Projected Results -- 4.1 Potential Performance Gain -- 4.2 The TDG: A Door for Expanding Portability -- 5 Related Work -- 6 Conclusion -- References -- OpenMP Taskloop Dependences -- 1 Introduction -- 2 Tasking Programmability Challenges -- 3 Related Work -- 4 Taskloop with Dependences -- 5 Implementation -- 6 Experiment Results -- 7 Conclusions and Future Work -- References -- Applications -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part I). 1 Introduction -- 2 Platforms Used -- 3 Application Experiences -- 3.1 BerkeleyGW -- 3.2 WDMApp -- References -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part II) -- 1 Introduction -- 2 Application Experiences -- 2.1 GAMESS -- 2.2 GESTS -- 2.3 GridMini -- 3 Conclusions -- References -- An Empirical Investigation of OpenMP Based Implementation of Simplex Algorithm -- 1 Introduction -- 2 Serial Algorithm -- 3 Parallel Algorithm -- 3.1 Implementation -- 3.2 Optimization Strategies -- 3.3 Algorithm Analysis -- 4 Experimental Results and Observations -- 4.1 NETLIB Dataset -- 4.2 Variation of the Number of Variables -- 4.3 Variation of the Number of Constraints -- 4.4 Variation in Matrix Density -- 4.5 Discussion -- 5 Conclusion -- A Appendix: Serial Algorithm - Working Example -- References -- Task Inefficiency Patterns for a Wave Equation Solver -- 1 Introduction -- 2 Case Studies -- 3 Test Environment -- 4 Benchmarking and Task Runtime Modifications -- 4.1 Direct Translation of Enclave Tasking to OpenMP (native) -- 4.2 Manual Task Postponing (Hold-Back) -- 4.3 Manual Backfilling (Backfill) -- 5 Evaluation and Conclusion -- References -- Case Studies -- Comparing OpenMP Implementations with Applications Across A64FX Platforms -- 1 Introduction -- 1.1 The A64FX Processor -- 1.2 Paper's Contribution and Organization -- 2 List of Applications and Experimental Setup -- 2.1 List of Applications -- 2.2 Systems and Compilers -- 2.3 Runtime Environment -- 2.4 Compiler Options -- 3 Experimental Results -- 3.1 Ookami -- 3.2 Fugaku -- 4 Related Work -- 5 Conclusions and Future Work -- References -- A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation -- 1 Introduction -- 2 Case Study: Porting DCA++ to Wombat -- 2.1 Evaluation Environment -- 2.2 DCA++ -- 2.3 Baseline Performance. 3 An LLVM Tool Methodology to Generate Efficient Vectorization -- 3.1 OpenMP SIMD -- 3.2 Using the Correct Compiler Flags -- 3.3 Loop Transformations -- 3.4 Results -- 4 Automating the Process: The OpenMP Advisor -- 5 Related Work -- 6 Conclusion -- References -- Heterogenous Computing and Memory -- Experience Report: Writing a Portable GPU Runtime with OpenMP 5.1 -- 1 Introduction -- 2 Background -- 2.1 Device Runtime Library -- 2.2 Compilation Flow of OpenMP Target Offloading in LLVM/Clang -- 2.3 Motivation -- 3 Implementation -- 3.1 Common Part -- 3.2 Target Specific Part -- 4 Evaluation -- 4.1 Code Comparison -- 4.2 Functional Testing -- 4.3 Performance Evaluation -- 5 Conclusions and Future Work -- References -- FOTV: A Generic Device Offloading Framework for OpenMP -- 1 Introduction -- 2 Background: OpenMP Offloading Infrastructure -- 2.1 Offloading Strategy -- 2.2 Advantages and Limitations -- 3 Architecture of the FOTV Generic Device Framework -- 3.1 The Runtime Library Components -- 3.2 The Code Extraction Tool -- 4 Device Management API Description -- 4.1 DeviceManagement Component -- 4.2 TgtRegionBase Component -- 5 Case Study: Running OpenCL Kernels as OpenMP Regions -- 5.1 The OpenCL Device Requirements -- 6 Results -- 7 Related Works -- 8 Conclusions and Future Works -- References -- Beyond Explicit Transfers: Shared and Managed Memory in OpenMP -- 1 Introduction -- 2 Current Support in OpenMP -- 2.1 Allocators -- 2.2 Host Memory -- 2.3 Device Memory -- 3 Survey -- 3.1 OpenCL -- 3.2 Level Zero -- 3.3 CUDA -- 3.4 HIP -- 4 Proposed OpenMP Extension -- 4.1 Memory Space Accessibility -- 4.2 Shared and Managed Memory -- 4.3 Memory Location Control -- 5 Evaluation -- 6 Conclusion -- References -- Tasking Extensions II -- Communication-Aware Task Scheduling Strategy in Hybrid MPI+OpenMP Applications -- 1 Introduction -- 2 Related Work. 3 Task Scheduling Strategy -- 3.1 Interoperation Between MPI and OpenMP Runtimes -- 3.2 Manual Policies -- 3.3 (Semi-)Automatic Policies -- 3.4 Summary -- 4 Implementation and Evaluation -- 4.1 Implementation -- 4.2 Evaluation Environment -- 4.3 Experimental Results -- 5 Conclusion and Future Work -- References -- An OpenMP Free Agent Threads Implementation -- 1 Introduction -- 2 Related Work -- 3 Proposal -- 3.1 Considered Aspects in the Design -- 3.2 The free_agent Task Clause -- 3.3 Proposed Mechanisms to Manage Free Agent Threads -- 4 Implementation -- 5 Evaluation -- 5.1 Use Case: Fixing Load Imbalance Between Parallel Regions -- 5.2 Use Case: Solving Load Imbalance in a Hybrid Application with DLB as an OMPT Tool -- 6 Conclusions and Future Work -- References -- Author Index.
Record Nr.	UNISA-996464509003316

Autore	McIntosh-Smith Simon
Pubbl/distr/stampa	Cham, Switzerland : , : Springer International Publishing, , [2021]
Descrizione fisica	1 online resource (231 pages)
Disciplina	621.3916
Collana	Lecture Notes in Computer Science
Soggetto topico	Microprocessors - Computer-aided design Logic design - Data processing
ISBN	3-030-85262-8
Formato	Materiale a stampa
Livello bibliografico	Monografia
Lingua di pubblicazione	eng
Nota di contenuto	Intro -- Preface -- Organization -- Contents -- Synchronization and Data -- Improving Speculative taskloop in Hardware Transactional Memory -- 1 Introduction -- 2 Background and Related Work -- 2.1 Task-Based Parallelism -- 2.2 TLS on Hardware Transactional Memories -- 2.3 Speculative taskloop (STL) -- 2.4 Lost-Thread Effect -- 2.5 LLVM OpenMP Runtime Library -- 3 Implementation -- 3.1 First Attempt: Use priority Clause -- 3.2 Recursive Partition of Iterations -- 3.3 Immediate Execution When Deque is Full -- 3.4 Removal from Tail of Thread's Deque -- 4 Benchmarks, Methodology and Experimental Setup -- 5 Experimental Results and Analysis -- 6 Conclusions -- References -- Vectorized Barrier and Reduction in LLVM OpenMP Runtime -- 1 Introduction -- 2 Background and Related Work -- 2.1 Types of Barriers in Literature -- 2.2 Barriers and Reductions in OpenMP -- 3 Low Overhead Barrier and Reduction in OpenMP -- 3.1 Vectorized Barrier -- 3.2 Vectorized Reduction -- 4 Performance Results -- 4.1 Intel KNL -- 4.2 Fujitsu A64FX -- 5 Conclusions -- References -- Tasking Extensions I -- Enhancing OpenMP Tasking Model: Performance and Portability -- 1 Introduction -- 2 Motivation -- 3 The Taskgraph Model -- 3.1 The taskgraph Mechanism -- 3.2 Syntax of the taskgraph Clause -- 3.3 Semantics of the taskgraph Clause -- 3.4 Requirements of the taskgraph Region -- 4 Projected Results -- 4.1 Potential Performance Gain -- 4.2 The TDG: A Door for Expanding Portability -- 5 Related Work -- 6 Conclusion -- References -- OpenMP Taskloop Dependences -- 1 Introduction -- 2 Tasking Programmability Challenges -- 3 Related Work -- 4 Taskloop with Dependences -- 5 Implementation -- 6 Experiment Results -- 7 Conclusions and Future Work -- References -- Applications -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part I). 1 Introduction -- 2 Platforms Used -- 3 Application Experiences -- 3.1 BerkeleyGW -- 3.2 WDMApp -- References -- Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part II) -- 1 Introduction -- 2 Application Experiences -- 2.1 GAMESS -- 2.2 GESTS -- 2.3 GridMini -- 3 Conclusions -- References -- An Empirical Investigation of OpenMP Based Implementation of Simplex Algorithm -- 1 Introduction -- 2 Serial Algorithm -- 3 Parallel Algorithm -- 3.1 Implementation -- 3.2 Optimization Strategies -- 3.3 Algorithm Analysis -- 4 Experimental Results and Observations -- 4.1 NETLIB Dataset -- 4.2 Variation of the Number of Variables -- 4.3 Variation of the Number of Constraints -- 4.4 Variation in Matrix Density -- 4.5 Discussion -- 5 Conclusion -- A Appendix: Serial Algorithm - Working Example -- References -- Task Inefficiency Patterns for a Wave Equation Solver -- 1 Introduction -- 2 Case Studies -- 3 Test Environment -- 4 Benchmarking and Task Runtime Modifications -- 4.1 Direct Translation of Enclave Tasking to OpenMP (native) -- 4.2 Manual Task Postponing (Hold-Back) -- 4.3 Manual Backfilling (Backfill) -- 5 Evaluation and Conclusion -- References -- Case Studies -- Comparing OpenMP Implementations with Applications Across A64FX Platforms -- 1 Introduction -- 1.1 The A64FX Processor -- 1.2 Paper's Contribution and Organization -- 2 List of Applications and Experimental Setup -- 2.1 List of Applications -- 2.2 Systems and Compilers -- 2.3 Runtime Environment -- 2.4 Compiler Options -- 3 Experimental Results -- 3.1 Ookami -- 3.2 Fugaku -- 4 Related Work -- 5 Conclusions and Future Work -- References -- A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation -- 1 Introduction -- 2 Case Study: Porting DCA++ to Wombat -- 2.1 Evaluation Environment -- 2.2 DCA++ -- 2.3 Baseline Performance. 3 An LLVM Tool Methodology to Generate Efficient Vectorization -- 3.1 OpenMP SIMD -- 3.2 Using the Correct Compiler Flags -- 3.3 Loop Transformations -- 3.4 Results -- 4 Automating the Process: The OpenMP Advisor -- 5 Related Work -- 6 Conclusion -- References -- Heterogenous Computing and Memory -- Experience Report: Writing a Portable GPU Runtime with OpenMP 5.1 -- 1 Introduction -- 2 Background -- 2.1 Device Runtime Library -- 2.2 Compilation Flow of OpenMP Target Offloading in LLVM/Clang -- 2.3 Motivation -- 3 Implementation -- 3.1 Common Part -- 3.2 Target Specific Part -- 4 Evaluation -- 4.1 Code Comparison -- 4.2 Functional Testing -- 4.3 Performance Evaluation -- 5 Conclusions and Future Work -- References -- FOTV: A Generic Device Offloading Framework for OpenMP -- 1 Introduction -- 2 Background: OpenMP Offloading Infrastructure -- 2.1 Offloading Strategy -- 2.2 Advantages and Limitations -- 3 Architecture of the FOTV Generic Device Framework -- 3.1 The Runtime Library Components -- 3.2 The Code Extraction Tool -- 4 Device Management API Description -- 4.1 DeviceManagement Component -- 4.2 TgtRegionBase Component -- 5 Case Study: Running OpenCL Kernels as OpenMP Regions -- 5.1 The OpenCL Device Requirements -- 6 Results -- 7 Related Works -- 8 Conclusions and Future Works -- References -- Beyond Explicit Transfers: Shared and Managed Memory in OpenMP -- 1 Introduction -- 2 Current Support in OpenMP -- 2.1 Allocators -- 2.2 Host Memory -- 2.3 Device Memory -- 3 Survey -- 3.1 OpenCL -- 3.2 Level Zero -- 3.3 CUDA -- 3.4 HIP -- 4 Proposed OpenMP Extension -- 4.1 Memory Space Accessibility -- 4.2 Shared and Managed Memory -- 4.3 Memory Location Control -- 5 Evaluation -- 6 Conclusion -- References -- Tasking Extensions II -- Communication-Aware Task Scheduling Strategy in Hybrid MPI+OpenMP Applications -- 1 Introduction -- 2 Related Work. 3 Task Scheduling Strategy -- 3.1 Interoperation Between MPI and OpenMP Runtimes -- 3.2 Manual Policies -- 3.3 (Semi-)Automatic Policies -- 3.4 Summary -- 4 Implementation and Evaluation -- 4.1 Implementation -- 4.2 Evaluation Environment -- 4.3 Experimental Results -- 5 Conclusion and Future Work -- References -- An OpenMP Free Agent Threads Implementation -- 1 Introduction -- 2 Related Work -- 3 Proposal -- 3.1 Considered Aspects in the Design -- 3.2 The free_agent Task Clause -- 3.3 Proposed Mechanisms to Manage Free Agent Threads -- 4 Implementation -- 5 Evaluation -- 5.1 Use Case: Fixing Load Imbalance Between Parallel Regions -- 5.2 Use Case: Solving Load Imbalance in a Hybrid Application with DLB as an OMPT Tool -- 6 Conclusions and Future Work -- References -- Author Index.
Record Nr.	UNINA-9910502669803321

Autore	McIntosh-Smith Simon
Edizione	[1st ed. 2023.]
Pubbl/distr/stampa	Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Descrizione fisica	1 online resource (244 pages)
Disciplina	005.275
Altri autori (Persone)	KlemmMichael de SupinskiBronis R DeakinTom KlinkenbergJannis
Collana	Lecture Notes in Computer Science
Soggetto topico	Microprocessors Computer architecture Compilers (Computer programs) Microprogramming Computer input-output equipment Computers, Special purpose Computer systems Processor Architectures Compilers and Interpreters Control Structures and Microprogramming Input/Output and Data Communications Special Purpose and Application-Based Systems Computer System Implementation
ISBN	3-031-40744-X
Formato	Materiale a stampa
Livello bibliografico	Monografia
Lingua di pubblicazione	eng
Nota di contenuto	OpenMP and AI: Advising OpenMP Parallelization via a Graph-Based Approach with Transformers -- Towards Effective Language Model Application in High-Performance Computing -- OpenMP Advisor: A Compiler Tool for Heterogeneous Architectures -- Tasking Extensions: Introducing Moldable Task in OpenMP -- Suspending OpenMP Tasks on Asynchronous Events: Extending the Taskwait Construct -- How to Efficiently Parallelize Irregular DOACROSS Loops Using Fine-Grained Granularity and OpenMP Tasks? The mcf Case -- OpenMP Offload Experiences: The Kokkos OpenMPTarget Backend: Implementation and Lessons Learned -- Fine-Grained Parallelism on GPUs Using OpenMP Target Offloading -- Improving a Multigrid Poisson Solver with Peer-to-Peer Communication and Task Dependencies -- Beyond Explicit GPU Support: Multipurpose Cacheing to accelerate OpenMP Target Regions on FPGAs -- Generalizing Hierarchical Parallelism -- Exploring the Limits of Generic Code Execution on GPUs via Direct (OpenMP) Offload -- OpenMP Infrastructure and Evaluation: Improving Simulations of Task-Based Applications on Complex NUMA Architectures -- Experimental Characterization of OpenMP Offloading Memory Operations and Unified Shared Memory Support -- OpenMP Reverse Offloading Using Shared Memory Remote Procedure Calls.
Record Nr.	UNISA-996546849103316

Autore	McIntosh-Smith Simon
Edizione	[1st ed. 2023.]
Pubbl/distr/stampa	Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Descrizione fisica	1 online resource (244 pages)
Disciplina	005.275
Altri autori (Persone)	KlemmMichael de SupinskiBronis R DeakinTom KlinkenbergJannis
Collana	Lecture Notes in Computer Science
Soggetto topico	Microprocessors Computer architecture Compilers (Computer programs) Microprogramming Computer input-output equipment Computers, Special purpose Computer systems Processor Architectures Compilers and Interpreters Control Structures and Microprogramming Input/Output and Data Communications Special Purpose and Application-Based Systems Computer System Implementation
ISBN	3-031-40744-X
Formato	Materiale a stampa
Livello bibliografico	Monografia
Lingua di pubblicazione	eng
Nota di contenuto	OpenMP and AI: Advising OpenMP Parallelization via a Graph-Based Approach with Transformers -- Towards Effective Language Model Application in High-Performance Computing -- OpenMP Advisor: A Compiler Tool for Heterogeneous Architectures -- Tasking Extensions: Introducing Moldable Task in OpenMP -- Suspending OpenMP Tasks on Asynchronous Events: Extending the Taskwait Construct -- How to Efficiently Parallelize Irregular DOACROSS Loops Using Fine-Grained Granularity and OpenMP Tasks? The mcf Case -- OpenMP Offload Experiences: The Kokkos OpenMPTarget Backend: Implementation and Lessons Learned -- Fine-Grained Parallelism on GPUs Using OpenMP Target Offloading -- Improving a Multigrid Poisson Solver with Peer-to-Peer Communication and Task Dependencies -- Beyond Explicit GPU Support: Multipurpose Cacheing to accelerate OpenMP Target Regions on FPGAs -- Generalizing Hierarchical Parallelism -- Exploring the Limits of Generic Code Execution on GPUs via Direct (OpenMP) Offload -- OpenMP Infrastructure and Evaluation: Improving Simulations of Task-Based Applications on Complex NUMA Architectures -- Experimental Characterization of OpenMP Offloading Memory Operations and Unified Shared Memory Support -- OpenMP Reverse Offloading Using Shared Memory Remote Procedure Calls.
Record Nr.	UNINA-9910743696103321

Info

Info

Export / Download (0)

Biblioteca

Formato

Livello bibliografico

Autore (Persona)

Autore (Ente)

Autore (Convegno)

Opere

Pubbl/distr/stampa

Lingua di pubblicazione

Data

Data di pubblicazione

Soggetto (Persona)

Soggetto (Ente)

Soggetto (Convegno)

Soggetto geografico

Soggetto topico

Soggetto genere / forma

Edizione	[1st ed. 2020.]
Pubbl/distr/stampa	Cham : , : Springer International Publishing : , : Imprint : Springer, , 2020
Descrizione fisica	1 online resource (XI, 344 p. 148 illus., 95 illus. in color.)
Disciplina	004.1
Collana	Programming and Software Engineering
Soggetto topico	Microprocessors Computer programming Programming languages (Electronic computers) Logic design Operating systems (Computers) Architecture, Computer Processor Architectures Programming Techniques Programming Languages, Compilers, Interpreters Logic Design Operating Systems Computer System Implementation
ISBN	3-030-58144-6
Formato	Materiale a stampa
Livello bibliografico	Monografia
Lingua di pubblicazione	eng
Nota di contenuto	Performance Methodologies -- FAROS: A Framework To Analyze OpenMP Compilation Through Benchmarking and Compiler Optimization Analysis -- Evaluating the Effciency of OpenMP Tasking for Unbalanced Computation on Diverse CPU Architectures -- Applications -- A Case Study of Porting HPGMG from CUDA to OpenMP Target Offload -- P-Aevol: an OpenMP Parallelization of a Biological Evolution Simulator, Through Decomposition in Multiple Loops -- Evaluating Performance of OpenMP Tasks in a Seismic Stencil Application -- OpenMP Extensions -- Unified Sequential Optimization Directives in OpenMP -- Support Data Shu e Between Threads in OpenMP -- Performance Studies -- Performance Study of SpMV Towards an Auto-tuned and Task-based SpMV (LASs Library) -- A Case Study on Addressing Complex Load Imbalance in OpenMP -- Tools -- On-the- y Data Race Detection with the Enhanced OpenMP Series-Parallel Graph -- AfterOMPT: An OMPT-based tool for ne-Grained Tracing of Tasks and Loops -- Co-designing OpenMP Programming Model Features with OMPT and Simulation -- NUMA -- sOMP: Simulating OpenMP Task-based Applications with NUMA Effects -- Virt ex: Automatic Adaptation to NUMA Topology Change for OpenMP Applications -- Compilation Techniques -- Using OpenMP to Detect and Speculate Dynamic DOALL Loops -- ComPar: Optimized Multi-Compiler for Automatic OpenMP S2S Parallelization -- Heterogeneous Computing -- OpenMP Device Offloading to FPGAs Using the Nymble Infrastructure -- Data Transfer and Reuse Analysis Tool for GPU-offloading Using OpenMP -- Toward Supporting Multi-GPU Targets via Taskloop and User-defined Schedules -- Memory -- Preliminary Experience with OpenMP Management Implementation Memory -- Memory Anomalies in OpenMP.
Record Nr.	UNISA-996418299303316