Euro-Par 2021: Parallel Processing [[electronic resource] ] : 27th International Conference on Parallel and Distributed Computing, Lisbon, Portugal, September 1–3, 2021, Proceedings / / edited by Leonel Sousa, Nuno Roma, Pedro Tomás |
Edizione | [1st ed. 2021.] |
Pubbl/distr/stampa | Cham : , : Springer International Publishing : , : Imprint : Springer, , 2021 |
Descrizione fisica | 1 online resource (652 pages) |
Disciplina | 004.35 |
Collana | Theoretical Computer Science and General Issues |
Soggetto topico |
Software engineering
Computer engineering Computer networks Compilers (Computer programs) Computers Operating systems (Computers) Software Engineering Computer Engineering and Networks Compilers and Interpreters Computer Hardware Operating Systems |
ISBN | 3-030-85665-8 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | Compilers, Tools and Environments -- ALONA: Automatic Loop Nest Approximation with Reconstruction and Space Pruning -- Automatic low-overhead load-imbalance detection in MPI applications -- Performance and Power Modeling, Prediction and Evaluation -- Trace-driven Workload Generation and Execution -- Bilas Update on the Asymptotic Optimality of LPT -- E2EWatch: An End-to-end Anomaly Diagnosis Framework for Production HPC Systems -- Scheduling and Load Balancing -- Collaborative GPU Preemption via Spatial Multitasking for Efficient GPU Sharing -- A Fixed-Parameter Algorithm for Scheduling Unit dependent Tasks with Unit Communication Delays -- Plan-based Job Scheduling for Super computers with Shared Burst Buffers -- Taming Tail Latency in Key-Value Stores: a Scheduling Perspective -- A log-linear(2+5/6)-approximation algorithm for parallel machine scheduling with a single orthogonal resource -- An MPI-Parallel Algorithm for Mapping Complex Networks onto Hierarchical Architectures -- Pipelined Model Parallelism: Complexity Results and Memory Considerations -- Data Management, Analytics and Machine Learning -- Efficient and Systematic Partitioning of Large and Deep Neural Networks for Parallelization -- A GPU Architecture Aware Fine-Grain Pruning Technique for Deep Neural Networks -- Towards Flexible and Compiler-Friendly Layer Fusion for CNNs on Multicore CPUs -- Smart Distributed Data Sets for Stream Processing -- Cluster, Cloud and Edge Computing -- Colony: Parallel Functions as a Service on the Cloud-Edge Continuum -- Horizontal Scaling in Cloud using Contextual Bandits -- Geo-Distribute Cloud Application at the Edge -- A Fault Tolerant and Deadline Constrained Sequence Alignment Application on Cloud-based Spot GPU Instances -- Sustaining Performance While Reducing Energy Consumption: A Control Theory Approach -- Theory and Algorithms for Parallel and Distributed Processing -- Algorithm design for Tensor Units -- A Scalable Approximation Algorithm for Weighted Longest Common Subsequence -- TSL Queue: An E‑cient Lock-free Design for Priority Queues -- G-Morph: Induced Subgraph Isomorphism Search of Labeled Graphs on a GPU -- Parallel and Distributed Programming, Interfaces, and Languages -- Accelerating Graph Applications Using Phased Transactional Memory -- Efficient GPU Computation using Task Graph Parallelism -- Towards High Performance Resilience using Performance Portable Abstractions -- Enhancing Load-Balancing of MPI Applications with Workshare -- Particle-In-Cell Simulation using Asynchronous Tasking -- Multicore and Manycore Parallelism -- Exploiting co-execution with one API: heterogeneity from a modern perspective -- Parallel Numerical Methods and Applications -- Designing a 3D Parallel Memory-Aware Lattice Boltzmann Algorithm on Manycore Systems -- Fault-tolerant LU factorization is low cost -- Mixed Precision Incomplete and Factorized Sparse Approximate Inverse Preconditioning on GPUs -- Outsmarting the Atmospheric Turbulence for Ground-Based Telescopes Using the Stochastic Levenberg-Marquardt Method -- GPU Accelerated Mahalanobis-average Hierarchical Clustering Analysis -- High performance architectures and accelerators -- PrioRAT: Criticality-Driven Prioritization Inside the On-Chip Memory Hierarchy -- Optimized Implementation of the HPCG Benchmark on Recongurable Hardware. |
Record Nr. | UNISA-996464495103316 |
Cham : , : Springer International Publishing : , : Imprint : Springer, , 2021 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. di Salerno | ||
|
Euro-Par 2021: Parallel Processing : 27th International Conference on Parallel and Distributed Computing, Lisbon, Portugal, September 1–3, 2021, Proceedings / / edited by Leonel Sousa, Nuno Roma, Pedro Tomás |
Edizione | [1st ed. 2021.] |
Pubbl/distr/stampa | Cham : , : Springer International Publishing : , : Imprint : Springer, , 2021 |
Descrizione fisica | 1 online resource (652 pages) |
Disciplina | 004.35 |
Collana | Theoretical Computer Science and General Issues |
Soggetto topico |
Software engineering
Computer engineering Computer networks Compilers (Computer programs) Computers Operating systems (Computers) Software Engineering Computer Engineering and Networks Compilers and Interpreters Computer Hardware Operating Systems |
ISBN | 3-030-85665-8 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | Compilers, Tools and Environments -- ALONA: Automatic Loop Nest Approximation with Reconstruction and Space Pruning -- Automatic low-overhead load-imbalance detection in MPI applications -- Performance and Power Modeling, Prediction and Evaluation -- Trace-driven Workload Generation and Execution -- Bilas Update on the Asymptotic Optimality of LPT -- E2EWatch: An End-to-end Anomaly Diagnosis Framework for Production HPC Systems -- Scheduling and Load Balancing -- Collaborative GPU Preemption via Spatial Multitasking for Efficient GPU Sharing -- A Fixed-Parameter Algorithm for Scheduling Unit dependent Tasks with Unit Communication Delays -- Plan-based Job Scheduling for Super computers with Shared Burst Buffers -- Taming Tail Latency in Key-Value Stores: a Scheduling Perspective -- A log-linear(2+5/6)-approximation algorithm for parallel machine scheduling with a single orthogonal resource -- An MPI-Parallel Algorithm for Mapping Complex Networks onto Hierarchical Architectures -- Pipelined Model Parallelism: Complexity Results and Memory Considerations -- Data Management, Analytics and Machine Learning -- Efficient and Systematic Partitioning of Large and Deep Neural Networks for Parallelization -- A GPU Architecture Aware Fine-Grain Pruning Technique for Deep Neural Networks -- Towards Flexible and Compiler-Friendly Layer Fusion for CNNs on Multicore CPUs -- Smart Distributed Data Sets for Stream Processing -- Cluster, Cloud and Edge Computing -- Colony: Parallel Functions as a Service on the Cloud-Edge Continuum -- Horizontal Scaling in Cloud using Contextual Bandits -- Geo-Distribute Cloud Application at the Edge -- A Fault Tolerant and Deadline Constrained Sequence Alignment Application on Cloud-based Spot GPU Instances -- Sustaining Performance While Reducing Energy Consumption: A Control Theory Approach -- Theory and Algorithms for Parallel and Distributed Processing -- Algorithm design for Tensor Units -- A Scalable Approximation Algorithm for Weighted Longest Common Subsequence -- TSL Queue: An E‑cient Lock-free Design for Priority Queues -- G-Morph: Induced Subgraph Isomorphism Search of Labeled Graphs on a GPU -- Parallel and Distributed Programming, Interfaces, and Languages -- Accelerating Graph Applications Using Phased Transactional Memory -- Efficient GPU Computation using Task Graph Parallelism -- Towards High Performance Resilience using Performance Portable Abstractions -- Enhancing Load-Balancing of MPI Applications with Workshare -- Particle-In-Cell Simulation using Asynchronous Tasking -- Multicore and Manycore Parallelism -- Exploiting co-execution with one API: heterogeneity from a modern perspective -- Parallel Numerical Methods and Applications -- Designing a 3D Parallel Memory-Aware Lattice Boltzmann Algorithm on Manycore Systems -- Fault-tolerant LU factorization is low cost -- Mixed Precision Incomplete and Factorized Sparse Approximate Inverse Preconditioning on GPUs -- Outsmarting the Atmospheric Turbulence for Ground-Based Telescopes Using the Stochastic Levenberg-Marquardt Method -- GPU Accelerated Mahalanobis-average Hierarchical Clustering Analysis -- High performance architectures and accelerators -- PrioRAT: Criticality-Driven Prioritization Inside the On-Chip Memory Hierarchy -- Optimized Implementation of the HPCG Benchmark on Recongurable Hardware. |
Record Nr. | UNINA-9910495224803321 |
Cham : , : Springer International Publishing : , : Imprint : Springer, , 2021 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|