LEADER 06694nam 22007335 450 001 996464495103316 005 20230330055700.0 010 $a3-030-85665-8 024 7 $a10.1007/978-3-030-85665-6 035 $a(CKB)5600000000003481 035 $a(MiAaPQ)EBC6714628 035 $a(Au-PeEL)EBL6714628 035 $a(DE-He213)978-3-030-85665-6 035 $a(PPN)257350802 035 $a(EXLCZ)995600000000003481 100 $a20210828d2021 u| 0 101 0 $aeng 135 $aurcnu|||||||| 181 $ctxt$2rdacontent 182 $cc$2rdamedia 183 $acr$2rdacarrier 200 10$aEuro-Par 2021: Parallel Processing$b[electronic resource] $e27th International Conference on Parallel and Distributed Computing, Lisbon, Portugal, September 1?3, 2021, Proceedings /$fedited by Leonel Sousa, Nuno Roma, Pedro Tomás 205 $a1st ed. 2021. 210 1$aCham :$cSpringer International Publishing :$cImprint: Springer,$d2021. 215 $a1 online resource (652 pages) 225 1 $aTheoretical Computer Science and General Issues,$x2512-2029 ;$v12820 311 $a3-030-85664-X 320 $aIncludes bibliographical references and index. 327 $aCompilers, Tools and Environments -- ALONA: Automatic Loop Nest Approximation with Reconstruction and Space Pruning -- Automatic low-overhead load-imbalance detection in MPI applications -- Performance and Power Modeling, Prediction and Evaluation -- Trace-driven Workload Generation and Execution -- Bilas Update on the Asymptotic Optimality of LPT -- E2EWatch: An End-to-end Anomaly Diagnosis Framework for Production HPC Systems -- Scheduling and Load Balancing -- Collaborative GPU Preemption via Spatial Multitasking for Efficient GPU Sharing -- A Fixed-Parameter Algorithm for Scheduling Unit dependent Tasks with Unit Communication Delays -- Plan-based Job Scheduling for Super computers with Shared Burst Buffers -- Taming Tail Latency in Key-Value Stores: a Scheduling Perspective -- A log-linear(2+5/6)-approximation algorithm for parallel machine scheduling with a single orthogonal resource -- An MPI-Parallel Algorithm for Mapping Complex Networks onto Hierarchical Architectures -- Pipelined Model Parallelism: Complexity Results and Memory Considerations -- Data Management, Analytics and Machine Learning -- Efficient and Systematic Partitioning of Large and Deep Neural Networks for Parallelization -- A GPU Architecture Aware Fine-Grain Pruning Technique for Deep Neural Networks -- Towards Flexible and Compiler-Friendly Layer Fusion for CNNs on Multicore CPUs -- Smart Distributed Data Sets for Stream Processing -- Cluster, Cloud and Edge Computing -- Colony: Parallel Functions as a Service on the Cloud-Edge Continuum -- Horizontal Scaling in Cloud using Contextual Bandits -- Geo-Distribute Cloud Application at the Edge -- A Fault Tolerant and Deadline Constrained Sequence Alignment Application on Cloud-based Spot GPU Instances -- Sustaining Performance While Reducing Energy Consumption: A Control Theory Approach -- Theory and Algorithms for Parallel and Distributed Processing -- Algorithm design for Tensor Units -- A Scalable Approximation Algorithm for Weighted Longest Common Subsequence -- TSL Queue: An E?cient Lock-free Design for Priority Queues -- G-Morph: Induced Subgraph Isomorphism Search of Labeled Graphs on a GPU -- Parallel and Distributed Programming, Interfaces, and Languages -- Accelerating Graph Applications Using Phased Transactional Memory -- Efficient GPU Computation using Task Graph Parallelism -- Towards High Performance Resilience using Performance Portable Abstractions -- Enhancing Load-Balancing of MPI Applications with Workshare -- Particle-In-Cell Simulation using Asynchronous Tasking -- Multicore and Manycore Parallelism -- Exploiting co-execution with one API: heterogeneity from a modern perspective -- Parallel Numerical Methods and Applications -- Designing a 3D Parallel Memory-Aware Lattice Boltzmann Algorithm on Manycore Systems -- Fault-tolerant LU factorization is low cost -- Mixed Precision Incomplete and Factorized Sparse Approximate Inverse Preconditioning on GPUs -- Outsmarting the Atmospheric Turbulence for Ground-Based Telescopes Using the Stochastic Levenberg-Marquardt Method -- GPU Accelerated Mahalanobis-average Hierarchical Clustering Analysis -- High performance architectures and accelerators -- PrioRAT: Criticality-Driven Prioritization Inside the On-Chip Memory Hierarchy -- Optimized Implementation of the HPCG Benchmark on Recongurable Hardware. 330 $aThis book constitutes the proceedings of the 27th International Conference on Parallel and Distributed Computing, Euro-Par 2021, held in Lisbon, Portugal, in August 2021. The conference was held virtually due to the COVID-19 pandemic. The 38 full papers presented in this volume were carefully reviewed and selected from 136 submissions. They deal with parallel and distributed computing in general, focusing on compilers, tools and environments; performance and power modeling, prediction and evaluation; scheduling and load balancing; data management, analytics and machine learning; cluster, cloud and edge computing; theory and algorithms for parallel and distributed processing; parallel and distributed programming, interfaces, and languages; parallel numerical methods and applications; and high performance architecture and accelerators. 410 0$aTheoretical Computer Science and General Issues,$x2512-2029 ;$v12820 606 $aSoftware engineering 606 $aComputer engineering 606 $aComputer networks 606 $aCompilers (Computer programs) 606 $aComputers 606 $aOperating systems (Computers) 606 $aSoftware Engineering 606 $aComputer Engineering and Networks 606 $aCompilers and Interpreters 606 $aComputer Hardware 606 $aOperating Systems 615 0$aSoftware engineering. 615 0$aComputer engineering. 615 0$aComputer networks. 615 0$aCompilers (Computer programs). 615 0$aComputers. 615 0$aOperating systems (Computers). 615 14$aSoftware Engineering. 615 24$aComputer Engineering and Networks. 615 24$aCompilers and Interpreters. 615 24$aComputer Hardware. 615 24$aOperating Systems. 676 $a004.35 702 $aSousa$b Leonel 702 $aRoma$b Nuno 702 $aPetrus Thomae$fapproximately 1280-approximately 1340, 801 0$bMiAaPQ 801 1$bMiAaPQ 801 2$bMiAaPQ 906 $aBOOK 912 $a996464495103316 996 $aEuro-Par 2021: Parallel Processing$93091284 997 $aUNISA