LEADER 13632nam 22007455 450 001 996466292803316 005 20230223001420.0 010 $a3-030-34356-1 024 7 $a10.1007/978-3-030-34356-9 035 $a(CKB)4100000009939718 035 $a(DE-He213)978-3-030-34356-9 035 $a(MiAaPQ)EBC6284078 035 $a(MiAaPQ)EBC5989005 035 $a(Au-PeEL)EBL5989005 035 $a(OCoLC)1134073315 035 $a(PPN)242818404 035 $a(EXLCZ)994100000009939718 100 $a20191202d2019 u| 0 101 0 $aeng 135 $aurnn|008mamaa 181 $ctxt$2rdacontent 182 $cc$2rdamedia 183 $acr$2rdacarrier 200 10$aHigh Performance Computing$b[electronic resource] $eISC High Performance 2019 International Workshops, Frankfurt, Germany, June 16-20, 2019, Revised Selected Papers /$fedited by Michèle Weiland, Guido Juckeland, Sadaf Alam, Heike Jagode 205 $a1st ed. 2019. 210 1$aCham :$cSpringer International Publishing :$cImprint: Springer,$d2019. 215 $a1 online resource (XXV, 659 p. 402 illus., 239 illus. in color.) 225 1 $aTheoretical Computer Science and General Issues,$x2512-2029 ;$v11887 311 $a3-030-34355-3 327 $aIntro -- Preface -- Organization -- Short Papers -- Preface to the First International Workshop on Legacy Software Refactoring for Performance -- P^3MA Workshop 2019 -- 4th International Workshop on In Situ Visualization (WOIV'19) -- Contents -- On the Use of Kernel Bypass Mechanisms for High-Performance Inter-container Communications -- 1 Introduction -- 2 Overview of Compared Solutions -- 3 Experimental Results -- 4 Related Work -- 5 Conclusions and Future Work -- References -- Continuous-Action Reinforcement Learning for Memory Allocation in Virtualized Servers -- 1 Introduction -- 2 Background -- 2.1 Memory Management in Virtualized Nodes -- 2.2 Reinforcement Learning: Markov Decision Process -- 3 CAVMem: Algorithm for Virtualized Memory Management -- 3.1 Decentralized Strategy for Memory Management -- 3.2 Formulating the Problem as an MDP -- 4 Experimental Framework -- 5 Results for Evaluation -- 5.1 Results for Scenario 1 -- 5.2 Results for Scenario 2 -- 5.3 Results for Scenario 3 -- 5.4 Discussion -- 6 Related Work -- 7 Conclusions and Future Work -- References -- Container Orchestration on HPC Clusters -- 1 Introduction -- 2 Related Work -- 3 Background -- 3.1 Kubernetes -- 3.2 Kubernetes Deployment -- 4 Implementation -- 4.1 General Approach -- 4.2 Kubernetes Cluster Deployment -- 4.3 HPC Worker Node Software Prerequisites -- 4.4 Networking -- 4.5 GE Worker Setup and Tear down -- 4.6 Kubernetes Cluster Configuration -- 5 Evaluation -- 6 Discussion -- 7 Conclusion and Future Work -- References -- Data Pallets: Containerizing Storage for Reproducibility and Traceability -- 1 Introduction -- 2 Related Work -- 3 Design -- 3.1 Design and Implementation Challenges -- 3.2 Design and Implementation Details -- 3.3 Integration with Sandia Analysis Workbench (SAW) -- 4 Measurements -- 4.1 Time Overheads -- 4.2 Space Overheads -- 4.3 Discussion. 327 $a5 Integration with Sandia Analysis Workbench -- 6 Conclusions and Future Work -- References -- Sarus: Highly Scalable Docker Containers for HPC Systems -- 1 Introduction -- 2 Related Work -- 3 Sarus -- 3.1 Sarus Architecture -- 3.2 Container Creation -- 4 Extending Sarus with OCI Hooks -- 4.1 Native MPICH-Based MPI Support (H1) -- 4.2 NVIDIA GPU Support (H2) -- 4.3 SSH Connection Within Containers (H3) -- 4.4 Slurm Scheduler Synchronization (H4) -- 5 Performance Evaluation -- 5.1 Scientific Applications -- 6 Conclusions -- References -- Singularity GPU Containers Execution on HPC Cluster -- 1 Introduction -- 2 Singularity GPU Containers Building and Running -- 3 Benchmark -- 3.1 Systems Description -- 3.2 Test Case 1: Containerized Tensorflow Execution on GALILEO Versus Official Tensorflow Performance Data -- 3.3 Test Case 2: Containerized Versus Bare Metal Execution on GALILEO -- 4 Conclusion -- References -- A Multitenant Container Platform with OKD, Harbor Registry and ELK -- 1 Introduction -- 2 Past -- 2.1 Background -- 2.2 Challenges -- 3 Present -- 3.1 Evaluation of Container Orchestration Frameworks -- 3.2 Observability: Logging and OKD -- 3.3 Observability: Monitoring and OKD -- 4 Future -- 4.1 Monitoring -- 4.2 Container Policy and OKD -- 4.3 Gitops gitops and OKD -- 4.4 Continuous Delivery in OKD -- 4.5 OKD in the Cloud -- 5 Conclusion -- References -- Enabling GPU-Enhanced Computer Vision and Machine Learning Research Using Containers -- 1 Introduction -- 2 Defining the Base Container -- 2.1 System Setup: Ubuntu, CUDA, Docker, Nvidia-Docker -- 2.2 Docker and Container Runtime -- 2.3 TensorFlow -- 2.4 OpenCV -- 2.5 Cuda_tensorflow_opencv -- 3 Using the Base Container -- 3.1 Testing Code from a Bash Terminal -- 3.2 Integrating Darknet and Yolo V3 Python Bindings -- 4 Conclusion -- References. 327 $aSoftware and Hardware Co-design for Low-Power HPC Platforms -- 1 Introduction -- 2 Network Interface Primitives -- 3 HPC Prototype -- 4 User-Level Communication Library -- 5 MPI Implementation over the Proposed Architecture -- 6 Conclusions and Future Work -- References -- Modernizing Titan2D, a Parallel AMR Geophysical Flow Code to Support Multiple Rheologies and Extendability -- 1 Introduction -- 2 Titan2D and Benchmark Problem -- 3 Refactoring Strategies -- 3.1 Adopting a Python Interface -- 3.2 Merging Multiple Forks -- 3.3 Changing Data Layout to for Modern CPU Architectures -- 3.4 Efficient Indexing for Elements/Nodes Addressing -- 3.5 Introducing OpenMP and Hybrid OpenMP/MPI Parallelization -- 4 Performance Improvement Evaluation -- 5 Conclusions and Future Plans -- References -- Asynchronous AMR on Multi-GPUs -- 1 Introduction -- 2 Execution on Heterogeneous Architectures -- 2.1 Data Model and CPU-GPU Communication -- 2.2 Scheduling on Heterogeneous Architectures -- 2.3 API -- 2.4 Multi-GPU Support -- 3 Evaluation -- 4 Conclusions -- References -- Batch Solution of Small PDEs with the OPS DSL -- 1 Introduction -- 2 The OPS DSL -- 3 Batching Support in OPS -- 3.1 Extending the Abstraction -- 3.2 Execution Schedule Transformation -- 3.3 Data Layout Transformation -- 3.4 Alternating Direction Implicit Solver -- 4 Evaluation -- 4.1 The Application -- 4.2 Experimental Set-Up -- 4.3 Results -- 5 Conclusions -- References -- Scalable Parallelization of Stencils Using MODA -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 MODA and User-Defined Indices -- 3.2 Using GGDML Indices -- 3.3 Communication Identification -- 4 Evaluation -- 4.1 Test Application -- 4.2 Test System -- 4.3 Experiments -- 5 Summary -- References -- Comparing High Performance Computing Accelerator Programming Models -- 1 Introduction -- 2 Motivation -- 3 Related Work. 327 $a4 Analysis -- 5 Discussion -- 5.1 BT Benchmark -- 5.2 SP Benchmark -- 5.3 LBM Benchmark -- 5.4 LBDC Benchmark -- 6 Conclusion -- References -- Tracking User-Perceived I/O Slowdown via Probing -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Probing -- 3.2 Data Reduction Using Statistics -- 3.3 Computing the Slowdown -- 4 Evaluation -- 4.1 Test Systems -- 4.2 Probing Tool -- 4.3 Timeseries of Individual Measurements -- 4.4 Host Variability -- 4.5 Understanding Application Behavior - The IO-500 -- 4.6 Long-Period -- 4.7 Slowdown -- 5 Conclusion -- References -- A Quantitative Approach to Architecting All-Flash Lustre File Systems -- 1 Introduction -- 2 Methods -- 3 File System Capacity -- 4 Drive Endurance -- 5 Metadata Configuration -- 5.1 MDT Capacity Required by DOM -- 5.2 MDT Capacity Required for Inodes -- 5.3 Overall MDT Capacity -- 6 Conclusion -- References -- MBWU: Benefit Quantification for Data Access Function Offloading -- 1 Introduction -- 2 The MBWU-Based Methodology -- 2.1 Background -- 2.2 What Is MBWU -- 2.3 How to Measure MBWU(s) -- 2.4 Evaluation Prototype -- 3 Evaluation -- 3.1 Infrastructure -- 3.2 Test Setup and Results -- 4 Related Work -- 5 Conclusion -- References -- Footprinting Parallel I/O - Machine Learning to Classify Application's I/O Behavior -- 1 Introduction -- 2 Related Work -- 3 DKRZ Monitoring -- 3.1 Metrics -- 4 Methodology -- 5 Test Data -- 5.1 Data Preparation -- 6 Evaluation -- 6.1 I/O Behavior Classification -- 6.2 Footprinting -- 7 Manual Identification of I/O Intensive Jobs -- 8 Summary and Conclusion -- References -- Adventures in NoSQL for Metadata Management -- 1 Introduction -- 2 Related Work -- 3 Metadata Model -- 3.1 Basic Metadata -- 3.2 Custom Metadata -- 4 Design -- 4.1 What Has the Right Features to Be Worth Testing? -- 4.2 What Is It Going to Take to Get It All Working at All?. 327 $a4.3 Can We Make Our Queries Work with Any Performance? -- 4.4 Battle Scars and Lessons for Our Next Battle Against Scale Out Computing Tools -- 5 Evaluation -- 5.1 Insert Time -- 5.2 Query Time -- 6 Conclusion and Future Work -- References -- Towards High Performance Data Analytics for Climate Change -- 1 Introduction -- 2 Main Challenges -- 3 The Ophidia Project -- 3.1 Multi-dimensional Storage Model -- 3.2 Array-Based Primitives and Parallel Operators -- 4 Benchmark and Experimental Results -- 4.1 Benchmark Definition -- 4.2 Test Environment -- 4.3 Experimental Results and Discussion -- 5 Related Work -- 6 Conclusions -- References -- An Architecture for High Performance Computing and Data Systems Using Byte-Addressable Persistent Memory -- 1 Introduction -- 2 Persistent Memory -- 2.1 Data Access -- 2.2 B-APM Modes of Operation -- 2.3 Non-volatile Memory Software Ecosystem -- 3 Opportunities for Exploiting B-APM for Computational Simulations and Data Analytics -- 3.1 Potential Caveats -- 4 Systemware Architecture -- 4.1 Job Scheduler -- 4.2 Data Scheduler -- 5 Performance Evaluation -- 6 Related Work -- 7 Summary -- References -- Mediating Data Center Storage Diversity in HPC Applications with FAODEL -- 1 Introduction -- 2 FAODEL Background -- 2.1 Kelpie -- 2.2 I/O Management (IOM) Modules -- 3 Mediating Storage Using Kelpie Object Naming -- 3.1 Kelpie Architectural Considerations -- 3.2 Annotating the Kelpie Namespace -- 3.3 Service-Initiated Mediation -- 3.4 Performance Considerations -- 4 Related Work -- 5 Conclusion -- References -- Predicting File Lifetimes with Machine Learning -- 1 Introduction -- 2 Specifying the Problem and Building the Models -- 2.1 Problem Specification -- 2.2 Dataset -- 2.3 Data Preprocessing -- 2.4 Models -- 3 Results -- 3.1 Evaluation Methodology -- 3.2 Training Times and Model Sizes -- 3.3 Accuracy. 327 $a3.4 Error and Accuracy Distribution. 330 $aThis book constitutes the refereed post-conference proceedings of 13 workshops held at the 34th International ISC High Performance 2019 Conference, in Frankfurt, Germany, in June 2019: HPC I/O in the Data Center (HPC-IODC), Workshop on Performance & Scalability of Storage Systems (WOPSSS), Workshop on Performance & Scalability of Storage Systems (WOPSSS), 13th Workshop on Virtualization in High-Performance Cloud Computing (VHPC '18), 3rd International Workshop on In Situ Visualization: Introduction and Applications, ExaComm: Fourth International Workshop on Communication Architectures for HPC, Big Data, Deep Learning and Clouds at Extreme Scale, International Workshop on OpenPOWER for HPC (IWOPH18), IXPUG Workshop: Many-core Computing on Intel, Processors: Applications, Performance and Best-Practice Solutions, Workshop on Sustainable Ultrascale Computing Systems, Approximate and Transprecision Computing on Emerging Technologies (ATCET), First Workshop on the Convergence of Large Scale Simulation and Artificial Intelligence, 3rd Workshop for Open Source Supercomputing (OpenSuCo), First Workshop on Interactive High-Performance Computing, Workshop on Performance Portable Programming Models for Accelerators (P^3MA). The 48 full papers included in this volume were carefully reviewed and selected. They cover all aspects of research, development, and application of large-scale, high performance experimental and commercial systems. Topics include HPC computer architecture and hardware; programming models, system software, and applications; solutions for heterogeneity, reliability, power efficiency of systems; virtualization and containerized environments; big data and cloud computing; and artificial intelligence. 410 0$aTheoretical Computer Science and General Issues,$x2512-2029 ;$v11887 606 $aComputer engineering 606 $aComputer networks 606 $aSoftware engineering 606 $aComputers 606 $aComputer Engineering and Networks 606 $aSoftware Engineering 606 $aComputer Hardware 606 $aComputing Milieux 615 0$aComputer engineering. 615 0$aComputer networks. 615 0$aSoftware engineering. 615 0$aComputers. 615 14$aComputer Engineering and Networks. 615 24$aSoftware Engineering. 615 24$aComputer Hardware. 615 24$aComputing Milieux. 676 $a004.3 702 $aWeiland$b Michèle$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aJuckeland$b Guido$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aAlam$b Sadaf$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aJagode$b Heike$4edt$4http://id.loc.gov/vocabulary/relators/edt 801 0$bMiAaPQ 801 1$bMiAaPQ 801 2$bMiAaPQ 906 $aBOOK 912 $a996466292803316 996 $aHigh Performance Computing$93000244 997 $aUNISA