LEADER 09187nam 2200505 450 001 9910559394303321 005 20231110224154.0 010 $a3-031-04209-3 035 $a(MiAaPQ)EBC6951414 035 $a(Au-PeEL)EBL6951414 035 $a(CKB)21502475700041 035 $a(PPN)26216857X 035 $a(EXLCZ)9921502475700041 100 $a20221117d2022 uy 0 101 0 $aeng 135 $aurcnu|||||||| 181 $ctxt$2rdacontent 182 $cc$2rdamedia 183 $acr$2rdacarrier 200 00$aHigh performance computing $e8th Latin American conference, CARLA 2021, Guadalajara, Mexico, October 6-8, 2021, revised selected papers /$fedited by Isidoro Gitler, Carlos Jaime Barrios Herna?ndez and Esteban Meneses 210 1$aCham, Switzerland :$cSpringer,$d[2022] 210 4$d©2022 215 $a1 online resource (273 pages) 225 1 $aCommunications in Computer and Information Science ;$vv.1540 311 08$aPrint version: Gitler, Isidoro High Performance Computing Cham : Springer International Publishing AG,c2022 9783031042089 320 $aIncludes bibliographical references and index. 327 $aIntro -- Preface -- Organization -- Contents -- High Performance Computing -- TEPUI: High-Performance Computing Infrastructure for Beamlines at LNLS/Sirius -- 1 Introduction -- 2 Cluster Subsystems -- 2.1 High-Performance Computing -- 2.2 Storage -- 3 System Access -- 3.1 Scheduling System -- 4 Monitoring System -- 4.1 Infrastructure -- 5 Conclusion -- References -- Energy Consumption Studies of WRF Executions with the LIMITLESS Monitor -- 1 Introduction -- 2 WRF: The Standard for Weather Simulations -- 3 The LIMITLESS Monitor -- 4 Methodology -- 5 Results -- 6 Conclusions and Future Work -- References -- Improving Performance of Long Short-Term Memory Networks for Sentiment Analysis Using Multicore and GPU Architectures -- 1 Introduction -- 2 Related Work -- 3 Implementation -- 3.1 Technical Specifications -- 3.2 Training Data Pre-processing -- 3.3 Model and Training -- 4 Results -- 4.1 Accuracy of the Sentiment Analysis Model -- 4.2 Performance Improvements -- 4.3 Proposal Validation -- 5 Conclusion and Future Works -- References -- A Methodology for Evaluating the Energy Efficiency of Post-Moore Architectures -- 1 Introduction -- 2 State of the Art -- 3 ACDC Methodology Outline -- 3.1 Description of the ACDC Methodology -- 3.2 Case Study: Implementation of the ACDC Methodology -- 4 Conclusions -- References -- Understanding COVID-19 Epidemic in Costa Rica Through Network-Based Modeling -- 1 Introduction -- 2 Background -- 2.1 Epidemic Model Simulation -- 2.2 Corona++ Simulation Framework -- 2.3 Related Work -- 3 Modeling -- 3.1 Characterization -- 3.2 Calibration and Scenario Setup -- 4 Results -- 4.1 Experimental Setup -- 4.2 Validation -- 4.3 Evaluation of Scenarios -- 5 Conclusions and Further Work -- References -- An Efficient Vectorized Auction Algorithm for Many-Core and Multicore Architectures -- 1 Introduction -- 2 Related Work. 327 $a3 Auction Algorithm -- 4 Vectorization of the Auction Algorithm for Multicore and Many-Core Architecture -- 4.1 Vectorization of the bid Phase -- 4.2 Vectorization of the assign Phase -- 5 Parallelization of the Auction Algorithm -- 6 Experimental Analysis -- 6.1 Experimental Analysis of the Vectorization -- 6.2 Experimental Analysis of the Parallel Vectorization -- 7 Conclusions -- References -- Green Energy HPC Data Centers to Improve Processing Cost Efficiency -- 1 Introduction -- 2 Development -- 3 Results -- 4 Conclusions -- References -- DICE: Generic Data Abstraction for Enhancing the Convergence of HPC and Big Data -- 1 Introduction -- 2 Data Containers -- 2.1 Input Containers -- 2.2 Output Containers -- 2.3 Storage Support for Containers -- 2.4 Interface -- 3 Reduction of I/O Interference -- 4 Evaluation -- 5 Related Work -- 6 Conclusions -- References -- A Comparative Study of Consensus Algorithms for Distributed Systems -- 1 Introduction -- 2 Implementation -- 2.1 Paxos -- 2.2 Raft -- 2.3 Practical Byzantine Fault Tolerance (pBFT) -- 3 Results -- 4 Conclusion -- References -- OCFTL: An MPI Implementation-Independent Fault Tolerance Library for Task-Based Applications -- 1 Introduction -- 2 Background and Motivation -- 2.1 Failure Detection and Propagation -- 2.2 Failure Mitigation -- 3 Related Work -- 4 An Implementation-Independent Fault Tolerance Library - OCFTL -- 4.1 Failure Detection -- 4.2 Handling Failures -- 4.3 Repairing Communicators -- 4.4 Gathering States -- 4.5 MPI Wrappers -- 4.6 Notification Callbacks -- 5 Experimental Results and Discussion -- 5.1 MPI Behavior -- 5.2 Empirical OCFTL Performance Evaluation -- 5.3 Internal Broadcast -- 5.4 Locality Problem -- 5.5 Checkpointing -- 5.6 Limitations -- 5.7 Stand-Alone Use -- 6 Conclusions and Future Work -- References -- Accelerating Smart City Simulations -- 1 Introduction. 327 $a2 Related Works -- 3 SimEDaPE: Simulation Estimation by Data Patterns Exploration -- 3.1 Time Series Extraction -- 3.2 Clustering -- 3.3 Estimation -- 4 Low-Level Performance Optimizations -- 4.1 Optimizations -- 5 Experimental Results -- 5.1 SimEDaPE -- 5.2 Low-Level Optimizations -- 6 Conclusions -- References -- High Performance Computing and Artificial Intelligence -- Distributed Artificial Intelligent Model Training and Evaluation -- 1 Introduction -- 2 Implementation Auto Tuning for ML Training -- 2.1 Front End -- 2.2 Back End -- 3 Accelerated Prediction/Inference Implementation -- 4 Results -- 4.1 Manual Adjustment Testing -- 4.2 Hyper-parameter Scopes -- 4.3 Model Variations -- 4.4 Inference Results -- 5 Conclusions -- References -- Large-Scale Distributed Deep Learning: A Study of Mechanisms and Trade-Offs with PyTorch -- 1 Introduction -- 2 Related Work -- 3 Background -- 3.1 Deep Learning Neural Network Models -- 3.2 Deep Learning Frameworks -- 3.3 Distributed Deep Learning Training -- 4 Methodology -- 5 Experimental Results -- 5.1 Scalability Study -- 5.2 Accuracy and Scaling Trade-Off -- 5.3 Mixed-Precision Distributed Training -- 5.4 Adaptive Summation in Distributed Training -- 6 Discussion -- 7 Concluding Remarks -- References -- Wind Prediction Using Deep Learning and High Performance Computing -- 1 Introduction -- 2 Related Work -- 3 Theoretical Background -- 3.1 Time Series Wind Prediction -- 3.2 Convolutional Networks -- 4 Experiment Setup -- 4.1 Data -- 4.2 Experimental Framework -- 4.3 Technical Assessment and HPC Requirements -- 4.4 Hyper-parameter Tuning and Optimization -- 5 Experimentation -- 5.1 Multiple Input Multiple Output (MIMO) Approach -- 5.2 Common Parameters in all Architectures -- 5.3 Classic Convolutional Architecture -- 5.4 Separable Convolutional Architecture. 327 $a5.5 Adding Skip and Residual Connections to the Models -- 5.6 Multi-head Architectures -- 6 Discussion and Conclusions -- 6.1 MIMO is a Reliable Approach for Deep Networks -- 6.2 Convolutional Networks are Superior to Baselines -- 7 Future Work -- References -- An Analysis of Neural Architecture Search and Hyper Parameter Optimization Methods -- 1 Introduction -- 2 Architecture and Hyper-parameters Optimization Overview -- 3 Optimization Methods -- 3.1 Bayesian Optimization -- 3.2 Population-Based Algorithms -- 3.3 Reinforcement Learning -- 3.4 Multi-objective Optimization -- 4 Occurrence Analysis -- 4.1 Cluster 1: Applications -- 4.2 Cluster 2: Learning -- 4.3 Cluster 3: Multi-objective -- 4.4 Cluster 4 and 5: Optimization Strategies -- 5 Conclusion -- References -- High Performance Computing Applications -- Solving the Heat Transfer Equation by a Finite Difference Method Using Multi-dimensional Arrays in CUDA as in Standard C -- 1 Introduction -- 2 Bidimensional Arrays -- 3 Tridimensional Arrays -- 4 Application to the Non-steady Heat Transport Equation -- 4.1 2D Case -- 4.2 3D Case -- 5 Performance Test -- 6 Conclusions -- References -- High-Throughput of Measure-Preserving Integrators Derived from the Liouville Operator for Molecular Dynamics Simulations on GPUs -- 1 Introduction -- 2 Methodology -- 3 Results -- 3.1 Code Validation -- 3.2 Code Performance -- 4 Discussion -- 5 Conclusions -- References -- An Efficient Parallel Model for Coupled Open-Porous Medium Problem Applied to Grain Drying Processing -- 1 Introduction -- 2 Numerical Simulation of Coupled Open-Porous Medium Problem -- 3 Methodology -- 3.1 The Algorithm -- 3.2 OpenMP Implementation -- 3.3 Experimental Setup -- 4 Experimental Results -- 4.1 Numerical Results -- 4.2 Preliminary Performance Analysis -- 4.3 Performance Evaluation -- 5 Conclusion and Future Works -- References. 327 $aAuthor Index. 410 0$aCommunications in Computer and Information Science 606 $aHigh performance computing 615 0$aHigh performance computing. 676 $a004.11 702 $aBarrios Hernandez$b Carlos Jaime 702 $aMeneses$b Esteban 702 $aGitler$b Isidoro 801 0$bMiAaPQ 801 1$bMiAaPQ 801 2$bMiAaPQ 906 $aBOOK 912 $a9910559394303321 996 $aHigh Performance Computing$93000244 997 $aUNINA