1.

Record Nr.

UNINA9910878979503321

Autore

Huang De-Shuang

Titolo

Advanced Intelligent Computing Technology and Applications : 20th International Conference, ICIC 2024, Tianjin, China, August 5–8, 2024, Proceedings, Part XII / / edited by De-Shuang Huang, Yijie Pan, Jiayang Guo

Pubbl/distr/stampa

Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024

ISBN

9789819756155

9789819756148

Edizione

[1st ed. 2024.]

Descrizione fisica

1 online resource (536 pages)

Collana

Lecture Notes in Computer Science, , 1611-3349 ; ; 14873

Disciplina

006.3

Soggetti

Computational intelligence

Machine learning

Computer networks

Application software

Computational Intelligence

Machine Learning

Computer Communication Networks

Computer and Information Systems Applications

Lingua di pubblicazione

Inglese

Formato

Materiale a stampa

Livello bibliografico

Monografia

Nota di contenuto

Intro -- Preface -- Organization -- Contents - Part XII -- Intelligent Computing in Computer Vision -- A 6-DoF Grasping Network Using Feature Augmentation for Novel Domain Generalization -- 1 Introduction -- 2 Methodology -- 2.1 Gaussian Noise Mix -- 2.2 Resblock Module -- 2.3 Local Features Interpolation -- 3 Experiments -- 3.1 Comparison with the State-of-the-Art -- 3.2 Generalization Analysis of Novel Domain -- 3.3 Visualization -- 3.4 Ablation Study -- 3.5 Practical Evaluation -- 4 Conclusion -- References -- TC-YOLO: Enhanced Vehicle Detection Approach for Traffic Surveillance Cameras Based on YOLOv8 -- 1 Introduction -- 2 Related Works -- 3 Method -- 3.1 Network Structure -- 3.2 Deformable Convolution for Enhancing Spatial Deformation Adaptability -- 3.3 Global Attention is Used to Enhance Cross-Dimensional Interaction Features -- 3.4 An Enhanced



Detecting Head -- 4 Experiments -- 4.1 Experimental Dataset -- 4.2 Experimental Environment and Configuration -- 4.3 Evaluation Metrics -- 4.4 Algorithm Comparison -- 4.5 Ablation Study -- 5 Conclusion -- References -- MineDet: A Real-Time Object Detection Framework Based Neural Architecture Search for Coal Mines -- 1 Introduction -- 2 Related Work -- 2.1 Object Detection Based on NAS -- 2.2 Lightweight Model Design -- 3 Method -- 3.1 The Reparameterization Technique -- 3.2 Efficient Search Space -- 3.3 Search Algorithm -- 4 Experimental -- 4.1 Dataset and Implementation Details -- 4.2 Experimental Results -- 5 Conclusion -- References -- Multi-gait Synthesis Based on Convolutional Neural Networks -- 1 Introduction -- 2 Related Work -- 2.1 Multi-gait Dataset -- 2.2 2D and 3D Convolution -- 2.3 Image Synthesis -- 2.4 Encoder and Decoder -- 2.5 Gait Recognition -- 3 Method -- 3.1 CNN Block -- 3.2 Encoder -- 3.3 Feature Merging -- 3.4 Decoder -- 3.5 Optimization Strategy -- 4 Experiment.

4.1 Datasets -- 4.2 Single Frame and Multi Frame -- 4.3 Gait Recognition and Similarity Comparison -- 5 Summary -- References -- Controlling Attention Map Better for Text-Guided Image Editing Diffusion Models -- 1 Introduction -- 2 Related Work -- 3 Preliminaries -- 3.1 Diffusion Models -- 3.2 Inversion Methods -- 3.3 Attention Control Methods -- 4 Methodology -- 4.1 Motivation -- 4.2 Integrate Attention Control -- 5 Experiments -- 5.1 Benchmark -- 5.2 Implementation Details -- 5.3 Results -- 5.4 Ablation Study -- 6 Conclusion and Future Work -- References -- Spatial Group and Cross-Channel Attention: Make Smaller Models More Effective, Focus on High-Level Semantic Features -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Spatial Group and Cross-Channel Attention -- 3.2 Visualization and Interpretation -- 4 Experiments on Image Classification -- 4.1 Implementation Details -- 4.2 Image Classification -- 4.3 Parameter Experiment -- 5 Conclusion -- References -- YOLO-BS: A Better Object Detection Model for Real-Time Driver Behavior Detection -- 1 Introduction -- 2 Method -- 2.1 EVITS Module -- 2.2 ASPPMP Module -- 3 Experiments -- 3.1 Implementation Details -- 3.2 Datasets -- 3.3 Experimental Results -- 4 Conclusion -- References -- Fusion Attention Graph Convolutional Network with Hyperskeleton for UAV Action Recognition -- 1 Introduction -- 2 Proposed FA-GCN Method -- 2.1 The Network Architecture -- 2.2 Spatiotemporal Channel Fusion Attention Mechanism -- 2.3 Hyperskeleton Features -- 2.4 Gaussian Center Enhanced Interpolation Strategy -- 3 Experiments -- 3.1 Datasets and Experimental Setup Details -- 3.2 Ablation Studies and Comparative Analysis -- 3.3 Comparison with the State-of-the-Art -- 4 Conclusion -- References -- Enhancing Adversarial Robustness for Deep Metric Learning via Attention-Aware Knowledge Guidance -- 1 Introduction.

2 Related Work -- 3 Proposed Method -- 3.1 Preliminaries -- 3.2 Adversarial Attention-Aware Knowledge Guidance -- 3.3 Benign Attention-Aware Knowledge Guidance -- 3.4 Optimization -- 4 Experiments -- 4.1 Experiment Setup -- 4.2 Detailed Robustness Evaluation -- 5 Ablation and Discussions -- 5.1 Loss Function -- 5.2 Training Interval -- 5.3 Weak Robustness Subnet Width -- 5.4 Attention-Aware Knowledge Guidance -- 6 Conclusion -- References -- IMFA-Stereo: Domain Generalized Stereo Matching via Iterative Multimodal Feature Aggregation Cost Volume -- 1 Introduction -- 2 Related Work -- 2.1 Cost Filtering-Based Methods -- 2.2 Iterative Methods -- 3 Method -- 3.1 Multi-scale Feature Extractor -- 3.2 Initial Disparity Estimation -- 3.3 Aggregated Cost Volume -- 3.4 ConvGRU-Based Updater -- 3.5 Loss Function -- 4 Experiments -- 4.1



Implementation Details -- 4.2 Ablation Study -- 4.3 Comparisons with State-of-the-Art -- 4.4 Cross-Domain Generalization -- 5 Conclusion -- References -- Anomaly Behavior Detection in Crowd via Lightweight 3D Convolution -- 1 Introduction -- 2 Method -- 2.1 Overall Framework -- 2.2 Channel-Only Polarized Self-attention -- 2.3 3D Separable Convolution -- 2.4 Truncated Singular Value Decomposition -- 3 Experiments and Analysis -- 3.1 Experimental Datasets and Preparation -- 3.2 Evaluation on Hajjv2 -- 3.3 Ablation Study -- 3.4 Validation on Benchmarks -- 3.5 Parameter Comparison and Results -- 4 Conclusion -- References -- Generating Graph-Based Rules for Enhancing Logical Reasoning -- 1 Introduction -- 2 Related Work -- 2.1 GNNs on Knowledge Graphs -- 2.2 Logical Rule Mining -- 3 Preliminary -- 4 Method -- 4.1 Graph-Based Rule Generator (GRG) -- 4.2 Subgraph Reasoning Module (SRM) -- 4.3 Loss Function -- 5 Experiments -- 5.1 Experiment Setup -- 5.2 Comparisons with Other Approaches -- 5.3 Ablation Studies.

5.4 Hyperparamter Analysis -- 5.5 Visualization Experiments -- 6 Conclusions -- References -- YOLO-PR: Multi Pose Object Detection Method for Underground Coal Mine -- 1 Introduction -- 2 Related Work -- 3 The Proposed Method -- 3.1 Backbone Network Incorporating EPA Modules -- 3.2 Neck Network Integrating RFB Modules -- 3.3 Loss Function Based on PioU V2 -- 4 Experiments -- 4.1 Datasets and Evaluation Indicators -- 4.2 Result Analysis and Ablation Experiment -- 5 Conclusion -- References -- DSMENet: A Road Segmentation Network Based on Dual-Branch Dynamic Snake Convolutional Encoding and Multi-modal Information Iterative Enhancement -- 1 Introduction -- 2 Method -- 2.1 Overall Architecture -- 3 Dynamic Snake Convolution -- 3.1 Multi-modal Feature Fusion Module -- 3.2 Multi-modal Information Iterative Enhancement Module -- 4 Experiment -- 4.1 Datasets and Experimental Setup -- 4.2 Comparative Experiments -- 4.3 Ablation Study -- 5 Conclusion -- References -- MPRNet: Multi-scale Pointwise Regression Network for Crowd Counting and Localization -- 1 Introduction -- 2 Related Works -- 3 Proposed Approach -- 3.1 Overall Counting and Localization Workflow -- 3.2 Multi-Scale Feature Extractor -- 3.3 Regional Maximum Substitution -- 3.4 One-to-One Points Matching -- 3.5 Training Objective -- 4 Experiment -- 4.1 Datasets and Configurations -- 4.2 Evaluation Metrics and Results -- 4.3 Ablation Studies -- 5 Conclusion -- References -- Text-to-Image Generation with Multiscale Semantic Context-Aware Generative Adversarial Networks -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Model Overview -- 3.2 Semantic Adaptive Affine Fusion -- 3.3 CrossBlock Context Aware Encoding -- 3.4 Objective Function -- 4 Experiment -- 4.1 Quantitative Results -- 4.2 Qualitative Results -- 4.3 Ablation Studies -- 5 Future Work -- 6 Conclusion -- References.

CHMF: Colorful Human Reconstruction Based on Mesh Features -- 1 Introduction -- 2 Related Work -- 2.1 3D Human Color Estimation -- 2.2 3D Object Features Extraction -- 3 Method -- 3.1 Color Features Extraction and Mapping -- 3.2 Structural Features Extraction and Color Features Repair -- 3.3 Shape Features Extraction and Transformation -- 3.4 Features Decoding and Loss Functions -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Qualitative and Quantitative Comparisons -- 4.3 Ablation Study -- 4.4 Limitations -- 5 Conclusion -- References -- Face Swapping via Reverse Contrastive Learning and Explicit Identity-Attribute Disentanglement -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Reverse Contrastive Learning -- 3.2 Information Disentanglement -- 3.3 Loss Functions -- 4 Experiments -- 4.1 Experience Details -- 4.2 Comparison with Other Methods -- 4.3



Analysis of RCLSwap -- 5 Conclusion -- References -- OSFENet: Object Spatiotemporal Feature Enhanced Network for Surgical Phase Recognition -- 1 Introduction -- 2 Related Work -- 3 Methods -- 3.1 Surgical Tool Alignment -- 3.2 Spatial Feature Encoder -- 3.3 Object Spatial Feature Enhanced Module -- 3.4 Object Temporal Feature Enhanced Module -- 3.5 Fusion Module -- 3.6 Loss Function -- 4 Experiments -- 4.1 Dataset -- 4.2 Experimental Settings -- 4.3 Online Surgical Phase Recognition Results -- 4.4 Offline Surgical Phase Recognition Results -- 4.5 Ablation Study -- 4.6 Qualitative Analysis -- 5 Conclusion -- References -- A Reinforced Passage Interactive Retrieval Framework Incorporating Implicit Knowledge for KB-VQA -- 1 Introduction -- 2 Related Work -- 2.1 Retrieval-Based Visual Question Answering Method -- 2.2 Large-Scale Model-Based Visual Question Answering Method -- 3 Methods -- 3.1 Implicit Knowledge-Driven Explicit Knowledge Retrieval -- 3.2 Passage Self-interaction -- 3.3 Model Training.

3.4 Retriever-Reader Generation.

Sommario/riassunto

This 13-volume set LNCS 14862-14874 constitutes - in conjunction with the 6-volume set LNAI 14875-14880 and the two-volume set LNBI 14881-14882 - the refereed proceedings of the 20th International Conference on Intelligent Computing, ICIC 2024, held in Tianjin, China, during August 5-8, 2024. The total of 863 regular papers were carefully reviewed and selected from 2189 submissions. This year, the conference concentrated mainly on the theories and methodologies as well as the emerging applications of intelligent computing. Its aim was to unify the picture of contemporary intelligent computing techniques as an integral concept that highlights the trends in advanced computational intelligence and bridges theoretical research with applications. Therefore, the theme for this conference was "Advanced Intelligent Computing Technology and Applications". Papers that focused on this theme were solicited, addressing theories, methodologies, and applications in science and technology. .