LEADER 13636nam 22008655 450 001 996587868803316 005 20231224180437.0 010 $a981-9985-49-8 024 7 $a10.1007/978-981-99-8549-4 035 $a(MiAaPQ)EBC31038248 035 $a(Au-PeEL)EBL31038248 035 $a(DE-He213)978-981-99-8549-4 035 $a(EXLCZ)9929451474400041 100 $a20231224d2024 u| 0 101 0 $aeng 135 $aurcnu|||||||| 181 $ctxt$2rdacontent 182 $cc$2rdamedia 183 $acr$2rdacarrier 200 10$aPattern Recognition and Computer Vision$b[electronic resource] $e6th Chinese Conference, PRCV 2023, Xiamen, China, October 13?15, 2023, Proceedings, Part X /$fedited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji 205 $a1st ed. 2024. 210 1$aSingapore :$cSpringer Nature Singapore :$cImprint: Springer,$d2024. 215 $a1 online resource (509 pages) 225 1 $aLecture Notes in Computer Science,$x1611-3349 ;$v14434 311 08$aPrint version: Liu, Qingshan Pattern Recognition and Computer Vision Singapore : Springer Singapore Pte. Limited,c2024 9789819985487 327 $aIntro -- Preface -- Organization -- Contents - Part X -- Neural Network and Deep Learning III -- Dual-Stream Context-Aware Neural Network for Survival Prediction from Whole Slide Images -- 1 Introduction -- 2 Method -- 3 Experiments and Results -- 4 Conclusion -- References -- A Multi-label Image Recognition Algorithm Based on Spatial and Semantic Correlation Interaction -- 1 Introduction -- 2 Related Work -- 2.1 Correlation-Agnostic Algorithms -- 2.2 Spatial Correlation Algorithms -- 2.3 Semantic Correlation Algorithms -- 3 Methodology -- 3.1 Definition of Multi-label Image Recognition -- 3.2 The Framework of SSCI -- 3.3 Loss Function -- 4 Experiments -- 4.1 Evaluation Metrics -- 4.2 Implementation Details -- 4.3 Comparison with Other Mainstream Algorithms -- 4.4 Evaluation of the SSCI Effectiveness -- 5 Conclusion -- References -- Hierarchical Spatial-Temporal Network for Skeleton-Based Temporal Action Segmentation -- 1 Introduction -- 2 Related Work -- 2.1 Temporal Action Segmentation -- 2.2 Skeleton-Based Action Recognition -- 3 Method -- 3.1 Network Architecture -- 3.2 Multi-Branch Transfer Fusion Module -- 3.3 Multi-Scale Temporal Convolution Module -- 3.4 Loss Function -- 4 Experiments -- 4.1 Setup -- 4.2 Effect of Hierarchical Model -- 4.3 Effect of Multiple Modalties -- 4.4 Effect of Multi-modal Fusion Methods -- 4.5 Effect of Multi-Scale Temporal Convolution -- 4.6 Comparision with State-of-the-Art -- 5 Conclusion -- References -- Multi-behavior Enhanced Graph Neural Networks for Social Recommendation -- 1 Introduction -- 2 Related Work -- 3 Preliminaries -- 4 Methodology -- 4.1 Embedding Layer -- 4.2 Propagation Layer -- 4.3 Multi-behavior Integration Layer -- 4.4 Prediction Layer -- 4.5 Model Training -- 5 Experiments -- 5.1 Experimental Settings -- 5.2 Performance Comparison (RQ1) -- 5.3 Ablation Study (RQ2). 327 $a5.4 Parameter Analysis (RQ3) -- 6 Conclusion and Future Work -- References -- A Complex-Valued Neural Network Based Robust Image Compression -- 1 Introduction -- 2 Related Works -- 2.1 Neural Image Compression -- 2.2 Adversarial Attack -- 2.3 Complex-Valued Convolutional Neural Networks -- 3 Proposed Method -- 3.1 Overall Framework -- 3.2 Nonlinear Transform -- 4 Experiment Results -- 4.1 Experiment Setup -- 4.2 Results and Comparison -- 4.3 Ablation Study -- 5 Conclusions -- References -- Binarizing Super-Resolution Neural Network Without Batch Normalization -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Batch Normalization in SR Models -- 3.2 Channel-Wise Asymmetric Binarizer for Activations -- 3.3 Smoothness-Controlled Estimator -- 4 Experimentation -- 4.1 Experiment Setup -- 4.2 Ablation Study -- 4.3 Visualization -- 5 Conclusion -- References -- Infrared and Visible Image Fusion via Test-Time Training -- 1 Introduction -- 2 Method -- 2.1 Overall Framework -- 2.2 Training and Testing -- 3 Experiments -- 3.1 Experiment Configuration -- 3.2 Performance Comparison on TNO -- 3.3 Performance Comparison on VIFB -- 3.4 Ablation Study -- 4 Conclusion -- References -- Graph-Based Dependency-Aware Non-Intrusive Load Monitoring -- 1 Introduction -- 2 Proposed Method -- 2.1 Problem Formulation -- 2.2 Co-occurrence Probability Graph -- 2.3 Graph Structure Learning -- 2.4 Graph Attention Neural Network -- 2.5 Encoder-Decoder Module -- 3 Numerical Studies and Discussions -- 3.1 Dataset and Experiment Setup -- 3.2 Metrics and Comparisons -- 4 Conclusion -- References -- Few-Shot Object Detection via Classify-Free RPN -- 1 Introduction -- 2 Related Work -- 2.1 Object Detection -- 2.2 Few-Shot Learning -- 2.3 Few-Shot Object Detection -- 3 Methodology -- 3.1 Problem Setting -- 3.2 Analysis of the Base Class Bias Issue in RPN -- 3.3 Classify-Free RPN. 327 $a4 Experiments -- 4.1 Experimental Setup -- 4.2 Comparison with the State-of-the-Art -- 4.3 Ablation Study -- 5 Conclusion -- References -- IPFR: Identity-Preserving Face Reenactment with Enhanced Domain Adversarial Training and Multi-level Identity Priors -- 1 Introduction -- 2 Methods -- 2.1 Target Motion Encoder and 3D Shape Encoder -- 2.2 3D Shape-Aware Warping Module -- 2.3 Identity-Aware Refining Module -- 2.4 Enhanced Domain Discriminator -- 2.5 Training -- 3 Experiment -- 3.1 Experimental Setup -- 3.2 Comparisons -- 3.3 Ablation Study -- 4 Limitation -- 5 Conclusion -- References -- L2MNet: Enhancing Continual Semantic Segmentation with Mask Matching -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Preliminaries and Revisiting -- 3.2 Proposed Learn-to-Match Framework -- 3.3 Training Loss -- 4 Experiments -- 4.1 Experimental Setting -- 4.2 Quantitative Evaluation -- 4.3 Ablation Study -- 5 Conclusion -- References -- Adaptive Channel Pruning for Trainability Protection -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Method Framework and Motivation -- 3.2 Channel Similarity Calculation and Trainability Preservation -- 3.3 Sparse Control and Optimization -- 4 Experiments -- 4.1 Experiments Settings and Evaluation Metrics -- 4.2 Results on Imagenet -- 4.3 Results on Cifar-10 -- 4.4 Results on YOLOX-s -- 4.5 Ablation -- 5 Conclusion -- References -- Exploiting Adaptive Crop and Deformable Convolution for Road Damage Detection -- 1 Introduction -- 2 Related Work -- 3 Methods -- 3.1 Adaptive Image Cropping Based on Vanishing Point Estimation -- 3.2 Feature Learning with Deformable Convolution -- 3.3 Diagonal Intersection over Union Loss Function -- 4 Experiment -- 4.1 Comparative Analysis of Different Datasets -- 4.2 Ablation Analysis -- 5 Conclusion -- References -- Cascaded-Scoring Tracklet Matching for Multi-object Tracking. 327 $a1 Introduction -- 2 Related Work -- 2.1 Tracking by Detection -- 2.2 Joint Detection and Tracking -- 3 Proposed Method -- 3.1 Cascaded-Scoring Tracklet Matching -- 3.2 Motion-Guided Based Target Aware -- 3.3 Appearance-Assisted Feature Warper -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Ablation Studies -- 4.3 Comparison with State-of-the-Art Methods -- 5 Conclusion -- References -- Boosting Generalization Performance in Person Re-identification -- 1 Introduction -- 2 Related Work -- 2.1 Generalizable Person ReID -- 2.2 Vision-Language Learning -- 3 Method -- 3.1 Review of CLIP -- 3.2 A Novel Cross-Modal Framework -- 3.3 Prompt Design Process -- 3.4 Loss Function -- 4 Experiments -- 4.1 Datasets and Evaluation Protocols -- 4.2 Implementation Details -- 4.3 Ablation Study -- 4.4 Comparison with State-of-the-Art Methods -- 4.5 Other Analysis -- 5 Conclusion -- References -- Self-guided Transformer for Video Super-Resolution -- 1 Introduction -- 2 Related Work -- 2.1 Video Super-Resolution -- 2.2 Vision Transformers -- 3 Our Method -- 3.1 Network Overview -- 3.2 Multi-headed Self-attention Module Based on Offset-Guided Window (OGW-MSA) -- 3.3 Feature Aggregation (FA) -- 4 Experiments -- 4.1 Datasets and Experimental Settings -- 4.2 Comparisons with State-of-the-Art Methods -- 4.3 Ablation Study -- 5 Conclusion -- References -- SAMP: Sub-task Aware Model Pruning with Layer-Wise Channel Balancing for Person Search -- 1 Introduction -- 2 Related Work -- 3 The Proposed Method -- 3.1 Framework Overview -- 3.2 Sub-task Aware Channel Importance Estimation -- 3.3 Layer-Wise Channel Balancing -- 3.4 Adaptive OIM Loss for Model Pruning and Finetuning -- 4 Experimental Results and Analysis -- 4.1 Dataset and Evaluation Metric -- 4.2 Implementation Details -- 4.3 Comparison with the State-of-the-Art Approaches -- 4.4 Ablation Study -- 5 Conclusion. 327 $aReferences -- MKB: Multi-Kernel Bures Metric for Nighttime Aerial Tracking -- 1 Introduction -- 2 Methodology -- 2.1 Kernel Bures Metric -- 2.2 Multi-Kernel Bures Metric -- 2.3 Objective Loss -- 3 Experiments -- 3.1 Implementation Details -- 3.2 Evaluation Datasets -- 3.3 Comparison Results -- 3.4 Visualization -- 3.5 Ablation Study -- 4 Conclusion -- References -- Deep Arbitrary-Scale Unfolding Network for Color-Guided Depth Map Super-Resolution -- 1 Introduction -- 2 The Proposed Method -- 2.1 Problem Formulation -- 2.2 Algorithm Unfolding -- 2.3 Continuous Up-Sampling Fusion (CUSF) -- 2.4 Loss Function -- 3 Experimental Results -- 3.1 Implementation Details -- 3.2 The Quality Comparison of Different DSR Methods -- 3.3 Ablation Study -- 4 Conclusion -- References -- SSDD-Net: A Lightweight and Efficient Deep Learning Model for Steel Surface Defect Detection -- 1 Introduction -- 2 Methods -- 2.1 LMFE: Light Multiscale Feature Extraction Module -- 2.2 SEFF: Simple Effective Feature Fusion Network -- 2.3 SSDD-Net -- 3 Experiments and Analysis -- 3.1 Implementation Details -- 3.2 Evaluation Metrics -- 3.3 Dataset -- 3.4 Ablation Studies -- 3.5 Comparison with Other SOTA Methods -- 3.6 Comprehensive Performance of SSDD-Net -- 4 Conclusion -- References -- Effective Small Ship Detection with Enhanced-YOLOv7 -- 1 Introduction -- 2 Method -- 2.1 Small Object-Aware Feature Extraction Module (SOAFE) -- 2.2 Small Object-Friendly Scale-Insensitive Regression Scheme (SOFSIR) -- 2.3 Geometric Constraint-Based Non-Maximum Suppression Method (GCNMS) -- 3 Experiments -- 3.1 Experimental Settings -- 3.2 Quantitative Analysis -- 3.3 Ablation Studies -- 3.4 Qualitative Analysis -- 4 Conclusion -- References -- PiDiNeXt: An Efficient Edge Detector Based on Parallel Pixel Difference Networks -- 1 Introduction -- 2 Related Work. 327 $a2.1 The Development of Deep Learning Based Edge Detection. 330 $aThe 13-volume set LNCS 14425-14437 constitutes the refereed proceedings of the 6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023, held in Xiamen, China, during October 13?15, 2023. The 532 full papers presented in these volumes were selected from 1420 submissions. The papers have been organized in the following topical sections: Action Recognition, Multi-Modal Information Processing, 3D Vision and Reconstruction, Character Recognition, Fundamental Theory of Computer Vision, Machine Learning, Vision Problems in Robotics, Autonomous Driving, Pattern Classification and Cluster Analysis, Performance Evaluation and Benchmarks, Remote Sensing Image Interpretation, Biometric Recognition, Face Recognition and Pose Recognition, Structural Pattern Recognition, Computational Photography, Sensing and Display Technology, Video Analysis and Understanding, Vision Applications and Systems, Document Analysis and Recognition, Feature Extraction and Feature Selection, Multimedia Analysis and Reasoning, Optimization and Learning methods, Neural Network and Deep Learning, Low-Level Vision and Image Processing, Object Detection, Tracking and Identification, Medical Image Processing and Analysis. . 410 0$aLecture Notes in Computer Science,$x1611-3349 ;$v14434 606 $aImage processing$xDigital techniques 606 $aComputer vision 606 $aArtificial intelligence 606 $aApplication software 606 $aComputer networks 606 $aComputer systems 606 $aMachine learning 606 $aComputer Imaging, Vision, Pattern Recognition and Graphics 606 $aArtificial Intelligence 606 $aComputer and Information Systems Applications 606 $aComputer Communication Networks 606 $aComputer System Implementation 606 $aMachine Learning 615 0$aImage processing$xDigital techniques. 615 0$aComputer vision. 615 0$aArtificial intelligence. 615 0$aApplication software. 615 0$aComputer networks. 615 0$aComputer systems. 615 0$aMachine learning. 615 14$aComputer Imaging, Vision, Pattern Recognition and Graphics. 615 24$aArtificial Intelligence. 615 24$aComputer and Information Systems Applications. 615 24$aComputer Communication Networks. 615 24$aComputer System Implementation. 615 24$aMachine Learning. 676 $a006 700 $aLiu$b Qingshan$01586078 701 $aWang$b Hanzi$0927694 701 $aMa$b Zhanyu$01586079 701 $aZheng$b Weishi$01586080 701 $aZha$b Hongbin$01586081 701 $aChen$b Xilin$01586082 701 $aWang$b Liang$01071990 701 $aJi$b Rongrong$01586083 801 0$bMiAaPQ 801 1$bMiAaPQ 801 2$bMiAaPQ 906 $aBOOK 912 $a996587868803316 996 $aPattern Recognition and Computer Vision$93872352 997 $aUNISA