LEADER 13891nam 22008775 450 001 9910799206903321 005 20231227072732.0 010 $a981-9985-55-2 024 7 $a10.1007/978-981-99-8555-5 035 $a(CKB)29476200400041 035 $a(DE-He213)978-981-99-8555-5 035 $a(MiAaPQ)EBC31094238 035 $a(Au-PeEL)EBL31094238 035 $a(EXLCZ)9929476200400041 100 $a20231227d2024 u| 0 101 0 $aeng 135 $aur||||||||||| 181 $ctxt$2rdacontent 182 $cc$2rdamedia 183 $acr$2rdacarrier 200 10$aPattern Recognition and Computer Vision$b[electronic resource] $e6th Chinese Conference, PRCV 2023, Xiamen, China, October 13?15, 2023, Proceedings, Part XII /$fedited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji 205 $a1st ed. 2024. 210 1$aSingapore :$cSpringer Nature Singapore :$cImprint: Springer,$d2024. 215 $a1 online resource (XIV, 523 p. 203 illus., 194 illus. in color.) 225 1 $aLecture Notes in Computer Science,$x1611-3349 ;$v14436 311 08$a9789819985548 327 $aIntro -- Preface -- Organization -- Contents - Part XII -- Object Detection, Tracking and Identification -- OKGR: Occluded Keypoint Generation and Refinement for 3D Object Detection -- 1 Introduction -- 2 Related Works -- 2.1 LiDAR-Based 3D Object Detection -- 2.2 Object Shape Completion -- 3 Methodology -- 3.1 Overview -- 3.2 Occluded Keypoint Generation -- 3.3 Occluded Keypoint Refinement -- 3.4 Loss Function -- 4 Experiments -- 4.1 Datasets and Evaluation Metrics -- 4.2 Implementation Details -- 4.3 Evaluation on KITTI Dataset -- 4.4 Evaluation on Waymo Open Dataset -- 4.5 Model Efficiency -- 4.6 Ablation Studies -- 5 Conclusion -- References -- Camouflaged Object Segmentation Based on Fractional Edge Perception -- 1 Introduction -- 2 Related Work -- 3 Interactive Task Learning Network -- 3.1 Integral and Fractional Edge -- 3.2 Camouflaged Edge Detection Module -- 4 Performance Evaluation -- 4.1 Datasets and Experiment Settings -- 4.2 Quantitative Evaluation -- 4.3 Qualitative Evaluation -- 4.4 Generalization of Edge Detection -- 5 Conclusion -- References -- DecTrans: Person Re-identification with Multifaceted Part Features via Decomposed Transformer -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Vision Transformer as Feature Extractor -- 3.2 Token Decomposition (TD) Layer -- 3.3 Data Augmentation for TD Layer -- 3.4 Training and Inference -- 4 Experiments -- 4.1 Datasets and Evaluation Metrics -- 4.2 Implementation Details -- 4.3 Comparisons to State-of-the-arts -- 4.4 Ablation Study -- 5 Conclusion -- References -- AHT: A Novel Aggregation Hyper-transformer for Few-Shot Object Detection -- 1 Introduction -- 2 Related Work -- 2.1 Object Detection -- 2.2 Hypernetworks -- 3 Method -- 3.1 Preliminaries -- 3.2 Overview -- 3.3 Dynamic Aggregation Module -- 3.4 Conditional Adaptation Hypernetworks. 327 $a3.5 The Classification-Regression Detection Head -- 4 Experiments -- 4.1 Experimental Setting -- 4.2 Comparison Results -- 4.3 Ablation Study -- 4.4 Visualization of Our Module -- 5 Conclusion -- References -- Feature Refinement from Multiple Perspectives for High Performance Salient Object Detection -- 1 Introduction -- 2 Proposed Method -- 2.1 Overall Architecture -- 2.2 Attention-Guided Bi-directional Feature Refinement Module -- 2.3 Serial Atrous Fusion Module -- 2.4 Upsampling Feature Refinement Module -- 2.5 Objective Function -- 3 Experiments -- 3.1 Experimental Setup -- 3.2 Comparison with State-of-the-Art Methods -- 3.3 Ablation Study -- 4 Conclusion -- References -- Feature Disentanglement and Adaptive Fusion for Improving Multi-modal Tracking -- 1 Introduction -- 2 Related Work -- 2.1 Multi-modal Tracking -- 2.2 Transformers Tracking -- 3 Methodology -- 3.1 Preliminary -- 3.2 Our Approach -- 3.3 Training and Inference -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Comparison with State-of-the-Arts Multi-modal Trackers -- 4.3 Ablation Study -- 5 Conclusion -- References -- Modality Balancing Mechanism for RGB-Infrared Object Detection in Aerial Image -- 1 Introduction -- 2 Related Work -- 2.1 Object Detection in Aerial Images -- 2.2 RGB-Infrared Object Detection -- 3 Method -- 3.1 Overview -- 3.2 Modality Balancing Mechanism -- 3.3 Multimodal Feature Hybrid Sampling Module -- 4 Experiment -- 4.1 Settings -- 4.2 Comparison with State-of-the-Art Methods -- 4.3 Ablation Study -- 5 Conclusion -- References -- Pacific Oyster Gonad Identification and Grayscale Calculation Based on Unapparent Object Detection -- 1 Introduction -- 2 Method -- 2.1 Compact Pyramid Refinement Module (CPRM) -- 2.2 Switchable Excitation Model (SEM) -- 3 Experiments and Analysis of Results -- 3.1 Establishment of the Datasets. 327 $a3.2 Experimental Environment and Evaluation Index -- 3.3 Ablation Experiments -- 3.4 Comparative Experiments and Analysis of Results -- 3.5 Visualization Results -- 3.6 Gray Value Calculation -- 4 Conclusion -- References -- Multi-task Self-supervised Few-Shot Detection -- 1 Introduction -- 2 Related Work -- 2.1 Self-supervised Learning -- 2.2 Few-Shot Object Detection -- 3 Methodology -- 3.1 Problem Setting -- 3.2 Self-supervised Auxiliary Branch -- 3.3 Multi-Task Learning -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Few-Shot Object Detection Benchmarks -- 4.3 Ablation Analysis -- 4.4 Visualization -- 5 Conclusion -- References -- CSTrack: A Comprehensive and Concise Vision Transformer Tracker -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Overview -- 3.2 CSBlock -- 3.3 Prediction Head and Loss -- 4 Experiment -- 4.1 Implementation Details -- 4.2 Comparisons with the State-of-the-Art Trackers -- 4.3 Ablation Study -- 4.4 Visualization of Attention Maps -- 4.5 Visualization of Tracking Performance -- 5 Conclusion -- References -- Feature Implicit Enhancement via Super-Resolution for Small Object Detection -- 1 Introduction -- 2 Related Works -- 2.1 General Object Detection -- 2.2 Small Object Detection Based on Super-Resolution -- 3 Methods -- 3.1 Overall Architecture -- 3.2 Training -- 4 Experiments and Details -- 4.1 Dataset and Details -- 4.2 Ablation Study -- 4.3 Main Results -- 5 Conclusion -- References -- Improved Detection Method for SODL-YOLOv7 Intensive Juvenile Abalone -- 1 Introduction -- 2 Methods -- 2.1 SODL Small Target Detection Network -- 2.2 ACBAM Attention Module -- 3 Experimental Results and Analysis -- 3.1 Experimental Data Preprocessing -- 3.2 Experimental Environment and Evaluation Index -- 3.3 Experimental Results and Analysis -- 4 Conclusion -- References. 327 $aMVP-SEG: Multi-view Prompt Learning for Open-Vocabulary Semantic Segmentation -- 1 Introduction -- 2 Related Work -- 2.1 Vision-Language Models -- 2.2 Zero-Shot Segmentation -- 2.3 Prompt Learning -- 3 Method -- 3.1 Problem Definition -- 3.2 MVP-SEG -- 3.3 MVP-SEG+ -- 4 Experiments -- 4.1 Datasets -- 4.2 Evaluation Metrics -- 4.3 Implementation Details -- 4.4 Ablation Studies on MVP-SEG -- 4.5 Comparison with State-of-the-Art -- 5 Conclusion -- References -- Context-FPN and Memory Contrastive Learning for Partially Supervised Instance Segmentation -- 1 Introduction -- 2 Related Work -- 3 CCMask -- 3.1 Overview -- 3.2 Context-FPN -- 3.3 Memory Contrastive Learning Head -- 3.4 Loss Function -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Experimental Results -- 4.3 Ablation Study -- 5 Conclusion -- References -- A Dynamic Tracking Framework Based on Scene Perception -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Easy-Hard Dual-Branch Network -- 3.2 Scene Router -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Comparison with State-of-the-arts -- 4.3 Ablation Study and Analysis -- 5 Conclusion -- References -- HPAN: A Hybrid Pose Attention Network for Person Re-Identification -- 1 Introduction -- 2 The Proposed Method -- 2.1 Local Key Point Features -- 2.2 Self-Attention -- 2.3 Hybrid Pose and Global Feature Fusion (HPGFF) -- 2.4 Loss Function -- 2.5 Training Strategy -- 3 Experiments -- 3.1 Datasets and Evaluation Metrics -- 3.2 Comparison with SOTA Methods -- 3.3 Ablation Studies -- 3.4 Visualization of Attention Maps -- 4 Conclusion -- References -- SpectralTracker: Jointly High and Low-Frequency Modeling for Tracking -- 1 Introduction -- 2 Related Work -- 2.1 Visual Tracking -- 2.2 Frequency Modeling in Visual Transformer -- 3 Method -- 3.1 Dual-Spectral Module -- 3.2 Dual-Spectral for Tracking -- 3.3 Prediction Head and Total Loss. 327 $a4 Experiments -- 4.1 Implementation Details -- 4.2 State-of-the-Art Comparison -- 4.3 Ablation Studies -- 5 Conclusion -- References -- DiffusionTracker: Targets Denoising Based on Diffusion Model for Visual Tracking -- 1 Introduction -- 2 Related Works -- 2.1 Visual Tracking Based on Siamese Network -- 2.2 Diffusion Model -- 3 Method -- 3.1 Architecture -- 3.2 Training Process -- 3.3 Inference Process -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Ablation Study -- 4.3 General Datasets Evaluation -- 4.4 Attributes Evaluation -- 4.5 Compatibility Experiment -- 5 Conclusion -- References -- Instance-Proxy Loss for Semi-supervised Learning with Coarse Labels -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Instance-Level Loss -- 3.2 Proxy-Level Loss -- 3.3 Instance-Proxy Loss -- 4 Experiments -- 4.1 Comparison to SOTA Methods -- 4.2 Ablation Study -- 5 Conclusion -- References -- FAFVTC: A Real-Time Network for Vehicle Tracking and Counting -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Backbone Network -- 3.2 Multi-spectral Channel and Spatial Attention (MCSA) -- 3.3 Data Association -- 3.4 Vehicle Counting -- 4 Experiments -- 4.1 Datasets and Metrics -- 4.2 Implementation Details -- 4.3 Comparison Experiments -- 4.4 Ablation Study -- 5 Conclusion -- References -- Ped-Mix: Mix Pedestrians for Occluded Person Re-identification -- 1 Introduction -- 2 Related Works -- 2.1 Occluded Person Re-identification -- 2.2 Data Augmentation and Training Loss -- 3 Proposed Method -- 3.1 Ped-Mix -- 3.2 Non-target Suppression Loss -- 3.3 Training Procedure -- 4 Experiment -- 4.1 Datasets and Evaluation Measures -- 4.2 Implementation Details -- 4.3 Ablation Studies -- 4.4 Comparison with State-of-the-Art Methods -- 4.5 Visualization -- 4.6 Why Random Masking -- 4.7 Results on Holistic Datasets -- 5 Conclusion -- References. 327 $aObject-Aware Transfer-Based Black-Box Adversarial Attack on Object Detector. 330 $aThe 13-volume set LNCS 14425-14437 constitutes the refereed proceedings of the 6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023, held in Xiamen, China, during October 13?15, 2023. The 532 full papers presented in these volumes were selected from 1420 submissions. The papers have been organized in the following topical sections: Action Recognition, Multi-Modal Information Processing, 3D Vision and Reconstruction, Character Recognition, Fundamental Theory of Computer Vision, Machine Learning, Vision Problems in Robotics, Autonomous Driving, Pattern Classification and Cluster Analysis, Performance Evaluation and Benchmarks, Remote Sensing Image Interpretation, Biometric Recognition, Face Recognition and Pose Recognition, Structural Pattern Recognition, Computational Photography, Sensing and Display Technology, Video Analysis and Understanding, Vision Applications and Systems, Document Analysis and Recognition, Feature Extraction and Feature Selection, Multimedia Analysis and Reasoning, Optimization and Learning methods, Neural Network and Deep Learning, Low-Level Vision and Image Processing, Object Detection, Tracking and Identification, Medical Image Processing and Analysis. . 410 0$aLecture Notes in Computer Science,$x1611-3349 ;$v14436 606 $aImage processing$xDigital techniques 606 $aComputer vision 606 $aArtificial intelligence 606 $aApplication software 606 $aComputer networks 606 $aComputer systems 606 $aMachine learning 606 $aComputer Imaging, Vision, Pattern Recognition and Graphics 606 $aArtificial Intelligence 606 $aComputer and Information Systems Applications 606 $aComputer Communication Networks 606 $aComputer System Implementation 606 $aMachine Learning 615 0$aImage processing$xDigital techniques. 615 0$aComputer vision. 615 0$aArtificial intelligence. 615 0$aApplication software. 615 0$aComputer networks. 615 0$aComputer systems. 615 0$aMachine learning. 615 14$aComputer Imaging, Vision, Pattern Recognition and Graphics. 615 24$aArtificial Intelligence. 615 24$aComputer and Information Systems Applications. 615 24$aComputer Communication Networks. 615 24$aComputer System Implementation. 615 24$aMachine Learning. 676 $a006 702 $aLiu$b Qingshan$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aWang$b Hanzi$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aMa$b Zhanyu$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aZheng$b Weishi$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aZha$b Hongbin$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aChen$b Xilin$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aWang$b Liang$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aJi$b Rongrong$4edt$4http://id.loc.gov/vocabulary/relators/edt 801 0$bMiAaPQ 801 1$bMiAaPQ 801 2$bMiAaPQ 906 $aBOOK 912 $a9910799206903321 996 $aPattern recognition and computer vision$91972598 997 $aUNINA