1.

Record Nr.

UNINA9910799206903321

Titolo

Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part XII / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji

Pubbl/distr/stampa

Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024

ISBN

981-9985-55-2

Edizione

[1st ed. 2024.]

Descrizione fisica

1 online resource (XIV, 523 p. 203 illus., 194 illus. in color.)

Collana

Lecture Notes in Computer Science, , 1611-3349 ; ; 14436

Disciplina

006

Soggetti

Image processing - Digital techniques

Computer vision

Artificial intelligence

Application software

Computer networks

Computer systems

Machine learning

Computer Imaging, Vision, Pattern Recognition and Graphics

Artificial Intelligence

Computer and Information Systems Applications

Computer Communication Networks

Computer System Implementation

Machine Learning

Lingua di pubblicazione

Inglese

Formato

Materiale a stampa

Livello bibliografico

Monografia

Nota di contenuto

Intro -- Preface -- Organization -- Contents - Part XII -- Object Detection, Tracking and Identification -- OKGR: Occluded Keypoint Generation and Refinement for 3D Object Detection -- 1 Introduction -- 2 Related Works -- 2.1 LiDAR-Based 3D Object Detection -- 2.2 Object Shape Completion -- 3 Methodology -- 3.1 Overview -- 3.2 Occluded Keypoint Generation -- 3.3 Occluded Keypoint Refinement -- 3.4 Loss Function -- 4 Experiments -- 4.1 Datasets and Evaluation Metrics -- 4.2 Implementation Details -- 4.3 Evaluation on KITTI



Dataset -- 4.4 Evaluation on Waymo Open Dataset -- 4.5 Model Efficiency -- 4.6 Ablation Studies -- 5 Conclusion -- References -- Camouflaged Object Segmentation Based on Fractional Edge Perception -- 1 Introduction -- 2 Related Work -- 3 Interactive Task Learning Network -- 3.1 Integral and Fractional Edge -- 3.2 Camouflaged Edge Detection Module -- 4 Performance Evaluation -- 4.1 Datasets and Experiment Settings -- 4.2 Quantitative Evaluation -- 4.3 Qualitative Evaluation -- 4.4 Generalization of Edge Detection -- 5 Conclusion -- References -- DecTrans: Person Re-identification with Multifaceted Part Features via Decomposed Transformer -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Vision Transformer as Feature Extractor -- 3.2 Token Decomposition (TD) Layer -- 3.3 Data Augmentation for TD Layer -- 3.4 Training and Inference -- 4 Experiments -- 4.1 Datasets and Evaluation Metrics -- 4.2 Implementation Details -- 4.3 Comparisons to State-of-the-arts -- 4.4 Ablation Study -- 5 Conclusion -- References -- AHT: A Novel Aggregation Hyper-transformer for Few-Shot Object Detection -- 1 Introduction -- 2 Related Work -- 2.1 Object Detection -- 2.2 Hypernetworks -- 3 Method -- 3.1 Preliminaries -- 3.2 Overview -- 3.3 Dynamic Aggregation Module -- 3.4 Conditional Adaptation Hypernetworks.

3.5 The Classification-Regression Detection Head -- 4 Experiments -- 4.1 Experimental Setting -- 4.2 Comparison Results -- 4.3 Ablation Study -- 4.4 Visualization of Our Module -- 5 Conclusion -- References -- Feature Refinement from Multiple Perspectives for High Performance Salient Object Detection -- 1 Introduction -- 2 Proposed Method -- 2.1 Overall Architecture -- 2.2 Attention-Guided Bi-directional Feature Refinement Module -- 2.3 Serial Atrous Fusion Module -- 2.4 Upsampling Feature Refinement Module -- 2.5 Objective Function -- 3 Experiments -- 3.1 Experimental Setup -- 3.2 Comparison with State-of-the-Art Methods -- 3.3 Ablation Study -- 4 Conclusion -- References -- Feature Disentanglement and Adaptive Fusion for Improving Multi-modal Tracking -- 1 Introduction -- 2 Related Work -- 2.1 Multi-modal Tracking -- 2.2 Transformers Tracking -- 3 Methodology -- 3.1 Preliminary -- 3.2 Our Approach -- 3.3 Training and Inference -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Comparison with State-of-the-Arts Multi-modal Trackers -- 4.3 Ablation Study -- 5 Conclusion -- References -- Modality Balancing Mechanism for RGB-Infrared Object Detection in Aerial Image -- 1 Introduction -- 2 Related Work -- 2.1 Object Detection in Aerial Images -- 2.2 RGB-Infrared Object Detection -- 3 Method -- 3.1 Overview -- 3.2 Modality Balancing Mechanism -- 3.3 Multimodal Feature Hybrid Sampling Module -- 4 Experiment -- 4.1 Settings -- 4.2 Comparison with State-of-the-Art Methods -- 4.3 Ablation Study -- 5 Conclusion -- References -- Pacific Oyster Gonad Identification and Grayscale Calculation Based on Unapparent Object Detection -- 1 Introduction -- 2 Method -- 2.1 Compact Pyramid Refinement Module (CPRM) -- 2.2 Switchable Excitation Model (SEM) -- 3 Experiments and Analysis of Results -- 3.1 Establishment of the Datasets.

3.2 Experimental Environment and Evaluation Index -- 3.3 Ablation Experiments -- 3.4 Comparative Experiments and Analysis of Results -- 3.5 Visualization Results -- 3.6 Gray Value Calculation -- 4 Conclusion -- References -- Multi-task Self-supervised Few-Shot Detection -- 1 Introduction -- 2 Related Work -- 2.1 Self-supervised Learning -- 2.2 Few-Shot Object Detection -- 3 Methodology -- 3.1 Problem Setting -- 3.2 Self-supervised Auxiliary Branch -- 3.3 Multi-Task Learning -- 4 Experiments -- 4.1 Implementation Details -- 4.2



Few-Shot Object Detection Benchmarks -- 4.3 Ablation Analysis -- 4.4 Visualization -- 5 Conclusion -- References -- CSTrack: A Comprehensive and Concise Vision Transformer Tracker -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Overview -- 3.2 CSBlock -- 3.3 Prediction Head and Loss -- 4 Experiment -- 4.1 Implementation Details -- 4.2 Comparisons with the State-of-the-Art Trackers -- 4.3 Ablation Study -- 4.4 Visualization of Attention Maps -- 4.5 Visualization of Tracking Performance -- 5 Conclusion -- References -- Feature Implicit Enhancement via Super-Resolution for Small Object Detection -- 1 Introduction -- 2 Related Works -- 2.1 General Object Detection -- 2.2 Small Object Detection Based on Super-Resolution -- 3 Methods -- 3.1 Overall Architecture -- 3.2 Training -- 4 Experiments and Details -- 4.1 Dataset and Details -- 4.2 Ablation Study -- 4.3 Main Results -- 5 Conclusion -- References -- Improved Detection Method for SODL-YOLOv7 Intensive Juvenile Abalone -- 1 Introduction -- 2 Methods -- 2.1 SODL Small Target Detection Network -- 2.2 ACBAM Attention Module -- 3 Experimental Results and Analysis -- 3.1 Experimental Data Preprocessing -- 3.2 Experimental Environment and Evaluation Index -- 3.3 Experimental Results and Analysis -- 4 Conclusion -- References.

MVP-SEG: Multi-view Prompt Learning for Open-Vocabulary Semantic Segmentation -- 1 Introduction -- 2 Related Work -- 2.1 Vision-Language Models -- 2.2 Zero-Shot Segmentation -- 2.3 Prompt Learning -- 3 Method -- 3.1 Problem Definition -- 3.2 MVP-SEG -- 3.3 MVP-SEG+ -- 4 Experiments -- 4.1 Datasets -- 4.2 Evaluation Metrics -- 4.3 Implementation Details -- 4.4 Ablation Studies on MVP-SEG -- 4.5 Comparison with State-of-the-Art -- 5 Conclusion -- References -- Context-FPN and Memory Contrastive Learning for Partially Supervised Instance Segmentation -- 1 Introduction -- 2 Related Work -- 3 CCMask -- 3.1 Overview -- 3.2 Context-FPN -- 3.3 Memory Contrastive Learning Head -- 3.4 Loss Function -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Experimental Results -- 4.3 Ablation Study -- 5 Conclusion -- References -- A Dynamic Tracking Framework Based on Scene Perception -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Easy-Hard Dual-Branch Network -- 3.2 Scene Router -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Comparison with State-of-the-arts -- 4.3 Ablation Study and Analysis -- 5 Conclusion -- References -- HPAN: A Hybrid Pose Attention Network for Person Re-Identification -- 1 Introduction -- 2 The Proposed Method -- 2.1 Local Key Point Features -- 2.2 Self-Attention -- 2.3 Hybrid Pose and Global Feature Fusion (HPGFF) -- 2.4 Loss Function -- 2.5 Training Strategy -- 3 Experiments -- 3.1 Datasets and Evaluation Metrics -- 3.2 Comparison with SOTA Methods -- 3.3 Ablation Studies -- 3.4 Visualization of Attention Maps -- 4 Conclusion -- References -- SpectralTracker: Jointly High and Low-Frequency Modeling for Tracking -- 1 Introduction -- 2 Related Work -- 2.1 Visual Tracking -- 2.2 Frequency Modeling in Visual Transformer -- 3 Method -- 3.1 Dual-Spectral Module -- 3.2 Dual-Spectral for Tracking -- 3.3 Prediction Head and Total Loss.

4 Experiments -- 4.1 Implementation Details -- 4.2 State-of-the-Art Comparison -- 4.3 Ablation Studies -- 5 Conclusion -- References -- DiffusionTracker: Targets Denoising Based on Diffusion Model for Visual Tracking -- 1 Introduction -- 2 Related Works -- 2.1 Visual Tracking Based on Siamese Network -- 2.2 Diffusion Model -- 3 Method -- 3.1 Architecture -- 3.2 Training Process -- 3.3 Inference Process -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Ablation Study -- 4.3 General Datasets Evaluation -- 4.4 Attributes Evaluation -- 4.5 Compatibility Experiment -- 5 Conclusion --



References -- Instance-Proxy Loss for Semi-supervised Learning with Coarse Labels -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Instance-Level Loss -- 3.2 Proxy-Level Loss -- 3.3 Instance-Proxy Loss -- 4 Experiments -- 4.1 Comparison to SOTA Methods -- 4.2 Ablation Study -- 5 Conclusion -- References -- FAFVTC: A Real-Time Network for Vehicle Tracking and Counting -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Backbone Network -- 3.2 Multi-spectral Channel and Spatial Attention (MCSA) -- 3.3 Data Association -- 3.4 Vehicle Counting -- 4 Experiments -- 4.1 Datasets and Metrics -- 4.2 Implementation Details -- 4.3 Comparison Experiments -- 4.4 Ablation Study -- 5 Conclusion -- References -- Ped-Mix: Mix Pedestrians for Occluded Person Re-identification -- 1 Introduction -- 2 Related Works -- 2.1 Occluded Person Re-identification -- 2.2 Data Augmentation and Training Loss -- 3 Proposed Method -- 3.1 Ped-Mix -- 3.2 Non-target Suppression Loss -- 3.3 Training Procedure -- 4 Experiment -- 4.1 Datasets and Evaluation Measures -- 4.2 Implementation Details -- 4.3 Ablation Studies -- 4.4 Comparison with State-of-the-Art Methods -- 4.5 Visualization -- 4.6 Why Random Masking -- 4.7 Results on Holistic Datasets -- 5 Conclusion -- References.

Object-Aware Transfer-Based Black-Box Adversarial Attack on Object Detector.

Sommario/riassunto

The 13-volume set LNCS 14425-14437 constitutes the refereed proceedings of the 6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023, held in Xiamen, China, during October 13–15, 2023. The 532 full papers presented in these volumes were selected from 1420 submissions. The papers have been organized in the following topical sections: Action Recognition, Multi-Modal Information Processing, 3D Vision and Reconstruction, Character Recognition, Fundamental Theory of Computer Vision, Machine Learning, Vision Problems in Robotics, Autonomous Driving, Pattern Classification and Cluster Analysis, Performance Evaluation and Benchmarks, Remote Sensing Image Interpretation, Biometric Recognition, Face Recognition and Pose Recognition, Structural Pattern Recognition, Computational Photography, Sensing and Display Technology, Video Analysis and Understanding, Vision Applications and Systems, Document Analysis and Recognition, Feature Extraction and Feature Selection, Multimedia Analysis and Reasoning, Optimization and Learning methods, Neural Network and Deep Learning, Low-Level Vision and Image Processing, Object Detection, Tracking and Identification, Medical Image Processing and Analysis. .