LEADER 13868nam 22008775 450 001 996587868003316 005 20231228101139.0 010 $a981-9985-43-9 024 7 $a10.1007/978-981-99-8543-2 035 $a(CKB)29476193100041 035 $a(DE-He213)978-981-99-8543-2 035 $a(MiAaPQ)EBC31046393 035 $a(Au-PeEL)EBL31046393 035 $a(EXLCZ)9929476193100041 100 $a20231228d2024 u| 0 101 0 $aeng 135 $aur||||||||||| 181 $ctxt$2rdacontent 182 $cc$2rdamedia 183 $acr$2rdacarrier 200 10$aPattern Recognition and Computer Vision$b[electronic resource] $e6th Chinese Conference, PRCV 2023, Xiamen, China, October 13?15, 2023, Proceedings, Part VIII /$fedited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji 205 $a1st ed. 2024. 210 1$aSingapore :$cSpringer Nature Singapore :$cImprint: Springer,$d2024. 215 $a1 online resource (XIV, 513 p. 157 illus., 152 illus. in color.) 225 1 $aLecture Notes in Computer Science,$x1611-3349 ;$v14432 311 08$a9789819985425 327 $aIntro -- Preface -- Organization -- Contents - Part VIII -- Neural Network and Deep Learning I -- A Quantum-Based Attention Mechanism in Scene Text Detection -- 1 Introduction -- 2 Related Work -- 2.1 Attention Mechanism -- 2.2 Revisit Quantum-State-based Mapping -- 3 Approach -- 3.1 QSM-Based Channel Attention (QCA) Module and QSM-Based Spatial Attention (QSA) Module -- 3.2 Quantum-Based Convolutional Attention Module (QCAM) -- 3.3 Adaptive Channel Information Transfer Module (ACTM) -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Performance Comparison -- 4.3 Ablation Study -- 5 Discussion and Conclusion -- References -- NCMatch: Semi-supervised Learning with Noisy Labels via Noisy Sample Filter and Contrastive Learning -- 1 Introduction -- 2 Related Work -- 2.1 Semi-supervised Learning -- 2.2 Self-supervised Contrastive Learning -- 2.3 Learning with Noisy Labels -- 3 Method -- 3.1 Preliminaries -- 3.2 Overall Framework -- 3.3 Noisy Sample Filter (NSF) -- 3.4 Semi-supervised Contrastive Learning (SSCL) -- 4 Experiments -- 4.1 Datasets -- 4.2 Experimental for SSL -- 4.3 Experimental for SSLNL -- 4.4 Ablation Study -- 5 Conclusion -- References -- Data-Free Low-Bit Quantization via Dynamic Multi-teacher Knowledge Distillation -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Preliminaries -- 3.2 More Insight on 8-Bit Quantized Models -- 3.3 Dynamic Multi-teacher Knowledge Distillation -- 4 Experiments -- 4.1 Experimental Setups -- 4.2 Comparison with Previous Data-Free Quantization Methods -- 4.3 Ablation Studies -- 5 Conclusion -- References -- LeViT-UNet: Make Faster Encoders with Transformer for Medical Image Segmentation -- 1 Introduction -- 2 Related Works -- 3 Method -- 3.1 Architecture of LeViT-UNet -- 3.2 LeViT as Encoder -- 3.3 CNNs as Decoder -- 4 Experiments and Results -- 4.1 Dataset -- 4.2 Implementation Details. 327 $a4.3 Experiment Results on Synapse Dataset -- 4.4 Experiment Results on ACDC Dataset -- 5 Conclusion -- References -- DUFormer: Solving Power Line Detection Task in Aerial Images Using Semantic Segmentation -- 1 Introduction -- 2 Related Work -- 2.1 Vision Transformer -- 2.2 Semantic Segmentation -- 3 Proposed Architecture -- 3.1 Overview -- 3.2 Double U Block (DUB) -- 3.3 Power Line Aware Block (PLAB) -- 3.4 BiscSE Block -- 3.5 Loss Function -- 4 Experiments -- 4.1 Experimental Settings -- 4.2 Comparative Experiments -- 4.3 Ablation Experiments -- 5 Conclusion -- References -- Space-Transform Margin Loss with Mixup for Long-Tailed Visual Recognition -- 1 Introduction -- 2 Related Work -- 2.1 Mixup and Its Space Transformation -- 2.2 Long-Tailed Learning with Mixup -- 2.3 Re-balanced Loss Function Modification Methods -- 3 Method -- 3.1 Space Transformation in Mixup -- 3.2 Space-Transform Margin Loss Function -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementations Details -- 4.3 Main Results -- 4.4 Feature Visualization and Analysis of STM Loss -- 4.5 Ablation Study -- 5 Conclusion -- References -- A Multi-perspective Squeeze Excitation Classifier Based on Vision Transformer for Few Shot Image Classification -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Problem Definition -- 3.2 Meta-Training Phase -- 3.3 Meta-test Phase -- 4 Experimental Results -- 4.1 Datasets and Training Details -- 4.2 Evaluation Results -- 4.3 Ablation Study -- 5 Conclusion -- References -- ITCNN: Incremental Learning Network Based on ITDA and Tree Hierarchical CNN -- 1 Introduction -- 2 Proposed Network -- 2.1 Network Structure -- 2.2 ITDA -- 2.3 Branch Route -- 2.4 Training Strategies -- 2.5 Optimization Strategies -- 3 Experiments and Results -- 3.1 Experiment on Classification -- 3.2 Experiment on CIL -- 4 Conclusion -- References. 327 $aPeriodic-Aware Network for Fine-Grained Action Recognition -- 1 Introduction -- 2 Related Work -- 2.1 Skeleton-Based Action Recognition -- 2.2 Periodicity Estimation of Videos -- 2.3 Squeeze and Excitation Module -- 3 Method -- 3.1 3D-CNN Backbone -- 3.2 Periodicity Feature Extraction Module -- 3.3 Periodicity Fusion Module -- 4 Experiment -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Ablation Study -- 4.4 Comparison with State-of-the-Art Methods -- 5 Conclusion -- References -- Learning Domain-Invariant Representations from Text for Domain Generalization -- 1 Introduction -- 2 Related Work -- 2.1 Domain Generalization -- 2.2 CLIP in Domain Generalization -- 3 Method -- 3.1 Problem Formulation -- 3.2 Text Regularization -- 3.3 CLIP Representations -- 4 Experiments and Results -- 4.1 Datasets and Experimental Settings -- 4.2 Comparison with Existing DG Methods -- 4.3 Ablation Study -- 5 Conclusions -- References -- TSTD:A Cross-modal Two Stages Network with New Trans-decoder for Point Cloud Semantic Segmentation -- 1 Introduction -- 2 Related Works -- 2.1 Image Transformers -- 2.2 Point Cloud Transformer -- 2.3 Joint 2D-3D Network -- 3 Method -- 3.1 Overall Architecture -- 3.2 2D-3D Backprojection -- 3.3 Trans-Decoder -- 4 Experiments -- 4.1 Dataset and Metric -- 4.2 Performance Comparison -- 4.3 Ablation Experiment -- 5 Conclusion -- References -- NeuralMAE: Data-Efficient Neural Architecture Predictor with Masked Autoencoder -- 1 Introduction -- 2 Related Work -- 2.1 Neural Architecture Performance Predictors -- 2.2 Generative Self-supervised Learning -- 3 Method -- 3.1 Overall Framework -- 3.2 Pre-training -- 3.3 Fine-Tuning -- 3.4 Multi-head Attention-Masked Transformer -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Experiments on NAS-Bench-101 -- 4.3 Experiments on NAS-Bench-201 -- 4.4 Experiments on NAS-Bench-301. 327 $a4.5 Ablation Study -- 5 Conclusion -- References -- Co-regularized Facial Age Estimation with Graph-Causal Learning -- 1 Introduction -- 2 Method -- 2.1 Problem Formulation -- 2.2 Ordinal Decision Mapping -- 2.3 Bilateral Counterfactual Pooling -- 3 Experiments -- 3.1 Datasets and Evaluation Settings -- 3.2 Comparison with State-of-the-Art Methods -- 3.3 Ablation Study -- 3.4 Performance Under Out-of-Distribution Settings -- 3.5 Qualitative Results -- 4 Conclusion -- References -- Online Distillation and Preferences Fusion for Graph Convolutional Network-Based Sequential Recommendation -- 1 Introduction -- 2 Method -- 2.1 Graph Construction -- 2.2 Collaborative Learning -- 2.3 Feature Fusion -- 3 Experiment -- 3.1 Experimental Setup -- 3.2 Experimental Results -- 3.3 Ablation Studies -- 4 Conclusion -- References -- Grassmann Graph Embedding for Few-Shot Class Incremental Learning -- 1 Introduction -- 2 Related Work -- 3 The Proposed Method -- 3.1 Problem Definition -- 3.2 Overview -- 3.3 Grassmann Manifold Embedding -- 3.4 Graph Structure Preserving on Grassmann Manifold -- 4 Experiment -- 4.1 Experimental Setup -- 4.2 Comparison with State-of-the-Art Methods -- 5 Conclusion -- References -- Global Variational Convolution Network for Semi-supervised Node Classification on Large-Scale Graphs -- 1 Introduction -- 2 Related Work -- 3 Proposed Methods -- 3.1 Positive Pointwise Mutual Information on Large-Scale Graphs -- 3.2 Global Variational Aggregation -- 3.3 Variational Convolution Kernels -- 4 Experiments -- 4.1 Comparison Experiments -- 4.2 Ablation Study -- 4.3 Runtime Study -- 5 Conclusion -- References -- Frequency Domain Distillation for Data-Free Quantization of Vision Transformer -- 1 Introduction -- 2 Related Work -- 2.1 Vision Transformer (ViT) -- 2.2 Network Quantization -- 3 Preliminaries -- 3.1 Quantizer. 327 $a3.2 Fast Fourier Transform (FFT) and Frequency Domain -- 4 Method -- 4.1 Our Insights -- 4.2 Frequency Domain Distillation -- 4.3 The Overall Pipeline -- 5 Experimentation -- 5.1 Comparison Experiments -- 5.2 Ablation Study -- 6 Conclusions -- References -- An ANN-Guided Approach to Task-Free Continual Learning with Spiking Neural Networks -- 1 Introduction -- 2 Related Works -- 2.1 Image Generation in SNNs -- 2.2 Continual Learning -- 3 Preliminary -- 3.1 The Referee Module: WGAN -- 3.2 The Player Module: FSVAE -- 4 Methodology -- 4.1 Problem Setting -- 4.2 Overview of Our Model -- 4.3 Adversarial Similarity Expansion -- 4.4 Precise Pruning -- 5 Experimental Results -- 5.1 Dataset Setup -- 5.2 Classification Tasks Under TFCL -- 5.3 The Impact of Different Thresholds and Buffer Sizes -- 5.4 ANN and SNN Under TFCL -- 6 Conclusion -- References -- Multi-adversarial Adaptive Transformers for Joint Multi-agent Trajectory Prediction -- 1 Introduction -- 2 Related Works -- 2.1 Multi-agent Trajectory Prediction -- 2.2 Domain Adaptation -- 3 Proposed Method -- 3.1 Encoder: Processing Multi-aspect Data -- 3.2 Decoder: Generating Multi-modal Trajectories -- 3.3 Adaptation: Learning Doamin Invaint Feature -- 3.4 Loss Function -- 4 Experiments -- 4.1 Dataset -- 4.2 Problem Setting -- 4.3 Evaluation Metrics -- 4.4 Implementation Details -- 4.5 Quantitative Analysis -- 4.6 Ablation Study -- 5 Conclusion -- References -- Enhancing Open-Set Object Detection via Uncertainty-Boxes Identification -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Preliminary -- 3.2 Baseline Setup -- 3.3 Pseudo Proposal Advisor -- 3.4 Uncertainty-Box Detection -- 4 Experiment -- 4.1 Experimental Setup -- 4.2 Comparison with Other Methods -- 4.3 Ablation Studies -- 4.4 Visualization and Qualitative Analysis -- 5 Conclusions -- References. 327 $aInterventional Supervised Learning for Person Re-identification. 330 $aThe 13-volume set LNCS 14425-14437 constitutes the refereed proceedings of the 6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023, held in Xiamen, China, during October 13?15, 2023. The 532 full papers presented in these volumes were selected from 1420 submissions. The papers have been organized in the following topical sections: Action Recognition, Multi-Modal Information Processing, 3D Vision and Reconstruction, Character Recognition, Fundamental Theory of Computer Vision, Machine Learning, Vision Problems in Robotics, Autonomous Driving, Pattern Classification and Cluster Analysis, Performance Evaluation and Benchmarks, Remote Sensing Image Interpretation, Biometric Recognition, Face Recognition and Pose Recognition, Structural Pattern Recognition, Computational Photography, Sensing and Display Technology, Video Analysis and Understanding, Vision Applications and Systems, Document Analysis and Recognition, Feature Extraction and Feature Selection, Multimedia Analysis and Reasoning, Optimization and Learning methods, Neural Network and Deep Learning, Low-Level Vision and Image Processing, Object Detection, Tracking and Identification, Medical Image Processing and Analysis. . 410 0$aLecture Notes in Computer Science,$x1611-3349 ;$v14432 606 $aImage processing$xDigital techniques 606 $aComputer vision 606 $aArtificial intelligence 606 $aApplication software 606 $aComputer networks 606 $aComputer systems 606 $aMachine learning 606 $aComputer Imaging, Vision, Pattern Recognition and Graphics 606 $aArtificial Intelligence 606 $aComputer and Information Systems Applications 606 $aComputer Communication Networks 606 $aComputer System Implementation 606 $aMachine Learning 615 0$aImage processing$xDigital techniques. 615 0$aComputer vision. 615 0$aArtificial intelligence. 615 0$aApplication software. 615 0$aComputer networks. 615 0$aComputer systems. 615 0$aMachine learning. 615 14$aComputer Imaging, Vision, Pattern Recognition and Graphics. 615 24$aArtificial Intelligence. 615 24$aComputer and Information Systems Applications. 615 24$aComputer Communication Networks. 615 24$aComputer System Implementation. 615 24$aMachine Learning. 676 $a006 702 $aLiu$b Qingshan$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aWang$b Hanzi$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aMa$b Zhanyu$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aZheng$b Weishi$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aZha$b Hongbin$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aChen$b Xilin$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aWang$b Liang$4edt$4http://id.loc.gov/vocabulary/relators/edt 702 $aJi$b Rongrong$4edt$4http://id.loc.gov/vocabulary/relators/edt 801 0$bMiAaPQ 801 1$bMiAaPQ 801 2$bMiAaPQ 906 $aBOOK 912 $a996587868003316 996 $aPattern recognition and computer vision$91972598 997 $aUNISA