Vai al contenuto principale della pagina
| Titolo: |
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part VIII / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
|
| Pubblicazione: | Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024 |
| Edizione: | 1st ed. 2024. |
| Descrizione fisica: | 1 online resource (XIV, 513 p. 157 illus., 152 illus. in color.) |
| Disciplina: | 006 |
| Soggetto topico: | Image processing - Digital techniques |
| Computer vision | |
| Artificial intelligence | |
| Application software | |
| Computer networks | |
| Computer systems | |
| Machine learning | |
| Computer Imaging, Vision, Pattern Recognition and Graphics | |
| Artificial Intelligence | |
| Computer and Information Systems Applications | |
| Computer Communication Networks | |
| Computer System Implementation | |
| Machine Learning | |
| Persona (resp. second.): | LiuQingshan |
| WangHanzi | |
| MaZhanyu | |
| ZhengWeishi | |
| ZhaHongbin | |
| ChenXilin | |
| WangLiang | |
| JiRongrong | |
| Nota di contenuto: | Intro -- Preface -- Organization -- Contents - Part VIII -- Neural Network and Deep Learning I -- A Quantum-Based Attention Mechanism in Scene Text Detection -- 1 Introduction -- 2 Related Work -- 2.1 Attention Mechanism -- 2.2 Revisit Quantum-State-based Mapping -- 3 Approach -- 3.1 QSM-Based Channel Attention (QCA) Module and QSM-Based Spatial Attention (QSA) Module -- 3.2 Quantum-Based Convolutional Attention Module (QCAM) -- 3.3 Adaptive Channel Information Transfer Module (ACTM) -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Performance Comparison -- 4.3 Ablation Study -- 5 Discussion and Conclusion -- References -- NCMatch: Semi-supervised Learning with Noisy Labels via Noisy Sample Filter and Contrastive Learning -- 1 Introduction -- 2 Related Work -- 2.1 Semi-supervised Learning -- 2.2 Self-supervised Contrastive Learning -- 2.3 Learning with Noisy Labels -- 3 Method -- 3.1 Preliminaries -- 3.2 Overall Framework -- 3.3 Noisy Sample Filter (NSF) -- 3.4 Semi-supervised Contrastive Learning (SSCL) -- 4 Experiments -- 4.1 Datasets -- 4.2 Experimental for SSL -- 4.3 Experimental for SSLNL -- 4.4 Ablation Study -- 5 Conclusion -- References -- Data-Free Low-Bit Quantization via Dynamic Multi-teacher Knowledge Distillation -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Preliminaries -- 3.2 More Insight on 8-Bit Quantized Models -- 3.3 Dynamic Multi-teacher Knowledge Distillation -- 4 Experiments -- 4.1 Experimental Setups -- 4.2 Comparison with Previous Data-Free Quantization Methods -- 4.3 Ablation Studies -- 5 Conclusion -- References -- LeViT-UNet: Make Faster Encoders with Transformer for Medical Image Segmentation -- 1 Introduction -- 2 Related Works -- 3 Method -- 3.1 Architecture of LeViT-UNet -- 3.2 LeViT as Encoder -- 3.3 CNNs as Decoder -- 4 Experiments and Results -- 4.1 Dataset -- 4.2 Implementation Details. |
| 4.3 Experiment Results on Synapse Dataset -- 4.4 Experiment Results on ACDC Dataset -- 5 Conclusion -- References -- DUFormer: Solving Power Line Detection Task in Aerial Images Using Semantic Segmentation -- 1 Introduction -- 2 Related Work -- 2.1 Vision Transformer -- 2.2 Semantic Segmentation -- 3 Proposed Architecture -- 3.1 Overview -- 3.2 Double U Block (DUB) -- 3.3 Power Line Aware Block (PLAB) -- 3.4 BiscSE Block -- 3.5 Loss Function -- 4 Experiments -- 4.1 Experimental Settings -- 4.2 Comparative Experiments -- 4.3 Ablation Experiments -- 5 Conclusion -- References -- Space-Transform Margin Loss with Mixup for Long-Tailed Visual Recognition -- 1 Introduction -- 2 Related Work -- 2.1 Mixup and Its Space Transformation -- 2.2 Long-Tailed Learning with Mixup -- 2.3 Re-balanced Loss Function Modification Methods -- 3 Method -- 3.1 Space Transformation in Mixup -- 3.2 Space-Transform Margin Loss Function -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementations Details -- 4.3 Main Results -- 4.4 Feature Visualization and Analysis of STM Loss -- 4.5 Ablation Study -- 5 Conclusion -- References -- A Multi-perspective Squeeze Excitation Classifier Based on Vision Transformer for Few Shot Image Classification -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Problem Definition -- 3.2 Meta-Training Phase -- 3.3 Meta-test Phase -- 4 Experimental Results -- 4.1 Datasets and Training Details -- 4.2 Evaluation Results -- 4.3 Ablation Study -- 5 Conclusion -- References -- ITCNN: Incremental Learning Network Based on ITDA and Tree Hierarchical CNN -- 1 Introduction -- 2 Proposed Network -- 2.1 Network Structure -- 2.2 ITDA -- 2.3 Branch Route -- 2.4 Training Strategies -- 2.5 Optimization Strategies -- 3 Experiments and Results -- 3.1 Experiment on Classification -- 3.2 Experiment on CIL -- 4 Conclusion -- References. | |
| Periodic-Aware Network for Fine-Grained Action Recognition -- 1 Introduction -- 2 Related Work -- 2.1 Skeleton-Based Action Recognition -- 2.2 Periodicity Estimation of Videos -- 2.3 Squeeze and Excitation Module -- 3 Method -- 3.1 3D-CNN Backbone -- 3.2 Periodicity Feature Extraction Module -- 3.3 Periodicity Fusion Module -- 4 Experiment -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Ablation Study -- 4.4 Comparison with State-of-the-Art Methods -- 5 Conclusion -- References -- Learning Domain-Invariant Representations from Text for Domain Generalization -- 1 Introduction -- 2 Related Work -- 2.1 Domain Generalization -- 2.2 CLIP in Domain Generalization -- 3 Method -- 3.1 Problem Formulation -- 3.2 Text Regularization -- 3.3 CLIP Representations -- 4 Experiments and Results -- 4.1 Datasets and Experimental Settings -- 4.2 Comparison with Existing DG Methods -- 4.3 Ablation Study -- 5 Conclusions -- References -- TSTD:A Cross-modal Two Stages Network with New Trans-decoder for Point Cloud Semantic Segmentation -- 1 Introduction -- 2 Related Works -- 2.1 Image Transformers -- 2.2 Point Cloud Transformer -- 2.3 Joint 2D-3D Network -- 3 Method -- 3.1 Overall Architecture -- 3.2 2D-3D Backprojection -- 3.3 Trans-Decoder -- 4 Experiments -- 4.1 Dataset and Metric -- 4.2 Performance Comparison -- 4.3 Ablation Experiment -- 5 Conclusion -- References -- NeuralMAE: Data-Efficient Neural Architecture Predictor with Masked Autoencoder -- 1 Introduction -- 2 Related Work -- 2.1 Neural Architecture Performance Predictors -- 2.2 Generative Self-supervised Learning -- 3 Method -- 3.1 Overall Framework -- 3.2 Pre-training -- 3.3 Fine-Tuning -- 3.4 Multi-head Attention-Masked Transformer -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Experiments on NAS-Bench-101 -- 4.3 Experiments on NAS-Bench-201 -- 4.4 Experiments on NAS-Bench-301. | |
| 4.5 Ablation Study -- 5 Conclusion -- References -- Co-regularized Facial Age Estimation with Graph-Causal Learning -- 1 Introduction -- 2 Method -- 2.1 Problem Formulation -- 2.2 Ordinal Decision Mapping -- 2.3 Bilateral Counterfactual Pooling -- 3 Experiments -- 3.1 Datasets and Evaluation Settings -- 3.2 Comparison with State-of-the-Art Methods -- 3.3 Ablation Study -- 3.4 Performance Under Out-of-Distribution Settings -- 3.5 Qualitative Results -- 4 Conclusion -- References -- Online Distillation and Preferences Fusion for Graph Convolutional Network-Based Sequential Recommendation -- 1 Introduction -- 2 Method -- 2.1 Graph Construction -- 2.2 Collaborative Learning -- 2.3 Feature Fusion -- 3 Experiment -- 3.1 Experimental Setup -- 3.2 Experimental Results -- 3.3 Ablation Studies -- 4 Conclusion -- References -- Grassmann Graph Embedding for Few-Shot Class Incremental Learning -- 1 Introduction -- 2 Related Work -- 3 The Proposed Method -- 3.1 Problem Definition -- 3.2 Overview -- 3.3 Grassmann Manifold Embedding -- 3.4 Graph Structure Preserving on Grassmann Manifold -- 4 Experiment -- 4.1 Experimental Setup -- 4.2 Comparison with State-of-the-Art Methods -- 5 Conclusion -- References -- Global Variational Convolution Network for Semi-supervised Node Classification on Large-Scale Graphs -- 1 Introduction -- 2 Related Work -- 3 Proposed Methods -- 3.1 Positive Pointwise Mutual Information on Large-Scale Graphs -- 3.2 Global Variational Aggregation -- 3.3 Variational Convolution Kernels -- 4 Experiments -- 4.1 Comparison Experiments -- 4.2 Ablation Study -- 4.3 Runtime Study -- 5 Conclusion -- References -- Frequency Domain Distillation for Data-Free Quantization of Vision Transformer -- 1 Introduction -- 2 Related Work -- 2.1 Vision Transformer (ViT) -- 2.2 Network Quantization -- 3 Preliminaries -- 3.1 Quantizer. | |
| 3.2 Fast Fourier Transform (FFT) and Frequency Domain -- 4 Method -- 4.1 Our Insights -- 4.2 Frequency Domain Distillation -- 4.3 The Overall Pipeline -- 5 Experimentation -- 5.1 Comparison Experiments -- 5.2 Ablation Study -- 6 Conclusions -- References -- An ANN-Guided Approach to Task-Free Continual Learning with Spiking Neural Networks -- 1 Introduction -- 2 Related Works -- 2.1 Image Generation in SNNs -- 2.2 Continual Learning -- 3 Preliminary -- 3.1 The Referee Module: WGAN -- 3.2 The Player Module: FSVAE -- 4 Methodology -- 4.1 Problem Setting -- 4.2 Overview of Our Model -- 4.3 Adversarial Similarity Expansion -- 4.4 Precise Pruning -- 5 Experimental Results -- 5.1 Dataset Setup -- 5.2 Classification Tasks Under TFCL -- 5.3 The Impact of Different Thresholds and Buffer Sizes -- 5.4 ANN and SNN Under TFCL -- 6 Conclusion -- References -- Multi-adversarial Adaptive Transformers for Joint Multi-agent Trajectory Prediction -- 1 Introduction -- 2 Related Works -- 2.1 Multi-agent Trajectory Prediction -- 2.2 Domain Adaptation -- 3 Proposed Method -- 3.1 Encoder: Processing Multi-aspect Data -- 3.2 Decoder: Generating Multi-modal Trajectories -- 3.3 Adaptation: Learning Doamin Invaint Feature -- 3.4 Loss Function -- 4 Experiments -- 4.1 Dataset -- 4.2 Problem Setting -- 4.3 Evaluation Metrics -- 4.4 Implementation Details -- 4.5 Quantitative Analysis -- 4.6 Ablation Study -- 5 Conclusion -- References -- Enhancing Open-Set Object Detection via Uncertainty-Boxes Identification -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Preliminary -- 3.2 Baseline Setup -- 3.3 Pseudo Proposal Advisor -- 3.4 Uncertainty-Box Detection -- 4 Experiment -- 4.1 Experimental Setup -- 4.2 Comparison with Other Methods -- 4.3 Ablation Studies -- 4.4 Visualization and Qualitative Analysis -- 5 Conclusions -- References. | |
| Interventional Supervised Learning for Person Re-identification. | |
| Sommario/riassunto: | The 13-volume set LNCS 14425-14437 constitutes the refereed proceedings of the 6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023, held in Xiamen, China, during October 13–15, 2023. The 532 full papers presented in these volumes were selected from 1420 submissions. The papers have been organized in the following topical sections: Action Recognition, Multi-Modal Information Processing, 3D Vision and Reconstruction, Character Recognition, Fundamental Theory of Computer Vision, Machine Learning, Vision Problems in Robotics, Autonomous Driving, Pattern Classification and Cluster Analysis, Performance Evaluation and Benchmarks, Remote Sensing Image Interpretation, Biometric Recognition, Face Recognition and Pose Recognition, Structural Pattern Recognition, Computational Photography, Sensing and Display Technology, Video Analysis and Understanding, Vision Applications and Systems, Document Analysis and Recognition, Feature Extraction and Feature Selection, Multimedia Analysis and Reasoning, Optimization and Learning methods, Neural Network and Deep Learning, Low-Level Vision and Image Processing, Object Detection, Tracking and Identification, Medical Image Processing and Analysis. . |
| Titolo autorizzato: | Pattern recognition and computer vision ![]() |
| ISBN: | 981-9985-43-9 |
| Formato: | Materiale a stampa |
| Livello bibliografico | Monografia |
| Lingua di pubblicazione: | Inglese |
| Record Nr.: | 996587868003316 |
| Lo trovi qui: | Univ. di Salerno |
| Opac: | Controlla la disponibilità qui |