top

  Info

  • Utilizzare la checkbox di selezione a fianco di ciascun documento per attivare le funzionalità di stampa, invio email, download nei formati disponibili del (i) record.

  Info

  • Utilizzare questo link per rimuovere la selezione effettuata.
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part V / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part V / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Autore Liu Qingshan
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (542 pages)
Disciplina 621.39
004.6
Altri autori (Persone) WangHanzi
MaZhanyu
ZhengWeishi
ZhaHongbin
ChenXilin
WangLiang
JiRongrong
Collana Lecture Notes in Computer Science
Soggetto topico Computer engineering
Computer networks
Image processing - Digital techniques
Computer vision
Computer systems
Machine learning
Computer Engineering and Networks
Computer Imaging, Vision, Pattern Recognition and Graphics
Computer Communication Networks
Computer System Implementation
Machine Learning
ISBN 981-9984-69-6
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Biometric Recognition -- Face Recognition and Pose Recognition -- Structural Pattern Recognition.
Record Nr. UNISA-996587868903316
Liu Qingshan  
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part VII / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part VII / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Autore Liu Qingshan
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (525 pages)
Disciplina 006
Altri autori (Persone) WangHanzi
MaZhanyu
ZhengWeishi
ZhaHongbin
ChenXilin
WangLiang
JiRongrong
Collana Lecture Notes in Computer Science
Soggetto topico Image processing - Digital techniques
Computer vision
Artificial intelligence
Application software
Computer networks
Computer systems
Machine learning
Computer Imaging, Vision, Pattern Recognition and Graphics
Artificial Intelligence
Computer and Information Systems Applications
Computer Communication Networks
Computer System Implementation
Machine Learning
ISBN 981-9985-40-4
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Document Analysis and Recognition -- Feature Extraction and Feature Selection -- Multimedia Analysis and Reasoning.
Record Nr. UNISA-996587869003316
Liu Qingshan  
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part X / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part X / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Autore Liu Qingshan
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (509 pages)
Disciplina 006
Altri autori (Persone) WangHanzi
MaZhanyu
ZhengWeishi
ZhaHongbin
ChenXilin
WangLiang
JiRongrong
Collana Lecture Notes in Computer Science
Soggetto topico Image processing - Digital techniques
Computer vision
Artificial intelligence
Application software
Computer networks
Computer systems
Machine learning
Computer Imaging, Vision, Pattern Recognition and Graphics
Artificial Intelligence
Computer and Information Systems Applications
Computer Communication Networks
Computer System Implementation
Machine Learning
ISBN 981-9985-49-8
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Contents - Part X -- Neural Network and Deep Learning III -- Dual-Stream Context-Aware Neural Network for Survival Prediction from Whole Slide Images -- 1 Introduction -- 2 Method -- 3 Experiments and Results -- 4 Conclusion -- References -- A Multi-label Image Recognition Algorithm Based on Spatial and Semantic Correlation Interaction -- 1 Introduction -- 2 Related Work -- 2.1 Correlation-Agnostic Algorithms -- 2.2 Spatial Correlation Algorithms -- 2.3 Semantic Correlation Algorithms -- 3 Methodology -- 3.1 Definition of Multi-label Image Recognition -- 3.2 The Framework of SSCI -- 3.3 Loss Function -- 4 Experiments -- 4.1 Evaluation Metrics -- 4.2 Implementation Details -- 4.3 Comparison with Other Mainstream Algorithms -- 4.4 Evaluation of the SSCI Effectiveness -- 5 Conclusion -- References -- Hierarchical Spatial-Temporal Network for Skeleton-Based Temporal Action Segmentation -- 1 Introduction -- 2 Related Work -- 2.1 Temporal Action Segmentation -- 2.2 Skeleton-Based Action Recognition -- 3 Method -- 3.1 Network Architecture -- 3.2 Multi-Branch Transfer Fusion Module -- 3.3 Multi-Scale Temporal Convolution Module -- 3.4 Loss Function -- 4 Experiments -- 4.1 Setup -- 4.2 Effect of Hierarchical Model -- 4.3 Effect of Multiple Modalties -- 4.4 Effect of Multi-modal Fusion Methods -- 4.5 Effect of Multi-Scale Temporal Convolution -- 4.6 Comparision with State-of-the-Art -- 5 Conclusion -- References -- Multi-behavior Enhanced Graph Neural Networks for Social Recommendation -- 1 Introduction -- 2 Related Work -- 3 Preliminaries -- 4 Methodology -- 4.1 Embedding Layer -- 4.2 Propagation Layer -- 4.3 Multi-behavior Integration Layer -- 4.4 Prediction Layer -- 4.5 Model Training -- 5 Experiments -- 5.1 Experimental Settings -- 5.2 Performance Comparison (RQ1) -- 5.3 Ablation Study (RQ2).
5.4 Parameter Analysis (RQ3) -- 6 Conclusion and Future Work -- References -- A Complex-Valued Neural Network Based Robust Image Compression -- 1 Introduction -- 2 Related Works -- 2.1 Neural Image Compression -- 2.2 Adversarial Attack -- 2.3 Complex-Valued Convolutional Neural Networks -- 3 Proposed Method -- 3.1 Overall Framework -- 3.2 Nonlinear Transform -- 4 Experiment Results -- 4.1 Experiment Setup -- 4.2 Results and Comparison -- 4.3 Ablation Study -- 5 Conclusions -- References -- Binarizing Super-Resolution Neural Network Without Batch Normalization -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Batch Normalization in SR Models -- 3.2 Channel-Wise Asymmetric Binarizer for Activations -- 3.3 Smoothness-Controlled Estimator -- 4 Experimentation -- 4.1 Experiment Setup -- 4.2 Ablation Study -- 4.3 Visualization -- 5 Conclusion -- References -- Infrared and Visible Image Fusion via Test-Time Training -- 1 Introduction -- 2 Method -- 2.1 Overall Framework -- 2.2 Training and Testing -- 3 Experiments -- 3.1 Experiment Configuration -- 3.2 Performance Comparison on TNO -- 3.3 Performance Comparison on VIFB -- 3.4 Ablation Study -- 4 Conclusion -- References -- Graph-Based Dependency-Aware Non-Intrusive Load Monitoring -- 1 Introduction -- 2 Proposed Method -- 2.1 Problem Formulation -- 2.2 Co-occurrence Probability Graph -- 2.3 Graph Structure Learning -- 2.4 Graph Attention Neural Network -- 2.5 Encoder-Decoder Module -- 3 Numerical Studies and Discussions -- 3.1 Dataset and Experiment Setup -- 3.2 Metrics and Comparisons -- 4 Conclusion -- References -- Few-Shot Object Detection via Classify-Free RPN -- 1 Introduction -- 2 Related Work -- 2.1 Object Detection -- 2.2 Few-Shot Learning -- 2.3 Few-Shot Object Detection -- 3 Methodology -- 3.1 Problem Setting -- 3.2 Analysis of the Base Class Bias Issue in RPN -- 3.3 Classify-Free RPN.
4 Experiments -- 4.1 Experimental Setup -- 4.2 Comparison with the State-of-the-Art -- 4.3 Ablation Study -- 5 Conclusion -- References -- IPFR: Identity-Preserving Face Reenactment with Enhanced Domain Adversarial Training and Multi-level Identity Priors -- 1 Introduction -- 2 Methods -- 2.1 Target Motion Encoder and 3D Shape Encoder -- 2.2 3D Shape-Aware Warping Module -- 2.3 Identity-Aware Refining Module -- 2.4 Enhanced Domain Discriminator -- 2.5 Training -- 3 Experiment -- 3.1 Experimental Setup -- 3.2 Comparisons -- 3.3 Ablation Study -- 4 Limitation -- 5 Conclusion -- References -- L2MNet: Enhancing Continual Semantic Segmentation with Mask Matching -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Preliminaries and Revisiting -- 3.2 Proposed Learn-to-Match Framework -- 3.3 Training Loss -- 4 Experiments -- 4.1 Experimental Setting -- 4.2 Quantitative Evaluation -- 4.3 Ablation Study -- 5 Conclusion -- References -- Adaptive Channel Pruning for Trainability Protection -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Method Framework and Motivation -- 3.2 Channel Similarity Calculation and Trainability Preservation -- 3.3 Sparse Control and Optimization -- 4 Experiments -- 4.1 Experiments Settings and Evaluation Metrics -- 4.2 Results on Imagenet -- 4.3 Results on Cifar-10 -- 4.4 Results on YOLOX-s -- 4.5 Ablation -- 5 Conclusion -- References -- Exploiting Adaptive Crop and Deformable Convolution for Road Damage Detection -- 1 Introduction -- 2 Related Work -- 3 Methods -- 3.1 Adaptive Image Cropping Based on Vanishing Point Estimation -- 3.2 Feature Learning with Deformable Convolution -- 3.3 Diagonal Intersection over Union Loss Function -- 4 Experiment -- 4.1 Comparative Analysis of Different Datasets -- 4.2 Ablation Analysis -- 5 Conclusion -- References -- Cascaded-Scoring Tracklet Matching for Multi-object Tracking.
1 Introduction -- 2 Related Work -- 2.1 Tracking by Detection -- 2.2 Joint Detection and Tracking -- 3 Proposed Method -- 3.1 Cascaded-Scoring Tracklet Matching -- 3.2 Motion-Guided Based Target Aware -- 3.3 Appearance-Assisted Feature Warper -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Ablation Studies -- 4.3 Comparison with State-of-the-Art Methods -- 5 Conclusion -- References -- Boosting Generalization Performance in Person Re-identification -- 1 Introduction -- 2 Related Work -- 2.1 Generalizable Person ReID -- 2.2 Vision-Language Learning -- 3 Method -- 3.1 Review of CLIP -- 3.2 A Novel Cross-Modal Framework -- 3.3 Prompt Design Process -- 3.4 Loss Function -- 4 Experiments -- 4.1 Datasets and Evaluation Protocols -- 4.2 Implementation Details -- 4.3 Ablation Study -- 4.4 Comparison with State-of-the-Art Methods -- 4.5 Other Analysis -- 5 Conclusion -- References -- Self-guided Transformer for Video Super-Resolution -- 1 Introduction -- 2 Related Work -- 2.1 Video Super-Resolution -- 2.2 Vision Transformers -- 3 Our Method -- 3.1 Network Overview -- 3.2 Multi-headed Self-attention Module Based on Offset-Guided Window (OGW-MSA) -- 3.3 Feature Aggregation (FA) -- 4 Experiments -- 4.1 Datasets and Experimental Settings -- 4.2 Comparisons with State-of-the-Art Methods -- 4.3 Ablation Study -- 5 Conclusion -- References -- SAMP: Sub-task Aware Model Pruning with Layer-Wise Channel Balancing for Person Search -- 1 Introduction -- 2 Related Work -- 3 The Proposed Method -- 3.1 Framework Overview -- 3.2 Sub-task Aware Channel Importance Estimation -- 3.3 Layer-Wise Channel Balancing -- 3.4 Adaptive OIM Loss for Model Pruning and Finetuning -- 4 Experimental Results and Analysis -- 4.1 Dataset and Evaluation Metric -- 4.2 Implementation Details -- 4.3 Comparison with the State-of-the-Art Approaches -- 4.4 Ablation Study -- 5 Conclusion.
References -- MKB: Multi-Kernel Bures Metric for Nighttime Aerial Tracking -- 1 Introduction -- 2 Methodology -- 2.1 Kernel Bures Metric -- 2.2 Multi-Kernel Bures Metric -- 2.3 Objective Loss -- 3 Experiments -- 3.1 Implementation Details -- 3.2 Evaluation Datasets -- 3.3 Comparison Results -- 3.4 Visualization -- 3.5 Ablation Study -- 4 Conclusion -- References -- Deep Arbitrary-Scale Unfolding Network for Color-Guided Depth Map Super-Resolution -- 1 Introduction -- 2 The Proposed Method -- 2.1 Problem Formulation -- 2.2 Algorithm Unfolding -- 2.3 Continuous Up-Sampling Fusion (CUSF) -- 2.4 Loss Function -- 3 Experimental Results -- 3.1 Implementation Details -- 3.2 The Quality Comparison of Different DSR Methods -- 3.3 Ablation Study -- 4 Conclusion -- References -- SSDD-Net: A Lightweight and Efficient Deep Learning Model for Steel Surface Defect Detection -- 1 Introduction -- 2 Methods -- 2.1 LMFE: Light Multiscale Feature Extraction Module -- 2.2 SEFF: Simple Effective Feature Fusion Network -- 2.3 SSDD-Net -- 3 Experiments and Analysis -- 3.1 Implementation Details -- 3.2 Evaluation Metrics -- 3.3 Dataset -- 3.4 Ablation Studies -- 3.5 Comparison with Other SOTA Methods -- 3.6 Comprehensive Performance of SSDD-Net -- 4 Conclusion -- References -- Effective Small Ship Detection with Enhanced-YOLOv7 -- 1 Introduction -- 2 Method -- 2.1 Small Object-Aware Feature Extraction Module (SOAFE) -- 2.2 Small Object-Friendly Scale-Insensitive Regression Scheme (SOFSIR) -- 2.3 Geometric Constraint-Based Non-Maximum Suppression Method (GCNMS) -- 3 Experiments -- 3.1 Experimental Settings -- 3.2 Quantitative Analysis -- 3.3 Ablation Studies -- 3.4 Qualitative Analysis -- 4 Conclusion -- References -- PiDiNeXt: An Efficient Edge Detector Based on Parallel Pixel Difference Networks -- 1 Introduction -- 2 Related Work.
2.1 The Development of Deep Learning Based Edge Detection.
Record Nr. UNISA-996587868803316
Liu Qingshan  
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part XIII / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part XIII / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Autore Liu Qingshan
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (524 pages)
Disciplina 006
Altri autori (Persone) WangHanzi
MaZhanyu
ZhengWeishi
ZhaHongbin
ChenXilin
WangLiang
JiRongrong
Collana Lecture Notes in Computer Science
Soggetto topico Image processing - Digital techniques
Computer vision
Artificial intelligence
Application software
Computer networks
Computer systems
Machine learning
Computer Imaging, Vision, Pattern Recognition and Graphics
Artificial Intelligence
Computer and Information Systems Applications
Computer Communication Networks
Computer System Implementation
Machine Learning
ISBN 981-9985-58-7
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Contents - Part XIII -- Medical Image Processing and Analysis -- Growth Simulation Network for Polyp Segmentation -- 1 Introduction -- 2 The Proposed Method -- 2.1 Gaussian Map and Body Map -- 2.2 Overall Architecture -- 2.3 Features Extraction and Fusion Module -- 2.4 Dynamic Attention Guidance Module -- 2.5 Dynamic Simulation Loss -- 3 Experiments -- 3.1 Settings -- 3.2 Comparisons with State-of-the-art -- 3.3 Ablation Study -- 4 Conclusion -- References -- Brain Diffuser: An End-to-End Brain Image to Brain Network Pipeline -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Feature Extraction Module -- 3.2 Brain Diffuser -- 3.3 GCN Classifier -- 3.4 Loss Function -- 4 Experiments -- 4.1 Dataset and Preprocessing -- 4.2 Experiment Configuration -- 4.3 Results and Discussion -- 5 Conclusion -- References -- CCJ-SLC: A Skin Lesion Image Classification Method Based on Contrastive Clustering and Jigsaw Puzzle -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Overview of Our Method -- 3.2 Contrastive Clustering -- 3.3 Jigsaw Puzzle -- 3.4 Loss Function -- 4 Experiments -- 4.1 Dataset and Evaluation Metrics -- 4.2 Baseline Performance -- 4.3 Ablation Experiment -- 4.4 Analysis -- 5 Conclusion -- References -- A Real-Time Network for Fast Breast Lesion Detection in Ultrasound Videos -- 1 Introduction -- 2 Method -- 2.1 Space Time Feature Aggregation (STA) Module -- 3 Experiments and Results -- 3.1 Comparisons with State-of-the-Arts -- 3.2 Ablation Study -- 3.3 Generalizability of Our Network -- 4 Conclusion -- References -- CBAV-Loss: Crossover and Branch Losses for Artery-Vein Segmentation in OCTA Images -- 1 Introduction -- 2 Methods -- 2.1 Overview -- 2.2 Crossover Loss and Branch Loss -- 2.3 Loss Function -- 3 Experiments -- 3.1 Data -- 3.2 Experimental Settings -- 3.3 Evaluation Metrics.
3.4 Ablation Study on CBAV-Loss -- 3.5 Influence of the Proposed Loss on Different Segmentation Networks -- 4 Conclusion -- References -- Leveraging Data Correlations for Skin Lesion Classification -- 1 Introduction -- 2 Related Work -- 2.1 Skin Lesion Classification -- 2.2 Correlation Mining -- 3 Methodology -- 3.1 Feature Enhancement Stage -- 3.2 Label Distribution Learning Stage -- 4 Experiments -- 4.1 Experiment Settings -- 4.2 Hyper Parameters Setting -- 4.3 Comparison with State-of-the-Art Methods -- 4.4 Ablation Studies -- 5 Conclusion -- References -- CheXNet: Combing Transformer and CNN for Thorax Disease Diagnosis from Chest X-ray Images -- 1 Introduction -- 2 Related Work -- 2.1 Label Dependency and Imbalance -- 2.2 Extensive Lesion Location -- 3 Approaches -- 3.1 Label Embedding and MSP Block -- 3.2 Inner Branch -- 3.3 C2T and T2C in IIM -- 4 Experiments -- 4.1 Dataset -- 4.2 Comparison to the State-of-the-Arts -- 4.3 Ablation Study -- 5 Conclusion -- References -- Cross Attention Multi Scale CNN-Transformer Hybrid Encoder Is General Medical Image Learner -- 1 Introduction -- 2 Methods -- 2.1 Dual Encoder -- 2.2 Shallow Fusion Module -- 2.3 Deep Fusion Module -- 2.4 Deep Supervision -- 3 Experiments and Results -- 3.1 Dateset -- 3.2 Implementation Details -- 3.3 Comparison with Other Methods -- 3.4 Ablation Studies -- 4 Conclusion -- References -- Weakly/Semi-supervised Left Ventricle Segmentation in 2D Echocardiography with Uncertain Region-Aware Contrastive Learning -- 1 Introduction -- 2 Methods -- 2.1 Multi-level Regularization of Semi-supervision -- 2.2 Uncertain Region-Aware Contrastive Learning -- 2.3 Differentiable Ejection Fraction Estimation of Weak Supervision -- 3 Datasets and Implementation Details -- 4 Results -- 5 Conclusion -- References.
Spatial-Temporal Graph Convolutional Network for Insomnia Classification via Brain Functional Connectivity Imaging of rs-fMRI -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Data Preprocessing -- 3.2 Data Augmentation -- 3.3 Construction of Spatio-Temporal Graph -- 3.4 Spatio-Temporal Graph Convolution (ST-GC) -- 3.5 ST-GCN Building -- 3.6 Edge Importance Learning -- 4 Experiments -- 4.1 Dataset -- 4.2 Evaluation Metrics -- 4.3 Analysis of Different Sliding Window Step Size -- 4.4 Comparison with Other Methods -- 5 Conclusion -- References -- Probability-Based Nuclei Detection and Critical-Region Guided Instance Segmentation -- 1 Introduction -- 2 Related Works on Nucleus Instance Segmentation -- 2.1 Bounding Box-Based Methods -- 2.2 Boundary-Based Methods -- 2.3 Critical Region-Based Methods -- 3 CGIS Method and CPF Feature -- 3.1 Critical-Region Guided Instance Segmentation -- 3.2 Central Probability Field -- 3.3 Nuclear Classification -- 4 Experimental Verification and Analysis -- 4.1 Datasets and Evaluation Metrics -- 4.2 Parameters and Implementation Details -- 4.3 Comparisons with Other Methods -- 4.4 Ablation Study -- 5 Conclusion -- References -- FlashViT: A Flash Vision Transformer with Large-Scale Token Merging for Congenital Heart Disease Detection -- 1 Introduction -- 2 Method -- 2.1 Overview -- 2.2 FlashViT Block -- 2.3 Large-Scale Token Merging Module -- 2.4 Architecture Variants -- 3 Experiments -- 3.1 CHD Dataset -- 3.2 Evaluations on CHD Dataset -- 3.3 Homogenous Pre-training Strategy -- 3.4 Ablation Study -- 4 Conclusion -- References -- Semi-supervised Retinal Vessel Segmentation Through Point Consistency -- 1 Introduction -- 2 Method -- 2.1 Segmentation Module -- 2.2 Point Consistency Module -- 2.3 Semi-supervised Training Through Point Consistency -- 3 Experiments -- 3.1 Datasets -- 3.2 Implementation Details.
3.3 Experimental Results -- 4 Conclusion -- References -- Knowledge Distillation of Attention and Residual U-Net: Transfer from Deep to Shallow Models for Medical Image Classification -- 1 Introduction -- 2 Methods -- 2.1 Res-Transformer Teacher Model Based on U-Net Structure -- 2.2 ResU-Net Student Model Incorporates Residual -- 2.3 Knowledge Distillation -- 3 Data and Experiments -- 3.1 Datasets -- 3.2 Experimental Settings -- 3.3 Results -- 4 Conclusion -- References -- Two-Stage Deep Learning Segmentation for Tiny Brain Regions -- 1 Introduction -- 2 Method -- 2.1 Overall Workflow -- 2.2 Two-Stage Segmentation Network -- 2.3 Contrast Loss Function -- 2.4 Attention Modules -- 3 Experiments -- 3.1 Dataset and Metrics -- 3.2 Comparisons Experiments -- 4 Conclusion -- References -- Encoder Activation Diffusion and Decoder Transformer Fusion Network for Medical Image Segmentation -- 1 Introduction -- 2 Methodology -- 2.1 Lightweight Convolution Modulation -- 2.2 Encoder Activation Diffusion -- 2.3 Multi-scale Decoding Fusion with Transformer -- 3 Experiments -- 3.1 Datasets -- 3.2 Implementation Details -- 3.3 Evaluation Results -- 3.4 Ablation Study -- 4 Conclusion -- References -- Liver Segmentation via Learning Cross-Modality Content-Aware Representation -- 1 Introduce -- 2 Methodology -- 2.1 Overview -- 2.2 Image-to-Image Network -- 2.3 Peer-to-Peer Network -- 3 Experiments -- 3.1 Dataset -- 3.2 Setting -- 3.3 Result -- 4 Conclusion -- References -- Semi-supervised Medical Image Segmentation Based on Multi-scale Knowledge Discovery and Multi-task Ensemble -- 1 Introduction -- 2 Related Works on SSMIS -- 3 Proposed Method -- 3.1 Multi-scale Knowledge Discovery -- 3.2 Multi-task Ensemble Strategy -- 4 Experiments and Analysis -- 4.1 Datasets and Implementation Details -- 4.2 Comparisons with State-of-the-Art Methods -- 4.3 Ablation Studies.
5 Conclusion -- References -- LATrans-Unet: Improving CNN-Transformer with Location Adaptive for Medical Image Segmentation -- 1 Introduction -- 2 Method -- 2.1 Encoder-Decoder Architecture -- 2.2 Location-Adaptive Attention -- 2.3 SimAM-Skip Structure -- 3 Experiments -- 3.1 Dataset -- 3.2 Implementation Details -- 3.3 Evaluation Results -- 3.4 Ablation Study -- 3.5 Discussion -- 4 Conclusions -- References -- Adversarial Keyword Extraction and Semantic-Spatial Feature Aggregation for Clinical Report Guided Thyroid Nodule Segmentation -- 1 Introduction -- 2 Method -- 2.1 Adversarial Keyword Extraction (AKE) -- 2.2 Semantic-Spatial Features Aggregation (SSFA) -- 2.3 The Full Objective Functions -- 3 Experiment -- 3.1 Comparison with the State-of-the-Arts -- 3.2 Ablation Study -- 3.3 Visualization of Generated Keyword Masks -- 4 Conclusion -- References -- A Multi-modality Driven Promptable Transformer for Automated Parapneumonic Effusion Staging -- 1 Introduction -- 2 Related Works -- 2.1 Disease Detection Methods with CT Images -- 2.2 Classification Methods with Time Sequence Videos -- 3 Method -- 3.1 CNN-Based Slice-Level Feature Extraction -- 3.2 Prompt Encoder -- 3.3 Cross-Modality Fusion Transformer -- 4 Experiments -- 4.1 Setting and Implementation -- 4.2 Results -- 4.3 Ablation Study -- 5 Conclusion -- References -- Assessing the Social Skills of Children with Autism Spectrum Disorder via Language-Image Pre-training Models -- 1 Introduction -- 2 Related Works -- 2.1 Behavior Signal Processing System -- 2.2 Language-Image Pre-training Models -- 3 Methodology -- 3.1 Paradigm Design -- 3.2 Language-Image Based Method -- 4 Experimental Results -- 4.1 Database -- 4.2 Results -- 4.3 Discussion -- 5 Conclusion -- References -- PPS: Semi-supervised 3D Biomedical Image Segmentation via Pyramid Pseudo-Labeling Supervision -- 1 Introduction -- 2 Method.
2.1 Overview.
Record Nr. UNISA-996587868703316
Liu Qingshan  
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part VIII / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part VIII / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (XIV, 513 p. 157 illus., 152 illus. in color.)
Disciplina 006
Collana Lecture Notes in Computer Science
Soggetto topico Image processing - Digital techniques
Computer vision
Artificial intelligence
Application software
Computer networks
Computer systems
Machine learning
Computer Imaging, Vision, Pattern Recognition and Graphics
Artificial Intelligence
Computer and Information Systems Applications
Computer Communication Networks
Computer System Implementation
Machine Learning
ISBN 981-9985-43-9
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Contents - Part VIII -- Neural Network and Deep Learning I -- A Quantum-Based Attention Mechanism in Scene Text Detection -- 1 Introduction -- 2 Related Work -- 2.1 Attention Mechanism -- 2.2 Revisit Quantum-State-based Mapping -- 3 Approach -- 3.1 QSM-Based Channel Attention (QCA) Module and QSM-Based Spatial Attention (QSA) Module -- 3.2 Quantum-Based Convolutional Attention Module (QCAM) -- 3.3 Adaptive Channel Information Transfer Module (ACTM) -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Performance Comparison -- 4.3 Ablation Study -- 5 Discussion and Conclusion -- References -- NCMatch: Semi-supervised Learning with Noisy Labels via Noisy Sample Filter and Contrastive Learning -- 1 Introduction -- 2 Related Work -- 2.1 Semi-supervised Learning -- 2.2 Self-supervised Contrastive Learning -- 2.3 Learning with Noisy Labels -- 3 Method -- 3.1 Preliminaries -- 3.2 Overall Framework -- 3.3 Noisy Sample Filter (NSF) -- 3.4 Semi-supervised Contrastive Learning (SSCL) -- 4 Experiments -- 4.1 Datasets -- 4.2 Experimental for SSL -- 4.3 Experimental for SSLNL -- 4.4 Ablation Study -- 5 Conclusion -- References -- Data-Free Low-Bit Quantization via Dynamic Multi-teacher Knowledge Distillation -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Preliminaries -- 3.2 More Insight on 8-Bit Quantized Models -- 3.3 Dynamic Multi-teacher Knowledge Distillation -- 4 Experiments -- 4.1 Experimental Setups -- 4.2 Comparison with Previous Data-Free Quantization Methods -- 4.3 Ablation Studies -- 5 Conclusion -- References -- LeViT-UNet: Make Faster Encoders with Transformer for Medical Image Segmentation -- 1 Introduction -- 2 Related Works -- 3 Method -- 3.1 Architecture of LeViT-UNet -- 3.2 LeViT as Encoder -- 3.3 CNNs as Decoder -- 4 Experiments and Results -- 4.1 Dataset -- 4.2 Implementation Details.
4.3 Experiment Results on Synapse Dataset -- 4.4 Experiment Results on ACDC Dataset -- 5 Conclusion -- References -- DUFormer: Solving Power Line Detection Task in Aerial Images Using Semantic Segmentation -- 1 Introduction -- 2 Related Work -- 2.1 Vision Transformer -- 2.2 Semantic Segmentation -- 3 Proposed Architecture -- 3.1 Overview -- 3.2 Double U Block (DUB) -- 3.3 Power Line Aware Block (PLAB) -- 3.4 BiscSE Block -- 3.5 Loss Function -- 4 Experiments -- 4.1 Experimental Settings -- 4.2 Comparative Experiments -- 4.3 Ablation Experiments -- 5 Conclusion -- References -- Space-Transform Margin Loss with Mixup for Long-Tailed Visual Recognition -- 1 Introduction -- 2 Related Work -- 2.1 Mixup and Its Space Transformation -- 2.2 Long-Tailed Learning with Mixup -- 2.3 Re-balanced Loss Function Modification Methods -- 3 Method -- 3.1 Space Transformation in Mixup -- 3.2 Space-Transform Margin Loss Function -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementations Details -- 4.3 Main Results -- 4.4 Feature Visualization and Analysis of STM Loss -- 4.5 Ablation Study -- 5 Conclusion -- References -- A Multi-perspective Squeeze Excitation Classifier Based on Vision Transformer for Few Shot Image Classification -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Problem Definition -- 3.2 Meta-Training Phase -- 3.3 Meta-test Phase -- 4 Experimental Results -- 4.1 Datasets and Training Details -- 4.2 Evaluation Results -- 4.3 Ablation Study -- 5 Conclusion -- References -- ITCNN: Incremental Learning Network Based on ITDA and Tree Hierarchical CNN -- 1 Introduction -- 2 Proposed Network -- 2.1 Network Structure -- 2.2 ITDA -- 2.3 Branch Route -- 2.4 Training Strategies -- 2.5 Optimization Strategies -- 3 Experiments and Results -- 3.1 Experiment on Classification -- 3.2 Experiment on CIL -- 4 Conclusion -- References.
Periodic-Aware Network for Fine-Grained Action Recognition -- 1 Introduction -- 2 Related Work -- 2.1 Skeleton-Based Action Recognition -- 2.2 Periodicity Estimation of Videos -- 2.3 Squeeze and Excitation Module -- 3 Method -- 3.1 3D-CNN Backbone -- 3.2 Periodicity Feature Extraction Module -- 3.3 Periodicity Fusion Module -- 4 Experiment -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Ablation Study -- 4.4 Comparison with State-of-the-Art Methods -- 5 Conclusion -- References -- Learning Domain-Invariant Representations from Text for Domain Generalization -- 1 Introduction -- 2 Related Work -- 2.1 Domain Generalization -- 2.2 CLIP in Domain Generalization -- 3 Method -- 3.1 Problem Formulation -- 3.2 Text Regularization -- 3.3 CLIP Representations -- 4 Experiments and Results -- 4.1 Datasets and Experimental Settings -- 4.2 Comparison with Existing DG Methods -- 4.3 Ablation Study -- 5 Conclusions -- References -- TSTD:A Cross-modal Two Stages Network with New Trans-decoder for Point Cloud Semantic Segmentation -- 1 Introduction -- 2 Related Works -- 2.1 Image Transformers -- 2.2 Point Cloud Transformer -- 2.3 Joint 2D-3D Network -- 3 Method -- 3.1 Overall Architecture -- 3.2 2D-3D Backprojection -- 3.3 Trans-Decoder -- 4 Experiments -- 4.1 Dataset and Metric -- 4.2 Performance Comparison -- 4.3 Ablation Experiment -- 5 Conclusion -- References -- NeuralMAE: Data-Efficient Neural Architecture Predictor with Masked Autoencoder -- 1 Introduction -- 2 Related Work -- 2.1 Neural Architecture Performance Predictors -- 2.2 Generative Self-supervised Learning -- 3 Method -- 3.1 Overall Framework -- 3.2 Pre-training -- 3.3 Fine-Tuning -- 3.4 Multi-head Attention-Masked Transformer -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Experiments on NAS-Bench-101 -- 4.3 Experiments on NAS-Bench-201 -- 4.4 Experiments on NAS-Bench-301.
4.5 Ablation Study -- 5 Conclusion -- References -- Co-regularized Facial Age Estimation with Graph-Causal Learning -- 1 Introduction -- 2 Method -- 2.1 Problem Formulation -- 2.2 Ordinal Decision Mapping -- 2.3 Bilateral Counterfactual Pooling -- 3 Experiments -- 3.1 Datasets and Evaluation Settings -- 3.2 Comparison with State-of-the-Art Methods -- 3.3 Ablation Study -- 3.4 Performance Under Out-of-Distribution Settings -- 3.5 Qualitative Results -- 4 Conclusion -- References -- Online Distillation and Preferences Fusion for Graph Convolutional Network-Based Sequential Recommendation -- 1 Introduction -- 2 Method -- 2.1 Graph Construction -- 2.2 Collaborative Learning -- 2.3 Feature Fusion -- 3 Experiment -- 3.1 Experimental Setup -- 3.2 Experimental Results -- 3.3 Ablation Studies -- 4 Conclusion -- References -- Grassmann Graph Embedding for Few-Shot Class Incremental Learning -- 1 Introduction -- 2 Related Work -- 3 The Proposed Method -- 3.1 Problem Definition -- 3.2 Overview -- 3.3 Grassmann Manifold Embedding -- 3.4 Graph Structure Preserving on Grassmann Manifold -- 4 Experiment -- 4.1 Experimental Setup -- 4.2 Comparison with State-of-the-Art Methods -- 5 Conclusion -- References -- Global Variational Convolution Network for Semi-supervised Node Classification on Large-Scale Graphs -- 1 Introduction -- 2 Related Work -- 3 Proposed Methods -- 3.1 Positive Pointwise Mutual Information on Large-Scale Graphs -- 3.2 Global Variational Aggregation -- 3.3 Variational Convolution Kernels -- 4 Experiments -- 4.1 Comparison Experiments -- 4.2 Ablation Study -- 4.3 Runtime Study -- 5 Conclusion -- References -- Frequency Domain Distillation for Data-Free Quantization of Vision Transformer -- 1 Introduction -- 2 Related Work -- 2.1 Vision Transformer (ViT) -- 2.2 Network Quantization -- 3 Preliminaries -- 3.1 Quantizer.
3.2 Fast Fourier Transform (FFT) and Frequency Domain -- 4 Method -- 4.1 Our Insights -- 4.2 Frequency Domain Distillation -- 4.3 The Overall Pipeline -- 5 Experimentation -- 5.1 Comparison Experiments -- 5.2 Ablation Study -- 6 Conclusions -- References -- An ANN-Guided Approach to Task-Free Continual Learning with Spiking Neural Networks -- 1 Introduction -- 2 Related Works -- 2.1 Image Generation in SNNs -- 2.2 Continual Learning -- 3 Preliminary -- 3.1 The Referee Module: WGAN -- 3.2 The Player Module: FSVAE -- 4 Methodology -- 4.1 Problem Setting -- 4.2 Overview of Our Model -- 4.3 Adversarial Similarity Expansion -- 4.4 Precise Pruning -- 5 Experimental Results -- 5.1 Dataset Setup -- 5.2 Classification Tasks Under TFCL -- 5.3 The Impact of Different Thresholds and Buffer Sizes -- 5.4 ANN and SNN Under TFCL -- 6 Conclusion -- References -- Multi-adversarial Adaptive Transformers for Joint Multi-agent Trajectory Prediction -- 1 Introduction -- 2 Related Works -- 2.1 Multi-agent Trajectory Prediction -- 2.2 Domain Adaptation -- 3 Proposed Method -- 3.1 Encoder: Processing Multi-aspect Data -- 3.2 Decoder: Generating Multi-modal Trajectories -- 3.3 Adaptation: Learning Doamin Invaint Feature -- 3.4 Loss Function -- 4 Experiments -- 4.1 Dataset -- 4.2 Problem Setting -- 4.3 Evaluation Metrics -- 4.4 Implementation Details -- 4.5 Quantitative Analysis -- 4.6 Ablation Study -- 5 Conclusion -- References -- Enhancing Open-Set Object Detection via Uncertainty-Boxes Identification -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Preliminary -- 3.2 Baseline Setup -- 3.3 Pseudo Proposal Advisor -- 3.4 Uncertainty-Box Detection -- 4 Experiment -- 4.1 Experimental Setup -- 4.2 Comparison with Other Methods -- 4.3 Ablation Studies -- 4.4 Visualization and Qualitative Analysis -- 5 Conclusions -- References.
Interventional Supervised Learning for Person Re-identification.
Record Nr. UNINA-9910799218703321
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part III / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part III / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (XIV, 521 p. 179 illus., 174 illus. in color.)
Disciplina 006
Collana Lecture Notes in Computer Science
Soggetto topico Image processing - Digital techniques
Computer vision
Artificial intelligence
Application software
Computer networks
Computer systems
Machine learning
Computer Imaging, Vision, Pattern Recognition and Graphics
Artificial Intelligence
Computer and Information Systems Applications
Computer Communication Networks
Computer System Implementation
Machine Learning
ISBN 981-9984-35-1
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Contents - Part III -- Machine Learning -- Loss Filtering Factor for Crowd Counting -- 1 Introduction -- 2 Related Work -- 3 Proposed Method -- 3.1 Background and Motivation -- 3.2 Loss Filtering Factor -- 4 Experiments -- 4.1 Evaluation Metrics -- 4.2 Datasets -- 4.3 Neural Network Model -- 4.4 Experimental Evaluations -- 4.5 Key Issues and Discussion -- 5 Conclusions and Future Work -- References -- Classifier Decoupled Training for Black-Box Unsupervised Domain Adaptation -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Problem Definition -- 3.2 Overall Framework -- 3.3 Classifier Decoupled Training (CDT) -- 3.4 ETP-Entropy Sampling -- 4 Experiment -- 4.1 Setup -- 4.2 Performance Comparison -- 4.3 Analysis -- 5 Conclusion -- References -- Unsupervised Concept Drift Detection via Imbalanced Cluster Discriminator Learning -- 1 Introduction -- 2 Related Works -- 2.1 Concept Drift Detection -- 2.2 Imbalance Data Clustering -- 3 Propose Method -- 3.1 Imbalanced Distribution Learning -- 3.2 Multi-cluster Descriptor Training -- 3.3 Concept Drift Detection Based on MCD -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Comparative Results -- 4.3 Ablation Study -- 4.4 Study on Imbalance Rate and Drift Severity -- 5 Conclusion -- References -- Unsupervised Domain Adaptation for Optical Flow Estimation -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Overview -- 3.2 Domain Adaptive Autoencoder -- 3.3 Incorporating with RAFT -- 3.4 Overall Objective -- 3.5 Network Architecture -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Experiment Results -- 4.3 Ablation Study -- 5 Conclusion -- References -- Continuous Exploration via Multiple Perspectives in Sparse Reward Environment -- 1 Introduction -- 2 Related Works -- 3 Method -- 3.1 Continuous Exploration via Multiple Perspectives -- 3.2 Global Reward Model.
3.3 Local Reward Model -- 4 Experiment -- 4.1 Comparison Algorithms and Evaluation Metrics -- 4.2 Network Architectures and Hyperparameters -- 4.3 Experimental Results -- 5 Conclusion -- References -- Network Transplanting for the Functionally Modular Architecture -- 1 Introduction -- 2 Related Work -- 3 Network Transplanting -- 3.1 Space-Projection Problem of Standard Distillation and Jacobian Distillation -- 3.2 Solution: Learning with Back-Distillation -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Experimental Results and Analysis -- 5 Conclusion -- References -- TiAM-GAN: Titanium Alloy Microstructure Image Generation Network -- 1 Introduction -- 2 Related Work -- 2.1 Image Generation -- 2.2 Mixture Density Network -- 3 Method -- 3.1 Framework -- 3.2 Feature-Fusion CcGAN -- 3.3 Feature-Extraction-Mapping GAN -- 4 Experiments -- 4.1 Dataset -- 4.2 Metric -- 4.3 Comparison Experiment -- 4.4 Ablation Experiment -- 5 Conclusion -- References -- A Robust Detection and Correction Framework for GNN-Based Vertical Federated Learning -- 1 Introduction -- 2 Related Works -- 2.1 Attack and Defense in Graph Neural Networks -- 2.2 Attack and Defense in Vertical Federated Learning -- 2.3 Attack and Defense in GNN-Based Vertical Federated Learning -- 3 Methodology -- 3.1 GNN-Based Vertical Federated Learning -- 3.2 Threat Model -- 3.3 Framework Overview -- 3.4 Malicious Participant Detection -- 3.5 Malicious Embedding Correction -- 4 Experiment -- 4.1 Experiment Settings -- 4.2 Detection Performance(RQ1) -- 4.3 Defense Performance(RQ2-RQ3) -- 5 Conclusion -- References -- QEA-Net: Quantum-Effects-based Attention Networks -- 1 Introduction -- 2 Related Works -- 2.1 Revisiting QSM -- 2.2 Attention Mechanism in CNNs -- 3 Quantum-Effects-based Attention Networks -- 3.1 Spatial Attention Module Based on Quantum Effects -- 4 Experiments.
4.1 Implementation Details -- 4.2 Comparisons Using Different Deep CNNs -- 4.3 Comparisons Using MLP-Mixer -- 5 Conclusion -- References -- Learning Scene Graph for Better Cross-Domain Image Captioning -- 1 Introduction -- 2 Related Work -- 2.1 Scene Graph -- 2.2 Cross Domain Image Captioning -- 3 Methods -- 3.1 The Principle of SGCDIC -- 3.2 Parameters Updating -- 3.3 Object Geometry Consistency Losses and Semantic Similarity -- 4 Experiments and Results Analysis -- 4.1 Datasets and Implementation Details -- 4.2 Quantitative Comparison -- 4.3 Qualitative Comparison -- 5 Conclusion -- References -- Enhancing Rule Learning on Knowledge Graphs Through Joint Ontology and Instance Guidance -- 1 Introduction -- 2 Related Work -- 2.1 Reasoning with Embeddings -- 2.2 Reasoning with Rules -- 3 Methodology -- 3.1 Framework Details -- 4 Experiment -- 4.1 Experiment Settings -- 4.2 Results -- 5 Conclusions -- References -- Explore Across-Dimensional Feature Correlations for Few-Shot Learning -- 1 Introduction -- 2 Related Work -- 2.1 Few-Shot Learning -- 2.2 Attention Mechanisms in Few-Shot Learning -- 3 Methodology -- 3.1 Preliminary -- 3.2 Overall Framework -- 3.3 Three-Dimensional Offset Position Encoding (TOPE) -- 3.4 Across-Dimensional Attention (ADA) -- 3.5 Learning Object -- 4 Experiments -- 4.1 Experiment Setup -- 4.2 Comparison with State-of-the-art -- 4.3 Ablation Studies -- 4.4 Convergence Analysis -- 4.5 Visualization -- 5 Conclusion -- References -- Pairwise-Emotion Data Distribution Smoothing for Emotion Recognition -- 1 Introduction -- 2 Method -- 2.1 Pairwise Data Distribution Smoothing -- 2.2 CLTNet -- 3 Experiment -- 3.1 Dataset and Evaluation Metrics -- 3.2 Implementation Details -- 3.3 Validation Experiment -- 3.4 Ablation Study -- 4 Conclusion -- References -- SIEFusion: Infrared and Visible Image Fusion via Semantic Information Enhancement.
1 Introduction -- 2 Method -- 2.1 Problem Formulation -- 2.2 Network Architecture -- 2.3 Loss Function -- 3 Experiments -- 3.1 Experimental Configurations -- 3.2 Results and Analysis -- 3.3 Ablation Study -- 3.4 Segmentation Performance -- 4 Conclusion -- References -- DeepChrom: A Diffusion-Based Framework for Long-Tailed Chromatin State Prediction -- 1 Introduction -- 2 Related Work -- 2.1 Chromatin State Prediction -- 2.2 Long-Tailed Learning -- 3 Methods -- 3.1 Methodology Overview -- 3.2 Pseudo Sequences Generation -- 3.3 Chromatin State Prediction -- 3.4 Equalization Loss -- 4 Experiments -- 4.1 Experimental Settings -- 4.2 Effectiveness of Our Proposed Long-Tailed Learning Methods -- 4.3 Ablation Study -- 5 Conclusion and Discussion -- References -- Adaptable Conservative Q-Learning for Offline Reinforcement Learning -- 1 Introduction -- 2 Related Work -- 3 Prelinminaries -- 4 Methodology -- 4.1 Adaptable Conservative Q-Learning -- 4.2 Variants and Practical Object -- 4.3 Implementation Settings -- 5 Experiments -- 5.1 Experimental Details -- 5.2 Q-Value Distribution and Effect of the Percentile -- 5.3 Deep Offline RL Benchmarks -- 5.4 Ablation Study -- 6 Conclusion -- References -- Boosting Out-of-Distribution Detection with Sample Weighting -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Problem Setting -- 3.2 Weighted Distance as Score Function -- 3.3 Contrastive Training for OOD Detection -- 4 Experiment -- 4.1 Common Setup -- 4.2 Main Results -- 4.3 Ablation Studies -- 5 Conclusion -- References -- Causal Discovery via the Subsample Based Reward and Punishment Mechanism -- 1 Introduction -- 2 Related Work -- 3 Introduction to the Algorithmic Framework -- 3.1 Introduction to SRPM Method -- 3.2 Correlation Measures and Hypothesis Testing -- 3.3 Skeleton Discovery Algorithm Based on SRPM Method (SRPM-SK).
4 Experimental Results and Analysis -- 4.1 Benchmark Network and Data Sets -- 4.2 High Dimensional Network Analysis -- 4.3 Real Data Analysis -- 5 Conclusion and Outlook -- References -- Local Neighbor Propagation Embedding -- 1 Introduction -- 2 Related Work -- 3 Local Neighbor Propagation Embedding -- 3.1 Motivation -- 3.2 Mathematical Background -- 3.3 Local Neighbor Propagation Framework -- 3.4 Computational Complexity -- 4 Experimental Results -- 4.1 Synthetic Datasets -- 4.2 Real-World Datasets -- 5 Conclusion -- References -- Inter-class Sparsity Based Non-negative Transition Sub-space Learning -- 1 Introduction -- 2 Related Work -- 2.1 Notations -- 2.2 StLSR -- 2.3 ICS_DLSR -- 2.4 SN-TSL -- 3 The Proposed Method -- 3.1 Problem Formulation and Learning Model -- 3.2 Solution to ICSN-TSL -- 3.3 Classification -- 3.4 Computational Time Complexity -- 3.5 Convergence Analysis -- 4 Experiments and Analysis -- 4.1 Data Sets -- 4.2 Experimental Results and Analysis -- 4.3 Parameter Sensitivity Analysis -- 4.4 Ablation Study -- 5 Conclusion -- References -- Incremental Learning Based on Dual-Branch Network -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Problem Description -- 3.2 Baseline Method -- 3.3 Model Extension -- 3.4 Two Distillation -- 3.5 Two Stage Training -- 3.6 Sample Update Policy -- 4 Experience -- 4.1 Baseline Result -- 4.2 Result on Imagenet100 -- 5 Conclusion -- References -- Inter-image Discrepancy Knowledge Distillation for Semantic Segmentation -- 1 Introduction -- 2 Method -- 2.1 Notations -- 2.2 Overview -- 2.3 Attention Discrepancy Distillation -- 2.4 Soft Probability Distillation -- 2.5 Optimization -- 3 Experiments -- 3.1 Datasets and Setups -- 3.2 Comparisons with Recent Methods -- 3.3 Ablation Studies -- 4 Conclusion -- References -- Vision Problems in Robotics, Autonomous Driving.
Cascaded Bilinear Mapping Collaborative Hybrid Attention Modality Fusion Model.
Record Nr. UNINA-9910799221603321
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part I / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part I / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (XIV, 513 p. 159 illus., 152 illus. in color.)
Disciplina 006
Collana Lecture Notes in Computer Science
Soggetto topico Image processing - Digital techniques
Computer vision
Artificial intelligence
Application software
Computer networks
Computer systems
Machine learning
Computer Imaging, Vision, Pattern Recognition and Graphics
Artificial Intelligence
Computer and Information Systems Applications
Computer Communication Networks
Computer System Implementation
Machine Learning
ISBN 981-9984-29-7
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Contents - Part I -- Action Recognition -- Learning Bottleneck Transformer for Event Image-Voxel Feature Fusion Based Classification -- 1 Introduction -- 2 Related Work -- 3 Our Proposed Approach -- 3.1 Overview -- 3.2 Network Architecture -- 4 Experiment -- 4.1 Dataset and Evaluation Metric -- 4.2 Implementation Details -- 4.3 Comparison with Other SOTA Algorithms -- 4.4 Ablation Study -- 4.5 Parameter Analysis -- 5 Conclusion -- References -- Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition -- 1 Introduction -- 2 Related Works -- 2.1 Attention Mechanism -- 2.2 Lightweight Models -- 3 Method -- 3.1 Multi-Branch Fusion Module -- 3.2 Semantic Information -- 3.3 Graph Convolution Module -- 3.4 Time Convolution Module -- 4 Experiment -- 4.1 Dataset -- 4.2 Experimental Details -- 4.3 Ablation Experiment -- 4.4 Comparison with State-of-the-Art -- 5 Action Visualization -- 6 Conclusion -- References -- Auto-Learning-GCN: An Ingenious Framework for Skeleton-Based Action Recognition -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 GCN-Based Skeleton Processing -- 3.2 The AL-GCN Module -- 3.3 The Attention Correction and Jump Model -- 3.4 Multi-stream Gaussian Weight Selection Algorithm -- 4 Experimental Results and Analysis -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Compared with the State-of-the-Art Methods -- 4.4 Ablation Study -- 4.5 Visualization -- 5 Conclusion -- References -- Skeleton-Based Action Recognition with Combined Part-Wise Topology Graph Convolutional Networks -- 1 Introduction -- 2 Related Work -- 2.1 Skeleton-Based Action Recognition -- 2.2 Partial Graph Convolution in Skeleton-Based Action Recognition -- 3 Methods -- 3.1 Preliminaries -- 3.2 Part-Wise Spatial Modeling -- 3.3 Part-Wise Spatio-Temporal Modeling.
3.4 Model Architecture -- 4 Experiments -- 4.1 Datasets -- 4.2 Training Details -- 4.3 Ablation Studies -- 4.4 Comparison with the State-of-the-Art -- 5 Conclusion -- References -- Segmenting Key Clues to Induce Human-Object Interaction Detection -- 1 Introduction -- 2 Related Work -- 3 Approach -- 3.1 Key Features Segmentation-Based Module -- 3.2 Key Features Learning Encoder -- 3.3 Spatial Relationships Learning Graph-Based Module -- 3.4 Training and Inference -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Implementation Results -- 4.3 Ablation Study -- 4.4 Qualitative Results -- 5 Conclusion -- References -- Lightweight Multispectral Skeleton and Multi-stream Graph Attention Networks for Enhanced Action Prediction with Multiple Modalities -- 1 Introduction -- 2 Related Work -- 2.1 Skeleton-Based Action Recognition -- 2.2 Dynamic Graph Neural Network -- 3 Methods -- 3.1 Spatial Embedding Component -- 3.2 Temporal Embedding Component -- 3.3 Action Prediction -- 4 Experiments and Discussion -- 4.1 NTU RGB+D Dataset -- 4.2 Experiments Setting -- 4.3 Evaluation of Human Action Recognition -- 4.4 Ablation Study -- 4.5 Visualization -- 5 Conclusion -- References -- Spatio-Temporal Self-supervision for Few-Shot Action Recognition -- 1 Introduction -- 2 Related Work -- 2.1 Few-Shot Action Recognition -- 2.2 Self-supervised Learning (SSL)-Based Few-Shot Learning -- 3 Method -- 3.1 Problem Definition -- 3.2 Spatio-Temporal Self-supervision Framework -- 4 Experiments -- 4.1 Experimental Settings -- 4.2 Comparison with State-of-the-Art Methods -- 4.3 Ablation Studies -- 5 Conclusions -- References -- A Fuzzy Error Based Fine-Tune Method for Spatio-Temporal Recognition Model -- 1 Introduction -- 2 Related Work -- 2.1 Spatio-Temporal (3D) Convolution Networks -- 2.2 Clips Selection and Features Aggregation -- 3 Proposed Method -- 3.1 Problem Definition.
3.2 Fuzzy Target -- 3.3 Fine Tune Loss Function -- 4 Experiment -- 4.1 Datasets and Implementation Details -- 4.2 Performance Comparison -- 4.3 Discussion -- 5 Conclusion -- References -- Temporal-Channel Topology Enhanced Network for Skeleton-Based Action Recognition -- 1 Introduction -- 2 Proposed Method -- 2.1 Network Architecture -- 2.2 Temporal-Channel Focus Module -- 2.3 Dynamic Channel Topology Attention Module -- 3 Experiments -- 3.1 Datasets and Implementation Details -- 3.2 Ablation Study -- 3.3 Comparison with the State-of-the-Art -- 4 Conclusion -- References -- HFGCN-Based Action Recognition System for Figure Skating -- 1 Introduction -- 2 Figure Skating Hierarchical Dataset -- 3 Figure Skating Action Recognition System -- 3.1 Data Preprocessing -- 3.2 Multi-stream Generation -- 3.3 Hierarchical Fine-Grained Graph Convolutional Neural Network (HFGCN) -- 3.4 Decision Fusion Module -- 4 Experiments and Results -- 4.1 Experimental Environment -- 4.2 Experiment Results and Analysis -- 5 Conclusion -- References -- Multi-modal Information Processing -- Image Priors Assisted Pre-training for Point Cloud Shape Analysis -- 1 Introduction -- 2 Proposed Method -- 2.1 Problem Setting -- 2.2 Overview Framework -- 2.3 Multi-task Cross-Modal SSL -- 2.4 Objective Function -- 3 Experiments and Analysis -- 3.1 Pre-training Setup -- 3.2 Downstream Tasks -- 3.3 Ablation Study -- 4 Conclusion -- References -- AMM-GAN: Attribute-Matching Memory for Person Text-to-Image Generation -- 1 Introduction -- 2 Related Work -- 2.1 Text-to-image Generative Adversarial Network -- 2.2 GANs for Person Image -- 3 Method -- 3.1 Feature Extraction -- 3.2 Multi-scale Feature Fusion Generator -- 3.3 Real-Result-Driven Discriminator -- 3.4 Objective Functions -- 4 Experiment -- 4.1 Dataset -- 4.2 Implementation -- 4.3 Evaluation Metrics -- 4.4 Quantitative Evaluation.
4.5 Qualitative Evaluation -- 4.6 Ablation Study -- 5 Conclusion -- References -- RecFormer: Recurrent Multi-modal Transformer with History-Aware Contrastive Learning for Visual Dialog -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Preliminaries -- 3.2 Model Architecture -- 3.3 Training Objectives -- 4 Experimental Setup -- 4.1 Dataset -- 4.2 Baselines -- 4.3 Evaluation Metric -- 4.4 Implementation Details -- 5 Results and Analysis -- 5.1 Main Results -- 5.2 Ablation Study -- 5.3 Attention Visualization -- 6 Conclusion -- References -- KV Inversion: KV Embeddings Learning for Text-Conditioned Real Image Action Editing -- 1 Introduction -- 2 Background -- 2.1 Text-to-Image Generation and Editing -- 2.2 Stable Diffusion Model -- 3 KV Inversion: Training-Free KV Embeddings Learning -- 3.1 Task Setting and Reason of Existing Problem -- 3.2 KV Inversion Overview -- 4 Experiments -- 4.1 Comparisons with Other Concurrent Works -- 4.2 Ablation Study -- 5 Limitations and Conclusion -- References -- Enhancing Text-Image Person Retrieval Through Nuances Varied Sample -- 1 Introduction -- 2 Relataed Work -- 2.1 Text-Image Retrieval -- 2.2 Text-Image Person Retrieval -- 3 Method -- 3.1 Feature Extraction and Alignment -- 3.2 Nuanced Variation Module -- 3.3 Image Text Matching Loss -- 3.4 Hard Negative Metric Loss -- 4 Experiment -- 4.1 Datasets and Evaluation Setting -- 4.2 Comparison with State-of-the-Art Methods -- 4.3 Ablation Study -- 5 Conclusion -- References -- Unsupervised Prototype Adapter for Vision-Language Models -- 1 Introduction -- 2 Related Work -- 2.1 Large-Scale Pre-trained Vision-Language Models -- 2.2 Adaptation Methods for Vision-Language Models -- 2.3 Self-training with Pseudo-Labeling -- 3 Method -- 3.1 Background -- 3.2 Unsupervised Prototype Adapter -- 4 Experiments -- 4.1 Image Recognition -- 4.2 Domain Generalization.
4.3 Ablation Study -- 5 Conclusion -- References -- Multimodal Causal Relations Enhanced CLIP for Image-to-Text Retrieval -- 1 Introduction -- 2 Related Works -- 3 Method -- 3.1 Overview -- 3.2 MCD: Multimodal Causal Discovery -- 3.3 MMC-CLIP -- 3.4 Image-Text Alignment -- 4 Experiments -- 4.1 Datasets and Settings -- 4.2 Results on MSCOCO -- 4.3 Results on Flickr30K -- 4.4 Ablation Studies -- 5 Conclusion -- References -- Exploring Cross-Modal Inconsistency in Entities and Emotions for Multimodal Fake News Detection -- 1 Introduction -- 2 Related Works -- 2.1 Single-Modality Fake News Detection -- 2.2 Multimodal Fake News Detection -- 3 Methodology -- 3.1 Feature Extraction -- 3.2 Cross-Modal Contrastive Learning -- 3.3 Entity Consistency Learning -- 3.4 Emotional Consistency Learning -- 3.5 Multimodal Fake News Detector -- 4 Experiments -- 4.1 Experimental Configurations -- 4.2 Overall Performance -- 4.3 Ablation Studies -- 5 Conclusion -- References -- Deep Consistency Preserving Network for Unsupervised Cross-Modal Hashing -- 1 Introduction -- 2 The Proposed Method -- 2.1 Problem Definition -- 2.2 Deep Feature Extraction and Hashing Learning -- 2.3 Features Fusion and Similarity Matrix Construction -- 2.4 Hash Code Fusion and Reconstruction -- 2.5 Objective Function -- 3 Experiments -- 3.1 Datasets and Baselines -- 3.2 Implementation Details -- 3.3 Results and Analysis -- 4 Conclusion -- References -- Learning Adapters for Text-Guided Portrait Stylization with Pretrained Diffusion Models -- 1 Introduction -- 2 Related Work -- 2.1 Text-to-Image Diffusion Models -- 2.2 Control of Pretrained Diffusion Model -- 2.3 Text-Guided Portrait Stylizing -- 3 Method -- 3.1 Background and Preliminaries -- 3.2 Overview of Our Method -- 3.3 Portrait Stylization with Text Prompt -- 3.4 Convolution Adapter -- 3.5 Adapter Optimization -- 4 Experiments.
4.1 Implementation Settings.
Record Nr. UNINA-9910799221303321
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part II / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part II / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (XIV, 509 p. 260 illus., 189 illus. in color.)
Disciplina 006
Collana Lecture Notes in Computer Science
Soggetto topico Image processing - Digital techniques
Computer vision
Artificial intelligence
Application software
Computer networks
Computer systems
Machine learning
Computer Imaging, Vision, Pattern Recognition and Graphics
Artificial Intelligence
Computer and Information Systems Applications
Computer Communication Networks
Computer System Implementation
Machine Learning
ISBN 981-9984-32-7
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto 3D Vision and Reconstruction -- Character Recognition -- Fundamental Theory of Computer Vision. .
Record Nr. UNINA-9910799216003321
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part XI / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part XI / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (XIV, 521 p. 207 illus., 202 illus. in color.)
Disciplina 006
Collana Lecture Notes in Computer Science
Soggetto topico Image processing - Digital techniques
Computer vision
Artificial intelligence
Application software
Computer networks
Computer systems
Machine learning
Computer Imaging, Vision, Pattern Recognition and Graphics
Artificial Intelligence
Computer and Information Systems Applications
Computer Communication Networks
Computer System Implementation
Machine Learning
ISBN 981-9985-52-8
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Contents - Part XI -- Low-Level Vision and Image Processing -- Efficiently Amalgamated CNN-Transformer Network for Image Super-Resolution Reconstruction -- 1 Introduction -- 2 Related Work -- 2.1 CNN for SISR -- 2.2 Lightweight SISR -- 3 Method Overview -- 3.1 The Fundamentals of SISR -- 3.2 Network Structure -- 4 Experimental Results and Analysis -- 4.1 Training Details and Evaluation Metrics -- 4.2 Experimental Results and Analysis -- 5 Conclusion -- References -- A Hybrid Model for Video Compression Based on the Fusion of Feature Compression Framework and Multi-object Tracking Network -- 1 Introduction -- 2 Related Works -- 2.1 JDE (Joint Detection and Embedding Model) -- 2.2 DCT (Discrete Cosine Transform) Method -- 3 Methodology -- 3.1 Feature Extractor -- 3.2 Feature Reconstructor -- 3.3 Feature Encoder and Decoder -- 4 Experiment -- 4.1 The Architecture of the Hybrid Model -- 4.2 Training Details -- 4.3 Evaluation Results -- 5 Conclusions -- References -- Robust Degradation Representation via Efficient Diffusion Model for Blind Super-Resolution -- 1 Introduction -- 2 Related Work -- 3 Methods -- 3.1 Lightweight Degradation Extractor (LDE) -- 3.2 Degradation-Aware Transformer (DAT) -- 3.3 Diffusion Model Training and Inference -- 4 Experiments -- 4.1 Training and Testing Datasets -- 4.2 Implementation and Training Details -- 4.3 Comparison with Existing Blind SR Methods -- 4.4 Ablation Study -- 5 Conclusion -- References -- MemDNet: Memorizing More Exogenous Information to Dehaze Natural Hazy Image -- 1 Introduction -- 2 Proposed Method -- 2.1 Dense Block -- 2.2 Enhanced Block -- 2.3 Memory Branch -- 3 Experiments -- 3.1 Experimental Settings -- 3.2 Comparison with SOTAs -- 3.3 Ablation Study -- 4 Conclusion -- References -- Technical Quality-Assisted Image Aesthetics Quality Assessment -- 1 Introduction.
2 Related Work -- 2.1 Technical Quality Assessment -- 2.2 Aesthetic Quality Assessment -- 3 Proposed Method -- 3.1 Theme-Aware Aesthetic Feature Extraction -- 3.2 Technical Quality Feature Extraction -- 3.3 Feature Fusion and Aesthetic Prediction -- 4 Experimental Results -- 4.1 Databases and Settings -- 4.2 Comparison with the State-of-the-Arts -- 4.3 Ablation Experiments -- 5 Conclusion -- References -- Self-supervised Low-Light Image Enhancement via Histogram Equalization Prior -- 1 Introduction -- 2 Methodology -- 2.1 Histogram Equalization Prior -- 2.2 Architecture -- 2.3 Loss Function -- 3 Experimental Validation -- 3.1 Implementation Details -- 3.2 Quantitative Evaluation -- 3.3 Qualitative Evaluation -- 3.4 Generalization Ability on Real-World Images -- 4 Ablation Studies -- 4.1 Comparison with Other Prior Information -- 4.2 The Effectiveness of Histogram Equalization Prior Loss -- 5 Conclusions -- References -- Enhancing GAN Compression by Image Probability Distribution Distillation -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Background -- 3.2 Image Probability Distribution Distillation -- 3.3 Asynchronous Weighted Discriminator -- 4 Experimentation -- 4.1 Experimental Settings -- 4.2 Result Comparison -- 4.3 Ablation Study -- 5 Conclusion -- References -- HDTR-Net: A Real-Time High-Definition Teeth Restoration Network for Arbitrary Talking Face Generation Methods -- 1 Introduction -- 2 Related Work -- 2.1 Talking Face Generation -- 2.2 Face Restoration -- 3 Method -- 3.1 Fine-Grained Feature Fusion -- 3.2 Decoder -- 3.3 Loss Function -- 4 Experiment -- 4.1 Experimental Settings -- 4.2 Experimental Results -- 4.3 Ablation Study -- 5 Conclusion -- References -- Multi-stream-Based Low-Latency Viewport Switching Scheme for Panoramic Videos -- 1 Introduction -- 2 Related Works -- 2.1 Tile-Based Viewport Adaptive Streaming.
2.2 MPEG-DASH and OMAF -- 2.3 MCTS Coding Scheme -- 3 Methodology -- 3.1 Tile-Based Panoramic Video Encoding -- 3.2 Multiple High Quality Streams -- 4 Experimental Results and Discussion -- 4.1 Experiment Setup -- 4.2 Analysis of the Results -- 5 Conclusion -- References -- Large Kernel Convolutional Attention Based U-Net Network for Inpainting Oracle Bone Inscription -- 1 Introduction -- 2 Method -- 2.1 Overview -- 2.2 Large Kernel Attention Block -- 2.3 U-Net Inpainting Generative Network -- 2.4 Global and Local Discriminative Networks -- 2.5 Loss Functions -- 3 Experimentation -- 3.1 Experimental Datasets and Settings -- 3.2 Evaluation Metrics -- 3.3 Experimental Results and Quantitative Evaluations -- 3.4 Ablation Study -- 4 Conclusion -- References -- L2DM: A Diffusion Model for Low-Light Image Enhancement -- 1 Introduction -- 2 Related Work -- 3 Proposed Method -- 3.1 Preliminaries -- 3.2 Autoencoder Module -- 3.3 ViTCondNet -- 3.4 Main Architecture -- 4 Experiments -- 4.1 Setup -- 4.2 Comparsion with SOTA Methods -- 4.3 Ablation Studies -- 5 Conclusion -- References -- Multi-domain Information Fusion for Key-Points Guided GAN Inversion -- 1 Introduction -- 2 Related Works -- 2.1 GAN Inversion -- 2.2 Latent Space Manipulation -- 3 Methodology -- 3.1 Overall Architecture -- 3.2 Unified Mapping Module -- 3.3 Multi Domain Information Fusion -- 3.4 Key-Point Patch Loss -- 3.5 Training Approaches for Inversion -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Comparison with Inversion Method -- 4.3 Ablation Study and Analyse -- 5 Conclusion -- References -- Adaptive Low-Light Image Enhancement Optimization Framework with Algorithm Unrolling -- 1 Introduction -- 2 LIE Optimization Framework with Algorithm Unrolling -- 2.1 Unrolling LIE-QE Module -- 2.2 Loss of the LIE Optimization Framework -- 3 Experiment.
3.1 Evaluation of the Proposed Framework -- 3.2 Evaluation of Unrolling Decomposition Module -- 3.3 Comparison with Related Methods -- 4 Conclusion -- References -- Feature Matching in the Changed Environments for Visual Localization -- 1 Introduction -- 2 Related Work -- 2.1 Room Layout Estimation -- 2.2 Local Feature Matching -- 2.3 Datasets for Matching -- 3 Image Matching Dataset for Changed Indoor Environments -- 3.1 Design of the Dataset -- 3.2 Detailed Specifications -- 3.3 Obtaining Ground Truth Camera Pose -- 4 Method -- 4.1 Network Architecture -- 4.2 Loss Function -- 5 Experiment -- 5.1 Metrics and Datasets -- 5.2 Results -- 5.3 Implementation Details -- 6 Conclusion -- References -- To Be Critical: Self-calibrated Weakly Supervised Learning for Salient Object Detection -- 1 Introduction -- 2 Related Work -- 2.1 Salient Object Detection -- 2.2 Weakly Supervised Salient Object Detection -- 3 The Proposed Method -- 3.1 From Image-Level to Pixel-Level -- 3.2 Self-calibrated Training Strategy -- 3.3 Saliency Network -- 4 Dataset Construction -- 5 Experiments -- 5.1 Implementation Details -- 5.2 Datasets and Evaluation Metrics -- 5.3 Comparison with State-of-the-Arts -- 5.4 Ablation Studies -- 6 Conclusion -- References -- Image Visual Complexity Evaluation Based on Deep Ordinal Regression -- 1 Introduction -- 2 Related Work -- 2.1 Image Complexity Evaluation -- 2.2 Ordinal Regression -- 3 The Proposed Method -- 3.1 Improved the ICNet Model -- 3.2 Ordinal Regression Model -- 3.3 Total Loss Function -- 4 Experiment and Results -- 4.1 Dataset -- 4.2 Experimental Setup -- 4.3 Experimental Results Analysis -- 5 Conclusions -- References -- Low-Light Image Enhancement Based on Mutual Guidance Between Enhancing Strength and Image Appearance -- 1 Introduction -- 2 Related Works -- 3 Method -- 3.1 The Overall Framework of Our Model.
3.2 Mutual Guidance Module -- 3.3 Estimation of the Edge-Aware Lightness Map -- 4 Experiments -- 4.1 Experimental Settings -- 4.2 Experimental Results -- 4.3 Analysis of Our Method -- 5 Conclusion -- References -- Semantic-Guided Completion Network for Video Inpainting in Complex Urban Scene -- 1 Introduction -- 2 Related Work -- 3 Methods -- 3.1 Problem Formulation -- 3.2 Semantic Video Completion Network -- 3.3 Video Synthesis Network -- 3.4 Loss Functions -- 4 Experiments -- 4.1 Benchmarks and Evaluation Metrics -- 4.2 Results and Discussion -- 4.3 Ablation Experiments -- 5 Conclusion -- References -- Anime Sketch Coloring Based on Self-attention Gate and Progressive PatchGAN -- 1 Introduction -- 2 Related Work -- 2.1 Style Transfer -- 2.2 Automatic Sketch Coloring -- 2.3 User-Guided Coloring -- 2.4 Reference-Based Sketch Image Coloring -- 3 Methodology -- 3.1 Overall Workflow -- 3.2 Self-attention Gate -- 3.3 Progressive PatchGAN -- 3.4 Loss Function -- 4 Experimental Results and Analysis -- 4.1 Implementation Details -- 4.2 Qualitative Evaluation -- 4.3 Quantitative Evaluation -- 4.4 Ablation Study -- 5 Conclusions -- References -- TransDDPM: Transformer-Based Denoising Diffusion Probabilistic Model for Image Restoration -- 1 Introduction -- 2 Related Work -- 2.1 Image Restoration -- 2.2 Denoising Diffusion Probabilistic Models -- 2.3 Diffusion Models for Image Restoration -- 3 Transformer-Based Denoising Diffusion Restoration Models -- 3.1 Overall Pipeline -- 3.2 Multi-Head Cross-Covariance Attention (MXCA) -- 3.3 Gated Feed-Forward Network (GFFN) -- 3.4 Accelerated with Implicit Sampling -- 4 Experiment -- 4.1 Datasets and Evaluation Metrics -- 4.2 Implementation Details -- 4.3 Image Deraining Experiments -- 4.4 Image Dehazing Experiments -- 4.5 Motion Deblurring Experiments -- 4.6 Ablation Experiment -- 4.7 Limitations -- 5 Conclusion.
References.
Record Nr. UNINA-9910799216703321
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part XII / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Pattern Recognition and Computer Vision [[electronic resource] ] : 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part XII / / edited by Qingshan Liu, Hanzi Wang, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang, Rongrong Ji
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (XIV, 523 p. 203 illus., 194 illus. in color.)
Disciplina 006
Collana Lecture Notes in Computer Science
Soggetto topico Image processing - Digital techniques
Computer vision
Artificial intelligence
Application software
Computer networks
Computer systems
Machine learning
Computer Imaging, Vision, Pattern Recognition and Graphics
Artificial Intelligence
Computer and Information Systems Applications
Computer Communication Networks
Computer System Implementation
Machine Learning
ISBN 981-9985-55-2
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Contents - Part XII -- Object Detection, Tracking and Identification -- OKGR: Occluded Keypoint Generation and Refinement for 3D Object Detection -- 1 Introduction -- 2 Related Works -- 2.1 LiDAR-Based 3D Object Detection -- 2.2 Object Shape Completion -- 3 Methodology -- 3.1 Overview -- 3.2 Occluded Keypoint Generation -- 3.3 Occluded Keypoint Refinement -- 3.4 Loss Function -- 4 Experiments -- 4.1 Datasets and Evaluation Metrics -- 4.2 Implementation Details -- 4.3 Evaluation on KITTI Dataset -- 4.4 Evaluation on Waymo Open Dataset -- 4.5 Model Efficiency -- 4.6 Ablation Studies -- 5 Conclusion -- References -- Camouflaged Object Segmentation Based on Fractional Edge Perception -- 1 Introduction -- 2 Related Work -- 3 Interactive Task Learning Network -- 3.1 Integral and Fractional Edge -- 3.2 Camouflaged Edge Detection Module -- 4 Performance Evaluation -- 4.1 Datasets and Experiment Settings -- 4.2 Quantitative Evaluation -- 4.3 Qualitative Evaluation -- 4.4 Generalization of Edge Detection -- 5 Conclusion -- References -- DecTrans: Person Re-identification with Multifaceted Part Features via Decomposed Transformer -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Vision Transformer as Feature Extractor -- 3.2 Token Decomposition (TD) Layer -- 3.3 Data Augmentation for TD Layer -- 3.4 Training and Inference -- 4 Experiments -- 4.1 Datasets and Evaluation Metrics -- 4.2 Implementation Details -- 4.3 Comparisons to State-of-the-arts -- 4.4 Ablation Study -- 5 Conclusion -- References -- AHT: A Novel Aggregation Hyper-transformer for Few-Shot Object Detection -- 1 Introduction -- 2 Related Work -- 2.1 Object Detection -- 2.2 Hypernetworks -- 3 Method -- 3.1 Preliminaries -- 3.2 Overview -- 3.3 Dynamic Aggregation Module -- 3.4 Conditional Adaptation Hypernetworks.
3.5 The Classification-Regression Detection Head -- 4 Experiments -- 4.1 Experimental Setting -- 4.2 Comparison Results -- 4.3 Ablation Study -- 4.4 Visualization of Our Module -- 5 Conclusion -- References -- Feature Refinement from Multiple Perspectives for High Performance Salient Object Detection -- 1 Introduction -- 2 Proposed Method -- 2.1 Overall Architecture -- 2.2 Attention-Guided Bi-directional Feature Refinement Module -- 2.3 Serial Atrous Fusion Module -- 2.4 Upsampling Feature Refinement Module -- 2.5 Objective Function -- 3 Experiments -- 3.1 Experimental Setup -- 3.2 Comparison with State-of-the-Art Methods -- 3.3 Ablation Study -- 4 Conclusion -- References -- Feature Disentanglement and Adaptive Fusion for Improving Multi-modal Tracking -- 1 Introduction -- 2 Related Work -- 2.1 Multi-modal Tracking -- 2.2 Transformers Tracking -- 3 Methodology -- 3.1 Preliminary -- 3.2 Our Approach -- 3.3 Training and Inference -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Comparison with State-of-the-Arts Multi-modal Trackers -- 4.3 Ablation Study -- 5 Conclusion -- References -- Modality Balancing Mechanism for RGB-Infrared Object Detection in Aerial Image -- 1 Introduction -- 2 Related Work -- 2.1 Object Detection in Aerial Images -- 2.2 RGB-Infrared Object Detection -- 3 Method -- 3.1 Overview -- 3.2 Modality Balancing Mechanism -- 3.3 Multimodal Feature Hybrid Sampling Module -- 4 Experiment -- 4.1 Settings -- 4.2 Comparison with State-of-the-Art Methods -- 4.3 Ablation Study -- 5 Conclusion -- References -- Pacific Oyster Gonad Identification and Grayscale Calculation Based on Unapparent Object Detection -- 1 Introduction -- 2 Method -- 2.1 Compact Pyramid Refinement Module (CPRM) -- 2.2 Switchable Excitation Model (SEM) -- 3 Experiments and Analysis of Results -- 3.1 Establishment of the Datasets.
3.2 Experimental Environment and Evaluation Index -- 3.3 Ablation Experiments -- 3.4 Comparative Experiments and Analysis of Results -- 3.5 Visualization Results -- 3.6 Gray Value Calculation -- 4 Conclusion -- References -- Multi-task Self-supervised Few-Shot Detection -- 1 Introduction -- 2 Related Work -- 2.1 Self-supervised Learning -- 2.2 Few-Shot Object Detection -- 3 Methodology -- 3.1 Problem Setting -- 3.2 Self-supervised Auxiliary Branch -- 3.3 Multi-Task Learning -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Few-Shot Object Detection Benchmarks -- 4.3 Ablation Analysis -- 4.4 Visualization -- 5 Conclusion -- References -- CSTrack: A Comprehensive and Concise Vision Transformer Tracker -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Overview -- 3.2 CSBlock -- 3.3 Prediction Head and Loss -- 4 Experiment -- 4.1 Implementation Details -- 4.2 Comparisons with the State-of-the-Art Trackers -- 4.3 Ablation Study -- 4.4 Visualization of Attention Maps -- 4.5 Visualization of Tracking Performance -- 5 Conclusion -- References -- Feature Implicit Enhancement via Super-Resolution for Small Object Detection -- 1 Introduction -- 2 Related Works -- 2.1 General Object Detection -- 2.2 Small Object Detection Based on Super-Resolution -- 3 Methods -- 3.1 Overall Architecture -- 3.2 Training -- 4 Experiments and Details -- 4.1 Dataset and Details -- 4.2 Ablation Study -- 4.3 Main Results -- 5 Conclusion -- References -- Improved Detection Method for SODL-YOLOv7 Intensive Juvenile Abalone -- 1 Introduction -- 2 Methods -- 2.1 SODL Small Target Detection Network -- 2.2 ACBAM Attention Module -- 3 Experimental Results and Analysis -- 3.1 Experimental Data Preprocessing -- 3.2 Experimental Environment and Evaluation Index -- 3.3 Experimental Results and Analysis -- 4 Conclusion -- References.
MVP-SEG: Multi-view Prompt Learning for Open-Vocabulary Semantic Segmentation -- 1 Introduction -- 2 Related Work -- 2.1 Vision-Language Models -- 2.2 Zero-Shot Segmentation -- 2.3 Prompt Learning -- 3 Method -- 3.1 Problem Definition -- 3.2 MVP-SEG -- 3.3 MVP-SEG+ -- 4 Experiments -- 4.1 Datasets -- 4.2 Evaluation Metrics -- 4.3 Implementation Details -- 4.4 Ablation Studies on MVP-SEG -- 4.5 Comparison with State-of-the-Art -- 5 Conclusion -- References -- Context-FPN and Memory Contrastive Learning for Partially Supervised Instance Segmentation -- 1 Introduction -- 2 Related Work -- 3 CCMask -- 3.1 Overview -- 3.2 Context-FPN -- 3.3 Memory Contrastive Learning Head -- 3.4 Loss Function -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Experimental Results -- 4.3 Ablation Study -- 5 Conclusion -- References -- A Dynamic Tracking Framework Based on Scene Perception -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Easy-Hard Dual-Branch Network -- 3.2 Scene Router -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Comparison with State-of-the-arts -- 4.3 Ablation Study and Analysis -- 5 Conclusion -- References -- HPAN: A Hybrid Pose Attention Network for Person Re-Identification -- 1 Introduction -- 2 The Proposed Method -- 2.1 Local Key Point Features -- 2.2 Self-Attention -- 2.3 Hybrid Pose and Global Feature Fusion (HPGFF) -- 2.4 Loss Function -- 2.5 Training Strategy -- 3 Experiments -- 3.1 Datasets and Evaluation Metrics -- 3.2 Comparison with SOTA Methods -- 3.3 Ablation Studies -- 3.4 Visualization of Attention Maps -- 4 Conclusion -- References -- SpectralTracker: Jointly High and Low-Frequency Modeling for Tracking -- 1 Introduction -- 2 Related Work -- 2.1 Visual Tracking -- 2.2 Frequency Modeling in Visual Transformer -- 3 Method -- 3.1 Dual-Spectral Module -- 3.2 Dual-Spectral for Tracking -- 3.3 Prediction Head and Total Loss.
4 Experiments -- 4.1 Implementation Details -- 4.2 State-of-the-Art Comparison -- 4.3 Ablation Studies -- 5 Conclusion -- References -- DiffusionTracker: Targets Denoising Based on Diffusion Model for Visual Tracking -- 1 Introduction -- 2 Related Works -- 2.1 Visual Tracking Based on Siamese Network -- 2.2 Diffusion Model -- 3 Method -- 3.1 Architecture -- 3.2 Training Process -- 3.3 Inference Process -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Ablation Study -- 4.3 General Datasets Evaluation -- 4.4 Attributes Evaluation -- 4.5 Compatibility Experiment -- 5 Conclusion -- References -- Instance-Proxy Loss for Semi-supervised Learning with Coarse Labels -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Instance-Level Loss -- 3.2 Proxy-Level Loss -- 3.3 Instance-Proxy Loss -- 4 Experiments -- 4.1 Comparison to SOTA Methods -- 4.2 Ablation Study -- 5 Conclusion -- References -- FAFVTC: A Real-Time Network for Vehicle Tracking and Counting -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Backbone Network -- 3.2 Multi-spectral Channel and Spatial Attention (MCSA) -- 3.3 Data Association -- 3.4 Vehicle Counting -- 4 Experiments -- 4.1 Datasets and Metrics -- 4.2 Implementation Details -- 4.3 Comparison Experiments -- 4.4 Ablation Study -- 5 Conclusion -- References -- Ped-Mix: Mix Pedestrians for Occluded Person Re-identification -- 1 Introduction -- 2 Related Works -- 2.1 Occluded Person Re-identification -- 2.2 Data Augmentation and Training Loss -- 3 Proposed Method -- 3.1 Ped-Mix -- 3.2 Non-target Suppression Loss -- 3.3 Training Procedure -- 4 Experiment -- 4.1 Datasets and Evaluation Measures -- 4.2 Implementation Details -- 4.3 Ablation Studies -- 4.4 Comparison with State-of-the-Art Methods -- 4.5 Visualization -- 4.6 Why Random Masking -- 4.7 Results on Holistic Datasets -- 5 Conclusion -- References.
Object-Aware Transfer-Based Black-Box Adversarial Attack on Object Detector.
Record Nr. UNINA-9910799206903321
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui