10983nam 2200517 450 99650346930331620231110231858.03-031-20497-2(MiAaPQ)EBC7158302(Au-PeEL)EBL7158302(CKB)25732544900041(PPN)268726280(EXLCZ)992573254490004120230418d2023 uy 0engurcnu||||||||txtrdacontentcrdamediacrrdacarrierArtificial intelligence second CAAI international conference, CICAI 2022, Beijing, China, August 27-28, 2022, revised selected papers, Part I /edited by Lu Fang [and four others]Cham, Switzerland :Springer,[2023]©20231 online resource (705 pages)Lecture Notes in Computer Science ;v.13604Print version: Fang, Lu Artificial Intelligence Cham : Springer,c2023 9783031204968 Includes bibliographical references and index.Intro -- Preface -- Organization -- Contents - Part I -- Contents - Part II -- Contents - Part III -- Computer Vision -- Cross-Camera Deep Colorization -- 1 Introduction -- 2 Related Work -- 2.1 Automatic Image Colorization -- 2.2 Reference-based Image Colorization -- 2.3 Flow-based or Non-rigid Correspondences -- 3 Approach -- 3.1 Cross-camera Alignment Module -- 3.2 Hierarchical Fusion Module -- 3.3 Loss Function -- 4 Experiments -- 4.1 Dataset -- 4.2 Comparison to State-of-the-Art Methods -- 4.3 Ablation Study -- 5 Conclusion -- References -- Attentive Cascaded Pyramid Network for Online Video Stabilization -- 1 Introduction -- 2 Related Work -- 2.1 Traditional Offline 2D Video Stabilization -- 2.2 CNN Based Video Stabilization -- 3 Methods -- 3.1 RGB and Flow Feature Encoding -- 3.2 Flow-guided Quiescent Attention Module -- 3.3 Cascaded Pyramid Prediction Module -- 3.4 Training Objectives -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Quantitative Evaluation -- 4.3 Qualitative Evaluation -- 4.4 Ablation Study -- 4.5 User Study -- 5 Conclusions -- References -- Amodal Layout Completion in Complex Outdoor Scenes -- 1 Introduction -- 2 Related Work -- 2.1 Amodal Perception -- 2.2 Synthesize Image from Layout -- 3 Pre-experiments -- 3.1 Data Augmentation -- 3.2 Experiments and Evaluation -- 4 Methodology -- 4.1 Divide-and-Conquer Strategy -- 4.2 ALCN Mainframe -- 4.3 New Indicators -- 5 Experiments -- 5.1 Datasets and Evaluation Metrics -- 5.2 Experiments Results on Amodal Layout Completion -- 5.3 Experiments Results on Layout-to-Image Generation -- 5.4 Ablation Study -- 6 Conclusion -- References -- Exploring Hierarchical Prototypes for Few-Shot Segmentation -- 1 Introduction -- 2 Related Work -- 2.1 Few-shot Learning -- 2.2 Few-shot Segmentation -- 3 Method -- 3.1 Overview -- 3.2 Hierarchical Prototypes -- 3.3 Prototype Attention Module.4 Experiments -- 4.1 Experimental Settings -- 4.2 Performance Comparison -- 4.3 Ablation Study -- 5 Conclusion -- References -- BSAM: Bidirectional Scene-Aware Mixup for Unsupervised Domain Adaptation in Semantic Segmentation -- 1 Introduction -- 2 Methods -- 2.1 Preliminaries -- 2.2 Stylized Source Domain -- 2.3 Bidirectional Scene-Aware Mixup -- 2.4 Self-training Under Mean-Teacher Framework -- 3 Experiments -- 3.1 Implementation Details -- 3.2 Comparison with State-of-the-Art Methods -- 3.3 Parameter Analysis and Ablation Study -- 4 Conclusion -- References -- Triple GNN: A Pedestrian-Scene-Object Joint Model for Pedestrian Trajectory Prediction -- 1 Introduction -- 2 The Proposed Pedestrian-Scene-Object Joint Model for Pedestrian Trajectory Prediction -- 2.1 Problem Statements and the Framework of Our Work -- 2.2 Triple Feature Extraction -- 2.3 S-GNN with Two-stage Scheme -- 2.4 T-CNN with Dilated Convolution -- 3 Experiments, Results and Discussions -- 3.1 Experimental Settings -- 3.2 Overall Results -- 3.3 Ablation Experiments -- 3.4 Qualitative Evaluation -- 4 Conclusions -- References -- Cross-domain Trajectory Prediction with CTP-Net -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Problem Formulation -- 3.2 Cross-domain Trajectory Prediction Network -- 4 Experiments -- 4.1 Datasets and Evaluation Metrics -- 4.2 Experimental Setup -- 4.3 Implementation Details -- 4.4 Results -- 4.5 Ablation and Visulazation -- 5 Conclusion -- References -- Spatial-Aware GAN for Instance-Guided Cross-Spectral Face Hallucination -- 1 Introduction -- 2 Method -- 2.1 Spatial-aware Instance-guided Cross-spectral Face Hallucination -- 2.2 Fine-grained Aligned Spatially Adaptive Normalization -- 2.3 Training Objective -- 3 Experiments -- 3.1 Quantitative Evaluations -- 3.2 Qualitative Evaluations -- 3.3 Ablation Studies -- 4 Conclusion -- References.Lightweight Image Compression Based on Deep Learning -- 1 Introduction -- 2 The Proposed Methods -- 2.1 Dynamic Concatenated Convolution (DCC) -- 2.2 Depthwise Separable Residual Block (DSRB) -- 3 Experiments -- 3.1 Implementation Details -- 3.2 Lightweight Ability Evaluation -- 3.3 Generalization Ability Evaluation -- 3.4 Ablation Study -- 4 Conclusion -- References -- DGMLP: Deformable Gating MLP Sharing for Multi-Task Learning -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Overview -- 3.2 Feature Extractor -- 3.3 DGMLP Method -- 3.4 Loss Function -- 4 Experiments -- 4.1 Comparisons with State-of-the-art Models -- 4.2 Ablation Study -- 4.3 Visualization -- 5 Conclusion -- References -- Monocular 3D Face Reconstruction with Joint 2D and 3D Constraints -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 3D Morphable Model -- 3.2 Joint 2D and 3D Optimization -- 4 Experimental Results -- 4.1 Datasets and Metrics -- 4.2 Ablation Study -- 4.3 Comparison -- 5 Conclusion and Discussion -- References -- Scene Text Recognition with Single-Point Decoding Network -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Rectifier -- 3.2 Encoder -- 3.3 Decoder -- 3.4 Loss Function -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Comparisons with State-of-the-Arts -- 4.4 Ablation Study -- 4.5 Visual Illustrations -- 5 Conclusion -- References -- Unsupervised Domain Adaptation for Semantic Segmentation with Global and Local Consistency -- 1 Introduction -- 2 Related Work -- 2.1 Semantic Segmentation -- 2.2 Unsupervised Domain Adaptation -- 3 Methods -- 3.1 Global Consistency -- 3.2 Local Consistency -- 3.3 Overall Training Loss -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Results -- 5 Conclusion -- References -- Research on Multi-temporal Cloud Removal Using D-S Evidence Theory and Cloud Segmentation Model.1 Introduction -- 2 Related Work -- 2.1 Cloud Removal Methods -- 2.2 Cloud Detection Methods -- 3 Methods -- 3.1 Cloud-net -- 3.2 Color Prior Knowledge -- 3.3 D-S Evidence Theory -- 3.4 Evidence Fusion Method -- 4 Experiment -- 4.1 Training Cloud-net -- 4.2 Experiments with Color Prior Knowledge -- 4.3 Multi-temporal Remote Sensing Cloud Removal Experiments -- 4.4 Ablation Experiment -- 4.5 Comparison Against Other Cloud Removal Methods -- 5 Conclusion -- References -- SASD: A Shape-Aware Saliency Object Detection Approach for RGB-D Images -- 1 Introduction -- 2 Related Work -- 3 The Proposed SASD Approach -- 3.1 Initial Saliency Map Calculation -- 3.2 Enhanced Saliency Map Calculation -- 3.3 Irregular Shape Display -- 4 Experiments and Discussion -- 5 Conclusion -- References -- Dual Windows Are Significant: Learning from Mediastinal Window and Focusing on Lung Window -- 1 Introduction -- 2 Related Works -- 2.1 Computer-Aid Diagnosis Systems for Pneumonia -- 2.2 Attention Mechanisms -- 3 Methods -- 3.1 Dual Window Network -- 3.2 Lung Window Attention Block -- 3.3 Overall Loss -- 4 Experiments -- 4.1 Evaluation Dataset and Experimental Settings -- 4.2 Experimental Analyse -- 5 Discussion -- 6 Conclusions -- References -- CDNeRF: A Multi-modal Feature Guided Neural Radiance Fields -- 1 Introduction -- 2 Related Work -- 2.1 Neural Radiance Field with Generality -- 2.2 Depth Prior to NeRF -- 3 Methods -- 3.1 Multi-modal Feature Extraction -- 3.2 Radiance Field Prediction and Volumetric Rendering -- 3.3 Optimization Functions -- 4 Experiments -- 4.1 Datasets and Evaluation -- 4.2 Implementation Details -- 4.3 Results -- 4.4 Ablation and Analysis -- 5 Conclusion -- References -- MHPro: Multi-hypothesis Probabilistic Modeling for Human Mesh Recovery -- 1 Introduction -- 2 Related Work -- 2.1 Human Mesh Recovery from Monocular Images.2.2 Multi-hypothesis Methods -- 2.3 Transformer in Computer Vision -- 3 Method -- 3.1 Preliminary -- 3.2 Probabilistic Modeling -- 3.3 Intra-hypothesis Refinement -- 3.4 Inter-hypothesis Communication -- 3.5 Loss Function -- 4 Experimental Results -- 4.1 Datasets -- 4.2 Comparison -- 4.3 Ablation Study -- 5 Conclusion -- References -- Image Sampling for Machine Vision -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Motivation -- 3.2 Implementation -- 3.3 Visualization -- 3.4 Cost Assessment -- 4 Experiments -- 4.1 Object Detection -- 4.2 Image Classification -- 4.3 Ablation Experiments -- 4.4 Effect Visualization -- 5 Conclusion -- References -- SMOF: Squeezing More Out of Filters Yields Hardware-Friendly CNN Pruning -- 1 Introduction -- 2 Related Work -- 2.1 Filter Channel Pruning -- 2.2 Filter Weight Pruning -- 2.3 Hybrid Filter Pruning -- 3 Proposed Method -- 3.1 Filter Skeleton (FS) for Kernel Size Reduction -- 3.2 Filter Mask (FM) for Filter Number Reduction -- 3.3 Training and Inference -- 4 Experiments -- 4.1 ResNet56 on CIFAR-10 -- 4.2 ResNet18 on ImageNet -- 4.3 U-Net for Image Denoising -- 5 Conclusions -- References -- H-ViT: Hybrid Vision Transformer for Multi-modal Vehicle Re-identification -- 1 Introduction -- 2 Related Work -- 2.1 Single-Modal Re-identification -- 2.2 Multi-modal Re-identification -- 2.3 Transformer in Re-identification Task -- 3 Method -- 3.1 Vision Transformer Branches and Loss Function -- 3.2 Modal-specific Controller -- 3.3 Modal Information Embedding -- 4 Experiments and Analysis -- 4.1 Datasets -- 4.2 Implementation -- 4.3 Comparison with State-of-the-Art -- 4.4 Analysis -- 5 Conclusion -- References -- A Coarse-to-Fine Convolutional Neural Network for Light Field Angular Super-Resolution -- 1 Introduction -- 2 Proposed Method -- 2.1 Coarse-grained View Synthesis Sub-network.2.2 Fine-grained View Refinement Sub-network.Lecture Notes in Computer Science Artificial intelligenceComputational intelligenceArtificial intelligence.Computational intelligence.006.3Fang LuMiAaPQMiAaPQMiAaPQBOOK996503469303316Artificial intelligence104454UNISA