1.

Record Nr.

UNISA996490366003316

Titolo

Artificial neural networks and machine learning - ICANN 2022 . Part III : 31st International Conference on Artificial Neural Networks, Bristol, UK, September 6-9, 2022, proceedings / / Elias Pimenidis [and four others] (editors)

Pubbl/distr/stampa

Cham, Switzerland : , : Springer, , [2022]

©2022

ISBN

3-031-15934-9

Descrizione fisica

1 online resource (835 pages)

Collana

Lecture notes in computer science ; ; Volume 13531

Disciplina

006.3

Soggetti

Artificial intelligence

Machine learning

Neural networks (Computer science)

Lingua di pubblicazione

Inglese

Formato

Materiale a stampa

Livello bibliografico

Monografia

Nota di bibliografia

Includes bibliographical references and index.

Nota di contenuto

Intro -- Preface -- Organization -- Contents- Part III -- Adaptive Channel Encoding Transformer for Point Cloud Analysis -- 1 Introduction -- 2 Related Work -- 2.1 Point-Based Method -- 2.2 Transformer-Based Method -- 3 Method -- 3.1 Tce -- 3.2 Network Architecture -- 4 Experiments -- 4.1 Classification on ModelNet40 -- 4.2 Part Segmentation on ShapeNet -- 4.3 Classification on ScanObjectNN -- 4.4 Ablation Studies -- 4.5 Robustness Experiments -- 5 Conclusion -- References -- ARB U-Net: An Improved Neural Network for Suprapatellar Bursa Effusion Ultrasound Image Segmentation -- 1 Introduction -- 2 ARB U-Net Network Structure -- 2.1 Encoder -- 2.2 Decoder -- 2.3 Loss Function -- 3 Result -- 3.1 Experimental Environment and Dataset Annotation -- 3.2 Evaluation Index -- 3.3 Quantitative and Qualitative Analysis -- 4 Conclusion -- References -- BPGG: Bidirectional Prototype Generation and Guidance Network for Few-Shot Anomaly Localization -- 1 Introduction -- 2 Related Work -- 2.1 Anomaly Localization -- 2.2 Few-Shot Learning -- 2.3 Few-Shot Segmentation -- 3 Method -- 3.1 Problem Definition -- 3.2 Method Overview -- 3.3 Prototype



Generation and Guidance -- 3.4 Original Forward Branch -- 3.5 Adaptive Reverse Branch -- 4 Experiments -- 4.1 Experiment Setup -- 4.2 Qualitative Results -- 4.3 Ablation Study -- 5 Conclusions -- References -- CoPrGAN: Image-to-Image Translation via Content Preservation -- 1 Introduction -- 2 Related Work -- 3 Approach -- 3.1 Model Overview -- 3.2 Dynamic Paths -- 3.3 Training Strategy -- 4 Experiments -- 4.1 Datasets -- 4.2 Baselines -- 4.3 Quality Comparison -- 4.4 Ablation Study -- 4.5 Discussions -- 5 Conclusion -- References -- Cross Domain Evaluation of Text Detection Models -- 1 Introduction -- 2 Related Work -- 2.1 Character-Based Detectors -- 2.2 Word-Based Detectors -- 2.3 Line-Based Detectors.

2.4 Segmentation-Based Detectors -- 3 Models -- 3.1 EAST -- 3.2 CRAFT -- 3.3 Tesseract -- 3.4 Outputs Ensemble -- 4 Experiment -- 4.1 Datasets -- 4.2 Experimental Set-Up -- 5 Results and Discussion -- 6 Conclusion -- References -- Cross-Domain Learning for Reference-Based Sketch Colorization with Structural and Colorific Strategy -- 1 Introduction -- 2 Related Work -- 2.1 Conditional Sketch Colorization -- 2.2 Reference-Based Colorization -- 3 Proposed Method -- 3.1 Domain Alignment Network -- 3.2 Coarse-to-Fine Generator -- 3.3 Structural and Colorific Strategy -- 3.4 Loss for Reference-Based Sketch Colorization -- 4 Experiments -- 4.1 Implementation -- 4.2 Datasets -- 4.3 Qualitative Comparison -- 4.4 Ablation Study -- 4.5 User Research -- 5 Conclusion -- References -- Data Augmented Dual-Attention Interactive Image Classification Network -- 1 Introduction -- 2 Related Work -- 2.1 Method Based on Strong Supervision -- 2.2 Method Based on Weak Supervision -- 3 Method -- 3.1 Data Augmentation -- 3.2 Dual Attention Network -- 3.3 Channel Interaction and Local Feature Fusion -- 4 Experimental Results and Analysis -- 4.1 Datasets and Implementation Details -- 4.2 Experimental Results on Three Fine-Grained Data Sets -- 4.3 Data Augmentation and Dual Attention Visualization -- 4.4 Ablation Experiments -- 5 Conclusions -- References -- Deep Dictionary Pair Learning for SAR Image Classification -- 1 Introduction -- 2 Related Work -- 2.1 Discriminative Dictionary Learning -- 2.2 Deep Convolutional Neural Network -- 3 Proposed Method -- 3.1 Projective Dictionary Pair Learning -- 3.2 Dictionary Learning Layers -- 3.3 Network Architecture -- 3.4 Training and Inference -- 4 Experiments -- 4.1 DataSet -- 4.2 Setting -- 4.3 Results -- 5 Discussion -- 5.1 Ablation Study -- 5.2 Parameter Comparison Study -- 6 Conclusion -- References.

Deepfake Video Detection Exploiting Binocular Synchronization -- 1 Introduction -- 2 Related Work -- 3 Proposed Method -- 3.1 Preprocessing -- 3.2 Architecture -- 4 Experiment and Analysis -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Feature Effectiveness -- 4.4 In-Dataset Evaluation -- 4.5 Cross-Dataset Evaluation -- 4.6 Ablation Study -- 4.7 Limitations -- 5 Conclusion -- References -- Dep-ViT: Uncertainty Suppression Model Based on Facial Expression Recognition in Depression Patients -- 1 Introduction -- 2 VFEM -- 2.1 Expression Collection -- 2.2 Emoticon Scoring -- 2.3 Facial Feature Extraction and Analysis -- 3 Dep-ViT -- 3.1 Encoder -- 3.2 SE and Self-attention Layer -- 3.3 Rank Regularization Based on KL Divergence and Relabeling -- 3.4 Loss Function Based on Manual Labeling -- 4 Experiment and Result Analysis -- 4.1 Parameter Settings -- 4.2 Experimental Results -- 4.3 Ablation Experiment -- 5 Limitation -- 6 Conclusion -- References -- Ensemble of One-Class Classifiers Based on Multi-level Hidden Representations Abstracted from Convolutional Autoencoder for Anomaly Detection -- 1 Introduction -- 2 Related Work -- 2.1 OCSVMs -- 2.2 Hybrid Approach -- 3 Ensemble of One-



Class Classifiers -- 3.1 Extracting Different Levels of Image Semantic Features -- 3.2 Building Multiple Base Classifiers with Extracted Features -- 3.3 Classifier Fusion for Image Anomaly Evaluation -- 4 Experiments -- 4.1 Datasets and Setup -- 4.2 Baseline Methods -- 4.3 Model Configuration -- 4.4 Performance Evaluation Metric -- 4.5 Results and Analysis -- 5 Conclusion and Future Work -- References -- Images Structure Reconstruction from fMRI by Unsupervised Learning Based on VAE -- 1 Introduction -- 2 Method -- 2.1 Basic Model and Loss Function -- 2.2 Overview of the Proposed Framework -- 3 Experimental Results -- 3.1 Datasets and Evalution.

3.2 Comparison of Images Reconstruction Performance with Others -- 3.3 Ablation Experiment -- 4 Conclusion -- References -- Inter-subtask Consistent Representation Learning for Visual Commonsense Reasoning -- 1 Introduction -- 2 Related Work -- 2.1 Visual Commonsense Reasoning -- 2.2 Siamese Network -- 2.3 Contrastive Learning -- 3 Proposed Approach -- 3.1 Joint Learning Framework -- 3.2 Feature Extraction -- 3.3 Multi-level Contrastive Learning -- 3.4 Classification and Loss -- 4 Experiments -- 4.1 Datasets and Implementation Details -- 4.2 Performance Comparison -- 4.3 Ablation Studies -- 5 Conclusion -- References -- InvisibiliTee: Angle-Agnostic Cloaking from Person-Tracking Systems with a Tee -- 1 Introduction -- 1.1 User Study -- 2 Literature Review -- 3 Method -- 3.1 Overview -- 3.2 Attack and Geometric Constraint Loss Functions -- 3.3 Geometric Warp and Masking -- 4 Attacks in the Digital World -- 4.1 Dataset and Experiment Setup -- 4.2 Experimental Results -- 4.3 Case Studies of Digital Attacks -- 5 Attacks in the Physical World -- 5.1 Additional Discussion -- 6 Conclusion -- References -- Makeup Transfer Based on Generative Adversarial Network for Large Angle Spatial Misalignment -- 1 Introduction -- 2 Related Work -- 2.1 Facial Makeup Transfer -- 2.2 Style Transfer -- 3 Related Work -- 3.1 Formulation -- 3.2 Framework -- 3.3 Neural Head Reenactment Module -- 3.4 Full Objective -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Comparisons -- 4.3 Ablation Studies -- 4.4 Controllable Makeup Transfer -- 4.5 Partial Makeup Transfer -- 4.6 Makeup Remove -- 5 Conclusion -- References -- Making Images Resilient to Adversarial Example Attacks -- 1 Introduction -- 2 Adversary-Proof Examples -- 2.1 Baseline: R-PGD -- 2.2 Advanced: ZigZag -- 3 Empirical Study -- 3.1 Experiment Settings -- 3.2 R-PGD Against FGSM -- 3.3 R-PGD Against PGD.

3.4 ZigZag Against FGSM -- 3.5 ZigZag Against PGD -- 3.6 ZigZag Against CW -- 3.7 Transferability Evaluation -- 4 Related Work -- 5 Concluding Remarks -- References -- Multi-Class Lane Semantic Segmentation of Expressway Dataset Based on Aerial View -- 1 Introduction -- 2 Related Work -- 2.1 Lane Detection Datasets -- 2.2 Semantic Segmentation Models Based on DCNNs -- 2.3 Hausdorff Distance Loss -- 2.4 Conditional Random Fields -- 3 Multi-class Lane Semantic Segmentation -- 3.1 Expressway Dataset Based on Aerial View -- 3.2 DeepLab-ERFC -- 3.3 Update Strategy -- 4 Experiment -- 4.1 Comparison Experiment -- 4.2 Evaluation on Expressway Dataset -- 5 Conclusion -- References -- Mutil-level Local Alignment and Semantic Matching Network for Image-Text Retrieval -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Generic Representation Extraction -- 3.2 Local Region-Word Alignment -- 3.3 Multi-level Semantic Matching -- 3.4 Loss Function -- 4 Experiments -- 4.1 Dataset and Evaluation Metric -- 4.2 Implementation Details -- 4.3 Comparison with State-of-the-Art Methods -- 4.4 Ablation Study and Analysis -- 4.5 Visualization of Retrieval Results -- 5 Conclusion -- References -- NAS4FBP: Facial Beauty Prediction Based on Neural



Architecture Search -- 1 Introduction -- 2 Related Work -- 2.1 Facial Beauty Prediction Based on Deep Learning -- 2.2 Neural Architecture Search -- 3 Method -- 3.1 Align-Crop -- 3.2 NAS for an FBP Backbone -- 3.3 Non-local Spatial Attention Module -- 3.4 Multi-task Learning Scheme with HBLoss -- 4 Experimental Results -- 4.1 Experimental Setup -- 4.2 Experiments of Applying NAS to FBP -- 4.3 Comparison with the Related State-of-the-Art Models -- 4.4 Ablation and Analysis -- 5 Conclusion -- References -- Object Detector with Recursive Feature Pyramid and Key Content-Only Attention -- 1 Introduction -- 2 Related Works.

3 RecursiveFeaturePyramid.