top

  Info

  • Utilizzare la checkbox di selezione a fianco di ciascun documento per attivare le funzionalità di stampa, invio email, download nei formati disponibili del (i) record.

  Info

  • Utilizzare questo link per rimuovere la selezione effettuata.
Advanced topics in computer vision / / Giovanni Maria Farinella, Sebastiano Battiato, Roberto Cipolla, editors
Advanced topics in computer vision / / Giovanni Maria Farinella, Sebastiano Battiato, Roberto Cipolla, editors
Edizione [1st ed. 2013.]
Pubbl/distr/stampa London : , : Springer, , 2013
Descrizione fisica 1 online resource (xiv, 433 pages) : illustrations (some color)
Disciplina 006.42
Collana Advances in Computer Vision and Pattern Recognition
Soggetto topico Computer vision
ISBN 1-4471-5520-3
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Visual Features: From Early Concepts to Modern Computer Vision -- Where Next in Object Recognition and How Much Supervision Do We Need? -- Recognizing Human Actions by Using Effective Codebooks and Tracking -- Evaluating and Extending Trajectory Features for Activity Recognition -- Co-Recognition of Images and Videos: Unsupervised Matching of Identical Object Patterns and its Applications -- Stereo Matching: State-of-the-Art and Research Challenges -- Visual Localization for Micro Aerial Vehicles in Urban Outdoor Environments -- Moment Constraints in Convex Optimization for Segmentation and Tracking -- Large Scale Metric Learning for Distance-Based Image Classification on Open Ended Data Sets -- Top-Down Bayesian Inference of Indoor Scenes -- Efficient Loopy Belief Propagation Using the Four Color Theorem -- Boosting k-Nearest Neighbors Classification -- Learning Object Detectors in Stationary Environments -- Video Temporal Super-Resolution Based on Self-Similarity.
Record Nr. UNINA-9910437597203321
London : , : Springer, , 2013
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXIV
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXIV
Autore Avidan Shai
Pubbl/distr/stampa Cham : , : Springer, , 2022
Descrizione fisica 1 online resource (803 pages)
Disciplina 006.37
Altri autori (Persone) BrostowGabriel
CisséMoustapha
FarinellaGiovanni Maria
HassnerTal
Collana Lecture Notes in Computer Science
Soggetto non controllato Engineering
Technology & Engineering
ISBN 3-031-20053-5
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Foreword -- Preface -- Organization -- Contents - Part XXIV -- Improving Vision Transformers by Revisiting High-Frequency Components -- 1 Introduction -- 2 Related Work -- 3 Revisiting ViT Models from a Frequency Perspective -- 4 The Proposed Method -- 4.1 Adversarial Training with High-Frequency Perturbations -- 4.2 A Case Study Using ViT-B -- 5 Experiments -- 5.1 Experimental Setup -- 5.2 Results on ImageNet Classification -- 5.3 Results on Out-of-distribution Data -- 5.4 Transfer Learning to Downstream Tasks -- 5.5 Ablation Studies -- 5.6 Discussions -- 6 Conclusions and Future Work -- References -- Recurrent Bilinear Optimization for Binary Neural Networks -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Preliminaries -- 3.2 Bilinear Model of BNNs -- 3.3 Recurrent Bilinear Optimization -- 3.4 Discussion -- 4 Experiments -- 4.1 Datasets and Implementation Details -- 4.2 Ablation Study -- 4.3 Image Classification -- 4.4 Object Detection -- 4.5 Deployment Efficiency -- 5 Conclusion -- References -- Neural Architecture Search for Spiking Neural Networks -- 1 Introduction -- 2 Related Work -- 2.1 Spiking Neural Networks -- 2.2 Neural Architecture Search -- 3 Preliminaries -- 3.1 Leaky Integrate-and-Fire Neuron -- 3.2 NAS Without Training -- 4 Methodology -- 4.1 Linear Regions from LIF Neurons -- 4.2 Sparsity-Aware Hamming Distance -- 4.3 Searching Forward and Backward Connections -- 5 Experiments -- 5.1 Implementation Details -- 5.2 Performance Comparison -- 5.3 Experimental Analysis -- 6 Conclusion -- References -- Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification -- 1 Introduction -- 2 Related Work -- 2.1 Fine-Grained Visual Classification -- 2.2 Human Attention in Vision -- 3 Approach -- 3.1 Overview -- 3.2 Region Feature Mining Module.
3.3 Cross-Hierarchical Orthogonal Fusion Module -- 4 Experiments and Analysis -- 4.1 Datasets -- 4.2 Hierarchy Interaction Analysis -- 4.3 Evaluation on Traditional FGVC Setting -- 4.4 Further Analysis -- 5 Conclusions -- References -- DaViT: Dual Attention Vision Transformers -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Overview -- 3.2 Spatial Window Attention -- 3.3 Channel Group Attention -- 3.4 Model Instantiation -- 4 Analysis -- 5 Experiments -- 5.1 Image Classification -- 5.2 Object Detection and Instance Segmentation -- 5.3 Semantic Segmentation on ADE20k -- 5.4 Ablation Study -- 6 Conclusion -- References -- Optimal Transport for Label-Efficient Visible-Infrared Person Re-Identification -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Problem Formulation and Overview -- 3.2 Discrepancy Elimination Network (DEN) -- 3.3 Optimal-Transport Label Assignment (OTLA) -- 3.4 Prediction Alignment Learning (PAL) -- 3.5 Optimization -- 4 Experiments -- 4.1 Experimental Settings -- 4.2 Implementation Details -- 4.3 Main Results -- 4.4 Ablation Study -- 4.5 Discussion -- 5 Conclusion -- References -- Locality Guidance for Improving Vision Transformers on Tiny Datasets -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 The Overall Approach -- 3.2 Guidance Positions -- 3.3 Architecture of the CNN -- 4 Experiments -- 4.1 Main Results -- 4.2 Discussion -- 4.3 Ablation Study -- 5 Conclusion -- References -- Neighborhood Collective Estimation for Noisy Label Identification and Correction -- 1 Introduction -- 2 Related Work -- 2.1 Noise Verification -- 2.2 Label Correction -- 3 The Proposed Method -- 3.1 Neighborhood Collective Noise Verification -- 3.2 Neighborhood Collective Label Correction -- 3.3 Training Objectives -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Comparisons with the State of the Art -- 4.3 Analysis.
5 Conclusions -- References -- Few-Shot Class-Incremental Learning via Entropy-Regularized Data-Free Replay -- 1 Introduction -- 2 Related Works -- 2.1 Class-Incremental Learning -- 2.2 Few-Shot Class-Incremental Learning -- 2.3 Data-Free Knowledge Distillation -- 3 Preliminaries -- 3.1 Problem Setting -- 3.2 Data-Free Replay -- 4 Methodology -- 4.1 Entropy-Regularized Data-Free Replay -- 4.2 Learning Incrementally with Uncertain Data -- 5 Experiments -- 5.1 Datasets -- 5.2 Implementation Details -- 5.3 Re-implementation of Replay-based Methods -- 5.4 Main Results and Comparison -- 5.5 Analysis -- 6 Conclusion -- References -- Anti-retroactive Interference for Lifelong Learning -- 1 Introduction -- 2 Related Work -- 2.1 Lifelong Learning -- 2.2 Adversarial Training -- 3 Proposed Method -- 3.1 Extracting Intra-Class Features -- 3.2 Generating and Fusing Task-Specific Models -- 4 Experiments and Results -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Results and Comparison -- 4.4 Ablation Study -- 5 Conclusion -- References -- Towards Calibrated Hyper-Sphere Representation via Distribution Overlap Coefficient for Long-Tailed Learning -- 1 Introduction -- 2 Related Works -- 3 Methodology -- 3.1 Build vMF Classifier on Hyper-Sphere -- 3.2 Quantify Distribution Overlap Coefficient on Hyper-Sphere -- 3.3 Improve Representation of Feature and Classifier via o -- 3.4 Calibrate Classifier Weight Beyond Training via o -- 4 Experiments -- 4.1 Long-Tailed Image Classification Task -- 4.2 Long-Tailed Semantic and Instance Segmentation Task -- 4.3 Ablation Study -- 5 Conclusions -- References -- Dynamic Metric Learning with Cross-Level Concept Distillation -- 1 Introduction -- 2 Related Work -- 3 Proposed Approach -- 3.1 Dynamic Metric Learning -- 3.2 Hierarchical Concept Refiner -- 3.3 Cross-Level Concept Distillation -- 3.4 Discussions -- 4 Experiments.
4.1 Datasets -- 4.2 Evaluation Protocol -- 4.3 Implementation Details -- 4.4 Main Results -- 4.5 Experimental Analysis -- 5 Conclusion -- References -- MENet: A Memory-Based Network with Dual-Branch for Efficient Event Stream Processing -- 1 Introduction -- 2 Related Work -- 2.1 Event-Based Representations -- 2.2 Memory-Based Networks -- 3 Event Camera Model -- 4 Method -- 4.1 Dual-Branch Structure -- 4.2 Double Polarities Calculation Method -- 4.3 Point-Wise Memory Bank -- 4.4 Training and Testing Strategies -- 5 Experiments -- 5.1 Experimental Setup -- 5.2 Ablation Study -- 5.3 Object Recognition -- 5.4 Gesture Recognition -- 6 Conclusion -- References -- Out-of-distribution Detection with Boundary Aware Learning -- 1 Introduction -- 2 Related Work -- 3 Preliminaries -- 4 Boundary Aware Learning -- 4.1 Representation Extraction Module (REM) -- 4.2 Representation Sampling Module (RSM) -- 4.3 Representation Discrimination Module (RDM) -- 5 Experiments -- 5.1 Dataset -- 5.2 Experimental Setup -- 5.3 Ablation Study -- 5.4 Detection Results -- 5.5 Visualization of trivial and hard OOD features -- 6 Conclusion -- References -- Learning Hierarchy Aware Features for Reducing Mistake Severity -- 1 Introduction -- 2 Related Work -- 3 HAF: Proposed Approach -- 3.1 Fine Grained Cross-Entropy (LCEfine) -- 3.2 Soft Hierarchical Consistency (Lshc) -- 3.3 Margin Loss (Lm) -- 3.4 Geometric Consistency (Lgc) -- 4 Experiments and Results -- 4.1 Experimental Setup -- 4.2 Training Configurations -- 4.3 Results -- 4.4 Coarse Classification Accuracy -- 5 Analysis -- 5.1 Ablation Study -- 5.2 Mistakes Severity Plots -- 5.3 Discussion: Hierarchical Metrics -- 6 Conclusion -- References -- Learning to Detect Every Thing in an Open World -- 1 Introduction -- 2 Related Work -- 3 Learning to Detect Every Thing -- 3.1 Data Augmentation: Background Erasing (BackErase).
3.2 Decoupled Multi-domain Training -- 4 Experiments -- 4.1 Cross-category Generalization -- 4.2 Cross-Dataset Generalization -- 5 Conclusion -- References -- KVT: k-NN Attention for Boosting Vision Transformers -- 1 Introduction -- 2 Related Work -- 2.1 Self-attention -- 2.2 Transformer for Vision -- 3 k-NN Attention -- 3.1 Vanilla Attention -- 3.2 k-NN Attention -- 3.3 Theoretical Analysis on k-NN Attention -- 4 Experiments for Vision Transformers -- 4.1 Experimental Settings -- 4.2 Results on ImageNet -- 4.3 The Impact of Number k -- 4.4 Convergence Speed of k-NN Attention -- 4.5 Other Properties of k-NN Attention -- 4.6 Comparisons with Temperature in Softmax -- 4.7 Visualization -- 4.8 Object Detection and Semantic Segmentation -- 5 Conclusion -- References -- Registration Based Few-Shot Anomaly Detection -- 1 Introduction -- 2 Related Work -- 2.1 Anomaly Detection -- 2.2 Few-Shot Learning -- 2.3 Few-Shot Anomaly Detection -- 3 Problem Setting -- 4 Method -- 4.1 Feature Registration Network -- 4.2 Normal Distribution Estimation -- 4.3 Inference -- 5 Experiments -- 5.1 Experimental Setups -- 5.2 Comparison with State-of-the-Art Methods -- 5.3 Ablation Studies -- 5.4 Visualization Analysis -- 6 Conclusion -- References -- Improving Robustness by Enhancing Weak Subnets -- 1 Introduction -- 2 Related Work -- 3 EWS: Training by Enhancing Weak Subnets -- 3.1 Subnet Construction and Impact on Overall Performance -- 3.2 Finding Particularly Weak Subnets -- 3.3 EWS: Enhancing Weak Subnets with Knowledge Distillation -- 3.4 Combining EWS with Adversarial Training -- 4 Experiments -- 4.1 Improving Corruption Robustness -- 4.2 Improving Adversarial Robustness -- 5 Ablation and Discussions -- 5.1 Search Strategies and Hyper-Parameters -- 5.2 Vulnerability of Blocks and Layers -- 6 Conclusion -- References.
Learning Invariant Visual Representations for Compositional Zero-Shot Learning.
Record Nr. UNISA-996500065903316
Avidan Shai  
Cham : , : Springer, , 2022
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XVII
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XVII
Autore Avidan Shai
Pubbl/distr/stampa Cham : , : Springer, , 2022
Descrizione fisica 1 online resource (800 pages)
Disciplina 006.37
Altri autori (Persone) BrostowGabriel
CisséMoustapha
FarinellaGiovanni Maria
HassnerTal
Collana Lecture Notes in Computer Science
Soggetto non controllato Engineering
Technology & Engineering
ISBN 3-031-19790-9
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Record Nr. UNISA-996495565403316
Avidan Shai  
Cham : , : Springer, , 2022
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XVII
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XVII
Autore Avidan Shai
Pubbl/distr/stampa Cham : , : Springer, , 2022
Descrizione fisica 1 online resource (800 pages)
Disciplina 006.37
Altri autori (Persone) BrostowGabriel
CisséMoustapha
FarinellaGiovanni Maria
HassnerTal
Collana Lecture Notes in Computer Science
Soggetto topico Visió per ordinador
Reconeixement de formes (Informàtica)
Soggetto genere / forma Congressos
Llibres electrònics
Soggetto non controllato Engineering
Technology & Engineering
ISBN 9783031197901
3031197909
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Record Nr. UNINA-9910619273903321
Avidan Shai  
Cham : , : Springer, , 2022
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXIV
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXIV
Autore Avidan Shai
Pubbl/distr/stampa Cham : , : Springer, , 2022
Descrizione fisica 1 online resource (803 pages)
Disciplina 006.37
Altri autori (Persone) BrostowGabriel
CisséMoustapha
FarinellaGiovanni Maria
HassnerTal
Collana Lecture Notes in Computer Science
Soggetto topico Visió per ordinador
Reconeixement de formes (Informàtica)
Soggetto genere / forma Congressos
Llibres electrònics
Soggetto non controllato Engineering
Technology & Engineering
ISBN 9783031200533
3031200535
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Foreword -- Preface -- Organization -- Contents - Part XXIV -- Improving Vision Transformers by Revisiting High-Frequency Components -- 1 Introduction -- 2 Related Work -- 3 Revisiting ViT Models from a Frequency Perspective -- 4 The Proposed Method -- 4.1 Adversarial Training with High-Frequency Perturbations -- 4.2 A Case Study Using ViT-B -- 5 Experiments -- 5.1 Experimental Setup -- 5.2 Results on ImageNet Classification -- 5.3 Results on Out-of-distribution Data -- 5.4 Transfer Learning to Downstream Tasks -- 5.5 Ablation Studies -- 5.6 Discussions -- 6 Conclusions and Future Work -- References -- Recurrent Bilinear Optimization for Binary Neural Networks -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Preliminaries -- 3.2 Bilinear Model of BNNs -- 3.3 Recurrent Bilinear Optimization -- 3.4 Discussion -- 4 Experiments -- 4.1 Datasets and Implementation Details -- 4.2 Ablation Study -- 4.3 Image Classification -- 4.4 Object Detection -- 4.5 Deployment Efficiency -- 5 Conclusion -- References -- Neural Architecture Search for Spiking Neural Networks -- 1 Introduction -- 2 Related Work -- 2.1 Spiking Neural Networks -- 2.2 Neural Architecture Search -- 3 Preliminaries -- 3.1 Leaky Integrate-and-Fire Neuron -- 3.2 NAS Without Training -- 4 Methodology -- 4.1 Linear Regions from LIF Neurons -- 4.2 Sparsity-Aware Hamming Distance -- 4.3 Searching Forward and Backward Connections -- 5 Experiments -- 5.1 Implementation Details -- 5.2 Performance Comparison -- 5.3 Experimental Analysis -- 6 Conclusion -- References -- Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification -- 1 Introduction -- 2 Related Work -- 2.1 Fine-Grained Visual Classification -- 2.2 Human Attention in Vision -- 3 Approach -- 3.1 Overview -- 3.2 Region Feature Mining Module.
3.3 Cross-Hierarchical Orthogonal Fusion Module -- 4 Experiments and Analysis -- 4.1 Datasets -- 4.2 Hierarchy Interaction Analysis -- 4.3 Evaluation on Traditional FGVC Setting -- 4.4 Further Analysis -- 5 Conclusions -- References -- DaViT: Dual Attention Vision Transformers -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Overview -- 3.2 Spatial Window Attention -- 3.3 Channel Group Attention -- 3.4 Model Instantiation -- 4 Analysis -- 5 Experiments -- 5.1 Image Classification -- 5.2 Object Detection and Instance Segmentation -- 5.3 Semantic Segmentation on ADE20k -- 5.4 Ablation Study -- 6 Conclusion -- References -- Optimal Transport for Label-Efficient Visible-Infrared Person Re-Identification -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Problem Formulation and Overview -- 3.2 Discrepancy Elimination Network (DEN) -- 3.3 Optimal-Transport Label Assignment (OTLA) -- 3.4 Prediction Alignment Learning (PAL) -- 3.5 Optimization -- 4 Experiments -- 4.1 Experimental Settings -- 4.2 Implementation Details -- 4.3 Main Results -- 4.4 Ablation Study -- 4.5 Discussion -- 5 Conclusion -- References -- Locality Guidance for Improving Vision Transformers on Tiny Datasets -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 The Overall Approach -- 3.2 Guidance Positions -- 3.3 Architecture of the CNN -- 4 Experiments -- 4.1 Main Results -- 4.2 Discussion -- 4.3 Ablation Study -- 5 Conclusion -- References -- Neighborhood Collective Estimation for Noisy Label Identification and Correction -- 1 Introduction -- 2 Related Work -- 2.1 Noise Verification -- 2.2 Label Correction -- 3 The Proposed Method -- 3.1 Neighborhood Collective Noise Verification -- 3.2 Neighborhood Collective Label Correction -- 3.3 Training Objectives -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Comparisons with the State of the Art -- 4.3 Analysis.
5 Conclusions -- References -- Few-Shot Class-Incremental Learning via Entropy-Regularized Data-Free Replay -- 1 Introduction -- 2 Related Works -- 2.1 Class-Incremental Learning -- 2.2 Few-Shot Class-Incremental Learning -- 2.3 Data-Free Knowledge Distillation -- 3 Preliminaries -- 3.1 Problem Setting -- 3.2 Data-Free Replay -- 4 Methodology -- 4.1 Entropy-Regularized Data-Free Replay -- 4.2 Learning Incrementally with Uncertain Data -- 5 Experiments -- 5.1 Datasets -- 5.2 Implementation Details -- 5.3 Re-implementation of Replay-based Methods -- 5.4 Main Results and Comparison -- 5.5 Analysis -- 6 Conclusion -- References -- Anti-retroactive Interference for Lifelong Learning -- 1 Introduction -- 2 Related Work -- 2.1 Lifelong Learning -- 2.2 Adversarial Training -- 3 Proposed Method -- 3.1 Extracting Intra-Class Features -- 3.2 Generating and Fusing Task-Specific Models -- 4 Experiments and Results -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Results and Comparison -- 4.4 Ablation Study -- 5 Conclusion -- References -- Towards Calibrated Hyper-Sphere Representation via Distribution Overlap Coefficient for Long-Tailed Learning -- 1 Introduction -- 2 Related Works -- 3 Methodology -- 3.1 Build vMF Classifier on Hyper-Sphere -- 3.2 Quantify Distribution Overlap Coefficient on Hyper-Sphere -- 3.3 Improve Representation of Feature and Classifier via o -- 3.4 Calibrate Classifier Weight Beyond Training via o -- 4 Experiments -- 4.1 Long-Tailed Image Classification Task -- 4.2 Long-Tailed Semantic and Instance Segmentation Task -- 4.3 Ablation Study -- 5 Conclusions -- References -- Dynamic Metric Learning with Cross-Level Concept Distillation -- 1 Introduction -- 2 Related Work -- 3 Proposed Approach -- 3.1 Dynamic Metric Learning -- 3.2 Hierarchical Concept Refiner -- 3.3 Cross-Level Concept Distillation -- 3.4 Discussions -- 4 Experiments.
4.1 Datasets -- 4.2 Evaluation Protocol -- 4.3 Implementation Details -- 4.4 Main Results -- 4.5 Experimental Analysis -- 5 Conclusion -- References -- MENet: A Memory-Based Network with Dual-Branch for Efficient Event Stream Processing -- 1 Introduction -- 2 Related Work -- 2.1 Event-Based Representations -- 2.2 Memory-Based Networks -- 3 Event Camera Model -- 4 Method -- 4.1 Dual-Branch Structure -- 4.2 Double Polarities Calculation Method -- 4.3 Point-Wise Memory Bank -- 4.4 Training and Testing Strategies -- 5 Experiments -- 5.1 Experimental Setup -- 5.2 Ablation Study -- 5.3 Object Recognition -- 5.4 Gesture Recognition -- 6 Conclusion -- References -- Out-of-distribution Detection with Boundary Aware Learning -- 1 Introduction -- 2 Related Work -- 3 Preliminaries -- 4 Boundary Aware Learning -- 4.1 Representation Extraction Module (REM) -- 4.2 Representation Sampling Module (RSM) -- 4.3 Representation Discrimination Module (RDM) -- 5 Experiments -- 5.1 Dataset -- 5.2 Experimental Setup -- 5.3 Ablation Study -- 5.4 Detection Results -- 5.5 Visualization of trivial and hard OOD features -- 6 Conclusion -- References -- Learning Hierarchy Aware Features for Reducing Mistake Severity -- 1 Introduction -- 2 Related Work -- 3 HAF: Proposed Approach -- 3.1 Fine Grained Cross-Entropy (LCEfine) -- 3.2 Soft Hierarchical Consistency (Lshc) -- 3.3 Margin Loss (Lm) -- 3.4 Geometric Consistency (Lgc) -- 4 Experiments and Results -- 4.1 Experimental Setup -- 4.2 Training Configurations -- 4.3 Results -- 4.4 Coarse Classification Accuracy -- 5 Analysis -- 5.1 Ablation Study -- 5.2 Mistakes Severity Plots -- 5.3 Discussion: Hierarchical Metrics -- 6 Conclusion -- References -- Learning to Detect Every Thing in an Open World -- 1 Introduction -- 2 Related Work -- 3 Learning to Detect Every Thing -- 3.1 Data Augmentation: Background Erasing (BackErase).
3.2 Decoupled Multi-domain Training -- 4 Experiments -- 4.1 Cross-category Generalization -- 4.2 Cross-Dataset Generalization -- 5 Conclusion -- References -- KVT: k-NN Attention for Boosting Vision Transformers -- 1 Introduction -- 2 Related Work -- 2.1 Self-attention -- 2.2 Transformer for Vision -- 3 k-NN Attention -- 3.1 Vanilla Attention -- 3.2 k-NN Attention -- 3.3 Theoretical Analysis on k-NN Attention -- 4 Experiments for Vision Transformers -- 4.1 Experimental Settings -- 4.2 Results on ImageNet -- 4.3 The Impact of Number k -- 4.4 Convergence Speed of k-NN Attention -- 4.5 Other Properties of k-NN Attention -- 4.6 Comparisons with Temperature in Softmax -- 4.7 Visualization -- 4.8 Object Detection and Semantic Segmentation -- 5 Conclusion -- References -- Registration Based Few-Shot Anomaly Detection -- 1 Introduction -- 2 Related Work -- 2.1 Anomaly Detection -- 2.2 Few-Shot Learning -- 2.3 Few-Shot Anomaly Detection -- 3 Problem Setting -- 4 Method -- 4.1 Feature Registration Network -- 4.2 Normal Distribution Estimation -- 4.3 Inference -- 5 Experiments -- 5.1 Experimental Setups -- 5.2 Comparison with State-of-the-Art Methods -- 5.3 Ablation Studies -- 5.4 Visualization Analysis -- 6 Conclusion -- References -- Improving Robustness by Enhancing Weak Subnets -- 1 Introduction -- 2 Related Work -- 3 EWS: Training by Enhancing Weak Subnets -- 3.1 Subnet Construction and Impact on Overall Performance -- 3.2 Finding Particularly Weak Subnets -- 3.3 EWS: Enhancing Weak Subnets with Knowledge Distillation -- 3.4 Combining EWS with Adversarial Training -- 4 Experiments -- 4.1 Improving Corruption Robustness -- 4.2 Improving Adversarial Robustness -- 5 Ablation and Discussions -- 5.1 Search Strategies and Hyper-Parameters -- 5.2 Vulnerability of Blocks and Layers -- 6 Conclusion -- References.
Learning Invariant Visual Representations for Compositional Zero-Shot Learning.
Record Nr. UNINA-9910629291203321
Avidan Shai  
Cham : , : Springer, , 2022
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Computer vision - ECCV 2022 . Part XXXV : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022 : proceedings / / Shai Avidan [and four others]
Computer vision - ECCV 2022 . Part XXXV : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022 : proceedings / / Shai Avidan [and four others]
Edizione [1st ed. 2022.]
Pubbl/distr/stampa Cham, Switzerland : , : Springer, , [2022]
Descrizione fisica 1 online resource (801 pages)
Disciplina 006.37
Collana Lecture Notes in Computer Science
Soggetto topico Computer vision
Pattern recognition systems
Soggetto non controllato Engineering
Technology & Engineering
ISBN 3-031-19833-6
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Efficient One-Stage Video Object Detection by Exploiting Temporal Consistency -- Leveraging Action Affinity and Continuity for Semi-Supervised Temporal Action Segmentation -- Spotting Temporally Precise, Fine-Grained Events in Video -- Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation -- Efficient Video Transformers with Spatial-Temporal Token Selection -- Long Movie Clip Classification with State-Space Video Models -- Prompting Visual-Language Models for Efficient Video Understanding -- Asymmetric Relation Consistency Reasoning for Video Relation Grounding -- Self-Supervised Social Relation Representation for Human Group Detection -- K-Centered Patch Sampling for Efficient Video Recognition -- A Deep Moving-Camera Background Model -- GraphVid: It Only Takes a Few Nodes to Understand a Video -- Delta Distillation for Efficient Video Processing -- MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning -- COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality -- E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context -- TDViT: Temporal Dilated Video Transformer for Dense Video Tasks -- Semi-Supervised Learning of Optical Flow by Flow Supervisor -- Flow Graph to Video Grounding for Weakly-Supervised Multi-step Localization -- Deep 360° Optical Flow Estimation Based on Multi-Projection Fusion -- MaCLR: Motion-Aware Contrastive Learning of Representations for Videos -- Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection -- Frozen CLIP Models Are Efficient Video Learners -- PIP: Physical Interaction Prediction via Mental Simulation with Span Selection -- Panoramic Vision Transformer for Saliency Detection in 360° Videos -- Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration -- Motion Sensitive Contrastive Learning for Self-Supervised Video Representation -- Dynamic Temporal Filtering In Video Models -- Tip-Adapter: Training-Free Adaption of CLIP for Few-Shot Classification -- Temporal Lift Pooling for Continuous Sign Language Recognition -- MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes -- SiRi: A Simple Selective Retraining Mechanism for Transformer-Based Visual Grounding -- Cross-Modal Prototype Driven Network for Radiology Report Generation -- TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts -- SeqTR: A Simple Yet Universal Network for Visual Grounding -- VTC: Improving Video-Text Retrieval with User Comments -- FashionViL: Fashion-Focused Vision-and-Language Representation Learning -- Weakly Supervised Grounding for VQA in Vision-Language Transformers -- Automatic Dense Annotation of Large-Vocabulary Sign Language Videos -- MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval -- GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval -- A Simple and Robust Correlation Filtering Method for Text-Based Person Search.
Record Nr. UNISA-996500066303316
Cham, Switzerland : , : Springer, , [2022]
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Computer vision - ECCV 2022 . Part XXXV : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022 : proceedings / / Shai Avidan [and four others]
Computer vision - ECCV 2022 . Part XXXV : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022 : proceedings / / Shai Avidan [and four others]
Edizione [1st ed. 2022.]
Pubbl/distr/stampa Cham, Switzerland : , : Springer, , [2022]
Descrizione fisica 1 online resource (801 pages)
Disciplina 006.37
Collana Lecture Notes in Computer Science
Soggetto topico Computer vision
Pattern recognition systems
Soggetto non controllato Engineering
Technology & Engineering
ISBN 3-031-19833-6
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Efficient One-Stage Video Object Detection by Exploiting Temporal Consistency -- Leveraging Action Affinity and Continuity for Semi-Supervised Temporal Action Segmentation -- Spotting Temporally Precise, Fine-Grained Events in Video -- Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation -- Efficient Video Transformers with Spatial-Temporal Token Selection -- Long Movie Clip Classification with State-Space Video Models -- Prompting Visual-Language Models for Efficient Video Understanding -- Asymmetric Relation Consistency Reasoning for Video Relation Grounding -- Self-Supervised Social Relation Representation for Human Group Detection -- K-Centered Patch Sampling for Efficient Video Recognition -- A Deep Moving-Camera Background Model -- GraphVid: It Only Takes a Few Nodes to Understand a Video -- Delta Distillation for Efficient Video Processing -- MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning -- COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality -- E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context -- TDViT: Temporal Dilated Video Transformer for Dense Video Tasks -- Semi-Supervised Learning of Optical Flow by Flow Supervisor -- Flow Graph to Video Grounding for Weakly-Supervised Multi-step Localization -- Deep 360° Optical Flow Estimation Based on Multi-Projection Fusion -- MaCLR: Motion-Aware Contrastive Learning of Representations for Videos -- Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection -- Frozen CLIP Models Are Efficient Video Learners -- PIP: Physical Interaction Prediction via Mental Simulation with Span Selection -- Panoramic Vision Transformer for Saliency Detection in 360° Videos -- Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration -- Motion Sensitive Contrastive Learning for Self-Supervised Video Representation -- Dynamic Temporal Filtering In Video Models -- Tip-Adapter: Training-Free Adaption of CLIP for Few-Shot Classification -- Temporal Lift Pooling for Continuous Sign Language Recognition -- MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes -- SiRi: A Simple Selective Retraining Mechanism for Transformer-Based Visual Grounding -- Cross-Modal Prototype Driven Network for Radiology Report Generation -- TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts -- SeqTR: A Simple Yet Universal Network for Visual Grounding -- VTC: Improving Video-Text Retrieval with User Comments -- FashionViL: Fashion-Focused Vision-and-Language Representation Learning -- Weakly Supervised Grounding for VQA in Vision-Language Transformers -- Automatic Dense Annotation of Large-Vocabulary Sign Language Videos -- MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval -- GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval -- A Simple and Robust Correlation Filtering Method for Text-Based Person Search.
Record Nr. UNINA-9910629292203321
Cham, Switzerland : , : Springer, , [2022]
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Computer Vision, Imaging and Computer Graphics Theory and Applications : 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics, VISIGRAPP 2023, Lisbon, Portugal, February 19–21, 2023, Revised Selected Papers / / edited by A. Augusto de Sousa, Thomas Bashford-Rogers, Alexis Paljic, Mounia Ziat, Christophe Hurter, Helen Purchase, Petia Radeva, Giovanni Maria Farinella, Kadi Bouatouch
Computer Vision, Imaging and Computer Graphics Theory and Applications : 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics, VISIGRAPP 2023, Lisbon, Portugal, February 19–21, 2023, Revised Selected Papers / / edited by A. Augusto de Sousa, Thomas Bashford-Rogers, Alexis Paljic, Mounia Ziat, Christophe Hurter, Helen Purchase, Petia Radeva, Giovanni Maria Farinella, Kadi Bouatouch
Autore de Sousa A. Augusto
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (419 pages)
Disciplina 006
Altri autori (Persone) Bashford-RogersThomas
PaljicAlexis
ZiatMounia
HurterChristophe
PurchaseHelen
RadevaPetia
FarinellaGiovanni Maria
BouatouchKadi
Collana Communications in Computer and Information Science
Soggetto topico Image processing - Digital techniques
Computer vision
Computer engineering
Computer networks
Artificial intelligence
Application software
User interfaces (Computer systems)
Human-computer interaction
Computer Imaging, Vision, Pattern Recognition and Graphics
Computer Engineering and Networks
Artificial Intelligence
Computer and Information Systems Applications
User Interfaces and Human Computer Interaction
ISBN 3-031-66743-3
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Analysis of Solar Radiation on Facades Using Mobile Augmented Reality -- Unified Shape Analysis and Synthesis via Deformable Voxel Grids -- Epipolar Equation Weighting for Accurate Camera Motion from Two Consecutive Frames -- Absolute ROMP Recovering Multi Person 3D Poses and Shapes with Absolute Scales from a Single RGB Image -- Semi Supervised Task Aware Image to Image Translation -- Deep Detection Dreams Enhancing Visualization Tools for Single Stage Object Detectors -- Approaches to Face Verification Through Attribute Based Attention -- Towards Fast Detection and Classification of Moving Objects -- ST SACLF Style Transfer Informed Self Attention Classifier for Bias Aware Painting Classification -- Attention to Emotions Body Emotion Recognition In The Wild Using Self Attention Transformer Network -- Linking Data Separation, Visual Separation, and Classifier Performance Using Multidimensional Projections -- Using Cockpit Interactions for Implicit Eye Tracking Calibration in a Flight Simulator -- Evaluation of Flexible Structured Light Calibration Using Circles -- GPS Enhanced RGB D IMU Calibration for Accurate Pose Estimation -- Application of Contrast Driven Color Class Assignment to Four Categorical Data Visualization Diagrams -- Measuring And Interpreting the Quality of 3D Projections of High Dimensional Data -- Visualizing Military Operations Extended Geospatial Temporal Survey.
Record Nr. UNINA-9910882886503321
de Sousa A. Augusto  
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Computer Vision, Imaging and Computer Graphics Theory and Applications [[electronic resource] ] : 17th International Joint Conference, VISIGRAPP 2022, Virtual Event, February 6–8, 2022, Revised Selected Papers / / edited by A. Augusto de Sousa, Kurt Debattista, Alexis Paljic, Mounia Ziat, Christophe Hurter, Helen Purchase, Giovanni Maria Farinella, Petia Radeva, Kadi Bouatouch
Computer Vision, Imaging and Computer Graphics Theory and Applications [[electronic resource] ] : 17th International Joint Conference, VISIGRAPP 2022, Virtual Event, February 6–8, 2022, Revised Selected Papers / / edited by A. Augusto de Sousa, Kurt Debattista, Alexis Paljic, Mounia Ziat, Christophe Hurter, Helen Purchase, Giovanni Maria Farinella, Petia Radeva, Kadi Bouatouch
Autore de Sousa A. Augusto
Edizione [1st ed. 2023.]
Pubbl/distr/stampa Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Descrizione fisica 1 online resource (343 pages)
Disciplina 006
Altri autori (Persone) DebattistaKurt
PaljicAlexis
ZiatMounia
HurterChristophe
PurchaseHelen
FarinellaGiovanni Maria
RadevaPetia
BouatouchKadi
Collana Communications in Computer and Information Science
Soggetto topico Image processing - Digital techniques
Computer vision
Computer engineering
Computer networks
Artificial intelligence
Application software
User interfaces (Computer systems)
Human-computer interaction
Computer Imaging, Vision, Pattern Recognition and Graphics
Computer Engineering and Networks
Artificial Intelligence
Computer and Information Systems Applications
User Interfaces and Human Computer Interaction
ISBN 3-031-45725-0
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Contents -- Automatic Threshold RanSaC Algorithms for Pose Estimation Tasks -- 1 Introduction -- 2 RanSaC Methods -- 2.1 Notation -- 2.2 History of RanSaC Algorithms -- 3 Adaptative RanSaC Algorithms -- 4 Data Generation Methodology -- 4.1 Models and Estimators -- 4.2 Semi-artificial Data Generation Method -- 5 Benchmark and Results -- 5.1 Performance Measures -- 5.2 Parameters -- 5.3 Results -- 5.4 Analysis and Comparison -- 6 Conclusion -- References -- Semi-automated Generation of Accurate Ground-Truth for 3D Object Detection -- 1 Introduction -- 2 Related Work on 3D Object Detection -- 2.1 Techniques for Early Object Detection -- 2.2 CNN-Based 3D Object Detection -- 2.3 Conclusions on Related Work -- 3 Semi-automated 3D Dataset Generation -- 3.1 Orientation Estimation -- 3.2 3D Box Estimation -- 4 Experiments -- 4.1 Experimental Setup and Configuration -- 4.2 Evaluation 1: Annotation-Processing Chain -- 4.3 Evaluation 2: 3D Object Detector Trained on the Annotation-Processing Configurations -- 4.4 Cross-Validation on KITTI Dataset -- 4.5 Unsupervised Approach -- 5 Conclusion -- References -- A Quantitative and Qualitative Analysis on a GAN-Based Face Mask Removal on Masked Images and Videos -- 1 Introduction -- 2 Related Works -- 2.1 Inpainting -- 2.2 Face Completion -- 3 Method -- 3.1 Pix2pix-Based Inpainting -- 3.2 Custom Loss Function -- 3.3 System Overview -- 3.4 Predicting Feature Points on a Face -- 4 Experiment -- 4.1 Image Evaluation -- 4.2 Video Evaluation -- 5 Discussion -- 5.1 Quality of Generated Images -- 5.2 Discriminating Facial Expressions -- 5.3 Generating Smooth Videos -- 5.4 Additional Quantitative Analyses -- 6 Limitations -- 7 Conclusion -- References -- Dense Material Segmentation with Context-Aware Network -- 1 Introduction -- 2 Related Works -- 2.1 Material Segmentation Datasets.
2.2 Fully Convolutional Network -- 2.3 Material Segmentation with FCN -- 2.4 Global and Local Training -- 2.5 Boundary Refinement -- 2.6 Self-training -- 3 CAM-SegNet Architecture -- 3.1 Feature Sharing Connection -- 3.2 Context-Aware Dense Material Segmentation -- 3.3 Self-training Approach -- 4 CAM-SegNet Experiment Configurations -- 4.1 Dataset -- 4.2 Evaluation Metrics -- 4.3 Implementation Details -- 5 CAM-SegNet Performance Analysis -- 5.1 Quantitative Analysis -- 5.2 Qualitative Analysis -- 5.3 Ablation Study -- 6 Conclusions -- References -- Partial Alignment of Time Series for Action and Activity Prediction -- 1 Introduction -- 2 Related Work -- 3 Temporal Alignment of Action/Activity Sequences -- 3.1 Alignment Methods - Segmented Sequences -- 3.2 Alignment Methods - Unsegmented Sequences -- 3.3 Action and Activity Prediction -- 4 Experimental Results -- 4.1 Datasets -- 4.2 Alignment-Based Prediction in Segmented Sequences -- 4.3 Alignment-Based Action Prediction in Unsegmented Sequences -- 4.4 Graph-Based Activity Prediction -- 4.5 Duration Prognosis -- 5 Conclusions -- References -- Automatic Bi-LSTM Architecture Search Using Bayesian Optimisation for Vehicle Activity Recognition -- 1 Introduction -- 2 Related Work -- 2.1 Trajectory Representation and Analysis -- 2.2 Deep Neural Network Optimisation -- 3 Method -- 3.1 Qualitative Feature Representation -- 3.2 Automatic Bi-LSTM Architecture Search -- 3.3 Optimal Architecture Selection -- 3.4 VNet Modelling -- 4 Vehicle Activity Datasets -- 4.1 Highway Drone Dataset -- 4.2 Traffic Dataset -- 4.3 Vehicle Obstacle Interaction Dataset -- 4.4 Next Generation Simulation Dataset -- 4.5 Combined Dataset -- 5 Experiments and Results -- 5.1 Optimal Architecture Selection -- 5.2 Evaluation of the Optimal Architecture -- 6 Discussion -- 7 Conclusion -- References.
ANTENNA: Visual Analytics of Mobility Derived from Cellphone Data -- 1 Introduction -- 2 Related Work -- 2.1 Reconstruction and Extraction of Trajectories -- 2.2 Visual Analytics of Movement -- 3 System Overview -- 3.1 Backend and Frontend -- 4 Data -- 4.1 Database -- 4.2 Processing Pipeline -- 5 ANTENNA's Visualization -- 5.1 Tasks and Design Requirements -- 5.2 Visual Query -- 5.3 Grid Aggregation Mode -- 5.4 Road Aggregation Mode -- 6 Usage Scenarios -- 6.1 Scenario 1: Inter-Urban Movements -- 6.2 Scenario 2: Group Movements -- 7 User Testing -- 7.1 Methodology -- 7.2 Tasks -- 7.3 Results -- 8 Discussion -- 9 Conclusion -- References -- Influence of Errors on the Evaluation of Text Classification Systems -- 1 Introduction -- 2 Setup -- 2.1 Models and Dataset -- 2.2 Explanation Methods -- 2.3 Evaluation of the Models -- 2.4 System Output and Explanation Visualization -- 3 Experiment 1: Effect on the Evaluation of One System -- 3.1 Experiment Design -- 3.2 Task and Questionnaire -- 3.3 Participant Recruitment -- 3.4 Results -- 3.5 Qualitative Results -- 4 Experiment 2: Effect on the Comparison of Two Systems -- 4.1 Experiment Design -- 4.2 Task and Questionnaire -- 4.3 Participant Recruitment -- 4.4 Results -- 5 Experiment 3: Effect of the Comparison of Two Systems (Bias Error Pattern) -- 5.1 Experiment Design -- 5.2 Results -- 6 Experiment 4: Effect of Incorrect Examples (with a Different Language) -- 6.1 Experiment Design -- 6.2 Task and Questionnaire -- 6.3 Participant Recruitment -- 6.4 Translation -- 6.5 Results -- 6.6 Qualitative Results -- 7 Discussion -- 7.1 Limitations -- 8 Conclusion -- References -- Autonomous Navigation Method Considering Passenger Comfort Recognition for Personal Mobility Vehicles in Crowded Pedestrian Spaces -- 1 Introduction -- 2 Process of Passenger Comfort Recognition.
3 Investigation of Passenger Comfort Recognition -- 3.1 Passenger Comfort Evaluation Experiment -- 3.2 Effects of Current Situation on Comfort Recognition -- 3.3 Effects of Future Status on Comfort Recognition -- 3.4 Characteristics of Passenger Comfort Recognition -- 4 Proposal of an Autonomous Navigation Method Considering Passenger Comfort Recognition -- 4.1 Design -- 4.2 Validation -- 5 Conclusions -- References -- The Electrodermal Activity of Player Experience in Virtual Reality Games: An Extended Evaluation of the Phasic Component -- 1 Introduction -- 2 Background -- 2.1 Related Work -- 3 Methodology -- 3.1 EDA Data Capture and Phasic Component Calculation -- 3.2 Phasic Component Analysis -- 3.3 Game Experience Analysis -- 3.4 Statistical Analyses -- 3.5 Implementation Tools -- 3.6 Ethical Considerations -- 4 Results -- 4.1 Peaks per Minute -- 4.2 Average Peak Amplitude -- 4.3 Game Experience -- 4.4 Correlation Analysis -- 5 Discussion -- 6 Conclusion and Future Work -- References -- MinMax-CAM: Increasing Precision of Explaining Maps by Contrasting Gradient Signals and Regularizing Kernel Usage -- 1 Introduction -- 2 Related Work -- 3 Contrasting Class Gradient Information -- 3.1 Intuition -- 3.2 Definition -- 3.3 Reducing Noise by Removing Negative Contributions -- 4 Reducing Shared Information Between Classifiers -- 4.1 Counterbalancing Activation Vanishing -- 5 Experimental Setup -- 5.1 Evaluations over Architectures and Problem Domains -- 5.2 Training Procedure -- 5.3 Evaluation Metrics -- 6 Results -- 6.1 Comparison Between Architectures -- 6.2 Evaluation over Distinct Problem Domains -- 6.3 Kernel Usage Regularization -- 7 Conclusions -- References -- DIAR: Deep Image Alignment and Reconstruction Using Swin Transformers -- 1 Introduction -- 2 Related Work -- 3 Dataset -- 3.1 Aligned Dataset -- 3.2 Misaligned Dataset.
4 Deep Image Alignment -- 5 Architecture -- 5.1 Deep Residual Sets -- 5.2 Video Swin Transformer -- 5.3 Image Reconstruction Using Swin Transformers -- 5.4 Training -- 6 Evaluation -- 6.1 Aggregation -- 6.2 Image Reconstruction -- 6.3 Alignment and Reconstruction: -- 7 Conclusion -- References -- Active Learning with Data Augmentation Under Small vs Large Dataset Regimes for Semantic-KITTI Dataset -- 1 Introduction -- 1.1 State of the Art -- 2 Methodology -- 3 Validation and Results -- 3.1 Class Based Learning Efficiency -- 3.2 Dataset Size Growth: 1/4 Semantic-KITTI vs Full Semantic-KITTI -- 3.3 t-SNE Problem Analysis -- 4 Conclusion -- 4.1 Challenges and Future Scope -- References -- Transformers in Unsupervised Structure-from-Motion -- 1 Introduction -- 2 Related Works -- 3 Method -- 3.1 Monocular Unsupervised SfM -- 3.2 Architecture -- 3.3 Intrinsics -- 3.4 Appearance-Based Losses -- 4 Experiments -- 4.1 Datasets -- 4.2 Architecture -- 4.3 Implementation Details -- 4.4 Evaluation Metrics -- 4.5 Impact of Architecture -- 4.6 Generalizability -- 4.7 Auxiliary Tasks -- 4.8 Depth Estimation with Learned Camera Intrinsics -- 4.9 Efficiency -- 4.10 Comparing Performance -- 5 Conclusion -- References -- A Study of Aerial Image-Based 3D Reconstructions in a Metropolitan Area -- 1 Introduction -- 2 Previous Work -- 3 Urban Environment -- 3.1 Ground Truth -- 3.2 Image Sets -- 3.3 Urban Categorization -- 4 Experimental Setup -- 4.1 3D Reconstruction Techniques -- 4.2 Pipelines Under Study -- 4.3 Alignment -- 5 Experimental Results -- 5.1 Scene Level Evaluation -- 5.2 Urban Category Centric Evaluation -- 5.3 General Pipeline Evaluation -- 6 Conclusion -- References -- Author Index.
Record Nr. UNISA-996558568803316
de Sousa A. Augusto  
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Computer Vision, Imaging and Computer Graphics Theory and Applications : 17th International Joint Conference, VISIGRAPP 2022, Virtual Event, February 6–8, 2022, Revised Selected Papers / / edited by A. Augusto de Sousa, Kurt Debattista, Alexis Paljic, Mounia Ziat, Christophe Hurter, Helen Purchase, Giovanni Maria Farinella, Petia Radeva, Kadi Bouatouch
Computer Vision, Imaging and Computer Graphics Theory and Applications : 17th International Joint Conference, VISIGRAPP 2022, Virtual Event, February 6–8, 2022, Revised Selected Papers / / edited by A. Augusto de Sousa, Kurt Debattista, Alexis Paljic, Mounia Ziat, Christophe Hurter, Helen Purchase, Giovanni Maria Farinella, Petia Radeva, Kadi Bouatouch
Autore de Sousa A. Augusto
Edizione [1st ed. 2023.]
Pubbl/distr/stampa Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Descrizione fisica 1 online resource (343 pages)
Disciplina 006
Altri autori (Persone) DebattistaKurt
PaljicAlexis
ZiatMounia
HurterChristophe
PurchaseHelen
FarinellaGiovanni Maria
RadevaPetia
BouatouchKadi
Collana Communications in Computer and Information Science
Soggetto topico Image processing - Digital techniques
Computer vision
Computer engineering
Computer networks
Artificial intelligence
Application software
User interfaces (Computer systems)
Human-computer interaction
Computer Imaging, Vision, Pattern Recognition and Graphics
Computer Engineering and Networks
Artificial Intelligence
Computer and Information Systems Applications
User Interfaces and Human Computer Interaction
ISBN 9783031457258
3031457250
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Contents -- Automatic Threshold RanSaC Algorithms for Pose Estimation Tasks -- 1 Introduction -- 2 RanSaC Methods -- 2.1 Notation -- 2.2 History of RanSaC Algorithms -- 3 Adaptative RanSaC Algorithms -- 4 Data Generation Methodology -- 4.1 Models and Estimators -- 4.2 Semi-artificial Data Generation Method -- 5 Benchmark and Results -- 5.1 Performance Measures -- 5.2 Parameters -- 5.3 Results -- 5.4 Analysis and Comparison -- 6 Conclusion -- References -- Semi-automated Generation of Accurate Ground-Truth for 3D Object Detection -- 1 Introduction -- 2 Related Work on 3D Object Detection -- 2.1 Techniques for Early Object Detection -- 2.2 CNN-Based 3D Object Detection -- 2.3 Conclusions on Related Work -- 3 Semi-automated 3D Dataset Generation -- 3.1 Orientation Estimation -- 3.2 3D Box Estimation -- 4 Experiments -- 4.1 Experimental Setup and Configuration -- 4.2 Evaluation 1: Annotation-Processing Chain -- 4.3 Evaluation 2: 3D Object Detector Trained on the Annotation-Processing Configurations -- 4.4 Cross-Validation on KITTI Dataset -- 4.5 Unsupervised Approach -- 5 Conclusion -- References -- A Quantitative and Qualitative Analysis on a GAN-Based Face Mask Removal on Masked Images and Videos -- 1 Introduction -- 2 Related Works -- 2.1 Inpainting -- 2.2 Face Completion -- 3 Method -- 3.1 Pix2pix-Based Inpainting -- 3.2 Custom Loss Function -- 3.3 System Overview -- 3.4 Predicting Feature Points on a Face -- 4 Experiment -- 4.1 Image Evaluation -- 4.2 Video Evaluation -- 5 Discussion -- 5.1 Quality of Generated Images -- 5.2 Discriminating Facial Expressions -- 5.3 Generating Smooth Videos -- 5.4 Additional Quantitative Analyses -- 6 Limitations -- 7 Conclusion -- References -- Dense Material Segmentation with Context-Aware Network -- 1 Introduction -- 2 Related Works -- 2.1 Material Segmentation Datasets.
2.2 Fully Convolutional Network -- 2.3 Material Segmentation with FCN -- 2.4 Global and Local Training -- 2.5 Boundary Refinement -- 2.6 Self-training -- 3 CAM-SegNet Architecture -- 3.1 Feature Sharing Connection -- 3.2 Context-Aware Dense Material Segmentation -- 3.3 Self-training Approach -- 4 CAM-SegNet Experiment Configurations -- 4.1 Dataset -- 4.2 Evaluation Metrics -- 4.3 Implementation Details -- 5 CAM-SegNet Performance Analysis -- 5.1 Quantitative Analysis -- 5.2 Qualitative Analysis -- 5.3 Ablation Study -- 6 Conclusions -- References -- Partial Alignment of Time Series for Action and Activity Prediction -- 1 Introduction -- 2 Related Work -- 3 Temporal Alignment of Action/Activity Sequences -- 3.1 Alignment Methods - Segmented Sequences -- 3.2 Alignment Methods - Unsegmented Sequences -- 3.3 Action and Activity Prediction -- 4 Experimental Results -- 4.1 Datasets -- 4.2 Alignment-Based Prediction in Segmented Sequences -- 4.3 Alignment-Based Action Prediction in Unsegmented Sequences -- 4.4 Graph-Based Activity Prediction -- 4.5 Duration Prognosis -- 5 Conclusions -- References -- Automatic Bi-LSTM Architecture Search Using Bayesian Optimisation for Vehicle Activity Recognition -- 1 Introduction -- 2 Related Work -- 2.1 Trajectory Representation and Analysis -- 2.2 Deep Neural Network Optimisation -- 3 Method -- 3.1 Qualitative Feature Representation -- 3.2 Automatic Bi-LSTM Architecture Search -- 3.3 Optimal Architecture Selection -- 3.4 VNet Modelling -- 4 Vehicle Activity Datasets -- 4.1 Highway Drone Dataset -- 4.2 Traffic Dataset -- 4.3 Vehicle Obstacle Interaction Dataset -- 4.4 Next Generation Simulation Dataset -- 4.5 Combined Dataset -- 5 Experiments and Results -- 5.1 Optimal Architecture Selection -- 5.2 Evaluation of the Optimal Architecture -- 6 Discussion -- 7 Conclusion -- References.
ANTENNA: Visual Analytics of Mobility Derived from Cellphone Data -- 1 Introduction -- 2 Related Work -- 2.1 Reconstruction and Extraction of Trajectories -- 2.2 Visual Analytics of Movement -- 3 System Overview -- 3.1 Backend and Frontend -- 4 Data -- 4.1 Database -- 4.2 Processing Pipeline -- 5 ANTENNA's Visualization -- 5.1 Tasks and Design Requirements -- 5.2 Visual Query -- 5.3 Grid Aggregation Mode -- 5.4 Road Aggregation Mode -- 6 Usage Scenarios -- 6.1 Scenario 1: Inter-Urban Movements -- 6.2 Scenario 2: Group Movements -- 7 User Testing -- 7.1 Methodology -- 7.2 Tasks -- 7.3 Results -- 8 Discussion -- 9 Conclusion -- References -- Influence of Errors on the Evaluation of Text Classification Systems -- 1 Introduction -- 2 Setup -- 2.1 Models and Dataset -- 2.2 Explanation Methods -- 2.3 Evaluation of the Models -- 2.4 System Output and Explanation Visualization -- 3 Experiment 1: Effect on the Evaluation of One System -- 3.1 Experiment Design -- 3.2 Task and Questionnaire -- 3.3 Participant Recruitment -- 3.4 Results -- 3.5 Qualitative Results -- 4 Experiment 2: Effect on the Comparison of Two Systems -- 4.1 Experiment Design -- 4.2 Task and Questionnaire -- 4.3 Participant Recruitment -- 4.4 Results -- 5 Experiment 3: Effect of the Comparison of Two Systems (Bias Error Pattern) -- 5.1 Experiment Design -- 5.2 Results -- 6 Experiment 4: Effect of Incorrect Examples (with a Different Language) -- 6.1 Experiment Design -- 6.2 Task and Questionnaire -- 6.3 Participant Recruitment -- 6.4 Translation -- 6.5 Results -- 6.6 Qualitative Results -- 7 Discussion -- 7.1 Limitations -- 8 Conclusion -- References -- Autonomous Navigation Method Considering Passenger Comfort Recognition for Personal Mobility Vehicles in Crowded Pedestrian Spaces -- 1 Introduction -- 2 Process of Passenger Comfort Recognition.
3 Investigation of Passenger Comfort Recognition -- 3.1 Passenger Comfort Evaluation Experiment -- 3.2 Effects of Current Situation on Comfort Recognition -- 3.3 Effects of Future Status on Comfort Recognition -- 3.4 Characteristics of Passenger Comfort Recognition -- 4 Proposal of an Autonomous Navigation Method Considering Passenger Comfort Recognition -- 4.1 Design -- 4.2 Validation -- 5 Conclusions -- References -- The Electrodermal Activity of Player Experience in Virtual Reality Games: An Extended Evaluation of the Phasic Component -- 1 Introduction -- 2 Background -- 2.1 Related Work -- 3 Methodology -- 3.1 EDA Data Capture and Phasic Component Calculation -- 3.2 Phasic Component Analysis -- 3.3 Game Experience Analysis -- 3.4 Statistical Analyses -- 3.5 Implementation Tools -- 3.6 Ethical Considerations -- 4 Results -- 4.1 Peaks per Minute -- 4.2 Average Peak Amplitude -- 4.3 Game Experience -- 4.4 Correlation Analysis -- 5 Discussion -- 6 Conclusion and Future Work -- References -- MinMax-CAM: Increasing Precision of Explaining Maps by Contrasting Gradient Signals and Regularizing Kernel Usage -- 1 Introduction -- 2 Related Work -- 3 Contrasting Class Gradient Information -- 3.1 Intuition -- 3.2 Definition -- 3.3 Reducing Noise by Removing Negative Contributions -- 4 Reducing Shared Information Between Classifiers -- 4.1 Counterbalancing Activation Vanishing -- 5 Experimental Setup -- 5.1 Evaluations over Architectures and Problem Domains -- 5.2 Training Procedure -- 5.3 Evaluation Metrics -- 6 Results -- 6.1 Comparison Between Architectures -- 6.2 Evaluation over Distinct Problem Domains -- 6.3 Kernel Usage Regularization -- 7 Conclusions -- References -- DIAR: Deep Image Alignment and Reconstruction Using Swin Transformers -- 1 Introduction -- 2 Related Work -- 3 Dataset -- 3.1 Aligned Dataset -- 3.2 Misaligned Dataset.
4 Deep Image Alignment -- 5 Architecture -- 5.1 Deep Residual Sets -- 5.2 Video Swin Transformer -- 5.3 Image Reconstruction Using Swin Transformers -- 5.4 Training -- 6 Evaluation -- 6.1 Aggregation -- 6.2 Image Reconstruction -- 6.3 Alignment and Reconstruction: -- 7 Conclusion -- References -- Active Learning with Data Augmentation Under Small vs Large Dataset Regimes for Semantic-KITTI Dataset -- 1 Introduction -- 1.1 State of the Art -- 2 Methodology -- 3 Validation and Results -- 3.1 Class Based Learning Efficiency -- 3.2 Dataset Size Growth: 1/4 Semantic-KITTI vs Full Semantic-KITTI -- 3.3 t-SNE Problem Analysis -- 4 Conclusion -- 4.1 Challenges and Future Scope -- References -- Transformers in Unsupervised Structure-from-Motion -- 1 Introduction -- 2 Related Works -- 3 Method -- 3.1 Monocular Unsupervised SfM -- 3.2 Architecture -- 3.3 Intrinsics -- 3.4 Appearance-Based Losses -- 4 Experiments -- 4.1 Datasets -- 4.2 Architecture -- 4.3 Implementation Details -- 4.4 Evaluation Metrics -- 4.5 Impact of Architecture -- 4.6 Generalizability -- 4.7 Auxiliary Tasks -- 4.8 Depth Estimation with Learned Camera Intrinsics -- 4.9 Efficiency -- 4.10 Comparing Performance -- 5 Conclusion -- References -- A Study of Aerial Image-Based 3D Reconstructions in a Metropolitan Area -- 1 Introduction -- 2 Previous Work -- 3 Urban Environment -- 3.1 Ground Truth -- 3.2 Image Sets -- 3.3 Urban Categorization -- 4 Experimental Setup -- 4.1 3D Reconstruction Techniques -- 4.2 Pipelines Under Study -- 4.3 Alignment -- 5 Experimental Results -- 5.1 Scene Level Evaluation -- 5.2 Urban Category Centric Evaluation -- 5.3 General Pipeline Evaluation -- 6 Conclusion -- References -- Author Index.
Record Nr. UNINA-9910754096903321
de Sousa A. Augusto  
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui