Advanced topics in computer vision / / Giovanni Maria Farinella, Sebastiano Battiato, Roberto Cipolla, editors
| Advanced topics in computer vision / / Giovanni Maria Farinella, Sebastiano Battiato, Roberto Cipolla, editors |
| Edizione | [1st ed. 2013.] |
| Pubbl/distr/stampa | London : , : Springer, , 2013 |
| Descrizione fisica | 1 online resource (xiv, 433 pages) : illustrations (some color) |
| Disciplina | 006.42 |
| Collana | Advances in Computer Vision and Pattern Recognition |
| Soggetto topico | Computer vision |
| ISBN | 1-4471-5520-3 |
| Formato | Materiale a stampa |
| Livello bibliografico | Monografia |
| Lingua di pubblicazione | eng |
| Nota di contenuto | Visual Features: From Early Concepts to Modern Computer Vision -- Where Next in Object Recognition and How Much Supervision Do We Need? -- Recognizing Human Actions by Using Effective Codebooks and Tracking -- Evaluating and Extending Trajectory Features for Activity Recognition -- Co-Recognition of Images and Videos: Unsupervised Matching of Identical Object Patterns and its Applications -- Stereo Matching: State-of-the-Art and Research Challenges -- Visual Localization for Micro Aerial Vehicles in Urban Outdoor Environments -- Moment Constraints in Convex Optimization for Segmentation and Tracking -- Large Scale Metric Learning for Distance-Based Image Classification on Open Ended Data Sets -- Top-Down Bayesian Inference of Indoor Scenes -- Efficient Loopy Belief Propagation Using the Four Color Theorem -- Boosting k-Nearest Neighbors Classification -- Learning Object Detectors in Stationary Environments -- Video Temporal Super-Resolution Based on Self-Similarity. |
| Record Nr. | UNINA-9910437597203321 |
| London : , : Springer, , 2013 | ||
| Lo trovi qui: Univ. Federico II | ||
| ||
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXIV
| Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXIV |
| Autore | Avidan Shai |
| Pubbl/distr/stampa | Cham : , : Springer, , 2022 |
| Descrizione fisica | 1 online resource (803 pages) |
| Disciplina | 006.37 |
| Altri autori (Persone) |
BrostowGabriel
CisséMoustapha FarinellaGiovanni Maria HassnerTal |
| Collana | Lecture Notes in Computer Science |
| Soggetto non controllato |
Engineering
Technology & Engineering |
| ISBN | 3-031-20053-5 |
| Formato | Materiale a stampa |
| Livello bibliografico | Monografia |
| Lingua di pubblicazione | eng |
| Nota di contenuto |
Intro -- Foreword -- Preface -- Organization -- Contents - Part XXIV -- Improving Vision Transformers by Revisiting High-Frequency Components -- 1 Introduction -- 2 Related Work -- 3 Revisiting ViT Models from a Frequency Perspective -- 4 The Proposed Method -- 4.1 Adversarial Training with High-Frequency Perturbations -- 4.2 A Case Study Using ViT-B -- 5 Experiments -- 5.1 Experimental Setup -- 5.2 Results on ImageNet Classification -- 5.3 Results on Out-of-distribution Data -- 5.4 Transfer Learning to Downstream Tasks -- 5.5 Ablation Studies -- 5.6 Discussions -- 6 Conclusions and Future Work -- References -- Recurrent Bilinear Optimization for Binary Neural Networks -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Preliminaries -- 3.2 Bilinear Model of BNNs -- 3.3 Recurrent Bilinear Optimization -- 3.4 Discussion -- 4 Experiments -- 4.1 Datasets and Implementation Details -- 4.2 Ablation Study -- 4.3 Image Classification -- 4.4 Object Detection -- 4.5 Deployment Efficiency -- 5 Conclusion -- References -- Neural Architecture Search for Spiking Neural Networks -- 1 Introduction -- 2 Related Work -- 2.1 Spiking Neural Networks -- 2.2 Neural Architecture Search -- 3 Preliminaries -- 3.1 Leaky Integrate-and-Fire Neuron -- 3.2 NAS Without Training -- 4 Methodology -- 4.1 Linear Regions from LIF Neurons -- 4.2 Sparsity-Aware Hamming Distance -- 4.3 Searching Forward and Backward Connections -- 5 Experiments -- 5.1 Implementation Details -- 5.2 Performance Comparison -- 5.3 Experimental Analysis -- 6 Conclusion -- References -- Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification -- 1 Introduction -- 2 Related Work -- 2.1 Fine-Grained Visual Classification -- 2.2 Human Attention in Vision -- 3 Approach -- 3.1 Overview -- 3.2 Region Feature Mining Module.
3.3 Cross-Hierarchical Orthogonal Fusion Module -- 4 Experiments and Analysis -- 4.1 Datasets -- 4.2 Hierarchy Interaction Analysis -- 4.3 Evaluation on Traditional FGVC Setting -- 4.4 Further Analysis -- 5 Conclusions -- References -- DaViT: Dual Attention Vision Transformers -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Overview -- 3.2 Spatial Window Attention -- 3.3 Channel Group Attention -- 3.4 Model Instantiation -- 4 Analysis -- 5 Experiments -- 5.1 Image Classification -- 5.2 Object Detection and Instance Segmentation -- 5.3 Semantic Segmentation on ADE20k -- 5.4 Ablation Study -- 6 Conclusion -- References -- Optimal Transport for Label-Efficient Visible-Infrared Person Re-Identification -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Problem Formulation and Overview -- 3.2 Discrepancy Elimination Network (DEN) -- 3.3 Optimal-Transport Label Assignment (OTLA) -- 3.4 Prediction Alignment Learning (PAL) -- 3.5 Optimization -- 4 Experiments -- 4.1 Experimental Settings -- 4.2 Implementation Details -- 4.3 Main Results -- 4.4 Ablation Study -- 4.5 Discussion -- 5 Conclusion -- References -- Locality Guidance for Improving Vision Transformers on Tiny Datasets -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 The Overall Approach -- 3.2 Guidance Positions -- 3.3 Architecture of the CNN -- 4 Experiments -- 4.1 Main Results -- 4.2 Discussion -- 4.3 Ablation Study -- 5 Conclusion -- References -- Neighborhood Collective Estimation for Noisy Label Identification and Correction -- 1 Introduction -- 2 Related Work -- 2.1 Noise Verification -- 2.2 Label Correction -- 3 The Proposed Method -- 3.1 Neighborhood Collective Noise Verification -- 3.2 Neighborhood Collective Label Correction -- 3.3 Training Objectives -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Comparisons with the State of the Art -- 4.3 Analysis. 5 Conclusions -- References -- Few-Shot Class-Incremental Learning via Entropy-Regularized Data-Free Replay -- 1 Introduction -- 2 Related Works -- 2.1 Class-Incremental Learning -- 2.2 Few-Shot Class-Incremental Learning -- 2.3 Data-Free Knowledge Distillation -- 3 Preliminaries -- 3.1 Problem Setting -- 3.2 Data-Free Replay -- 4 Methodology -- 4.1 Entropy-Regularized Data-Free Replay -- 4.2 Learning Incrementally with Uncertain Data -- 5 Experiments -- 5.1 Datasets -- 5.2 Implementation Details -- 5.3 Re-implementation of Replay-based Methods -- 5.4 Main Results and Comparison -- 5.5 Analysis -- 6 Conclusion -- References -- Anti-retroactive Interference for Lifelong Learning -- 1 Introduction -- 2 Related Work -- 2.1 Lifelong Learning -- 2.2 Adversarial Training -- 3 Proposed Method -- 3.1 Extracting Intra-Class Features -- 3.2 Generating and Fusing Task-Specific Models -- 4 Experiments and Results -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Results and Comparison -- 4.4 Ablation Study -- 5 Conclusion -- References -- Towards Calibrated Hyper-Sphere Representation via Distribution Overlap Coefficient for Long-Tailed Learning -- 1 Introduction -- 2 Related Works -- 3 Methodology -- 3.1 Build vMF Classifier on Hyper-Sphere -- 3.2 Quantify Distribution Overlap Coefficient on Hyper-Sphere -- 3.3 Improve Representation of Feature and Classifier via o -- 3.4 Calibrate Classifier Weight Beyond Training via o -- 4 Experiments -- 4.1 Long-Tailed Image Classification Task -- 4.2 Long-Tailed Semantic and Instance Segmentation Task -- 4.3 Ablation Study -- 5 Conclusions -- References -- Dynamic Metric Learning with Cross-Level Concept Distillation -- 1 Introduction -- 2 Related Work -- 3 Proposed Approach -- 3.1 Dynamic Metric Learning -- 3.2 Hierarchical Concept Refiner -- 3.3 Cross-Level Concept Distillation -- 3.4 Discussions -- 4 Experiments. 4.1 Datasets -- 4.2 Evaluation Protocol -- 4.3 Implementation Details -- 4.4 Main Results -- 4.5 Experimental Analysis -- 5 Conclusion -- References -- MENet: A Memory-Based Network with Dual-Branch for Efficient Event Stream Processing -- 1 Introduction -- 2 Related Work -- 2.1 Event-Based Representations -- 2.2 Memory-Based Networks -- 3 Event Camera Model -- 4 Method -- 4.1 Dual-Branch Structure -- 4.2 Double Polarities Calculation Method -- 4.3 Point-Wise Memory Bank -- 4.4 Training and Testing Strategies -- 5 Experiments -- 5.1 Experimental Setup -- 5.2 Ablation Study -- 5.3 Object Recognition -- 5.4 Gesture Recognition -- 6 Conclusion -- References -- Out-of-distribution Detection with Boundary Aware Learning -- 1 Introduction -- 2 Related Work -- 3 Preliminaries -- 4 Boundary Aware Learning -- 4.1 Representation Extraction Module (REM) -- 4.2 Representation Sampling Module (RSM) -- 4.3 Representation Discrimination Module (RDM) -- 5 Experiments -- 5.1 Dataset -- 5.2 Experimental Setup -- 5.3 Ablation Study -- 5.4 Detection Results -- 5.5 Visualization of trivial and hard OOD features -- 6 Conclusion -- References -- Learning Hierarchy Aware Features for Reducing Mistake Severity -- 1 Introduction -- 2 Related Work -- 3 HAF: Proposed Approach -- 3.1 Fine Grained Cross-Entropy (LCEfine) -- 3.2 Soft Hierarchical Consistency (Lshc) -- 3.3 Margin Loss (Lm) -- 3.4 Geometric Consistency (Lgc) -- 4 Experiments and Results -- 4.1 Experimental Setup -- 4.2 Training Configurations -- 4.3 Results -- 4.4 Coarse Classification Accuracy -- 5 Analysis -- 5.1 Ablation Study -- 5.2 Mistakes Severity Plots -- 5.3 Discussion: Hierarchical Metrics -- 6 Conclusion -- References -- Learning to Detect Every Thing in an Open World -- 1 Introduction -- 2 Related Work -- 3 Learning to Detect Every Thing -- 3.1 Data Augmentation: Background Erasing (BackErase). 3.2 Decoupled Multi-domain Training -- 4 Experiments -- 4.1 Cross-category Generalization -- 4.2 Cross-Dataset Generalization -- 5 Conclusion -- References -- KVT: k-NN Attention for Boosting Vision Transformers -- 1 Introduction -- 2 Related Work -- 2.1 Self-attention -- 2.2 Transformer for Vision -- 3 k-NN Attention -- 3.1 Vanilla Attention -- 3.2 k-NN Attention -- 3.3 Theoretical Analysis on k-NN Attention -- 4 Experiments for Vision Transformers -- 4.1 Experimental Settings -- 4.2 Results on ImageNet -- 4.3 The Impact of Number k -- 4.4 Convergence Speed of k-NN Attention -- 4.5 Other Properties of k-NN Attention -- 4.6 Comparisons with Temperature in Softmax -- 4.7 Visualization -- 4.8 Object Detection and Semantic Segmentation -- 5 Conclusion -- References -- Registration Based Few-Shot Anomaly Detection -- 1 Introduction -- 2 Related Work -- 2.1 Anomaly Detection -- 2.2 Few-Shot Learning -- 2.3 Few-Shot Anomaly Detection -- 3 Problem Setting -- 4 Method -- 4.1 Feature Registration Network -- 4.2 Normal Distribution Estimation -- 4.3 Inference -- 5 Experiments -- 5.1 Experimental Setups -- 5.2 Comparison with State-of-the-Art Methods -- 5.3 Ablation Studies -- 5.4 Visualization Analysis -- 6 Conclusion -- References -- Improving Robustness by Enhancing Weak Subnets -- 1 Introduction -- 2 Related Work -- 3 EWS: Training by Enhancing Weak Subnets -- 3.1 Subnet Construction and Impact on Overall Performance -- 3.2 Finding Particularly Weak Subnets -- 3.3 EWS: Enhancing Weak Subnets with Knowledge Distillation -- 3.4 Combining EWS with Adversarial Training -- 4 Experiments -- 4.1 Improving Corruption Robustness -- 4.2 Improving Adversarial Robustness -- 5 Ablation and Discussions -- 5.1 Search Strategies and Hyper-Parameters -- 5.2 Vulnerability of Blocks and Layers -- 6 Conclusion -- References. Learning Invariant Visual Representations for Compositional Zero-Shot Learning. |
| Record Nr. | UNISA-996500065903316 |
Avidan Shai
|
||
| Cham : , : Springer, , 2022 | ||
| Lo trovi qui: Univ. di Salerno | ||
| ||
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XVII
| Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XVII |
| Autore | Avidan Shai |
| Pubbl/distr/stampa | Cham : , : Springer, , 2022 |
| Descrizione fisica | 1 online resource (800 pages) |
| Disciplina | 006.37 |
| Altri autori (Persone) |
BrostowGabriel
CisséMoustapha FarinellaGiovanni Maria HassnerTal |
| Collana | Lecture Notes in Computer Science |
| Soggetto non controllato |
Engineering
Technology & Engineering |
| ISBN | 3-031-19790-9 |
| Formato | Materiale a stampa |
| Livello bibliografico | Monografia |
| Lingua di pubblicazione | eng |
| Record Nr. | UNISA-996495565403316 |
Avidan Shai
|
||
| Cham : , : Springer, , 2022 | ||
| Lo trovi qui: Univ. di Salerno | ||
| ||
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XVII
| Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XVII |
| Autore | Avidan Shai |
| Pubbl/distr/stampa | Cham : , : Springer, , 2022 |
| Descrizione fisica | 1 online resource (800 pages) |
| Disciplina | 006.37 |
| Altri autori (Persone) |
BrostowGabriel
CisséMoustapha FarinellaGiovanni Maria HassnerTal |
| Collana | Lecture Notes in Computer Science |
| Soggetto topico |
Visió per ordinador
Reconeixement de formes (Informàtica) |
| Soggetto genere / forma |
Congressos
Llibres electrònics |
| Soggetto non controllato |
Engineering
Technology & Engineering |
| ISBN |
9783031197901
3031197909 |
| Formato | Materiale a stampa |
| Livello bibliografico | Monografia |
| Lingua di pubblicazione | eng |
| Record Nr. | UNINA-9910619273903321 |
Avidan Shai
|
||
| Cham : , : Springer, , 2022 | ||
| Lo trovi qui: Univ. Federico II | ||
| ||
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXIV
| Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXIV |
| Autore | Avidan Shai |
| Pubbl/distr/stampa | Cham : , : Springer, , 2022 |
| Descrizione fisica | 1 online resource (803 pages) |
| Disciplina | 006.37 |
| Altri autori (Persone) |
BrostowGabriel
CisséMoustapha FarinellaGiovanni Maria HassnerTal |
| Collana | Lecture Notes in Computer Science |
| Soggetto topico |
Visió per ordinador
Reconeixement de formes (Informàtica) |
| Soggetto genere / forma |
Congressos
Llibres electrònics |
| Soggetto non controllato |
Engineering
Technology & Engineering |
| ISBN |
9783031200533
3031200535 |
| Formato | Materiale a stampa |
| Livello bibliografico | Monografia |
| Lingua di pubblicazione | eng |
| Nota di contenuto |
Intro -- Foreword -- Preface -- Organization -- Contents - Part XXIV -- Improving Vision Transformers by Revisiting High-Frequency Components -- 1 Introduction -- 2 Related Work -- 3 Revisiting ViT Models from a Frequency Perspective -- 4 The Proposed Method -- 4.1 Adversarial Training with High-Frequency Perturbations -- 4.2 A Case Study Using ViT-B -- 5 Experiments -- 5.1 Experimental Setup -- 5.2 Results on ImageNet Classification -- 5.3 Results on Out-of-distribution Data -- 5.4 Transfer Learning to Downstream Tasks -- 5.5 Ablation Studies -- 5.6 Discussions -- 6 Conclusions and Future Work -- References -- Recurrent Bilinear Optimization for Binary Neural Networks -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Preliminaries -- 3.2 Bilinear Model of BNNs -- 3.3 Recurrent Bilinear Optimization -- 3.4 Discussion -- 4 Experiments -- 4.1 Datasets and Implementation Details -- 4.2 Ablation Study -- 4.3 Image Classification -- 4.4 Object Detection -- 4.5 Deployment Efficiency -- 5 Conclusion -- References -- Neural Architecture Search for Spiking Neural Networks -- 1 Introduction -- 2 Related Work -- 2.1 Spiking Neural Networks -- 2.2 Neural Architecture Search -- 3 Preliminaries -- 3.1 Leaky Integrate-and-Fire Neuron -- 3.2 NAS Without Training -- 4 Methodology -- 4.1 Linear Regions from LIF Neurons -- 4.2 Sparsity-Aware Hamming Distance -- 4.3 Searching Forward and Backward Connections -- 5 Experiments -- 5.1 Implementation Details -- 5.2 Performance Comparison -- 5.3 Experimental Analysis -- 6 Conclusion -- References -- Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification -- 1 Introduction -- 2 Related Work -- 2.1 Fine-Grained Visual Classification -- 2.2 Human Attention in Vision -- 3 Approach -- 3.1 Overview -- 3.2 Region Feature Mining Module.
3.3 Cross-Hierarchical Orthogonal Fusion Module -- 4 Experiments and Analysis -- 4.1 Datasets -- 4.2 Hierarchy Interaction Analysis -- 4.3 Evaluation on Traditional FGVC Setting -- 4.4 Further Analysis -- 5 Conclusions -- References -- DaViT: Dual Attention Vision Transformers -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Overview -- 3.2 Spatial Window Attention -- 3.3 Channel Group Attention -- 3.4 Model Instantiation -- 4 Analysis -- 5 Experiments -- 5.1 Image Classification -- 5.2 Object Detection and Instance Segmentation -- 5.3 Semantic Segmentation on ADE20k -- 5.4 Ablation Study -- 6 Conclusion -- References -- Optimal Transport for Label-Efficient Visible-Infrared Person Re-Identification -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Problem Formulation and Overview -- 3.2 Discrepancy Elimination Network (DEN) -- 3.3 Optimal-Transport Label Assignment (OTLA) -- 3.4 Prediction Alignment Learning (PAL) -- 3.5 Optimization -- 4 Experiments -- 4.1 Experimental Settings -- 4.2 Implementation Details -- 4.3 Main Results -- 4.4 Ablation Study -- 4.5 Discussion -- 5 Conclusion -- References -- Locality Guidance for Improving Vision Transformers on Tiny Datasets -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 The Overall Approach -- 3.2 Guidance Positions -- 3.3 Architecture of the CNN -- 4 Experiments -- 4.1 Main Results -- 4.2 Discussion -- 4.3 Ablation Study -- 5 Conclusion -- References -- Neighborhood Collective Estimation for Noisy Label Identification and Correction -- 1 Introduction -- 2 Related Work -- 2.1 Noise Verification -- 2.2 Label Correction -- 3 The Proposed Method -- 3.1 Neighborhood Collective Noise Verification -- 3.2 Neighborhood Collective Label Correction -- 3.3 Training Objectives -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Comparisons with the State of the Art -- 4.3 Analysis. 5 Conclusions -- References -- Few-Shot Class-Incremental Learning via Entropy-Regularized Data-Free Replay -- 1 Introduction -- 2 Related Works -- 2.1 Class-Incremental Learning -- 2.2 Few-Shot Class-Incremental Learning -- 2.3 Data-Free Knowledge Distillation -- 3 Preliminaries -- 3.1 Problem Setting -- 3.2 Data-Free Replay -- 4 Methodology -- 4.1 Entropy-Regularized Data-Free Replay -- 4.2 Learning Incrementally with Uncertain Data -- 5 Experiments -- 5.1 Datasets -- 5.2 Implementation Details -- 5.3 Re-implementation of Replay-based Methods -- 5.4 Main Results and Comparison -- 5.5 Analysis -- 6 Conclusion -- References -- Anti-retroactive Interference for Lifelong Learning -- 1 Introduction -- 2 Related Work -- 2.1 Lifelong Learning -- 2.2 Adversarial Training -- 3 Proposed Method -- 3.1 Extracting Intra-Class Features -- 3.2 Generating and Fusing Task-Specific Models -- 4 Experiments and Results -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Results and Comparison -- 4.4 Ablation Study -- 5 Conclusion -- References -- Towards Calibrated Hyper-Sphere Representation via Distribution Overlap Coefficient for Long-Tailed Learning -- 1 Introduction -- 2 Related Works -- 3 Methodology -- 3.1 Build vMF Classifier on Hyper-Sphere -- 3.2 Quantify Distribution Overlap Coefficient on Hyper-Sphere -- 3.3 Improve Representation of Feature and Classifier via o -- 3.4 Calibrate Classifier Weight Beyond Training via o -- 4 Experiments -- 4.1 Long-Tailed Image Classification Task -- 4.2 Long-Tailed Semantic and Instance Segmentation Task -- 4.3 Ablation Study -- 5 Conclusions -- References -- Dynamic Metric Learning with Cross-Level Concept Distillation -- 1 Introduction -- 2 Related Work -- 3 Proposed Approach -- 3.1 Dynamic Metric Learning -- 3.2 Hierarchical Concept Refiner -- 3.3 Cross-Level Concept Distillation -- 3.4 Discussions -- 4 Experiments. 4.1 Datasets -- 4.2 Evaluation Protocol -- 4.3 Implementation Details -- 4.4 Main Results -- 4.5 Experimental Analysis -- 5 Conclusion -- References -- MENet: A Memory-Based Network with Dual-Branch for Efficient Event Stream Processing -- 1 Introduction -- 2 Related Work -- 2.1 Event-Based Representations -- 2.2 Memory-Based Networks -- 3 Event Camera Model -- 4 Method -- 4.1 Dual-Branch Structure -- 4.2 Double Polarities Calculation Method -- 4.3 Point-Wise Memory Bank -- 4.4 Training and Testing Strategies -- 5 Experiments -- 5.1 Experimental Setup -- 5.2 Ablation Study -- 5.3 Object Recognition -- 5.4 Gesture Recognition -- 6 Conclusion -- References -- Out-of-distribution Detection with Boundary Aware Learning -- 1 Introduction -- 2 Related Work -- 3 Preliminaries -- 4 Boundary Aware Learning -- 4.1 Representation Extraction Module (REM) -- 4.2 Representation Sampling Module (RSM) -- 4.3 Representation Discrimination Module (RDM) -- 5 Experiments -- 5.1 Dataset -- 5.2 Experimental Setup -- 5.3 Ablation Study -- 5.4 Detection Results -- 5.5 Visualization of trivial and hard OOD features -- 6 Conclusion -- References -- Learning Hierarchy Aware Features for Reducing Mistake Severity -- 1 Introduction -- 2 Related Work -- 3 HAF: Proposed Approach -- 3.1 Fine Grained Cross-Entropy (LCEfine) -- 3.2 Soft Hierarchical Consistency (Lshc) -- 3.3 Margin Loss (Lm) -- 3.4 Geometric Consistency (Lgc) -- 4 Experiments and Results -- 4.1 Experimental Setup -- 4.2 Training Configurations -- 4.3 Results -- 4.4 Coarse Classification Accuracy -- 5 Analysis -- 5.1 Ablation Study -- 5.2 Mistakes Severity Plots -- 5.3 Discussion: Hierarchical Metrics -- 6 Conclusion -- References -- Learning to Detect Every Thing in an Open World -- 1 Introduction -- 2 Related Work -- 3 Learning to Detect Every Thing -- 3.1 Data Augmentation: Background Erasing (BackErase). 3.2 Decoupled Multi-domain Training -- 4 Experiments -- 4.1 Cross-category Generalization -- 4.2 Cross-Dataset Generalization -- 5 Conclusion -- References -- KVT: k-NN Attention for Boosting Vision Transformers -- 1 Introduction -- 2 Related Work -- 2.1 Self-attention -- 2.2 Transformer for Vision -- 3 k-NN Attention -- 3.1 Vanilla Attention -- 3.2 k-NN Attention -- 3.3 Theoretical Analysis on k-NN Attention -- 4 Experiments for Vision Transformers -- 4.1 Experimental Settings -- 4.2 Results on ImageNet -- 4.3 The Impact of Number k -- 4.4 Convergence Speed of k-NN Attention -- 4.5 Other Properties of k-NN Attention -- 4.6 Comparisons with Temperature in Softmax -- 4.7 Visualization -- 4.8 Object Detection and Semantic Segmentation -- 5 Conclusion -- References -- Registration Based Few-Shot Anomaly Detection -- 1 Introduction -- 2 Related Work -- 2.1 Anomaly Detection -- 2.2 Few-Shot Learning -- 2.3 Few-Shot Anomaly Detection -- 3 Problem Setting -- 4 Method -- 4.1 Feature Registration Network -- 4.2 Normal Distribution Estimation -- 4.3 Inference -- 5 Experiments -- 5.1 Experimental Setups -- 5.2 Comparison with State-of-the-Art Methods -- 5.3 Ablation Studies -- 5.4 Visualization Analysis -- 6 Conclusion -- References -- Improving Robustness by Enhancing Weak Subnets -- 1 Introduction -- 2 Related Work -- 3 EWS: Training by Enhancing Weak Subnets -- 3.1 Subnet Construction and Impact on Overall Performance -- 3.2 Finding Particularly Weak Subnets -- 3.3 EWS: Enhancing Weak Subnets with Knowledge Distillation -- 3.4 Combining EWS with Adversarial Training -- 4 Experiments -- 4.1 Improving Corruption Robustness -- 4.2 Improving Adversarial Robustness -- 5 Ablation and Discussions -- 5.1 Search Strategies and Hyper-Parameters -- 5.2 Vulnerability of Blocks and Layers -- 6 Conclusion -- References. Learning Invariant Visual Representations for Compositional Zero-Shot Learning. |
| Record Nr. | UNINA-9910629291203321 |
Avidan Shai
|
||
| Cham : , : Springer, , 2022 | ||
| Lo trovi qui: Univ. Federico II | ||
| ||
Computer vision - ECCV 2022 . Part XXXV : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022 : proceedings / / Shai Avidan [and four others]
| Computer vision - ECCV 2022 . Part XXXV : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022 : proceedings / / Shai Avidan [and four others] |
| Edizione | [1st ed. 2022.] |
| Pubbl/distr/stampa | Cham, Switzerland : , : Springer, , [2022] |
| Descrizione fisica | 1 online resource (801 pages) |
| Disciplina | 006.37 |
| Collana | Lecture Notes in Computer Science |
| Soggetto topico |
Computer vision
Pattern recognition systems |
| Soggetto non controllato |
Engineering
Technology & Engineering |
| ISBN | 3-031-19833-6 |
| Formato | Materiale a stampa |
| Livello bibliografico | Monografia |
| Lingua di pubblicazione | eng |
| Nota di contenuto | Efficient One-Stage Video Object Detection by Exploiting Temporal Consistency -- Leveraging Action Affinity and Continuity for Semi-Supervised Temporal Action Segmentation -- Spotting Temporally Precise, Fine-Grained Events in Video -- Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation -- Efficient Video Transformers with Spatial-Temporal Token Selection -- Long Movie Clip Classification with State-Space Video Models -- Prompting Visual-Language Models for Efficient Video Understanding -- Asymmetric Relation Consistency Reasoning for Video Relation Grounding -- Self-Supervised Social Relation Representation for Human Group Detection -- K-Centered Patch Sampling for Efficient Video Recognition -- A Deep Moving-Camera Background Model -- GraphVid: It Only Takes a Few Nodes to Understand a Video -- Delta Distillation for Efficient Video Processing -- MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning -- COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality -- E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context -- TDViT: Temporal Dilated Video Transformer for Dense Video Tasks -- Semi-Supervised Learning of Optical Flow by Flow Supervisor -- Flow Graph to Video Grounding for Weakly-Supervised Multi-step Localization -- Deep 360° Optical Flow Estimation Based on Multi-Projection Fusion -- MaCLR: Motion-Aware Contrastive Learning of Representations for Videos -- Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection -- Frozen CLIP Models Are Efficient Video Learners -- PIP: Physical Interaction Prediction via Mental Simulation with Span Selection -- Panoramic Vision Transformer for Saliency Detection in 360° Videos -- Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration -- Motion Sensitive Contrastive Learning for Self-Supervised Video Representation -- Dynamic Temporal Filtering In Video Models -- Tip-Adapter: Training-Free Adaption of CLIP for Few-Shot Classification -- Temporal Lift Pooling for Continuous Sign Language Recognition -- MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes -- SiRi: A Simple Selective Retraining Mechanism for Transformer-Based Visual Grounding -- Cross-Modal Prototype Driven Network for Radiology Report Generation -- TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts -- SeqTR: A Simple Yet Universal Network for Visual Grounding -- VTC: Improving Video-Text Retrieval with User Comments -- FashionViL: Fashion-Focused Vision-and-Language Representation Learning -- Weakly Supervised Grounding for VQA in Vision-Language Transformers -- Automatic Dense Annotation of Large-Vocabulary Sign Language Videos -- MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval -- GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval -- A Simple and Robust Correlation Filtering Method for Text-Based Person Search. |
| Record Nr. | UNISA-996500066303316 |
| Cham, Switzerland : , : Springer, , [2022] | ||
| Lo trovi qui: Univ. di Salerno | ||
| ||
Computer vision - ECCV 2022 . Part XXXV : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022 : proceedings / / Shai Avidan [and four others]
| Computer vision - ECCV 2022 . Part XXXV : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022 : proceedings / / Shai Avidan [and four others] |
| Edizione | [1st ed. 2022.] |
| Pubbl/distr/stampa | Cham, Switzerland : , : Springer, , [2022] |
| Descrizione fisica | 1 online resource (801 pages) |
| Disciplina | 006.37 |
| Collana | Lecture Notes in Computer Science |
| Soggetto topico |
Computer vision
Pattern recognition systems |
| Soggetto non controllato |
Engineering
Technology & Engineering |
| ISBN | 3-031-19833-6 |
| Formato | Materiale a stampa |
| Livello bibliografico | Monografia |
| Lingua di pubblicazione | eng |
| Nota di contenuto | Efficient One-Stage Video Object Detection by Exploiting Temporal Consistency -- Leveraging Action Affinity and Continuity for Semi-Supervised Temporal Action Segmentation -- Spotting Temporally Precise, Fine-Grained Events in Video -- Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation -- Efficient Video Transformers with Spatial-Temporal Token Selection -- Long Movie Clip Classification with State-Space Video Models -- Prompting Visual-Language Models for Efficient Video Understanding -- Asymmetric Relation Consistency Reasoning for Video Relation Grounding -- Self-Supervised Social Relation Representation for Human Group Detection -- K-Centered Patch Sampling for Efficient Video Recognition -- A Deep Moving-Camera Background Model -- GraphVid: It Only Takes a Few Nodes to Understand a Video -- Delta Distillation for Efficient Video Processing -- MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning -- COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality -- E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context -- TDViT: Temporal Dilated Video Transformer for Dense Video Tasks -- Semi-Supervised Learning of Optical Flow by Flow Supervisor -- Flow Graph to Video Grounding for Weakly-Supervised Multi-step Localization -- Deep 360° Optical Flow Estimation Based on Multi-Projection Fusion -- MaCLR: Motion-Aware Contrastive Learning of Representations for Videos -- Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection -- Frozen CLIP Models Are Efficient Video Learners -- PIP: Physical Interaction Prediction via Mental Simulation with Span Selection -- Panoramic Vision Transformer for Saliency Detection in 360° Videos -- Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration -- Motion Sensitive Contrastive Learning for Self-Supervised Video Representation -- Dynamic Temporal Filtering In Video Models -- Tip-Adapter: Training-Free Adaption of CLIP for Few-Shot Classification -- Temporal Lift Pooling for Continuous Sign Language Recognition -- MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes -- SiRi: A Simple Selective Retraining Mechanism for Transformer-Based Visual Grounding -- Cross-Modal Prototype Driven Network for Radiology Report Generation -- TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts -- SeqTR: A Simple Yet Universal Network for Visual Grounding -- VTC: Improving Video-Text Retrieval with User Comments -- FashionViL: Fashion-Focused Vision-and-Language Representation Learning -- Weakly Supervised Grounding for VQA in Vision-Language Transformers -- Automatic Dense Annotation of Large-Vocabulary Sign Language Videos -- MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval -- GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval -- A Simple and Robust Correlation Filtering Method for Text-Based Person Search. |
| Record Nr. | UNINA-9910629292203321 |
| Cham, Switzerland : , : Springer, , [2022] | ||
| Lo trovi qui: Univ. Federico II | ||
| ||
Computer Vision, Imaging and Computer Graphics Theory and Applications : 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics, VISIGRAPP 2023, Lisbon, Portugal, February 19–21, 2023, Revised Selected Papers / / edited by A. Augusto de Sousa, Thomas Bashford-Rogers, Alexis Paljic, Mounia Ziat, Christophe Hurter, Helen Purchase, Petia Radeva, Giovanni Maria Farinella, Kadi Bouatouch
| Computer Vision, Imaging and Computer Graphics Theory and Applications : 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics, VISIGRAPP 2023, Lisbon, Portugal, February 19–21, 2023, Revised Selected Papers / / edited by A. Augusto de Sousa, Thomas Bashford-Rogers, Alexis Paljic, Mounia Ziat, Christophe Hurter, Helen Purchase, Petia Radeva, Giovanni Maria Farinella, Kadi Bouatouch |
| Autore | de Sousa A. Augusto |
| Edizione | [1st ed. 2024.] |
| Pubbl/distr/stampa | Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024 |
| Descrizione fisica | 1 online resource (419 pages) |
| Disciplina | 006 |
| Altri autori (Persone) |
Bashford-RogersThomas
PaljicAlexis ZiatMounia HurterChristophe PurchaseHelen RadevaPetia FarinellaGiovanni Maria BouatouchKadi |
| Collana | Communications in Computer and Information Science |
| Soggetto topico |
Image processing - Digital techniques
Computer vision Computer engineering Computer networks Artificial intelligence Application software User interfaces (Computer systems) Human-computer interaction Computer Imaging, Vision, Pattern Recognition and Graphics Computer Engineering and Networks Artificial Intelligence Computer and Information Systems Applications User Interfaces and Human Computer Interaction |
| ISBN | 3-031-66743-3 |
| Formato | Materiale a stampa |
| Livello bibliografico | Monografia |
| Lingua di pubblicazione | eng |
| Nota di contenuto | Analysis of Solar Radiation on Facades Using Mobile Augmented Reality -- Unified Shape Analysis and Synthesis via Deformable Voxel Grids -- Epipolar Equation Weighting for Accurate Camera Motion from Two Consecutive Frames -- Absolute ROMP Recovering Multi Person 3D Poses and Shapes with Absolute Scales from a Single RGB Image -- Semi Supervised Task Aware Image to Image Translation -- Deep Detection Dreams Enhancing Visualization Tools for Single Stage Object Detectors -- Approaches to Face Verification Through Attribute Based Attention -- Towards Fast Detection and Classification of Moving Objects -- ST SACLF Style Transfer Informed Self Attention Classifier for Bias Aware Painting Classification -- Attention to Emotions Body Emotion Recognition In The Wild Using Self Attention Transformer Network -- Linking Data Separation, Visual Separation, and Classifier Performance Using Multidimensional Projections -- Using Cockpit Interactions for Implicit Eye Tracking Calibration in a Flight Simulator -- Evaluation of Flexible Structured Light Calibration Using Circles -- GPS Enhanced RGB D IMU Calibration for Accurate Pose Estimation -- Application of Contrast Driven Color Class Assignment to Four Categorical Data Visualization Diagrams -- Measuring And Interpreting the Quality of 3D Projections of High Dimensional Data -- Visualizing Military Operations Extended Geospatial Temporal Survey. |
| Record Nr. | UNINA-9910882886503321 |
de Sousa A. Augusto
|
||
| Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024 | ||
| Lo trovi qui: Univ. Federico II | ||
| ||
Computer Vision, Imaging and Computer Graphics Theory and Applications [[electronic resource] ] : 17th International Joint Conference, VISIGRAPP 2022, Virtual Event, February 6–8, 2022, Revised Selected Papers / / edited by A. Augusto de Sousa, Kurt Debattista, Alexis Paljic, Mounia Ziat, Christophe Hurter, Helen Purchase, Giovanni Maria Farinella, Petia Radeva, Kadi Bouatouch
| Computer Vision, Imaging and Computer Graphics Theory and Applications [[electronic resource] ] : 17th International Joint Conference, VISIGRAPP 2022, Virtual Event, February 6–8, 2022, Revised Selected Papers / / edited by A. Augusto de Sousa, Kurt Debattista, Alexis Paljic, Mounia Ziat, Christophe Hurter, Helen Purchase, Giovanni Maria Farinella, Petia Radeva, Kadi Bouatouch |
| Autore | de Sousa A. Augusto |
| Edizione | [1st ed. 2023.] |
| Pubbl/distr/stampa | Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 |
| Descrizione fisica | 1 online resource (343 pages) |
| Disciplina | 006 |
| Altri autori (Persone) |
DebattistaKurt
PaljicAlexis ZiatMounia HurterChristophe PurchaseHelen FarinellaGiovanni Maria RadevaPetia BouatouchKadi |
| Collana | Communications in Computer and Information Science |
| Soggetto topico |
Image processing - Digital techniques
Computer vision Computer engineering Computer networks Artificial intelligence Application software User interfaces (Computer systems) Human-computer interaction Computer Imaging, Vision, Pattern Recognition and Graphics Computer Engineering and Networks Artificial Intelligence Computer and Information Systems Applications User Interfaces and Human Computer Interaction |
| ISBN | 3-031-45725-0 |
| Formato | Materiale a stampa |
| Livello bibliografico | Monografia |
| Lingua di pubblicazione | eng |
| Nota di contenuto |
Intro -- Preface -- Organization -- Contents -- Automatic Threshold RanSaC Algorithms for Pose Estimation Tasks -- 1 Introduction -- 2 RanSaC Methods -- 2.1 Notation -- 2.2 History of RanSaC Algorithms -- 3 Adaptative RanSaC Algorithms -- 4 Data Generation Methodology -- 4.1 Models and Estimators -- 4.2 Semi-artificial Data Generation Method -- 5 Benchmark and Results -- 5.1 Performance Measures -- 5.2 Parameters -- 5.3 Results -- 5.4 Analysis and Comparison -- 6 Conclusion -- References -- Semi-automated Generation of Accurate Ground-Truth for 3D Object Detection -- 1 Introduction -- 2 Related Work on 3D Object Detection -- 2.1 Techniques for Early Object Detection -- 2.2 CNN-Based 3D Object Detection -- 2.3 Conclusions on Related Work -- 3 Semi-automated 3D Dataset Generation -- 3.1 Orientation Estimation -- 3.2 3D Box Estimation -- 4 Experiments -- 4.1 Experimental Setup and Configuration -- 4.2 Evaluation 1: Annotation-Processing Chain -- 4.3 Evaluation 2: 3D Object Detector Trained on the Annotation-Processing Configurations -- 4.4 Cross-Validation on KITTI Dataset -- 4.5 Unsupervised Approach -- 5 Conclusion -- References -- A Quantitative and Qualitative Analysis on a GAN-Based Face Mask Removal on Masked Images and Videos -- 1 Introduction -- 2 Related Works -- 2.1 Inpainting -- 2.2 Face Completion -- 3 Method -- 3.1 Pix2pix-Based Inpainting -- 3.2 Custom Loss Function -- 3.3 System Overview -- 3.4 Predicting Feature Points on a Face -- 4 Experiment -- 4.1 Image Evaluation -- 4.2 Video Evaluation -- 5 Discussion -- 5.1 Quality of Generated Images -- 5.2 Discriminating Facial Expressions -- 5.3 Generating Smooth Videos -- 5.4 Additional Quantitative Analyses -- 6 Limitations -- 7 Conclusion -- References -- Dense Material Segmentation with Context-Aware Network -- 1 Introduction -- 2 Related Works -- 2.1 Material Segmentation Datasets.
2.2 Fully Convolutional Network -- 2.3 Material Segmentation with FCN -- 2.4 Global and Local Training -- 2.5 Boundary Refinement -- 2.6 Self-training -- 3 CAM-SegNet Architecture -- 3.1 Feature Sharing Connection -- 3.2 Context-Aware Dense Material Segmentation -- 3.3 Self-training Approach -- 4 CAM-SegNet Experiment Configurations -- 4.1 Dataset -- 4.2 Evaluation Metrics -- 4.3 Implementation Details -- 5 CAM-SegNet Performance Analysis -- 5.1 Quantitative Analysis -- 5.2 Qualitative Analysis -- 5.3 Ablation Study -- 6 Conclusions -- References -- Partial Alignment of Time Series for Action and Activity Prediction -- 1 Introduction -- 2 Related Work -- 3 Temporal Alignment of Action/Activity Sequences -- 3.1 Alignment Methods - Segmented Sequences -- 3.2 Alignment Methods - Unsegmented Sequences -- 3.3 Action and Activity Prediction -- 4 Experimental Results -- 4.1 Datasets -- 4.2 Alignment-Based Prediction in Segmented Sequences -- 4.3 Alignment-Based Action Prediction in Unsegmented Sequences -- 4.4 Graph-Based Activity Prediction -- 4.5 Duration Prognosis -- 5 Conclusions -- References -- Automatic Bi-LSTM Architecture Search Using Bayesian Optimisation for Vehicle Activity Recognition -- 1 Introduction -- 2 Related Work -- 2.1 Trajectory Representation and Analysis -- 2.2 Deep Neural Network Optimisation -- 3 Method -- 3.1 Qualitative Feature Representation -- 3.2 Automatic Bi-LSTM Architecture Search -- 3.3 Optimal Architecture Selection -- 3.4 VNet Modelling -- 4 Vehicle Activity Datasets -- 4.1 Highway Drone Dataset -- 4.2 Traffic Dataset -- 4.3 Vehicle Obstacle Interaction Dataset -- 4.4 Next Generation Simulation Dataset -- 4.5 Combined Dataset -- 5 Experiments and Results -- 5.1 Optimal Architecture Selection -- 5.2 Evaluation of the Optimal Architecture -- 6 Discussion -- 7 Conclusion -- References. ANTENNA: Visual Analytics of Mobility Derived from Cellphone Data -- 1 Introduction -- 2 Related Work -- 2.1 Reconstruction and Extraction of Trajectories -- 2.2 Visual Analytics of Movement -- 3 System Overview -- 3.1 Backend and Frontend -- 4 Data -- 4.1 Database -- 4.2 Processing Pipeline -- 5 ANTENNA's Visualization -- 5.1 Tasks and Design Requirements -- 5.2 Visual Query -- 5.3 Grid Aggregation Mode -- 5.4 Road Aggregation Mode -- 6 Usage Scenarios -- 6.1 Scenario 1: Inter-Urban Movements -- 6.2 Scenario 2: Group Movements -- 7 User Testing -- 7.1 Methodology -- 7.2 Tasks -- 7.3 Results -- 8 Discussion -- 9 Conclusion -- References -- Influence of Errors on the Evaluation of Text Classification Systems -- 1 Introduction -- 2 Setup -- 2.1 Models and Dataset -- 2.2 Explanation Methods -- 2.3 Evaluation of the Models -- 2.4 System Output and Explanation Visualization -- 3 Experiment 1: Effect on the Evaluation of One System -- 3.1 Experiment Design -- 3.2 Task and Questionnaire -- 3.3 Participant Recruitment -- 3.4 Results -- 3.5 Qualitative Results -- 4 Experiment 2: Effect on the Comparison of Two Systems -- 4.1 Experiment Design -- 4.2 Task and Questionnaire -- 4.3 Participant Recruitment -- 4.4 Results -- 5 Experiment 3: Effect of the Comparison of Two Systems (Bias Error Pattern) -- 5.1 Experiment Design -- 5.2 Results -- 6 Experiment 4: Effect of Incorrect Examples (with a Different Language) -- 6.1 Experiment Design -- 6.2 Task and Questionnaire -- 6.3 Participant Recruitment -- 6.4 Translation -- 6.5 Results -- 6.6 Qualitative Results -- 7 Discussion -- 7.1 Limitations -- 8 Conclusion -- References -- Autonomous Navigation Method Considering Passenger Comfort Recognition for Personal Mobility Vehicles in Crowded Pedestrian Spaces -- 1 Introduction -- 2 Process of Passenger Comfort Recognition. 3 Investigation of Passenger Comfort Recognition -- 3.1 Passenger Comfort Evaluation Experiment -- 3.2 Effects of Current Situation on Comfort Recognition -- 3.3 Effects of Future Status on Comfort Recognition -- 3.4 Characteristics of Passenger Comfort Recognition -- 4 Proposal of an Autonomous Navigation Method Considering Passenger Comfort Recognition -- 4.1 Design -- 4.2 Validation -- 5 Conclusions -- References -- The Electrodermal Activity of Player Experience in Virtual Reality Games: An Extended Evaluation of the Phasic Component -- 1 Introduction -- 2 Background -- 2.1 Related Work -- 3 Methodology -- 3.1 EDA Data Capture and Phasic Component Calculation -- 3.2 Phasic Component Analysis -- 3.3 Game Experience Analysis -- 3.4 Statistical Analyses -- 3.5 Implementation Tools -- 3.6 Ethical Considerations -- 4 Results -- 4.1 Peaks per Minute -- 4.2 Average Peak Amplitude -- 4.3 Game Experience -- 4.4 Correlation Analysis -- 5 Discussion -- 6 Conclusion and Future Work -- References -- MinMax-CAM: Increasing Precision of Explaining Maps by Contrasting Gradient Signals and Regularizing Kernel Usage -- 1 Introduction -- 2 Related Work -- 3 Contrasting Class Gradient Information -- 3.1 Intuition -- 3.2 Definition -- 3.3 Reducing Noise by Removing Negative Contributions -- 4 Reducing Shared Information Between Classifiers -- 4.1 Counterbalancing Activation Vanishing -- 5 Experimental Setup -- 5.1 Evaluations over Architectures and Problem Domains -- 5.2 Training Procedure -- 5.3 Evaluation Metrics -- 6 Results -- 6.1 Comparison Between Architectures -- 6.2 Evaluation over Distinct Problem Domains -- 6.3 Kernel Usage Regularization -- 7 Conclusions -- References -- DIAR: Deep Image Alignment and Reconstruction Using Swin Transformers -- 1 Introduction -- 2 Related Work -- 3 Dataset -- 3.1 Aligned Dataset -- 3.2 Misaligned Dataset. 4 Deep Image Alignment -- 5 Architecture -- 5.1 Deep Residual Sets -- 5.2 Video Swin Transformer -- 5.3 Image Reconstruction Using Swin Transformers -- 5.4 Training -- 6 Evaluation -- 6.1 Aggregation -- 6.2 Image Reconstruction -- 6.3 Alignment and Reconstruction: -- 7 Conclusion -- References -- Active Learning with Data Augmentation Under Small vs Large Dataset Regimes for Semantic-KITTI Dataset -- 1 Introduction -- 1.1 State of the Art -- 2 Methodology -- 3 Validation and Results -- 3.1 Class Based Learning Efficiency -- 3.2 Dataset Size Growth: 1/4 Semantic-KITTI vs Full Semantic-KITTI -- 3.3 t-SNE Problem Analysis -- 4 Conclusion -- 4.1 Challenges and Future Scope -- References -- Transformers in Unsupervised Structure-from-Motion -- 1 Introduction -- 2 Related Works -- 3 Method -- 3.1 Monocular Unsupervised SfM -- 3.2 Architecture -- 3.3 Intrinsics -- 3.4 Appearance-Based Losses -- 4 Experiments -- 4.1 Datasets -- 4.2 Architecture -- 4.3 Implementation Details -- 4.4 Evaluation Metrics -- 4.5 Impact of Architecture -- 4.6 Generalizability -- 4.7 Auxiliary Tasks -- 4.8 Depth Estimation with Learned Camera Intrinsics -- 4.9 Efficiency -- 4.10 Comparing Performance -- 5 Conclusion -- References -- A Study of Aerial Image-Based 3D Reconstructions in a Metropolitan Area -- 1 Introduction -- 2 Previous Work -- 3 Urban Environment -- 3.1 Ground Truth -- 3.2 Image Sets -- 3.3 Urban Categorization -- 4 Experimental Setup -- 4.1 3D Reconstruction Techniques -- 4.2 Pipelines Under Study -- 4.3 Alignment -- 5 Experimental Results -- 5.1 Scene Level Evaluation -- 5.2 Urban Category Centric Evaluation -- 5.3 General Pipeline Evaluation -- 6 Conclusion -- References -- Author Index. |
| Record Nr. | UNISA-996558568803316 |
de Sousa A. Augusto
|
||
| Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 | ||
| Lo trovi qui: Univ. di Salerno | ||
| ||
Computer Vision, Imaging and Computer Graphics Theory and Applications : 17th International Joint Conference, VISIGRAPP 2022, Virtual Event, February 6–8, 2022, Revised Selected Papers / / edited by A. Augusto de Sousa, Kurt Debattista, Alexis Paljic, Mounia Ziat, Christophe Hurter, Helen Purchase, Giovanni Maria Farinella, Petia Radeva, Kadi Bouatouch
| Computer Vision, Imaging and Computer Graphics Theory and Applications : 17th International Joint Conference, VISIGRAPP 2022, Virtual Event, February 6–8, 2022, Revised Selected Papers / / edited by A. Augusto de Sousa, Kurt Debattista, Alexis Paljic, Mounia Ziat, Christophe Hurter, Helen Purchase, Giovanni Maria Farinella, Petia Radeva, Kadi Bouatouch |
| Autore | de Sousa A. Augusto |
| Edizione | [1st ed. 2023.] |
| Pubbl/distr/stampa | Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 |
| Descrizione fisica | 1 online resource (343 pages) |
| Disciplina | 006 |
| Altri autori (Persone) |
DebattistaKurt
PaljicAlexis ZiatMounia HurterChristophe PurchaseHelen FarinellaGiovanni Maria RadevaPetia BouatouchKadi |
| Collana | Communications in Computer and Information Science |
| Soggetto topico |
Image processing - Digital techniques
Computer vision Computer engineering Computer networks Artificial intelligence Application software User interfaces (Computer systems) Human-computer interaction Computer Imaging, Vision, Pattern Recognition and Graphics Computer Engineering and Networks Artificial Intelligence Computer and Information Systems Applications User Interfaces and Human Computer Interaction |
| ISBN |
9783031457258
3031457250 |
| Formato | Materiale a stampa |
| Livello bibliografico | Monografia |
| Lingua di pubblicazione | eng |
| Nota di contenuto |
Intro -- Preface -- Organization -- Contents -- Automatic Threshold RanSaC Algorithms for Pose Estimation Tasks -- 1 Introduction -- 2 RanSaC Methods -- 2.1 Notation -- 2.2 History of RanSaC Algorithms -- 3 Adaptative RanSaC Algorithms -- 4 Data Generation Methodology -- 4.1 Models and Estimators -- 4.2 Semi-artificial Data Generation Method -- 5 Benchmark and Results -- 5.1 Performance Measures -- 5.2 Parameters -- 5.3 Results -- 5.4 Analysis and Comparison -- 6 Conclusion -- References -- Semi-automated Generation of Accurate Ground-Truth for 3D Object Detection -- 1 Introduction -- 2 Related Work on 3D Object Detection -- 2.1 Techniques for Early Object Detection -- 2.2 CNN-Based 3D Object Detection -- 2.3 Conclusions on Related Work -- 3 Semi-automated 3D Dataset Generation -- 3.1 Orientation Estimation -- 3.2 3D Box Estimation -- 4 Experiments -- 4.1 Experimental Setup and Configuration -- 4.2 Evaluation 1: Annotation-Processing Chain -- 4.3 Evaluation 2: 3D Object Detector Trained on the Annotation-Processing Configurations -- 4.4 Cross-Validation on KITTI Dataset -- 4.5 Unsupervised Approach -- 5 Conclusion -- References -- A Quantitative and Qualitative Analysis on a GAN-Based Face Mask Removal on Masked Images and Videos -- 1 Introduction -- 2 Related Works -- 2.1 Inpainting -- 2.2 Face Completion -- 3 Method -- 3.1 Pix2pix-Based Inpainting -- 3.2 Custom Loss Function -- 3.3 System Overview -- 3.4 Predicting Feature Points on a Face -- 4 Experiment -- 4.1 Image Evaluation -- 4.2 Video Evaluation -- 5 Discussion -- 5.1 Quality of Generated Images -- 5.2 Discriminating Facial Expressions -- 5.3 Generating Smooth Videos -- 5.4 Additional Quantitative Analyses -- 6 Limitations -- 7 Conclusion -- References -- Dense Material Segmentation with Context-Aware Network -- 1 Introduction -- 2 Related Works -- 2.1 Material Segmentation Datasets.
2.2 Fully Convolutional Network -- 2.3 Material Segmentation with FCN -- 2.4 Global and Local Training -- 2.5 Boundary Refinement -- 2.6 Self-training -- 3 CAM-SegNet Architecture -- 3.1 Feature Sharing Connection -- 3.2 Context-Aware Dense Material Segmentation -- 3.3 Self-training Approach -- 4 CAM-SegNet Experiment Configurations -- 4.1 Dataset -- 4.2 Evaluation Metrics -- 4.3 Implementation Details -- 5 CAM-SegNet Performance Analysis -- 5.1 Quantitative Analysis -- 5.2 Qualitative Analysis -- 5.3 Ablation Study -- 6 Conclusions -- References -- Partial Alignment of Time Series for Action and Activity Prediction -- 1 Introduction -- 2 Related Work -- 3 Temporal Alignment of Action/Activity Sequences -- 3.1 Alignment Methods - Segmented Sequences -- 3.2 Alignment Methods - Unsegmented Sequences -- 3.3 Action and Activity Prediction -- 4 Experimental Results -- 4.1 Datasets -- 4.2 Alignment-Based Prediction in Segmented Sequences -- 4.3 Alignment-Based Action Prediction in Unsegmented Sequences -- 4.4 Graph-Based Activity Prediction -- 4.5 Duration Prognosis -- 5 Conclusions -- References -- Automatic Bi-LSTM Architecture Search Using Bayesian Optimisation for Vehicle Activity Recognition -- 1 Introduction -- 2 Related Work -- 2.1 Trajectory Representation and Analysis -- 2.2 Deep Neural Network Optimisation -- 3 Method -- 3.1 Qualitative Feature Representation -- 3.2 Automatic Bi-LSTM Architecture Search -- 3.3 Optimal Architecture Selection -- 3.4 VNet Modelling -- 4 Vehicle Activity Datasets -- 4.1 Highway Drone Dataset -- 4.2 Traffic Dataset -- 4.3 Vehicle Obstacle Interaction Dataset -- 4.4 Next Generation Simulation Dataset -- 4.5 Combined Dataset -- 5 Experiments and Results -- 5.1 Optimal Architecture Selection -- 5.2 Evaluation of the Optimal Architecture -- 6 Discussion -- 7 Conclusion -- References. ANTENNA: Visual Analytics of Mobility Derived from Cellphone Data -- 1 Introduction -- 2 Related Work -- 2.1 Reconstruction and Extraction of Trajectories -- 2.2 Visual Analytics of Movement -- 3 System Overview -- 3.1 Backend and Frontend -- 4 Data -- 4.1 Database -- 4.2 Processing Pipeline -- 5 ANTENNA's Visualization -- 5.1 Tasks and Design Requirements -- 5.2 Visual Query -- 5.3 Grid Aggregation Mode -- 5.4 Road Aggregation Mode -- 6 Usage Scenarios -- 6.1 Scenario 1: Inter-Urban Movements -- 6.2 Scenario 2: Group Movements -- 7 User Testing -- 7.1 Methodology -- 7.2 Tasks -- 7.3 Results -- 8 Discussion -- 9 Conclusion -- References -- Influence of Errors on the Evaluation of Text Classification Systems -- 1 Introduction -- 2 Setup -- 2.1 Models and Dataset -- 2.2 Explanation Methods -- 2.3 Evaluation of the Models -- 2.4 System Output and Explanation Visualization -- 3 Experiment 1: Effect on the Evaluation of One System -- 3.1 Experiment Design -- 3.2 Task and Questionnaire -- 3.3 Participant Recruitment -- 3.4 Results -- 3.5 Qualitative Results -- 4 Experiment 2: Effect on the Comparison of Two Systems -- 4.1 Experiment Design -- 4.2 Task and Questionnaire -- 4.3 Participant Recruitment -- 4.4 Results -- 5 Experiment 3: Effect of the Comparison of Two Systems (Bias Error Pattern) -- 5.1 Experiment Design -- 5.2 Results -- 6 Experiment 4: Effect of Incorrect Examples (with a Different Language) -- 6.1 Experiment Design -- 6.2 Task and Questionnaire -- 6.3 Participant Recruitment -- 6.4 Translation -- 6.5 Results -- 6.6 Qualitative Results -- 7 Discussion -- 7.1 Limitations -- 8 Conclusion -- References -- Autonomous Navigation Method Considering Passenger Comfort Recognition for Personal Mobility Vehicles in Crowded Pedestrian Spaces -- 1 Introduction -- 2 Process of Passenger Comfort Recognition. 3 Investigation of Passenger Comfort Recognition -- 3.1 Passenger Comfort Evaluation Experiment -- 3.2 Effects of Current Situation on Comfort Recognition -- 3.3 Effects of Future Status on Comfort Recognition -- 3.4 Characteristics of Passenger Comfort Recognition -- 4 Proposal of an Autonomous Navigation Method Considering Passenger Comfort Recognition -- 4.1 Design -- 4.2 Validation -- 5 Conclusions -- References -- The Electrodermal Activity of Player Experience in Virtual Reality Games: An Extended Evaluation of the Phasic Component -- 1 Introduction -- 2 Background -- 2.1 Related Work -- 3 Methodology -- 3.1 EDA Data Capture and Phasic Component Calculation -- 3.2 Phasic Component Analysis -- 3.3 Game Experience Analysis -- 3.4 Statistical Analyses -- 3.5 Implementation Tools -- 3.6 Ethical Considerations -- 4 Results -- 4.1 Peaks per Minute -- 4.2 Average Peak Amplitude -- 4.3 Game Experience -- 4.4 Correlation Analysis -- 5 Discussion -- 6 Conclusion and Future Work -- References -- MinMax-CAM: Increasing Precision of Explaining Maps by Contrasting Gradient Signals and Regularizing Kernel Usage -- 1 Introduction -- 2 Related Work -- 3 Contrasting Class Gradient Information -- 3.1 Intuition -- 3.2 Definition -- 3.3 Reducing Noise by Removing Negative Contributions -- 4 Reducing Shared Information Between Classifiers -- 4.1 Counterbalancing Activation Vanishing -- 5 Experimental Setup -- 5.1 Evaluations over Architectures and Problem Domains -- 5.2 Training Procedure -- 5.3 Evaluation Metrics -- 6 Results -- 6.1 Comparison Between Architectures -- 6.2 Evaluation over Distinct Problem Domains -- 6.3 Kernel Usage Regularization -- 7 Conclusions -- References -- DIAR: Deep Image Alignment and Reconstruction Using Swin Transformers -- 1 Introduction -- 2 Related Work -- 3 Dataset -- 3.1 Aligned Dataset -- 3.2 Misaligned Dataset. 4 Deep Image Alignment -- 5 Architecture -- 5.1 Deep Residual Sets -- 5.2 Video Swin Transformer -- 5.3 Image Reconstruction Using Swin Transformers -- 5.4 Training -- 6 Evaluation -- 6.1 Aggregation -- 6.2 Image Reconstruction -- 6.3 Alignment and Reconstruction: -- 7 Conclusion -- References -- Active Learning with Data Augmentation Under Small vs Large Dataset Regimes for Semantic-KITTI Dataset -- 1 Introduction -- 1.1 State of the Art -- 2 Methodology -- 3 Validation and Results -- 3.1 Class Based Learning Efficiency -- 3.2 Dataset Size Growth: 1/4 Semantic-KITTI vs Full Semantic-KITTI -- 3.3 t-SNE Problem Analysis -- 4 Conclusion -- 4.1 Challenges and Future Scope -- References -- Transformers in Unsupervised Structure-from-Motion -- 1 Introduction -- 2 Related Works -- 3 Method -- 3.1 Monocular Unsupervised SfM -- 3.2 Architecture -- 3.3 Intrinsics -- 3.4 Appearance-Based Losses -- 4 Experiments -- 4.1 Datasets -- 4.2 Architecture -- 4.3 Implementation Details -- 4.4 Evaluation Metrics -- 4.5 Impact of Architecture -- 4.6 Generalizability -- 4.7 Auxiliary Tasks -- 4.8 Depth Estimation with Learned Camera Intrinsics -- 4.9 Efficiency -- 4.10 Comparing Performance -- 5 Conclusion -- References -- A Study of Aerial Image-Based 3D Reconstructions in a Metropolitan Area -- 1 Introduction -- 2 Previous Work -- 3 Urban Environment -- 3.1 Ground Truth -- 3.2 Image Sets -- 3.3 Urban Categorization -- 4 Experimental Setup -- 4.1 3D Reconstruction Techniques -- 4.2 Pipelines Under Study -- 4.3 Alignment -- 5 Experimental Results -- 5.1 Scene Level Evaluation -- 5.2 Urban Category Centric Evaluation -- 5.3 General Pipeline Evaluation -- 6 Conclusion -- References -- Author Index. |
| Record Nr. | UNINA-9910754096903321 |
de Sousa A. Augusto
|
||
| Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 | ||
| Lo trovi qui: Univ. Federico II | ||
| ||