top

  Info

  • Utilizzare la checkbox di selezione a fianco di ciascun documento per attivare le funzionalità di stampa, invio email, download nei formati disponibili del (i) record.

  Info

  • Utilizzare questo link per rimuovere la selezione effettuata.
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXIV
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXIV
Autore Avidan Shai
Pubbl/distr/stampa Cham : , : Springer, , 2022
Descrizione fisica 1 online resource (803 pages)
Disciplina 006.37
Altri autori (Persone) BrostowGabriel
CisséMoustapha
FarinellaGiovanni Maria
HassnerTal
Collana Lecture Notes in Computer Science
Soggetto non controllato Engineering
Technology & Engineering
ISBN 3-031-20053-5
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Foreword -- Preface -- Organization -- Contents - Part XXIV -- Improving Vision Transformers by Revisiting High-Frequency Components -- 1 Introduction -- 2 Related Work -- 3 Revisiting ViT Models from a Frequency Perspective -- 4 The Proposed Method -- 4.1 Adversarial Training with High-Frequency Perturbations -- 4.2 A Case Study Using ViT-B -- 5 Experiments -- 5.1 Experimental Setup -- 5.2 Results on ImageNet Classification -- 5.3 Results on Out-of-distribution Data -- 5.4 Transfer Learning to Downstream Tasks -- 5.5 Ablation Studies -- 5.6 Discussions -- 6 Conclusions and Future Work -- References -- Recurrent Bilinear Optimization for Binary Neural Networks -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Preliminaries -- 3.2 Bilinear Model of BNNs -- 3.3 Recurrent Bilinear Optimization -- 3.4 Discussion -- 4 Experiments -- 4.1 Datasets and Implementation Details -- 4.2 Ablation Study -- 4.3 Image Classification -- 4.4 Object Detection -- 4.5 Deployment Efficiency -- 5 Conclusion -- References -- Neural Architecture Search for Spiking Neural Networks -- 1 Introduction -- 2 Related Work -- 2.1 Spiking Neural Networks -- 2.2 Neural Architecture Search -- 3 Preliminaries -- 3.1 Leaky Integrate-and-Fire Neuron -- 3.2 NAS Without Training -- 4 Methodology -- 4.1 Linear Regions from LIF Neurons -- 4.2 Sparsity-Aware Hamming Distance -- 4.3 Searching Forward and Backward Connections -- 5 Experiments -- 5.1 Implementation Details -- 5.2 Performance Comparison -- 5.3 Experimental Analysis -- 6 Conclusion -- References -- Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification -- 1 Introduction -- 2 Related Work -- 2.1 Fine-Grained Visual Classification -- 2.2 Human Attention in Vision -- 3 Approach -- 3.1 Overview -- 3.2 Region Feature Mining Module.
3.3 Cross-Hierarchical Orthogonal Fusion Module -- 4 Experiments and Analysis -- 4.1 Datasets -- 4.2 Hierarchy Interaction Analysis -- 4.3 Evaluation on Traditional FGVC Setting -- 4.4 Further Analysis -- 5 Conclusions -- References -- DaViT: Dual Attention Vision Transformers -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Overview -- 3.2 Spatial Window Attention -- 3.3 Channel Group Attention -- 3.4 Model Instantiation -- 4 Analysis -- 5 Experiments -- 5.1 Image Classification -- 5.2 Object Detection and Instance Segmentation -- 5.3 Semantic Segmentation on ADE20k -- 5.4 Ablation Study -- 6 Conclusion -- References -- Optimal Transport for Label-Efficient Visible-Infrared Person Re-Identification -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Problem Formulation and Overview -- 3.2 Discrepancy Elimination Network (DEN) -- 3.3 Optimal-Transport Label Assignment (OTLA) -- 3.4 Prediction Alignment Learning (PAL) -- 3.5 Optimization -- 4 Experiments -- 4.1 Experimental Settings -- 4.2 Implementation Details -- 4.3 Main Results -- 4.4 Ablation Study -- 4.5 Discussion -- 5 Conclusion -- References -- Locality Guidance for Improving Vision Transformers on Tiny Datasets -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 The Overall Approach -- 3.2 Guidance Positions -- 3.3 Architecture of the CNN -- 4 Experiments -- 4.1 Main Results -- 4.2 Discussion -- 4.3 Ablation Study -- 5 Conclusion -- References -- Neighborhood Collective Estimation for Noisy Label Identification and Correction -- 1 Introduction -- 2 Related Work -- 2.1 Noise Verification -- 2.2 Label Correction -- 3 The Proposed Method -- 3.1 Neighborhood Collective Noise Verification -- 3.2 Neighborhood Collective Label Correction -- 3.3 Training Objectives -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Comparisons with the State of the Art -- 4.3 Analysis.
5 Conclusions -- References -- Few-Shot Class-Incremental Learning via Entropy-Regularized Data-Free Replay -- 1 Introduction -- 2 Related Works -- 2.1 Class-Incremental Learning -- 2.2 Few-Shot Class-Incremental Learning -- 2.3 Data-Free Knowledge Distillation -- 3 Preliminaries -- 3.1 Problem Setting -- 3.2 Data-Free Replay -- 4 Methodology -- 4.1 Entropy-Regularized Data-Free Replay -- 4.2 Learning Incrementally with Uncertain Data -- 5 Experiments -- 5.1 Datasets -- 5.2 Implementation Details -- 5.3 Re-implementation of Replay-based Methods -- 5.4 Main Results and Comparison -- 5.5 Analysis -- 6 Conclusion -- References -- Anti-retroactive Interference for Lifelong Learning -- 1 Introduction -- 2 Related Work -- 2.1 Lifelong Learning -- 2.2 Adversarial Training -- 3 Proposed Method -- 3.1 Extracting Intra-Class Features -- 3.2 Generating and Fusing Task-Specific Models -- 4 Experiments and Results -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Results and Comparison -- 4.4 Ablation Study -- 5 Conclusion -- References -- Towards Calibrated Hyper-Sphere Representation via Distribution Overlap Coefficient for Long-Tailed Learning -- 1 Introduction -- 2 Related Works -- 3 Methodology -- 3.1 Build vMF Classifier on Hyper-Sphere -- 3.2 Quantify Distribution Overlap Coefficient on Hyper-Sphere -- 3.3 Improve Representation of Feature and Classifier via o -- 3.4 Calibrate Classifier Weight Beyond Training via o -- 4 Experiments -- 4.1 Long-Tailed Image Classification Task -- 4.2 Long-Tailed Semantic and Instance Segmentation Task -- 4.3 Ablation Study -- 5 Conclusions -- References -- Dynamic Metric Learning with Cross-Level Concept Distillation -- 1 Introduction -- 2 Related Work -- 3 Proposed Approach -- 3.1 Dynamic Metric Learning -- 3.2 Hierarchical Concept Refiner -- 3.3 Cross-Level Concept Distillation -- 3.4 Discussions -- 4 Experiments.
4.1 Datasets -- 4.2 Evaluation Protocol -- 4.3 Implementation Details -- 4.4 Main Results -- 4.5 Experimental Analysis -- 5 Conclusion -- References -- MENet: A Memory-Based Network with Dual-Branch for Efficient Event Stream Processing -- 1 Introduction -- 2 Related Work -- 2.1 Event-Based Representations -- 2.2 Memory-Based Networks -- 3 Event Camera Model -- 4 Method -- 4.1 Dual-Branch Structure -- 4.2 Double Polarities Calculation Method -- 4.3 Point-Wise Memory Bank -- 4.4 Training and Testing Strategies -- 5 Experiments -- 5.1 Experimental Setup -- 5.2 Ablation Study -- 5.3 Object Recognition -- 5.4 Gesture Recognition -- 6 Conclusion -- References -- Out-of-distribution Detection with Boundary Aware Learning -- 1 Introduction -- 2 Related Work -- 3 Preliminaries -- 4 Boundary Aware Learning -- 4.1 Representation Extraction Module (REM) -- 4.2 Representation Sampling Module (RSM) -- 4.3 Representation Discrimination Module (RDM) -- 5 Experiments -- 5.1 Dataset -- 5.2 Experimental Setup -- 5.3 Ablation Study -- 5.4 Detection Results -- 5.5 Visualization of trivial and hard OOD features -- 6 Conclusion -- References -- Learning Hierarchy Aware Features for Reducing Mistake Severity -- 1 Introduction -- 2 Related Work -- 3 HAF: Proposed Approach -- 3.1 Fine Grained Cross-Entropy (LCEfine) -- 3.2 Soft Hierarchical Consistency (Lshc) -- 3.3 Margin Loss (Lm) -- 3.4 Geometric Consistency (Lgc) -- 4 Experiments and Results -- 4.1 Experimental Setup -- 4.2 Training Configurations -- 4.3 Results -- 4.4 Coarse Classification Accuracy -- 5 Analysis -- 5.1 Ablation Study -- 5.2 Mistakes Severity Plots -- 5.3 Discussion: Hierarchical Metrics -- 6 Conclusion -- References -- Learning to Detect Every Thing in an Open World -- 1 Introduction -- 2 Related Work -- 3 Learning to Detect Every Thing -- 3.1 Data Augmentation: Background Erasing (BackErase).
3.2 Decoupled Multi-domain Training -- 4 Experiments -- 4.1 Cross-category Generalization -- 4.2 Cross-Dataset Generalization -- 5 Conclusion -- References -- KVT: k-NN Attention for Boosting Vision Transformers -- 1 Introduction -- 2 Related Work -- 2.1 Self-attention -- 2.2 Transformer for Vision -- 3 k-NN Attention -- 3.1 Vanilla Attention -- 3.2 k-NN Attention -- 3.3 Theoretical Analysis on k-NN Attention -- 4 Experiments for Vision Transformers -- 4.1 Experimental Settings -- 4.2 Results on ImageNet -- 4.3 The Impact of Number k -- 4.4 Convergence Speed of k-NN Attention -- 4.5 Other Properties of k-NN Attention -- 4.6 Comparisons with Temperature in Softmax -- 4.7 Visualization -- 4.8 Object Detection and Semantic Segmentation -- 5 Conclusion -- References -- Registration Based Few-Shot Anomaly Detection -- 1 Introduction -- 2 Related Work -- 2.1 Anomaly Detection -- 2.2 Few-Shot Learning -- 2.3 Few-Shot Anomaly Detection -- 3 Problem Setting -- 4 Method -- 4.1 Feature Registration Network -- 4.2 Normal Distribution Estimation -- 4.3 Inference -- 5 Experiments -- 5.1 Experimental Setups -- 5.2 Comparison with State-of-the-Art Methods -- 5.3 Ablation Studies -- 5.4 Visualization Analysis -- 6 Conclusion -- References -- Improving Robustness by Enhancing Weak Subnets -- 1 Introduction -- 2 Related Work -- 3 EWS: Training by Enhancing Weak Subnets -- 3.1 Subnet Construction and Impact on Overall Performance -- 3.2 Finding Particularly Weak Subnets -- 3.3 EWS: Enhancing Weak Subnets with Knowledge Distillation -- 3.4 Combining EWS with Adversarial Training -- 4 Experiments -- 4.1 Improving Corruption Robustness -- 4.2 Improving Adversarial Robustness -- 5 Ablation and Discussions -- 5.1 Search Strategies and Hyper-Parameters -- 5.2 Vulnerability of Blocks and Layers -- 6 Conclusion -- References.
Learning Invariant Visual Representations for Compositional Zero-Shot Learning.
Record Nr. UNISA-996500065903316
Avidan Shai  
Cham : , : Springer, , 2022
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XVII
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XVII
Autore Avidan Shai
Pubbl/distr/stampa Cham : , : Springer, , 2022
Descrizione fisica 1 online resource (800 pages)
Disciplina 006.37
Altri autori (Persone) BrostowGabriel
CisséMoustapha
FarinellaGiovanni Maria
HassnerTal
Collana Lecture Notes in Computer Science
Soggetto non controllato Engineering
Technology & Engineering
ISBN 3-031-19790-9
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Record Nr. UNISA-996495565403316
Avidan Shai  
Cham : , : Springer, , 2022
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXIV
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXIV
Autore Avidan Shai
Pubbl/distr/stampa Cham : , : Springer, , 2022
Descrizione fisica 1 online resource (803 pages)
Disciplina 006.37
Altri autori (Persone) BrostowGabriel
CisséMoustapha
FarinellaGiovanni Maria
HassnerTal
Collana Lecture Notes in Computer Science
Soggetto non controllato Engineering
Technology & Engineering
ISBN 3-031-20053-5
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Foreword -- Preface -- Organization -- Contents - Part XXIV -- Improving Vision Transformers by Revisiting High-Frequency Components -- 1 Introduction -- 2 Related Work -- 3 Revisiting ViT Models from a Frequency Perspective -- 4 The Proposed Method -- 4.1 Adversarial Training with High-Frequency Perturbations -- 4.2 A Case Study Using ViT-B -- 5 Experiments -- 5.1 Experimental Setup -- 5.2 Results on ImageNet Classification -- 5.3 Results on Out-of-distribution Data -- 5.4 Transfer Learning to Downstream Tasks -- 5.5 Ablation Studies -- 5.6 Discussions -- 6 Conclusions and Future Work -- References -- Recurrent Bilinear Optimization for Binary Neural Networks -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Preliminaries -- 3.2 Bilinear Model of BNNs -- 3.3 Recurrent Bilinear Optimization -- 3.4 Discussion -- 4 Experiments -- 4.1 Datasets and Implementation Details -- 4.2 Ablation Study -- 4.3 Image Classification -- 4.4 Object Detection -- 4.5 Deployment Efficiency -- 5 Conclusion -- References -- Neural Architecture Search for Spiking Neural Networks -- 1 Introduction -- 2 Related Work -- 2.1 Spiking Neural Networks -- 2.2 Neural Architecture Search -- 3 Preliminaries -- 3.1 Leaky Integrate-and-Fire Neuron -- 3.2 NAS Without Training -- 4 Methodology -- 4.1 Linear Regions from LIF Neurons -- 4.2 Sparsity-Aware Hamming Distance -- 4.3 Searching Forward and Backward Connections -- 5 Experiments -- 5.1 Implementation Details -- 5.2 Performance Comparison -- 5.3 Experimental Analysis -- 6 Conclusion -- References -- Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification -- 1 Introduction -- 2 Related Work -- 2.1 Fine-Grained Visual Classification -- 2.2 Human Attention in Vision -- 3 Approach -- 3.1 Overview -- 3.2 Region Feature Mining Module.
3.3 Cross-Hierarchical Orthogonal Fusion Module -- 4 Experiments and Analysis -- 4.1 Datasets -- 4.2 Hierarchy Interaction Analysis -- 4.3 Evaluation on Traditional FGVC Setting -- 4.4 Further Analysis -- 5 Conclusions -- References -- DaViT: Dual Attention Vision Transformers -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Overview -- 3.2 Spatial Window Attention -- 3.3 Channel Group Attention -- 3.4 Model Instantiation -- 4 Analysis -- 5 Experiments -- 5.1 Image Classification -- 5.2 Object Detection and Instance Segmentation -- 5.3 Semantic Segmentation on ADE20k -- 5.4 Ablation Study -- 6 Conclusion -- References -- Optimal Transport for Label-Efficient Visible-Infrared Person Re-Identification -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Problem Formulation and Overview -- 3.2 Discrepancy Elimination Network (DEN) -- 3.3 Optimal-Transport Label Assignment (OTLA) -- 3.4 Prediction Alignment Learning (PAL) -- 3.5 Optimization -- 4 Experiments -- 4.1 Experimental Settings -- 4.2 Implementation Details -- 4.3 Main Results -- 4.4 Ablation Study -- 4.5 Discussion -- 5 Conclusion -- References -- Locality Guidance for Improving Vision Transformers on Tiny Datasets -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 The Overall Approach -- 3.2 Guidance Positions -- 3.3 Architecture of the CNN -- 4 Experiments -- 4.1 Main Results -- 4.2 Discussion -- 4.3 Ablation Study -- 5 Conclusion -- References -- Neighborhood Collective Estimation for Noisy Label Identification and Correction -- 1 Introduction -- 2 Related Work -- 2.1 Noise Verification -- 2.2 Label Correction -- 3 The Proposed Method -- 3.1 Neighborhood Collective Noise Verification -- 3.2 Neighborhood Collective Label Correction -- 3.3 Training Objectives -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Comparisons with the State of the Art -- 4.3 Analysis.
5 Conclusions -- References -- Few-Shot Class-Incremental Learning via Entropy-Regularized Data-Free Replay -- 1 Introduction -- 2 Related Works -- 2.1 Class-Incremental Learning -- 2.2 Few-Shot Class-Incremental Learning -- 2.3 Data-Free Knowledge Distillation -- 3 Preliminaries -- 3.1 Problem Setting -- 3.2 Data-Free Replay -- 4 Methodology -- 4.1 Entropy-Regularized Data-Free Replay -- 4.2 Learning Incrementally with Uncertain Data -- 5 Experiments -- 5.1 Datasets -- 5.2 Implementation Details -- 5.3 Re-implementation of Replay-based Methods -- 5.4 Main Results and Comparison -- 5.5 Analysis -- 6 Conclusion -- References -- Anti-retroactive Interference for Lifelong Learning -- 1 Introduction -- 2 Related Work -- 2.1 Lifelong Learning -- 2.2 Adversarial Training -- 3 Proposed Method -- 3.1 Extracting Intra-Class Features -- 3.2 Generating and Fusing Task-Specific Models -- 4 Experiments and Results -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Results and Comparison -- 4.4 Ablation Study -- 5 Conclusion -- References -- Towards Calibrated Hyper-Sphere Representation via Distribution Overlap Coefficient for Long-Tailed Learning -- 1 Introduction -- 2 Related Works -- 3 Methodology -- 3.1 Build vMF Classifier on Hyper-Sphere -- 3.2 Quantify Distribution Overlap Coefficient on Hyper-Sphere -- 3.3 Improve Representation of Feature and Classifier via o -- 3.4 Calibrate Classifier Weight Beyond Training via o -- 4 Experiments -- 4.1 Long-Tailed Image Classification Task -- 4.2 Long-Tailed Semantic and Instance Segmentation Task -- 4.3 Ablation Study -- 5 Conclusions -- References -- Dynamic Metric Learning with Cross-Level Concept Distillation -- 1 Introduction -- 2 Related Work -- 3 Proposed Approach -- 3.1 Dynamic Metric Learning -- 3.2 Hierarchical Concept Refiner -- 3.3 Cross-Level Concept Distillation -- 3.4 Discussions -- 4 Experiments.
4.1 Datasets -- 4.2 Evaluation Protocol -- 4.3 Implementation Details -- 4.4 Main Results -- 4.5 Experimental Analysis -- 5 Conclusion -- References -- MENet: A Memory-Based Network with Dual-Branch for Efficient Event Stream Processing -- 1 Introduction -- 2 Related Work -- 2.1 Event-Based Representations -- 2.2 Memory-Based Networks -- 3 Event Camera Model -- 4 Method -- 4.1 Dual-Branch Structure -- 4.2 Double Polarities Calculation Method -- 4.3 Point-Wise Memory Bank -- 4.4 Training and Testing Strategies -- 5 Experiments -- 5.1 Experimental Setup -- 5.2 Ablation Study -- 5.3 Object Recognition -- 5.4 Gesture Recognition -- 6 Conclusion -- References -- Out-of-distribution Detection with Boundary Aware Learning -- 1 Introduction -- 2 Related Work -- 3 Preliminaries -- 4 Boundary Aware Learning -- 4.1 Representation Extraction Module (REM) -- 4.2 Representation Sampling Module (RSM) -- 4.3 Representation Discrimination Module (RDM) -- 5 Experiments -- 5.1 Dataset -- 5.2 Experimental Setup -- 5.3 Ablation Study -- 5.4 Detection Results -- 5.5 Visualization of trivial and hard OOD features -- 6 Conclusion -- References -- Learning Hierarchy Aware Features for Reducing Mistake Severity -- 1 Introduction -- 2 Related Work -- 3 HAF: Proposed Approach -- 3.1 Fine Grained Cross-Entropy (LCEfine) -- 3.2 Soft Hierarchical Consistency (Lshc) -- 3.3 Margin Loss (Lm) -- 3.4 Geometric Consistency (Lgc) -- 4 Experiments and Results -- 4.1 Experimental Setup -- 4.2 Training Configurations -- 4.3 Results -- 4.4 Coarse Classification Accuracy -- 5 Analysis -- 5.1 Ablation Study -- 5.2 Mistakes Severity Plots -- 5.3 Discussion: Hierarchical Metrics -- 6 Conclusion -- References -- Learning to Detect Every Thing in an Open World -- 1 Introduction -- 2 Related Work -- 3 Learning to Detect Every Thing -- 3.1 Data Augmentation: Background Erasing (BackErase).
3.2 Decoupled Multi-domain Training -- 4 Experiments -- 4.1 Cross-category Generalization -- 4.2 Cross-Dataset Generalization -- 5 Conclusion -- References -- KVT: k-NN Attention for Boosting Vision Transformers -- 1 Introduction -- 2 Related Work -- 2.1 Self-attention -- 2.2 Transformer for Vision -- 3 k-NN Attention -- 3.1 Vanilla Attention -- 3.2 k-NN Attention -- 3.3 Theoretical Analysis on k-NN Attention -- 4 Experiments for Vision Transformers -- 4.1 Experimental Settings -- 4.2 Results on ImageNet -- 4.3 The Impact of Number k -- 4.4 Convergence Speed of k-NN Attention -- 4.5 Other Properties of k-NN Attention -- 4.6 Comparisons with Temperature in Softmax -- 4.7 Visualization -- 4.8 Object Detection and Semantic Segmentation -- 5 Conclusion -- References -- Registration Based Few-Shot Anomaly Detection -- 1 Introduction -- 2 Related Work -- 2.1 Anomaly Detection -- 2.2 Few-Shot Learning -- 2.3 Few-Shot Anomaly Detection -- 3 Problem Setting -- 4 Method -- 4.1 Feature Registration Network -- 4.2 Normal Distribution Estimation -- 4.3 Inference -- 5 Experiments -- 5.1 Experimental Setups -- 5.2 Comparison with State-of-the-Art Methods -- 5.3 Ablation Studies -- 5.4 Visualization Analysis -- 6 Conclusion -- References -- Improving Robustness by Enhancing Weak Subnets -- 1 Introduction -- 2 Related Work -- 3 EWS: Training by Enhancing Weak Subnets -- 3.1 Subnet Construction and Impact on Overall Performance -- 3.2 Finding Particularly Weak Subnets -- 3.3 EWS: Enhancing Weak Subnets with Knowledge Distillation -- 3.4 Combining EWS with Adversarial Training -- 4 Experiments -- 4.1 Improving Corruption Robustness -- 4.2 Improving Adversarial Robustness -- 5 Ablation and Discussions -- 5.1 Search Strategies and Hyper-Parameters -- 5.2 Vulnerability of Blocks and Layers -- 6 Conclusion -- References.
Learning Invariant Visual Representations for Compositional Zero-Shot Learning.
Record Nr. UNINA-9910629291203321
Avidan Shai  
Cham : , : Springer, , 2022
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XVII
Computer Vision - ECCV 2022 : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XVII
Autore Avidan Shai
Pubbl/distr/stampa Cham : , : Springer, , 2022
Descrizione fisica 1 online resource (800 pages)
Disciplina 006.37
Altri autori (Persone) BrostowGabriel
CisséMoustapha
FarinellaGiovanni Maria
HassnerTal
Collana Lecture Notes in Computer Science
Soggetto non controllato Engineering
Technology & Engineering
ISBN 3-031-19790-9
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Record Nr. UNINA-9910619273903321
Avidan Shai  
Cham : , : Springer, , 2022
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Computer vision - ECCV 2022 . Part XXXV : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022 : proceedings / / Shai Avidan [and four others]
Computer vision - ECCV 2022 . Part XXXV : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022 : proceedings / / Shai Avidan [and four others]
Edizione [1st ed. 2022.]
Pubbl/distr/stampa Cham, Switzerland : , : Springer, , [2022]
Descrizione fisica 1 online resource (801 pages)
Disciplina 006.37
Collana Lecture Notes in Computer Science
Soggetto topico Computer vision
Pattern recognition systems
Soggetto non controllato Engineering
Technology & Engineering
ISBN 3-031-19833-6
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Efficient One-Stage Video Object Detection by Exploiting Temporal Consistency -- Leveraging Action Affinity and Continuity for Semi-Supervised Temporal Action Segmentation -- Spotting Temporally Precise, Fine-Grained Events in Video -- Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation -- Efficient Video Transformers with Spatial-Temporal Token Selection -- Long Movie Clip Classification with State-Space Video Models -- Prompting Visual-Language Models for Efficient Video Understanding -- Asymmetric Relation Consistency Reasoning for Video Relation Grounding -- Self-Supervised Social Relation Representation for Human Group Detection -- K-Centered Patch Sampling for Efficient Video Recognition -- A Deep Moving-Camera Background Model -- GraphVid: It Only Takes a Few Nodes to Understand a Video -- Delta Distillation for Efficient Video Processing -- MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning -- COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality -- E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context -- TDViT: Temporal Dilated Video Transformer for Dense Video Tasks -- Semi-Supervised Learning of Optical Flow by Flow Supervisor -- Flow Graph to Video Grounding for Weakly-Supervised Multi-step Localization -- Deep 360° Optical Flow Estimation Based on Multi-Projection Fusion -- MaCLR: Motion-Aware Contrastive Learning of Representations for Videos -- Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection -- Frozen CLIP Models Are Efficient Video Learners -- PIP: Physical Interaction Prediction via Mental Simulation with Span Selection -- Panoramic Vision Transformer for Saliency Detection in 360° Videos -- Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration -- Motion Sensitive Contrastive Learning for Self-Supervised Video Representation -- Dynamic Temporal Filtering In Video Models -- Tip-Adapter: Training-Free Adaption of CLIP for Few-Shot Classification -- Temporal Lift Pooling for Continuous Sign Language Recognition -- MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes -- SiRi: A Simple Selective Retraining Mechanism for Transformer-Based Visual Grounding -- Cross-Modal Prototype Driven Network for Radiology Report Generation -- TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts -- SeqTR: A Simple Yet Universal Network for Visual Grounding -- VTC: Improving Video-Text Retrieval with User Comments -- FashionViL: Fashion-Focused Vision-and-Language Representation Learning -- Weakly Supervised Grounding for VQA in Vision-Language Transformers -- Automatic Dense Annotation of Large-Vocabulary Sign Language Videos -- MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval -- GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval -- A Simple and Robust Correlation Filtering Method for Text-Based Person Search.
Record Nr. UNISA-996500066303316
Cham, Switzerland : , : Springer, , [2022]
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Computer vision - ECCV 2022 . Part XXXV : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022 : proceedings / / Shai Avidan [and four others]
Computer vision - ECCV 2022 . Part XXXV : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022 : proceedings / / Shai Avidan [and four others]
Edizione [1st ed. 2022.]
Pubbl/distr/stampa Cham, Switzerland : , : Springer, , [2022]
Descrizione fisica 1 online resource (801 pages)
Disciplina 006.37
Collana Lecture Notes in Computer Science
Soggetto topico Computer vision
Pattern recognition systems
Soggetto non controllato Engineering
Technology & Engineering
ISBN 3-031-19833-6
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Efficient One-Stage Video Object Detection by Exploiting Temporal Consistency -- Leveraging Action Affinity and Continuity for Semi-Supervised Temporal Action Segmentation -- Spotting Temporally Precise, Fine-Grained Events in Video -- Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation -- Efficient Video Transformers with Spatial-Temporal Token Selection -- Long Movie Clip Classification with State-Space Video Models -- Prompting Visual-Language Models for Efficient Video Understanding -- Asymmetric Relation Consistency Reasoning for Video Relation Grounding -- Self-Supervised Social Relation Representation for Human Group Detection -- K-Centered Patch Sampling for Efficient Video Recognition -- A Deep Moving-Camera Background Model -- GraphVid: It Only Takes a Few Nodes to Understand a Video -- Delta Distillation for Efficient Video Processing -- MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning -- COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality -- E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context -- TDViT: Temporal Dilated Video Transformer for Dense Video Tasks -- Semi-Supervised Learning of Optical Flow by Flow Supervisor -- Flow Graph to Video Grounding for Weakly-Supervised Multi-step Localization -- Deep 360° Optical Flow Estimation Based on Multi-Projection Fusion -- MaCLR: Motion-Aware Contrastive Learning of Representations for Videos -- Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection -- Frozen CLIP Models Are Efficient Video Learners -- PIP: Physical Interaction Prediction via Mental Simulation with Span Selection -- Panoramic Vision Transformer for Saliency Detection in 360° Videos -- Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration -- Motion Sensitive Contrastive Learning for Self-Supervised Video Representation -- Dynamic Temporal Filtering In Video Models -- Tip-Adapter: Training-Free Adaption of CLIP for Few-Shot Classification -- Temporal Lift Pooling for Continuous Sign Language Recognition -- MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes -- SiRi: A Simple Selective Retraining Mechanism for Transformer-Based Visual Grounding -- Cross-Modal Prototype Driven Network for Radiology Report Generation -- TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts -- SeqTR: A Simple Yet Universal Network for Visual Grounding -- VTC: Improving Video-Text Retrieval with User Comments -- FashionViL: Fashion-Focused Vision-and-Language Representation Learning -- Weakly Supervised Grounding for VQA in Vision-Language Transformers -- Automatic Dense Annotation of Large-Vocabulary Sign Language Videos -- MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval -- GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval -- A Simple and Robust Correlation Filtering Method for Text-Based Person Search.
Record Nr. UNINA-9910629292203321
Cham, Switzerland : , : Springer, , [2022]
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Dense Image Correspondences for Computer Vision / / edited by Tal Hassner, Ce Liu
Dense Image Correspondences for Computer Vision / / edited by Tal Hassner, Ce Liu
Edizione [1st ed. 2016.]
Pubbl/distr/stampa Cham : , : Springer International Publishing : , : Imprint : Springer, , 2016
Descrizione fisica 1 online resource (302 p.)
Disciplina 620
Soggetto topico Signal processing
Image processing
Speech processing systems
Optical data processing
Artificial intelligence
Electrical engineering
Signal, Image and Speech Processing
Image Processing and Computer Vision
Artificial Intelligence
Communications Engineering, Networks
ISBN 3-319-23048-4
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Introduction to Dense Optical Flow -- SIFT Flow: Dense Correspondence across Scenes and its Applications -- Dense, Scale-Less Descriptors -- Scale-Space SIFT Flow -- Dense Segmentation-aware Descriptors -- SIFTpack: A Compact Representation for Efficient SIFT Matching -- In Defense of Gradient-Based Alignment on Densely Sampled Sparse Features -- From Images to Depths and Back -- DepthTransfer: Depth Extraction from Video Using Non-parametric Sampling -- Joint Inference in Image Datasets via Dense Correspondence -- Dense Correspondences and Ancient Texts.
Record Nr. UNINA-9910254206203321
Cham : , : Springer International Publishing : , : Imprint : Springer, , 2016
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui