Vai al contenuto principale della pagina
Titolo: | Computer vision - ECCV 2022 . Part I : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022 : proceedings / / Shai Avidan [and four others] |
Pubblicazione: | Cham, Switzerland : , : Springer, , [2022] |
©2022 | |
Descrizione fisica: | 1 online resource (803 pages) |
Disciplina: | 006.37 |
Soggetto topico: | Computer vision |
Pattern recognition systems | |
Persona (resp. second.): | Shai Avidan |
Nota di contenuto: | Intro -- Foreword -- Preface -- Organization -- Contents - Part I -- Learning Depth from Focus in the Wild -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 A Network for Defocus Image Alignment -- 3.2 Focal Stack-oriented Feature Extraction -- 3.3 Aggregation and Refinement -- 4 Evaluation -- 4.1 Comparisons to State-of-the-art Methods -- 4.2 Ablation Studies -- 5 Conclusion -- References -- Learning-Based Point Cloud Registration for 6D Object Pose Estimation in the Real World -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Problem Formulation -- 3.2 Method Overview -- 3.3 Match Normalization for Robust Feature Extraction -- 3.4 NLL Loss Function for Stable Training -- 3.5 Network Architectures -- 4 Experiments -- 4.1 Datasets and Training Parameters -- 4.2 Evaluation Metrics -- 4.3 Comparison with Existing Methods -- 4.4 Ablation Study -- 5 Conclusion -- References -- An End-to-End Transformer Model for Crowd Localization -- 1 Introduction -- 2 Related Works -- 2.1 Detection-Based Methods -- 2.2 Map-Based Methods -- 2.3 Regression-Based Methods -- 2.4 Visual Transformer -- 3 Our Method -- 3.1 Transformer Encoder -- 3.2 Transformer Decoder -- 3.3 KMO-Based Matcher -- 3.4 Loss Function -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Dataset -- 4.3 Evaluation Metrics -- 5 Results and Analysis -- 5.1 Crowd Localization -- 5.2 Crowd Counting -- 5.3 Visualizations -- 5.4 Ablation Studies -- 5.5 Limitations -- 6 Conclusion -- References -- Few-Shot Single-View 3D Reconstruction with Memory Prior Contrastive Network -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Memory Network -- 3.2 Prior Module -- 3.3 3D-Aware Contrastive Learning Method -- 3.4 Training Procedure in Few-Shot Settings -- 3.5 Architecture -- 3.6 Loss Function -- 4 Experiment -- 4.1 Experimental Setup -- 4.2 Results on ShapeNet Dataset. |
4.3 Results on Real-world Dataset -- 5 Ablation Study -- 6 Conclusion -- References -- DID-M3D: Decoupling Instance Depth for Monocular 3D Object Detection -- 1 Introduction -- 2 Related Work -- 2.1 LiDAR-Based 3D Object Detection -- 2.2 Monocular 3D Object Detection -- 2.3 Estimation of Instance Depth -- 3 Overview and Framework -- 4 Decoupled Instance Depth -- 4.1 Visual Depth -- 4.2 Attribute Depth -- 4.3 Data Augmentation -- 4.4 Depth Uncertainty and Aggregation -- 4.5 Loss Functions -- 5 Experiments -- 5.1 Implementation Details -- 5.2 Dataset and Metrics -- 5.3 Performance on KITTI Benchmark -- 5.4 Ablation Study -- 6 Conclusion -- References -- Adaptive Co-teaching for Unsupervised Monocular Depth Estimation -- 1 Introduction -- 2 Related Work -- 2.1 Unsupervised Monocular Depth Estimation -- 2.2 Knowledge Distillation -- 3 Problem Formulation -- 4 Methodology -- 4.1 MUSTNet: MUlit-STream Ensemble Network -- 4.2 Adaptive Co-teaching Framework -- 4.3 Implementation -- 5 Experiments -- 5.1 Datasets -- 5.2 Quantitative Evaluation -- 5.3 Ablation Studies -- 5.4 Extension of Our Work -- 5.5 Discussion -- 6 Conclusion -- References -- Fusing Local Similarities for Retrieval-Based 3D Orientation Estimation of Unseen Objects -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Problem Formulation -- 3.2 Motivation -- 3.3 Multi-scale Patch-Level Image Comparison -- 3.4 Fast Retrieval -- 3.5 Training and Testing -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Experimental Setup -- 4.3 Experiments on LineMOD and LineMOD-O -- 4.4 Experiments on T-LESS -- 4.5 Ablation Studies -- 5 Conclusion -- References -- Lidar Point Cloud Guided Monocular 3D Object Detection -- 1 Introduction -- 2 Related Work -- 2.1 LiDAR-based 3D Object Detection -- 2.2 Image-Only-Based Monocular 3D Object Detection -- 2.3 Depth-Map-Based Monocular 3D Object Detection. | |
3 LiDAR Guided Monocular 3D Detection -- 3.1 High Accuracy Mode -- 3.2 Low Cost Mode -- 4 Applications in Real-World Self-driving System -- 5 Experiments -- 5.1 Implementation Details -- 5.2 Dataset and Metrics -- 5.3 Results on KITTI -- 5.4 Results on Waymo -- 5.5 Comparisons on Pseudo Labels and Manually Annotated Labels -- 5.6 Ablation Studies -- 6 Conclusion -- References -- Structural Causal 3D Reconstruction -- 1 Introduction -- 2 Related Work -- 3 Causal Ordering of Latent Factors Matters -- 3.1 A Motivating Example from Function Approximation -- 3.2 Expressiveness of Representing Conditional Distributions -- 3.3 Modeling Causality in Rendering-Based Decoding -- 3.4 Empirical Evidence on 3D Reconstruction -- 4 Learning Causal Ordering for 3D Reconstruction -- 4.1 General SCR Framework -- 4.2 Learning Dense SCR via Bayesian Optimization -- 4.3 Learning Generic SCR via Optimization Unrolling -- 4.4 Learning Dynamic SCR via Masked Self-attention -- 4.5 Insights and Discussion -- 5 Experiments and Results -- 5.1 Quantitative Results -- 5.2 Qualitative Results -- References -- 3D Human Pose Estimation Using Möbius Graph Convolutional Networks -- 1 Introduction -- 2 Related Work -- 3 Spectral Graph Convolutional Network -- 3.1 Graph Definitions -- 3.2 Graph Fourier Transform -- 3.3 Spectral Graph Convolutional Network -- 3.4 Spectral Graph Filter -- 4 MöbiusGCN -- 4.1 Möbius Transformation -- 4.2 MöbiusGCN -- 4.3 Why MöbiusGCN is a Light Architecture -- 4.4 Discontinuity -- 5 Experimental Results -- 5.1 Datasets and Evaluation Protocols -- 5.2 Implementation Details -- 5.3 Fully-Supervised MöbiusGCN -- 5.4 MöbiusGCN with Reduced Dataset -- 6 Conclusion and Discussion -- References -- Learning to Train a Point Cloud Reconstruction Network Without Matching -- 1 Introduction -- 2 Related Works -- 2.1 Optimization-Based Matching Losses. | |
2.2 Generative Adversarial Network -- 3 Methodology -- 3.1 The Architecture of PCLossNet -- 3.2 Training of the Reconstruction Network -- 3.3 Algorithm Analysis -- 4 Experiments -- 4.1 Datasets and Implementation Details -- 4.2 Comparisons with Basic Matching-Based Losses -- 4.3 Comparisons with Discriminators-Based Losses -- 4.4 Comparisons on Training Efficiency -- 4.5 How Is the Training Process Going? -- 4.6 Ablation Study -- 5 Conclusion -- References -- PanoFormer: Panorama Transformer for Indoor 360 Depth Estimation -- 1 Introduction -- 2 Related Work -- 2.1 Panoramic Depth Estimation -- 2.2 Vision Transformer -- 3 PanoFomer -- 3.1 Architecture Overview -- 3.2 Transformer-Customized Spherical Token -- 3.3 Relative Position Embedding -- 3.4 Panorama Self-attention with Token Flow -- 3.5 Objective Function -- 4 Panorama-Specific Metrics -- 5 Experiments -- 5.1 Datasets and Implementations -- 5.2 Comparison Results -- 5.3 Ablation Study -- 5.4 Extensibility -- 6 Conclusion -- References -- Self-supervised Human Mesh Recovery with Cross-Representation Alignment -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Prerequisites -- 3.2 Training Data Synthesis -- 3.3 Individual Coarse-to-Fine Regression -- 3.4 Evidential Cross-Representation Alignment -- 3.5 Loss Function -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Quantitative Results -- 4.4 Qualitative Results -- 5 Conclusion -- References -- AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Hand Pose Estimation -- 3.2 Object Pose Estimation -- 3.3 Hand and Object Shape Reconstruction -- 4 Experiments -- 4.1 Benchmarks -- 4.2 Evaluation Metrics -- 4.3 Implementation Details -- 4.4 Hand-Only Experiments on ObMan -- 4.5 Hand-Object Experiments on ObMan -- 4.6 Hand-Object Experiments on DexYCB. | |
5 Conclusion -- References -- A Reliable Online Method for Joint Estimation of Focal Length and Camera Rotation -- 1 Introduction -- 2 Prior Work -- 2.1 Image Features and Deviation Measures -- 2.2 Benchmarks -- 2.3 State-of-the-Art Systems -- 3 fR Method -- 3.1 Probabilistic Model -- 3.2 Parameter Search -- 3.3 Error Prediction -- 4 Datasets -- 5 Experiments -- 5.1 Evaluating Deviation Measures -- 5.2 Evaluating Line Segment Detectors -- 5.3 Comparison with State of the Art -- 5.4 Predicting Reliability -- 5.5 Run Time -- 6 Limitations -- 7 Conclusions -- References -- PS-NeRF: Neural Inverse Rendering for Multi-view Photometric Stereo -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Overview -- 3.2 Stage I: Initial Shape Modeling -- 3.3 Stage II: Joint Optimization with Inverse Rendering -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Dataset -- 4.3 Comparison with MVPS Methods -- 4.4 Comparison with Neural Rendering Based Methods -- 4.5 Method Analysis -- 5 Conclusions -- References -- Share with Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency -- 1 Introduction -- 2 Related Work -- 3 Approach -- 3.1 Structured Autoencoding -- 3.2 Unsupervised Learning with Cross-Instance Consistency -- 3.3 Alternate 3D and Pose Learning -- 4 Experiments -- 4.1 Evaluation on the ShapeNet Benchmark -- 4.2 Results on Real Images -- 4.3 Ablation Study -- 5 Conclusion -- References -- Towards Comprehensive Representation Enhancement in Semantics-Guided Self-supervised Monocular Depth Estimation -- 1 Introduction -- 2 Related Work -- 2.1 Self-supervised Monocular Depth Estimation -- 2.2 Vision Transformer -- 2.3 Deep Metric Learning -- 3 Methods -- 3.1 Proposed Model -- 3.2 Photometric Loss and Edge-Aware Smoothness Loss -- 3.3 Hardest Non-boundary Triplet Loss with Minimum-Distance Based Candidate Mining Strategy. | |
4 Experiments. | |
Titolo autorizzato: | Computer Vision – ECCV 2022 |
ISBN: | 3-031-19769-0 |
Formato: | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione: | Inglese |
Record Nr.: | 996495567003316 |
Lo trovi qui: | Univ. di Salerno |
Opac: | Controlla la disponibilità qui |