1.

Record Nr.

UNISA996495568603316

Titolo

Computer vision - ECCV 2022 . Part XXII : 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, proceedings / / Shai Avidan [and four others]

Pubbl/distr/stampa

Cham, Switzerland : , : Springer, , [2022]

©2022

ISBN

3-031-20047-0

Descrizione fisica

1 online resource (828 pages)

Collana

Lecture Notes in Computer Science

Disciplina

006.4

Soggetti

Computer vision

Pattern recognition systems

Lingua di pubblicazione

Inglese

Formato

Materiale a stampa

Livello bibliografico

Monografia

Nota di contenuto

Intro -- Foreword -- Preface -- Organization -- Contents - Part XXII -- ByteTrack: Multi-object Tracking by Associating Every Detection Box -- 1 Introduction -- 2 Related Work -- 2.1 Object Detection in MOT -- 2.2 Data Association -- 3 BYTE -- 4 Experiments -- 4.1 Setting -- 4.2 Ablation Studies on BYTE -- 4.3 Benchmark Evaluation -- 5 Conclusion -- References -- Robust Multi-object Tracking by Marginal Inference -- 1 Introduction -- 2 Related Work -- 2.1 Similarity Computation -- 2.2 Matching Strategy -- 3 Method -- 3.1 Problem Formulation -- 3.2 Our Solution -- 3.3 Tracking Algorithm -- 4 Experiments -- 4.1 MOT Benchmarks and Metrics -- 4.2 Implementation Details -- 4.3 Evaluation of the Marginal Probability -- 4.4 Ablation Studies -- 4.5 Benchmark Evaluation -- 5 Conclusion -- References -- PolarMOT: How Far Can Geometric Relations Take us in 3D Multi-object Tracking? -- 1 Introduction -- 2 Related Work -- 3 Message Passing Networks for Multi-object Tracking -- 4 PolarMOT -- 4.1 Method Overview -- 4.2 Message Passing on a Sparse Multiplex Graph -- 4.3 Localized Relational Polar Encoding -- 4.4 Online Graph Construction -- 4.5 Implementation Details -- 5 Experimental Evaluation -- 5.1 Evaluation Setting -- 5.2 Benchmark Results -- 5.3 Model Ablation -- 5.4 Generalization Study -- 6 Conclusion -- References -- Particle Video Revisited: Tracking Through Occlusions Using Point Trajectories -- 1



Introduction -- 2 Related Work -- 2.1 Optical Flow -- 2.2 Feature Matching -- 2.3 Tracking with Temporal Priors -- 3 Persistent Independent Particles (PIPs) -- 3.1 Setup and Overview -- 3.2 Extracting Features -- 3.3 Initializing Each Target -- 3.4 Measuring Local Appearance Similarity -- 3.5 Iterative Updates -- 3.6 Supervision -- 3.7 Test-Time Trajectory Linking -- 4 Implementation Details -- 5 Experiments -- 5.1 Training Data: FlyingThings++.

5.2 Baselines -- 5.3 Trajectory Estimation in FlyingThings++ -- 5.4 Trajectory Estimation in KITTI -- 5.5 Trajectory Estimation in CroHD -- 5.6 Keypoint Propagation in BADJA -- 5.7 Limitations -- 6 Conclusion -- References -- Tracking Objects as Pixel-Wise Distributions -- 1 Introduction -- 2 Related Work -- 2.1 Transformer-Based Multiple-object Tracking -- 2.2 Conventional Multi-object Tracking -- 2.3 Transformer Revolution -- 2.4 Video Object Detection -- 2.5 Pixel-Wise Techniques -- 3 Pixel-Wise Propagation, Prediction and Association -- 3.1 Single-Frame Feature Extraction -- 3.2 Pixel-Wise Feature Propagation -- 3.3 Pixel-Wise Predictions -- 3.4 Training Targets -- 3.5 Pixel-Wise Association -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Comparisons on Public Benchmarks -- 4.4 Effectiveness of Pixel-Wise Techniques -- 4.5 Influence of Training Techniques -- 4.6 Generalizing to Other Trackers -- 4.7 Visualization of Results -- 5 Conclusion -- References -- CMT: Context-Matching-Guided Transformer for 3D Tracking in Point Clouds -- 1 Introduction -- 2 Related Works -- 2.1 2D Siamese Tracking -- 2.2 3D Single Object Tracking -- 2.3 Transformer for Point Cloud Analysis -- 3 Method -- 3.1 Feature Extraction -- 3.2 Two-Stage Template-Search Matching -- 3.3 Matching-Guided Feature Fusion -- 4 Experiments -- 4.1 Experimental Settings -- 4.2 Comparison Results -- 4.3 Ablation Study -- 5 Conclusions -- References -- Towards Generic 3D Tracking in RGBD Videos: Benchmark and Baseline -- 1 Introduction -- 2 Related Work -- 3 Proposed Benchmark: Track-it-in-3D -- 3.1 Problem Formulation -- 3.2 Dataset Construction -- 3.3 Evaluation Protocols -- 3.4 Comparison with Related Tasks -- 4 Proposed Baseline: TrackIt3D -- 4.1 Network Architecture -- 4.2 Implementation Details -- 5 Experiments -- 5.1 Benchmark Settings -- 5.2 Benchmark Results.

5.3 Ablation Study -- 6 Conclusions -- References -- Hierarchical Latent Structure for Multi-modal Vehicle Trajectory Forecasting -- 1 Introduction -- 2 Related Works -- 2.1 Limitations of VAEs -- 2.2 Forecasting with Lane Geometry -- 2.3 Lane-Level Scene Context -- 3 Proposed Method -- 3.1 Problem Formulation -- 3.2 Forecasting Model with Hierarchical Latent Structure -- 3.3 Proposed Network Structure -- 3.4 Regularization Through GAN -- 3.5 Training Details -- 3.6 Inference -- 4 Experiments -- 4.1 Dataset -- 4.2 Evaluation Metric -- 4.3 Ablation Study -- 4.4 Performance Evaluation -- 5 Conclusions -- References -- AiATrack: Attention in Attention for Transformer Visual Tracking -- 1 Introduction -- 2 Related Work -- 2.1 Visual Tracking -- 2.2 Attention Mechanism -- 2.3 Correlation as Feature -- 3 Method -- 3.1 Attention in Attention -- 3.2 Proposed Framework -- 3.3 Tracking with AiATrack -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Results and Comparisons -- 4.3 Ablation Studies -- 5 Conclusion -- References -- Disentangling Architecture and Training for Optical Flow -- 1 Introduction -- 2 Previous Work -- 3 Approach and Results -- 3.1 Models Evaluated -- 3.2 Pre-training -- 3.3 Fine-Tuning -- 3.4 Benchmark Results -- 3.5 Higher-Resolution Input, Inference Time and Memory -- 3.6 Discussion -- 4 Conclusions -- References -- A Perturbation-Constrained Adversarial Attack for Evaluating the Robustness of Optical Flow -- 1 Introduction -- 2 Related Work -- 3



Adversarial Attacks: Foundations and Notations -- 4 A Global Perturbation-Constrained Adversarial Attack for Optical Flow Networks -- 4.1 Attack Strength and Adversarial Robustness for Optical Flow -- 4.2 The Perturbation-Constrained Flow Attack -- 4.3 Joint and Universal Adversarial Perturbations -- 4.4 Design Overview and Comparison to Literature -- 5 Experiments.

5.1 Generating Strong Perturbations for Individual Frame Pairs -- 5.2 Joint and Universal Perturbations -- 5.3 Evaluating Quality and Robustness for Optical Flow Methods -- 6 Conclusions -- References -- Robust Landmark-Based Stent Tracking in X-ray Fluoroscopy -- 1 Introduction -- 2 Related Work -- 2.1 Digital Stent Enhancement -- 2.2 Balloon Marker Detection -- 2.3 Graph Based Object Tracking -- 3 Approach -- 3.1 Landmark Detection -- 3.2 Stent Proposal and Feature Extraction -- 3.3 Stent Tracking -- 3.4 Training -- 4 Experiments -- 4.1 Datasets -- 4.2 Comparative Models -- 4.3 Evaluation Metrics -- 4.4 Implementation Details -- 5 Results and Discussion -- 5.1 Main Results -- 5.2 Ablation Studies -- 6 Conclusion -- References -- Social ODE: Multi-agent Trajectory Forecasting with Neural Ordinary Differential Equations -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Trajectory Forecasting -- 3.2 Social ODE: Overview -- 3.3 Encoder: Spatio-Temporal Transformer -- 3.4 Decoder -- 3.5 Loss Function -- 3.6 Agent Controlling with Social ODE -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation and Training Details -- 4.3 Comparison Results -- 4.4 Agent Controlling with Social ODE -- 4.5 Ablation Study -- 5 Conclusion -- References -- Social-SSL: Self-supervised Cross-Sequence Representation Learning Based on Transformers for Multi-agent Trajectory Prediction -- 1 Introduction -- 2 Related Work -- 3 Social-SSL -- 3.1 Preliminary -- 3.2 Pretext Task -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Implementation Details -- 4.3 Quantitative Results -- 4.4 Qualitative Results -- 4.5 Ablation Study -- 5 Conclusions -- References -- Diverse Human Motion Prediction Guided by Multi-level Spatial-Temporal Anchors -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Multi-level Spatial-Temporal Anchor-Based Sampling.

3.2 Interaction-Enhanced Spatial-Temporal Graph Convolutional Network -- 4 Experiments -- 4.1 Experimental Setup for Diverse Prediction -- 4.2 Quantitative Results and Ablation of Diverse Prediction -- 4.3 Qualitative Results of Diverse Prediction -- 4.4 Effectiveness on Deterministic Prediction -- 5 Conclusion -- References -- Learning Pedestrian Group Representations for Multi-modal Trajectory Prediction -- 1 Introduction -- 2 Related Works -- 2.1 Trajectory Prediction -- 2.2 Group-Aware Representation -- 2.3 Graph Node Pooling -- 3 Proposed Method -- 3.1 Problem Definition -- 3.2 Learning the Trajectory Grouping Network -- 3.3 Pedestrian Group Hierarchy Architecture -- 3.4 Implementation Details -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Quantitative Results -- 4.3 Qualitative Results -- 4.4 Ablation Study -- 5 Conclusion -- References -- Sequential Multi-view Fusion Network for Fast LiDAR Point Motion Estimation -- 1 Introduction -- 2 Related Work -- 3 Approach -- 3.1 Problem Definition -- 3.2 Multi-view Feature Encoding -- 3.3 Sequential Instance Copy-Paste -- 3.4 Optimization Objectives -- 4 Experiments -- 4.1 Datasets and Evaluation Metrics -- 4.2 Experimental Setup -- 4.3 Results on the SemanticKITTI -- 4.4 Results on the Waymo Open Dataset -- 5 Conclusion -- References -- E-Graph: Minimal Solution for Rigid Rotation with Extensibility Graphs -- 1 Introduction -- 2 Related Work -- 3 Minimal Case in Orientation Estimation -- 3.1 Minimal Solution for Rotation -- 4 Extensibility Graph (E-Graph) -- 4.1 Landmarks from a RGB-D Frame -- 4.2 Data



Association -- 5 Experiments -- 5.1 ICL NUIM Dataset -- 5.2 TUM RGB-D -- 6 Conclusion -- References -- Point Cloud Compression with Range Image-Based Entropy Model for Autonomous Driving -- 1 Introduction -- 2 Related Works -- 2.1 Point Cloud Compression Frameworks -- 2.2 3D and 2D Feature Extractors.

3 Our Approach.