Advances in Visual Computing : 18th International Symposium, ISVC 2023, Lake Tahoe, NV, USA, October 16-18, 2023, Proceedings, Part I |
Autore | Bebis George |
Edizione | [1st ed.] |
Pubbl/distr/stampa | Cham : , : Springer, , 2024 |
Descrizione fisica | 1 online resource (630 pages) |
Altri autori (Persone) |
GhiasiGolnaz
FangYi SharfAndrei DongYue WeaverChris LeoZhicheng LaViola JrJoseph J KohliLuv |
Collana | Lecture Notes in Computer Science Series |
ISBN | 3-031-47969-6 |
Formato | Materiale a stampa ![]() |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
Intro -- Preface -- Organization -- Keynote Talks -- Machine Learning for Scientific Data Analysis and Visualization -- Estimating the Structure and Motion of Biomolecules at Atomic Resolutions -- Curriculum Learning and Active Learning, for Visual Object Recognition when Data is Scarce -- Have We Solved Image Correspondences? -- Visual Content Manipulation by Learning Generative Models -- Lights, Camera, Animation! Adaptive Simulation Methods for Training and Entertainment -- Beyond the Specs: A Computational and Human-Centered Approach to Wearability in AR/VR -- Contents - Part I -- Contents - Part II -- ST: Biomedical Image Analysis Techniques for Cancer Detection, Diagnosis and Management -- Hybrid Region and Pixel-Level Adaptive Loss for Mass Segmentation on Whole Mammography Images -- 1 Introduction -- 2 Related Work -- 2.1 Mass Segmentation on Whole Mammograms -- 2.2 Loss for Medical Image Segmentation -- 3 Methodology -- 3.1 Hybrid Pixel-Level Loss -- 3.2 Hybrid Region-Level Loss -- 3.3 Density-Adaptive Sample-Level Prioritizing Loss -- 4 Experimental Results -- 4.1 Datasets -- 4.2 Evaluation Metrics -- 4.3 Comparison with State-of-the-Art Methods -- 5 Conclusion -- References -- Deep Learning Based GABA Edited-MRS Signal Reconstruction -- 1 Introduction -- 2 Methods -- 2.1 Dataset -- 2.2 J-Difference Spectrum -- 2.3 Dual Branch Self-Attention Neural Network -- 2.4 Evaluation Metrics -- 3 Results and Discussion -- 4 Conclusion -- References -- Investigating the Impact of Attention on Mammogram Classification -- 1 Introduction -- 2 Data and Methods -- 2.1 Data Selection and Preprocessing -- 2.2 Selection of Models -- 2.3 Selection of Attention Methods -- 2.4 Training and Testing Process -- 3 Results and Discussion -- 3.1 Impact of Attention on CNN Performance -- 3.2 Impact of Model Architecture on Performance Differences.
3.3 Impact of Attention on Resolution -- 3.4 Impact of Attention on Abnormality Type -- 3.5 Relationship Between Model Activation and AU-ROC -- 4 Conclusions -- References -- ReFit: A Framework for Refinement of Weakly Supervised Semantic Segmentation Using Object Border Fitting for Medical Images -- 1 Introduction -- 2 Our ReFit Framework -- 2.1 Unsupervised Segment Detection -- 2.2 Class Activation Map - CAM -- 2.3 The BoundaryFit Module -- 3 Results and Discussion -- 3.1 Ablation Studies -- 4 Conclusion -- References -- A Data-Centric Approach for Pectoral Muscle Deep Learning Segmentation Enhancements in Mammography Images -- 1 Introduction -- 2 Related Work -- 3 Mammography Segmentation -- 3.1 Dataset -- 3.2 Model Training -- 3.3 Drawbacks -- 4 Data-Centric Model Optimization -- 4.1 Stage I: Annotation Correction -- 4.2 Stage II: Downsampling -- 5 Results -- 5.1 Evaluation Metrics -- 5.2 Evaluated Training Datasets -- 5.3 Intersection over Union Evaluation -- 5.4 Classification Metrics for Pectoral Muscle Detection in CC View -- 6 Conclusion -- References -- Visualization -- Visualizing Multimodal Time Series at Scale -- 1 Introduction -- 2 Related Work -- 3 Overview Scenario -- 4 Detail Methods and Implementation -- 4.1 Time Series Dataset -- 4.2 Exploiting Elasticsearch for Fast Search and Big Query -- 4.3 Visualizing Time Series -- 5 Exploring UMAFall Dataset with TimeXplore -- 6 Conclusions and Future Work -- References -- Hybrid Tree Visualizations for Analysis of Gerrymandering -- 1 Introduction -- 2 Related Work -- 3 Gerrymandering -- 4 Data Model in Gerrymandering -- 5 Visual Design -- 6 Analysis Examples -- 6.1 Evaluating the Efficiency Gap -- 6.2 Assessing Electoral Competition -- 7 Conclusion -- References -- ArcheryVis: A Tool for Analyzing and Visualizing Archery Performance Data -- 1 Introduction -- 2 Related Work. 2.1 Archery Performance Analysis -- 2.2 Archery Scoring Apps -- 3 Data Collection, Processing, and Analysis -- 3.1 Data Collection -- 3.2 Ring and Center Detection -- 3.3 Shot Detection and Calibration -- 3.4 Scoring and Statistical Measures -- 4 Visual Interface and Interaction -- 5 Results and Discussion -- 5.1 Brushing and Filtering -- 5.2 Trainee Comparison -- 5.3 Statistical Measure as Performance Indicator -- 5.4 Empirical Evaluation -- 5.5 Limitations -- 6 Conclusions and Future Work -- References -- Spiro: Order-Preserving Visualization in High Performance Computing Monitoring -- 1 Introduction -- 2 Related Work -- 2.1 Spiral Layout in Visualization -- 2.2 Monitoring with Spiral Layout -- 3 Monitoring Tasks -- 4 Spiro Design -- 4.1 Design Rationales -- 4.2 Visual Encoding -- 5 Case Studies -- 5.1 Clustering on Compute Servers -- 5.2 Exploring Usage Behavior -- 6 Conclusion and Future Work -- References -- From Faces to Volumes - Measuring Volumetric Asymmetry in 3D Facial Palsy Scans -- 1 Introduction -- 2 Related Work -- 3 Data Acquisition -- 4 Methods -- 4.1 3D Landmark Extraction for Facial Palsy Patients -- 4.2 Radial Curves -- 4.3 Lateral Face Mesh Generation -- 4.4 Volume Estimation for Lateral Face Sides -- 4.5 Volumetric Difference Visualization -- 5 Volume Analysis During Dynamic Movements -- 6 Conclusions and Future Work -- References -- Video Analysis and Event Recognition -- Comparison of Autoencoder Models for Unsupervised Representation Learning of Skeleton Sequences -- 1 Introduction -- 2 Related Work -- 3 Methods -- 3.1 Proposed Methods -- 4 Experiments -- 4.1 Datasets -- 4.2 Results Analysis and Comparisons -- 5 Conclusion and Future Works -- References -- Local and Global Context Reasoning for Spatio-Temporal Action Localization -- 1 Introduction -- 2 Related Works -- 3 Proposed Method -- 3.1 Overall Pipeline. 3.2 Near-Actor Relation Network -- 4 Experiments on JHMDB21 -- 4.1 Implementation Details -- 4.2 Comparison on JHMDB21 -- 4.3 Ablation Study -- 4.4 Qualitative Results -- 5 Experiments on AVA -- 5.1 Implementation Details -- 5.2 Comparison on AVA -- 6 Conclusion -- References -- Zero-Shot Video Moment Retrieval Using BLIP-Based Models -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Computing Image and Text Embeddings -- 3.2 Sparse Frame-Sampling Strategies -- 3.3 Moment-Query Matching -- 4 Experiments -- 5 Results and Discussion -- 6 Conclusions and Future Work -- References -- Self-supervised Representation Learning for Fine Grained Human Hand Action Recognition in Industrial Assembly Lines -- 1 Introduction -- 2 Related Work -- 3 Proposed Method -- 3.1 Model Architecture -- 3.2 Masking Method -- 4 Experiments -- 4.1 Datasets -- 4.2 Model Training Environment -- 4.3 Self-supervised Pretraining and Downstream Task -- 5 Results and Analysis -- 5.1 Results Self-supervised Learning -- 5.2 Results Downstream Task -- 5.3 Analysis -- 6 Conclusion and Outlook -- References -- ST: Innovations in Computer Vision & -- Machine Learning for Critical & -- Civil Infrastructures -- Pretext Tasks in Bridge Defect Segmentation Within a ViT-Adapter Framework -- 1 Introduction -- 2 Methods -- 2.1 ViT-Adapter Model -- 2.2 Datasets -- 2.3 Supervised Learning (SL) Pre-training -- 2.4 Self- And Semi-Supervised Learning (SSL) Pre-training -- 2.5 Training Parameters -- 3 Results and Discussion -- 4 Conclusion -- References -- A Few-Shot Attention Recurrent Residual U-Net for Crack Segmentation -- 1 Introduction -- 1.1 Current Limitations and Our Contribution -- 2 Proposed Architecture -- 2.1 R2AU-Net Architecture for Road Crack Segmentation -- 2.2 Few-Shot Learning for Segmentation Refinement -- 3 Experimental Setup and Results -- 3.1 Dataset Description. 3.2 Comparative Algorithms and Training Configuration -- 3.3 Experiments and Comparisons -- 4 Conclusions -- References -- Efficient Resource Provisioning in Critical Infrastructures Based on Multi-Agent Rollout Enabled by Deep Q-Learning -- 1 Introduction -- 2 Related Work -- 3 Workload Management in Critical Infrastructures -- 3.1 Infrastructure Model -- 3.2 Problem Formulation -- 3.3 Deterministic Markov Decision Process Model -- 3.4 Multi-Agent Rollout Enabled by Deep Q-Learning -- 4 Simulation Experiments -- 4.1 Experimental Setup -- 4.2 Evaluation Results -- 5 Conclusions -- References -- Video-Based Recognition of Aquatic Invasive Species Larvae Using Attention-LSTM Transformer -- 1 Introduction -- 1.1 Attention-LSTM -- 2 Related Work -- 3 Proposed Method -- 3.1 Model Architecture -- 3.2 Attention-LSTM Layer -- 3.3 Model Variations -- 4 Invasive Species Dataset -- 5 Empirical Evaluation -- 6 Conclusion -- References -- ST: Generalization in Visual Machine Learning -- Latent Space Navigation for Face Privacy: A Case Study on the MNIST Dataset -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 4 Experimental Result -- 5 Future Work -- 6 Conclusion -- References -- Domain Generalization for Foreground Segmentation Using Federated Learning -- 1 Introduction -- 2 Related Work -- 3 Proposed Work -- 3.1 Model Architecture -- 3.2 Training Technique -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Traditional Foreground Segmentation Experiment -- 4.4 Domain Generalization Experiment -- 4.5 Few-Shot Experiment -- 5 Conclusion and Future Work -- References -- Probabilistic Local Equivalence Certification for Robustness Evaluation -- 1 Introduction -- 2 Related Work -- 3 Probabilistic Local Equivalence Certification -- 3.1 Probabilistic Local Equivalence Certification -- 3.2 When Labels are Available. 3.3 The Case of Classification. |
Record Nr. | UNISA-996565867203316 |
Bebis George
![]() |
||
Cham : , : Springer, , 2024 | ||
![]() | ||
Lo trovi qui: Univ. di Salerno | ||
|
Advances in Visual Computing : 18th International Symposium, ISVC 2023, Lake Tahoe, NV, USA, October 16-18, 2023, Proceedings, Part I |
Autore | Bebis George |
Edizione | [1st ed.] |
Pubbl/distr/stampa | Cham : , : Springer, , 2024 |
Descrizione fisica | 1 online resource (630 pages) |
Altri autori (Persone) |
GhiasiGolnaz
FangYi SharfAndrei DongYue WeaverChris LeoZhicheng LaViola JrJoseph J KohliLuv |
Collana | Lecture Notes in Computer Science Series |
ISBN | 3-031-47969-6 |
Formato | Materiale a stampa ![]() |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
Intro -- Preface -- Organization -- Keynote Talks -- Machine Learning for Scientific Data Analysis and Visualization -- Estimating the Structure and Motion of Biomolecules at Atomic Resolutions -- Curriculum Learning and Active Learning, for Visual Object Recognition when Data is Scarce -- Have We Solved Image Correspondences? -- Visual Content Manipulation by Learning Generative Models -- Lights, Camera, Animation! Adaptive Simulation Methods for Training and Entertainment -- Beyond the Specs: A Computational and Human-Centered Approach to Wearability in AR/VR -- Contents - Part I -- Contents - Part II -- ST: Biomedical Image Analysis Techniques for Cancer Detection, Diagnosis and Management -- Hybrid Region and Pixel-Level Adaptive Loss for Mass Segmentation on Whole Mammography Images -- 1 Introduction -- 2 Related Work -- 2.1 Mass Segmentation on Whole Mammograms -- 2.2 Loss for Medical Image Segmentation -- 3 Methodology -- 3.1 Hybrid Pixel-Level Loss -- 3.2 Hybrid Region-Level Loss -- 3.3 Density-Adaptive Sample-Level Prioritizing Loss -- 4 Experimental Results -- 4.1 Datasets -- 4.2 Evaluation Metrics -- 4.3 Comparison with State-of-the-Art Methods -- 5 Conclusion -- References -- Deep Learning Based GABA Edited-MRS Signal Reconstruction -- 1 Introduction -- 2 Methods -- 2.1 Dataset -- 2.2 J-Difference Spectrum -- 2.3 Dual Branch Self-Attention Neural Network -- 2.4 Evaluation Metrics -- 3 Results and Discussion -- 4 Conclusion -- References -- Investigating the Impact of Attention on Mammogram Classification -- 1 Introduction -- 2 Data and Methods -- 2.1 Data Selection and Preprocessing -- 2.2 Selection of Models -- 2.3 Selection of Attention Methods -- 2.4 Training and Testing Process -- 3 Results and Discussion -- 3.1 Impact of Attention on CNN Performance -- 3.2 Impact of Model Architecture on Performance Differences.
3.3 Impact of Attention on Resolution -- 3.4 Impact of Attention on Abnormality Type -- 3.5 Relationship Between Model Activation and AU-ROC -- 4 Conclusions -- References -- ReFit: A Framework for Refinement of Weakly Supervised Semantic Segmentation Using Object Border Fitting for Medical Images -- 1 Introduction -- 2 Our ReFit Framework -- 2.1 Unsupervised Segment Detection -- 2.2 Class Activation Map - CAM -- 2.3 The BoundaryFit Module -- 3 Results and Discussion -- 3.1 Ablation Studies -- 4 Conclusion -- References -- A Data-Centric Approach for Pectoral Muscle Deep Learning Segmentation Enhancements in Mammography Images -- 1 Introduction -- 2 Related Work -- 3 Mammography Segmentation -- 3.1 Dataset -- 3.2 Model Training -- 3.3 Drawbacks -- 4 Data-Centric Model Optimization -- 4.1 Stage I: Annotation Correction -- 4.2 Stage II: Downsampling -- 5 Results -- 5.1 Evaluation Metrics -- 5.2 Evaluated Training Datasets -- 5.3 Intersection over Union Evaluation -- 5.4 Classification Metrics for Pectoral Muscle Detection in CC View -- 6 Conclusion -- References -- Visualization -- Visualizing Multimodal Time Series at Scale -- 1 Introduction -- 2 Related Work -- 3 Overview Scenario -- 4 Detail Methods and Implementation -- 4.1 Time Series Dataset -- 4.2 Exploiting Elasticsearch for Fast Search and Big Query -- 4.3 Visualizing Time Series -- 5 Exploring UMAFall Dataset with TimeXplore -- 6 Conclusions and Future Work -- References -- Hybrid Tree Visualizations for Analysis of Gerrymandering -- 1 Introduction -- 2 Related Work -- 3 Gerrymandering -- 4 Data Model in Gerrymandering -- 5 Visual Design -- 6 Analysis Examples -- 6.1 Evaluating the Efficiency Gap -- 6.2 Assessing Electoral Competition -- 7 Conclusion -- References -- ArcheryVis: A Tool for Analyzing and Visualizing Archery Performance Data -- 1 Introduction -- 2 Related Work. 2.1 Archery Performance Analysis -- 2.2 Archery Scoring Apps -- 3 Data Collection, Processing, and Analysis -- 3.1 Data Collection -- 3.2 Ring and Center Detection -- 3.3 Shot Detection and Calibration -- 3.4 Scoring and Statistical Measures -- 4 Visual Interface and Interaction -- 5 Results and Discussion -- 5.1 Brushing and Filtering -- 5.2 Trainee Comparison -- 5.3 Statistical Measure as Performance Indicator -- 5.4 Empirical Evaluation -- 5.5 Limitations -- 6 Conclusions and Future Work -- References -- Spiro: Order-Preserving Visualization in High Performance Computing Monitoring -- 1 Introduction -- 2 Related Work -- 2.1 Spiral Layout in Visualization -- 2.2 Monitoring with Spiral Layout -- 3 Monitoring Tasks -- 4 Spiro Design -- 4.1 Design Rationales -- 4.2 Visual Encoding -- 5 Case Studies -- 5.1 Clustering on Compute Servers -- 5.2 Exploring Usage Behavior -- 6 Conclusion and Future Work -- References -- From Faces to Volumes - Measuring Volumetric Asymmetry in 3D Facial Palsy Scans -- 1 Introduction -- 2 Related Work -- 3 Data Acquisition -- 4 Methods -- 4.1 3D Landmark Extraction for Facial Palsy Patients -- 4.2 Radial Curves -- 4.3 Lateral Face Mesh Generation -- 4.4 Volume Estimation for Lateral Face Sides -- 4.5 Volumetric Difference Visualization -- 5 Volume Analysis During Dynamic Movements -- 6 Conclusions and Future Work -- References -- Video Analysis and Event Recognition -- Comparison of Autoencoder Models for Unsupervised Representation Learning of Skeleton Sequences -- 1 Introduction -- 2 Related Work -- 3 Methods -- 3.1 Proposed Methods -- 4 Experiments -- 4.1 Datasets -- 4.2 Results Analysis and Comparisons -- 5 Conclusion and Future Works -- References -- Local and Global Context Reasoning for Spatio-Temporal Action Localization -- 1 Introduction -- 2 Related Works -- 3 Proposed Method -- 3.1 Overall Pipeline. 3.2 Near-Actor Relation Network -- 4 Experiments on JHMDB21 -- 4.1 Implementation Details -- 4.2 Comparison on JHMDB21 -- 4.3 Ablation Study -- 4.4 Qualitative Results -- 5 Experiments on AVA -- 5.1 Implementation Details -- 5.2 Comparison on AVA -- 6 Conclusion -- References -- Zero-Shot Video Moment Retrieval Using BLIP-Based Models -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Computing Image and Text Embeddings -- 3.2 Sparse Frame-Sampling Strategies -- 3.3 Moment-Query Matching -- 4 Experiments -- 5 Results and Discussion -- 6 Conclusions and Future Work -- References -- Self-supervised Representation Learning for Fine Grained Human Hand Action Recognition in Industrial Assembly Lines -- 1 Introduction -- 2 Related Work -- 3 Proposed Method -- 3.1 Model Architecture -- 3.2 Masking Method -- 4 Experiments -- 4.1 Datasets -- 4.2 Model Training Environment -- 4.3 Self-supervised Pretraining and Downstream Task -- 5 Results and Analysis -- 5.1 Results Self-supervised Learning -- 5.2 Results Downstream Task -- 5.3 Analysis -- 6 Conclusion and Outlook -- References -- ST: Innovations in Computer Vision & -- Machine Learning for Critical & -- Civil Infrastructures -- Pretext Tasks in Bridge Defect Segmentation Within a ViT-Adapter Framework -- 1 Introduction -- 2 Methods -- 2.1 ViT-Adapter Model -- 2.2 Datasets -- 2.3 Supervised Learning (SL) Pre-training -- 2.4 Self- And Semi-Supervised Learning (SSL) Pre-training -- 2.5 Training Parameters -- 3 Results and Discussion -- 4 Conclusion -- References -- A Few-Shot Attention Recurrent Residual U-Net for Crack Segmentation -- 1 Introduction -- 1.1 Current Limitations and Our Contribution -- 2 Proposed Architecture -- 2.1 R2AU-Net Architecture for Road Crack Segmentation -- 2.2 Few-Shot Learning for Segmentation Refinement -- 3 Experimental Setup and Results -- 3.1 Dataset Description. 3.2 Comparative Algorithms and Training Configuration -- 3.3 Experiments and Comparisons -- 4 Conclusions -- References -- Efficient Resource Provisioning in Critical Infrastructures Based on Multi-Agent Rollout Enabled by Deep Q-Learning -- 1 Introduction -- 2 Related Work -- 3 Workload Management in Critical Infrastructures -- 3.1 Infrastructure Model -- 3.2 Problem Formulation -- 3.3 Deterministic Markov Decision Process Model -- 3.4 Multi-Agent Rollout Enabled by Deep Q-Learning -- 4 Simulation Experiments -- 4.1 Experimental Setup -- 4.2 Evaluation Results -- 5 Conclusions -- References -- Video-Based Recognition of Aquatic Invasive Species Larvae Using Attention-LSTM Transformer -- 1 Introduction -- 1.1 Attention-LSTM -- 2 Related Work -- 3 Proposed Method -- 3.1 Model Architecture -- 3.2 Attention-LSTM Layer -- 3.3 Model Variations -- 4 Invasive Species Dataset -- 5 Empirical Evaluation -- 6 Conclusion -- References -- ST: Generalization in Visual Machine Learning -- Latent Space Navigation for Face Privacy: A Case Study on the MNIST Dataset -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 4 Experimental Result -- 5 Future Work -- 6 Conclusion -- References -- Domain Generalization for Foreground Segmentation Using Federated Learning -- 1 Introduction -- 2 Related Work -- 3 Proposed Work -- 3.1 Model Architecture -- 3.2 Training Technique -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Traditional Foreground Segmentation Experiment -- 4.4 Domain Generalization Experiment -- 4.5 Few-Shot Experiment -- 5 Conclusion and Future Work -- References -- Probabilistic Local Equivalence Certification for Robustness Evaluation -- 1 Introduction -- 2 Related Work -- 3 Probabilistic Local Equivalence Certification -- 3.1 Probabilistic Local Equivalence Certification -- 3.2 When Labels are Available. 3.3 The Case of Classification. |
Record Nr. | UNINA-9910767585603321 |
Bebis George
![]() |
||
Cham : , : Springer, , 2024 | ||
![]() | ||
Lo trovi qui: Univ. Federico II | ||
|
Advances in Visual Computing [[electronic resource] ] : 18th International Symposium, ISVC 2023, Lake Tahoe, NV, USA, October 16–18, 2023, Proceedings, Part II / / edited by George Bebis, Golnaz Ghiasi, Yi Fang, Andrei Sharf, Yue Dong, Chris Weaver, Zhicheng Leo, Joseph J. LaViola Jr., Luv Kohli |
Autore | Bebis George |
Edizione | [1st ed. 2023.] |
Pubbl/distr/stampa | Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 |
Descrizione fisica | 1 online resource (506 pages) |
Disciplina | 006 |
Altri autori (Persone) |
GhiasiGolnaz
FangYi SharfAndrei DongYue WeaverChris LeoZhicheng LaViola JrJoseph J KohliLuv |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Image processing - Digital techniques
Computer vision Computer Imaging, Vision, Pattern Recognition and Graphics |
ISBN | 3-031-47966-1 |
Formato | Materiale a stampa ![]() |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | Virtual Reality -- A Pilot Study Comparing User Interactions Between Augmented and Virtual Reality -- Synthesizing Play-Ready VR Scenes with Natural Language Prompts through GPT API -- Emergent Individual Factors for AR Education and Training -- Segmentation -- ISLE: A Framework for Image Level Semantic Segmentation Ensemble -- Particulate Mapping Centerline Extraction (PMCE), a Novel Centerline Extraction Algorithm Based on Patterns in the Spatial Distribution of Aggregates -- Evaluating Segmentation Approaches on Digitized Herbarium Specimens -- Semantic Scene Filtering for Event Cameras in Long-Term Outdoor Monitoring Scenarios -- SODAWideNet - Salient Object Detection with an Attention augmented Wide Encoder Decoder network without ImageNet pre-training -- Applications -- Foil-Net: Deep Learning-Based Wave Classification for Hydrofoil Surfing -- Inpainting of Depth Images using Deep Neural Networks for Real-Time Applications -- Using 2D and 3D Face Representations to Generate Comprehensive Facial Electromyography Intensity Maps -- Real-world Image Deblurring via Unsupervised Domain Adaptation -- Object Detection and Recognition -- Reliable Matching by Combining Optimal Color and Intensity Information based on Relationships between Target and Surrounding Objects -- Regularized Meta-Training with Embedding Mixup for Improved Few-Shot Learning -- Visual Foreign Object Detection for Wireless Charging of Electric Vehicles -- Deep Representation Learning for License Plate Recognition in Low Quality Video Images -- Optimizing PnP-Algorithms for Limited Point Correspondences Using Spatial Constraints -- Deep Learning -- Unsupervised Deep-Learning Approach for Underwater Image Enhancement -- LaneNet++ : Uncertainty-aware Lane Detection for Autonomous Vehicle -- Task-driven Compression for Collision Encoding based on Depth Images -- Eigenpatches - Adversarial Patches from Principal Components -- Edge-guided Image Inpainting with Transformer -- Poster -- Bayesian Fusion inspired 3D reconstruction via LiDAR-Stereo Camera Pair -- Marimba Mallet Placement Tracker -- DINO-CXR: A Self Supervised Method Based on Vision Transformer for Chest X-Ray Classification -- Social Bias and Image Tagging: Evaluation of Progress in State-of-the-Art Models -- L-TReiD: Logic Tensor Transformer for Re-Identification -- Retinal Disease Diagnosis with a Hybrid ResNet50-LSTM Deep Learning Model -- Pothole Segmentation and Area Estimation with Deep Neural Networks and Unmanned Aerial Vehicles -- Generation method of robot assembly motion considering physicality gap between humans and robots -- A Self-Supervised Pose Estimation Approach for Construction Machines -- Image Quality Improvement of Surveillance Camera Images by Learning Noise Removal Method Using Noise2Noise -- Automating Kernel Size Selection in MRI Reconstruction via a Transparent and Interpretable Search Approach -- Segmentation and Identification of Mediterranean Plant Species -- Exploiting Generative Adversarial Networks in Joint Sensitivity Encoding for Enhanced MRI Reconstruction -- Multisensory Modeling of Tabular Data for Enhanced Perception and Immersive User Experience -- Coping with Bullying Incidents by the Narrative and Multi-modal Interaction in Virtual Reality. |
Record Nr. | UNINA-9910767583103321 |
Bebis George
![]() |
||
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 | ||
![]() | ||
Lo trovi qui: Univ. Federico II | ||
|
Advances in Visual Computing [[electronic resource] ] : 18th International Symposium, ISVC 2023, Lake Tahoe, NV, USA, October 16–18, 2023, Proceedings, Part II / / edited by George Bebis, Golnaz Ghiasi, Yi Fang, Andrei Sharf, Yue Dong, Chris Weaver, Zhicheng Leo, Joseph J. LaViola Jr., Luv Kohli |
Autore | Bebis George |
Edizione | [1st ed. 2023.] |
Pubbl/distr/stampa | Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 |
Descrizione fisica | 1 online resource (506 pages) |
Disciplina | 006 |
Altri autori (Persone) |
GhiasiGolnaz
FangYi SharfAndrei DongYue WeaverChris LeoZhicheng LaViola JrJoseph J KohliLuv |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Image processing - Digital techniques
Computer vision Computer Imaging, Vision, Pattern Recognition and Graphics |
ISBN | 3-031-47966-1 |
Formato | Materiale a stampa ![]() |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | Virtual Reality -- A Pilot Study Comparing User Interactions Between Augmented and Virtual Reality -- Synthesizing Play-Ready VR Scenes with Natural Language Prompts through GPT API -- Emergent Individual Factors for AR Education and Training -- Segmentation -- ISLE: A Framework for Image Level Semantic Segmentation Ensemble -- Particulate Mapping Centerline Extraction (PMCE), a Novel Centerline Extraction Algorithm Based on Patterns in the Spatial Distribution of Aggregates -- Evaluating Segmentation Approaches on Digitized Herbarium Specimens -- Semantic Scene Filtering for Event Cameras in Long-Term Outdoor Monitoring Scenarios -- SODAWideNet - Salient Object Detection with an Attention augmented Wide Encoder Decoder network without ImageNet pre-training -- Applications -- Foil-Net: Deep Learning-Based Wave Classification for Hydrofoil Surfing -- Inpainting of Depth Images using Deep Neural Networks for Real-Time Applications -- Using 2D and 3D Face Representations to Generate Comprehensive Facial Electromyography Intensity Maps -- Real-world Image Deblurring via Unsupervised Domain Adaptation -- Object Detection and Recognition -- Reliable Matching by Combining Optimal Color and Intensity Information based on Relationships between Target and Surrounding Objects -- Regularized Meta-Training with Embedding Mixup for Improved Few-Shot Learning -- Visual Foreign Object Detection for Wireless Charging of Electric Vehicles -- Deep Representation Learning for License Plate Recognition in Low Quality Video Images -- Optimizing PnP-Algorithms for Limited Point Correspondences Using Spatial Constraints -- Deep Learning -- Unsupervised Deep-Learning Approach for Underwater Image Enhancement -- LaneNet++ : Uncertainty-aware Lane Detection for Autonomous Vehicle -- Task-driven Compression for Collision Encoding based on Depth Images -- Eigenpatches - Adversarial Patches from Principal Components -- Edge-guided Image Inpainting with Transformer -- Poster -- Bayesian Fusion inspired 3D reconstruction via LiDAR-Stereo Camera Pair -- Marimba Mallet Placement Tracker -- DINO-CXR: A Self Supervised Method Based on Vision Transformer for Chest X-Ray Classification -- Social Bias and Image Tagging: Evaluation of Progress in State-of-the-Art Models -- L-TReiD: Logic Tensor Transformer for Re-Identification -- Retinal Disease Diagnosis with a Hybrid ResNet50-LSTM Deep Learning Model -- Pothole Segmentation and Area Estimation with Deep Neural Networks and Unmanned Aerial Vehicles -- Generation method of robot assembly motion considering physicality gap between humans and robots -- A Self-Supervised Pose Estimation Approach for Construction Machines -- Image Quality Improvement of Surveillance Camera Images by Learning Noise Removal Method Using Noise2Noise -- Automating Kernel Size Selection in MRI Reconstruction via a Transparent and Interpretable Search Approach -- Segmentation and Identification of Mediterranean Plant Species -- Exploiting Generative Adversarial Networks in Joint Sensitivity Encoding for Enhanced MRI Reconstruction -- Multisensory Modeling of Tabular Data for Enhanced Perception and Immersive User Experience -- Coping with Bullying Incidents by the Narrative and Multi-modal Interaction in Virtual Reality. |
Record Nr. | UNISA-996574260103316 |
Bebis George
![]() |
||
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023 | ||
![]() | ||
Lo trovi qui: Univ. di Salerno | ||
|
Computational Visual Media : 12th International Conference, CVM 2024, Wellington, New Zealand, April 10–12, 2024, Proceedings, Part I / / edited by Fang-Lue Zhang, Andrei Sharf |
Autore | Zhang Fang-Lue |
Edizione | [1st ed. 2024.] |
Pubbl/distr/stampa | Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024 |
Descrizione fisica | 1 online resource (331 pages) |
Disciplina | 006.37 |
Altri autori (Persone) | SharfAndrei |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Computer vision
Pattern recognition systems Application software Computer graphics Artificial intelligence Algorithms Computer Vision Automated Pattern Recognition Computer and Information Systems Applications Computer Graphics Artificial Intelligence |
ISBN | 981-9720-95-8 |
Formato | Materiale a stampa ![]() |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
Intro -- Preface -- Organization -- Contents - Part I -- Contents - Part II -- Reconstruction and Modelling -- PIFu for the Real World: A Self-supervised Framework to Reconstruct Dressed Human from Single-View Images -- 1 Introduction -- 2 Related Work -- 2.1 Singe-View Human Reconstruction -- 2.2 Single-View Depth Estimation -- 2.3 Self-supervised 3D Reconstruction -- 3 Method -- 3.1 Normal and Depth Estimation -- 3.2 SDF-Based Pixel-Aligned Implicit Function from Depth -- 3.3 Depth-Guided Self-supervised Learning -- 4 Experiments -- 4.1 Datasets, Metrics, and Implementation Details -- 4.2 Evaluations -- 4.3 Comparison with the State-of-the-Art -- 5 Conclusion -- References -- Sketchformer++: A Hierarchical Transformer Architecture for Vector Sketch Representation -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Data Representation -- 3.2 Hierarchical Transformer Architecture -- 3.3 Training -- 4 Experiments -- 4.1 Sketch Reconstruction -- 4.2 Sketch Recognition -- 4.3 Sketch Semantic Segmentation -- 4.4 Ablation Study -- 5 Conclusion -- References -- Leveraging Panoptic Prior for 3D Zero-Shot Semantic Understanding Within Language Embedded Radiance Fields -- 1 Introduction -- 2 Related Works -- 2.1 NeRF with Semantics -- 2.2 Panoptic Segmentation -- 2.3 Open-Vocabulary Object Detection -- 2.4 Zero-Shot Learning in 3D -- 2.5 Cross-Modal Knowledge Distillation -- 3 Method -- 3.1 Overview -- 3.2 Field Structure -- 3.3 Semantic Prior Extraction -- 3.4 CLIP Pyramid Reconstruction -- 3.5 Relevancy Evaluation Metric -- 4 Experiments -- 4.1 Settings -- 4.2 Qualitative Results -- 4.3 Ablation Study -- 5 Limitations -- 6 Conclusions -- References -- Multi-Scale Implicit Surface Reconstruction for Outdoor Scenes -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Multi-scale Rendering with SDF Representation -- 3.2 Dynamic Position Encoding.
3.3 Adaptive Sampling Strategy in Image Space -- 3.4 More Details -- 3.5 Loss Function -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Qualitative and Quantitative Comparisons -- 4.3 Ablation Study -- 5 Conclusion -- References -- Neural Radiance Fields for Dynamic View Synthesis Using Local Temporal Priors -- 1 Introduction -- 2 Related Work -- 3 Overview -- 4 Dynamic Scene Representation -- 5 Local Temporal NeRF -- 5.1 Local Temporal Module -- 5.2 Loss Functions -- 5.3 Implementation -- 6 Results -- 6.1 Quantitative Evaluation -- 6.2 Qualitative Evaluation -- 6.3 Ablation Study -- 6.4 Additional Comparisons -- 7 Limitations and Discussion -- 8 Conclusion -- References -- Point Cloud -- Point Cloud Segmentation with Guided Sampling and Continuous Interpolation -- 1 Introduction -- 2 Related Work -- 2.1 Point Cloud Learning -- 2.2 Point Cloud Sampling -- 3 Method -- 3.1 Motivation -- 3.2 Guided Sampling -- 3.3 Continuous Interpolation -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Signal Reconstruction -- 4.3 Semantic Segmentation -- 4.4 Object Part Segmentation -- 4.5 Ablation Study -- 5 Conclusion and Discussion -- References -- TopFormer: Topology-Aware Transformer for Point Cloud Registration -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Problem Definition -- 3.2 Local Feature Encoder -- 3.3 Topology-Aware Transformer -- 3.4 Sparse Point Matching -- 3.5 Dense Points Refinement -- 3.6 Loss Function -- 4 Experiments -- 4.1 Implementation -- 4.2 Indoor Scene: 3DMatch -- 4.3 Outdoor Scene Data: KITTI -- 4.4 Ablation Study -- 5 Conclusion -- References -- Adversarial Geometric Transformations of Point Clouds for Physical Attack -- 1 Introduction -- 2 Related Works -- 3 Methodology -- 3.1 Preliminaries -- 3.2 Adversarial Geometric Transformations -- 3.3 Optimization -- 4 Experiments -- 4.1 Dataset and Settings. 4.2 Evaluation on Adversarial Point Clouds -- 4.3 Evaluation on Shape and Physical Attack -- 4.4 Ablation Studies -- 5 Conclusions -- References -- SARNet: Semantic Augmented Registration of Large-Scale Urban Point Clouds -- 1 Introduction -- 2 Related Work -- 2.1 Traditional Feature-Based Registration -- 2.2 Learning-Based Registration -- 2.3 3D Point Feature Learning -- 3 Problem Statement and Overview -- 4 Methodology -- 4.1 Semantic-Based Farthest Point Sampling -- 4.2 Semantic-Augmented Feature Extraction -- 4.3 Semantic-Refined Transformation Estimation -- 4.4 Loss Functions -- 4.5 Implementation Details -- 5 Experimental Results -- 5.1 Experimental Setup -- 5.2 Evaluation Metrics -- 5.3 Comparisons -- 5.4 Ablation Study -- 5.5 Limitations -- 6 Conclusion and Future Work -- References -- Rendering and Animation -- FASSET: Frame Supersampling and Extrapolation Using Implicit Neural Representations of Rendering Contents -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Motivation and Overview -- 3.2 Implicit Neural Representations of Rendering Contents -- 3.3 Frame Feature Extractor -- 3.4 Network Training -- 4 Experiments -- 4.1 Dataset -- 4.2 Baselines and Settings -- 4.3 Analysis of Runtime Performance and Model Efficiency -- 4.4 Ablation Study -- 4.5 Limitation -- 5 Conclusion -- References -- MatTrans: Material Reflectance Property Estimation of Complex Objects with Transformer -- 1 Introduction -- 2 Related Works -- 3 Method -- 3.1 Initial Estimation Network -- 3.2 Refined Estimation Network -- 3.3 Transformer Encoder -- 3.4 Dataset -- 3.5 Training -- 4 Experiments -- 4.1 Ablation Experiment -- 4.2 Generalization to Real Data -- 4.3 Comparison Experiment -- 5 Conclusion -- References -- Improved Text-Driven Human Motion Generation via Out-of-Distribution Detection and Rectification -- 1 Introduction -- 2 Related Work. 3 The Proposed Method -- 4 Experiments -- 4.1 Dataset and Evaluation Metrics -- 4.2 Experiment Configuration and Training Details -- 4.3 Comparisons Between Different Text-Driven Human Motion Generation Methods -- 4.4 Comparison Between Different Outlier Detection Algorithms -- 4.5 Evaluation of Different Thresholds for Outlier Detection -- 4.6 Ablation Study -- 5 Conclusion -- References -- User Interactions -- BK-Editer: Body-Keeping Text-Conditioned Real Image Editing -- 1 Introduction -- 2 Related Work -- 3 Background -- 3.1 Diffusion Model Training -- 3.2 DDIM Sampling and Inversion -- 3.3 Text Condition and Classifier-Free Guidance -- 3.4 Stable Diffusion Model -- 3.5 Task Setting and the Body-Keeping Problem -- 4 Method -- 4.1 Tuning Stage for Finetuning Network -- 4.2 Inversion Stage for Obtaining BK-Attn Embeddings -- 4.3 Edit Stage with Body-Keeping -- 5 Experiments -- 5.1 Comparisons with Other Concurrent Works -- 5.2 User Study -- 5.3 Ablation Study -- 6 Limitations and Conclusion -- References -- Walking Telescope: Exploring the Zooming Effect in Expanding Detection Threshold Range for Translation Gain -- 1 Introduction -- 2 Related Work -- 2.1 Translation Gain Detection Threshold -- 2.2 Impact of FoV Change on Distance Perception -- 2.3 Impact of Magnified View on Distance Perception -- 3 Method -- 3.1 Translation Gain -- 3.2 Motivation -- 3.3 Verification Experiment -- 4 Main Experiment -- 4.1 Design and Hypotheses -- 4.2 Apparatus -- 4.3 Participants -- 4.4 Procedure -- 5 Results -- 5.1 Direction Thresholds -- 5.2 Simulator Sickness -- 6 Discussion -- 7 Limitation and Future Work -- 8 Conclusion -- References -- A U-Shaped Spatio-Temporal Transformer as Solver for Motion Capture -- 1 Introduction -- 2 Related Work -- 2.1 MoCap Data Clean-Up and Solving -- 2.2 Smoothness -- 2.3 Rotation Representations -- 2.4 Attention Model. 2.5 U-Net Architecture -- 3 Methodology -- 3.1 Problem Formulation -- 3.2 Overall Structure -- 4 Experiments and Evaluation -- 4.1 Experimental Settings -- 4.2 Quantitative and Qualitative Research -- 4.3 Ablation Study -- 5 Limitations and Future Work -- 6 Conclusion -- References -- ROSA-Net: Rotation-Robust Structure-Aware Network for Fine-Grained 3D Shape Retrieval -- 1 Introduction -- 2 Related Work -- 2.1 3D Shape Retrieval -- 2.2 Mesh-Based Representations -- 2.3 Rotation-Invariant Representations -- 3 ROSA-Net -- 3.1 Overview -- 3.2 Geometric Feature Representation -- 3.3 Part Geometry Attention Mechanism -- 3.4 Structural Information Representation -- 3.5 Geometry-Structure Attention Mechanism -- 3.6 Global Feature Encoding -- 3.7 Losses -- 3.8 Model Training and Shape Retrieval -- 4 Experimental Results -- 4.1 ROSA-Dataset -- 4.2 Fine-Grained Shape Retrieval -- 4.3 Weighted Features of Parts by Part-Geo Attention -- 4.4 Weighted Features by Geo-Struct Attention -- 4.5 Using Other Data Representation -- 4.6 Ablation Study -- 5 Conclusion -- References -- Author Index. |
Record Nr. | UNINA-9910847083303321 |
Zhang Fang-Lue
![]() |
||
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024 | ||
![]() | ||
Lo trovi qui: Univ. Federico II | ||
|
Computational Visual Media : 12th International Conference, CVM 2024, Wellington, New Zealand, April 10–12, 2024, Proceedings, Part I / / edited by Fang-Lue Zhang, Andrei Sharf |
Autore | Zhang Fang-Lue |
Edizione | [1st ed. 2024.] |
Pubbl/distr/stampa | Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024 |
Descrizione fisica | 1 online resource (331 pages) |
Disciplina | 006.37 |
Altri autori (Persone) | SharfAndrei |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Computer vision
Pattern recognition systems Application software Computer graphics Artificial intelligence Algorithms Computer Vision Automated Pattern Recognition Computer and Information Systems Applications Computer Graphics Artificial Intelligence |
ISBN | 981-9720-95-8 |
Formato | Materiale a stampa ![]() |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
Intro -- Preface -- Organization -- Contents - Part I -- Contents - Part II -- Reconstruction and Modelling -- PIFu for the Real World: A Self-supervised Framework to Reconstruct Dressed Human from Single-View Images -- 1 Introduction -- 2 Related Work -- 2.1 Singe-View Human Reconstruction -- 2.2 Single-View Depth Estimation -- 2.3 Self-supervised 3D Reconstruction -- 3 Method -- 3.1 Normal and Depth Estimation -- 3.2 SDF-Based Pixel-Aligned Implicit Function from Depth -- 3.3 Depth-Guided Self-supervised Learning -- 4 Experiments -- 4.1 Datasets, Metrics, and Implementation Details -- 4.2 Evaluations -- 4.3 Comparison with the State-of-the-Art -- 5 Conclusion -- References -- Sketchformer++: A Hierarchical Transformer Architecture for Vector Sketch Representation -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Data Representation -- 3.2 Hierarchical Transformer Architecture -- 3.3 Training -- 4 Experiments -- 4.1 Sketch Reconstruction -- 4.2 Sketch Recognition -- 4.3 Sketch Semantic Segmentation -- 4.4 Ablation Study -- 5 Conclusion -- References -- Leveraging Panoptic Prior for 3D Zero-Shot Semantic Understanding Within Language Embedded Radiance Fields -- 1 Introduction -- 2 Related Works -- 2.1 NeRF with Semantics -- 2.2 Panoptic Segmentation -- 2.3 Open-Vocabulary Object Detection -- 2.4 Zero-Shot Learning in 3D -- 2.5 Cross-Modal Knowledge Distillation -- 3 Method -- 3.1 Overview -- 3.2 Field Structure -- 3.3 Semantic Prior Extraction -- 3.4 CLIP Pyramid Reconstruction -- 3.5 Relevancy Evaluation Metric -- 4 Experiments -- 4.1 Settings -- 4.2 Qualitative Results -- 4.3 Ablation Study -- 5 Limitations -- 6 Conclusions -- References -- Multi-Scale Implicit Surface Reconstruction for Outdoor Scenes -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Multi-scale Rendering with SDF Representation -- 3.2 Dynamic Position Encoding.
3.3 Adaptive Sampling Strategy in Image Space -- 3.4 More Details -- 3.5 Loss Function -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Qualitative and Quantitative Comparisons -- 4.3 Ablation Study -- 5 Conclusion -- References -- Neural Radiance Fields for Dynamic View Synthesis Using Local Temporal Priors -- 1 Introduction -- 2 Related Work -- 3 Overview -- 4 Dynamic Scene Representation -- 5 Local Temporal NeRF -- 5.1 Local Temporal Module -- 5.2 Loss Functions -- 5.3 Implementation -- 6 Results -- 6.1 Quantitative Evaluation -- 6.2 Qualitative Evaluation -- 6.3 Ablation Study -- 6.4 Additional Comparisons -- 7 Limitations and Discussion -- 8 Conclusion -- References -- Point Cloud -- Point Cloud Segmentation with Guided Sampling and Continuous Interpolation -- 1 Introduction -- 2 Related Work -- 2.1 Point Cloud Learning -- 2.2 Point Cloud Sampling -- 3 Method -- 3.1 Motivation -- 3.2 Guided Sampling -- 3.3 Continuous Interpolation -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Signal Reconstruction -- 4.3 Semantic Segmentation -- 4.4 Object Part Segmentation -- 4.5 Ablation Study -- 5 Conclusion and Discussion -- References -- TopFormer: Topology-Aware Transformer for Point Cloud Registration -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Problem Definition -- 3.2 Local Feature Encoder -- 3.3 Topology-Aware Transformer -- 3.4 Sparse Point Matching -- 3.5 Dense Points Refinement -- 3.6 Loss Function -- 4 Experiments -- 4.1 Implementation -- 4.2 Indoor Scene: 3DMatch -- 4.3 Outdoor Scene Data: KITTI -- 4.4 Ablation Study -- 5 Conclusion -- References -- Adversarial Geometric Transformations of Point Clouds for Physical Attack -- 1 Introduction -- 2 Related Works -- 3 Methodology -- 3.1 Preliminaries -- 3.2 Adversarial Geometric Transformations -- 3.3 Optimization -- 4 Experiments -- 4.1 Dataset and Settings. 4.2 Evaluation on Adversarial Point Clouds -- 4.3 Evaluation on Shape and Physical Attack -- 4.4 Ablation Studies -- 5 Conclusions -- References -- SARNet: Semantic Augmented Registration of Large-Scale Urban Point Clouds -- 1 Introduction -- 2 Related Work -- 2.1 Traditional Feature-Based Registration -- 2.2 Learning-Based Registration -- 2.3 3D Point Feature Learning -- 3 Problem Statement and Overview -- 4 Methodology -- 4.1 Semantic-Based Farthest Point Sampling -- 4.2 Semantic-Augmented Feature Extraction -- 4.3 Semantic-Refined Transformation Estimation -- 4.4 Loss Functions -- 4.5 Implementation Details -- 5 Experimental Results -- 5.1 Experimental Setup -- 5.2 Evaluation Metrics -- 5.3 Comparisons -- 5.4 Ablation Study -- 5.5 Limitations -- 6 Conclusion and Future Work -- References -- Rendering and Animation -- FASSET: Frame Supersampling and Extrapolation Using Implicit Neural Representations of Rendering Contents -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Motivation and Overview -- 3.2 Implicit Neural Representations of Rendering Contents -- 3.3 Frame Feature Extractor -- 3.4 Network Training -- 4 Experiments -- 4.1 Dataset -- 4.2 Baselines and Settings -- 4.3 Analysis of Runtime Performance and Model Efficiency -- 4.4 Ablation Study -- 4.5 Limitation -- 5 Conclusion -- References -- MatTrans: Material Reflectance Property Estimation of Complex Objects with Transformer -- 1 Introduction -- 2 Related Works -- 3 Method -- 3.1 Initial Estimation Network -- 3.2 Refined Estimation Network -- 3.3 Transformer Encoder -- 3.4 Dataset -- 3.5 Training -- 4 Experiments -- 4.1 Ablation Experiment -- 4.2 Generalization to Real Data -- 4.3 Comparison Experiment -- 5 Conclusion -- References -- Improved Text-Driven Human Motion Generation via Out-of-Distribution Detection and Rectification -- 1 Introduction -- 2 Related Work. 3 The Proposed Method -- 4 Experiments -- 4.1 Dataset and Evaluation Metrics -- 4.2 Experiment Configuration and Training Details -- 4.3 Comparisons Between Different Text-Driven Human Motion Generation Methods -- 4.4 Comparison Between Different Outlier Detection Algorithms -- 4.5 Evaluation of Different Thresholds for Outlier Detection -- 4.6 Ablation Study -- 5 Conclusion -- References -- User Interactions -- BK-Editer: Body-Keeping Text-Conditioned Real Image Editing -- 1 Introduction -- 2 Related Work -- 3 Background -- 3.1 Diffusion Model Training -- 3.2 DDIM Sampling and Inversion -- 3.3 Text Condition and Classifier-Free Guidance -- 3.4 Stable Diffusion Model -- 3.5 Task Setting and the Body-Keeping Problem -- 4 Method -- 4.1 Tuning Stage for Finetuning Network -- 4.2 Inversion Stage for Obtaining BK-Attn Embeddings -- 4.3 Edit Stage with Body-Keeping -- 5 Experiments -- 5.1 Comparisons with Other Concurrent Works -- 5.2 User Study -- 5.3 Ablation Study -- 6 Limitations and Conclusion -- References -- Walking Telescope: Exploring the Zooming Effect in Expanding Detection Threshold Range for Translation Gain -- 1 Introduction -- 2 Related Work -- 2.1 Translation Gain Detection Threshold -- 2.2 Impact of FoV Change on Distance Perception -- 2.3 Impact of Magnified View on Distance Perception -- 3 Method -- 3.1 Translation Gain -- 3.2 Motivation -- 3.3 Verification Experiment -- 4 Main Experiment -- 4.1 Design and Hypotheses -- 4.2 Apparatus -- 4.3 Participants -- 4.4 Procedure -- 5 Results -- 5.1 Direction Thresholds -- 5.2 Simulator Sickness -- 6 Discussion -- 7 Limitation and Future Work -- 8 Conclusion -- References -- A U-Shaped Spatio-Temporal Transformer as Solver for Motion Capture -- 1 Introduction -- 2 Related Work -- 2.1 MoCap Data Clean-Up and Solving -- 2.2 Smoothness -- 2.3 Rotation Representations -- 2.4 Attention Model. 2.5 U-Net Architecture -- 3 Methodology -- 3.1 Problem Formulation -- 3.2 Overall Structure -- 4 Experiments and Evaluation -- 4.1 Experimental Settings -- 4.2 Quantitative and Qualitative Research -- 4.3 Ablation Study -- 5 Limitations and Future Work -- 6 Conclusion -- References -- ROSA-Net: Rotation-Robust Structure-Aware Network for Fine-Grained 3D Shape Retrieval -- 1 Introduction -- 2 Related Work -- 2.1 3D Shape Retrieval -- 2.2 Mesh-Based Representations -- 2.3 Rotation-Invariant Representations -- 3 ROSA-Net -- 3.1 Overview -- 3.2 Geometric Feature Representation -- 3.3 Part Geometry Attention Mechanism -- 3.4 Structural Information Representation -- 3.5 Geometry-Structure Attention Mechanism -- 3.6 Global Feature Encoding -- 3.7 Losses -- 3.8 Model Training and Shape Retrieval -- 4 Experimental Results -- 4.1 ROSA-Dataset -- 4.2 Fine-Grained Shape Retrieval -- 4.3 Weighted Features of Parts by Part-Geo Attention -- 4.4 Weighted Features by Geo-Struct Attention -- 4.5 Using Other Data Representation -- 4.6 Ablation Study -- 5 Conclusion -- References -- Author Index. |
Record Nr. | UNISA-996589543503316 |
Zhang Fang-Lue
![]() |
||
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024 | ||
![]() | ||
Lo trovi qui: Univ. di Salerno | ||
|
Computational Visual Media : 12th International Conference, CVM 2024, Wellington, New Zealand, April 10–12, 2024, Proceedings, Part II / / edited by Fang-Lue Zhang, Andrei Sharf |
Autore | Zhang Fang-Lue |
Edizione | [1st ed. 2024.] |
Pubbl/distr/stampa | Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024 |
Descrizione fisica | 1 online resource (384 pages) |
Disciplina | 006.37 |
Altri autori (Persone) | SharfAndrei |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Computer vision
Pattern recognition systems Application software Computer graphics Artificial intelligence Algorithms Computer Vision Automated Pattern Recognition Computer and Information Systems Applications Computer Graphics Artificial Intelligence |
ISBN | 981-9720-92-3 |
Formato | Materiale a stampa ![]() |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
Intro -- Preface -- Organization -- Contents - Part II -- Contents - Part I -- Facial Images -- Zero-Shot Real Facial Attribute Separation and Transfer at Novel Views -- 1 Introduction -- 2 Related Works -- 2.1 Explicit Face Morphable Models -- 2.2 3D-Aware Implicit Models -- 2.3 Disentanglement Representation Learning -- 3 Method -- 3.1 Model Architecture -- 3.2 EM-Like Alternating Training Procedure -- 3.3 Model Parameters Initialization -- 3.4 Rendering Refinement with Blind Face Restoration -- 4 Experiment -- 4.1 Implementation Details -- 4.2 Zero-Shot Attribute Separation from Single Image -- 4.3 Comparisons -- 4.4 Ablation Study -- 5 Conclusion -- 5.1 Limitation -- References -- Explore and Enhance the Generalization of Anomaly DeepFake Detection -- 1 Introduction -- 2 Related Work -- 2.1 Conventional DeepFake Detection -- 2.2 Anomaly DeepFake Detection -- 3 Approach -- 3.1 Overview -- 3.2 Review and Exploration of ADFD -- 3.3 Boundary Blur Mask Generator -- 3.4 Noise Refinement Strategy -- 3.5 Algorithm -- 4 Experiments -- 4.1 Experiments Setting -- 4.2 Exploration Experiments of ADFD Methods -- 4.3 Comparison Experiments -- 5 Conclusion -- References -- Deep Tiny Network for Recognition-Oriented Face Image Quality Assessment -- 1 Introduction -- 2 Related Work -- 2.1 Image Quality Assessment -- 2.2 Face Image Quality Assessment -- 3 Method -- 3.1 Recognition-Oriented Non-reference Quality Measurement -- 3.2 Tiny Face Quality Network -- 3.3 Generating Training Dataset with Quality Labels -- 3.4 Data Sampling and Augmentation Strategy for Balancing the Distribution of Scores -- 4 Experimental Results -- 4.1 Experimental Setup -- 4.2 Datasets and Protocols -- 4.3 Visualization of Different FIQA Methods -- 4.4 Memory and Computation Costs -- 4.5 Quantitative Evaluation on IJB-B and IJB-C Datasets -- 4.6 Quantitative Evaluation on YTF Dataset.
4.7 Ablation Studies -- 5 Conclusion -- References -- Face Expression Recognition via Product-Cross Dual Attention and Neutral-Aware Anchor Loss -- 1 Introduction -- 2 Related Work -- 2.1 Landmark -- 2.2 Transformer in FER -- 2.3 Losses Used in FER -- 3 Our Method -- 3.1 Product-Cross Dual Attention Module -- 3.2 Neutral Expression Aware Anchor Loss -- 3.3 Total Loss Function -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Ablation Study -- 4.4 Comparison with the State-of-the-Art Methods -- 4.5 Comparison on Number of Parameters and Running Performance -- 5 Conclusion -- References -- Image Generation and Enhancement -- Deformable CNN with Position Encoding for Arbitrary-Scale Super-Resolution -- 1 Introduction -- 2 Related Work -- 2.1 Implicit Neural Representation -- 2.2 Single Image Super-Resolution (SISR) -- 2.3 Arbitrary-Scale Super-Resolution -- 3 Methods -- 3.1 Deformable Feature Unfolding (DFU) -- 3.2 Fusion with Learned Position Encoding (FPE) -- 3.3 Deep ResMLP -- 4 Experiments -- 4.1 Datasets and Metrics -- 4.2 Implementation Detail -- 4.3 Evaluation -- 4.4 Ablation Study -- 5 Conclusion -- References -- Single-Video Temporal Consistency Enhancement with Rolling Guidance -- 1 Introduction -- 2 Related Work -- 2.1 Temporal Consistency for Specific Tasks -- 2.2 Blind Video Temporal Consistency -- 2.3 Spatial Smoothing Filters and Rolling Guidance -- 3 Method -- 3.1 Overview -- 3.2 Constructing Coarse Guidance Video -- 3.3 Recovering Image Details -- 3.4 Global Refinement -- 3.5 Comparison with the Deflickering Algorithm -- 4 Experiment -- 4.1 Dataset -- 4.2 Quality Assessment -- 4.3 Comparison to State-of-the-Art Methods -- 4.4 Ablation Study -- 5 Discussion and Conclusion -- References -- GTLayout: Learning General Trees for Structured Grid Layout Generation -- 1 Introduction -- 2 Related Work -- 3 Method. 3.1 Structural Layout Representation -- 3.2 Generative Model for Structured Grid Layouts -- 3.3 Training -- 4 Evaluation -- 4.1 Layout Generation -- 4.2 Layout Reconstruction -- 4.3 Layout Interpolation -- 5 Conclusion -- References -- Image Understanding -- Silhouette-Based 6D Object Pose Estimation -- 1 Introduction -- 2 Related Work -- 2.1 Traditional Methods -- 2.2 Methods with Deep Learning -- 3 The Method -- 3.1 Problem Formulation and Notation -- 3.2 Dimensionality Reduction -- 3.3 Optimized Particle Swarm Optimization -- 4 Experiments -- 4.1 Experiments Setup -- 4.2 Comparison to State of the Art -- 4.3 Performance on YCB-V-NT and TR-RW -- 4.4 Silhouette Stability Experiments -- 4.5 Ablation Study on YCB-V -- 5 Conclusion and Outlook -- References -- Robust Light Field Depth Estimation over Occluded and Specular Regions -- 1 Introduction -- 2 Related Work -- 3 The Depth Estimation -- 3.1 Consistency Data and Confidence -- 3.2 NPCR Depth Estimation -- 3.3 Depth Refinement -- 4 Experiment -- 4.1 Occlusion Processing Comparisons -- 4.2 Specular Regions Processing -- 4.3 Depth Map -- 4.4 Computational Time -- 5 Conclusion and Limitation -- References -- Foreground and Background Separate Adaptive Equilibrium Gradients Loss for Long-Tail Object Detection -- 1 Introduction -- 2 Related Works -- 2.1 General Object Detection -- 2.2 Long-Tail Object Detection -- 3 Methodology -- 3.1 Revisiting Sigmoid Cross-Entropy Loss -- 3.2 Foreground and Background Separate Adaptive Equilibrium Gradients Loss -- 4 Experiments on LVIS -- 4.1 Datasets and Evaluation Metric -- 4.2 Implementation Details -- 4.3 Ablation Studies -- 4.4 Generalization on Stronger Models -- 4.5 Performance Analysis -- 4.6 Comparison with State-of-the-Art Methods -- 4.7 Evaluation on COCO-LT -- 4.8 Result Visualization -- 5 Conclusion -- References -- Stylization. Multi-level Patch Transformer for Style Transfer with Single Reference Image -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Multi-level Patch Transformer Encoder -- 3.2 Dynamic Filter-Based Decoder -- 3.3 Loss Functions -- 4 Experiments and Evaluations -- 4.1 Implementation Details -- 4.2 Qualitative Evaluation -- 4.3 Ablation Study -- 4.4 User Study -- 4.5 Quantitative Evaluations -- 4.6 Discussion CycleTransformer vs CycleGAN -- 5 Conclusion and Future Work -- References -- Palette-Based Content-Aware Image Recoloring -- 1 Introduction -- 2 Related Works -- 2.1 Palette-Based Image Recoloring -- 2.2 Edit Propagation (Stroke-Based Image Recoloring) -- 2.3 Style Transfer (Example-Based Image Recoloring) -- 3 Method -- 3.1 Overview -- 3.2 Palette Extraction -- 3.3 Content-Aware Recoloring -- 4 Experiments -- 4.1 Results -- 4.2 Evaluation -- 4.3 Comparisons -- 5 Conclusion, Limitation and Future Work -- References -- FreeStyler: A Free-Form Stylization Method via Multimodal Vector Quantization -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Vector Quantization Framework -- 3.2 Pseudo-Paired Token Predictor -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Qualitative Results -- 4.3 Quantitative Results -- 4.4 Ablation Study -- 4.5 Applications -- 5 Limitations and Future Work -- 6 Conclusion -- References -- Vision Meets Graphics -- Denoised Dual-Level Contrastive Network for Weakly-Supervised Temporal Sentence Grounding -- 1 Introduction -- 2 Related Work -- 2.1 Weakly-Supervised Temporal Sentence Grounding -- 2.2 Contrastive Representation Learning -- 3 The Proposed Method -- 3.1 Problem Formulation -- 3.2 Visual-Text Feature Extraction -- 3.3 Gaussian-Based Proposal Generation -- 3.4 Intra-video Contrastive Learning -- 3.5 Inter-video Contrastive Learning -- 3.6 Pseudo-Label Noise Removal -- 3.7 Training and Inference. 4 Experiments -- 4.1 Datasets -- 4.2 Evaluation Metric -- 4.3 Implementation Details -- 4.4 Comparisons with State-of-the-Art Methods -- 4.5 Ablation Study and Analysis -- 4.6 Qualitative Results -- 5 Conclusion -- References -- Isolation and Integration: A Strong Pre-trained Model-Based Paradigm for Class-Incremental Learning -- 1 Introduction -- 2 Realeated Work -- 3 Method -- 3.1 Problem Setting -- 3.2 A Simple Baseline -- 3.3 Dynamically Adaption and Aggregation -- 4 Experiments -- 4.1 Experimental Setups -- 4.2 Comparison with State of the Art -- 4.3 Ablation Study -- 5 Conclusion -- References -- Object Category-Based Visual Dialog for Effective Question Generation -- 1 Introduction -- 2 Related Work -- 3 Model -- 3.1 Object Information Extraction -- 3.2 Category Selection -- 3.3 Object Fusion Feature Update -- 3.4 Object-Self Difference Attention Module -- 3.5 Question Decoder -- 3.6 Object-Level Attention Update -- 4 Experiments -- 4.1 Dataset -- 4.2 Evaluation Metrics -- 4.3 Experiment Settings -- 4.4 Results -- 5 Conclusions -- References -- AST: An Attention-Guided Segment Transformer for Drone-Based Cross-View Geo-Localization -- 1 Introduction -- 2 Related Work -- 2.1 Image-Based Cross-View Geo-Localization -- 2.2 Vision Transformer -- 3 Proposed Method -- 3.1 Problem Formulation -- 3.2 Vision Transformer for Cross-View Geo-Localization -- 3.3 Attention-Guided Segment Tokens -- 3.4 Loss Function and Training Strategy -- 4 Experiment -- 4.1 Datasets and Evaluation Metrics -- 4.2 Implementation Details -- 4.3 Comparison with Existing Methods -- 4.4 Ablation Study -- 4.5 Visualization -- 5 Conclusion -- References -- Improved YOLOv5 Algorithm for Small Object Detection in Drone Images -- 1 Introduction -- 2 Related Work -- 2.1 Object Detection -- 2.2 Small Object Detection -- 2.3 YOLOv5 -- 3 HTH-YOLOv5 -- 3.1 Hybrid Transformer Head. 3.2 Convolutional Attention Feature Fusion Module. |
Record Nr. | UNINA-9910847092103321 |
Zhang Fang-Lue
![]() |
||
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024 | ||
![]() | ||
Lo trovi qui: Univ. Federico II | ||
|
Computational Visual Media : 12th International Conference, CVM 2024, Wellington, New Zealand, April 10–12, 2024, Proceedings, Part II / / edited by Fang-Lue Zhang, Andrei Sharf |
Autore | Zhang Fang-Lue |
Edizione | [1st ed. 2024.] |
Pubbl/distr/stampa | Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024 |
Descrizione fisica | 1 online resource (384 pages) |
Disciplina | 006.37 |
Altri autori (Persone) | SharfAndrei |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Computer vision
Pattern recognition systems Application software Computer graphics Artificial intelligence Algorithms Computer Vision Automated Pattern Recognition Computer and Information Systems Applications Computer Graphics Artificial Intelligence |
ISBN | 981-9720-92-3 |
Formato | Materiale a stampa ![]() |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
Intro -- Preface -- Organization -- Contents - Part II -- Contents - Part I -- Facial Images -- Zero-Shot Real Facial Attribute Separation and Transfer at Novel Views -- 1 Introduction -- 2 Related Works -- 2.1 Explicit Face Morphable Models -- 2.2 3D-Aware Implicit Models -- 2.3 Disentanglement Representation Learning -- 3 Method -- 3.1 Model Architecture -- 3.2 EM-Like Alternating Training Procedure -- 3.3 Model Parameters Initialization -- 3.4 Rendering Refinement with Blind Face Restoration -- 4 Experiment -- 4.1 Implementation Details -- 4.2 Zero-Shot Attribute Separation from Single Image -- 4.3 Comparisons -- 4.4 Ablation Study -- 5 Conclusion -- 5.1 Limitation -- References -- Explore and Enhance the Generalization of Anomaly DeepFake Detection -- 1 Introduction -- 2 Related Work -- 2.1 Conventional DeepFake Detection -- 2.2 Anomaly DeepFake Detection -- 3 Approach -- 3.1 Overview -- 3.2 Review and Exploration of ADFD -- 3.3 Boundary Blur Mask Generator -- 3.4 Noise Refinement Strategy -- 3.5 Algorithm -- 4 Experiments -- 4.1 Experiments Setting -- 4.2 Exploration Experiments of ADFD Methods -- 4.3 Comparison Experiments -- 5 Conclusion -- References -- Deep Tiny Network for Recognition-Oriented Face Image Quality Assessment -- 1 Introduction -- 2 Related Work -- 2.1 Image Quality Assessment -- 2.2 Face Image Quality Assessment -- 3 Method -- 3.1 Recognition-Oriented Non-reference Quality Measurement -- 3.2 Tiny Face Quality Network -- 3.3 Generating Training Dataset with Quality Labels -- 3.4 Data Sampling and Augmentation Strategy for Balancing the Distribution of Scores -- 4 Experimental Results -- 4.1 Experimental Setup -- 4.2 Datasets and Protocols -- 4.3 Visualization of Different FIQA Methods -- 4.4 Memory and Computation Costs -- 4.5 Quantitative Evaluation on IJB-B and IJB-C Datasets -- 4.6 Quantitative Evaluation on YTF Dataset.
4.7 Ablation Studies -- 5 Conclusion -- References -- Face Expression Recognition via Product-Cross Dual Attention and Neutral-Aware Anchor Loss -- 1 Introduction -- 2 Related Work -- 2.1 Landmark -- 2.2 Transformer in FER -- 2.3 Losses Used in FER -- 3 Our Method -- 3.1 Product-Cross Dual Attention Module -- 3.2 Neutral Expression Aware Anchor Loss -- 3.3 Total Loss Function -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Ablation Study -- 4.4 Comparison with the State-of-the-Art Methods -- 4.5 Comparison on Number of Parameters and Running Performance -- 5 Conclusion -- References -- Image Generation and Enhancement -- Deformable CNN with Position Encoding for Arbitrary-Scale Super-Resolution -- 1 Introduction -- 2 Related Work -- 2.1 Implicit Neural Representation -- 2.2 Single Image Super-Resolution (SISR) -- 2.3 Arbitrary-Scale Super-Resolution -- 3 Methods -- 3.1 Deformable Feature Unfolding (DFU) -- 3.2 Fusion with Learned Position Encoding (FPE) -- 3.3 Deep ResMLP -- 4 Experiments -- 4.1 Datasets and Metrics -- 4.2 Implementation Detail -- 4.3 Evaluation -- 4.4 Ablation Study -- 5 Conclusion -- References -- Single-Video Temporal Consistency Enhancement with Rolling Guidance -- 1 Introduction -- 2 Related Work -- 2.1 Temporal Consistency for Specific Tasks -- 2.2 Blind Video Temporal Consistency -- 2.3 Spatial Smoothing Filters and Rolling Guidance -- 3 Method -- 3.1 Overview -- 3.2 Constructing Coarse Guidance Video -- 3.3 Recovering Image Details -- 3.4 Global Refinement -- 3.5 Comparison with the Deflickering Algorithm -- 4 Experiment -- 4.1 Dataset -- 4.2 Quality Assessment -- 4.3 Comparison to State-of-the-Art Methods -- 4.4 Ablation Study -- 5 Discussion and Conclusion -- References -- GTLayout: Learning General Trees for Structured Grid Layout Generation -- 1 Introduction -- 2 Related Work -- 3 Method. 3.1 Structural Layout Representation -- 3.2 Generative Model for Structured Grid Layouts -- 3.3 Training -- 4 Evaluation -- 4.1 Layout Generation -- 4.2 Layout Reconstruction -- 4.3 Layout Interpolation -- 5 Conclusion -- References -- Image Understanding -- Silhouette-Based 6D Object Pose Estimation -- 1 Introduction -- 2 Related Work -- 2.1 Traditional Methods -- 2.2 Methods with Deep Learning -- 3 The Method -- 3.1 Problem Formulation and Notation -- 3.2 Dimensionality Reduction -- 3.3 Optimized Particle Swarm Optimization -- 4 Experiments -- 4.1 Experiments Setup -- 4.2 Comparison to State of the Art -- 4.3 Performance on YCB-V-NT and TR-RW -- 4.4 Silhouette Stability Experiments -- 4.5 Ablation Study on YCB-V -- 5 Conclusion and Outlook -- References -- Robust Light Field Depth Estimation over Occluded and Specular Regions -- 1 Introduction -- 2 Related Work -- 3 The Depth Estimation -- 3.1 Consistency Data and Confidence -- 3.2 NPCR Depth Estimation -- 3.3 Depth Refinement -- 4 Experiment -- 4.1 Occlusion Processing Comparisons -- 4.2 Specular Regions Processing -- 4.3 Depth Map -- 4.4 Computational Time -- 5 Conclusion and Limitation -- References -- Foreground and Background Separate Adaptive Equilibrium Gradients Loss for Long-Tail Object Detection -- 1 Introduction -- 2 Related Works -- 2.1 General Object Detection -- 2.2 Long-Tail Object Detection -- 3 Methodology -- 3.1 Revisiting Sigmoid Cross-Entropy Loss -- 3.2 Foreground and Background Separate Adaptive Equilibrium Gradients Loss -- 4 Experiments on LVIS -- 4.1 Datasets and Evaluation Metric -- 4.2 Implementation Details -- 4.3 Ablation Studies -- 4.4 Generalization on Stronger Models -- 4.5 Performance Analysis -- 4.6 Comparison with State-of-the-Art Methods -- 4.7 Evaluation on COCO-LT -- 4.8 Result Visualization -- 5 Conclusion -- References -- Stylization. Multi-level Patch Transformer for Style Transfer with Single Reference Image -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Multi-level Patch Transformer Encoder -- 3.2 Dynamic Filter-Based Decoder -- 3.3 Loss Functions -- 4 Experiments and Evaluations -- 4.1 Implementation Details -- 4.2 Qualitative Evaluation -- 4.3 Ablation Study -- 4.4 User Study -- 4.5 Quantitative Evaluations -- 4.6 Discussion CycleTransformer vs CycleGAN -- 5 Conclusion and Future Work -- References -- Palette-Based Content-Aware Image Recoloring -- 1 Introduction -- 2 Related Works -- 2.1 Palette-Based Image Recoloring -- 2.2 Edit Propagation (Stroke-Based Image Recoloring) -- 2.3 Style Transfer (Example-Based Image Recoloring) -- 3 Method -- 3.1 Overview -- 3.2 Palette Extraction -- 3.3 Content-Aware Recoloring -- 4 Experiments -- 4.1 Results -- 4.2 Evaluation -- 4.3 Comparisons -- 5 Conclusion, Limitation and Future Work -- References -- FreeStyler: A Free-Form Stylization Method via Multimodal Vector Quantization -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Vector Quantization Framework -- 3.2 Pseudo-Paired Token Predictor -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Qualitative Results -- 4.3 Quantitative Results -- 4.4 Ablation Study -- 4.5 Applications -- 5 Limitations and Future Work -- 6 Conclusion -- References -- Vision Meets Graphics -- Denoised Dual-Level Contrastive Network for Weakly-Supervised Temporal Sentence Grounding -- 1 Introduction -- 2 Related Work -- 2.1 Weakly-Supervised Temporal Sentence Grounding -- 2.2 Contrastive Representation Learning -- 3 The Proposed Method -- 3.1 Problem Formulation -- 3.2 Visual-Text Feature Extraction -- 3.3 Gaussian-Based Proposal Generation -- 3.4 Intra-video Contrastive Learning -- 3.5 Inter-video Contrastive Learning -- 3.6 Pseudo-Label Noise Removal -- 3.7 Training and Inference. 4 Experiments -- 4.1 Datasets -- 4.2 Evaluation Metric -- 4.3 Implementation Details -- 4.4 Comparisons with State-of-the-Art Methods -- 4.5 Ablation Study and Analysis -- 4.6 Qualitative Results -- 5 Conclusion -- References -- Isolation and Integration: A Strong Pre-trained Model-Based Paradigm for Class-Incremental Learning -- 1 Introduction -- 2 Realeated Work -- 3 Method -- 3.1 Problem Setting -- 3.2 A Simple Baseline -- 3.3 Dynamically Adaption and Aggregation -- 4 Experiments -- 4.1 Experimental Setups -- 4.2 Comparison with State of the Art -- 4.3 Ablation Study -- 5 Conclusion -- References -- Object Category-Based Visual Dialog for Effective Question Generation -- 1 Introduction -- 2 Related Work -- 3 Model -- 3.1 Object Information Extraction -- 3.2 Category Selection -- 3.3 Object Fusion Feature Update -- 3.4 Object-Self Difference Attention Module -- 3.5 Question Decoder -- 3.6 Object-Level Attention Update -- 4 Experiments -- 4.1 Dataset -- 4.2 Evaluation Metrics -- 4.3 Experiment Settings -- 4.4 Results -- 5 Conclusions -- References -- AST: An Attention-Guided Segment Transformer for Drone-Based Cross-View Geo-Localization -- 1 Introduction -- 2 Related Work -- 2.1 Image-Based Cross-View Geo-Localization -- 2.2 Vision Transformer -- 3 Proposed Method -- 3.1 Problem Formulation -- 3.2 Vision Transformer for Cross-View Geo-Localization -- 3.3 Attention-Guided Segment Tokens -- 3.4 Loss Function and Training Strategy -- 4 Experiment -- 4.1 Datasets and Evaluation Metrics -- 4.2 Implementation Details -- 4.3 Comparison with Existing Methods -- 4.4 Ablation Study -- 4.5 Visualization -- 5 Conclusion -- References -- Improved YOLOv5 Algorithm for Small Object Detection in Drone Images -- 1 Introduction -- 2 Related Work -- 2.1 Object Detection -- 2.2 Small Object Detection -- 2.3 YOLOv5 -- 3 HTH-YOLOv5 -- 3.1 Hybrid Transformer Head. 3.2 Convolutional Attention Feature Fusion Module. |
Record Nr. | UNISA-996589544103316 |
Zhang Fang-Lue
![]() |
||
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024 | ||
![]() | ||
Lo trovi qui: Univ. di Salerno | ||
|
SIGGRAPH Asia 2013 posters |
Pubbl/distr/stampa | [Place of publication not identified], : ACM, 2013 |
Descrizione fisica | 1 online resource (41 pages) |
Collana | ACM Conferences |
Soggetto topico |
Engineering & Applied Sciences
Technology - General |
ISBN | 1-4503-2634-X |
Formato | Materiale a stampa ![]() |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Altri titoli varianti | Special Interest Group on Computer Graphics and Interactive Techniques Asia 2013 posters |
Record Nr. | UNINA-9910375727103321 |
[Place of publication not identified], : ACM, 2013 | ||
![]() | ||
Lo trovi qui: Univ. Federico II | ||
|
SIGGRAPH Asia 2013 technical briefs |
Pubbl/distr/stampa | [Place of publication not identified], : ACM, 2013 |
Descrizione fisica | 1 online resource (135 pages) |
Collana | ACM Conferences |
Soggetto topico |
Engineering & Applied Sciences
Technology - General |
ISBN | 1-4503-2629-3 |
Formato | Materiale a stampa ![]() |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Altri titoli varianti | Special Interest Group on Computer Graphics and Interactive Techniques Asia 2013 technical briefs |
Record Nr. | UNINA-9910375727003321 |
[Place of publication not identified], : ACM, 2013 | ||
![]() | ||
Lo trovi qui: Univ. Federico II | ||
|