top

  Info

  • Utilizzare la checkbox di selezione a fianco di ciascun documento per attivare le funzionalità di stampa, invio email, download nei formati disponibili del (i) record.

  Info

  • Utilizzare questo link per rimuovere la selezione effettuata.
Advances in Visual Computing : 18th International Symposium, ISVC 2023, Lake Tahoe, NV, USA, October 16-18, 2023, Proceedings, Part I
Advances in Visual Computing : 18th International Symposium, ISVC 2023, Lake Tahoe, NV, USA, October 16-18, 2023, Proceedings, Part I
Autore Bebis George
Edizione [1st ed.]
Pubbl/distr/stampa Cham : , : Springer, , 2024
Descrizione fisica 1 online resource (630 pages)
Altri autori (Persone) GhiasiGolnaz
FangYi
SharfAndrei
DongYue
WeaverChris
LeoZhicheng
LaViola JrJoseph J
KohliLuv
Collana Lecture Notes in Computer Science Series
ISBN 3-031-47969-6
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Keynote Talks -- Machine Learning for Scientific Data Analysis and Visualization -- Estimating the Structure and Motion of Biomolecules at Atomic Resolutions -- Curriculum Learning and Active Learning, for Visual Object Recognition when Data is Scarce -- Have We Solved Image Correspondences? -- Visual Content Manipulation by Learning Generative Models -- Lights, Camera, Animation! Adaptive Simulation Methods for Training and Entertainment -- Beyond the Specs: A Computational and Human-Centered Approach to Wearability in AR/VR -- Contents - Part I -- Contents - Part II -- ST: Biomedical Image Analysis Techniques for Cancer Detection, Diagnosis and Management -- Hybrid Region and Pixel-Level Adaptive Loss for Mass Segmentation on Whole Mammography Images -- 1 Introduction -- 2 Related Work -- 2.1 Mass Segmentation on Whole Mammograms -- 2.2 Loss for Medical Image Segmentation -- 3 Methodology -- 3.1 Hybrid Pixel-Level Loss -- 3.2 Hybrid Region-Level Loss -- 3.3 Density-Adaptive Sample-Level Prioritizing Loss -- 4 Experimental Results -- 4.1 Datasets -- 4.2 Evaluation Metrics -- 4.3 Comparison with State-of-the-Art Methods -- 5 Conclusion -- References -- Deep Learning Based GABA Edited-MRS Signal Reconstruction -- 1 Introduction -- 2 Methods -- 2.1 Dataset -- 2.2 J-Difference Spectrum -- 2.3 Dual Branch Self-Attention Neural Network -- 2.4 Evaluation Metrics -- 3 Results and Discussion -- 4 Conclusion -- References -- Investigating the Impact of Attention on Mammogram Classification -- 1 Introduction -- 2 Data and Methods -- 2.1 Data Selection and Preprocessing -- 2.2 Selection of Models -- 2.3 Selection of Attention Methods -- 2.4 Training and Testing Process -- 3 Results and Discussion -- 3.1 Impact of Attention on CNN Performance -- 3.2 Impact of Model Architecture on Performance Differences.
3.3 Impact of Attention on Resolution -- 3.4 Impact of Attention on Abnormality Type -- 3.5 Relationship Between Model Activation and AU-ROC -- 4 Conclusions -- References -- ReFit: A Framework for Refinement of Weakly Supervised Semantic Segmentation Using Object Border Fitting for Medical Images -- 1 Introduction -- 2 Our ReFit Framework -- 2.1 Unsupervised Segment Detection -- 2.2 Class Activation Map - CAM -- 2.3 The BoundaryFit Module -- 3 Results and Discussion -- 3.1 Ablation Studies -- 4 Conclusion -- References -- A Data-Centric Approach for Pectoral Muscle Deep Learning Segmentation Enhancements in Mammography Images -- 1 Introduction -- 2 Related Work -- 3 Mammography Segmentation -- 3.1 Dataset -- 3.2 Model Training -- 3.3 Drawbacks -- 4 Data-Centric Model Optimization -- 4.1 Stage I: Annotation Correction -- 4.2 Stage II: Downsampling -- 5 Results -- 5.1 Evaluation Metrics -- 5.2 Evaluated Training Datasets -- 5.3 Intersection over Union Evaluation -- 5.4 Classification Metrics for Pectoral Muscle Detection in CC View -- 6 Conclusion -- References -- Visualization -- Visualizing Multimodal Time Series at Scale -- 1 Introduction -- 2 Related Work -- 3 Overview Scenario -- 4 Detail Methods and Implementation -- 4.1 Time Series Dataset -- 4.2 Exploiting Elasticsearch for Fast Search and Big Query -- 4.3 Visualizing Time Series -- 5 Exploring UMAFall Dataset with TimeXplore -- 6 Conclusions and Future Work -- References -- Hybrid Tree Visualizations for Analysis of Gerrymandering -- 1 Introduction -- 2 Related Work -- 3 Gerrymandering -- 4 Data Model in Gerrymandering -- 5 Visual Design -- 6 Analysis Examples -- 6.1 Evaluating the Efficiency Gap -- 6.2 Assessing Electoral Competition -- 7 Conclusion -- References -- ArcheryVis: A Tool for Analyzing and Visualizing Archery Performance Data -- 1 Introduction -- 2 Related Work.
2.1 Archery Performance Analysis -- 2.2 Archery Scoring Apps -- 3 Data Collection, Processing, and Analysis -- 3.1 Data Collection -- 3.2 Ring and Center Detection -- 3.3 Shot Detection and Calibration -- 3.4 Scoring and Statistical Measures -- 4 Visual Interface and Interaction -- 5 Results and Discussion -- 5.1 Brushing and Filtering -- 5.2 Trainee Comparison -- 5.3 Statistical Measure as Performance Indicator -- 5.4 Empirical Evaluation -- 5.5 Limitations -- 6 Conclusions and Future Work -- References -- Spiro: Order-Preserving Visualization in High Performance Computing Monitoring -- 1 Introduction -- 2 Related Work -- 2.1 Spiral Layout in Visualization -- 2.2 Monitoring with Spiral Layout -- 3 Monitoring Tasks -- 4 Spiro Design -- 4.1 Design Rationales -- 4.2 Visual Encoding -- 5 Case Studies -- 5.1 Clustering on Compute Servers -- 5.2 Exploring Usage Behavior -- 6 Conclusion and Future Work -- References -- From Faces to Volumes - Measuring Volumetric Asymmetry in 3D Facial Palsy Scans -- 1 Introduction -- 2 Related Work -- 3 Data Acquisition -- 4 Methods -- 4.1 3D Landmark Extraction for Facial Palsy Patients -- 4.2 Radial Curves -- 4.3 Lateral Face Mesh Generation -- 4.4 Volume Estimation for Lateral Face Sides -- 4.5 Volumetric Difference Visualization -- 5 Volume Analysis During Dynamic Movements -- 6 Conclusions and Future Work -- References -- Video Analysis and Event Recognition -- Comparison of Autoencoder Models for Unsupervised Representation Learning of Skeleton Sequences -- 1 Introduction -- 2 Related Work -- 3 Methods -- 3.1 Proposed Methods -- 4 Experiments -- 4.1 Datasets -- 4.2 Results Analysis and Comparisons -- 5 Conclusion and Future Works -- References -- Local and Global Context Reasoning for Spatio-Temporal Action Localization -- 1 Introduction -- 2 Related Works -- 3 Proposed Method -- 3.1 Overall Pipeline.
3.2 Near-Actor Relation Network -- 4 Experiments on JHMDB21 -- 4.1 Implementation Details -- 4.2 Comparison on JHMDB21 -- 4.3 Ablation Study -- 4.4 Qualitative Results -- 5 Experiments on AVA -- 5.1 Implementation Details -- 5.2 Comparison on AVA -- 6 Conclusion -- References -- Zero-Shot Video Moment Retrieval Using BLIP-Based Models -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Computing Image and Text Embeddings -- 3.2 Sparse Frame-Sampling Strategies -- 3.3 Moment-Query Matching -- 4 Experiments -- 5 Results and Discussion -- 6 Conclusions and Future Work -- References -- Self-supervised Representation Learning for Fine Grained Human Hand Action Recognition in Industrial Assembly Lines -- 1 Introduction -- 2 Related Work -- 3 Proposed Method -- 3.1 Model Architecture -- 3.2 Masking Method -- 4 Experiments -- 4.1 Datasets -- 4.2 Model Training Environment -- 4.3 Self-supervised Pretraining and Downstream Task -- 5 Results and Analysis -- 5.1 Results Self-supervised Learning -- 5.2 Results Downstream Task -- 5.3 Analysis -- 6 Conclusion and Outlook -- References -- ST: Innovations in Computer Vision & -- Machine Learning for Critical & -- Civil Infrastructures -- Pretext Tasks in Bridge Defect Segmentation Within a ViT-Adapter Framework -- 1 Introduction -- 2 Methods -- 2.1 ViT-Adapter Model -- 2.2 Datasets -- 2.3 Supervised Learning (SL) Pre-training -- 2.4 Self- And Semi-Supervised Learning (SSL) Pre-training -- 2.5 Training Parameters -- 3 Results and Discussion -- 4 Conclusion -- References -- A Few-Shot Attention Recurrent Residual U-Net for Crack Segmentation -- 1 Introduction -- 1.1 Current Limitations and Our Contribution -- 2 Proposed Architecture -- 2.1 R2AU-Net Architecture for Road Crack Segmentation -- 2.2 Few-Shot Learning for Segmentation Refinement -- 3 Experimental Setup and Results -- 3.1 Dataset Description.
3.2 Comparative Algorithms and Training Configuration -- 3.3 Experiments and Comparisons -- 4 Conclusions -- References -- Efficient Resource Provisioning in Critical Infrastructures Based on Multi-Agent Rollout Enabled by Deep Q-Learning -- 1 Introduction -- 2 Related Work -- 3 Workload Management in Critical Infrastructures -- 3.1 Infrastructure Model -- 3.2 Problem Formulation -- 3.3 Deterministic Markov Decision Process Model -- 3.4 Multi-Agent Rollout Enabled by Deep Q-Learning -- 4 Simulation Experiments -- 4.1 Experimental Setup -- 4.2 Evaluation Results -- 5 Conclusions -- References -- Video-Based Recognition of Aquatic Invasive Species Larvae Using Attention-LSTM Transformer -- 1 Introduction -- 1.1 Attention-LSTM -- 2 Related Work -- 3 Proposed Method -- 3.1 Model Architecture -- 3.2 Attention-LSTM Layer -- 3.3 Model Variations -- 4 Invasive Species Dataset -- 5 Empirical Evaluation -- 6 Conclusion -- References -- ST: Generalization in Visual Machine Learning -- Latent Space Navigation for Face Privacy: A Case Study on the MNIST Dataset -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 4 Experimental Result -- 5 Future Work -- 6 Conclusion -- References -- Domain Generalization for Foreground Segmentation Using Federated Learning -- 1 Introduction -- 2 Related Work -- 3 Proposed Work -- 3.1 Model Architecture -- 3.2 Training Technique -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Traditional Foreground Segmentation Experiment -- 4.4 Domain Generalization Experiment -- 4.5 Few-Shot Experiment -- 5 Conclusion and Future Work -- References -- Probabilistic Local Equivalence Certification for Robustness Evaluation -- 1 Introduction -- 2 Related Work -- 3 Probabilistic Local Equivalence Certification -- 3.1 Probabilistic Local Equivalence Certification -- 3.2 When Labels are Available.
3.3 The Case of Classification.
Record Nr. UNISA-996565867203316
Bebis George  
Cham : , : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Advances in Visual Computing : 18th International Symposium, ISVC 2023, Lake Tahoe, NV, USA, October 16-18, 2023, Proceedings, Part I
Advances in Visual Computing : 18th International Symposium, ISVC 2023, Lake Tahoe, NV, USA, October 16-18, 2023, Proceedings, Part I
Autore Bebis George
Edizione [1st ed.]
Pubbl/distr/stampa Cham : , : Springer, , 2024
Descrizione fisica 1 online resource (630 pages)
Altri autori (Persone) GhiasiGolnaz
FangYi
SharfAndrei
DongYue
WeaverChris
LeoZhicheng
LaViola JrJoseph J
KohliLuv
Collana Lecture Notes in Computer Science Series
ISBN 3-031-47969-6
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Keynote Talks -- Machine Learning for Scientific Data Analysis and Visualization -- Estimating the Structure and Motion of Biomolecules at Atomic Resolutions -- Curriculum Learning and Active Learning, for Visual Object Recognition when Data is Scarce -- Have We Solved Image Correspondences? -- Visual Content Manipulation by Learning Generative Models -- Lights, Camera, Animation! Adaptive Simulation Methods for Training and Entertainment -- Beyond the Specs: A Computational and Human-Centered Approach to Wearability in AR/VR -- Contents - Part I -- Contents - Part II -- ST: Biomedical Image Analysis Techniques for Cancer Detection, Diagnosis and Management -- Hybrid Region and Pixel-Level Adaptive Loss for Mass Segmentation on Whole Mammography Images -- 1 Introduction -- 2 Related Work -- 2.1 Mass Segmentation on Whole Mammograms -- 2.2 Loss for Medical Image Segmentation -- 3 Methodology -- 3.1 Hybrid Pixel-Level Loss -- 3.2 Hybrid Region-Level Loss -- 3.3 Density-Adaptive Sample-Level Prioritizing Loss -- 4 Experimental Results -- 4.1 Datasets -- 4.2 Evaluation Metrics -- 4.3 Comparison with State-of-the-Art Methods -- 5 Conclusion -- References -- Deep Learning Based GABA Edited-MRS Signal Reconstruction -- 1 Introduction -- 2 Methods -- 2.1 Dataset -- 2.2 J-Difference Spectrum -- 2.3 Dual Branch Self-Attention Neural Network -- 2.4 Evaluation Metrics -- 3 Results and Discussion -- 4 Conclusion -- References -- Investigating the Impact of Attention on Mammogram Classification -- 1 Introduction -- 2 Data and Methods -- 2.1 Data Selection and Preprocessing -- 2.2 Selection of Models -- 2.3 Selection of Attention Methods -- 2.4 Training and Testing Process -- 3 Results and Discussion -- 3.1 Impact of Attention on CNN Performance -- 3.2 Impact of Model Architecture on Performance Differences.
3.3 Impact of Attention on Resolution -- 3.4 Impact of Attention on Abnormality Type -- 3.5 Relationship Between Model Activation and AU-ROC -- 4 Conclusions -- References -- ReFit: A Framework for Refinement of Weakly Supervised Semantic Segmentation Using Object Border Fitting for Medical Images -- 1 Introduction -- 2 Our ReFit Framework -- 2.1 Unsupervised Segment Detection -- 2.2 Class Activation Map - CAM -- 2.3 The BoundaryFit Module -- 3 Results and Discussion -- 3.1 Ablation Studies -- 4 Conclusion -- References -- A Data-Centric Approach for Pectoral Muscle Deep Learning Segmentation Enhancements in Mammography Images -- 1 Introduction -- 2 Related Work -- 3 Mammography Segmentation -- 3.1 Dataset -- 3.2 Model Training -- 3.3 Drawbacks -- 4 Data-Centric Model Optimization -- 4.1 Stage I: Annotation Correction -- 4.2 Stage II: Downsampling -- 5 Results -- 5.1 Evaluation Metrics -- 5.2 Evaluated Training Datasets -- 5.3 Intersection over Union Evaluation -- 5.4 Classification Metrics for Pectoral Muscle Detection in CC View -- 6 Conclusion -- References -- Visualization -- Visualizing Multimodal Time Series at Scale -- 1 Introduction -- 2 Related Work -- 3 Overview Scenario -- 4 Detail Methods and Implementation -- 4.1 Time Series Dataset -- 4.2 Exploiting Elasticsearch for Fast Search and Big Query -- 4.3 Visualizing Time Series -- 5 Exploring UMAFall Dataset with TimeXplore -- 6 Conclusions and Future Work -- References -- Hybrid Tree Visualizations for Analysis of Gerrymandering -- 1 Introduction -- 2 Related Work -- 3 Gerrymandering -- 4 Data Model in Gerrymandering -- 5 Visual Design -- 6 Analysis Examples -- 6.1 Evaluating the Efficiency Gap -- 6.2 Assessing Electoral Competition -- 7 Conclusion -- References -- ArcheryVis: A Tool for Analyzing and Visualizing Archery Performance Data -- 1 Introduction -- 2 Related Work.
2.1 Archery Performance Analysis -- 2.2 Archery Scoring Apps -- 3 Data Collection, Processing, and Analysis -- 3.1 Data Collection -- 3.2 Ring and Center Detection -- 3.3 Shot Detection and Calibration -- 3.4 Scoring and Statistical Measures -- 4 Visual Interface and Interaction -- 5 Results and Discussion -- 5.1 Brushing and Filtering -- 5.2 Trainee Comparison -- 5.3 Statistical Measure as Performance Indicator -- 5.4 Empirical Evaluation -- 5.5 Limitations -- 6 Conclusions and Future Work -- References -- Spiro: Order-Preserving Visualization in High Performance Computing Monitoring -- 1 Introduction -- 2 Related Work -- 2.1 Spiral Layout in Visualization -- 2.2 Monitoring with Spiral Layout -- 3 Monitoring Tasks -- 4 Spiro Design -- 4.1 Design Rationales -- 4.2 Visual Encoding -- 5 Case Studies -- 5.1 Clustering on Compute Servers -- 5.2 Exploring Usage Behavior -- 6 Conclusion and Future Work -- References -- From Faces to Volumes - Measuring Volumetric Asymmetry in 3D Facial Palsy Scans -- 1 Introduction -- 2 Related Work -- 3 Data Acquisition -- 4 Methods -- 4.1 3D Landmark Extraction for Facial Palsy Patients -- 4.2 Radial Curves -- 4.3 Lateral Face Mesh Generation -- 4.4 Volume Estimation for Lateral Face Sides -- 4.5 Volumetric Difference Visualization -- 5 Volume Analysis During Dynamic Movements -- 6 Conclusions and Future Work -- References -- Video Analysis and Event Recognition -- Comparison of Autoencoder Models for Unsupervised Representation Learning of Skeleton Sequences -- 1 Introduction -- 2 Related Work -- 3 Methods -- 3.1 Proposed Methods -- 4 Experiments -- 4.1 Datasets -- 4.2 Results Analysis and Comparisons -- 5 Conclusion and Future Works -- References -- Local and Global Context Reasoning for Spatio-Temporal Action Localization -- 1 Introduction -- 2 Related Works -- 3 Proposed Method -- 3.1 Overall Pipeline.
3.2 Near-Actor Relation Network -- 4 Experiments on JHMDB21 -- 4.1 Implementation Details -- 4.2 Comparison on JHMDB21 -- 4.3 Ablation Study -- 4.4 Qualitative Results -- 5 Experiments on AVA -- 5.1 Implementation Details -- 5.2 Comparison on AVA -- 6 Conclusion -- References -- Zero-Shot Video Moment Retrieval Using BLIP-Based Models -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Computing Image and Text Embeddings -- 3.2 Sparse Frame-Sampling Strategies -- 3.3 Moment-Query Matching -- 4 Experiments -- 5 Results and Discussion -- 6 Conclusions and Future Work -- References -- Self-supervised Representation Learning for Fine Grained Human Hand Action Recognition in Industrial Assembly Lines -- 1 Introduction -- 2 Related Work -- 3 Proposed Method -- 3.1 Model Architecture -- 3.2 Masking Method -- 4 Experiments -- 4.1 Datasets -- 4.2 Model Training Environment -- 4.3 Self-supervised Pretraining and Downstream Task -- 5 Results and Analysis -- 5.1 Results Self-supervised Learning -- 5.2 Results Downstream Task -- 5.3 Analysis -- 6 Conclusion and Outlook -- References -- ST: Innovations in Computer Vision & -- Machine Learning for Critical & -- Civil Infrastructures -- Pretext Tasks in Bridge Defect Segmentation Within a ViT-Adapter Framework -- 1 Introduction -- 2 Methods -- 2.1 ViT-Adapter Model -- 2.2 Datasets -- 2.3 Supervised Learning (SL) Pre-training -- 2.4 Self- And Semi-Supervised Learning (SSL) Pre-training -- 2.5 Training Parameters -- 3 Results and Discussion -- 4 Conclusion -- References -- A Few-Shot Attention Recurrent Residual U-Net for Crack Segmentation -- 1 Introduction -- 1.1 Current Limitations and Our Contribution -- 2 Proposed Architecture -- 2.1 R2AU-Net Architecture for Road Crack Segmentation -- 2.2 Few-Shot Learning for Segmentation Refinement -- 3 Experimental Setup and Results -- 3.1 Dataset Description.
3.2 Comparative Algorithms and Training Configuration -- 3.3 Experiments and Comparisons -- 4 Conclusions -- References -- Efficient Resource Provisioning in Critical Infrastructures Based on Multi-Agent Rollout Enabled by Deep Q-Learning -- 1 Introduction -- 2 Related Work -- 3 Workload Management in Critical Infrastructures -- 3.1 Infrastructure Model -- 3.2 Problem Formulation -- 3.3 Deterministic Markov Decision Process Model -- 3.4 Multi-Agent Rollout Enabled by Deep Q-Learning -- 4 Simulation Experiments -- 4.1 Experimental Setup -- 4.2 Evaluation Results -- 5 Conclusions -- References -- Video-Based Recognition of Aquatic Invasive Species Larvae Using Attention-LSTM Transformer -- 1 Introduction -- 1.1 Attention-LSTM -- 2 Related Work -- 3 Proposed Method -- 3.1 Model Architecture -- 3.2 Attention-LSTM Layer -- 3.3 Model Variations -- 4 Invasive Species Dataset -- 5 Empirical Evaluation -- 6 Conclusion -- References -- ST: Generalization in Visual Machine Learning -- Latent Space Navigation for Face Privacy: A Case Study on the MNIST Dataset -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 4 Experimental Result -- 5 Future Work -- 6 Conclusion -- References -- Domain Generalization for Foreground Segmentation Using Federated Learning -- 1 Introduction -- 2 Related Work -- 3 Proposed Work -- 3.1 Model Architecture -- 3.2 Training Technique -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Traditional Foreground Segmentation Experiment -- 4.4 Domain Generalization Experiment -- 4.5 Few-Shot Experiment -- 5 Conclusion and Future Work -- References -- Probabilistic Local Equivalence Certification for Robustness Evaluation -- 1 Introduction -- 2 Related Work -- 3 Probabilistic Local Equivalence Certification -- 3.1 Probabilistic Local Equivalence Certification -- 3.2 When Labels are Available.
3.3 The Case of Classification.
Record Nr. UNINA-9910767585603321
Bebis George  
Cham : , : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Advances in Visual Computing [[electronic resource] ] : 18th International Symposium, ISVC 2023, Lake Tahoe, NV, USA, October 16–18, 2023, Proceedings, Part II / / edited by George Bebis, Golnaz Ghiasi, Yi Fang, Andrei Sharf, Yue Dong, Chris Weaver, Zhicheng Leo, Joseph J. LaViola Jr., Luv Kohli
Advances in Visual Computing [[electronic resource] ] : 18th International Symposium, ISVC 2023, Lake Tahoe, NV, USA, October 16–18, 2023, Proceedings, Part II / / edited by George Bebis, Golnaz Ghiasi, Yi Fang, Andrei Sharf, Yue Dong, Chris Weaver, Zhicheng Leo, Joseph J. LaViola Jr., Luv Kohli
Autore Bebis George
Edizione [1st ed. 2023.]
Pubbl/distr/stampa Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Descrizione fisica 1 online resource (506 pages)
Disciplina 006
Altri autori (Persone) GhiasiGolnaz
FangYi
SharfAndrei
DongYue
WeaverChris
LeoZhicheng
LaViola JrJoseph J
KohliLuv
Collana Lecture Notes in Computer Science
Soggetto topico Image processing - Digital techniques
Computer vision
Computer Imaging, Vision, Pattern Recognition and Graphics
ISBN 3-031-47966-1
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Virtual Reality -- A Pilot Study Comparing User Interactions Between Augmented and Virtual Reality -- Synthesizing Play-Ready VR Scenes with Natural Language Prompts through GPT API -- Emergent Individual Factors for AR Education and Training -- Segmentation -- ISLE: A Framework for Image Level Semantic Segmentation Ensemble -- Particulate Mapping Centerline Extraction (PMCE), a Novel Centerline Extraction Algorithm Based on Patterns in the Spatial Distribution of Aggregates -- Evaluating Segmentation Approaches on Digitized Herbarium Specimens -- Semantic Scene Filtering for Event Cameras in Long-Term Outdoor Monitoring Scenarios -- SODAWideNet - Salient Object Detection with an Attention augmented Wide Encoder Decoder network without ImageNet pre-training -- Applications -- Foil-Net: Deep Learning-Based Wave Classification for Hydrofoil Surfing -- Inpainting of Depth Images using Deep Neural Networks for Real-Time Applications -- Using 2D and 3D Face Representations to Generate Comprehensive Facial Electromyography Intensity Maps -- Real-world Image Deblurring via Unsupervised Domain Adaptation -- Object Detection and Recognition -- Reliable Matching by Combining Optimal Color and Intensity Information based on Relationships between Target and Surrounding Objects -- Regularized Meta-Training with Embedding Mixup for Improved Few-Shot Learning -- Visual Foreign Object Detection for Wireless Charging of Electric Vehicles -- Deep Representation Learning for License Plate Recognition in Low Quality Video Images -- Optimizing PnP-Algorithms for Limited Point Correspondences Using Spatial Constraints -- Deep Learning -- Unsupervised Deep-Learning Approach for Underwater Image Enhancement -- LaneNet++ : Uncertainty-aware Lane Detection for Autonomous Vehicle -- Task-driven Compression for Collision Encoding based on Depth Images -- Eigenpatches - Adversarial Patches from Principal Components -- Edge-guided Image Inpainting with Transformer -- Poster -- Bayesian Fusion inspired 3D reconstruction via LiDAR-Stereo Camera Pair -- Marimba Mallet Placement Tracker -- DINO-CXR: A Self Supervised Method Based on Vision Transformer for Chest X-Ray Classification -- Social Bias and Image Tagging: Evaluation of Progress in State-of-the-Art Models -- L-TReiD: Logic Tensor Transformer for Re-Identification -- Retinal Disease Diagnosis with a Hybrid ResNet50-LSTM Deep Learning Model -- Pothole Segmentation and Area Estimation with Deep Neural Networks and Unmanned Aerial Vehicles -- Generation method of robot assembly motion considering physicality gap between humans and robots -- A Self-Supervised Pose Estimation Approach for Construction Machines -- Image Quality Improvement of Surveillance Camera Images by Learning Noise Removal Method Using Noise2Noise -- Automating Kernel Size Selection in MRI Reconstruction via a Transparent and Interpretable Search Approach -- Segmentation and Identification of Mediterranean Plant Species -- Exploiting Generative Adversarial Networks in Joint Sensitivity Encoding for Enhanced MRI Reconstruction -- Multisensory Modeling of Tabular Data for Enhanced Perception and Immersive User Experience -- Coping with Bullying Incidents by the Narrative and Multi-modal Interaction in Virtual Reality.
Record Nr. UNINA-9910767583103321
Bebis George  
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Advances in Visual Computing [[electronic resource] ] : 18th International Symposium, ISVC 2023, Lake Tahoe, NV, USA, October 16–18, 2023, Proceedings, Part II / / edited by George Bebis, Golnaz Ghiasi, Yi Fang, Andrei Sharf, Yue Dong, Chris Weaver, Zhicheng Leo, Joseph J. LaViola Jr., Luv Kohli
Advances in Visual Computing [[electronic resource] ] : 18th International Symposium, ISVC 2023, Lake Tahoe, NV, USA, October 16–18, 2023, Proceedings, Part II / / edited by George Bebis, Golnaz Ghiasi, Yi Fang, Andrei Sharf, Yue Dong, Chris Weaver, Zhicheng Leo, Joseph J. LaViola Jr., Luv Kohli
Autore Bebis George
Edizione [1st ed. 2023.]
Pubbl/distr/stampa Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Descrizione fisica 1 online resource (506 pages)
Disciplina 006
Altri autori (Persone) GhiasiGolnaz
FangYi
SharfAndrei
DongYue
WeaverChris
LeoZhicheng
LaViola JrJoseph J
KohliLuv
Collana Lecture Notes in Computer Science
Soggetto topico Image processing - Digital techniques
Computer vision
Computer Imaging, Vision, Pattern Recognition and Graphics
ISBN 3-031-47966-1
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Virtual Reality -- A Pilot Study Comparing User Interactions Between Augmented and Virtual Reality -- Synthesizing Play-Ready VR Scenes with Natural Language Prompts through GPT API -- Emergent Individual Factors for AR Education and Training -- Segmentation -- ISLE: A Framework for Image Level Semantic Segmentation Ensemble -- Particulate Mapping Centerline Extraction (PMCE), a Novel Centerline Extraction Algorithm Based on Patterns in the Spatial Distribution of Aggregates -- Evaluating Segmentation Approaches on Digitized Herbarium Specimens -- Semantic Scene Filtering for Event Cameras in Long-Term Outdoor Monitoring Scenarios -- SODAWideNet - Salient Object Detection with an Attention augmented Wide Encoder Decoder network without ImageNet pre-training -- Applications -- Foil-Net: Deep Learning-Based Wave Classification for Hydrofoil Surfing -- Inpainting of Depth Images using Deep Neural Networks for Real-Time Applications -- Using 2D and 3D Face Representations to Generate Comprehensive Facial Electromyography Intensity Maps -- Real-world Image Deblurring via Unsupervised Domain Adaptation -- Object Detection and Recognition -- Reliable Matching by Combining Optimal Color and Intensity Information based on Relationships between Target and Surrounding Objects -- Regularized Meta-Training with Embedding Mixup for Improved Few-Shot Learning -- Visual Foreign Object Detection for Wireless Charging of Electric Vehicles -- Deep Representation Learning for License Plate Recognition in Low Quality Video Images -- Optimizing PnP-Algorithms for Limited Point Correspondences Using Spatial Constraints -- Deep Learning -- Unsupervised Deep-Learning Approach for Underwater Image Enhancement -- LaneNet++ : Uncertainty-aware Lane Detection for Autonomous Vehicle -- Task-driven Compression for Collision Encoding based on Depth Images -- Eigenpatches - Adversarial Patches from Principal Components -- Edge-guided Image Inpainting with Transformer -- Poster -- Bayesian Fusion inspired 3D reconstruction via LiDAR-Stereo Camera Pair -- Marimba Mallet Placement Tracker -- DINO-CXR: A Self Supervised Method Based on Vision Transformer for Chest X-Ray Classification -- Social Bias and Image Tagging: Evaluation of Progress in State-of-the-Art Models -- L-TReiD: Logic Tensor Transformer for Re-Identification -- Retinal Disease Diagnosis with a Hybrid ResNet50-LSTM Deep Learning Model -- Pothole Segmentation and Area Estimation with Deep Neural Networks and Unmanned Aerial Vehicles -- Generation method of robot assembly motion considering physicality gap between humans and robots -- A Self-Supervised Pose Estimation Approach for Construction Machines -- Image Quality Improvement of Surveillance Camera Images by Learning Noise Removal Method Using Noise2Noise -- Automating Kernel Size Selection in MRI Reconstruction via a Transparent and Interpretable Search Approach -- Segmentation and Identification of Mediterranean Plant Species -- Exploiting Generative Adversarial Networks in Joint Sensitivity Encoding for Enhanced MRI Reconstruction -- Multisensory Modeling of Tabular Data for Enhanced Perception and Immersive User Experience -- Coping with Bullying Incidents by the Narrative and Multi-modal Interaction in Virtual Reality.
Record Nr. UNISA-996574260103316
Bebis George  
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2023
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Computational Visual Media : 12th International Conference, CVM 2024, Wellington, New Zealand, April 10–12, 2024, Proceedings, Part I / / edited by Fang-Lue Zhang, Andrei Sharf
Computational Visual Media : 12th International Conference, CVM 2024, Wellington, New Zealand, April 10–12, 2024, Proceedings, Part I / / edited by Fang-Lue Zhang, Andrei Sharf
Autore Zhang Fang-Lue
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (331 pages)
Disciplina 006.37
Altri autori (Persone) SharfAndrei
Collana Lecture Notes in Computer Science
Soggetto topico Computer vision
Pattern recognition systems
Application software
Computer graphics
Artificial intelligence
Algorithms
Computer Vision
Automated Pattern Recognition
Computer and Information Systems Applications
Computer Graphics
Artificial Intelligence
ISBN 981-9720-95-8
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Contents - Part I -- Contents - Part II -- Reconstruction and Modelling -- PIFu for the Real World: A Self-supervised Framework to Reconstruct Dressed Human from Single-View Images -- 1 Introduction -- 2 Related Work -- 2.1 Singe-View Human Reconstruction -- 2.2 Single-View Depth Estimation -- 2.3 Self-supervised 3D Reconstruction -- 3 Method -- 3.1 Normal and Depth Estimation -- 3.2 SDF-Based Pixel-Aligned Implicit Function from Depth -- 3.3 Depth-Guided Self-supervised Learning -- 4 Experiments -- 4.1 Datasets, Metrics, and Implementation Details -- 4.2 Evaluations -- 4.3 Comparison with the State-of-the-Art -- 5 Conclusion -- References -- Sketchformer++: A Hierarchical Transformer Architecture for Vector Sketch Representation -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Data Representation -- 3.2 Hierarchical Transformer Architecture -- 3.3 Training -- 4 Experiments -- 4.1 Sketch Reconstruction -- 4.2 Sketch Recognition -- 4.3 Sketch Semantic Segmentation -- 4.4 Ablation Study -- 5 Conclusion -- References -- Leveraging Panoptic Prior for 3D Zero-Shot Semantic Understanding Within Language Embedded Radiance Fields -- 1 Introduction -- 2 Related Works -- 2.1 NeRF with Semantics -- 2.2 Panoptic Segmentation -- 2.3 Open-Vocabulary Object Detection -- 2.4 Zero-Shot Learning in 3D -- 2.5 Cross-Modal Knowledge Distillation -- 3 Method -- 3.1 Overview -- 3.2 Field Structure -- 3.3 Semantic Prior Extraction -- 3.4 CLIP Pyramid Reconstruction -- 3.5 Relevancy Evaluation Metric -- 4 Experiments -- 4.1 Settings -- 4.2 Qualitative Results -- 4.3 Ablation Study -- 5 Limitations -- 6 Conclusions -- References -- Multi-Scale Implicit Surface Reconstruction for Outdoor Scenes -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Multi-scale Rendering with SDF Representation -- 3.2 Dynamic Position Encoding.
3.3 Adaptive Sampling Strategy in Image Space -- 3.4 More Details -- 3.5 Loss Function -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Qualitative and Quantitative Comparisons -- 4.3 Ablation Study -- 5 Conclusion -- References -- Neural Radiance Fields for Dynamic View Synthesis Using Local Temporal Priors -- 1 Introduction -- 2 Related Work -- 3 Overview -- 4 Dynamic Scene Representation -- 5 Local Temporal NeRF -- 5.1 Local Temporal Module -- 5.2 Loss Functions -- 5.3 Implementation -- 6 Results -- 6.1 Quantitative Evaluation -- 6.2 Qualitative Evaluation -- 6.3 Ablation Study -- 6.4 Additional Comparisons -- 7 Limitations and Discussion -- 8 Conclusion -- References -- Point Cloud -- Point Cloud Segmentation with Guided Sampling and Continuous Interpolation -- 1 Introduction -- 2 Related Work -- 2.1 Point Cloud Learning -- 2.2 Point Cloud Sampling -- 3 Method -- 3.1 Motivation -- 3.2 Guided Sampling -- 3.3 Continuous Interpolation -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Signal Reconstruction -- 4.3 Semantic Segmentation -- 4.4 Object Part Segmentation -- 4.5 Ablation Study -- 5 Conclusion and Discussion -- References -- TopFormer: Topology-Aware Transformer for Point Cloud Registration -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Problem Definition -- 3.2 Local Feature Encoder -- 3.3 Topology-Aware Transformer -- 3.4 Sparse Point Matching -- 3.5 Dense Points Refinement -- 3.6 Loss Function -- 4 Experiments -- 4.1 Implementation -- 4.2 Indoor Scene: 3DMatch -- 4.3 Outdoor Scene Data: KITTI -- 4.4 Ablation Study -- 5 Conclusion -- References -- Adversarial Geometric Transformations of Point Clouds for Physical Attack -- 1 Introduction -- 2 Related Works -- 3 Methodology -- 3.1 Preliminaries -- 3.2 Adversarial Geometric Transformations -- 3.3 Optimization -- 4 Experiments -- 4.1 Dataset and Settings.
4.2 Evaluation on Adversarial Point Clouds -- 4.3 Evaluation on Shape and Physical Attack -- 4.4 Ablation Studies -- 5 Conclusions -- References -- SARNet: Semantic Augmented Registration of Large-Scale Urban Point Clouds -- 1 Introduction -- 2 Related Work -- 2.1 Traditional Feature-Based Registration -- 2.2 Learning-Based Registration -- 2.3 3D Point Feature Learning -- 3 Problem Statement and Overview -- 4 Methodology -- 4.1 Semantic-Based Farthest Point Sampling -- 4.2 Semantic-Augmented Feature Extraction -- 4.3 Semantic-Refined Transformation Estimation -- 4.4 Loss Functions -- 4.5 Implementation Details -- 5 Experimental Results -- 5.1 Experimental Setup -- 5.2 Evaluation Metrics -- 5.3 Comparisons -- 5.4 Ablation Study -- 5.5 Limitations -- 6 Conclusion and Future Work -- References -- Rendering and Animation -- FASSET: Frame Supersampling and Extrapolation Using Implicit Neural Representations of Rendering Contents -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Motivation and Overview -- 3.2 Implicit Neural Representations of Rendering Contents -- 3.3 Frame Feature Extractor -- 3.4 Network Training -- 4 Experiments -- 4.1 Dataset -- 4.2 Baselines and Settings -- 4.3 Analysis of Runtime Performance and Model Efficiency -- 4.4 Ablation Study -- 4.5 Limitation -- 5 Conclusion -- References -- MatTrans: Material Reflectance Property Estimation of Complex Objects with Transformer -- 1 Introduction -- 2 Related Works -- 3 Method -- 3.1 Initial Estimation Network -- 3.2 Refined Estimation Network -- 3.3 Transformer Encoder -- 3.4 Dataset -- 3.5 Training -- 4 Experiments -- 4.1 Ablation Experiment -- 4.2 Generalization to Real Data -- 4.3 Comparison Experiment -- 5 Conclusion -- References -- Improved Text-Driven Human Motion Generation via Out-of-Distribution Detection and Rectification -- 1 Introduction -- 2 Related Work.
3 The Proposed Method -- 4 Experiments -- 4.1 Dataset and Evaluation Metrics -- 4.2 Experiment Configuration and Training Details -- 4.3 Comparisons Between Different Text-Driven Human Motion Generation Methods -- 4.4 Comparison Between Different Outlier Detection Algorithms -- 4.5 Evaluation of Different Thresholds for Outlier Detection -- 4.6 Ablation Study -- 5 Conclusion -- References -- User Interactions -- BK-Editer: Body-Keeping Text-Conditioned Real Image Editing -- 1 Introduction -- 2 Related Work -- 3 Background -- 3.1 Diffusion Model Training -- 3.2 DDIM Sampling and Inversion -- 3.3 Text Condition and Classifier-Free Guidance -- 3.4 Stable Diffusion Model -- 3.5 Task Setting and the Body-Keeping Problem -- 4 Method -- 4.1 Tuning Stage for Finetuning Network -- 4.2 Inversion Stage for Obtaining BK-Attn Embeddings -- 4.3 Edit Stage with Body-Keeping -- 5 Experiments -- 5.1 Comparisons with Other Concurrent Works -- 5.2 User Study -- 5.3 Ablation Study -- 6 Limitations and Conclusion -- References -- Walking Telescope: Exploring the Zooming Effect in Expanding Detection Threshold Range for Translation Gain -- 1 Introduction -- 2 Related Work -- 2.1 Translation Gain Detection Threshold -- 2.2 Impact of FoV Change on Distance Perception -- 2.3 Impact of Magnified View on Distance Perception -- 3 Method -- 3.1 Translation Gain -- 3.2 Motivation -- 3.3 Verification Experiment -- 4 Main Experiment -- 4.1 Design and Hypotheses -- 4.2 Apparatus -- 4.3 Participants -- 4.4 Procedure -- 5 Results -- 5.1 Direction Thresholds -- 5.2 Simulator Sickness -- 6 Discussion -- 7 Limitation and Future Work -- 8 Conclusion -- References -- A U-Shaped Spatio-Temporal Transformer as Solver for Motion Capture -- 1 Introduction -- 2 Related Work -- 2.1 MoCap Data Clean-Up and Solving -- 2.2 Smoothness -- 2.3 Rotation Representations -- 2.4 Attention Model.
2.5 U-Net Architecture -- 3 Methodology -- 3.1 Problem Formulation -- 3.2 Overall Structure -- 4 Experiments and Evaluation -- 4.1 Experimental Settings -- 4.2 Quantitative and Qualitative Research -- 4.3 Ablation Study -- 5 Limitations and Future Work -- 6 Conclusion -- References -- ROSA-Net: Rotation-Robust Structure-Aware Network for Fine-Grained 3D Shape Retrieval -- 1 Introduction -- 2 Related Work -- 2.1 3D Shape Retrieval -- 2.2 Mesh-Based Representations -- 2.3 Rotation-Invariant Representations -- 3 ROSA-Net -- 3.1 Overview -- 3.2 Geometric Feature Representation -- 3.3 Part Geometry Attention Mechanism -- 3.4 Structural Information Representation -- 3.5 Geometry-Structure Attention Mechanism -- 3.6 Global Feature Encoding -- 3.7 Losses -- 3.8 Model Training and Shape Retrieval -- 4 Experimental Results -- 4.1 ROSA-Dataset -- 4.2 Fine-Grained Shape Retrieval -- 4.3 Weighted Features of Parts by Part-Geo Attention -- 4.4 Weighted Features by Geo-Struct Attention -- 4.5 Using Other Data Representation -- 4.6 Ablation Study -- 5 Conclusion -- References -- Author Index.
Record Nr. UNINA-9910847083303321
Zhang Fang-Lue  
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Computational Visual Media : 12th International Conference, CVM 2024, Wellington, New Zealand, April 10–12, 2024, Proceedings, Part I / / edited by Fang-Lue Zhang, Andrei Sharf
Computational Visual Media : 12th International Conference, CVM 2024, Wellington, New Zealand, April 10–12, 2024, Proceedings, Part I / / edited by Fang-Lue Zhang, Andrei Sharf
Autore Zhang Fang-Lue
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (331 pages)
Disciplina 006.37
Altri autori (Persone) SharfAndrei
Collana Lecture Notes in Computer Science
Soggetto topico Computer vision
Pattern recognition systems
Application software
Computer graphics
Artificial intelligence
Algorithms
Computer Vision
Automated Pattern Recognition
Computer and Information Systems Applications
Computer Graphics
Artificial Intelligence
ISBN 981-9720-95-8
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Contents - Part I -- Contents - Part II -- Reconstruction and Modelling -- PIFu for the Real World: A Self-supervised Framework to Reconstruct Dressed Human from Single-View Images -- 1 Introduction -- 2 Related Work -- 2.1 Singe-View Human Reconstruction -- 2.2 Single-View Depth Estimation -- 2.3 Self-supervised 3D Reconstruction -- 3 Method -- 3.1 Normal and Depth Estimation -- 3.2 SDF-Based Pixel-Aligned Implicit Function from Depth -- 3.3 Depth-Guided Self-supervised Learning -- 4 Experiments -- 4.1 Datasets, Metrics, and Implementation Details -- 4.2 Evaluations -- 4.3 Comparison with the State-of-the-Art -- 5 Conclusion -- References -- Sketchformer++: A Hierarchical Transformer Architecture for Vector Sketch Representation -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Data Representation -- 3.2 Hierarchical Transformer Architecture -- 3.3 Training -- 4 Experiments -- 4.1 Sketch Reconstruction -- 4.2 Sketch Recognition -- 4.3 Sketch Semantic Segmentation -- 4.4 Ablation Study -- 5 Conclusion -- References -- Leveraging Panoptic Prior for 3D Zero-Shot Semantic Understanding Within Language Embedded Radiance Fields -- 1 Introduction -- 2 Related Works -- 2.1 NeRF with Semantics -- 2.2 Panoptic Segmentation -- 2.3 Open-Vocabulary Object Detection -- 2.4 Zero-Shot Learning in 3D -- 2.5 Cross-Modal Knowledge Distillation -- 3 Method -- 3.1 Overview -- 3.2 Field Structure -- 3.3 Semantic Prior Extraction -- 3.4 CLIP Pyramid Reconstruction -- 3.5 Relevancy Evaluation Metric -- 4 Experiments -- 4.1 Settings -- 4.2 Qualitative Results -- 4.3 Ablation Study -- 5 Limitations -- 6 Conclusions -- References -- Multi-Scale Implicit Surface Reconstruction for Outdoor Scenes -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Multi-scale Rendering with SDF Representation -- 3.2 Dynamic Position Encoding.
3.3 Adaptive Sampling Strategy in Image Space -- 3.4 More Details -- 3.5 Loss Function -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Qualitative and Quantitative Comparisons -- 4.3 Ablation Study -- 5 Conclusion -- References -- Neural Radiance Fields for Dynamic View Synthesis Using Local Temporal Priors -- 1 Introduction -- 2 Related Work -- 3 Overview -- 4 Dynamic Scene Representation -- 5 Local Temporal NeRF -- 5.1 Local Temporal Module -- 5.2 Loss Functions -- 5.3 Implementation -- 6 Results -- 6.1 Quantitative Evaluation -- 6.2 Qualitative Evaluation -- 6.3 Ablation Study -- 6.4 Additional Comparisons -- 7 Limitations and Discussion -- 8 Conclusion -- References -- Point Cloud -- Point Cloud Segmentation with Guided Sampling and Continuous Interpolation -- 1 Introduction -- 2 Related Work -- 2.1 Point Cloud Learning -- 2.2 Point Cloud Sampling -- 3 Method -- 3.1 Motivation -- 3.2 Guided Sampling -- 3.3 Continuous Interpolation -- 4 Experiments -- 4.1 Experimental Setup -- 4.2 Signal Reconstruction -- 4.3 Semantic Segmentation -- 4.4 Object Part Segmentation -- 4.5 Ablation Study -- 5 Conclusion and Discussion -- References -- TopFormer: Topology-Aware Transformer for Point Cloud Registration -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Problem Definition -- 3.2 Local Feature Encoder -- 3.3 Topology-Aware Transformer -- 3.4 Sparse Point Matching -- 3.5 Dense Points Refinement -- 3.6 Loss Function -- 4 Experiments -- 4.1 Implementation -- 4.2 Indoor Scene: 3DMatch -- 4.3 Outdoor Scene Data: KITTI -- 4.4 Ablation Study -- 5 Conclusion -- References -- Adversarial Geometric Transformations of Point Clouds for Physical Attack -- 1 Introduction -- 2 Related Works -- 3 Methodology -- 3.1 Preliminaries -- 3.2 Adversarial Geometric Transformations -- 3.3 Optimization -- 4 Experiments -- 4.1 Dataset and Settings.
4.2 Evaluation on Adversarial Point Clouds -- 4.3 Evaluation on Shape and Physical Attack -- 4.4 Ablation Studies -- 5 Conclusions -- References -- SARNet: Semantic Augmented Registration of Large-Scale Urban Point Clouds -- 1 Introduction -- 2 Related Work -- 2.1 Traditional Feature-Based Registration -- 2.2 Learning-Based Registration -- 2.3 3D Point Feature Learning -- 3 Problem Statement and Overview -- 4 Methodology -- 4.1 Semantic-Based Farthest Point Sampling -- 4.2 Semantic-Augmented Feature Extraction -- 4.3 Semantic-Refined Transformation Estimation -- 4.4 Loss Functions -- 4.5 Implementation Details -- 5 Experimental Results -- 5.1 Experimental Setup -- 5.2 Evaluation Metrics -- 5.3 Comparisons -- 5.4 Ablation Study -- 5.5 Limitations -- 6 Conclusion and Future Work -- References -- Rendering and Animation -- FASSET: Frame Supersampling and Extrapolation Using Implicit Neural Representations of Rendering Contents -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Motivation and Overview -- 3.2 Implicit Neural Representations of Rendering Contents -- 3.3 Frame Feature Extractor -- 3.4 Network Training -- 4 Experiments -- 4.1 Dataset -- 4.2 Baselines and Settings -- 4.3 Analysis of Runtime Performance and Model Efficiency -- 4.4 Ablation Study -- 4.5 Limitation -- 5 Conclusion -- References -- MatTrans: Material Reflectance Property Estimation of Complex Objects with Transformer -- 1 Introduction -- 2 Related Works -- 3 Method -- 3.1 Initial Estimation Network -- 3.2 Refined Estimation Network -- 3.3 Transformer Encoder -- 3.4 Dataset -- 3.5 Training -- 4 Experiments -- 4.1 Ablation Experiment -- 4.2 Generalization to Real Data -- 4.3 Comparison Experiment -- 5 Conclusion -- References -- Improved Text-Driven Human Motion Generation via Out-of-Distribution Detection and Rectification -- 1 Introduction -- 2 Related Work.
3 The Proposed Method -- 4 Experiments -- 4.1 Dataset and Evaluation Metrics -- 4.2 Experiment Configuration and Training Details -- 4.3 Comparisons Between Different Text-Driven Human Motion Generation Methods -- 4.4 Comparison Between Different Outlier Detection Algorithms -- 4.5 Evaluation of Different Thresholds for Outlier Detection -- 4.6 Ablation Study -- 5 Conclusion -- References -- User Interactions -- BK-Editer: Body-Keeping Text-Conditioned Real Image Editing -- 1 Introduction -- 2 Related Work -- 3 Background -- 3.1 Diffusion Model Training -- 3.2 DDIM Sampling and Inversion -- 3.3 Text Condition and Classifier-Free Guidance -- 3.4 Stable Diffusion Model -- 3.5 Task Setting and the Body-Keeping Problem -- 4 Method -- 4.1 Tuning Stage for Finetuning Network -- 4.2 Inversion Stage for Obtaining BK-Attn Embeddings -- 4.3 Edit Stage with Body-Keeping -- 5 Experiments -- 5.1 Comparisons with Other Concurrent Works -- 5.2 User Study -- 5.3 Ablation Study -- 6 Limitations and Conclusion -- References -- Walking Telescope: Exploring the Zooming Effect in Expanding Detection Threshold Range for Translation Gain -- 1 Introduction -- 2 Related Work -- 2.1 Translation Gain Detection Threshold -- 2.2 Impact of FoV Change on Distance Perception -- 2.3 Impact of Magnified View on Distance Perception -- 3 Method -- 3.1 Translation Gain -- 3.2 Motivation -- 3.3 Verification Experiment -- 4 Main Experiment -- 4.1 Design and Hypotheses -- 4.2 Apparatus -- 4.3 Participants -- 4.4 Procedure -- 5 Results -- 5.1 Direction Thresholds -- 5.2 Simulator Sickness -- 6 Discussion -- 7 Limitation and Future Work -- 8 Conclusion -- References -- A U-Shaped Spatio-Temporal Transformer as Solver for Motion Capture -- 1 Introduction -- 2 Related Work -- 2.1 MoCap Data Clean-Up and Solving -- 2.2 Smoothness -- 2.3 Rotation Representations -- 2.4 Attention Model.
2.5 U-Net Architecture -- 3 Methodology -- 3.1 Problem Formulation -- 3.2 Overall Structure -- 4 Experiments and Evaluation -- 4.1 Experimental Settings -- 4.2 Quantitative and Qualitative Research -- 4.3 Ablation Study -- 5 Limitations and Future Work -- 6 Conclusion -- References -- ROSA-Net: Rotation-Robust Structure-Aware Network for Fine-Grained 3D Shape Retrieval -- 1 Introduction -- 2 Related Work -- 2.1 3D Shape Retrieval -- 2.2 Mesh-Based Representations -- 2.3 Rotation-Invariant Representations -- 3 ROSA-Net -- 3.1 Overview -- 3.2 Geometric Feature Representation -- 3.3 Part Geometry Attention Mechanism -- 3.4 Structural Information Representation -- 3.5 Geometry-Structure Attention Mechanism -- 3.6 Global Feature Encoding -- 3.7 Losses -- 3.8 Model Training and Shape Retrieval -- 4 Experimental Results -- 4.1 ROSA-Dataset -- 4.2 Fine-Grained Shape Retrieval -- 4.3 Weighted Features of Parts by Part-Geo Attention -- 4.4 Weighted Features by Geo-Struct Attention -- 4.5 Using Other Data Representation -- 4.6 Ablation Study -- 5 Conclusion -- References -- Author Index.
Record Nr. UNISA-996589543503316
Zhang Fang-Lue  
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
Computational Visual Media : 12th International Conference, CVM 2024, Wellington, New Zealand, April 10–12, 2024, Proceedings, Part II / / edited by Fang-Lue Zhang, Andrei Sharf
Computational Visual Media : 12th International Conference, CVM 2024, Wellington, New Zealand, April 10–12, 2024, Proceedings, Part II / / edited by Fang-Lue Zhang, Andrei Sharf
Autore Zhang Fang-Lue
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (384 pages)
Disciplina 006.37
Altri autori (Persone) SharfAndrei
Collana Lecture Notes in Computer Science
Soggetto topico Computer vision
Pattern recognition systems
Application software
Computer graphics
Artificial intelligence
Algorithms
Computer Vision
Automated Pattern Recognition
Computer and Information Systems Applications
Computer Graphics
Artificial Intelligence
ISBN 981-9720-92-3
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Contents - Part II -- Contents - Part I -- Facial Images -- Zero-Shot Real Facial Attribute Separation and Transfer at Novel Views -- 1 Introduction -- 2 Related Works -- 2.1 Explicit Face Morphable Models -- 2.2 3D-Aware Implicit Models -- 2.3 Disentanglement Representation Learning -- 3 Method -- 3.1 Model Architecture -- 3.2 EM-Like Alternating Training Procedure -- 3.3 Model Parameters Initialization -- 3.4 Rendering Refinement with Blind Face Restoration -- 4 Experiment -- 4.1 Implementation Details -- 4.2 Zero-Shot Attribute Separation from Single Image -- 4.3 Comparisons -- 4.4 Ablation Study -- 5 Conclusion -- 5.1 Limitation -- References -- Explore and Enhance the Generalization of Anomaly DeepFake Detection -- 1 Introduction -- 2 Related Work -- 2.1 Conventional DeepFake Detection -- 2.2 Anomaly DeepFake Detection -- 3 Approach -- 3.1 Overview -- 3.2 Review and Exploration of ADFD -- 3.3 Boundary Blur Mask Generator -- 3.4 Noise Refinement Strategy -- 3.5 Algorithm -- 4 Experiments -- 4.1 Experiments Setting -- 4.2 Exploration Experiments of ADFD Methods -- 4.3 Comparison Experiments -- 5 Conclusion -- References -- Deep Tiny Network for Recognition-Oriented Face Image Quality Assessment -- 1 Introduction -- 2 Related Work -- 2.1 Image Quality Assessment -- 2.2 Face Image Quality Assessment -- 3 Method -- 3.1 Recognition-Oriented Non-reference Quality Measurement -- 3.2 Tiny Face Quality Network -- 3.3 Generating Training Dataset with Quality Labels -- 3.4 Data Sampling and Augmentation Strategy for Balancing the Distribution of Scores -- 4 Experimental Results -- 4.1 Experimental Setup -- 4.2 Datasets and Protocols -- 4.3 Visualization of Different FIQA Methods -- 4.4 Memory and Computation Costs -- 4.5 Quantitative Evaluation on IJB-B and IJB-C Datasets -- 4.6 Quantitative Evaluation on YTF Dataset.
4.7 Ablation Studies -- 5 Conclusion -- References -- Face Expression Recognition via Product-Cross Dual Attention and Neutral-Aware Anchor Loss -- 1 Introduction -- 2 Related Work -- 2.1 Landmark -- 2.2 Transformer in FER -- 2.3 Losses Used in FER -- 3 Our Method -- 3.1 Product-Cross Dual Attention Module -- 3.2 Neutral Expression Aware Anchor Loss -- 3.3 Total Loss Function -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Ablation Study -- 4.4 Comparison with the State-of-the-Art Methods -- 4.5 Comparison on Number of Parameters and Running Performance -- 5 Conclusion -- References -- Image Generation and Enhancement -- Deformable CNN with Position Encoding for Arbitrary-Scale Super-Resolution -- 1 Introduction -- 2 Related Work -- 2.1 Implicit Neural Representation -- 2.2 Single Image Super-Resolution (SISR) -- 2.3 Arbitrary-Scale Super-Resolution -- 3 Methods -- 3.1 Deformable Feature Unfolding (DFU) -- 3.2 Fusion with Learned Position Encoding (FPE) -- 3.3 Deep ResMLP -- 4 Experiments -- 4.1 Datasets and Metrics -- 4.2 Implementation Detail -- 4.3 Evaluation -- 4.4 Ablation Study -- 5 Conclusion -- References -- Single-Video Temporal Consistency Enhancement with Rolling Guidance -- 1 Introduction -- 2 Related Work -- 2.1 Temporal Consistency for Specific Tasks -- 2.2 Blind Video Temporal Consistency -- 2.3 Spatial Smoothing Filters and Rolling Guidance -- 3 Method -- 3.1 Overview -- 3.2 Constructing Coarse Guidance Video -- 3.3 Recovering Image Details -- 3.4 Global Refinement -- 3.5 Comparison with the Deflickering Algorithm -- 4 Experiment -- 4.1 Dataset -- 4.2 Quality Assessment -- 4.3 Comparison to State-of-the-Art Methods -- 4.4 Ablation Study -- 5 Discussion and Conclusion -- References -- GTLayout: Learning General Trees for Structured Grid Layout Generation -- 1 Introduction -- 2 Related Work -- 3 Method.
3.1 Structural Layout Representation -- 3.2 Generative Model for Structured Grid Layouts -- 3.3 Training -- 4 Evaluation -- 4.1 Layout Generation -- 4.2 Layout Reconstruction -- 4.3 Layout Interpolation -- 5 Conclusion -- References -- Image Understanding -- Silhouette-Based 6D Object Pose Estimation -- 1 Introduction -- 2 Related Work -- 2.1 Traditional Methods -- 2.2 Methods with Deep Learning -- 3 The Method -- 3.1 Problem Formulation and Notation -- 3.2 Dimensionality Reduction -- 3.3 Optimized Particle Swarm Optimization -- 4 Experiments -- 4.1 Experiments Setup -- 4.2 Comparison to State of the Art -- 4.3 Performance on YCB-V-NT and TR-RW -- 4.4 Silhouette Stability Experiments -- 4.5 Ablation Study on YCB-V -- 5 Conclusion and Outlook -- References -- Robust Light Field Depth Estimation over Occluded and Specular Regions -- 1 Introduction -- 2 Related Work -- 3 The Depth Estimation -- 3.1 Consistency Data and Confidence -- 3.2 NPCR Depth Estimation -- 3.3 Depth Refinement -- 4 Experiment -- 4.1 Occlusion Processing Comparisons -- 4.2 Specular Regions Processing -- 4.3 Depth Map -- 4.4 Computational Time -- 5 Conclusion and Limitation -- References -- Foreground and Background Separate Adaptive Equilibrium Gradients Loss for Long-Tail Object Detection -- 1 Introduction -- 2 Related Works -- 2.1 General Object Detection -- 2.2 Long-Tail Object Detection -- 3 Methodology -- 3.1 Revisiting Sigmoid Cross-Entropy Loss -- 3.2 Foreground and Background Separate Adaptive Equilibrium Gradients Loss -- 4 Experiments on LVIS -- 4.1 Datasets and Evaluation Metric -- 4.2 Implementation Details -- 4.3 Ablation Studies -- 4.4 Generalization on Stronger Models -- 4.5 Performance Analysis -- 4.6 Comparison with State-of-the-Art Methods -- 4.7 Evaluation on COCO-LT -- 4.8 Result Visualization -- 5 Conclusion -- References -- Stylization.
Multi-level Patch Transformer for Style Transfer with Single Reference Image -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Multi-level Patch Transformer Encoder -- 3.2 Dynamic Filter-Based Decoder -- 3.3 Loss Functions -- 4 Experiments and Evaluations -- 4.1 Implementation Details -- 4.2 Qualitative Evaluation -- 4.3 Ablation Study -- 4.4 User Study -- 4.5 Quantitative Evaluations -- 4.6 Discussion CycleTransformer vs CycleGAN -- 5 Conclusion and Future Work -- References -- Palette-Based Content-Aware Image Recoloring -- 1 Introduction -- 2 Related Works -- 2.1 Palette-Based Image Recoloring -- 2.2 Edit Propagation (Stroke-Based Image Recoloring) -- 2.3 Style Transfer (Example-Based Image Recoloring) -- 3 Method -- 3.1 Overview -- 3.2 Palette Extraction -- 3.3 Content-Aware Recoloring -- 4 Experiments -- 4.1 Results -- 4.2 Evaluation -- 4.3 Comparisons -- 5 Conclusion, Limitation and Future Work -- References -- FreeStyler: A Free-Form Stylization Method via Multimodal Vector Quantization -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Vector Quantization Framework -- 3.2 Pseudo-Paired Token Predictor -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Qualitative Results -- 4.3 Quantitative Results -- 4.4 Ablation Study -- 4.5 Applications -- 5 Limitations and Future Work -- 6 Conclusion -- References -- Vision Meets Graphics -- Denoised Dual-Level Contrastive Network for Weakly-Supervised Temporal Sentence Grounding -- 1 Introduction -- 2 Related Work -- 2.1 Weakly-Supervised Temporal Sentence Grounding -- 2.2 Contrastive Representation Learning -- 3 The Proposed Method -- 3.1 Problem Formulation -- 3.2 Visual-Text Feature Extraction -- 3.3 Gaussian-Based Proposal Generation -- 3.4 Intra-video Contrastive Learning -- 3.5 Inter-video Contrastive Learning -- 3.6 Pseudo-Label Noise Removal -- 3.7 Training and Inference.
4 Experiments -- 4.1 Datasets -- 4.2 Evaluation Metric -- 4.3 Implementation Details -- 4.4 Comparisons with State-of-the-Art Methods -- 4.5 Ablation Study and Analysis -- 4.6 Qualitative Results -- 5 Conclusion -- References -- Isolation and Integration: A Strong Pre-trained Model-Based Paradigm for Class-Incremental Learning -- 1 Introduction -- 2 Realeated Work -- 3 Method -- 3.1 Problem Setting -- 3.2 A Simple Baseline -- 3.3 Dynamically Adaption and Aggregation -- 4 Experiments -- 4.1 Experimental Setups -- 4.2 Comparison with State of the Art -- 4.3 Ablation Study -- 5 Conclusion -- References -- Object Category-Based Visual Dialog for Effective Question Generation -- 1 Introduction -- 2 Related Work -- 3 Model -- 3.1 Object Information Extraction -- 3.2 Category Selection -- 3.3 Object Fusion Feature Update -- 3.4 Object-Self Difference Attention Module -- 3.5 Question Decoder -- 3.6 Object-Level Attention Update -- 4 Experiments -- 4.1 Dataset -- 4.2 Evaluation Metrics -- 4.3 Experiment Settings -- 4.4 Results -- 5 Conclusions -- References -- AST: An Attention-Guided Segment Transformer for Drone-Based Cross-View Geo-Localization -- 1 Introduction -- 2 Related Work -- 2.1 Image-Based Cross-View Geo-Localization -- 2.2 Vision Transformer -- 3 Proposed Method -- 3.1 Problem Formulation -- 3.2 Vision Transformer for Cross-View Geo-Localization -- 3.3 Attention-Guided Segment Tokens -- 3.4 Loss Function and Training Strategy -- 4 Experiment -- 4.1 Datasets and Evaluation Metrics -- 4.2 Implementation Details -- 4.3 Comparison with Existing Methods -- 4.4 Ablation Study -- 4.5 Visualization -- 5 Conclusion -- References -- Improved YOLOv5 Algorithm for Small Object Detection in Drone Images -- 1 Introduction -- 2 Related Work -- 2.1 Object Detection -- 2.2 Small Object Detection -- 2.3 YOLOv5 -- 3 HTH-YOLOv5 -- 3.1 Hybrid Transformer Head.
3.2 Convolutional Attention Feature Fusion Module.
Record Nr. UNINA-9910847092103321
Zhang Fang-Lue  
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Computational Visual Media : 12th International Conference, CVM 2024, Wellington, New Zealand, April 10–12, 2024, Proceedings, Part II / / edited by Fang-Lue Zhang, Andrei Sharf
Computational Visual Media : 12th International Conference, CVM 2024, Wellington, New Zealand, April 10–12, 2024, Proceedings, Part II / / edited by Fang-Lue Zhang, Andrei Sharf
Autore Zhang Fang-Lue
Edizione [1st ed. 2024.]
Pubbl/distr/stampa Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Descrizione fisica 1 online resource (384 pages)
Disciplina 006.37
Altri autori (Persone) SharfAndrei
Collana Lecture Notes in Computer Science
Soggetto topico Computer vision
Pattern recognition systems
Application software
Computer graphics
Artificial intelligence
Algorithms
Computer Vision
Automated Pattern Recognition
Computer and Information Systems Applications
Computer Graphics
Artificial Intelligence
ISBN 981-9720-92-3
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Nota di contenuto Intro -- Preface -- Organization -- Contents - Part II -- Contents - Part I -- Facial Images -- Zero-Shot Real Facial Attribute Separation and Transfer at Novel Views -- 1 Introduction -- 2 Related Works -- 2.1 Explicit Face Morphable Models -- 2.2 3D-Aware Implicit Models -- 2.3 Disentanglement Representation Learning -- 3 Method -- 3.1 Model Architecture -- 3.2 EM-Like Alternating Training Procedure -- 3.3 Model Parameters Initialization -- 3.4 Rendering Refinement with Blind Face Restoration -- 4 Experiment -- 4.1 Implementation Details -- 4.2 Zero-Shot Attribute Separation from Single Image -- 4.3 Comparisons -- 4.4 Ablation Study -- 5 Conclusion -- 5.1 Limitation -- References -- Explore and Enhance the Generalization of Anomaly DeepFake Detection -- 1 Introduction -- 2 Related Work -- 2.1 Conventional DeepFake Detection -- 2.2 Anomaly DeepFake Detection -- 3 Approach -- 3.1 Overview -- 3.2 Review and Exploration of ADFD -- 3.3 Boundary Blur Mask Generator -- 3.4 Noise Refinement Strategy -- 3.5 Algorithm -- 4 Experiments -- 4.1 Experiments Setting -- 4.2 Exploration Experiments of ADFD Methods -- 4.3 Comparison Experiments -- 5 Conclusion -- References -- Deep Tiny Network for Recognition-Oriented Face Image Quality Assessment -- 1 Introduction -- 2 Related Work -- 2.1 Image Quality Assessment -- 2.2 Face Image Quality Assessment -- 3 Method -- 3.1 Recognition-Oriented Non-reference Quality Measurement -- 3.2 Tiny Face Quality Network -- 3.3 Generating Training Dataset with Quality Labels -- 3.4 Data Sampling and Augmentation Strategy for Balancing the Distribution of Scores -- 4 Experimental Results -- 4.1 Experimental Setup -- 4.2 Datasets and Protocols -- 4.3 Visualization of Different FIQA Methods -- 4.4 Memory and Computation Costs -- 4.5 Quantitative Evaluation on IJB-B and IJB-C Datasets -- 4.6 Quantitative Evaluation on YTF Dataset.
4.7 Ablation Studies -- 5 Conclusion -- References -- Face Expression Recognition via Product-Cross Dual Attention and Neutral-Aware Anchor Loss -- 1 Introduction -- 2 Related Work -- 2.1 Landmark -- 2.2 Transformer in FER -- 2.3 Losses Used in FER -- 3 Our Method -- 3.1 Product-Cross Dual Attention Module -- 3.2 Neutral Expression Aware Anchor Loss -- 3.3 Total Loss Function -- 4 Experiments -- 4.1 Datasets -- 4.2 Implementation Details -- 4.3 Ablation Study -- 4.4 Comparison with the State-of-the-Art Methods -- 4.5 Comparison on Number of Parameters and Running Performance -- 5 Conclusion -- References -- Image Generation and Enhancement -- Deformable CNN with Position Encoding for Arbitrary-Scale Super-Resolution -- 1 Introduction -- 2 Related Work -- 2.1 Implicit Neural Representation -- 2.2 Single Image Super-Resolution (SISR) -- 2.3 Arbitrary-Scale Super-Resolution -- 3 Methods -- 3.1 Deformable Feature Unfolding (DFU) -- 3.2 Fusion with Learned Position Encoding (FPE) -- 3.3 Deep ResMLP -- 4 Experiments -- 4.1 Datasets and Metrics -- 4.2 Implementation Detail -- 4.3 Evaluation -- 4.4 Ablation Study -- 5 Conclusion -- References -- Single-Video Temporal Consistency Enhancement with Rolling Guidance -- 1 Introduction -- 2 Related Work -- 2.1 Temporal Consistency for Specific Tasks -- 2.2 Blind Video Temporal Consistency -- 2.3 Spatial Smoothing Filters and Rolling Guidance -- 3 Method -- 3.1 Overview -- 3.2 Constructing Coarse Guidance Video -- 3.3 Recovering Image Details -- 3.4 Global Refinement -- 3.5 Comparison with the Deflickering Algorithm -- 4 Experiment -- 4.1 Dataset -- 4.2 Quality Assessment -- 4.3 Comparison to State-of-the-Art Methods -- 4.4 Ablation Study -- 5 Discussion and Conclusion -- References -- GTLayout: Learning General Trees for Structured Grid Layout Generation -- 1 Introduction -- 2 Related Work -- 3 Method.
3.1 Structural Layout Representation -- 3.2 Generative Model for Structured Grid Layouts -- 3.3 Training -- 4 Evaluation -- 4.1 Layout Generation -- 4.2 Layout Reconstruction -- 4.3 Layout Interpolation -- 5 Conclusion -- References -- Image Understanding -- Silhouette-Based 6D Object Pose Estimation -- 1 Introduction -- 2 Related Work -- 2.1 Traditional Methods -- 2.2 Methods with Deep Learning -- 3 The Method -- 3.1 Problem Formulation and Notation -- 3.2 Dimensionality Reduction -- 3.3 Optimized Particle Swarm Optimization -- 4 Experiments -- 4.1 Experiments Setup -- 4.2 Comparison to State of the Art -- 4.3 Performance on YCB-V-NT and TR-RW -- 4.4 Silhouette Stability Experiments -- 4.5 Ablation Study on YCB-V -- 5 Conclusion and Outlook -- References -- Robust Light Field Depth Estimation over Occluded and Specular Regions -- 1 Introduction -- 2 Related Work -- 3 The Depth Estimation -- 3.1 Consistency Data and Confidence -- 3.2 NPCR Depth Estimation -- 3.3 Depth Refinement -- 4 Experiment -- 4.1 Occlusion Processing Comparisons -- 4.2 Specular Regions Processing -- 4.3 Depth Map -- 4.4 Computational Time -- 5 Conclusion and Limitation -- References -- Foreground and Background Separate Adaptive Equilibrium Gradients Loss for Long-Tail Object Detection -- 1 Introduction -- 2 Related Works -- 2.1 General Object Detection -- 2.2 Long-Tail Object Detection -- 3 Methodology -- 3.1 Revisiting Sigmoid Cross-Entropy Loss -- 3.2 Foreground and Background Separate Adaptive Equilibrium Gradients Loss -- 4 Experiments on LVIS -- 4.1 Datasets and Evaluation Metric -- 4.2 Implementation Details -- 4.3 Ablation Studies -- 4.4 Generalization on Stronger Models -- 4.5 Performance Analysis -- 4.6 Comparison with State-of-the-Art Methods -- 4.7 Evaluation on COCO-LT -- 4.8 Result Visualization -- 5 Conclusion -- References -- Stylization.
Multi-level Patch Transformer for Style Transfer with Single Reference Image -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 Multi-level Patch Transformer Encoder -- 3.2 Dynamic Filter-Based Decoder -- 3.3 Loss Functions -- 4 Experiments and Evaluations -- 4.1 Implementation Details -- 4.2 Qualitative Evaluation -- 4.3 Ablation Study -- 4.4 User Study -- 4.5 Quantitative Evaluations -- 4.6 Discussion CycleTransformer vs CycleGAN -- 5 Conclusion and Future Work -- References -- Palette-Based Content-Aware Image Recoloring -- 1 Introduction -- 2 Related Works -- 2.1 Palette-Based Image Recoloring -- 2.2 Edit Propagation (Stroke-Based Image Recoloring) -- 2.3 Style Transfer (Example-Based Image Recoloring) -- 3 Method -- 3.1 Overview -- 3.2 Palette Extraction -- 3.3 Content-Aware Recoloring -- 4 Experiments -- 4.1 Results -- 4.2 Evaluation -- 4.3 Comparisons -- 5 Conclusion, Limitation and Future Work -- References -- FreeStyler: A Free-Form Stylization Method via Multimodal Vector Quantization -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Vector Quantization Framework -- 3.2 Pseudo-Paired Token Predictor -- 4 Experiments -- 4.1 Implementation Details -- 4.2 Qualitative Results -- 4.3 Quantitative Results -- 4.4 Ablation Study -- 4.5 Applications -- 5 Limitations and Future Work -- 6 Conclusion -- References -- Vision Meets Graphics -- Denoised Dual-Level Contrastive Network for Weakly-Supervised Temporal Sentence Grounding -- 1 Introduction -- 2 Related Work -- 2.1 Weakly-Supervised Temporal Sentence Grounding -- 2.2 Contrastive Representation Learning -- 3 The Proposed Method -- 3.1 Problem Formulation -- 3.2 Visual-Text Feature Extraction -- 3.3 Gaussian-Based Proposal Generation -- 3.4 Intra-video Contrastive Learning -- 3.5 Inter-video Contrastive Learning -- 3.6 Pseudo-Label Noise Removal -- 3.7 Training and Inference.
4 Experiments -- 4.1 Datasets -- 4.2 Evaluation Metric -- 4.3 Implementation Details -- 4.4 Comparisons with State-of-the-Art Methods -- 4.5 Ablation Study and Analysis -- 4.6 Qualitative Results -- 5 Conclusion -- References -- Isolation and Integration: A Strong Pre-trained Model-Based Paradigm for Class-Incremental Learning -- 1 Introduction -- 2 Realeated Work -- 3 Method -- 3.1 Problem Setting -- 3.2 A Simple Baseline -- 3.3 Dynamically Adaption and Aggregation -- 4 Experiments -- 4.1 Experimental Setups -- 4.2 Comparison with State of the Art -- 4.3 Ablation Study -- 5 Conclusion -- References -- Object Category-Based Visual Dialog for Effective Question Generation -- 1 Introduction -- 2 Related Work -- 3 Model -- 3.1 Object Information Extraction -- 3.2 Category Selection -- 3.3 Object Fusion Feature Update -- 3.4 Object-Self Difference Attention Module -- 3.5 Question Decoder -- 3.6 Object-Level Attention Update -- 4 Experiments -- 4.1 Dataset -- 4.2 Evaluation Metrics -- 4.3 Experiment Settings -- 4.4 Results -- 5 Conclusions -- References -- AST: An Attention-Guided Segment Transformer for Drone-Based Cross-View Geo-Localization -- 1 Introduction -- 2 Related Work -- 2.1 Image-Based Cross-View Geo-Localization -- 2.2 Vision Transformer -- 3 Proposed Method -- 3.1 Problem Formulation -- 3.2 Vision Transformer for Cross-View Geo-Localization -- 3.3 Attention-Guided Segment Tokens -- 3.4 Loss Function and Training Strategy -- 4 Experiment -- 4.1 Datasets and Evaluation Metrics -- 4.2 Implementation Details -- 4.3 Comparison with Existing Methods -- 4.4 Ablation Study -- 4.5 Visualization -- 5 Conclusion -- References -- Improved YOLOv5 Algorithm for Small Object Detection in Drone Images -- 1 Introduction -- 2 Related Work -- 2.1 Object Detection -- 2.2 Small Object Detection -- 2.3 YOLOv5 -- 3 HTH-YOLOv5 -- 3.1 Hybrid Transformer Head.
3.2 Convolutional Attention Feature Fusion Module.
Record Nr. UNISA-996589544103316
Zhang Fang-Lue  
Singapore : , : Springer Nature Singapore : , : Imprint : Springer, , 2024
Materiale a stampa
Lo trovi qui: Univ. di Salerno
Opac: Controlla la disponibilità qui
SIGGRAPH Asia 2013 posters
SIGGRAPH Asia 2013 posters
Pubbl/distr/stampa [Place of publication not identified], : ACM, 2013
Descrizione fisica 1 online resource (41 pages)
Collana ACM Conferences
Soggetto topico Engineering & Applied Sciences
Technology - General
ISBN 1-4503-2634-X
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Altri titoli varianti Special Interest Group on Computer Graphics and Interactive Techniques Asia 2013 posters
Record Nr. UNINA-9910375727103321
[Place of publication not identified], : ACM, 2013
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
SIGGRAPH Asia 2013 technical briefs
SIGGRAPH Asia 2013 technical briefs
Pubbl/distr/stampa [Place of publication not identified], : ACM, 2013
Descrizione fisica 1 online resource (135 pages)
Collana ACM Conferences
Soggetto topico Engineering & Applied Sciences
Technology - General
ISBN 1-4503-2629-3
Formato Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione eng
Altri titoli varianti Special Interest Group on Computer Graphics and Interactive Techniques Asia 2013 technical briefs
Record Nr. UNINA-9910375727003321
[Place of publication not identified], : ACM, 2013
Materiale a stampa
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui