MultiMedia Modeling [[electronic resource] ] : 30th International Conference, MMM 2024, Amsterdam, The Netherlands, January 29 – February 2, 2024, Proceedings, Part I / / edited by Stevan Rudinac, Alan Hanjalic, Cynthia Liem, Marcel Worring, Björn Þór Jónsson, Bei Liu, Yoko Yamakata |
Autore | Rudinac Stevan |
Edizione | [1st ed. 2024.] |
Pubbl/distr/stampa | Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024 |
Descrizione fisica | 1 online resource (523 pages) |
Disciplina | 006.37 |
Altri autori (Persone) |
HanjalicAlan
LiemCynthia WorringMarcel JónssonBjö Þór LiuBei YamakataYoko |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Computer vision
Image processing Pattern recognition systems Application software Information storage and retrieval systems Machine learning Computer Vision Image Processing Automated Pattern Recognition Computer and Information Systems Applications Information Storage and Retrieval Machine Learning |
ISBN | 3-031-53305-4 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | Where are Biases? Adversarial Debiasing with Spurious Feature Visualization -- Cross-Modal Hash Retrieval with Category Semantics -- Spatiotemporal Representation Enhanced ViT for Video Recognition -- SCFormer: A Vision Transformer with Split Channel in Sitting Posture Recognition -- Dive into Coarse-to-Fine Strategy in Single Image Deblurring -- TICondition: Expanding Control Capabilities for Text-to-Image Generation with Multi-Modal Conditions -- Enhancing Generative Generalized Zero Shot Learning via Multi-Space Constraints and Adapative Integration -- Joint Image Data Hiding and Rate-Distortion Optimization in Neural Compressed Latent Representations -- GSUNet: A Brain Tumor Segmentation Method Based On 3D Ghost Shuffle U-Net -- ACT: Action-associated and Target-related Representations for Object Navigation -- Foreground Feature Enhancement and Peak & Background Suppression for Fine-Grained Visual Classification -- YOLOv5-SRR: Enhancing YOLOv5 for Effective Underwater Target Detection -- Image Clustering and Generation with HDGMVAE-I -- “Car or Bus?" CLearSeg: CLIP-enhanced Discrimination among Resembling Classes for Few-Shot Semantic Segmentation -- PANDA: Prompt-based Context- and Indoor-aware Pretraining for Vision and Language Navigation -- Cross-Modal Semantic Alignment Learning for Text-based Person Search -- Point Cloud Classification via Learnable Memory Bank -- Adversarially Regularized Low-Light Image Enhancement -- Advancing Incremental Few-shot Semantic Segmentation via Semantic-guided Relation Alignment and Adaptation -- PMGCN:Preserving measuring mapping prototype graph calibration network for few-shot learning -- ARE-CAM: An interpretable approach to quantitatively evaluating the adversarial robustness of deep models based on CAM -- SSK-Yolo:Global feature-driven small object detection network for images -- MetaVSR: A Novel Approach to Video Super-Resolution for Arbitrary Magnification -- From Skulls to Faces: A Deep Generative Framework for Realistic 3D Craniofacial Reconstruction -- Structure-aware Adaptive Hybrid Interaction Modeling for Image-Text Matching -- Using Saliency and Cropping to Improve Video Memorability -- Contextual Augmentation with Bias Adaptive for Few-shot Video Object Segmentation -- A lightweight local attention network for image super resolution -- Domain Adaptation for Speaker Verification Based on Self-Supervised Learning with Adversarial Training -- Quality Scalable Video Coding based on Neural Representation -- Hierarchical Bi-Directional Temporal Context Mining for Improved Video Compression -- MAMixer: Multivariate Time Series Forecasting via Multi-Axis Mixing -- A Custom GAN-based Robust Algorithm for Medical Image Watermarking -- A Detail-guided Multi-source Fusion Network for Remote Sensing Object Detection -- A Secure and Fair Federated Learning Protocol under the Universal Composability Framework -- Bi-directional Interaction and Dense Aggregation Network for RGB-D Salient Object Detection -- Face Forgery Detection via Texture and Saliency Enhancement. |
Record Nr. | UNISA-996587863403316 |
Rudinac Stevan | ||
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. di Salerno | ||
|
MultiMedia Modeling [[electronic resource] ] : 30th International Conference, MMM 2024, Amsterdam, The Netherlands, January 29 – February 2, 2024, Proceedings, Part IV / / edited by Stevan Rudinac, Alan Hanjalic, Cynthia Liem, Marcel Worring, Björn Þór Jónsson, Bei Liu, Yoko Yamakata |
Autore | Rudinac Stevan |
Edizione | [1st ed. 2024.] |
Pubbl/distr/stampa | Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024 |
Descrizione fisica | 1 online resource (419 pages) |
Disciplina | 006.37 |
Altri autori (Persone) |
HanjalicAlan
LiemCynthia WorringMarcel JónssonBjö Þór LiuBei YamakataYoko |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Computer vision
Image processing Pattern recognition systems Application software Information storage and retrieval systems Machine learning Computer Vision Image Processing Automated Pattern Recognition Computer and Information Systems Applications Information Storage and Retrieval Machine Learning |
ISBN | 3-031-53302-X |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | FMM: Special Session on Foundation Models for Multimedia -- Removing Stray-Light for Wild-Field Fundus Image Fusion based on Large Generative Models -- Training-free Region Prediction with Stable Diffusion -- Mitigating Fine-Grained Hallucination by Fine-Tuning Large Vision-Language Models with Caption Rewrites -- GDTNet: A Synergistic Dilated Transformer and CNN by Gate Attention for Abdominal Multi-organ Segmentation -- Fine-Grained Multi-Modal Fundus Image Generation Based on Diffusion Models for Glaucoma Classification -- Adapting Pretrained Large-Scale Vision Models for Face Forgery Detection -- ICDAR: Special Session on Intelligent Cross-Data Analysis and Retrieval -- Towards Cross-modal Point Cloud Retrieval for Indoor Scenes -- Correlation visualization under missing values: a comparison between imputation and direct parameter estimation methods -- IFI: Interpreting for Improving: a Multimodal Transformer with an Interpretability Technique for Recognition of Risk Events -- OOKPIK - A Collection of Out-of-Context Image-Caption Pairs -- LUMOS-DM: Landscape-based Multimodal Scene Retrieval Enhanced by Diffusion Model -- XR-MACCI: Special Session on eXtended Reality and Multimedia - Advancing Content Creation and Interaction -- Mining Landmark Images for Scene Reconstruction from Weakly Annotated Video Collections -- A framework for 3D modeling of construction sites using aerial imagery and semantic NeRFs -- Multimodal 3D Object Retrieval -- An Integrated System for Spatio-Temporal Summarization of 360-degrees Videos -- Brave New Ideas -- Mutant Texts: A Technique for Uncovering Unexpected Inconsistencies in Large-Scale Vision-language Models -- Exploring Artificial Intelligence for Advancing Performance Processes and Events in Io3MT -- Demonstrations -- Implementation of Melody Slot Machines -- E2Evideo: End to End Video and Image Pre-processing and Analysis Tool -- Augmented Reality Photo Presentation and Content-based Image Retrieval on Mobile Devices with AR-Explorer -- Augmented Reality Photo Presentation and Content-based Image Retrieval on Mobile Devices with AR-Explorer -- AI-Based Cropping of Soccer Videos for Different Social Media Representations -- Few-shot Object Detection as a Service: Facilitating Training and Deployment for Domain Experts -- DatAR: Supporting Neuroscience Literature Exploration by Finding Relations between Topics in Augmented Reality -- EmoAda:A Multimodal Emotion Interaction and Psychological Adaptation System -- Video Browser Showdown -- Waseda Meisei SoftBank at Video Browser Showdown 2024 -- Exploring Multimedia Vector Spaces with vitrivr-VR -- A new Retrieval Engine for vitrivr -- VISIONE 5.0: Enhanced User Interface and AI Models for VBS2024 -- PraK Tool: An Interactive Search Tool Based on Video Data Services -- Exquisitor at the Video Browser Showdown 2024: Relevance Feedback Meets Conversational Search -- VERGE in VBS 2024 -- Optimizing the Interactive Video Retrieval Tool Vibro for the Video Browser Showdown 2024 -- diveXplore at the Video Browser Showdown 2024 -- Leveraging LLMs and Generative Models for Interactive Known-Item Video Search -- TalkSee: Interactive Video Retrieval Engine Using Large Language Model -- VideoCLIP 2: An Interactive CLIP-based Video Retrieval System for Novice Users at VBS2024 -- ViewsInsight: Enhancing Video Retrieval for VBS 2024 with a User-Friendly Interaction Mechanism. |
Record Nr. | UNISA-996587863203316 |
Rudinac Stevan | ||
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. di Salerno | ||
|
MultiMedia Modeling [[electronic resource] ] : 30th International Conference, MMM 2024, Amsterdam, The Netherlands, January 29 – February 2, 2024, Proceedings, Part V / / edited by Stevan Rudinac, Alan Hanjalic, Cynthia Liem, Marcel Worring, Björn Þór Jónsson, Bei Liu, Yoko Yamakata |
Autore | Rudinac Stevan |
Edizione | [1st ed. 2024.] |
Pubbl/distr/stampa | Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024 |
Descrizione fisica | 1 online resource (125 pages) |
Disciplina | 621.382 |
Altri autori (Persone) |
HanjalicAlan
LiemCynthia WorringMarcel JónssonBjö Þór LiuBei YamakataYoko |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Signal processing
Pattern recognition systems Application software Information storage and retrieval systems Machine learning Signal, Speech and Image Processing Automated Pattern Recognition Computer and Information Systems Applications Information Storage and Retrieval Machine Learning |
ISBN | 3-031-56435-9 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | RESET: Relational Similarity Extension for V3C1 Video Dataset -- A New Benchmark and OCR-free Method for Document Image Topic Classification -- The Rach3 Dataset: Towards Data-Driven Analysis of Piano Performance Rehearsal -- WikiMuTe: A web-sourced dataset of semantic descriptions for music audio -- PDTW150K: A Dataset for Patent Drawing Retrieval -- Interactive Question Answering for Multimodal Lifelog Retrieval -- Event Recognition in Laparoscopic Gynecology Videos with Hybrid Transformers -- GreenScreen: A Multimodal Dataset for Detecting Corporate Greenwashing in the Wild. |
Record Nr. | UNISA-996589544903316 |
Rudinac Stevan | ||
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. di Salerno | ||
|
MultiMedia Modeling : 30th International Conference, MMM 2024, Amsterdam, The Netherlands, January 29 – February 2, 2024, Proceedings, Part I / / edited by Stevan Rudinac, Alan Hanjalic, Cynthia Liem, Marcel Worring, Björn Þór Jónsson, Bei Liu, Yoko Yamakata |
Autore | Rudinac Stevan |
Edizione | [1st ed. 2024.] |
Pubbl/distr/stampa | Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024 |
Descrizione fisica | 1 online resource (523 pages) |
Disciplina | 006.37 |
Altri autori (Persone) |
HanjalicAlan
LiemCynthia WorringMarcel JónssonBjö Þór LiuBei YamakataYoko |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Computer vision
Image processing Pattern recognition systems Application software Information storage and retrieval systems Machine learning Computer Vision Image Processing Automated Pattern Recognition Computer and Information Systems Applications Information Storage and Retrieval Machine Learning |
ISBN | 3-031-53305-4 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | Where are Biases? Adversarial Debiasing with Spurious Feature Visualization -- Cross-Modal Hash Retrieval with Category Semantics -- Spatiotemporal Representation Enhanced ViT for Video Recognition -- SCFormer: A Vision Transformer with Split Channel in Sitting Posture Recognition -- Dive into Coarse-to-Fine Strategy in Single Image Deblurring -- TICondition: Expanding Control Capabilities for Text-to-Image Generation with Multi-Modal Conditions -- Enhancing Generative Generalized Zero Shot Learning via Multi-Space Constraints and Adapative Integration -- Joint Image Data Hiding and Rate-Distortion Optimization in Neural Compressed Latent Representations -- GSUNet: A Brain Tumor Segmentation Method Based On 3D Ghost Shuffle U-Net -- ACT: Action-associated and Target-related Representations for Object Navigation -- Foreground Feature Enhancement and Peak & Background Suppression for Fine-Grained Visual Classification -- YOLOv5-SRR: Enhancing YOLOv5 for Effective Underwater Target Detection -- Image Clustering and Generation with HDGMVAE-I -- “Car or Bus?" CLearSeg: CLIP-enhanced Discrimination among Resembling Classes for Few-Shot Semantic Segmentation -- PANDA: Prompt-based Context- and Indoor-aware Pretraining for Vision and Language Navigation -- Cross-Modal Semantic Alignment Learning for Text-based Person Search -- Point Cloud Classification via Learnable Memory Bank -- Adversarially Regularized Low-Light Image Enhancement -- Advancing Incremental Few-shot Semantic Segmentation via Semantic-guided Relation Alignment and Adaptation -- PMGCN:Preserving measuring mapping prototype graph calibration network for few-shot learning -- ARE-CAM: An interpretable approach to quantitatively evaluating the adversarial robustness of deep models based on CAM -- SSK-Yolo:Global feature-driven small object detection network for images -- MetaVSR: A Novel Approach to Video Super-Resolution for Arbitrary Magnification -- From Skulls to Faces: A Deep Generative Framework for Realistic 3D Craniofacial Reconstruction -- Structure-aware Adaptive Hybrid Interaction Modeling for Image-Text Matching -- Using Saliency and Cropping to Improve Video Memorability -- Contextual Augmentation with Bias Adaptive for Few-shot Video Object Segmentation -- A lightweight local attention network for image super resolution -- Domain Adaptation for Speaker Verification Based on Self-Supervised Learning with Adversarial Training -- Quality Scalable Video Coding based on Neural Representation -- Hierarchical Bi-Directional Temporal Context Mining for Improved Video Compression -- MAMixer: Multivariate Time Series Forecasting via Multi-Axis Mixing -- A Custom GAN-based Robust Algorithm for Medical Image Watermarking -- A Detail-guided Multi-source Fusion Network for Remote Sensing Object Detection -- A Secure and Fair Federated Learning Protocol under the Universal Composability Framework -- Bi-directional Interaction and Dense Aggregation Network for RGB-D Salient Object Detection -- Face Forgery Detection via Texture and Saliency Enhancement. |
Record Nr. | UNINA-9910806199003321 |
Rudinac Stevan | ||
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|
MultiMedia Modeling : 30th International Conference, MMM 2024, Amsterdam, The Netherlands, January 29 – February 2, 2024, Proceedings, Part V / / edited by Stevan Rudinac, Alan Hanjalic, Cynthia Liem, Marcel Worring, Björn Þór Jónsson, Bei Liu, Yoko Yamakata |
Autore | Rudinac Stevan |
Edizione | [1st ed. 2024.] |
Pubbl/distr/stampa | Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024 |
Descrizione fisica | 1 online resource (125 pages) |
Disciplina | 621.382 |
Altri autori (Persone) |
HanjalicAlan
LiemCynthia WorringMarcel JónssonBjö Þór LiuBei YamakataYoko |
Collana | Lecture Notes in Computer Science |
Soggetto topico |
Signal processing
Pattern recognition systems Application software Information storage and retrieval systems Machine learning Signal, Speech and Image Processing Automated Pattern Recognition Computer and Information Systems Applications Information Storage and Retrieval Machine Learning |
ISBN | 3-031-56435-9 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto | RESET: Relational Similarity Extension for V3C1 Video Dataset -- A New Benchmark and OCR-free Method for Document Image Topic Classification -- The Rach3 Dataset: Towards Data-Driven Analysis of Piano Performance Rehearsal -- WikiMuTe: A web-sourced dataset of semantic descriptions for music audio -- PDTW150K: A Dataset for Patent Drawing Retrieval -- Interactive Question Answering for Multimodal Lifelog Retrieval -- Event Recognition in Laparoscopic Gynecology Videos with Hybrid Transformers -- GreenScreen: A Multimodal Dataset for Detecting Corporate Greenwashing in the Wild. |
Record Nr. | UNINA-9910845096703321 |
Rudinac Stevan | ||
Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|