Vai al contenuto principale della pagina

Big Data Analytics and Knowledge Discovery : 26th International Conference, DaWaK 2024, Naples, Italy, August 26–28, 2024, Proceedings / / edited by Robert Wrembel, Silvia Chiusano, Gabriele Kotsis, A Min Tjoa, Ismail Khalil



(Visualizza in formato marc)    (Visualizza in BIBFRAME)

Titolo: Big Data Analytics and Knowledge Discovery : 26th International Conference, DaWaK 2024, Naples, Italy, August 26–28, 2024, Proceedings / / edited by Robert Wrembel, Silvia Chiusano, Gabriele Kotsis, A Min Tjoa, Ismail Khalil Visualizza cluster
Pubblicazione: Cham : , : Springer Nature Switzerland : , : Imprint : Springer, , 2024
Edizione: 1st ed. 2024.
Descrizione fisica: 1 online resource (409 pages)
Disciplina: 005.7
Soggetto topico: Statistics
Data mining
Information technology - Management
Artificial intelligence
Data Mining and Knowledge Discovery
Computer Application in Administrative Data Processing
Artificial Intelligence
Dades massives
Mineria de dades
Intel·ligència artificial
Soggetto genere / forma: Congressos
Llibres electrònics
Persona (resp. second.): WrembelRobert
Nota di bibliografia: Includes bibliographical references and index.
Nota di contenuto: Intro -- Preface -- Organization -- Abstracts of Keynote Talks -- Multimodal Deep Learning in Medical Imaging -- Digital Humanism as an Enabler for a Holistic Socio-Technical Approach to the Latest Developments in Computer Science and Artificial Intelligence -- Deep Entity Processing in the Era of Large Language Models: Challenges and Opportunities -- Contents -- Modeling and Design -- LiteSelect: A Lightweight Adaptive Learning Algorithm for Online Index Selection -- 1 Introduction -- 2 The Online Index Selection Problem -- 3 LiteSelect: An Lightweight Online Index Tuner -- 3.1 Algorithm LiteSelect -- 3.2 Fine Tuning LiteSelect -- 4 Experimental Evaluation -- 4.1 Experimental Setup -- 4.2 Parameter Impact Analysis -- 4.3 Index Tuning Performance Comparison -- 5 Related Work -- 6 Conclusion -- References -- IDAGEmb: An Incremental Data Alignment Based on Graph Embedding -- 1 Introduction -- 2 Background -- 2.1 Existing Data Alignment Approaches -- 2.2 Graph Embedding in Representation Learning -- 2.3 Discussion -- 3 Methodology -- 3.1 Research Design -- 3.2 Preliminaries -- 3.3 Adopted Algorithm for IDAGEmb -- 4 Experiments and Results -- 4.1 Experiment Configuration -- 4.2 Experiment #1: Embedding Method Selection -- 4.3 Experiment #2: Comparison with Static Methods (effectiveness and Efficiency) -- 4.4 Experiment #3: Model Sensitivity to Data Order Variation -- 5 Conclusion and Outlook -- References -- Learning Paradigms and Modelling Methodologies for Digital Twins in Process Industry -- 1 Introduction and Motivation -- 1.1 Research Questions (RQs) -- 1.2 Structure of Review -- 2 Literature Search Strategy -- 2.1 Quality Assessment Checks -- 2.2 Selection of Primary Studies -- 2.3 Data Synthesis and Analysis Approach -- 3 Reporting the Review -- 3.1 Overview of All Studies -- 3.2 Overview of All Primary Studies.
4 Evaluating the Research Questions -- 5 Discussion and Conclusion -- References -- Entity Matching and Similarity -- MultiMatch: Low-Resource Generalized Entity Matching Using Task-Conditioned Hyperadapters in Multitask Learning -- 1 Introduction -- 2 Background -- 2.1 Problem Formulation -- 2.2 Entity Matching with Single-task Objective Models -- 2.3 Fully Fine-tuning Methods -- 2.4 Parameter-Efficient Fine-tuning Methods -- 2.5 Entity Matching with Parameter-Efficient Multi-task Models -- 3 MultiMatch Training -- 4 Experiments -- 5 Analysis -- 5.1 Single Versus Multiple Objective Models -- 5.2 Task Ablation Experiments -- 6 Conclusions and Future Work -- References -- Embedding-Based Data Matching for Disparate Data Sources -- 1 Context and Main Issues -- 2 Proposed Framework -- 2.1 Problem Statement -- 2.2 Overview -- 3 Experiments -- 3.1 RQ1. Effectiveness and Stability -- 3.2 RQ2. Ablation -- 4 Conclusion -- References -- Subtree Similarity Search Based on Structure and Text -- 1 Introduction -- 2 Problem Definition -- 3 Related Works -- 3.1 Tree Edit Distance -- 3.2 Lower Bounds of Tree Edit Distance -- 3.3 Upper Bounds of Tree Edit Distance -- 3.4 Subtree Similarity Search -- 3.5 Other Related Problems -- 4 Preliminaries -- 5 Proposed Method -- 6 Experiments -- 6.1 Dataset -- 6.2 Methods -- 6.3 Effect of the Recall -- 6.4 Effect of the Document Size -- 6.5 Effect of the Query Size -- 6.6 Accuracy -- 7 Conclusion -- References -- Classification -- Towards Hybrid Embedded Feature Selection and Classification Approach with Slim-TSF -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 4 Experimental Evaluations -- 4.1 Data Collection -- 4.2 Experimental Settings -- 4.3 Bootstrapping -- 4.4 Remarks -- 5 Conclusions -- References -- Evaluation of High Sparsity Strategies for Efficient Binary Classification -- 1 Introduction -- 2 Related Work.
3 Materials and Methods -- 4 Results and Discussion -- 5 Conclusions and Future Work -- References -- Incremental SMOTE with Control Coefficient for Classifiers in Data Starved Medical Applications -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 An Incremental Synthetic Data Generation System -- 4 Experiments -- 4.1 Datasets and Experiments Setup -- 4.2 Statistical Analysis -- 4.3 Performance Evaluation on Classifiers -- 5 Conclusions -- References -- Exploring Evaluation Metrics for Binary Classification in Data Analysis: the Worthiness Benchmark Concept -- 1 Introduction and Related Research -- 2 Methodology -- 3 Discussion and Conclusion -- References -- Machine Learning Methods and Applications -- Exploring Causal Chain Identification: Comprehensive Insights from Text and Knowledge Graphs -- 1 Introduction -- 2 Related Work -- 3 Methodology -- 3.1 In-Chain Domain Knowledge -- 3.2 CK-CEVAE -- 3.3 Chained Prediction Unit -- 4 Experiments -- 4.1 Chains Acquisition -- 4.2 Domain Detection Model -- 4.3 Models Configurations -- 4.4 Overall Analysis -- 4.5 Ablation Study -- 5 Case Study: Understanding Semantic Continuity in Knowledge Graphs -- 6 Discussion -- 7 Conclusion -- References -- Towards Regional Explanations with Validity Domains for Local Explanations -- 1 Introduction -- 2 Related Work -- 2.1 Explanation Methods -- 2.2 Explanation Evaluation Metrics -- 2.3 Validity Domain of Models -- 3 Toy Example -- 4 Our Proposal -- 4.1 Validity Domain -- 4.2 Model Summary -- 4.3 Evaluation Metrics -- 5 Experiments -- 5.1 Protocol -- 5.2 Evaluation of Methods -- 5.3 Model Summary -- 5.4 Sensitivity Analysis -- 6 Discussion and Limits -- 7 Conclusion and Perspectives -- References -- Analyzing a Decade of Evolution: Trends in Natural Language Processing -- 1 Introduction -- 2 Methodology -- 2.1 PDF Parsing -- 3 Results -- 4 Conclusion.
5 Limitations -- References -- Improving Serendipity for Collaborative Metric Learning Based on Mutual Proximity -- 1 Introduction -- 2 Background -- 2.1 Serendipity -- 2.2 Collaborative Metric Learning (CML) -- 2.3 Mutual Proximity (MP) -- 2.4 Advantages and Originality of the Proposed Method -- 3 Methodology -- 3.1 Learning Embeddings -- 3.2 Searching Embedding Space and Recommending Items -- 4 Experiments -- 4.1 Datasets -- 4.2 Metrics -- 4.3 Results -- 5 Conclusions and Discussion -- References -- Ada2vec: Adaptive Representation Learning for Large-Scale Dynamic Heterogeneous Networks -- 1 Introduction -- 2 Related Work -- 3 Problem Definition -- 4 The Ada2vec Framework -- 4.1 Part 1 Dynamic -- 4.2 Part 2 Heterogeneity -- 4.3 Part 3 Change -- 5 Experimental Evaluations -- 5.1 Data -- 5.2 Benchmarks -- 5.3 Classification -- 5.4 Clustering -- 5.5 Performance Analysis -- 6 Conclusion and Future Work -- References -- Differentially-Private Neural Network Training with Private Features and Public Labels -- 1 Introduction -- 2 Background -- 2.1 Differential Privacy -- 2.2 DP-SGD -- 3 Related Work -- 4 Proposed Approach -- 4.1 Sanitization Layer -- 4.2 Bounding Sensitivity and Adding Noise -- 4.3 Design Choices and Tradeoffs -- 5 Experimental Evaluation -- 5.1 Experimental Settings -- 5.2 Results -- 6 Conclusion -- References -- Time Series -- Series2Graph++: Distributed Detection of Correlation Anomalies in Multivariate Time Series -- 1 Introduction -- 2 Related Work -- 3 Series2Graph++ -- 4 Experiments -- 5 Conclusion -- References -- Anomaly Detection from Time Series Under Uncertainty -- 1 Introduction -- 2 Related Work -- 3 Proposed Approach -- 4 Experiments -- 4.1 Uncertainty Quantification Evaluation -- 4.2 Model Performance -- 5 Conclusion -- References -- Comparison of Measures for Characterizing the Difficulty of Time Series Classification.
1 Introduction -- 2 Methodology -- 2.1 Data and Models -- 2.2 Complexity Measures -- 3 Analysis -- 3.1 Correlation Analysis -- 3.2 Relationships Between the Complexity Measures -- 4 Conclusion -- References -- Dynamic Time Warping for Phase Recognition in Tribological Sensor Data -- 1 Introduction -- 2 Related Work -- 3 Method -- 3.1 Dynamic Time Warping (DTW) -- 3.2 Tribological Use Case -- 3.3 Experiments -- 4 Results -- 4.1 Classification of the Whole Wear Phases -- 4.2 Partial Classification of the Wear Phases -- 5 Conclusion -- References -- Data Repositories -- Putting Co-Design-Supporting Data Lakes to the Test: An Evaluation on AEC Case Studies -- 1 Motivation: Data Management in AEC -- 2 ArchIBALD Architecture Development and Definition -- 2.1 Requirement Analysis -- 2.2 Design of the ArchIBALD Architecture -- 3 Scenario-Based Case Studies: Context and Overview -- 3.1 The livMatS Biomimetic Shell -- 3.2 Co-Design of Robotic Prefabrication -- 3.3 Co-Design of End-Effectors for On-Site Assembly -- 3.4 Co-Design of On-Site Planning and Execution -- 4 Evaluation -- 4.1 Case Study 1: Co-Design of Robotic Prefabrication -- 4.2 Case Study 2: Co-Design of End-Effectors -- 4.3 Case Study 3: Co-Design of On-Site Planning and Execution -- 5 Conclusion -- References -- Creating and Querying Data Cubes in Python Using PyCube -- 1 Introduction -- 2 Related Work -- 3 Preliminaries -- 4 Use Case -- 4.1 Initializing PyCube -- 4.2 Analyzing the Data in the View -- 5 Populating the View -- 5.1 Generating the SQL Query -- 5.2 Converting Result Sets to Dataframes -- 6 Experiments -- 6.1 Experimental Setup -- 6.2 Data Retrieval Speeds -- 6.3 Memory Usage -- 6.4 Code Comparison -- 7 Conclusion and Future Work -- References -- An E-Commerce Benchmark for Evaluating Performance Trade-Offs in Document Stores -- 1 Introduction -- 2 Benchmark Design.
2.1 E-Commerce Application.
Sommario/riassunto: This book constitutes the proceedings of the 26th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2024, which too place in Naples, Italy, during August 26-28, 2024. The 16 full and 20 short papers included in this book were carefully reviewed and selected from 83 submissions. They were organized in topical sections as follows: Modeling and design; entity matching and similarity; classification; machine learning methods and applications; time series; data repositories;optimization; and data quality and applications. .
Titolo autorizzato: Big data analytics and knowledge discovery  Visualizza cluster
ISBN: 9783031683237
9783031683220
Formato: Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione: Inglese
Record Nr.: 9910881092203321
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Serie: Lecture Notes in Computer Science, . 1611-3349 ; ; 14912