12258nam 22008175 450 991048445790332120200629165048.03-319-27057-510.1007/978-3-319-27057-9(CKB)4340000000001231(SSID)ssj0001584899(PQKBManifestationID)16264199(PQKBTitleCode)TC0001584899(PQKBWorkID)14865540(PQKB)10200698(DE-He213)978-3-319-27057-9(MiAaPQ)EBC5586775(Au-PeEL)EBL5586775(OCoLC)932170459(PPN)190529644(EXLCZ)99434000000000123120151124d2015 u| 0engurnn#008mamaatxtccrBig Data Analytics 4th International Conference, BDA 2015, Hyderabad, India, December 15-18, 2015, Proceedings /edited by Naveen Kumar, Vasudha Bhatnagar1st ed. 2015.Cham :Springer International Publishing :Imprint: Springer,2015.1 online resource (XII, 267 p. 104 illus. in color.)Information Systems and Applications, incl. Internet/Web, and HCI ;9498Bibliographic Level Mode of Issuance: Monograph3-319-27056-7 Intro -- Preface -- Organization -- Contents -- Big Data: Security and Privacy -- Privacy Protection or Data Value: Can We Have Both? -- 1 Introduction -- 2 Big Data -- 3 Privacy -- 3.1 Individuals Own Data About Themselves -- 3.2 Corporations Own Data Collected -- 3.3 Shared Ownership of Data -- 3.4 Summary -- Ownership -- 4 Privacy Principles -- 5 A Call for a New Approach -- 5.1 User and Corporate Obligations -- 5.2 System Obligations -- 5.3 Data Acquisition Process -- 5.4 Managing Private Queries -- 6 Summary -- References -- Open Source Social Media Analytics for Intelligence and Security Informatics Applications -- 1 Introduction -- 1.1 Online Social Media -- 1.2 Intelligence and Security Informatics -- 2 Technical and Computational Challenges -- 3 Machine Learning and Data Mining Techniques -- 4 Case Studies -- 4.1 Identification of Extremist Content, Users and Communities -- 4.2 Event Forecasting for Civil Unrest Related Events -- 5 Characterization and Classification of Related Work -- 6 ISI Leading Conferences and Journals -- 7 Conclusions -- References -- Big Data in Commerce -- Information Exploration in E-Commerce Databases -- 1 Introduction -- 2 Search Interface in E-Commerce -- 3 Obstacles in Exploratory Search -- 3.1 Limited Feedback in the Query Panel -- 3.2 Low Adaptivity of the Result Panel -- 3.3 Information Loss from Hidden Attributes -- 4 Extensions to Facilitate Exploratory Search -- 4.1 Exploratory Search in Query Panel -- 4.2 Exploratory Search in Result Panel -- 5 Evaluation -- 6 Conclusion -- References -- A Framework to Harvest Page Views of Web for Banner Advertising -- 1 Introduction -- 2 Related Work -- 3 Proposed Approach -- 3.1 Existing Framework -- 3.2 Proposed Framework -- 4 Design Issues -- 5 Summary and Conclusions -- References -- Utility-Based Control Flow Discovery from Business Process Event Logs.1 Research Motivation and Aim -- 2 Related Work and Research Contributions -- 3 Research Framework and Solution Approach -- 4 Experimental Analysis and Results -- 4.1 Experimental Dataset -- 4.2 Experimental Results -- 4.3 Airport Dataset Results -- 4.4 BPI 2014 Dataset Results -- 5 Conclusion -- References -- An Efficient Algorithm for Mining High-Utility Itemsets with Discount Notion -- 1 Introduction -- 2 Related Work -- 3 Background -- 3.1 Preliminary -- 3.2 UP-Hist Tree and UP-Hist Algorithm -- 3.3 Utility-List Data Structure and FHM Algorithm -- 3.4 Three-Phase Algorithm for Mining High-Utility Itemsets with Discount Strategies -- 4 Integration of Discount Notion in State-of-the-art Algorithms -- 4.1 UP-Hist Discount Algorithm -- 4.2 FHM Discount Algorithm -- 5 Mining High-Utility Itemsets -- 6 Experiments and Results -- 7 Conclusion and Future Work -- References -- Big Data: Models and Algorithms -- Design of Algorithms for Big Data Analytics -- 1 Introduction -- 2 Building FCA Lattice -- 2.1 Inefficient Map-Reduce Based Design -- 2.2 Efficient Map-Reduce Based Design -- 3 High Velocity Datasets -- 4 Conclusions -- References -- Mobility Big Data Analysis and Visualization (Invited Talk) -- 1 Introduction -- 2 Public Transportation Data Analysis -- 2.1 Framework for Analyzing Smart Card Data -- 2.2 Visual Fusion Analysis Environment -- 3 Vehicle Recorder Data Analysis -- 3.1 Relationships Between Drivers' Behaviors and Their Past Driving Histories -- 3.2 Visual Exploration of Caution Spots -- 4 Conclusions -- References -- Finding Periodic Patterns in Big Data -- 1 Introduction -- 2 Periodic Pattern Mining in Time Series -- 2.1 The Basic Model of Periodic Patterns -- 2.2 The Limitations of Basic Model -- 2.3 Research Efforts to Address the Limitations -- 3 Periodic Pattern Mining in Transactional Databases.3.1 The Basic Model of Periodic-Frequent Patterns -- 3.2 The Performance Issues of the Periodic-Frequent Pattern Model -- 3.3 Research Efforts to Address the Performance Issues -- 4 Experimental Results -- 4.1 Experimental Setup -- 4.2 Generation of Periodic-Frequent Patterns -- 4.3 Interesting Patterns Discovered in Accidents Database -- 4.4 Interesting Patterns Discovered in Shop-4 Database -- 5 Conclusions -- References -- VDMR-DBSCAN: Varied Density MapReduce DBSCAN -- 1 Introduction -- 2 Related Work -- 3 Proposed Work -- 3.1 Partitioning Phase -- 3.2 Map Phase -- 3.3 Reduce Phase -- 3.4 Merge Phase -- 3.5 Relabeling of Data Points -- 4 Results and Discussions -- 4.1 Experimental Settings -- 4.2 Experimental Results -- 4.3 Clustering Results on Zahn_compound Dataset (DS1) -- 4.4 Clustering Results on Spiral Dataset (DS2) -- 4.5 Large Synthetic Dataset (DS3) -- 5 Conclusions and Future Work -- References -- Concept Discovery from Un-Constrained Distributed Context -- Abstract -- 1 Introduction -- 2 Definitions and Properties in FCA -- 3 Related Work -- 4 Methodology -- 4.1 Data Format and Input -- 4.2 Data Aggregation -- 4.3 Concept Finding Approach -- 4.4 Additional Important Insights -- 5 Implementation Using Spark -- 6 Evaluation and Results -- 7 Discussion -- 7.1 Scalability with Respect to Context -- 7.2 Comparison with Distributed Frequent Item-Set Mining -- 7.3 Concept Explosion -- 7.4 Mining Links Between Concepts -- 8 Conclusion -- Acknowledgments -- References -- Khanan: Performance Comparisonand Programming -Miner Algorithm in Column-Oriented and Relational Database Query Languages -- 1 Research Motivation and Aim -- 2 Related Work and Novel Contributions -- 2.1 Implementation of Mining Algorithms in Row-Oriented Databases -- 2.2 Implementation of Mining Algorithms in Column-Oriented Databases.2.3 Performance Comparison of Mining Algorithms in Column-Oriented and Graph Databases -- 3 -Miner Algorithm -- 4 Implementation of -Miner Algorithm in SQL on Row-Oriented Database (MySQL) -- 5 Implementation of -Miner Algorithm on NoSQL Column-Oriented Database Cassandra -- 6 Experimental Dataset -- 7 Benchmarking and Performance Comparison -- 8 Conclusion -- References -- A New Proposed Feature Subset Selection Algorithm Based on Maximization of Gain Ratio -- Abstract -- 1 Introduction -- 2 Related Work -- 3 Theoretical Background -- 4 Algorithm and Analysis -- 5 Empirical Study -- 5.1 Datasets -- 5.2 Experimental Results and Comparisons -- 5.3 Dimension Wise Comparison of Feature Selection Algorithms -- 5.4 Domain Wise Comparison of Feature Selection Algorithms -- 6 Conclusion -- References -- Big Data in Medicine -- Genomics 3.0: Big-Data in Precision Medicine -- 1 Introduction -- 2 Genomics 3.0 -- 3 Dimensions of Big-data -- 4 Genomic Big-data -- 4.1 Patient Data -- 4.2 Background Databases -- 5 Exploratory Data Analysis -- 6 Big-data to Information -- 7 Information to Knowledge -- 8 Integrative Systems Biology -- 9 Patient Stratification -- 10 Characteristics and Life of Genomic Big-data -- 11 Mark-up Languages to Represent Biological Data -- 12 The Big-data Genomics Platform iOMICS -- 13 Live Genome Research Case -- 14 Live Genomics and Precision Medicine Case -- References -- CuraEx - Clinical Expert System Using Big-Data for Precision Medicine -- 1 Introduction -- 2 CuraEx Overview -- 3 Patient Specific Information -- 4 Analytical Engine -- 4.1 Cancer Staging -- 4.2 Prognosis -- 4.3 Therapeutics -- 5 Knowledge Database -- 6 Software Stack and Data Management -- 7 Conclusion -- References -- Multi-omics Multi-scale Big Data Analytics for Cancer Genomics -- 1 Introduction -- 2 Available Data -- 2.1 DNA Level Data -- 2.2 RNA Level Data.2.3 Clinical Data -- 2.4 Background Databases -- 3 Key Aims -- 4 Exploratory Data Analysis -- 4.1 Mutation Association with Cancer State -- 4.2 Differential Gene Expression -- 4.3 Patient Stratification -- 5 Multi-scale Integrative Analysis -- 6 Data Integration and Network Analysis -- 6.1 Functional Characterization Databases -- 6.2 Metabolic Network Reconstruction and Protein Interactions -- 7 Conclusions -- References -- Class Aware Exemplar Discovery from Microarray Gene Expression Data -- Abstract -- 1 Introduction -- 2 Overview of Our Approach -- 2.1 Gene Data -- 2.2 Gene-Gene Similarity -- 2.3 Class Aware Preference -- 2.4 Class Aware Message Passing -- 3 Experimental Evaluation -- 3.1 Description of Experimental Datasets -- 3.2 Comparison of Class Aware Exemplar Discovery with Affinity Propagation -- 3.3 Comparison of Class Aware Exemplar Discovery with Standard Feature Subset Selection Techniques -- 3.4 Comparison of Class Aware Exemplar Discovery with All Features -- 4 Conclusions -- References -- Multistage Classification for Cardiovascular Disease Risk Prediction -- Abstract -- 1 Introduction -- 2 Related Work -- 3 Proposed Work -- 3.1 Data Set -- 4 Experiments and Results -- 4.1 Pre-processing -- 4.2 Feature Extraction -- 4.3 Classification -- 5 Conclusion -- Acknowledgement -- References -- Author Index.This book constitutes the refereed conference proceedings of the Fourth International Conference on Big Data Analytics, BDA 2015, held in Hyderabad, India, in December 2015. The 9 revised full papers and 9 invited papers were carefully reviewed and selected from 61 submissions and cover topics on big data: security and privacy; big data in commerce; big data: models and algorithms; and big data in medicine.Information Systems and Applications, incl. Internet/Web, and HCI ;9498Data miningHealth informaticsDatabase managementInformation storage and retrievalAlgorithmsData Mining and Knowledge Discoveryhttps://scigraph.springernature.com/ontologies/product-market-codes/I18030Health Informaticshttps://scigraph.springernature.com/ontologies/product-market-codes/I23060Database Managementhttps://scigraph.springernature.com/ontologies/product-market-codes/I18024Information Storage and Retrievalhttps://scigraph.springernature.com/ontologies/product-market-codes/I18032Algorithm Analysis and Problem Complexityhttps://scigraph.springernature.com/ontologies/product-market-codes/I16021Data mining.Health informatics.Database management.Information storage and retrieval.Algorithms.Data Mining and Knowledge Discovery.Health Informatics.Database Management.Information Storage and Retrieval.Algorithm Analysis and Problem Complexity.006.312Kumar Naveenedthttp://id.loc.gov/vocabulary/relators/edtBhatnagar Vasudhaedthttp://id.loc.gov/vocabulary/relators/edtMiAaPQMiAaPQMiAaPQBOOK9910484457903321Big data analytics1523196UNINA