LEADER 05575nam 2200709 450 001 9910460169903321 005 20200520144314.0 010 $a1-118-86870-6 010 $a1-118-86867-6 035 $a(CKB)3710000000359043 035 $a(EBL)1895687 035 $a(SSID)ssj0001437790 035 $a(PQKBManifestationID)11774104 035 $a(PQKBTitleCode)TC0001437790 035 $a(PQKBWorkID)11377402 035 $a(PQKB)10669391 035 $a(MiAaPQ)EBC1895687 035 $a(CaSebORM)9781118868706 035 $a(PPN)197577806 035 $a(Au-PeEL)EBL1895687 035 $a(CaPaEBR)ebr11024581 035 $a(CaONFJC)MIL770045 035 $a(OCoLC)907093982 035 $a(EXLCZ)993710000000359043 100 $a20150308h20152015 uy 0 101 0 $aeng 135 $aur|n|---||||| 181 $ctxt 182 $cc 183 $acr 200 10$aData mining and predictive analytics /$fDaniel T. Larose, Chantal D. Larose 205 $aSecond edition. 210 1$aHoboken, New Jersey :$cJohn Wiley & Sons,$d2015. 210 4$dİ2015 215 $a1 online resource (827 p.) 225 1 $aWiley Series on Methods and Applications in Data Mining 300 $aDescription based upon print version of record. 311 $a1-118-11619-4 320 $aIncludes bibliographical references and index. 327 $aCover; Contents; Preface; Acknowledgments; Part I Data Preparation; Chapter 1 An Introduction to Data Mining and Predictive Analytics; 1.1 What is Data Mining? What is Predictive Analytics?; 1.2 Wanted: Data Miners; 1.3 The Need for Human Direction of Data Mining; 1.4 The Cross-Industry Standard Process for Data Mining: CRISP-DM; 1.4.1 CRISP-DM: The Six Phases; 1.5 Fallacies of Data Mining; 1.6 What Tasks Can Data Mining Accomplish; 1.6.1 Description; 1.6.2 Estimation; 1.6.3 Prediction; 1.6.4 Classification; 1.6.5 Clustering; 1.6.6 Association; The R Zone; R References; Exercises 327 $aChapter 2 Data Preprocessing2.1 Why do We Need to Preprocess the Data?; 2.2 Data Cleaning; 2.3 Handling Missing Data; 2.4 Identifying Misclassifications; 2.5 Graphical Methods for Identifying Outliers; 2.6 Measures of Center and Spread; 2.7 Data Transformation; 2.8 Min-Max Normalization; 2.9 Z-Score Standardization; 2.10 Decimal Scaling; 2.11 Transformations to Achieve Normality; 2.12 Numerical Methods for Identifying Outliers; 2.13 Flag Variables; 2.14 Transforming Categorical Variables into Numerical Variables; 2.15 Binning Numerical Variables; 2.16 Reclassifying Categorical Variables 327 $a2.17 Adding an Index Field2.18 Removing Variables that are not Useful; 2.19 Variables that Should Probably not be Removed; 2.20 Removal of Duplicate Records; 2.21 A Word About ID Fields; The R Zone; R Reference; Exercises; Chapter 3 Exploratory Data Analysis; 3.1 Hypothesis Testing Versus Exploratory Data Analysis; 3.2 Getting to Know the Data Set; 3.3 Exploring Categorical Variables; 3.4 Exploring Numeric Variables; 3.5 Exploring Multivariate Relationships; 3.6 Selecting Interesting Subsets of the Data for Further Investigation; 3.7 Using EDA to Uncover Anomalous Fields 327 $a3.8 Binning Based on Predictive Value3.9 Deriving New Variables: Flag Variables; 3.10 Deriving New Variables: Numerical Variables; 3.11 Using EDA to Investigate Correlated Predictor Variables; 3.12 Summary of Our EDA; The R Zone; R References; Exercises; Chapter 4 Dimension-Reduction Methods; 4.1 Need for Dimension-Reduction in Data Mining; 4.2 Principal Components Analysis; 4.3 Applying PCA to the Houses Data Set; 4.4 How Many Components Should We Extract?; 4.4.1 The Eigenvalue Criterion; 4.4.2 The Proportion of Variance Explained Criterion; 4.4.3 The Minimum Communality Criterion 327 $a4.4.4 The Scree Plot Criterion4.5 Profiling the Principal Components; 4.6 Communalities; 4.6.1 Minimum Communality Criterion; 4.7 Validation of the Principal Components; 4.8 Factor Analysis; 4.9 Applying Factor Analysis to the Adult Data Set; 4.10 Factor Rotation; 4.11 User-Defined Composites; 4.12 An Example of a User-Defined Composite; The R Zone; R References; Exercises; Part II Statistical Analysis; Chapter 5 Univariate Statistical Analysis; 5.1 Data Mining Tasks in Discovering Knowledge in Data; 5.2 Statistical Approaches to Estimation and Prediction; 5.3 Statistical Inference 327 $a5.4 How Confident are We in Our Estimates? 330 $a Learn methods of data analysis and their application to real-world data sets This updated second edition serves as an introduction to data mining methods and models, including association rules, clustering, neural networks, logistic regression, and multivariate analysis. The authors apply a unified "white box" approach to data mining methods and models. This approach is designed to walk readers through the operations and nuances of the various methods, using small data sets, so readers can gain an insight into the inner workings of the method under review. Chapters provide readers with hands 410 0$aWiley series on methods and applications in data mining. 606 $aData mining 606 $aPrediction theory 608 $aElectronic books. 615 0$aData mining. 615 0$aPrediction theory. 676 $a006.3/12 700 $aLarose$b Daniel T.$0497081 702 $aLarose$b Chantal D. 801 0$bMiAaPQ 801 1$bMiAaPQ 801 2$bMiAaPQ 906 $aBOOK 912 $a9910460169903321 996 $aData mining and predictive analytics$91947512 997 $aUNINA