Intro -- Contents -- Preface -- Acknowledgments -- Author Biography -- 1 Probability and Statistics: An Introduction -- 1.1 Introduction -- 1.1.1 The Interplay Between Probability, Statistics, and Machine Learning -- 1.1.2 Chapter Organization -- 1.2 Representing Data -- 1.2.1 Numeric Multidimensional Data -- 1.2.2 Categorical and Mixed Attribute Data -- 1.3 Summarizing and Visualizing Data -- 1.4 The Basics of Probability and Probability Distributions -- 1.4.1 Populations versus Samples -- 1.4.2 Modeling Populations with Samples -- 1.4.3 Handing Dependence in Data Samples -- 1.5 Hypothesis Testing -- 1.6 Basic Problems in Machine Learning -- 1.6.1 Clustering -- 1.6.2 Classification and Regression Modeling -- 1.6.2.1 Regression -- 1.6.3 Outlier Detection -- 1.7 Summary -- 1.8 Further Reading -- 1.9 Exercises -- 2 Summarizing and Visualizing Data -- 2.1 Introduction -- 2.1.1 Chapter Organization -- 2.2 Summarizing Data -- 2.2.1 Univariate Summarization -- 2.2.1.1 Measures of Central Tendency -- 2.2.1.2 Measures of Dispersion -- 2.2.2 Multivariate Summarization -- 2.2.2.1 Covariance and Correlation -- 2.2.2.2 Rank Correlation Measures -- 2.2.2.3 Correlations among Multiple Attributes -- 2.2.2.4 Contingency Tables for Categorical Data -- 2.3 Data Visualization -- 2.3.1 Univariate Visualization -- 2.3.1.1 Histogram -- 2.3.1.2 Box Plot -- 2.3.2 Multivariate Visualization -- 2.3.2.1 Line Plot -- 2.3.2.2 Scatter Plot -- 2.3.2.3 Bar Chart -- 2.4 Applications to Data Preprocessing -- 2.4.1 Univariate Preprocessing Methods -- 2.4.2 Whitening: A Multivariate Preprocessing Method -- 2.5 Summary -- 2.6 |