Vai al contenuto principale della pagina

Mathematical foundations for data analysis / / Jeff M. Phillips



(Visualizza in formato marc)    (Visualizza in BIBFRAME)

Autore: Phillips Jeff M. Visualizza persona
Titolo: Mathematical foundations for data analysis / / Jeff M. Phillips Visualizza cluster
Pubblicazione: Cham, Switzerland : , : Springer, , [2021]
©2021
Descrizione fisica: 1 online resource (299 pages)
Disciplina: 006.312
Soggetto topico: Data mining - Mathematics
Machine learning - Mathematics
Mineria de dades
Aprenentatge automàtic
Matemàtica
Soggetto genere / forma: Llibres electrònics
Nota di contenuto: Intro -- Preface -- Acknowledgements -- Contents -- 1 Probability Review -- 1.1 Sample Spaces -- 1.2 Conditional Probability and Independence -- 1.3 Density Functions -- 1.4 Expected Value -- 1.5 Variance -- 1.6 Joint, Marginal, and Conditional Distributions -- 1.7 Bayes' Rule -- 1.7.1 Model Given Data -- 1.8 Bayesian Inference -- Exercises -- 2 Convergence and Sampling -- 2.1 Sampling and Estimation -- 2.2 Probably Approximately Correct (PAC) -- 2.3 Concentration of Measure -- 2.3.1 Markov Inequality -- 2.3.2 Chebyshev Inequality -- 2.3.3 Chernoff-Hoeffding Inequality -- 2.3.4 Union Bound and Examples -- 2.4 Importance Sampling -- 2.4.1 Sampling Without Replacement with Priority Sampling -- Exercises -- 3 Linear Algebra Review -- 3.1 Vectors and Matrices -- 3.2 Addition and Multiplication -- 3.3 Norms -- 3.4 Linear Independence -- 3.5 Rank -- 3.6 Square Matrices and Properties -- 3.7 Orthogonality -- Exercises -- 4 Distances and Nearest Neighbors -- 4.1 Metrics -- 4.2 Lp Distances and their Relatives -- 4.2.1 Lp Distances -- 4.2.2 Mahalanobis Distance -- 4.2.3 Cosine and Angular Distance -- 4.2.4 KL Divergence -- 4.3 Distances for Sets and Strings -- 4.3.1 Jaccard Distance -- 4.3.2 Edit Distance -- 4.4 Modeling Text with Distances -- 4.4.1 Bag-of-Words Vectors -- 4.4.2 k-Grams -- 4.5 Similarities -- 4.5.1 Set Similarities -- 4.5.2 Normed Similarities -- 4.5.3 Normed Similarities between Sets -- 4.6 Locality Sensitive Hashing -- 4.6.1 Properties of Locality Sensitive Hashing -- 4.6.2 Prototypical Tasks for LSH -- 4.6.3 Banding to Amplify LSH -- 4.6.4 LSH for Angular Distance -- 4.6.5 LSH for Euclidean Distance -- 4.6.6 Min Hashing as LSH for Jaccard Distance -- Exercises -- 5 Linear Regression -- 5.1 Simple Linear Regression -- 5.2 Linear Regression with Multiple Explanatory Variables -- 5.3 Polynomial Regression -- 5.4 Cross-Validation.
5.4.1 Other ways to Evaluate Linear Regression Models -- 5.5 Regularized Regression -- 5.5.1 Tikhonov Regularization for Ridge Regression -- 5.5.2 Lasso -- 5.5.3 Dual Constrained Formulation -- 5.5.4 Matching Pursuit -- Exercises -- 6 Gradient Descent -- 6.1 Functions -- 6.2 Gradients -- 6.3 Gradient Descent -- 6.3.1 Learning Rate -- 6.4 Fitting a Model to Data -- 6.4.1 Least Mean Squares Updates for Regression -- 6.4.2 Decomposable Functions -- Exercises -- 7 Dimensionality Reduction -- 7.1 Data Matrices -- 7.1.1 Projections -- 7.1.2 Sum of Squared Errors Goal -- 7.2 Singular Value Decomposition -- 7.2.1 Best Rank-k Approximation of a Matrix -- 7.3 Eigenvalues and Eigenvectors -- 7.4 The Power Method -- 7.5 Principal Component Analysis -- 7.6 Multidimensional Scaling -- 7.6.1 Why does Classical MDS work? -- 7.7 Linear Discriminant Analysis -- 7.8 Distance Metric Learning -- 7.9 Matrix Completion -- 7.10 Random Projections -- Exercises -- 8 Clustering -- 8.1 Voronoi Diagrams -- 8.1.1 Delaunay Triangulation -- 8.1.2 Connection to Assignment-Based Clustering -- 8.2 Gonzalez's Algorithm for k-Center Clustering -- 8.3 Lloyd's Algorithm for k-Means Clustering -- 8.3.1 Lloyd's Algorithm -- 8.3.2 k-Means++ -- 8.3.3 k-Mediod Clustering -- 8.3.4 Soft Clustering -- 8.4 Mixture of Gaussians -- 8.4.1 Expectation-Maximization -- 8.5 Hierarchical Clustering -- 8.6 Density-Based Clustering and Outliers -- 8.6.1 Outliers -- 8.7 Mean Shift Clustering -- Exercises -- 9 Classification -- 9.1 Linear Classifiers -- 9.1.1 Loss Functions -- 9.1.2 Cross-Validation and Regularization -- 9.2 Perceptron Algorithm -- 9.3 Support Vector Machines and Kernels -- 9.3.1 The Dual: Mistake Counter -- 9.3.2 Feature Expansion -- 9.3.3 Support Vector Machines -- 9.4 Learnability and VC dimension -- 9.5 kNN Classifiers -- 9.6 Decision Trees -- 9.7 Neural Networks.
9.7.1 Training with Back-propagation -- 10 Graph Structured Data -- 10.1 Markov Chains -- 10.1.1 Ergodic Markov Chains -- 10.1.2 Metropolis Algorithm -- 10.2 PageRank -- 10.3 Spectral Clustering on Graphs -- 10.3.1 Laplacians and their EigenStructures -- 10.4 Communities in Graphs -- 10.4.1 Preferential Attachment -- 10.4.2 Betweenness -- 10.4.3 Modularity -- Exercises -- 11 Big Data and Sketching -- 11.1 The Streaming Model -- 11.1.1 Mean and Variance -- 11.1.2 Reservoir Sampling -- 11.2 Frequent Items -- 11.2.1 Warm-Up: Majority -- 11.2.2 Misra-Gries Algorithm -- 11.2.3 Count-Min Sketch -- 11.2.4 Count Sketch -- 11.3 Matrix Sketching -- 11.3.1 Covariance Matrix Summation -- 11.3.2 Frequent Directions -- 11.3.3 Row Sampling -- 11.3.4 Random Projections and Count Sketch Hashing -- Exercises -- Index.
Titolo autorizzato: Mathematical foundations for data analysis  Visualizza cluster
ISBN: 3-030-62341-6
Formato: Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione: Inglese
Record Nr.: 9910483358803321
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Serie: Springer series in the data sciences.