Vai al contenuto principale della pagina

Machine Learning with R [[electronic resource] /] / by Abhijit Ghatak



(Visualizza in formato marc)    (Visualizza in BIBFRAME)

Autore: Ghatak Abhijit Visualizza persona
Titolo: Machine Learning with R [[electronic resource] /] / by Abhijit Ghatak Visualizza cluster
Pubblicazione: Singapore : , : Springer Singapore : , : Imprint : Springer, , 2017
Edizione: 1st ed. 2017.
Descrizione fisica: 1 online resource (XIX, 210 p. 56 illus.)
Disciplina: 519.502855133
Soggetto topico: Artificial intelligence
Computer programming
Programming languages (Electronic computers)
Database management
R (Computer program language)
Artificial Intelligence
Programming Techniques
Programming Languages, Compilers, Interpreters
Database Management
Nota di contenuto: Intro -- Preface -- The Data-Driven Universe -- Causality-The Cornerstone of Accountability -- The Growth of Machines -- What is Machine Learning? -- Intended Audience -- Acknowledgements -- Contents -- About the Author -- 1 Linear Algebra, Numerical Optimization, and Its Applications in Machine Learning -- 1.1 Scalars, Vectors, and Linear Functions -- 1.1.1 Scalars -- 1.1.2 Vectors -- 1.2 Linear Functions -- 1.3 Matrices -- 1.3.1 Transpose of a Matrix -- 1.3.2 Identity Matrix -- 1.3.3 Inverse of a Matrix -- 1.3.4 Representing Linear Equations in Matrix Form -- 1.4 Matrix Transformations -- 1.5 Norms -- 1.5.1 ell2 Optimization -- 1.5.2 ell1 Optimization -- 1.6 Rewriting the Regression Model in Matrix Notation -- 1.7 Cost of a n-Dimensional Function -- 1.8 Computing the Gradient of the Cost -- 1.8.1 Closed-Form Solution -- 1.8.2 Gradient Descent -- 1.9 An Example of Gradient Descent Optimization -- 1.10 Eigendecomposition -- 1.11 Singular Value Decomposition (SVD) -- 1.12 Principal Component Analysis (PCA) -- 1.12.1 PCA and SVD -- 1.13 Computational Errors -- 1.13.1 Rounding---Overflow and Underflow -- 1.13.2 Conditioning -- 1.14 Numerical Optimization -- 2 Probability and Distributions -- 2.1 Sources of Uncertainty -- 2.2 Random Experiment -- 2.3 Probability -- 2.3.1 Marginal Probability -- 2.3.2 Conditional Probability -- 2.3.3 The Chain Rule -- 2.4 Bayes' Rule -- 2.5 Probability Distribution -- 2.5.1 Discrete Probability Distribution -- 2.5.2 Continuous Probability Distribution -- 2.5.3 Cumulative Probability Distribution -- 2.5.4 Joint Probability Distribution -- 2.6 Measures of Central Tendency -- 2.7 Dispersion -- 2.8 Covariance and Correlation -- 2.9 Shape of a Distribution -- 2.10 Chebyshev's Inequality -- 2.11 Common Probability Distributions -- 2.11.1 Discrete Distributions -- 2.11.2 Continuous Distributions.
2.11.3 Summary of Probability Distributions -- 2.12 Tests for Fit -- 2.12.1 Chi-Square Distribution -- 2.12.2 Chi-Square Test -- 2.13 Ratio Distributions -- 2.13.1 Student's t-Distribution -- 2.13.2 F-Distribution -- 3 Introduction to Machine Learning -- 3.1 Scientific Enquiry -- 3.1.1 Empirical Science -- 3.1.2 Theoretical Science -- 3.1.3 Computational Science -- 3.1.4 e-Science -- 3.2 Machine Learning -- 3.2.1 A Learning Task -- 3.2.2 The Performance Measure -- 3.2.3 The Experience -- 3.3 Train and Test Data -- 3.3.1 Training Error, Generalization (True) Error, and Test Error -- 3.4 Irreducible Error, Bias, and Variance -- 3.5 Bias--Variance Trade-off -- 3.6 Deriving the Expected Prediction Error -- 3.7 Underfitting and Overfitting -- 3.8 Regularization -- 3.9 Hyperparameters -- 3.10 Cross-Validation -- 3.11 Maximum Likelihood Estimation -- 3.12 Gradient Descent -- 3.13 Building a Machine Learning Algorithm -- 3.13.1 Challenges in Learning Algorithms -- 3.13.2 Curse of Dimensionality and Feature Engineering -- 3.14 Conclusion -- 4 Regression -- 4.1 Linear Regression -- 4.1.1 Hypothesis Function -- 4.1.2 Cost Function -- 4.2 Linear Regression as Ordinary Least Squares -- 4.3 Linear Regression as Maximum Likelihood -- 4.4 Gradient Descent -- 4.4.1 Gradient of RSS -- 4.4.2 Closed Form Solution -- 4.4.3 Step-by-Step Batch Gradient Descent -- 4.4.4 Writing the Batch Gradient Descent Application -- 4.4.5 Writing the Stochastic Gradient Descent Application -- 4.5 Linear Regression Assumptions -- 4.6 Summary of Regression Outputs -- 4.7 Ridge Regression -- 4.7.1 Computing the Gradient of Ridge Regression -- 4.7.2 Writing the Ridge Regression Gradient Descent Application -- 4.8 Assessing Performance -- 4.8.1 Sources of Error Revisited -- 4.8.2 Bias--Variance Trade-Off in Ridge Regression -- 4.9 Lasso Regression.
4.9.1 Coordinate Descent for Least Squares Regression -- 4.9.2 Coordinate Descent for Lasso -- 4.9.3 Writing the Lasso Coordinate Descent Application -- 4.9.4 Implementing Coordinate Descent -- 4.9.5 Bias Variance Trade-Off in Lasso Regression -- 5 Classification -- 5.1 Linear Classifiers -- 5.1.1 Linear Classifier Model -- 5.1.2 Interpreting the Score -- 5.2 Logistic Regression -- 5.2.1 Likelihood Function -- 5.2.2 Model Selection with Log-Likelihood -- 5.2.3 Gradient Ascent to Find the Best Linear Classifier -- 5.2.4 Deriving the Log-Likelihood Function -- 5.2.5 Deriving the Gradient of Log-Likelihood -- 5.2.6 Gradient Ascent for Logistic Regression -- 5.2.7 Writing the Logistic Regression Application -- 5.2.8 A Comparison Using the BFGS Optimization Method -- 5.2.9 Regularization -- 5.2.10 \ell_2 Regularized Logistic Regression -- 5.2.11 \ell_2 Regularized Logistic Regression with Gradient Ascent -- 5.2.12 Writing the Ridge Logistic Regression with Gradient Ascent Application -- 5.2.13 Writing the Lasso Regularized Logistic Regression With Gradient Ascent Application -- 5.3 Decision Trees -- 5.3.1 Decision Tree Algorithm -- 5.3.2 Overfitting in Decision Trees -- 5.3.3 Control of Tree Parameters -- 5.3.4 Writing the Decision Tree Application -- 5.3.5 Unbalanced Data -- 5.4 Assessing Performance -- 5.4.1 Assessing Performance--Logistic Regression -- 5.5 Boosting -- 5.5.1 AdaBoost Learning Ensemble -- 5.5.2 AdaBoost: Learning from Weighted Data -- 5.5.3 AdaBoost: Updating the Weights -- 5.5.4 AdaBoost Algorithm -- 5.5.5 Writing the Weighted Decision Tree Algorithm -- 5.5.6 Writing the AdaBoost Application -- 5.5.7 Performance of our AdaBoost Algorithm -- 5.6 Other Variants -- 5.6.1 Bagging -- 5.6.2 Gradient Boosting -- 5.6.3 XGBoost -- 6 Clustering -- 6.1 The Clustering Algorithm -- 6.2 Clustering Algorithm as Coordinate Descent optimization.
6.3 An Introduction to Text mining -- 6.3.1 Text Mining Application---Reading Multiple Text Files from Multiple Directories -- 6.3.2 Text Mining Application---Creating a Weighted tf-idf Document-Term Matrix -- 6.3.3 Text Mining Application---Exploratory Analysis -- 6.4 Writing the Clustering Application -- 6.4.1 Smart Initialization of k-means -- 6.4.2 Writing the k-means++ Application -- 6.4.3 Finding the Optimal Number of Centroids -- 6.5 Topic Modeling -- 6.5.1 Clustering and Topic Modeling -- 6.5.2 Latent Dirichlet Allocation for Topic Modeling -- Appendix References and Further Reading.
Sommario/riassunto: This book helps readers understand the mathematics of  machine learning, and apply them in different situations. It is divided into two basic parts, the first of which introduces readers to the theory of linear algebra, probability, and data distributions and it’s applications to machine learning. It also includes a detailed introduction to the concepts and constraints of machine learning and what is involved in designing a learning algorithm. This part helps readers understand the mathematical and statistical aspects of machine learning. In turn, the second part discusses the algorithms used in supervised and unsupervised learning. It works out each learning algorithm mathematically and encodes it in R to produce customized learning applications. In the process, it touches upon the specifics of each algorithm and the science behind its formulation. The book includes a wealth of worked-out examples along with R codes. It explains the code for each algorithm, and readers can modify the code to suit their own needs. The book will be of interest to all researchers who intend to use R for machine learning, and those who are interested in the practical aspects of implementing learning algorithms for data analysis. Further, it will be particularly useful and informative for anyone who has struggled to relate the concepts of mathematics and statistics to machine learning.
Titolo autorizzato: Machine Learning with R  Visualizza cluster
ISBN: 981-10-6808-9
Formato: Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione: Inglese
Record Nr.: 9910254832103321
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui
Serie: Springer computer science ebooks.