Vai al contenuto principale della pagina
Autore: | Rokach Lior |
Titolo: | Data mining with decision trees [[electronic resource] /] : theory and applications / / Lior Rokach, Oded Maimon |
Pubblicazione: | Singapore, : World Scientific, c2008 |
Descrizione fisica: | 1 online resource (263 p.) |
Disciplina: | 006.312 |
Soggetto topico: | Data mining |
Decision trees | |
Altri autori: | MaimonOded Z |
Note generali: | Description based upon print version of record. |
Nota di bibliografia: | Includes bibliographical references (p. 215-242) and index. |
Nota di contenuto: | Preface; Contents; 1. Introduction to Decision Trees; 1.1 Data Mining and Knowledge Discovery; 1.2 Taxonomy of Data Mining Methods; 1.3 Supervised Methods; 1.3.1 Overview; 1.4 Classification Trees; 1.5 Characteristics of Classification Trees; 1.5.1 Tree Size; 1.5.2 The hierarchical nature of decision trees; 1.6 Relation to Rule Induction; 2. Growing Decision Trees; 2.0.1 Training Set; 2.0.2 Definition of the Classification Problem; 2.0.3 Induction Algorithms; 2.0.4 Probability Estimation in Decision Trees; 2.0.4.1 Laplace Correction; 2.0.4.2 No Match |
2.1 Algorithmic Framework for Decision Trees2.2 Stopping Criteria; 3. Evaluation of Classification Trees; 3.1 Overview; 3.2 Generalization Error; 3.2.1 Theoretical Estimation of Generalization Error; 3.2.2 Empirical Estimation of Generalization Error; 3.2.3 Alternatives to the Accuracy Measure; 3.2.4 The F-Measure; 3.2.5 Confusion Matrix; 3.2.6 Classifier Evaluation under Limited Resources; 3.2.6.1 ROC Curves; 3.2.6.2 Hit Rate Curve; 3.2.6.3 Qrecall (Quota Recall); 3.2.6.4 Lift Curve; 3.2.6.5 Pearson Correlation Coegfficient; 3.2.6.6 Area Under Curve (AUC); 3.2.6.7 Average Hit Rate | |
3.2.6.8 Average Qrecall3.2.6.9 Potential Extract Measure (PEM); 3.2.7 Which Decision Tree Classifier is Better?; 3.2.7.1 McNemar's Test; 3.2.7.2 A Test for the Difference of Two Proportions; 3.2.7.3 The Resampled Paired t Test; 3.2.7.4 The k-fold Cross-validated Paired t Test; 3.3 Computational Complexity; 3.4 Comprehensibility; 3.5 Scalability to Large Datasets; 3.6 Robustness; 3.7 Stability; 3.8 Interestingness Measures; 3.9 Overfitting and Underfitting; 3.10 "No Free Lunch" Theorem; 4. Splitting Criteria; 4.1 Univariate Splitting Criteria; 4.1.1 Overview; 4.1.2 Impurity based Criteria | |
4.1.3 Information Gain4.1.4 Gini Index; 4.1.5 Likelihood Ratio Chi-squared Statistics; 4.1.6 DKM Criterion; 4.1.7 Normalized Impurity-based Criteria; 4.1.8 Gain Ratio; 4.1.9 Distance Measure; 4.1.10 Binary Criteria; 4.1.11 Twoing Criterion; 4.1.12 Orthogonal Criterion; 4.1.13 Kolmogorov-Smirnov Criterion; 4.1.14 AUC Splitting Criteria; 4.1.15 Other Univariate Splitting Criteria; 4.1.16 Comparison of Univariate Splitting Criteria; 4.2 Handling Missing Values; 5. Pruning Trees; 5.1 Stopping Criteria; 5.2 Heuristic Pruning; 5.2.1 Overview; 5.2.2 Cost Complexity Pruning | |
5.2.3 Reduced Error Pruning5.2.4 Minimum Error Pruning (MEP); 5.2.5 Pessimistic Pruning; 5.2.6 Error-Based Pruning (EBP); 5.2.7 Minimum Description Length (MDL) Pruning; 5.2.8 Other Pruning Methods; 5.2.9 Comparison of Pruning Methods; 5.3 Optimal Pruning; 6. Advanced Decision Trees; 6.1 Survey of Common Algorithms for Decision Tree Induction; 6.1.1 ID3; 6.1.2 C4.5; 6.1.3 CART; 6.1.4 CHAID; 6.1.5 QUEST.; 6.1.6 Reference to Other Algorithms; 6.1.7 Advantages and Disadvantages of Decision Trees; 6.1.8 Oblivious Decision Trees; 6.1.9 Decision Trees Inducers for Large Datasets | |
6.1.10 Online Adaptive Decision Trees | |
Sommario/riassunto: | This is the first comprehensive book dedicated entirely to the field of decision trees in data mining and covers all aspects of this important technique. Decision trees have become one of the most powerful and popular approaches in knowledge discovery and data mining, the science and technology of exploring large and complex bodies of data in order to discover useful patterns. The area is of great importance because it enables modeling and knowledge extraction from the abundance of data available. Both theoreticians and practitioners are continually seeking techniques to make the process more |
Titolo autorizzato: | Data mining with decision trees |
ISBN: | 1-281-91179-8 |
9786611911799 | |
981-277-172-7 | |
Formato: | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione: | Inglese |
Record Nr.: | 9910784996003321 |
Lo trovi qui: | Univ. Federico II |
Opac: | Controlla la disponibilità qui |