LEADER 05568nam 22006974a 450 001 996211655103316 005 20230617031032.0 010 $a1-280-36625-7 010 $a9786610366255 010 $a0-470-30781-1 010 $a0-471-45864-3 010 $a0-471-44835-4 035 $a(CKB)1000000000018977 035 $a(EBL)159847 035 $a(OCoLC)123112222 035 $a(SSID)ssj0000295984 035 $a(PQKBManifestationID)11250991 035 $a(PQKBTitleCode)TC0000295984 035 $a(PQKBWorkID)10322357 035 $a(PQKB)10628633 035 $a(MiAaPQ)EBC159847 035 $a(EXLCZ)991000000000018977 100 $a20021105d2003 uy 0 101 0 $aeng 135 $aur|n|---||||| 181 $ctxt 182 $cc 183 $acr 200 10$aExploratory data mining and data cleaning$b[electronic resource] /$fTamraparni Dasu, Theorodre Johnson 210 $aNew York $cWiley-Interscience$d2003 215 $a1 online resource (226 p.) 225 1 $aWiley series in probability and statistics 300 $aDescription based upon print version of record. 311 $a0-471-26851-8 320 $aIncludes bibliographical references (p. 189-195) and index. 327 $aExploratory Data Mining and Data Cleaning; Contents; Preface; 1. Exploratory Data Mining and Data Cleaning: An Overview; 1.1 Introduction; 1.2 Cautionary Tales; 1.3 Taming the Data; 1.4 Challenges; 1.5 Methods; 1.6 EDM; 1.6.1 EDM Summaries-Parametric; 1.6.2 EDM Summaries-Nonparametric; 1.7 End-to-End Data Quality (DQ); 1.7.1 DQ in Data Preparation; 1.7.2 EDM and Data Glitches; 1.7.3 Tools for DQ; 1.7.4 End-to-End DQ: The Data Quality Continuum; 1.7.5 Measuring Data Quality; 1.8 Conclusion; 2. Exploratory Data Mining; 2.1 Introduction; 2.2 Uncertainty; 2.2.1 Annotated Bibliography 327 $a2.3 EDM: Exploratory Data Mining2.4 EDM Summaries; 2.4.1 Typical Values; 2.4.2 Attribute Variation; 2.4.3 Example; 2.4.4 Attribute Relationships; 2.4.5 Annotated Bibliography; 2.5 What Makes a Summary Useful?; 2.5.1 Statistical Properties; 2.5.2 Computational Criteria; 2.5.3 Annotated Bibliography; 2.6 Data-Driven Approach-Nonparametric Analysis; 2.6.1 The Joy of Counting; 2.6.2 Empirical Cumulative Distribution Function (ECDF); 2.6.3 Univariate Histograms; 2.6.4 Annotated Bibliography; 2.7 EDM in Higher Dimensions; 2.8 Rectilinear Histograms; 2.9 Depth and Multivariate Binning 327 $a2.9.1 Data Depth2.9.2 Aside: Depth-Related Topics; 2.9.3 Annotated Bibliography; 2.10 Conclusion; 3. Partitions and Piecewise Models; 3.1 Divide and Conquer; 3.1.1 Why Do We Need Partitions?; 3.1.2 Dividing Data; 3.1.3 Applications of Partition-Based EDM Summaries; 3.2 Axis-Aligned Partitions and Data Cubes; 3.2.1 Annotated Bibliography; 3.3 Nonlinear Partitions; 3.3.1 Annotated Bibliography; 3.4 DataSpheres (DS); 3.4.1 Layers; 3.4.2 Data Pyramids; 3.4.3 EDM Summaries; 3.4.4 Annotated Bibliography; 3.5 Set Comparison Using EDM Summaries; 3.5.1 Motivation; 3.5.2 Comparison Strategy 327 $a3.5.3 Statistical Tests for Change3.5.4 Application-Two Case Studies; 3.5.5 Annotated Bibliography; 3.6 Discovering Complex Structure in Data with EDM Summaries; 3.6.1 Exploratory Model Fitting in Interactive Response Time; 3.6.2 Annotated Bibliography; 3.7 Piecewise Linear Regression; 3.7.1 An Application; 3.7.2 Regression Coefficients; 3.7.3 Improvement in Fit; 3.7.4 Annotated Bibliography; 3.8 One-Pass Classification; 3.8.1 Quantile-Based Prediction with Piecewise Models; 3.8.2 Simulation Study; 3.8.3 Annotated Bibliography; 3.9 Conclusion; 4. Data Quality; 4.1 Introduction 327 $a4.2 The Meaning of Data Quality4.2.1 An Example; 4.2.2 Data Glitches; 4.2.3 Conventional Definition of DQ; 4.2.4 Times Have Changed; 4.2.5 Annotated Bibliography; 4.3 Updating DQ Metrics: Data Quality Continuum; 4.3.1 Data Gathering; 4.3.2 Data Delivery; 4.3.3 Data Monitoring; 4.3.4 Data Storage; 4.3.5 Data Integration; 4.3.6 Data Retrieval; 4.3.7 Data Mining/Analysis; 4.3.8 Annotated Bibliography; 4.4 The Meaning of Data Quality Revisited; 4.4.1 Data Interpretation; 4.4.2 Data Suitability; 4.4.3 Dataset Type; 4.4.4 Attribute Type; 4.4.5 Application Type 327 $a4.4.6 Data Quality-A Many Splendored Thing 330 $aWritten for practitioners of data mining, data cleaning and database management. Presents a technical treatment of data quality including process, metrics, tools and algorithms.Focuses on developing an evolving modeling strategy through an iterative data exploration loop and incorporation of domain knowledge.Addresses methods of detecting, quantifying and correcting data quality issues that can have a significant impact on findings and decisions, using commercially available tools as well as new algorithmic approaches.Uses case studies to illustrate applications in real 410 0$aWiley series in probability and statistics. 606 $aData mining 606 $aElectronic data processing$xData preparation 606 $aElectronic data processing$xQuality control 615 0$aData mining. 615 0$aElectronic data processing$xData preparation. 615 0$aElectronic data processing$xQuality control. 676 $a005.741 676 $a006.3 676 $a006.312 700 $aDasu$b Tamraparni$0281835 701 $aJohnson$b Theodore$0281836 801 0$bMiAaPQ 801 1$bMiAaPQ 801 2$bMiAaPQ 906 $aBOOK 912 $a996211655103316 996 $aExploratory data mining and data cleaning$9673537 997 $aUNISA LEADER 01737nas 2200433 n 450 001 990008947580403321 005 20240229084333.0 011 $a0210-0746 035 $a000894758 035 $aFED01000894758 035 $a(Aleph)000894758FED01 035 $a000894758 091 $2CNR$aP 00067390 100 $a20090724b19711990km-y0itaa50------ba 101 0 $aspa 102 $aES 110 $aauu-------- 200 1 $aCuadernos de filología clásica 207 1$a1971-1990 210 $aMadrid$cUniversidad Complutense 326 $aAnnuale 433 0$1001000894759$12001$aCuadernos de filología clásica. Estudios griegos e indoeuropeos 446 0$12001$aCuadernos de filología clásica. Estudios latinos 530 0 $aCuadernos de filología clásica 675 $a807 675 $a338 675 $a811.124 675 $a811.14 712 02$aUniversidad Complutense de Madrid.$bFacultad de Filología 801 0$aIT$bACNP$c20090723 859 4 $uhttp://acnp.cib.unibo.it/cgi-ser/start/it/cnr/dc-p1.tcl?catno=43670&person=false&language=ITALIANO&libr=&libr_th=unina1$zBiblioteche che possiedono il periodico 901 $aSE 912 $a990008947580403321 958 $aBRAU. Biblioteca di Ricerca di Area Umanistica$b1988-1990.$ePR 84 SPAGNA$fFLFBC 959 $aFLFBC 996 $aCuadernos de filología clásica$9797479 997 $aUNINA AP1 8 $6866-01$aNA072 BRAU. Biblioteca di Ricerca di Area Umanistica$bPR 84 SPAGNA$ePiazza Bellini 56/60, 80133 Napoli (NA)$m(081) 2533948$nit AP2 40$aacnp.cib.unibo.it$nACNP Italian Union Catalogue of Serials$uhttp://acnp.cib.unibo.it/cgi-ser/start/it/cnr/df-p.tcl?catno=43670&language=ITALIANO&libr=&person=&B=1&libr_th=unina&proposto=NO