LEADER 05376nam 2200685 450 001 9910816227103321 005 20200520144314.0 010 $a0-12-802091-1 035 $a(CKB)2670000000578822 035 $a(EBL)1875436 035 $a(SSID)ssj0001432549 035 $a(PQKBManifestationID)11778771 035 $a(PQKBTitleCode)TC0001432549 035 $a(PQKBWorkID)11406593 035 $a(PQKB)10787791 035 $a(Au-PeEL)EBL1875436 035 $a(CaPaEBR)ebr10997047 035 $a(CaONFJC)MIL666050 035 $a(OCoLC)900291708 035 $a(CaSebORM)9780128020449 035 $a(MiAaPQ)EBC1875436 035 $a(PPN)189085762 035 $a(EXLCZ)992670000000578822 100 $a20150108h20152015 uy 0 101 0 $aeng 135 $aur|n|---||||| 181 $ctxt 182 $cc 183 $acr 200 10$aData architecture $ea primer for the data scientist : big data, data warehouse and data vault /$fW. H. Inmon, Dan Linstedt ; Steven Elliot, executive editor ; Mark Rogers, designer 205 $a1st edition 210 1$aAmsterdam, Netherlands :$cMorgan Kaufmann,$d2015. 210 4$dİ2015 215 $a1 online resource (378 p.) 300 $aIncludes index. 311 $a0-12-802044-X 311 $a1-322-34768-9 327 $aCover; Title Page; Copyright; Dedication; Contents; Preface; About the authors; 1.1 - Corporate data; The Totality of Data Across the Corporation; Dividing Unstructured Data; Business Relevancy; Big Data; The Great Divide; The Continental Divide; The Complete Picture; 1.2 - The data infrastructure; Two Types of Repetitive Data; Repetitive Structured Data; Repetitive Big Data; The Two Infrastructures; What's being Optimized?; Comparing the Two Infrastructures; 1.3 - The "great divide"; Classifying Corporate Data; The "Great Divide"; Repetitive Unstructured Data; Nonrepetitive Unstructured Data 327 $aDifferent Worlds1.4 - Demographics of corporate data; 1.5 - Corporate data analysis; 1.6 - The life cycle of data - understanding data over time; 1.7 - A brief history of data; Paper Tape and Punch Cards; Magnetic Tapes; Disk Storage; Database Management System; Coupled Processors; Online Transaction Processing; Data Warehouse; Parallel Data Management; Data Vault; Big Data; The Great Divide; 2.1 - A brief history of big data; An Analogy - Taking the High Ground; Taking the High Ground; Standardization with the 360; Online Transaction Processing 327 $aEnter Teradata and Massively Parallel ProcessingThen Came Hadoop and Big Data; IBM and Hadoop; Holding the High Ground; 2.2 - What is big data?; Another Definition; Large Volumes; Inexpensive Storage; The Roman Census Approach; Unstructured Data; Data in Big Data; Context in Repetitive Data; Nonrepetitive Data; Context in Nonrepetitive Data; 2.3 - Parallel processing; 2.4 - Unstructured data; Textual Information Everywhere; Decisions Based on Structured Data; The Business Value Proposition; Repetitive and Nonrepetitive Unstructured Information; Ease of Analysis; Contextualization 327 $aSome Approaches to ContextualizationMapReduce; Manual Analysis; 2.5 - Contextualizing repetitive unstructured data; Parsing Repetitive Unstructured Data; Recasting the Output Data; 2.6 - Textual disambiguation; From Narrative into an Analytical Database; Input into Textual Disambiguation; Mapping; Input/Output; Document Fracturing/Named Value Processing; Preprocessing a Document; Emails - A Special Case; Spreadsheets; Report Decompilation; 2.7 - Taxonomies; Data Models and Taxonomies; Applicability of Taxonomies; What is a Taxonomy?; Taxonomies in Multiple Languages 327 $aDynamics of Taxonomies and Textual DisambiguationTaxonomies and Textual Disambiguation - Separate Technologies; Different Types of Taxonomies; Taxonomies - Maintenance Over Time; 3.1 - A brief history of data warehouse; Early Applications; Online Applications; Extract Programs; 4GL Technology; Personal Computers; Spreadsheets; Integrity of Data; Spider-Web Systems; The Maintenance Backlog; The Data Warehouse; To an Architected Environment; To the CIF; DW 2.0; 3.2 - Integrated corporate data; Many Applications; Looking Across the Corporation; More Than One Analyst; ETL Technology 327 $aThe Challenges of Integration 330 $aToday, the world is trying to create and educate data scientists because of the phenomenon of Big Data. And everyone is looking deeply into this technology. But no one is looking at the larger architectural picture of how Big Data needs to fit within the existing systems (data warehousing systems). Taking a look at the larger picture into which Big Data fits gives the data scientist the necessary context for how pieces of the puzzle should fit together. Most references on Big Data look at only one tiny part of a much larger whole. Until data gathered can be put into an existing framework or a 606 $aData warehousing 606 $aBig data 615 0$aData warehousing. 615 0$aBig data. 676 $a005.745 700 $aInmon$b W. H.$01602310 702 $aLinstedt$b Dan 702 $aElliot$b Steven 702 $aRogers$b Mark 801 0$bMiAaPQ 801 1$bMiAaPQ 801 2$bMiAaPQ 906 $aBOOK 912 $a9910816227103321 996 $aData architecture$93926243 997 $aUNINA