LEADER 05618nam 2200709 450 001 9910807848203321 005 20230124190829.0 010 $a1-118-82418-0 010 $a1-118-61254-X 035 $a(CKB)2670000000421687 035 $a(EBL)1566514 035 $a(SSID)ssj0001166371 035 $a(PQKBManifestationID)11745514 035 $a(PQKBTitleCode)TC0001166371 035 $a(PQKBWorkID)11120555 035 $a(PQKB)10314084 035 $a(Au-PeEL)EBL1566514 035 $a(CaPaEBR)ebr10756837 035 $a(CaONFJC)MIL576363 035 $a(OCoLC)861529054 035 $a(CaSebORM)9781118824184 035 $a(MiAaPQ)EBC1566514 035 $a(EXLCZ)992670000000421687 100 $a20130815h20132013 uy| 0 101 0 $aeng 135 $aur|n|---||||| 181 $ctxt 182 $cc 183 $acr 200 10$aProfessional hadoop solutions /$fBoris Lublinsky, Kevin T. Smith, Alexey Yakubovich 205 $a1st edition 210 1$aIndianapolis, IN :$cJohn Wiley and Sons,$d[2013] 210 4$d©2013 215 $a1 online resource (506 p.) 225 1 $aWrox Programmer to programmer 300 $aDescription based upon print version of record. 311 $a1-118-61193-4 320 $aIncludes bibliographical references and index. 327 $aProfessional Hadoop® Solutions; Copyright; Credits; About the Authors; About the Technical Editors; Acnowledgments; Contents; Introduction; Who This Book Is For; What This Book Covers; How This Book Is Structured; What You Need to Use This Book; Conventions; Source Code; Errata; P2P.Wrox.Com; Chapter 1: Big Data and the Hadoop Ecosystem; Big Data Meets Hadoop; Hadoop: Meeting the Big Data Challenge; Data Science in the Business World; The Hadoop Ecosystem; Hadoop Core Components; Hadoop Distributions; Developing Enterprise Applications with Hadoop; Summary; Chapter 2: Storing Data in Hadoop 327 $aHDFSHDFS Architecture; Using HDFS Files; Hadoop-Specific File Types; HDFS Federation and High Availability; HBase; HBase Architecture; HBase Schema Design; Programming for HBase; New HBase Features; Combining HDFS and HBase for Effective Data Storage; Using Apache Avro; Managing Metadata with HCatalog; Choosing an Appropriate Hadoop Data Organization for Your Applications; Summary; Chapter 3: Processing Your Data with MapReduce; Getting to Know MapReduce; MapReduce Execution Pipeline; Runtime Coordination and Task Management in MapReduce; Your First MapReduce Application 327 $aBuilding and Executing MapReduce ProgramsDesigning MapReduce Implementations; Using MapReduce as a Framework for Parallel Processing; Simple Data Processing with MapReduce; Building Joins with MapReduce; Building Iterative MapReduce Applications; To MapReduce or Not to MapReduce?; Common MapReduce Design Gotchas; Summary; Chapter 4: Customizing MapReduce Execution; Controlling MapReduce Execution with InputFormat; Implementing InputFormat for Compute-Intensive Applications; Implementing InputFormat to Control the Number of Maps; Implementing InputFormat for Multiple HBase Tables 327 $aReading Data Your Way with Custom RecordReadersImplementing a Queue-Based RecordReader; Implementing RecordReader for XML Data; Organizing Output Data with Custom Output Formats; Implementing OutputFormat for Splitting MapReduce Job's Output into Multiple Directories; Writing Data Your Way with Custom RecordWriters; Implementing a RecordWriter to Produce Output tar Files; Optimizing Your MapReduce Execution with a Combiner; Controlling Reducer Execution with Partitioners; Implementing a Custom Partitioner for One-to-Many Joins; Using Non-Java Code with Hadoop; Pipes; Hadoop Streaming 327 $aUsing JNISummary; Chapter 5: Building Reliable MapReduce Apps; Unit Testing MapReduce Applications; Testing Mappers; Testing Reducers; Integration Testing; Local Application Testing with Eclipse; Using Logging for Hadoop Testing; Processing Applications Logs; Reporting Metrics with Job Counters; Defensive Programming in MapReduce; Summary; Chapter 6: Automating Data Processing with Oozie; Getting to Know Oozie; Oozie Workflow; Executing Asynchronous Activities in Oozie Workflow; Oozie Recovery Capabilities; Oozie Workflow Job Life Cycle; Oozie Coordinator; Oozie Bundle 327 $aOozie Parameterization with Expression Language 330 $aThe go-to guidebook for deploying Big Data solutions with Hadoop Today's enterprise architects need to understand how the Hadoop frameworks and APIs fit together, and how they can be integrated to deliver real-world solutions. This book is a practical, detailed guide to building and implementing those solutions, with code-level instruction in the popular Wrox tradition. It covers storing data with HDFS and Hbase, processing data with MapReduce, and automating data processing with Oozie. Hadoop security, running Hadoop with Amazon Web Services, best practices, and automating Hadoop 606 $aElectronic data processing$xDistributed processing 606 $aFile organization (Computer science) 606 $aCloud computing 615 0$aElectronic data processing$xDistributed processing. 615 0$aFile organization (Computer science) 615 0$aCloud computing. 676 $a005.74 700 $aLublinsky$b Boris$0760142 701 $aSmith$b Kevin T$0760143 701 $aYakubovich$b Alexey$0760144 801 0$bMiAaPQ 801 1$bMiAaPQ 801 2$bMiAaPQ 906 $aBOOK 912 $a9910807848203321 996 $aProfessional Hadoop solutions$91537544 997 $aUNINA