LEADER 01083nam 2200361 450 001 9910583478703321 005 20230120002729.0 010 $a0-12-815337-7 010 $a0-12-815336-9 035 $a(CKB)4100000004096049 035 $a(MiAaPQ)EBC5404288 035 $a(EXLCZ)994100000004096049 100 $a20180618d2018 uy 0 101 0 $aeng 135 $aurcnu|||||||| 181 $ctxt$2rdacontent 182 $cc$2rdamedia 183 $acr$2rdacarrier 200 12$aA practical guide to writing a Ruth L. Kirschstein NRSA grant /$fAndrew D. Hollenbach 210 1$aLondon :$cAcademic Press,$d[2018] 210 4$dİ2018 215 $a1 online resource (146 pages) 606 $aScience$xResearch grants$vHandbooks, manuals, etc 615 0$aScience$xResearch grants 676 $a507.973 700 $aHollenbach$b Andrew D.$0976723 801 0$bMiAaPQ 801 1$bMiAaPQ 801 2$bMiAaPQ 906 $aBOOK 912 $a9910583478703321 996 $aA practical guide to writing a Ruth L. Kirschstein NRSA grant$92224977 997 $aUNINA LEADER 05042nam 2200721 a 450 001 9911006763703321 005 20240313112325.0 010 $a1-62198-910-0 010 $a1-84951-913-7 010 $a1-299-18393-X 035 $a(CKB)2550000001005718 035 $a(EBL)1103987 035 $a(OCoLC)828794321 035 $a(SSID)ssj0000907255 035 $a(PQKBManifestationID)12469492 035 $a(PQKBTitleCode)TC0000907255 035 $a(PQKBWorkID)10884307 035 $a(PQKB)10906268 035 $a(MiAaPQ)EBC1103987 035 $a(CaSebORM)9781849519120 035 $a(Au-PeEL)EBL1103987 035 $a(CaPaEBR)ebr10659963 035 $a(CaONFJC)MIL449643 035 $z(PPN)22799860X 035 $a(PPN)167589679 035 $a(OCoLC)843959318 035 $a(OCoLC)ocn843959318 035 $a(EXLCZ)992550000001005718 100 $a20130307d2013 uy 0 101 0 $aeng 135 $aurunu||||| 181 $ctxt 182 $cc 183 $acr 200 10$aHadoop real-world solutions cookbook $eRealistic, simple code examples to solve problems at scale with Hadoop and related technologies /$fJonathan R. Owens, Jon Lentz, Brian Femiano 205 $a1st edition 210 $aBirmingham [England] $cPackt Pub.$d2013 215 $a1 online resource (316 p.) 300 $aIncludes index. 311 $a1-84951-912-9 327 $aCover; Copyright; Credits; About the Authors; About the Reviewers; www.packtpub.com; Table of Contents; Preface; Chapter 1: Hadoop Distributed File System - Importing and Exporting Data; Introduction; Importing and exporting data into HDFS using Hadoop shell commands; Moving data efficiently between clusters using Distributed Copy; Importing data from MySQL into HDFS using Sqoop; Exporting data from HDFS into MySQL using Sqoop; Configuring Sqoop for Microsoft SQL Server; Exporting data from HDFS into MongoDB; Importing data from MongoDB into HDFS 327 $aExporting data from HDFS into MongoDB using PigUsing HDFS in a Greenplum external table; Using Flume to load data into HDFS; Chapter 2: HDFS; Introduction; Reading and writing data to HDFS; Compressing data using LZO; Reading and writing data to SequenceFiles; Using Apache Avro to serialize data; Using Apache Thrift to serialize data; Using Protocol Buffers to serialize data; Setting the replication factor for HDFS; Setting the block size for HDFS; Chapter 3: Extracting and Transforming Data; Introduction; Transforming Apache logs into TSV format using MapReduce 327 $aUsing Apache Pig to filter bot traffic from web server logsUsing Apache Pig to sort web server log data by timestamp; Using Apache Pig to sessionize web server log data; Using Python to extend Apache Pig functionality; Using MapReduce and secondary sort to calculate page views; Using Hive and Python to clean and transform geographical event data; Using Python and Hadoop Streaming to perform a time series analytic; Using Multiple Outputs in MapReduce to name output files; Creating custom Hadoop Writable and InputFormat to read geographical event data 327 $aChapter 4: Performing Common Tasks Using Hive, Pig, and MapReduce Introduction; Using Hive to map an external table over weblog data in HDFS; Using Hive to dynamically create tables from the results of a weblog query; Using the Hive string UDFs to concatenate fields in weblog data; Using Hive to intersect weblog IPs and determine the country; Generating n-grams over news archives using MapReduce; Using the distributed cache in MapReduce; to find lines that contain matching keywords over news archives; Using Pig to load a table and perform a SELECT operation with GROUP BY 327 $aChapter 5: Advanced Joins Introduction; Joining data in the Mapper using MapReduce; Joining data using Apache Pig replicated join; Joining sorted data using Apache Pig merge join; Joining skewed data using Apache Pig skewed join; Using a map-side join in Apache Hive to analyze geographical events; Using optimized full outer joins in Apache Hive to analyze geographical events; Joining data using an external key-value store (Redis); Chapter 6: Big Data Analysis; Introduction; Counting distinct IPs in web log data using MapReduce and Combiners 327 $aUsing Hive date UDFs to transform and sort event dates from geographic event data 330 $aRealistic, simple code examples to solve problems at scale with Hadoop and related technologies. 606 $aElectronic data processing$xDistributed processing 606 $aOpen source software 615 0$aElectronic data processing$xDistributed processing. 615 0$aOpen source software. 676 $a004.6 676 $a005.74 700 $aOwens$b Jonathan R$01823336 701 $aLentz$b Jon$0733032 701 $aFemiano$b Brian$01823337 801 0$bMiAaPQ 801 1$bMiAaPQ 801 2$bMiAaPQ 906 $aBOOK 912 $a9911006763703321 996 $aHadoop real-world solutions cookbook$94389933 997 $aUNINA