Vai al contenuto principale della pagina

Hadoop Beginner's Guide [[electronic resource]]



(Visualizza in formato marc)    (Visualizza in BIBFRAME)

Autore: Turkington Garry Visualizza persona
Titolo: Hadoop Beginner's Guide [[electronic resource]] Visualizza cluster
Pubblicazione: Birmingham, : Packt Publishing, 2013
Edizione: 1st edition
Descrizione fisica: 1 online resource (398 p.)
Disciplina: 005.74
Soggetto topico: Apache Hadoop (Computer file)
Database management
Electronic data processing -- Distributed processing
File organization (Computer science)
Hadoop (Computer program)
Note generali: Description based upon print version of record.
Nota di contenuto: Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: What It's All About; Big data processing; The value of data; Historically for the few and not the many; Classic data processing systems; Limiting factors; A different approach; All roads lead to scale-out; Share nothing; Expect failure; Smart software, dumb hardware; Move processing, not data; Build applications, not infrastructure; Hadoop; Thanks Google; Thanks Doug; Thanks Yahoo; Parts of Hadoop; Common building blocks; HDFS; MapReduce; Better together
Common architectureWhat it is and isn't good for; Cloud computing with Amazon Web Services; Too many clouds; A third way; Different types of cost; AWS - infrastructure on demand from Amazon; Elastic Compute Cloud (EC2); Simple Storage Service (S3); Elastic MapReduce (EMR); What this book covers; A dual approach; Summary; Chapter 2: Getting Hadoop Up and Running; Hadoop on a local Ubuntu host; Other operating systems; Time for action - checking the prerequisites; Setting up Hadoop; A note on versions; Time for action - downloading Hadoop; Time for action - setting up SSH
Configuring and running HadoopTime for action - using Hadoop to calculate Pi; Three modes; Time for action - configuring the pseudo-distributed mode; Configuring the base directory and formatting the filesystem; Time for action - changing the base HDFS directory; Time for action - formatting the NameNode; Starting and using Hadoop; Time for action - starting Hadoop; Time for action - using HDFS; Time for action - WordCount, the Hello World of MapReduce; Monitoring Hadoop from the browser; The HDFS web UI; Using Elastic MapReduce; Setting up an account on Amazon Web Services
Creating an AWS accountSigning up for the necessary services; Time for action - WordCount on EMR using the management console; Other ways of using EMR; AWS credentials; The EMR command-line tools; The AWS ecosystem; Comparison of local versus EMR Hadoop; Summary; Chapter 3: Understanding MapReduce; Key/value pairs; What does it mean; Why key/value data?; Some real-world examples; MapReduce as a series of key/value transformations; The Hadoop Java API for MapReduce; The 0.20 MapReduce Java API; The Mapper class; The Reducer class; The Driver class; Writing MapReduce programs
Time for action - setting up the classpathTime for action - implementing WordCount; Time for action - building a JAR file; Time for action - running WordCount on a local Hadoop cluster; Time for action - running WordCount on EMR; The pre-0.20 Java MapReduce API; Hadoop-provided mapper and reducer implementations; Time for action - WordCount the easy way; Walking through a run of WordCount; Startup; Splitting the input; Task assignment; Task startup; Ongoing JobTracker monitoring; Mapper input; Mapper execution; Mapper output and reduce input; Partitioning; The optional partition function
Reducer input
Sommario/riassunto: As a Packt Beginner's Guide, the book is packed with clear step-by-step instructions for performing the most useful tasks, getting you up and running quickly, and learning by doing. This book assumes no existing experience with Hadoop or cloud services. It assumes you have familiarity with a programming language such as Java or Ruby but gives you the needed background on the other topics.
Titolo autorizzato: Hadoop Beginner's Guide  Visualizza cluster
ISBN: 1-5231-3215-9
1-68015-106-1
1-299-26161-2
1-84951-731-2
Formato: Materiale a stampa
Livello bibliografico Monografia
Lingua di pubblicazione: Inglese
Record Nr.: 9910792067903321
Lo trovi qui: Univ. Federico II
Opac: Controlla la disponibilità qui