Hadoop Beginner's Guide [[electronic resource]] |
Autore | Turkington Garry |
Edizione | [1st edition] |
Pubbl/distr/stampa | Birmingham, : Packt Publishing, 2013 |
Descrizione fisica | 1 online resource (398 p.) |
Disciplina | 005.74 |
Soggetto topico |
Apache Hadoop (Computer file)
Database management Electronic data processing -- Distributed processing File organization (Computer science) Hadoop (Computer program) |
Soggetto genere / forma | Electronic books. |
ISBN |
1-5231-3215-9
1-68015-106-1 1-299-26161-2 1-84951-731-2 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: What It's All About; Big data processing; The value of data; Historically for the few and not the many; Classic data processing systems; Limiting factors; A different approach; All roads lead to scale-out; Share nothing; Expect failure; Smart software, dumb hardware; Move processing, not data; Build applications, not infrastructure; Hadoop; Thanks Google; Thanks Doug; Thanks Yahoo; Parts of Hadoop; Common building blocks; HDFS; MapReduce; Better together
Common architectureWhat it is and isn't good for; Cloud computing with Amazon Web Services; Too many clouds; A third way; Different types of cost; AWS - infrastructure on demand from Amazon; Elastic Compute Cloud (EC2); Simple Storage Service (S3); Elastic MapReduce (EMR); What this book covers; A dual approach; Summary; Chapter 2: Getting Hadoop Up and Running; Hadoop on a local Ubuntu host; Other operating systems; Time for action - checking the prerequisites; Setting up Hadoop; A note on versions; Time for action - downloading Hadoop; Time for action - setting up SSH Configuring and running HadoopTime for action - using Hadoop to calculate Pi; Three modes; Time for action - configuring the pseudo-distributed mode; Configuring the base directory and formatting the filesystem; Time for action - changing the base HDFS directory; Time for action - formatting the NameNode; Starting and using Hadoop; Time for action - starting Hadoop; Time for action - using HDFS; Time for action - WordCount, the Hello World of MapReduce; Monitoring Hadoop from the browser; The HDFS web UI; Using Elastic MapReduce; Setting up an account on Amazon Web Services Creating an AWS accountSigning up for the necessary services; Time for action - WordCount on EMR using the management console; Other ways of using EMR; AWS credentials; The EMR command-line tools; The AWS ecosystem; Comparison of local versus EMR Hadoop; Summary; Chapter 3: Understanding MapReduce; Key/value pairs; What does it mean; Why key/value data?; Some real-world examples; MapReduce as a series of key/value transformations; The Hadoop Java API for MapReduce; The 0.20 MapReduce Java API; The Mapper class; The Reducer class; The Driver class; Writing MapReduce programs Time for action - setting up the classpathTime for action - implementing WordCount; Time for action - building a JAR file; Time for action - running WordCount on a local Hadoop cluster; Time for action - running WordCount on EMR; The pre-0.20 Java MapReduce API; Hadoop-provided mapper and reducer implementations; Time for action - WordCount the easy way; Walking through a run of WordCount; Startup; Splitting the input; Task assignment; Task startup; Ongoing JobTracker monitoring; Mapper input; Mapper execution; Mapper output and reduce input; Partitioning; The optional partition function Reducer input |
Record Nr. | UNINA-9910465656103321 |
Turkington Garry | ||
Birmingham, : Packt Publishing, 2013 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|
Hadoop Beginner's Guide [[electronic resource]] |
Autore | Turkington Garry |
Edizione | [1st edition] |
Pubbl/distr/stampa | Birmingham, : Packt Publishing, 2013 |
Descrizione fisica | 1 online resource (398 p.) |
Disciplina | 005.74 |
Soggetto topico |
Apache Hadoop (Computer file)
Database management Electronic data processing -- Distributed processing File organization (Computer science) Hadoop (Computer program) |
ISBN |
1-5231-3215-9
1-68015-106-1 1-299-26161-2 1-84951-731-2 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: What It's All About; Big data processing; The value of data; Historically for the few and not the many; Classic data processing systems; Limiting factors; A different approach; All roads lead to scale-out; Share nothing; Expect failure; Smart software, dumb hardware; Move processing, not data; Build applications, not infrastructure; Hadoop; Thanks Google; Thanks Doug; Thanks Yahoo; Parts of Hadoop; Common building blocks; HDFS; MapReduce; Better together
Common architectureWhat it is and isn't good for; Cloud computing with Amazon Web Services; Too many clouds; A third way; Different types of cost; AWS - infrastructure on demand from Amazon; Elastic Compute Cloud (EC2); Simple Storage Service (S3); Elastic MapReduce (EMR); What this book covers; A dual approach; Summary; Chapter 2: Getting Hadoop Up and Running; Hadoop on a local Ubuntu host; Other operating systems; Time for action - checking the prerequisites; Setting up Hadoop; A note on versions; Time for action - downloading Hadoop; Time for action - setting up SSH Configuring and running HadoopTime for action - using Hadoop to calculate Pi; Three modes; Time for action - configuring the pseudo-distributed mode; Configuring the base directory and formatting the filesystem; Time for action - changing the base HDFS directory; Time for action - formatting the NameNode; Starting and using Hadoop; Time for action - starting Hadoop; Time for action - using HDFS; Time for action - WordCount, the Hello World of MapReduce; Monitoring Hadoop from the browser; The HDFS web UI; Using Elastic MapReduce; Setting up an account on Amazon Web Services Creating an AWS accountSigning up for the necessary services; Time for action - WordCount on EMR using the management console; Other ways of using EMR; AWS credentials; The EMR command-line tools; The AWS ecosystem; Comparison of local versus EMR Hadoop; Summary; Chapter 3: Understanding MapReduce; Key/value pairs; What does it mean; Why key/value data?; Some real-world examples; MapReduce as a series of key/value transformations; The Hadoop Java API for MapReduce; The 0.20 MapReduce Java API; The Mapper class; The Reducer class; The Driver class; Writing MapReduce programs Time for action - setting up the classpathTime for action - implementing WordCount; Time for action - building a JAR file; Time for action - running WordCount on a local Hadoop cluster; Time for action - running WordCount on EMR; The pre-0.20 Java MapReduce API; Hadoop-provided mapper and reducer implementations; Time for action - WordCount the easy way; Walking through a run of WordCount; Startup; Splitting the input; Task assignment; Task startup; Ongoing JobTracker monitoring; Mapper input; Mapper execution; Mapper output and reduce input; Partitioning; The optional partition function Reducer input |
Record Nr. | UNINA-9910792067903321 |
Turkington Garry | ||
Birmingham, : Packt Publishing, 2013 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|
Hadoop Beginner's Guide |
Autore | Turkington Garry |
Edizione | [1st edition] |
Pubbl/distr/stampa | Birmingham, : Packt Publishing, 2013 |
Descrizione fisica | 1 online resource (398 p.) |
Disciplina | 005.74 |
Soggetto topico |
Apache Hadoop (Computer file)
Database management Electronic data processing -- Distributed processing File organization (Computer science) Hadoop (Computer program) |
ISBN |
1-5231-3215-9
1-68015-106-1 1-299-26161-2 1-84951-731-2 |
Formato | Materiale a stampa |
Livello bibliografico | Monografia |
Lingua di pubblicazione | eng |
Nota di contenuto |
Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: What It's All About; Big data processing; The value of data; Historically for the few and not the many; Classic data processing systems; Limiting factors; A different approach; All roads lead to scale-out; Share nothing; Expect failure; Smart software, dumb hardware; Move processing, not data; Build applications, not infrastructure; Hadoop; Thanks Google; Thanks Doug; Thanks Yahoo; Parts of Hadoop; Common building blocks; HDFS; MapReduce; Better together
Common architectureWhat it is and isn't good for; Cloud computing with Amazon Web Services; Too many clouds; A third way; Different types of cost; AWS - infrastructure on demand from Amazon; Elastic Compute Cloud (EC2); Simple Storage Service (S3); Elastic MapReduce (EMR); What this book covers; A dual approach; Summary; Chapter 2: Getting Hadoop Up and Running; Hadoop on a local Ubuntu host; Other operating systems; Time for action - checking the prerequisites; Setting up Hadoop; A note on versions; Time for action - downloading Hadoop; Time for action - setting up SSH Configuring and running HadoopTime for action - using Hadoop to calculate Pi; Three modes; Time for action - configuring the pseudo-distributed mode; Configuring the base directory and formatting the filesystem; Time for action - changing the base HDFS directory; Time for action - formatting the NameNode; Starting and using Hadoop; Time for action - starting Hadoop; Time for action - using HDFS; Time for action - WordCount, the Hello World of MapReduce; Monitoring Hadoop from the browser; The HDFS web UI; Using Elastic MapReduce; Setting up an account on Amazon Web Services Creating an AWS accountSigning up for the necessary services; Time for action - WordCount on EMR using the management console; Other ways of using EMR; AWS credentials; The EMR command-line tools; The AWS ecosystem; Comparison of local versus EMR Hadoop; Summary; Chapter 3: Understanding MapReduce; Key/value pairs; What does it mean; Why key/value data?; Some real-world examples; MapReduce as a series of key/value transformations; The Hadoop Java API for MapReduce; The 0.20 MapReduce Java API; The Mapper class; The Reducer class; The Driver class; Writing MapReduce programs Time for action - setting up the classpathTime for action - implementing WordCount; Time for action - building a JAR file; Time for action - running WordCount on a local Hadoop cluster; Time for action - running WordCount on EMR; The pre-0.20 Java MapReduce API; Hadoop-provided mapper and reducer implementations; Time for action - WordCount the easy way; Walking through a run of WordCount; Startup; Splitting the input; Task assignment; Task startup; Ongoing JobTracker monitoring; Mapper input; Mapper execution; Mapper output and reduce input; Partitioning; The optional partition function Reducer input |
Record Nr. | UNINA-9910823155903321 |
Turkington Garry | ||
Birmingham, : Packt Publishing, 2013 | ||
Materiale a stampa | ||
Lo trovi qui: Univ. Federico II | ||
|