1.

Record Nr.

UNINA9910254754603321

Autore

Vohra Deepak

Titolo

Practical Hadoop Ecosystem [[electronic resource] ] : A Definitive Guide to Hadoop-Related Frameworks and Tools / / by Deepak Vohra

Pubbl/distr/stampa

Berkeley, CA : , : Apress : , : Imprint : Apress, , 2016

ISBN

1-4842-2199-0

Edizione

[1st ed. 2016.]

Descrizione fisica

1 online resource (XX, 421 p. 311 illus., 293 illus. in color.)

Disciplina

004.36

Soggetti

Big data

Database management

Big Data

Database Management

Lingua di pubblicazione

Inglese

Formato

Materiale a stampa

Livello bibliografico

Monografia

Note generali

Includes index.

Nota di contenuto

Part I. Fundamentals -- Introduction -- 1. HDFS and MapReduce -- Part II Storing & Querying -- 2. Apache Hive -- 3. Apache HBase -- Part III Bulk Transferring & Streaming -- 4. Apache Sqoop -- 5. Apache Flume -- Part IV Serializing -- 6. Apache Avro -- 7. Apache Parquet -- Part V Messaging & Indexing -- 8. Apache Kafka -- 9. Apache Solr -- 10.Apache Mahout.

Sommario/riassunto

This book is a practical guide on using the Apache Hadoop projects including MapReduce, HDFS, Apache Hive, Apache HBase, Apache Kafka, Apache Mahout and Apache Solr. From setting up the environment to running sample applications each chapter is a practical tutorial on using a Apache Hadoop ecosystem project. While several books on Apache Hadoop are available, most are based on the main projects MapReduce and HDFS and none discusses the other Apache Hadoop ecosystem projects and how these all work together as a cohesive big data development platform. What you'll learn How to set up environment in Linux for Hadoop projects using Cloudera Hadoop Distribution CDH 5. How to run a MapReduce job How to store data with Apache Hive, Apache HBase How to index data in HDFS with Apache Solr How to develop a Kafka messaging system How to develop a Mahout User Recommender System How to stream Logs to HDFS with



Apache Flume How to transfer data from MySQL database to Hive, HDFS and HBase with Sqoop How create a Hive table over Apache Solr.