1.

Record Nr.

UNINA9910818396703321

Autore

Zhao Wenbing, Ph.D

Titolo

Building dependable distributed systems / / Wenbing Zhao

Pubbl/distr/stampa

Hoboken, New Jersey : , : Scrivener Publishing : , : Wiley, , 2014

©2014

ISBN

1-118-91263-2

1-118-91274-8

1-118-91270-5

Descrizione fisica

1 online resource (370 p.)

Collana

Performability Engineering Series

Classificazione

COM051230

Disciplina

004/.36

Soggetti

Electronic data processing - Distributed processing

Computer systems - Design and construction

Lingua di pubblicazione

Inglese

Formato

Materiale a stampa

Livello bibliografico

Monografia

Note generali

"Published simultaneously in Canada"--Title page verso.

Nota di bibliografia

Includes bibliographical references and index.

Nota di contenuto

1. Introduction to Dependable Distributed Computing -- 2. Logging and Checkpointing -- 3. Recovery-Oriented Computing -- 4. Data and Service Replication -- 5. Group Communication Systems -- 6. Consensus and the Paxos Algorithms -- 7. Byzantine Fault Tolerance -- 8. Application-Aware Byzantine Fault Tolerance.

Sommario/riassunto

"This book covers the most essential techniques for designing and building dependable distributed systems. Instead of covering a broad range of research works for each dependability strategy, the book focuses only a selected few (usually the most seminal works, the most practical approaches, or the first publication of each approach) are included and explained in depth, usually with a comprehensive set of examples. The goal is to dissect each technique thoroughly so that readers who are not familiar with dependable distributed computing can actually grasp the technique after studying the book.The book contains eight chapters. The first chapter introduces the basic concepts and terminologies of dependable distributed computing, and also provide an overview of the primary means for achieving dependability. The second chapter describes in detail the checkpointing and logging mechanisms, which are the most commonly used means to achieve limited degree of fault tolerance. Such mechanisms also serve as the



foundation for more sophisticated dependability solutions. Chapter three covers the works on recovery-oriented computing, which focus on the practical techniques that reduce the fault detection and recovery times for Internet-based applications. Chapter four outlines the replication techniques for data and service fault tolerance. This chapter also pays particular attention to optimistic replication and the CAP theorem. Chapter five explains a few seminal works on group communication systems. Chapter six introduces the distributed consensus problem and covers a number of Paxos family algorithms in depth. Chapter seven introduces the Byzantine generals problem and its latest solutions, including the seminal Practical Byzantine Fault Tolerance (PBFT) algorithm and a number of its derivatives. The final chapter covers the latest research results on application-aware Byzantine fault tolerance, which is an important step forward towards practical use of Byzantine fault tolerance techniques"--