1.

Record Nr.

UNINA9910874658003321

Autore

Schagaev Igor

Titolo

Software Design for Resilient Computer Systems / / by Igor Schagaev, Jürg Gutknecht

Pubbl/distr/stampa

Cham : , : Springer International Publishing : , : Imprint : Springer, , 2024

ISBN

9783031551390

9783031551383

Edizione

[3rd ed. 2024.]

Descrizione fisica

1 online resource (414 pages)

Altri autori (Persone)

GutknechtJürg

Disciplina

005.30287

Soggetti

Telecommunication

Electronic circuits

Software engineering

Computers

Security systems

Communications Engineering, Networks

Electronic Circuits and Systems

Software Engineering

Hardware Performance and Reliability

Security Science and Technology

Lingua di pubblicazione

Inglese

Formato

Materiale a stampa

Livello bibliografico

Monografia

Nota di contenuto

Introduction -- Hardware faults -- Fault tolerance -- Generalized algorithm of fault tolerance -- GAFT generalization: a principle and model of active system safety -- System software support for hardware deficiency -- Testing, Checking and Hardware Syndrome -- Recovery preparation -- Searching and monitoring of correct software states -- Recovery algorithms.

Sommario/riassunto

This book addresses the question of how system software should be designed to account for faults, and which fault tolerance features should provide for highest reliability. With this third edition of Software Design for Resilient Computer Systems, the book is thoroughly updated to contain the newest advice regarding software resilience. With a new



introductory chapter, the new edition is ideal for researchers and industry professionals. In the book, the authors first show how system software interacts with the hardware to tolerate faults. They analyze and further develop the theory of fault tolerance to understand the diverse ways to increase the reliability of a system, with special attention on the role of system software in this process. They introduce the theory of redundancy and its use for construction of a subsystem through generalised algorithm of fault tolerance (GAFT) and apply it to distributed systems. The book’s approach is applied to various hardware subsystems: different structures of RAM and processor cores and demonstrates exceptional performance reliability and energy efficiency. This third edition devotes substantial attention to system software for modern computers, including run time systems, supporting algorithms of recovery and their analysis, language aspects and ways to improve reconfigurable and parallel computing. Due to the wide-reaching nature of the content, this book applies to a host of industries and research areas, including military, aviation, intensive health care, industrial control, and space exploration.