LEADER 05201nam 22006855 450 001 9910366591603321 005 20200704044032.0 010 $a3-030-21244-0 024 7 $a10.1007/978-3-030-21244-5 035 $a(CKB)4100000008701645 035 $a(MiAaPQ)EBC5825094 035 $a(DE-He213)978-3-030-21244-5 035 $a(PPN)238492044 035 $a(EXLCZ)994100000008701645 100 $a20190709d2020 u| 0 101 0 $aeng 135 $aurcnu|||||||| 181 $ctxt$2rdacontent 182 $cc$2rdamedia 183 $acr$2rdacarrier 200 10$aSoftware Design for Resilient Computer Systems /$fby Igor Schagaev, Eugene Zouev, Kaegi Thomas 205 $a2nd ed. 2020. 210 1$aCham :$cSpringer International Publishing :$cImprint: Springer,$d2020. 215 $a1 online resource (xiv, 214 pages) $cillustrations 311 $a3-030-21243-2 327 $aIntroduction -- Hardware Faults -- Fault Tolerance: Theory and Concepts -- Generalized Algorithm of Fault Tolerance (GAFT) -- GAFT Generalization: A Principle and Model of Active System Safety -- System Software Support for Hardware Deficiency: Function and Features -- Testing and Checking -- Recovery Preparation -- Recovery: Searching and Monitoring of Correct Software States -- Recovery Algorithms: An Analysis -- Programming Language for Safety Critical Systems -- Proposed Runtime System Structure -- Proposed Runtime System vs. Existing Approaches -- Hardware: The ERRIC Architecture -- Architecture Comparison and Evaluation -- Reliability of ERRIC -- Performance of ERRIC -- ERRIC Software -- How about resilience at large -- Map of Resilience. 330 $aThis book addresses the question of how system software should be designed to account for faults, and which fault tolerance features it should provide for highest reliability. With this second edition of Software Design for Resilient Computer Systems the book is thoroughly updated to contain the newest advice regarding software resilience. With additional chapters on computer system performance and system resilience, as well as online resources, the new edition is ideal for researchers and industry professionals. The authors first show how the system software interacts with the hardware to tolerate faults. They analyze and further develop the theory of fault tolerance to understand the different ways to increase the reliability of a system, with special attention on the role of system software in this process. They further develop the general algorithm of fault tolerance (GAFT) with its three main processes: hardware checking, preparation for recovery, and the recovery procedure. For each of the three processes, they analyze the requirements and properties theoretically and give possible implementation scenarios and system software support required. Based on the theoretical results, the authors derive an Oberon-based programming language with direct support of the three processes of GAFT. In the last part of this book, they introduce a simulator, using it as a proof of concept implementation of a novel fault tolerant processor architecture (ERRIC) and its newly developed runtime system feature-wise and performance-wise. Due to the wide reaching nature of the content, this book applies to a host of industries and research areas, including military, aviation, intensive health care, industrial control, and space exploration. 606 $aElectrical engineering 606 $aElectronic circuits 606 $aSoftware engineering 606 $aComputer software?Reusability 606 $aQuality control 606 $aReliability 606 $aIndustrial safety 606 $aCommunications Engineering, Networks$3https://scigraph.springernature.com/ontologies/product-market-codes/T24035 606 $aCircuits and Systems$3https://scigraph.springernature.com/ontologies/product-market-codes/T24068 606 $aSoftware Engineering$3https://scigraph.springernature.com/ontologies/product-market-codes/I14029 606 $aPerformance and Reliability$3https://scigraph.springernature.com/ontologies/product-market-codes/I12077 606 $aQuality Control, Reliability, Safety and Risk$3https://scigraph.springernature.com/ontologies/product-market-codes/T22032 615 0$aElectrical engineering. 615 0$aElectronic circuits. 615 0$aSoftware engineering. 615 0$aComputer software?Reusability. 615 0$aQuality control. 615 0$aReliability. 615 0$aIndustrial safety. 615 14$aCommunications Engineering, Networks. 615 24$aCircuits and Systems. 615 24$aSoftware Engineering. 615 24$aPerformance and Reliability. 615 24$aQuality Control, Reliability, Safety and Risk. 676 $a005 676 $a005 700 $aSchagaev$b Igor$4aut$4http://id.loc.gov/vocabulary/relators/aut$0720771 702 $aZouev$b Eugene$4aut$4http://id.loc.gov/vocabulary/relators/aut 702 $aThomas$b Kaegi$4aut$4http://id.loc.gov/vocabulary/relators/aut 906 $aBOOK 912 $a9910366591603321 996 $aSoftware Design for Resilient Computer Systems$92527431 997 $aUNINA