Reliability Analysis in Distributed Systems

C. S. Raghavendra, V. K. Prasanna Kumar, S. Hariri

Research output: Contribution to journalArticlepeer-review

55 Scopus citations

Abstract

Reliability of a distributed processing system is an important design parameter that can be described in terms of the reliability of processing elements and communication links and also of the redundancy of programs and data files. The traditional terminal-pair reliability does not capture the redundancy of programs and files in a distributed system. Two reliability measures are introduced which are distributed program reliability that describes the probability of successful execution of a program requiring cooperation of several computers, and distributed system reliability which is the probability that all the specified distributed programs for the system are operational. These two reliability measures can be extended to incorporate the effects of user sites on reliability. We develop an efficient unified approach based on graph traversal to evaluate the proposed reliability measures.

Original languageEnglish (US)
Pages (from-to)352-358
Number of pages7
JournalIEEE Transactions on Computers
Volume37
Issue number3
DOIs
StatePublished - Mar 1988
Externally publishedYes

Keywords

  • Allocation of files
  • distributed
  • distributed program
  • file spanning tree
  • reliability analysis
  • system

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Reliability Analysis in Distributed Systems'. Together they form a unique fingerprint.

Cite this