Message-Optimal Protocols for Fault-Tolerant Broadcasts/Multicasts in
Distributed Systems with Crash Failures
Hong-Yi Tzeng and Kai-Yeung Siu
IEEE TRANSACTIONS ON COMPUTERS, Vol.44, No. 2,
FEB,1995 page:346-352.
Abstract
An essential feature in any fault-tolerant design of distributed systems is a mechanism by which a process can reliably broadcast information to other processes of failures. This paper studies the message complexity of fault-tolerant broadcast protocols in weakly synchronous and totally asynchronous distributed systems with point-to-point communication links, where the system failures are caused by the processes but the communication links are completely reliable. We focus on the number of messages required of any fault-tolerant protocol in failure-free execution. We present protocols that, in an n-process system subject to at most t crash failures where 1<= t < (n-1), guarantee the delivery of a message from any process to other nonfaulty processes. In
the absence of crash failure, our protocols require (n+t-1) messages in the weakly synchronous model and (t+1)(n-1-(t/2)) messages in the totally asynchronous model.
Content:
Definitions and Notations
In our model, we assume that a set of n processes are connected by a complete point -to-point network such that any two processes can exchange messages directly, but each process can only deliver at most one message to another process at a time. .
On this paper we only consider crash failures.
Crash failures: At any time a process may stop sending and receiving message.
We now define the problem of reliable broadcast. To facilitate the presentation, we assume that each process p consists of a failure-free write-once storage W(p) which is always available even if p is faulty.
A protocol is called a reliable broadcast protocol if it satisfies the following (RB) conditions:
RB1. Agreement : if any process(correct of faulty) accepts M, all corrects accept M.
RB2. Validity: if a process p does not fail while broadcasting M or if any correct process receives M, all nonfaulty processes accept M.
RB3. Integrity: For any message M, every process accepts M at most once and only if some process broadcast M.
RB4. Termination: The above three conditions RB1.RB2.and RB3.are satisfied eventually.
A reliable broadcast protocol is said to be t-resilient if it can tolerate up to t crash failures and satisfy all the above conditions.
The Weakly synchronous model.
In the weakly synchronous model, one can assume a constant
d such that the processing and delivery of a message can be completed within d units of time.
The Totally asynchronous model.
Created by: Jain-Ming Chen
Date:Apr.17, 1997