Heartbeat network


In computer clusters, a heartbeat network is a private network which is shared only by the nodes in the cluster, and is not accessible from outside the cluster. It is used by cluster nodes in order to monitor each node's status and communicate with each other messages necessary for maintaining operation of the cluster.
The heartbeat method uses the FIFO nature of the signals sent across the network. By making sure that all messages have been received, the system ensures that events can be properly ordered.
In this communications protocol every node sends back a message in a given interval, say delta, in effect confirming that it is alive and has a heartbeat. These messages are viewed as control messages that help determine that the network includes no delayed messages. A receiver node called a "sync", maintains an ordered list of the received messages. Once a message with a timestamp later than the given marked time is received from every node, the system determines that all messages have been received, since the FIFO property ensures that the messages are ordered.
In general, it is difficult to select a delta that is optimal for all applications. If delta is too small, it requires too much overhead and if it is large it results in performance degradation as everything waits for the next heartbeat signal.