When configuring VMware High Availability (HA) cluster, you have the possibility to check as a secondary communication channel a datastore (or several ones), during the configuration wizard. VMware Datastore Hearbeating provides an additional option for determining if host is in failed state or not.
Before vSphere 5.0 era, in vSphere 4.1, if host had a hardware problem and failed, or if host was just isolated on its management network, HA would restart the VMs that were running on that particular host. There could be just a problem on a management network, but that was it, the HA would triggered restart of VMs. The VMs would have been restarted on another host in the cluster.
When vSphere 5.0 has been introduced, with the new solution around Fault Domain Manager (FDM) – a Master and Slave architecture, then a more intelligent and advanced technique for host failure has been introduced – Datastore Heartbeat, which adds an additional way of detection for host failures. VMware Datastore Hearbeating brings more resiliency.
In case the Master cannot communicate with a slave (don't receives the heartbeat), but the heartbeat datastore answers, the server is still working. So if that's the case, the host is partitioned from the network, or isolated. The Datastore heartbeat function helps greatly to determine the difference between host which failed and host that has just been isolated from others.
vCenter automatically selects at least two datastores from the shared datastores. It's preferable to have VMware Datastore heartbeating selected on every NAS/SAN device you have. In my example above I have two shared datastores checked, each on different storage device. vCenter gives you the option to specify alternative datastores, but the choice is only from datastores that are mounted by at least two hosts. You can always check later, on the properties of your HA cluster, and see which datastores has been selected and if you have an option to check additional datastores.
The Datastore Hearbeating enables to avoid false restarting of VMs in case only a management network has failed. The default number of heartbeat datastores is two, where the maximum valid value is five. You can override the default value by an advanced attribute: das.heartbeatdsperhost
venkat says
This is a very good article..Thanks for sharing this
venkat says
i have a question..
how can the master node datastore heart beat works.in case management network is down.could you please help on this
Horatiu says
Hy
I have a question about activation heartbeat data store in the following scenarios:
First of all I have a new data center (Data center A) with 4 server in cluster configuration (HA, DRS and 2 data store setting for heat beat) called Cluster A. All data store is shared data store FC 16 GB. Everything works fine (HA /DRS). Now I have to create a new cluster (Cluster B) under the same data center (or new) that is works fine until I need to choose what data store will be used for HA, because I can not choose any storage unit although it is seen in the configuration tab. For Cluster A I have 2 data store for heart beat (HB1 /HB2 – 10 GB) and for Cluster B have 2 data store for heart beat (HB 3 HB 4 10 GB). All of these heart beat is seen only in Cluster A (in HA Configuration) and in Cluster B (in HA Configuration there are not visible). The configuration of fiber channel zoning is default (permit all all host to SAN)
Vladan SEGET says
Yes. It’s per cluster config, so basically in your situation you can pick 1 datastrore per cluster, to be used for heartbeating.
Mike says
Hello
We have 2 clusters and 5 datastores, visible from heartbeat configuration tab. Option “Select any of the cluster datastores taking into account my prefrences” selected.
The question is – can I choose all 5 available datastores on each cluster or it is better to choose datastore 1,2 for 1 cluster and datastore 3,4 for second?
And what if two clusters will be configured to share the same datastore for heartbeat?