SNAP HA on Azure
SnapReplicate™ allows for manual replication of volumes to another node in the event of a problem or a planned maintenance. SNAP HA allows replication to be triggered automatically in such a case, by establishing a heartbeat between linked instances. If the heartbeat fails to register for more than a few moments, the other instance takes over, ensuring seamless access to the provisioned data.
Configuration of SnapReplicate™ is a prerequisite to setup of SNAP HA™. If SnapReplicate™ is not configured, the Add SNAP HA™ button will be grayed out.
The update all subnets option is an advanced option that can be useful if you do not have a large number of subnets. If hundreds of subnets are in use, it is recommended not to select this option.
The Virtual IP on same subnet option should only be checked if using VNET peering. This too is an advanced option, and not within the scope of this guide.
Next, you can tune the behavior of your HA pairing.
You can determine the max number of retries before your virtual machine fails over.
You can determine the max time (in seconds) that storage can be unavailable before a failover is triggered.
You can also set a default for max ioping request time, to ensure that a failover is triggered more quickly in event of failure.
Finally, you can determine the behavior of the failed node during a failover.
Reboot - this is the default option, allowing for quicker recovery and re-establishment of high availability, as the failed node will reboot, and SNAP HA will be reactivated, with the original node set as secondary.
Shutdown - The failed node will remain shut down. You will need to reboot the instance manually to re-establish high availability.
None (No action taken) - This option is only for debug or support use. The failed node will remain in its current state.
Click Next to continue.
To test, shut down one of the instances. The other will become primary after a few moments. Alternatively, select Actions, and Takeover to simulate a failover. Follow the instructions found in Recovering from a High Availability Failure to re-establish your highly available configuration.