...
...
...
...
Dual Controller HA™ is an extension to our existing SoftNAS Cloud® high availability solution, SNAP HA™. It is designed to provide high availability for a shared pool of object storage only.
Adding a device to a dedicated storage pool results in the pool being replicated in the usual way, via SyncImage and asynchronous SnapReplicate ZFS send/receive once per minute, ensuring a copy of the pool’s data is maintained on the target node. HA failover operates as always, with dedicated storage devices and pools on each node having their own distinct, non-shared data that requires replication for use in HA (original design of SNAP HA). SoftNAS SNAP HA™ provides NFS, CIFS and iSCSI services via redundant storage controllers. One controller is active, while another is a standby controller. As only one controller is active at a time, this can be considered single-controller HA.
Dual Controller HA™ on the other hand, only applies if a shared pool of object storage, such as AWS S3, or Azure Hot or Cool blob storage, is specified at storage pool creation. After adding object storage 'disks' via Disk Devices, and selecting Create in Storage Pools, the following dialog will appear. If Shared Storage is selected, Dual Controller HA™ will automatically be applied to the shared pool after SNAP HA™ is configured.
Shared pools operate very differently from dedicated pools from an HA perspective. First, underlying storage devices are shared across nodes. Such shared devices (e.g., S3 cloud disks, Azure Hot and Cool Blob storage) include their own data redundancy, and are typically accessed over a network connection, enabling it to be shared across two or more nodes (only two nodes are currently supported).
A second major difference is the take-over process for shared pools. Volume configuration files are replicated between both the primary and secondary controller (hence Dual Controller). Failover is initiated at the point the primary controller fails to reply to an IO request within the expected time frame.
During a take-over event, first the devices associated with a shared pool must be mounted by the target node (and sometimes disconnected or unmounted from the original node, if required by the device type). Next, the shared pool is imported using the ZFS import command (and verified the pool was imported successfully and is not degraded or faulted). The appropriate level of both debug/trace and info/error logging is provided in existing HA log files, to ensure it’s possible to troubleshoot and provide support in the field if errors or issues arise.
With this method of failover:
- Very little data needs to be transferred for fail-over to occur.
- There is no need to create duplicate pools of already resilient object storage.
- No potential loss of transactional data occurs due to standard SNAP HA asynchronous replication delays.
To determine if Dual Controller HA is right for your deployment, see /wiki/spaces/SD/pages/92995970.
No change to Dedicated Pools
As stated above, Dual Controller HA does not change the way SNAP HA is configured, nor does it change how it operates for dedicated pools. SoftNAS has worked very hard to ensure that this feature is a seamless addition, with little to no change to existing functionality, or configuration.
Regardless of whether it is a shared pool or dedicated, the customer must first define a SnapReplicate™ relationship between the primary and secondary node, then add the SNAP HA relationship. In other words, there is no change to the SnapReplicate/SNAP HA process shown below.
Adding a device to a shared storage pool results in the pool being excluded (skipped) by SnapReplicate; i.e., the data on the underlying device is already shared across nodes, so there is no need to replicate shared storage pools. This involves a change in SnapReplicate’s “pool discovery” logic, forcing it to first read the sharedpools.xml file to get the list of shared pool names, then exclude those pools from the list of pools to be replicated (similar to how pool names not found on the target node get excluded).
...
Configuring SnapReplicate™
Having prepared the environment on both AWS SoftNAS
...
instances, we can now set up high availability. The first step towards high availability in SoftNAS is to establish replication. SnapReplicate™ makes this as simple as completing a quick wizard.
To establish the secure SnapReplicate relationship between two SoftNAS Cloud® nodes, simply follow the steps given below:
...
- On your primary instance (source controller) navigate to SnapReplicate/SNAP HA
...
- .
- Click the Add
...
- Replication button in the Replication Control Panel.
...
...
- From the Add
...
- Replication wizard, read the Instructions and then click
...
- the Next button.
...
- Enter the IP address
...
- of your secondary instance (Target) in the IP address text box. This step is specifying the network path that SnapReplicate™ traffic will take.
...
- Once done, click the Next button.
Note |
---|
The source node must be able to connect via HTTPS to the target node (similar to how the browser user logs into StorageCenter using HTTPS). HTTPS is used to create the initial SnapReplicate configuration. Next, several SSH sessions are established to ensure two-way communications between the nodes is possible. When connecting two Amazon EC2 nodes, it is best to use the internal instance IP addresses, as traffic is routed internally by default between instances in EC2. |
Note |
---|
...
If you have not yet done so, the Security Group on each instance should be configured with the internal IP addresses of the paired instance (the source instance should recognize traffic from the target instance, and the target instance should recognize traffic from the source) to ensure both HTTPS and SSH traffic between instances is recognized. |
...
See AWS Getting Started - Network Configuration (Security Groups) to learn more. |
...
Enter the username for the secondary instance (
...
The IP address/DNS name and login credentials of the target node will be verified. If there is a problem, an error message will be displayed. Click the Previous button to make the necessary corrections and then click the Next button to continue.
...
target) in the Remote admin user ID text box.
Info The username should be softnas.
Enter the password for the secondary instance (target) and verify in the appropriate text boxes.
Info Unless changed, by default, the password will be the Instance ID of the target instance.
- Once done, click the Next button.
- Read the Finish Replication Setup instructions and click the Finish button.
...
- The SnapReplicate relationship between the two SoftNAS
...
- controller nodes will be established. The corresponding SyncImage of the SnapReplicate will be displayed.
...
- After data from the volumes on the source node is mirrored to the target, once per minute SnapReplicate transfers keep the target node hot with data block changes from the source volumes.
- The tasks and an event log will be displayed in the Replication Control Panel section. This indicates that a SnapReplicate relationship is established and that replication should be taking place.
Configuring SNAP HA™
SnapReplicate™ establishes a replication relationship, one that can be manually triggered or scheduled, but is not automated. For true high availability in a failover situation, SNAP HA™ must be configured as well.
...
- While still on the SnapReplicate/SNAP HA page, click the Add SNAP HA button in the Replication Control Panel.
Note |
---|
Configuration of SnapReplicate™ is a prerequisite to setup of SNAP HA™. If SnapReplicate™ is not configured, the Add SNAP HA™ button will be grayed out. |
Note |
---|
If you have not yet configured a notification email, |
...
you will need to provide one |
...
prior to continuing SNAP HA™ |
...
. |
- Enter your e-mail into the Email text box.
- Once done, click the OK button.
- From the Add High Availability wizard, read the Instructions and then click the Next button.
- Enter a Virtual IP into the Virtual IP text box.
- Once done, click the Next button.
Note |
---|
The Virtual IP is a human-configured IP address (you choose the IP) |
...
. It can be any IP address that falls outside the CIDR block of the IP addresses of the two SoftNAS HA paired nodes. |
...
Provide your AWS Access Key and AWS Secret Key in the appropriate text boxes.
Note In most cases, your IAM Policy will handle this and you will not need to enter anything.
- Once done, click the Next button.
- In the HA Num of retries text box, enter the max number of retries before your virtual machine fails over.
- In the Storage timeout text box, enter the max time (in seconds) that storage can be unavailable before a failover is triggered.
- In the Max. hoping request time (ms) text box, enter the default time for max ioping requests, to ensure that a failover is triggered more quickly in event of failure.
- From the HA node recovery mode section, select the behavior of the failed node during a failover.
Info |
---|
|
- Once done, click the Next button.
- Click the Install button to start the HA Installation.
Note |
---|
|
...
|
- Once complete, click the Next button.
- Read the Finish HA Setup instructions and click the Finish button.
- After completion, the High Availability SoftNAS
...
- pair
...
- is successfully set up across Availability Zones.