Azure – Creating a SQL AlwaysOn Failover Cluster on Azure using Internal Load Balancer (ILB)

I’ve spent a lot of time trying to make the process of building a SQL AlwaysOn failover cluster in Azure.  Due to the limitations of having to use DHCP this caused all sorts of problems when trying to replicate On-Premise installations in Azure.  I have now got a nice build process, there is a mixture of PowerShell and GUI.  It could all be done with PowerShell but I like to see what is going on sometimes hence my approach.

First off, I am assuming that you’ve built the two SQL Server nodes that are going to form part of the AlwaysOn failover cluster.  For information if you are doing this for the first time you need to install 2 standalone SQL installations.  There are some complex decisions to be made regarding the SQL design but for this post I am going to stick to the simple installation process.  So for my setup I have two Windows 2012R2 servers running SQL 2014 Enterprise.  Both servers have a default installation of SQL with a single default instance.

Pre-requisites:

– Azure PowerShell installed and connected to your subscription
– Servers must be in the same cloud service

Ok, lets go!

As mentioned above you should now have a couple of Windows servers with SQL installed on them  Check that the SQL Agent is on Auto start as this is normally set to manual, and start the service.

Assign reserved IP addresses to the servers using the following Azure PowerShell

Get-AzureVM -ServiceName CLOUDNAME -Name VMNAME | Set-AzureStaticVNetIP -IPAddress 10.151.129.102 | Update-AzureVM

Install Failover Cluster services on all nodes that will be part of the cluster

Install-WindowsFeature -Name Failover-Clustering –IncludeManagementTools
Install-WindowsFeature –Name RSAT-Clustering-PowerShell
Install-WindowsFeature –Name RSAT-Clustering-CmdInterface

Next we need to configure the firewall on the servers, this is done on the Domain profile Only
– SQL Servers – TCP1433, TCP59999 and TCP5022 (In/Out)
– SharePoint – TCP1433 (in/out)

To make this easier run the following NetSh Command to set firewall on all nodes:

netsh advfirewall firewall add rule name=”SQL” dir=in action=allow protocol=TCP localport=1433 profile=domain
netsh advfirewall firewall add rule name=”AlwaysOn Endpoint” dir=in action=allow protocol=TCP localport=5022 profile=domain
netsh advfirewall firewall add rule name=”AlwaysOn Probe” dir=in action=allow protocol=TCP localport=59999 profile=domain

netsh advfirewall firewall add rule name=”SQL” dir=out action=allow protocol=TCP localport=1433 profile=domain
netsh advfirewall firewall add rule name=”AlwaysOn Endpoint” dir=out action=allow protocol=TCP localport=5022 profile=domain
netsh advfirewall firewall add rule name=”AlwaysOn Probe” dir=out action=allow protocol=TCP localport=59999 profile=domain

Now we need to build Internal Load Balancer (ILB) and join the nodes to it, create a PS1 file from the following (change variables where applicable)

# Define variables
$ServiceName = “devrobsql” # the name of the cloud service that contains the availability group nodes
$AGNodes = “embo”,”elan” # all availability group nodes containing replicas in the same cloud service, separated by commas
$EndpointName = “SQLEP1” # name of the endpoint
$EndpointPort = “1433” # public port to use for the endpoint
$ILBName = “SQLILB” # chosen name for the new ILB
$SubnetName = “RC-Subnet” # subnet name that the replicas use in the VNet
$ILBStaticIP = “10.151.129.103” # static IP address for the ILB in the subnet

Add-AzureInternalLoadBalancer -InternalLoadBalancerName $ILBName -SubnetName $SubnetName -ServiceName $ServiceName -StaticVNetIPAddress $ILBStaticIP

# Configure a load balanced endpoint for each node in $AGNodes using ILB
ForEach ($node in $AGNodes)
{
Get-AzureVM -ServiceName $ServiceName -Name $node | Add-AzureEndpoint -Name $EndpointName -LBSetName “$EndpointName-LB” -Protocol tcp -LocalPort $EndpointPort -PublicPort $EndpointPort -ProbePort 59999 -ProbeProtocol tcp -ProbeIntervalInSeconds 10 -InternalLoadBalancerName $ILBName -DirectServerReturn $true | Update-AzureVM
}

It’s now time to build the failover cluster.  This is done via PowerShell because of the issues with setting up clusters with DHCP.

Connect to one of the SQL nodes that is to be part of the cluster and Run the following PS1 file to create the cluster nodes.

<#
* Copyright Microsoft Corporation
*
* Licensed under the Apache License, Version 2.0 (the “License”);
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an “AS IS” BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
#>

#########################################################################################################
#
# Powershell script to create a cluster in Azure environment
# It must be run on a machine which is going to become a node in the cluster
#
# Syntax:
#
#     .\CreateAzureFailoverCluster.ps1 -ClusterName “ClusterName” -ClusterNode “Node1″,”Node2”,… [-Force]
#
# Example 1
#     Running this command on Machine1 or Machine2 will create cluster “MyCluster”
#
#     .\CreateAzureFailoverCluster.ps1 -ClusterName “MyCluster” -ClusterNodes “Machine1”, “Machine2”
#
# Example 2
#     Running this command on Machine1 or Machine2 will create cluster “MyCluster”, erasing any previous cluster state that was on the nodes
#
#     .\CreateAzureFailoverCluster.ps1 -ClusterName “MyCluster” -ClusterNodes “Machine1”, “Machine2” -Force
#
#  Summary of the script:
#     1. Create a 1 node cluster on local node
#     2. Set cluster IP address to static IP 169.254.1.1
#     3. Online the “Cluster Name” so that it will create an entry in active directory
#     4. Add the other nodes to the cluster
#
#
#########################################################################################################
param(
[Parameter(Mandatory=$true)]
$ClusterName,
[Parameter(Mandatory=$true)]
$ClusterNodes,
[Parameter(Mandatory=$false)]
[Switch]$Force)
Import-Module ServerManager
$OsBuildNumber = [System.Environment]::OSVersion.Version.Build
if ($OsBuildNumber -lt 7600)
{
Write-Error “Not support on Windows Visa or lower”
exit 1
}
elseif ($OsBuildNumber -lt 9200)
{
Write-Output “Windows Server 2008 R2 detected”
$ClusterFeature = Get-WindowsFeature “Failover-Clustering”
$ClusterTools = Get-WindowsFeature “RSAT-Clustering”
if ($ClusterFeature.Installed -eq $false -or $ClusterFeature.Installed -eq $false)
{
Write-Output “Needed cluster features were not found on the machine. Please run the following command to install them:”
Write-Output “Add-WindowsFeature ‘Failover-Clustering’, ‘RSAT-Clustering'”
exit 1
}
}
else
{
Write-Output “Windows Server 2012 or above detected”
$ClusterFeature = Get-WindowsFeature “Failover-Clustering”
$ClusterPowerShellTools = Get-WindowsFeature “RSAT-Clustering-PowerShell”
$ClusterCmdTools = Get-WindowsFeature “RSAT-Clustering-CmdInterface”
if ($ClusterFeature.Installed -eq $false -or $ClusterPowerShellTools.Installed -eq $false -or $ClusterCmdTools.Installed -eq $false)
{
Write-Output “Needed cluster features were not found on the machine. Please run the following command to install them:”
Write-Output “Add-WindowsFeature ‘Failover-Clustering’, ‘RSAT-Clustering-PowerShell’, ‘RSAT-Clustering-CmdInterface'”
exit 1
}
}
Import-Module FailoverClusters
$LocalMachineName = $env:computername
$LocalNodePresent = $false
# The below line will make sure that the script is running on one of the specified cluster nodes
# The Spplit(“.”) is needed, because user might specify machines using their fully qualified domain name, but we only care about the machine name in the below verification
@($ClusterNodes) | Foreach-Object {
if ([string]::Compare(($_).Split(“.”)[0], $LocalMachineName, $true) -eq 0) {
$LocalNodePresent = $true } }
if ($LocalNodePresent -eq $false)
{
Write-Error “Local machine where this script is running, must be one of the cluster nodes”
exit 1
}

if ($Force)
{
Write-Output “Forcing cleanup of the specified nodes”
@($ClusterNodes) | Foreach-Object { Clear-ClusterNode “$_” -Force }
}
else
{
Write-Output “Making sure that there is no cluster currently running on the current node”
$CurrentCluster = $null
# In case there is no cluster presetn, we don’t want to show an ugly error message, so we eat it out by redirecting
# the error output to null
$CurrentCluster = Get-Cluster 2> $null
if ($CurrentCluster -ne $null)
{
Write-Error “There is an existing cluster on this machine. Please remove any existing cluster settings from the current machine before running this script”
exit 1
}
}
Write-Output “Trying to create a one node cluster on the current machine”
Sleep 5
New-Cluster -Name $ClusterName -NoStorage -Node $LocalMachineName
Write-Output “Verify that cluster is present after creation”
$CurrentCluster = $null
$CurrentCluster = Get-Cluster
if ($CurrentCluster -eq $null)
{
Write-Error “Cluster does not exist”
exit 1
}
Write-Output “Bring offline the cluster name resource”
Sleep 5
Stop-ClusterResource “Cluster Name”
Write-Output “Get all IP addresses associated with cluster group”
$AllClusterGroupIPs = Get-Cluster | Get-ClusterGroup | Get-ClusterResource | Where-Object {$_.ResourceType.Name -eq “IP Address” -or $_.ResourceType.Name -eq “IPv6 Tunnel Address” -or $_.ResourceType.Name -eq “IPv6 Address”}
$NumberOfIPs = @($AllClusterGroupIPs).Count
Write-Output “Found $NumberOfIPs IP addresses”
Write-Output “Bringing all IPs offline”
Sleep 5
$AllClusterGroupIPs | Stop-ClusterResource
Write-Output “Get the first IPv4 resource”
$AllIPv4Resources = Get-Cluster | Get-ClusterGroup | Get-ClusterResource | Where-Object {$_.ResourceType.Name -eq “IP Address”}
$FirstIPv4Resource = @($AllIPv4Resources)[0];
Write-Output “Removing all IPs except one IPv4 resource”
Sleep 5
$AllClusterGroupIPs | Where-Object {$_.Name -ne $FirstIPv4Resource.Name} | Remove-ClusterResource -Force
$NameOfIPv4Resource = $FirstIPv4Resource.Name
Write-Output “Setting the cluster IP address to a link local address”
Sleep 5
cluster res $NameOfIPv4Resource /priv enabledhcp=0 overrideaddressmatch=1 address=169.254.1.1 subnetmask=255.255.0.0
$ClusterNameResource = Get-ClusterResource “Cluster Name”
$ClusterNameResource | Start-ClusterResource -Wait 60
if ((Get-ClusterResource “Cluster Name”).State -ne “Online”)
{
Write-Error “There was an error onlining the cluster name resource”
exit 1
}
Write-Output “Adding other nodes to the cluster”
@($ClusterNodes) | Foreach-Object {
if ([string]::Compare(($_).Split(“.”)[0],$LocalMachineName, $true) -ne 0) {
Add-ClusterNode “$_” } }
Write-Output “Cluster creation finished !”
exit 0

This excellent script is taken from the Microsoft Script gallery here, I recommend checking for a later version. It’s also a great resource for PowerShell scripts.

SQL Configuration

VERY IMPORTANT! If the consuming applications are using service accounts then all nodes must have the accounts added to each instance of SQL with the same “Server” role permissions. The database permissions will get set when the DBs are sync’d.

Ensure that TCP/IP Protocol is enabled on all the SQL nodes

SQL Server AlwaysOn - Enable TCPIP

Enable “Always-On” through SQL Configuration Manager” (perform on all nodes of the cluster)

SQL Server AlwaysOn - Enable AlwaysOn availbility groups

The next steps require there to be a database present on the SQL server that is going to be the primary

Connect to the node that will be the primary in the cluster

Set user databases recovery model to “Full” (these are the databases you want to be part of the AlwaysOn Group)

SQL Server AlwaysOn - Set recovery mode to FULL

This little SQL query will help identify the databases that need changing

select [name], DATABASEPROPERTYEX([name],’recovery’)
from sysdatabases
where name not in (‘master’,’model’,’tempdb’,’msdb’)

Take a Full Backup of the user databases you’ve selected to be part of the AlwaysOn group (if you have a lot of databases it might be quicker to make a maintenance task to do this)

Create a share on this SQL server and make sure you can access the share from the other nodes.

Create the availability group on the node that is hosting the databases (in this example my consuming application is SharePoint)

– Specify a name:

SQL Server AlwaysOn - Set group name

– Select databases:

SQL Server AlwaysOn - Select databases

– Specify a replica, hit the “Add replica” button and select the other node(s)

SQL Server AlwaysOn - Add replicas

– Click on “EndPoint” tab and check the port matches the “AlwaysOn Endpoint” firewall we added earlier

SQL Server AlwaysOn - Adding endpoints

– Click the “Backup Preferences” and select an option that suits your environment (here I’ve gone for the default)

SQL Server AlwaysOn - Backup preferences

– On the “Listener” tab leave the option set to “Do not create an availability group listener now”

– Select the share you created earlier.

– Run the verification and all being well it’ll look like this (ignore the listener warning)

SQL Server AlwaysOn - Run validation

– Click “Finish” on the summary page

SQL Server AlwaysOn - Summary

The process will start to build the always on group now.  If all has gone to plan you’ll get a nice happy message like this

SQL Server AlwaysOn - Wizard completion

Click Close to finish

Create the listener

Before creating the listener make sure that the Cluster computer account has the permission to create computer objects within the “Computers” OU on the domain.

To do this open up “Active Directory Users and Computers” MMC and find the computer object created when the cluster was created.  In this case the object is called “SQLCLUSTER”
Right click on the “Computers” OU within AD and select “Delegate Control” -> Click “Add” and select the computer account.  (remember to add Computer objects type before trying to resolve the cluster name)

SQL Server AlwaysOn - Delegation

Select “Create a custom task to delegate”

SQL Server AlwaysOn - delegation task

Select “This folder, existing objects in this folder, and creation of new objects in this folder”

SQL Server AlwaysOn - delegation - this folder

Scroll down the permissions list and select “Create all Child Objects”

SQL Server AlwaysOn - Delegation permissions

Click “Finish” to exit

Create listener on the failover cluster

Open up the Failover Cluster manager on the primary node.

Expand Roles

Here you’ll see the AlwaysOn group that you created earlier (in my case “SharePointAO)

Right click the object and select “Add Resource” -> “Create Access Point”

SQL Server AlwaysOn - Client access point

Create a name for the listener, I am going to call it “SQLAlwaysOn”

SQL Server AlwaysOn - set name for client access point

Click through the wizard, if you get an error here check the AD permissions you set for the cluster computer object above.

Right Click the role (SharePointOA) again and at the bottom of the screen click the “Resources” tab

SQL Server AlwaysOn - Check resources tab

Expand the newly created resource (SQLALWAYSON)
Right-click the network address and select Properties

SQL Server AlwaysOn - Client access point properties

Change the IP address to Static IP Address and set the address to the one that was configured for the ILB above.  Also give it a name that is a bit more descriptive

SQL Server AlwaysOn - cluster IP address properties

Open a PowerShell command window with elevated privileges.
Run the command and make a note of the cluster network name.

Get-ClusterNetwork

Run the following PowerShell on each of the nodes (change variables as required)

– ClusterNetworkName is what you noted above
– IPResourceName is the name you gave the IP resource above (SQLALWAYSON-IP)
– ILBIP is the IP address of the Internal Load Balancer

$ClusterNetworkName = “Cluster Network 1”
$IPResourceName = “SQLAlwaysOn-IP”
$ILBIP = “10.151.129.103”

Get-ClusterResource $IPResourceName | Set-ClusterParameter -Multiple @{“Address”=”$ILBIP”;”ProbePort”=”59999″;”SubnetMask”=”255.255.255.255″;”Network”=”$ClusterNetworkName”;”OverrideAddressMatch”=1;”EnableDhcp”=0}

Go back to the Failover Cluster Manager

Click the Role (SharePointAO) and then click the Resources tab and finally right click the “Resource” and click “Bring Online”

SQL Server AlwaysOn - Bring resource online

If everything is done correctly you’ll see this

SQL Server AlwaysOn - confirmation that services are online

Right click the “SharePointAO” object under “Other Resources” and select “Properties”

SQL Server AlwaysOn - role properties

Navigate to the “Dependencies” tab

Select the “SQLALWAYSON” resource (not the IP) and click “OK”

SQL Server AlwaysOn - set dependancies

Now fire up SQL Management Studio

Select AlwaysOn High Availability -> Availability Groups -> <AvailabilityGroupName> -> Availability Group Listeners

SQL Server AlwaysOn - Select AG

Right click Properties and change the port to 1433 (or whatever the SQL is using)

SQL Server AlwaysOn - Change listener port

To test the listener is working correctly try connecting to is from SQL Management Studio by entering the listener name into the server name box

SQL Server AlwaysOn - Connect to listener

You can now re-point your application to the listener name rather than the SQL node name.  If you are using aliasing this is easy to do.
Once the application is using the listener you can test failover

Note: Quorum – to ensure proper failover you need to have an additional quorum vote (this could be a shared disk, file share).  Without this failover testing in the case of a machine failure will not be successful.

1 thought on “Azure – Creating a SQL AlwaysOn Failover Cluster on Azure using Internal Load Balancer (ILB)

  1. Pingback: SharePoint 2013 with SQL 2012 Always On in Azure | MyStar Blog

Leave a Reply

Your email address will not be published. Required fields are marked *