This post is a part of my series on Patching SQL Server AG using SCCM and Powershell
Please read this post for the complete overview
Checking the health of your Always On Availability Groups should always be included in your patching scripts, especially when failovers or even multiple reboots are required. You should only continue when everything is “green”!
There is a really good blog written on how to get the health status of your Always On configuration using Powershell. You can take it as far as you like (e.g.: report on individual database states), but I am sticking to the general health: Test-SqlAvailabilityGroup
When you add this to your scripts, you need to deal with some extra stuff to make this work nicely. In some cases the dashboard is reporting “Unhealthy” or “Warning” when there is actually nothing wrong…..yet. So you need some kind of retry mechanism.
For example: When you perform a failover and there are databases in the Availability Group that deal with lots of transactions, you may see that, after a failover, the database is still seeding data for a couple of minutes. (I dealt with a SAP database that needed 9 minutes of seeding after a failover). During the seeding, the dashboard shows a warning. In that case you will have to wait till the signal “Healthy” comes back.
Keep in mind: when using powershell to check for Health on your Availability Groups you should always connect to the primary replica, otherwise status “unkown” is returned!
In the example below:
– I am checking if $Server is the primary Replica. If not, we will switch context.
– Then, we will initiate a stopwatch and let it run for max 10 minutes. If the dashboard is not reporting “Healthy” within those 10 minutes, we will no longer wait and report “Unhealthy”.
$server='your servername' $srv = New-Object Microsoft.SqlServer.Management.Smo.Server $Server ## Dashboard only available on Primary Replica $Ags = $srv.AvailabilityGroups if ($AGs.PrimaryReplicaServerName -ne $Server) { $srv.ConnectionContext.Disconnect() $AGName = $AGs.Name $srv = New-Object Microsoft.SqlServer.Management.Smo.Server $AGs.PrimaryReplicaServerName $Ags = $srv.AvailabilityGroups[$AGName] } $Name = $srv.Name if ($srv.InstanceName -eq '') { $InstanceName = 'DEFAULT' } else { $InstanceName = $srv.InstanceName } $Path = "SQLSERVER:\Sql\$Name\$InstanceName\AvailabilityGroups\$AgName" $timeout = new-timespan -Minutes 10 $sw = [diagnostics.stopwatch]::StartNew() while ($sw.elapsed -lt $timeout) { if ( (Test-SqlAvailabilityGroup -Path $path -ErrorAction SilentlyContinue).HealthState -eq "Healthy") { write-verbose "Availabilty Group is reporting Healty" $TimedOut = $false $AgState= "Healthy" break } else { write-warning "Availabilty Group is not reporting Healty... retrying for max 10 minutes" } start-sleep -seconds 5 $TimedOut = $true } if($TimedOut){ $AgState= "UnHealthy" } $AGState $srv.ConnectionContext.Disconnect()
I hope my guide on Monitoring the Health of Always On with Powershell was helpful—feel free to leave questions in the comments!
If the task still seems a bit daunting, book a call with Data Masterminds. We have the expertise to execute this and other SQL Server management and troubleshooting tasks to save you the headache.