The Art of SQL Server Database Administration, Development, and Career Skills for the Technically Minded
Can a group of basic availability groups mimic an enterprise level availability group for far less cost?
With SQL Server 2016 we have finally been able to have some form of AlwaysOn AG with SQL Server Standard edition. Quite a relief since mirroring has been deprecated. At our site we have several AlwaysOn implementations already in place on Enterprise, but not every server justifies that kind of expense. However, just because we don’t want to spend several grand on licensing doesn’t mean we can do without HA/DR. So, I repeat, it was a bit of a relief when Basic Availability Groups were released with SQL 2016.
For a recent project that required HA/DR but couldn’t justify Enterprise edition we decided to take the plunge on 2016’s Basic Availability Groups.
For a quick rundown of the watered down feature set – basically what you don’t get with a Basic Availability Group (BAG) – the major points are as follows:
I find it funny that the technology is called availability groups inferring that all your protected databases can travel as a group, but the basic implementation on SQL Standard only supports one database. It’s no longer an availability group but a database with high availability. But, whatever. It’s better than nothing.
At first glance at this list of “can’t”s we thought however, this won’t work at all. One, we have four databases that support an SOA application (ignore the acronym redundancy please) – and we need to ensure that all four fail together. We don’t want to use mirroring, after all, it’s been deprecated. We do however want to remain on SQL Standard. Further, we need a listener, because our applications don’t respond well to failover-partner specifications in the connection string due to the lag time of figuring out who is in control of the pair. We’d also rather not deal with a witness.
What if we get creative? The question before the court is? Can you implement more than one BAG on a single instance that all use the same underlying cluster. Further, can you assign a listener? My thought is probably, after all, it’s just a virtual identity. But, would the other BAGs on the same pair of servers benefit if a listener were configured for just one of the BAGs. We could pick one as King BAG. Would it work?
The real world answer is yes, and yes. We just finished setting up four availability groups on a two node pair. There is a single cluster in place and all four BAG setups work just fine on that one cluster. We picked one BAG to be King BAG and configured a listener. Sure enough, all other BAGs on the instance can utilize that same listener, (like I said, it’s just a virtual identity). We just have to be careful that all primaries are always on the same node.
That leads us to the final question. How do we ensure they all travel together like a normal Enterprise Edition AG Group would (ignore acronym redundancy please).
Well, since it’s all on the same cluster, under 95 percent of the failure circumstances, the cluster will move them all – even though they are technically different AGs. For those rare cases where they might get out of sync, such as a manual move, we implemented a job that regularly checks where King BAG is. It then verifies that all other BAGs on the instance are also on the same primary instance as King BAG and if not, manually moves them. This is a running agent job that constantly checks.
Finally, this is all set up on a multi-subnet environment. One node is in one datacenter. The other node is in the other datacenter. So, a background concern if multisubnet would work with a BAG was also resolved. It does.
To be clear. It was a bit of a pain to set up. But now that it’s in place the only real perceptible difference between multiple BAGs on Standard as opposed to a single AG Group (ignore acronym redundancy please) on Enterprise is that we can’t use a read only secondary, and that our AG nodes have a few more entries in SSMS plus the overhead in management that requires on human staff. The underlying cluster still does a pretty good job of treating them like one group.
All in all, it’s pretty slick and allows us to achieve exactly what we wanted on the cheap. And, before you ask. Yes, we tested failovers. Both manual and forced (by killing the primary node).