Tag Archives: High Availability

#Azure : Virtual Machines Scale Sets

Microsoft Azure virtual machines scale sets are a next step in high availability and scalability of virtual machines. Virtual machine high availability can be achieved by availability sets in Microsoft Azure. Microsoft Azure virtual machines scale sets are a group of an identical compute resources deployed in multiple availability sets. It is a true scalable model of auto-scaling that can target large-scale services with big compute, large data and containerized workloads. As these virtual machine scale sets leverage multiple availability sets in the background, therefore scale operations are tacitly balanced across fault and update domains. These scale sets use five fault domains and five update domains in each availability set. Each virtual machine scale sets can host 0-1000 VMs based on platform images, and 0-300 VMs based on custom images.

To define autoscale configuration for consistent application performance, many permutation and combination can be used. Very common rules are compute, memory, and disk I/O utilization. Apart from these common rule of performance metrics, auto scaling of VMs can also depend on application response, or a fixed scheduled.

Note: Virtual Machines scale sets can also be deployed with availability zones.

Now, let me explain you how does this auto scaling works behind the scene. When a new VM added to the scale set, a VM instance Id will be provided to each VM that is unique within a scaleset. When you add or remove a virtual machine from the scale set, the existing Id doesn’t go anywhere. For Example: In a virtual machine scale set you have 10 VMs, if your 2 VMs removed from the scale set based on the configuration and need, and then after some time 5 VMs are added based on the load then new VMs will have Instance id 10, 11, 12, 13, 14 in an incremented manner and these VMs will be balanced across fault and update domains to maintain maximum availability.

Let see how to do it.

Login to the Azure portal and search for “scale” in the azure market place and select the “Virtual machine scale set”.

In the Virtual machine scale set panel, select “Create” to create a new Virtual machine scale set.

In the “create virtual machine scale set” fill the basics information.

Virtual machine scale set name = Enter the scale set name for your virtual machine scale set deployment.

Operating system disk image = Select the operating disk image from drop-down.

Subscription = select your subscription.

Resource group = Create a new resource group or select the existing one.

Location = Select the Azure region from drop-down.

User name = Enter the username that will be used for virtual machines.

Password = Enter the password for the user name.

Confirm password = Re-enter the password to confirm.

Scroll down and fill the required details under “Instances and Load Balancer”.

Instance count = Enter to VMs count between 0 – 100. If you enter any number more than 100 and up to 1000, all the configuration settings will be disabled except instance size. As large-scale sets with more than 100 VMs use managed disk by default and deployment of these large-scale sets take place across multiple placement groups.

Instance size = Select the VM size based on your requirement.

Enable scaling beyond 100 instances = By default “No”, if you select “Yes” then rest of the settings will be disabled as described under instance count.

Use managed disks = By default “Yes”.

Public IP address name = Define name of the public IP address that will be used for load balancer, which will be placed in front of the scale set.

Public IP allocation method = By default dynamic but Static can be selected.

Domain name label = Domain name label for the load balancer in front of the scale set.

Autoscale = By default disabled but if you enable this feature then you need to define the conditions for the auto scaling.

If Autoscale enabled, fill the required details.


Minimum number of VMs = Enter the minimum number of VMs that required in this scale set.

Maximum number of VMs = Enter the maximum number of VMs that required in this scale set.

Scale out

CPU threshold (%) = Enter the cpu threshold after that VM will be added.

Number of VMs to increase by = Enter the number of VMs that will added when your running VMs reach defined cpu threshold.

Scale in

CPU threshold (%) = Enter the cpu threshold after that VM will be removed.

Number of VMs to decrease by = Enter the number of VMs that will removed when your running VMs reach defined cpu threshold.

Once filled all the details, click on create to start the deployment process.

As you observed that in the entire process, virtual network and storage account was not asked anywhere because virtual machine scale sets take care of it behind the scene based on the configuration. Therefore, you don’t have to really worry about it.


Windows Fabric and Server Placement – Part II

Part I of this article talks about Windows Fabric basics. In part II, will describe how Lync Server 2013 and Skype for Business is tied together with Windows Fabric. Will call out few best practices which should be taken care while placing Front End Servers in virtualized environment.

In Lync Server 2013 & Skype for Business Server 2015, a pool can have maximum 12 Front End Servers. Lync Server 2013 works with Windows Fabric V1 while Skype for Business works with Windows Fabric V2/V3. As Lync Server 2013 and Skype for Business Server Front End pools uses a distributed systems model which is based on Windows Fabric. In this model, it keeps important users and conference data of each user on as many as three Front End Servers in a FE Pool. With this model a least number of FE servers must be running for the pool to function. There are two loss modes where quorum loss comes in picture for a FE Pool.

Routing group level quorum loss: Every user is assigned to a particular routing group with in a FE Pool and there are three servers in one Routing group where one is a primary replica and another two are secondary replicas. When enough replica servers of a particular routing group become unavailable then it results in routing group level quorum loss.

Total number of servers in the pool Number of servers that must be running for the pool to be started the first time
2 1
3 3
4 3
5 4
6 5
7 5
8 6
9 7
10 8
11 9
12 10


Pool level quorum loss: Same as Windows Server Cluster, FE pool needs minimum N/2+1 to make the pool state up and running. In the odd numbers it is automatically taken care while in the even numbers Primary SQL database plays a role of witness. When minimum numbers of server in a FE Pool becomes unavailable, it results in Pool level quorum loss.

Total number of Front End Servers in the pool Number of servers that must be running for pool to be functional
2 1
3-4 Any 2
5-6 Any 3
7 Any 4
8-9 Any 4 of the first 7 servers
10-12 Any 5 of the first 9 servers


If organization has plan to deploy Skype for Business Server in virtualized environment, FE Servers placement is a key to make sure minimum impact on FE services in case one host goes down. If you place more than one FE Server on any particular host in virtualized environment that can surely result in routing group level quorum loss or might be result in Pool level quorum loss as well if pool has only four FE Servers and SQL primary database is part of the host which is down.

The above example consider SfB deployment with two physical hosts where each host caters two FE Server and one SQL Back End Server. If only routing group goes down then it can be recovered via Reset-CsPoolRegistrarState -ResetType QuorumLossRecovery cmdlet. Please make sure if two FE Servers in a particular routing group go down, the users belongs to that routing group will be downgraded in limited functionality until FE Servers come back or pool registrar state has been reset.

Windows Fabric and Server Placement – Part I

Windows Fabric plays a key role for Front End pool services availability in Lync Server 2013 and Skype for Business Server 2015. In Lync Server 2010 this responsibility was managed by Cluster Manager. Lync Server 2013 / Skype for Business 2015 Front End Pool services availability totally depend on windows Fabric and Fault & Upgrade domains provisioned by the Topology Builder.

Lync Server 2013 and Skype for Business Server 2015 use brick model which is based on Windows Fabric and use lazy writes to update Back End Server databases. Windows Fabric is a distributed system platform for building scalable applications. It is used for both on premise and cloud scenarios. Windows fabric starts independently without any specific external configuration store. It has self-healing and decentralized features which provides self-monitoring and automatically adjustment (load balancing) without any single point of failure. Windows Fabric Hosts service (FabricHostSvc) is installed as part of “Setup and Remove Lync Server components. Windows fabric also elect primary, secondary and backup secondary (tertiary) replica, maintain replication between primary and secondary replicas. You can find config file on each server located at “C:\ProgramData\Windows fabric\<ServerFQDN>\Fabric\ClusterMainfest.current.xml”.

Below are the core services which use Windows Fabric:

  • Routing Services
  • Lync Storage Services
  • MCU Factory Services
  • Conferencing Data Services

Windows Fabric is nothing but works like Windows Server Cluster. Similar to Cluster, Windows Server works on Majority where every Front End Server serves as voters. To get the majority for Front End pool, it always calculate N/2 + 1 for even FE nodes & N+1/2 for Odd FE nodes.

There are two major concepts which rely on fault and upgrade domain. Fault domain basically correlates underlying hardware and widely considered at the time of virtualization where organizations or administrators place more than one similar server role on same host. While Upgrade domains correlates logical set of nodes for planning upgrades.

Part II covers quorum loss modes, server placement, SQL server requirement for majority and best practices.