One of the key benefits that the Windows Azure platform delivers is the ability to rapidly scale your application in the cloud in response to fluctuations in demand.
Normally people scale their website or cloud services, but what if you have your applications hosted on an Azure VM and you want to be able to horizontally scale it? Well, it also possible!
You have to do basically 2 steps: create a Load Balanced Web Farm and then configure the AutoScale.
Here you will find a step-by-step guide on how to do it.
Necessary Steps:
- Create a Standard Tier VM and assign it to an availability set
- Configure the machine as you need (IIS, Application server, ftp, and so on...)
- Clone the VM
- Sysprep
- Capture
- Recreate the original VM, adding all the needed endpoints
- Create the second VM with no "extra endpoints"
- Optional - repeat point 3.4 to create additional VMs
- Balance VMs
- Change the endpoint on the first VM to create a Load-Balanced set
- Add the endpoint to the second (third, ...) VM to the created Load-Balanced set
- Repeat 4.1 and 4.2 for all the endpoints you need to be balanced
- Takes care of session state (if needed)
- Configure Autoscale
Details:
3.1 - Sysprep
The first thing to do when cloning a VM is "sysprepping" it. On Linux, there’s a similar option in the Azure agent. Sysprep ensures the machine can be cloned into a new machine, getting it’s own settings like a hostname and IP address. A non-sysprepped machine can thus never be cloned.
After sysprepping the machine, shut it down. If you’ve selected the option during sysprep, the machine will automatically shutdown. Otherwise you can do so through remote desktop or SSH, or simply through the Azure portal.
3.2 - Capture
On the Windows Azure portal go to the VM dashboard page. Next, click the "Capture" button to create a disk image from this machine. Give it a name and check the "Yes, I’ve sysprepped the machine" checkbox in order to be able to continue.
After clicking the "OK" button, Azure will create an image of our first server.
3.3 - Recreate the original VM, adding all the needed endpoints
After the image has been created, you’ll notice that your first VM has disappeared! This is normal: the machine has been disemboweled in order to create a template from it. You can now simply re-create this machine using the same settings as before, except you can now base it on this newly created VM image instead of basing it off a VM template Microsoft provides.
In the endpoints configuration, make sure to add the HTTP endpoint again listening on port 80 or, however, all the endpoints you need to access your applications.
3.4 - Create the second VM with no "extra endpoints"
To create the second machine in your webfarm, create a fresh virtual machine. As before, choose the image we’ve created earlier.
In step 4 of the machine creation, be sure to select the same "Cloud Service" of the first server and locate the VM in the same availability set.
Don’t add the HTTP endpoint (or other endpoints configured in the step 3.3) to this machine just yet.
You now have two machines running, yet they aren’t load balanced at this moment. You’ll notice that both machines are already behind the same hostname and that they share the same public virtual IP address. This is due to the fact that we "linked" the machines earlier. If you don’t, you will never be able to use the out-of-the-box load balancer that comes with Azure. This also means that the public remote desktop endpoint for both machines will be different: there’s only one IP address exposed to the outside world so you’ll have to think about endpoints.
4.1 - Change the endpoint on the first VM to create a Load-Balanced set
The last part of setting up our webfarm will be load balancing. This is in fact really, really easy.
As first point, go the "Endpoints" page of the first (original) VM, choose the Endpoint you want to balance and edit it.
Just check the "Create a Load-Balance set" checkbox.
In the step 2 of the edit, give the Load-Balanced set a name and configure the probe parameters (in my example, I'm configuring an HTTPS endpoint, so I want to check every 15 second if the port 443 answers. After 2 fails, the balancer switch to the other endpoint)
4.2 - Add the endpoint to the second (third, ...) VM to the created Load-Balanced set
Simply go to second machine’s dashboard in the Azure portal and navigate to the Endpoints tab. We’ve already added public HTTPS endpoint on our first machine, which means for our second machine we can just subscribe to load balancing:
Now we have free round-robin load balancing with checks every few seconds to ensure that all machines are up and running. And since we linked these machines through an availability set, they are on different fault domains in the datacenter reducing the chance of errors due to malfunctioning hardware or maintenance. You can safely shut down a machine too. In short: anything you’d expect from a load balancer (except sticky sessions).
4.4 - Takes care of session state (if needed)
Now that you have the VMs balanced, you have to think about how your applications manage the session state.
If you are deploying web servers with Asp.Net applications, for example, you’ll have to configure machine keys and sessione state in the same way you would do it on-premise. On Azure you can choose to user the "normal" database way (Session state stored on Azure database), you can use the Azure storage or the new Azure cache.
You can visit this link on msdn (http://blogs.msdn.com/b/cie/archive/2013/05/17/session-state-management-in-windows-azure-web-roles.aspx) to have an overview about Session State management on Azure.
5 - Configure Autoscale
Ok, finally let's configure the autoscale! Now we have some VMs running, balanced. But do we need all the VMs running at the same time? Maybe not. We maybe need to have it running on some time periods, or maybe only if under load.
If you remember when you've created the VMs you have choosen the same cloud service for all of them. To configure the autoscale on the VMs, just go to the Cloud Service related to them and navigate to the "Scale" page.
Here you can choose the type of scale you want: None (no scale...), by Cpu or by Queue.
In my case, I decided to scale using the CPU percentage as parameter. The "Target CPU" slider says that I want to scale up when the average CPU is over 80% and to scale down when it is under 60%.
I have only 2 VMs, so I can configure that normally only 1 is active and the second will be activate to scale up.
You can also choose to scale based on time settings.