Geographically distributing Windows Azure applications using Traffic Manager

Edit on GitHub

With the downtime of Amazon EC2 this week, it seems a lot of websites “in the cloud” are down at the moment. Most comments I read on Twitter (and that I also made, jokingly :-)) are in the lines of “outrageous!” and “don’t go cloud!”. While I understand these comments, I think they are wrong. These “clouds” can fail. They are even designed to fail, and often provide components and services that allow you to cope with these failures. You just have to expect failure at some point in time and build it into your application.

Let me rephrase my introduction. I just told you to expect failure, but I actually believe that clouds don’t “fail”. Yes, you may think I’m lunatic there, providing you with two different and opposing views in 2 paragraphs. Allow me to explain: "a “failing” cloud is actually a “scaling” cloud, only thing is: it’s scaling down to zero. If you design your application so that it can scale out, you should also plan for scaling “in”, eventually to zero. Use different availability zones on Amazon, and if you’re a Windows Azure user: try the new Traffic Manager CTP!

The Windows Azure Traffic Manager provides several methods of distributing internet traffic among two or more hosted services, all accessible with the same URL, in one or more Windows Azure datacenters. It’s basically a distributed DNS service that knows which Windows Azure Services are sitting behind the traffic manager URL and distributes requests based on three possible profiles:

  • Failover: all traffic is mapped to one Windows Azure service, unless it fails. It then directs all traffic to the failover Windows Azure service.
  • Performance: all traffic is mapped to the Windows Azure service “closest” (in routing terms) to the client requesting it. This will direct users from the US to one of the US datacenters, European users will probably end up in one of the European datacenters and Asian users, well, somewhere in the Asian datacenters.
  • Round-robin: Just distribute requests between various Windows Azure services defined in the Traffic Manager policy

As a sample, I have uploaded my Windows Azure package to two datacenters: EU North and US North Central. Both have their own URL:

I have created a “performance” policy at the URL http://certgen.ctp.trafficmgr.com/, which redirects users to the nearest datacenter (and fails-over if one goes down):

Windows Azure Traffic Manager geo replicate

If one of the datacenters goes down, the other service will take over. And as a bonus, I get reduced latency because users use their nearest datacenter.

So what’s this have to do with my initial thoughts? Well: design to scale, using an appropriate technique to your specific situation. Use all the tools the platform has to offer, and prepare for scaling out and for scaling '”in”, even to zero instances. And as with backups: test your disaster strategy now and then.

PS: Artwork based on Josh Twist’s sketches

This is an imported post. It was imported from my old blog using an automated tool and may contain formatting errors and/or broken images.

Leave a Comment

avatar

2 responses

  1. Avatar for krishna kishore
    krishna kishore April 28th, 2011

    Hi Maarten,

    While the idea of using 2 (or more) data centers, with separate URLs and using Traffic Manager is good, what about your application&#39s data? I&#39m assuming that your data (be it SQL Azure or Azure Storage) is in the same data center as your application - which means, you have 2 copies of your data. How do you keep them in sync so that the other instance also participates - either because of failover, performance or round robin?
    How &#39expensive&#39 is the mechanism - twice the cost, plus cost of data moving up and down the wire?

    regards,
    kk

  2. Avatar for maartenbax
    maartenbax April 30th, 2011

    SQL Azure data sync will be a option, but yes, there&#39s the cost of keeping data "twice". No solution for that currently, except storing data twice.