Maarten Balliauw {blog}

ASP.NET MVC, Microsoft Azure, PHP, web development ...

NAVIGATION - SEARCH

Tales from the trenches: resizing a Windows Azure virtual disk the smooth way

We’ve all been there. Running a virtual machine on Windows Azure and all of a sudden you notice that a virtual disk is running full. Having no access to the hypervisor nor to its storage (directly), there’s no easy way out…

Big disclaimer: use the provided code on your own risk! I’m not responsible if something breaks! The provided code is as-is without warranty! I have tested this on a couple of data disks without any problems. I've tested this on OS disks and this sometimes works, sometimes fails. Be warned.

Download/contribute: on GitHub

When searching for a solution to this issue,the typical solution you’ll find is the following:

  • Delete the VM
  • Download the .vhd
  • Resize the downloaded .vhd
  • Delete the original .vhd from blob storage
  • Upload the resized .vhd
  • Recreate the VM
  • Use diskpart to resize the partition

That’s a lot of work. Deleting and re-creating the VM isn’t that bad, it can be done pretty quickly. But doing a download of a 30GB disk, resizing the disk and re-uploading it is a serious PITA! Even if you do this on a temporary VM that sits in the same datacenter as your storage account.

Last saturday, I was in this situation… A decision would have to be made: spend an estimated 3 hours in doing the entire download/resize/upload process or reading up on the VHD file format and finding an easier way. With the possibility of having to fall back to doing the entire process…

Now what!

Being a bit geeked out, I decided to read up on the VHD file format and download the specs.

Before we dive in: why would I even read up on the VHD file format? Well, since Windows Azure storage is used as the underlying store for Windows Azure Virtual Machine VHD’s and Windows Azure storage supports byte operations without having to download an entire file, it occurred to me that combining both would result in a less-than-one-second VHD resize. Or would it?

Note that if you’re just interested in the bits to “get it done”, check the last section of this post.

Researching the VHD file format specs

The specs for the VHD file format are publicly available. Which means it shouldn't be to hard to learn how VHD files, the underlying format for virtual disks on Windows Azure Virtual Machines, are structured. Having fear of extremely complex file structures, I started reading and found that a VHD isn’t actually that complicated.

Apparently, VHD files created with Virtual PC 2004 are a bit different from newer VHD files. But hey, Microsoft will probably not use that old beast in their datacenters, right? Using that assumption and the assumption that VHD files for Windows Azure Virtual Machines are always fixed in size, I learnt the following over-generalized lesson:

A fixed-size VHD for Windows Azure Virtual Machines is a bunch of bytes representing the actual disk contents, followed by a 512-byte file footer that holds some metadata.
Maarten Balliauw – last Saturday

A-ha! So in short, if the size of the VHD file is known, the offset to the footer can be calculated and the entire footer can be read. And this footer is just a simple byte array. From the specs:

VHD footer specification

Let’s see what’s needed to do some dynamic VHD resizing…

Resizing a VHD file - take 1

My first approach to “fixing” this issue was simple:

  • Read the footer bytes
  • Write null values over it and resize the disk to (desired size + 512 bytes)
  • Write the footer in those last 512 bytes

Guess what? I tried mounting an updated VHD file in Windows, without any successful result. Time for some more reading… resulting in the big Eureka! scream: the “current size” field in the footer must be updated!

So I did that… and got failure again. But Eureka! again: the checksum must be updated so that the VHD driver can verify the footer is valid!

So I did that… and found more failure.

*sigh* – the fallback scenario of download/resize/update came to mind again…

Resizing a VHD file - take 2

Being a persistent developer, I decided to do some more searching. For most problems, at least a partial solution is available out there! And there was: CodePlex holds a library called .NET DiscUtils which supports reading from and writing to a giant load of file container formats such as ISO, VHD, various file systems, Udf, Vdi and much more!

Going through the sources and doing some research, I found the one missing piece from my first attempt: “geometry”. An old class on basic computer principles came to mind where the professor taught us that disks have geometry: cylinder-head-sector or CHS information for the disk driver which can use this info for determining physical data blocks on the disk.

Being lazy, I decided to copy-and-adapt the Footer class from this library. Why reinvent the wheel? Why risk  going sub-zero on the WIfe Acceptance Factor since this was saturday?

So I decided to generate a fresh VHD file in Windows and try to resize that one using this Footer class. Let’s start simple: specify the file to open, the desired new size and open a read/write stream to it.

1 string file = @"c:\temp\path\to\some.vhd"; 2 long newSize = 20971520; // resize to 20 MB 3 4 using (Stream stream = new FileStream(file, FileMode.OpenOrCreate, FileAccess.ReadWrite)) 5 { 6 // code goes here 7 }

Since we know the size of the file we’ve just opened, the footer is at length – 512, the Footer class takes these bytes and creates a .NET object for it:

1 stream.Seek(-512, SeekOrigin.End); 2 var currentFooterPosition = stream.Position; 3 4 // Read current footer 5 var footer = new byte[512]; 6 stream.Read(footer, 0, 512); 7 8 var footerInstance = Footer.FromBytes(footer, 0);

Of course, we want to make sure we’re working on a fixed-size disk and that it’s smaller than the requested new size.

1 if (footerInstance.DiskType != FileType.Fixed 2 || footerInstance.CurrentSize >= newSize) 3 { 4 throw new Exception("You are one serious nutcase!"); 5 }

If all is well, we can start resizing the disk. Simply writing a series of zeroes in the least optimal way will do:

1 // Write 0 values 2 stream.Seek(currentFooterPosition, SeekOrigin.Begin); 3 while (stream.Length < newSize) 4 { 5 stream.WriteByte(0); 6 }

Now that we have a VHD file that holds the desired new size capacity, there’s one thing left: updating the VHD file footer. Again, the Footer class can help us here by updating the current size, original size, geometry and checksum fields:

1 // Change footer size values 2 footerInstance.CurrentSize = newSize; 3 footerInstance.OriginalSize = newSize; 4 footerInstance.Geometry = Geometry.FromCapacity(newSize); 5 6 footerInstance.UpdateChecksum();

One thing left: writing the footer to our VHD file:

1 footer = new byte[512]; 2 footerInstance.ToBytes(footer, 0); 3 4 // Write new footer 5 stream.Write(footer, 0, footer.Length);

That’s it. And my big surprise after running this? Great success! A VHD that doubled in size.

Resize VHD Windows Azure disk

So we can now resize VHD files in under a second. That’s much faster than any VHD resizer tool you find out here! But still: what about the download/upload?

Resizing a VHD file stored in blob storage

Now that we have the code for resizing a local VHD, porting this to using blob storage and more specifically, the features provided for manipulating page blobs, is pretty straightforward. The Windows Azure Storage SDK gives us access to every single page of 512 bytes of a page blob, meaning we can work with files that span gigabytes of data while only downloading and uploading a couple of bytes…

Let’s give it a try. First of all, our file is now a URL to a blob:

1 var blob = new CloudPageBlob( 2 "http://account.blob.core.windows.net/vhds/some.vhd", 3 new StorageCredentials("accountname", "accountkey));

Next, we can fetch the last page of this blob to read our VHD’s footer:

1 blob.FetchAttributes(); 2 var originalLength = blob.Properties.Length; 3 4 var footer = new byte[512]; 5 using (Stream stream = new MemoryStream()) 6 { 7 blob.DownloadRangeToStream(stream, originalLength - 512, 512); 8 stream.Position = 0; 9 stream.Read(footer, 0, 512); 10 stream.Close(); 11 } 12 13 var footerInstance = Footer.FromBytes(footer, 0);

After doing the check on disk type again (fixed and smaller than the desired new size), we can resize the VHD. This time not by writing zeroes to it, but by calling one simple method on the storage SDK.

1 blob.Resize(newSize + 512);

In theory, it’s not required to overwrite the current footer with zeroes, but let’s play it clean:

1 blob.ClearPages(originalLength - 512, 512);

Next, we can change our footer values again:

1 footerInstance.CurrentSize = newSize; 2 footerInstance.OriginalSize = newSize; 3 footerInstance.Geometry = Geometry.FromCapacity(newSize); 4 5 footerInstance.UpdateChecksum(); 6 7 footer = new byte[512]; 8 footerInstance.ToBytes(footer, 0);

And write them to the last page of our page blob:

1 using (Stream stream = new MemoryStream(footer)) 2 { 3 blob.WritePages(stream, newSize); 4 }

And that’s all, folks! Using this code you’ll be able to resize a VHD file stored on blob storage in less than a second without having to download and upload several gigabytes of data.

Meet WindowsAzureDiskResizer

Since resizing Windows Azure VHD files is a well-known missing feature, I decided to wrap all my code in a console application and share it on GitHub. Feel free to fork, contribute and so on. WindowsAzureDiskResizer takes at least two parameters: the desired new size (in bytes) and a blob URL to the VHD. This can be a URL containing a Shared Access SIgnature.

Resize windows azure VM disk

Now let’s resize a disk. Here are the steps to take:

  • Shutdown the VM
  • Delete the VM -or- detach the disk if it’s not the OS disk
  • In the Windows Azure portal, delete the disk (retain the data!) do that the lease Windows Azure has on it is removed
  • Run WindowsAzureDiskResizer
  • In the Windows Azure portal, recreate the disk based on the existing blob
  • Recreate the VM  -or- reattach the disk if it’s not the OS disk
  • Start the VM
  • Use diskpart / disk management to resize the partition

Here’s how fast the resizing happens:

VhdResizer

Woah! Enjoy!

We’re good for now, at least until Microsoft decides to switch to the newer VHDX file format…

Download/contribute: on GitHub or binaries: WindowsAzureDiskResizer-1.0.0.0.zip (831.69 kb)

Working with Windows Azure command line tools from within Visual Studio

Right after my last post (Working with Windows Azure command line tools from PhpStorm), the obvious question came to mind… Can I do Windows Azure things using the command line tools from within Visual Studio as well? Sure you can! At least if you have the NuGet Package Manager Console installed into your Visual Studio.

For good order: you can use either the PowerShell cmdlets that are available or use the Node-based tools available (how-to). In this post we’ll be using the PowerShell cmdlets. And once those are installed… there’s nothing you have to do to get these working in Visual Studio!

The first thing we’ll have to do before being able to do anything with these cmdlets is making sure we can access the Windows Azure management service. Invoke the Get-AzurePublishSettings command. This will open op a new browser window and generate a .publishsettings. Save it somewhere and remember the full path to it. Next, we’ll have to import that file using the Import-AzurePublishSettingsFile <path to publishsettings file> command.

If everything went according to plan, we’ll now be able to do some really interesting things from inside our NuGet Package Manager console. Let’s see if we can list all Windows Azure Web Sites under our subscription… Get-AzureWebsite should do!

List Windows Azure Web Site from NuGet Package Manager console

And it did. Let’s scale our brewbuddy website and make use of 3 workers.

image

Whoa!

For reference, here’s the full list of supported cmdlets. There’s also Glenn Block’s post on some common recipes you can mash together using these cmdlets. Enjoy!

[edit] Sandrino Di Mattia has a take on this as well: http://fabriccontroller.net/blog/posts/using-the-windows-azure-cli-on-windows-and-from-within-visual-studio/

Working with Windows Azure from within PhpStorm

Working with Windows Azure and my new toy (PhpStorm), I wanted to have support for doing specific actions like creating a new web site or a new database in the IDE. Since I’m not a Java guy, writing a plugin was not an option. Fortunately, PhpStorm (or WebStorm for that matter) provide support for issuing commands from the IDE. Which led me to think that it may be possible to hook up the Windows Azure Command Line Tools in my IDE… Let’s see what we can do…

First of all, we’ll need the ‘azure’ tools. These are available for download for Windows or Mac. If you happen to have Node and NPM installed, simply issue npm install azure-cli -g and we’re good to go.

Next, we’ll have to configure PhpStorm with a custom command so that we can invoke these commands from within our IDE. From the File > Settings menu find the Command Line Tool Support pane and add a new framework:

PhpStorm custom framework

Next, enter the following detail. Note that the tool path may be different on your machine. It should be the full path to the command line tools for Windows Azure, which on my machine is C:\Program Files (x86)\Microsoft SDKs\Windows Azure\CLI\0.6.9\wbin\azure.cmd.

PhpStorm custom framework settings

Click Ok, close the settings dialog and return to your working environment. From there, we can open a command line through the Tools > Run Command menu or by simply using the Ctrl+Shift+X keyboard combo. Let’s invoke the azure command:

Running Windows Azure bash tools in PhpStrom WebStorm

Cool aye? Let’s see if we can actually do some things. The first thing we’ll have to do before being able to do anything with these tools is making sure we can access the Windows Azure management service. Invoke the azure account download command and save the generated .publishsettings file somewhere on your system. Next, we’ll have to import that file using the azure account import <path to publishsettings file> command.

If everything went according to plan, we’ll now be able to do some really interesting things from inside our PhpStorm IDE… How about we create a new Windows Azure Website named “GroovyBaby” in the West US datacenter with Git support and a local clone lined to it? Here’s the command:

azure site create GroovyBaby --git --location "West US"

And here’s the result:

Create a new website in PhpStorm

I seriously love this stuff! For reference, here’s the complete list of available commands. And Glenn Block cooked up some cool commands too.

Windows Azure Websites and PhpStorm

In my new role as Technical Evangelist at JetBrains, I’ve been experimenting with one of our products a lot: PhpStorm. I was kind of curious how this tool would integrate with Windows Azure Web Sites. Now before you quit reading this post because of that acronym: if you are a Node-head you can also use WebStorm to do the same things I will describe in this post. Let’s see if we can get a simple PHP application running on Windows Azure right from within our IDE…

Setting up a Windows Azure Web Site

Let’s go through setting up a Windows Azure Web Site real quickly. If this is the first time you hear about Web Sites and want more detail on getting started, check the Windows Azure website for a detailed step-by-step explanation.

From the management portal, click the big “New” button and create a new web site. Use the “quick create” so you just have to specify a URL and select the datacenter location where you’ll be hosted. Click “Create” and endure the 4 second wait before everything is provisioned.

Create a Windows Azure web site

Next, make sure Git support is enabled. From your newly created web site,click “Enable Git publishing”. This will create a new Git repository for your web site.

Windows Azure git

From now on, we have a choice to make. We can opt to “deploy from GitHub”, which will link the web site to a project on GitHub and deploy fresh code on every change on a specific branch. It’s very easy to do that, but we’ll be taking the other option: let’s use our Windows Azure Web Site as a Git repository instead.

Creating a PhpStorm project

After starting PhpStorm, go to VCS > Checkout from Version Control > Git. For repository URL, enter the repository that is listed in the Windows Azure management portal. It’s probably similar to @.scm.azurewebsites.net/stormy.git">https://<yourusername>@<your web name>.scm.azurewebsites.net/stormy.git.

Windows Azure PHPStorm WebStorm

Once that’s done, simply click “Clone”. PhpStorm will ask for credentials after which it will download the contents of your Windows Azure Web Site. For this post, we started from an empty web site but if we would have started with creating a web site from gallery, PhpStorm would simply download the entire web site’s contents. After the cloning finishes, this should be your PhpStorm project:

PHPStorm clone web site

Let’s add a new file by right-clicking the project and clicking New > File. Name the file “index.php” since that is one of the root documents recognized by Windows Azure Web Sites. If PhpStorm asks you if you want to add he file to the Git repository, answer affirmative. We want this file to end up being deployed some day.

The following code will do:

<?php echo "Hello world!";

Now let’s get this beauty online!

Publishing the application to Windows Azure

To commit the changes we’ve made earlier, press CTRL + K or use the menu VCS > Commit Changes. This will commit the created and modified files to our local copy of the remote Git repository.

Commit VCS changes PHPStorm

On the “Commit” button, click the little arrow and go with Commit and Push. This will make PhpStorm do two things at once: create a changeset containing our modifications and push it to Windows Azure Web Sites. We’ll be asked for a final confirmation:

Push to Windows Azure

After having clicked Push, PhpStorm will send our contents to Windows Azure Web Sites and create a new deployment as you can see from the management portal:

Windows Azure Web Sites deployment from PHPStorm

Guess what this all did? Our web site is now up and running at http://stormy.azurewebsites.net/.

image

A non-Microsoft language on Windows Azure? A non-Microsoft IDE? It all works seamlessly together! Enjoy!

Storing user uploads in Windows Azure blob storage

On one of the mailing lists I follow, an interesting question came up: “We want to write a VSTO plugin for Outlook which copies attachments to blob storage. What’s the best way to do this? What about security?”. Shortly thereafter, an answer came around: “That can be done directly from the client. And storage credentials can be encrypted for use in your VSTO plugin.”

While that’s certainly a solution to the problem, it’s not the best. Let’s try and answer…

What’s the best way to uploads data to blob storage directly from the client?

The first solution that comes to mind is implementing the following flow: the client authenticates and uploads data to your service which then stores the upload on blob storage.

Upload data to blob storage

While that is in fact a valid solution, think about the following: you are creating an expensive layer in your application that just sits there copying data from one network connection to another. If you have to scale this solution, you will have to scale out the service layer in between. If you want redundancy, you need at least two machines for doing this simple copy operation… A better approach would be one where the client authenticates with your service and then uploads the data directly to blob storage.

Upload data to blob storage using shared access signature

This approach allows you to have a “cheap” service layer: it can even run on the free version of Windows Azure Web Sites if you have a low traffic volume. You don’t have to scale out the service layer once your number of clients grows (at least, not for the uploading scenario).But how would you handle uploading to blob storage from a security point of view…

What about security? Shared access signatures!

The first suggested answer on the mailing list was this: “(…) storage credentials can be encrypted for use in your VSTO plugin.” That’s true, but you only have 2 access keys to storage. It’s like giving the master key of your house to someone you don’t know. It’s encrypted, sure, but still, the master key is at the client and that’s a potential risk. The solution? Using a shared access signature!

Shared access signatures (SAS) allow us to separate the code that signs a request from the code that executes it. It basically is a set of query string parameters attached to a blob (or container!) URL that serves as the authentication ticket to blob storage. Of course, these parameters are signed using the real storage access key, so that no-one can change this signature without knowing the master key. And that’s the scenario we want to support…

On the service side, the place where you’ll be authenticating your user, you can create a Web API method (or ASMX or WCF or whatever you feel like) similar to this one:

public class UploadController : ApiController { [Authorize] public string Put(string fileName) { var account = CloudStorageAccount.DevelopmentStorageAccount; var blobClient = account.CreateCloudBlobClient(); var blobContainer = blobClient.GetContainerReference("uploads"); blobContainer.CreateIfNotExists(); var blob = blobContainer.GetBlockBlobReference("customer1-" + fileName); var uriBuilder = new UriBuilder(blob.Uri); uriBuilder.Query = blob.GetSharedAccessSignature(new SharedAccessBlobPolicy { Permissions = SharedAccessBlobPermissions.Write, SharedAccessStartTime = DateTime.UtcNow, SharedAccessExpiryTime = DateTime.UtcNow.AddMinutes(5) }).Substring(1); return uriBuilder.ToString(); } }

This method does a couple of things:

  • Authenticate the client using your authentication mechanism
  • Create a blob reference (not the actual blob, just a URL)
  • Signs the blob URL with write access, allowed from now until now + 5 minutes. That should give the client 5 minutes to start the upload.

On the client side, in our VSTO plugin, the only thing to do now is call this method with a filename. The web service will create a shared access signature to a non-existing blob and returns that to the client. The VSTO plugin can then use this signed blob URL to perform the upload:

Uri url = new Uri("http://...../uploads/customer1-test.txt?sv=2012-02-12&st=2012-12-18T08%3A11%3A57Z&se=2012-12-18T08%3A16%3A57Z&sr=b&sp=w&sig=Rb5sHlwRAJp7mELGBiog%2F1t0qYcdA9glaJGryFocj88%3D"); var blob = new CloudBlockBlob(url); blob.Properties.ContentType = "test/plain"; using (var data = new MemoryStream( Encoding.UTF8.GetBytes("Hello, world!"))) { blob.UploadFromStream(data); }

Easy, secure and scalable. Enjoy!

Protecting your ASP.NET Web API using OAuth2 and the Windows Azure Access Control Service

An article I wrote a while ago has been posted on DeveloperFusion:

The world in which we live evolves at a vast speed. Today, many applications on the Internet expose an API which can be consumed by everyone using a web browser or a mobile application on their smartphone or tablet. How would you build your API if you want these apps to be a full-fledged front-end to your service without compromising security? In this article, I’ll dive into that. We’ll be using OAuth2 and the Windows Azure Access Control Service to secure our API yet provide access to all those apps out there.

Why would I need an API?

A couple of years ago, having a web-based application was enough. Users would navigate to it using their computer’s browser, do their dance and log out again. Nowadays, a web-based application isn’t enough anymore. People have smartphones, tablets and maybe even a refrigerator with Internet access on which applications can run. Applications or “apps”. We’re moving from the web towards apps.

If you want to expose your data and services to external third-parties, you may want to think about building an API. Having an API gives you a giant advantage on the Internet nowadays. Having an API will allow your web application to reach more users. App developers will jump onto your API and build their app around it. Other websites or apps will integrate with your services by consuming your API. The only thing you have to do is expose an API and get people to know it. Apps will come. Integration will come.

A great example of an API is Twitter. They have a massive data store containing tweets and data related to that. They have user profiles. And a web site. And an API. Are you using www.twitter.com to post tweets? I am using the website, maybe once a year. All other tweets come either from my Windows Phone 7’s Twitter application or through www.hootsuite.com, a third-party Twitter client which provides added value in the form of statistics and scheduling. Both the app on my phone as well as the third-party service are using the Twitter API. By exposing an API, Twitter has created a rich ecosystem which drives adoption of their service, reaches more users and adds to their real value: data which they can analyze and sell.

(…)

Getting to know OAuth2

If you decide that your API isn’t public or specific actions can only be done for a certain user (let that third party web site get me my tweets, Twitter!), you’ll be facing authentication and authorization problems. With ASP.NET Web API, this is simple: add an [Authorize] attribute on top of a controller or action method and you’re done, right? Well, sort of…

When using the out-of-the-box authentication/authorization mechanisms of ASP.NET Web API, you are relying on basic or Windows authentication. Both require the user to log in. While perfectly viable and a good way of securing your API, a good alternative may be to use delegation.

In many cases, typically with public API’s, your API user will not really be your user, but an application acting on behalf of that user. That means that the application should know the user’s credentials. In an ideal world, you would only give your username and password to the service you’re using rather than just trusting the third-party application or website with it. You’ll be delegating access to these third parties. If you look at Facebook for example, many apps and websites redirect you to Facebook to do the login there instead of through the app itself.

Head over to the original article for more! (I’ll also be doing a talk on this on some upcoming conferences)

Configuring IIS methods for ASP.NET Web API on Windows Azure Websites and elsewhere

That’s a pretty long title, I agree. When working on my implementation of RFC2324, also known as the HyperText Coffee Pot Control Protocol, I’ve been struggling with something that you will struggle with as well in your ASP.NET Web API’s: supporting additional HTTP methods like HEAD, PATCH or PROPFIND. ASP.NET Web API has no issue with those, but when hosting them on IIS you’ll find yourself in Yellow-screen-of-death heaven.

The reason why IIS blocks these methods (or fails to route them to ASP.NET) is because it may happen that your IIS installation has some configuration leftovers from another API: WebDAV. WebDAV allows you to work with a virtual filesystem (and others) using a HTTP API. IIS of course supports this (because flagship product “SharePoint” uses it, probably) and gets in the way of your API.

Bottom line of the story: if you need those methods or want to provide your own HTTP methods, here’s the bit of configuration to add to your Web.config file:

<?xml version="1.0" encoding="utf-8"?> <configuration> <!-- ... --> <system.webServer> <validation validateIntegratedModeConfiguration="false" /> <modules runAllManagedModulesForAllRequests="true"> <remove name="WebDAVModule" /> </modules> <security> <requestFiltering> <verbs applyToWebDAV="false"> <add verb="XYZ" allowed="true" /> </verbs> </requestFiltering> </security> <handlers> <remove name="ExtensionlessUrlHandler-ISAPI-4.0_32bit" /> <remove name="ExtensionlessUrlHandler-ISAPI-4.0_64bit" /> <remove name="ExtensionlessUrlHandler-Integrated-4.0" /> <add name="ExtensionlessUrlHandler-ISAPI-4.0_32bit" path="*." verb="GET,HEAD,POST,DEBUG,PUT,DELETE,PATCH,OPTIONS,XYZ" modules="IsapiModule" scriptProcessor="%windir%\Microsoft.NET\Framework\v4.0.30319\aspnet_isapi.dll" preCondition="classicMode,runtimeVersionv4.0,bitness32" responseBufferLimit="0" /> <add name="ExtensionlessUrlHandler-ISAPI-4.0_64bit" path="*." verb="GET,HEAD,POST,DEBUG,PUT,DELETE,PATCH,OPTIONS,XYZ" modules="IsapiModule" scriptProcessor="%windir%\Microsoft.NET\Framework64\v4.0.30319\aspnet_isapi.dll" preCondition="classicMode,runtimeVersionv4.0,bitness64" responseBufferLimit="0" /> <add name="ExtensionlessUrlHandler-Integrated-4.0" path="*." verb="GET,HEAD,POST,DEBUG,PUT,DELETE,PATCH,OPTIONS,XYZ" type="System.Web.Handlers.TransferRequestHandler" preCondition="integratedMode,runtimeVersionv4.0" /> </handlers> </system.webServer> <!-- ... --> </configuration>

Here’s what each part does:

  • Under modules, the WebDAVModule is being removed. Just to make sure that it’s not going to get in our way ever again.
  • The security/requestFiltering element I’ve added only applies if you want to define your own HTTP methods. So unless you need the XYZ method I’ve defined here, don’t add it to your config.
  • Under handlers, I’m removing the default handlers that route into ASP.NET. Then, I’m adding them again. The important part? The "verb attribute. You can provide a list of comma-separated methods that you want to route into ASP.NET. Again, I’ve added my XYZ methodbut you probably don’t need it.

This will work on any IIS server as well as on Windows Azure Websites. It will make your API… happy.

A phone call from the cloud: Windows Azure, SignalR & Twilio

Note: this blog post used to be an article for the Windows Azure Roadtrip website. Since that one no longer exists, I decided to post the articles on my blog as well. Find the source code for this post here: 05 ConfirmPhoneNumberDemo.zip (1.32 mb).
It has been written earlier this year, some versions of packages used (like jQuery or SignalR) may be outdated in this post. Live with it.

In the previous blog post we saw how you can send e-mails from Windows Azure. Why not take communication a step further and make a phone call from Windows Azure? I’ve already mentioned that Windows Azure is a platform which will run your code, topped with some awesomesauce in the form of a large number of components that will speed up development. One of those components is the API provided by Twilio, a third-party service.

Twilio is a telephony web-service API that lets you use your existing web languages and skills to build voice and SMS applications. Twilio Voice allows your applications to make and receive phone calls. Twilio SMS allows your applications to make and receive SMS messages. We’ll use Twilio Voice in conjunction with jQuery and SignalR to spice up a sign-up process.

The scenario

The idea is simple: we want users to sign up using a username and password. In addition, they’ll have to provide their phone number. The user will submit the sign-up form and will be displayed a confirmation code. In the background, the user will be called and asked to enter this confirmation code in order to validate his phone number. Once finished, the browser will automatically continue the sign-up process. Here’s a visual:

clip_image002

Sounds too good to be true? Get ready, as it’s relatively simple using Windows Azure and Twilio.

Let’s start…

Before we begin, make sure you have a Twilio account. Twilio offers some free credits, enough to test with. After registering, make sure that you enable international calls and that your phone number is registered as a developer. Twilio takes this step in order to ensure that their service isn’t misused for making abusive phone calls using free developer accounts.

Next, create a Windows Azure project containing an ASP.NET MVC 4 web role. Install the following NuGet packages in it (right-click, Library Package Manager, go wild):

  • jQuery
  • jQuery.UI.Combined
  • jQuery.Validation
  • json2
  • Modernizr
  • SignalR
  • Twilio
  • Twilio.Mvc
  • Twilio.TwiML

It may also be useful to develop some familiarity with the concepts behind SignalR.

The registration form

Let’s create our form. Using a simple model class, SignUpModel, create the following action method:

public ActionResult Index() { return View(new SignUpModel()); }

This action method is accompanied with a view, a simple form requesting the required information from our user:

@using (Html.BeginForm("SignUp", "Home", FormMethod.Post)) { @Html.ValidationSummary(true) <fieldset> <legend>Sign Up for this awesome service</legend> @* etc etc etc *@ <div class="editor-label"> @Html.LabelFor(model => model.Phone) </div> <div class="editor-field"> @Html.EditorFor(model => model.Phone) @Html.ValidationMessageFor(model => model.Phone) </div> <p> <input type="submit" value="Sign up!" /> </p> </fieldset> }

We’ll spice up this form with a dialog first. Using jQuery UI, we can create a simple <div> element which will serve as the dialog’s content. Note the ui-helper-hidden class which is used to make it invisible to view.

<div id="phoneDialog" class="ui-helper-hidden"> <h1>Keep an eye on your phone...</h1> <p>Pick up the phone and follow the instructions.</p> <p>You will be asked to enter the following code:</p> <h2>1743</h2> </div>

This is a simple dialog in which we’ll show a hardcoded confirmation code which the user will have to provide when called using Twilio.

Next, let’s code our JavaScript logic which will spice up this form. First, add the required JavaScript libraries for SignalR (more on that later):

<script src="@Url.Content("~/Scripts/jquery.signalR-0.5.0.min.js")" type="text/javascript"></script> <script src="@Url.Content("~/signalr/hubs")" type="text/javascript"></script>

Next, capture the form’s submit event and, if the phone number has not been validated yet, cancel the submit event and show our dialog instead:

$('form:first').submit(function (e) { if ($(this).valid() && $('#Phone').data('validated') != true) { // Show a dialog $('#phoneDialog').dialog({ title: '', modal: true, width: 400, height: 400, resizable: false, beforeClose: function () { if ($('#Phone').data('validated') != true) { return false; } } }); // Don't submit. Yet. e.preventDefault(); } });

Nothing fancy yet. If you now run this code, you’ll see that a dialog opens and remains open for eternity. Let’s craft some SignalR code now. SignalR uses a concept of Hubs to enable client-server communication, but also server-client communication. We’ll need the latter to inform our view whenever the user has confirmed his phone number. In the project, add the following class:

[HubName("phonevalidator")] public class PhoneValidatorHub : Hub { public void StartValidation(string phoneNumber) { } }

This class defines a service that the client can call. SignalR will also keep the connection with the client open so that this PhoneValidatorHub can later send a message to the client as well. Let’s connect our view to this hub. In the form submit event handler, add the following line of code:

// Validate the phone number using Twilio $.connection.phonevalidator.startValidation($('#Phone').val());

We’ve created a C# class with a StartValidation method and we’re calling the startValidation message from JavaScript. Coincidence? No. SignalR makes this possible. But we’re not finished yet. We can now call a method on the server side, but how would the server inform the client when the phone number has been validated? I’ll get to that point later. First, let’s make sure our JavaScript code can receive that call from the server. To do so, connect to the PhoneValidator hub and add a callback function to it:

var validatorHub = $.connection.phonevalidator; validatorHub.validated = function (phoneNumber) { if (phoneNumber == $('#Phone').val()) { $('#Phone').data('validated', true); $('#phoneDialog').dialog('destroy'); $('form:first').trigger('submit'); } }; $.connection.hub.start();

What we’re doing here is adding a client-side function named validated to the SignalR hub. We can call this function, sitting at the client side, from our server-side code later on. The function itself is easy: it checks whether the phone number that was validated matches the one the user entered and, if so, it submits the form and completes the signup.

All that’s left is calling the user and, when the confirmation succeeds, we’ll have to inform our client by calling the validated message on the hub.

Initiating a phone call

The phone call to our user will be initiated in the PhoneValidatorHub’s StartValidation method. Add the following code there:

var twilioClient = new TwilioRestClient("api user", "api password"); string url = "http://mas.cloudapp.net/Home/TwilioValidationMessage?passcode=1743" + "&phoneNumber=" + HttpContext.Current.Server.UrlEncode(phoneNumber); // Instantiate the call options that are passed to the outbound call CallOptions options = new CallOptions(); options.From = "+14155992671"; // Twilio's developer number options.To = phoneNumber; options.Url = url; // Make the call. twilioClient.InitiateOutboundCall(options);

Using the TwilioRestClient class, we create a request to Twilio. We also pass on a URL which points to our application. Twilio uses TwiML, an XML format to instruct their phone services. When calling the InitiateOutboundCall method, Twilio will issue a request to the URL we are hosting (http://.....cloudapp.net/Home/TwilioValidationMessage) to fetch the TwiML which tells Twilio what to say, ask, record, gather, … on the phone.

Next up: implementing the TwilioValidationMessage action method.

public ActionResult TwilioValidationMessage(string passcode, string phoneNumber) { var response = new TwilioResponse(); response.Say("Hi there, welcome to Maarten's Awesome Service."); response.Say("To validate your phone number, please enter the 4 digit" + " passcode displayed on your screen followed by the pound sign."); response.BeginGather(new { numDigits = 4, action = "http://mas.cloudapp.net/Home/TwilioValidationCallback?phoneNumber=" + Server.UrlEncode(phoneNumber), method = "GET" }); response.EndGather(); return new TwiMLResult(response); }

That’s right. We’re creating some TwiML here. Our ASP.NET MVC action method is telling Twilio to say some text and to gather 4 digits from his phone pad. These 4 digits will be posted to the TwilioValidationCallback action method by the Twilio service. Which is the next method we’ll have to implement.

public ActionResult TwilioValidationCallback(string phoneNumber) { var hubContext = GlobalHost.ConnectionManager.GetHubContext<PhoneValidatorHub>(); hubContext.Clients.validated(phoneNumber); var response = new TwilioResponse(); response.Say("Thank you! Your browser should automatically continue. Bye!"); response.Hangup(); return new TwiMLResult(response); }

The TwilioValidationCallback action method does two things. First, it gets a reference to our SignalR hub and calls the validated function on it. As you may recall, we created this method on the hub’s client side, so in fact our ASP.NET MVC server application is calling a method on the client side. Doing this triggers the client to hide the validation dialog and complete the user sign-up process.

Another action we’re doing here is generating some more TwiML (it’s fun!). We thank the user for validating his phone number and, after that, we hang up the call.

You see? Working with voice (and text messages too, if you want) isn’t that hard. It enables additional scenarios that can make your application stand out from all the many others out there. Enjoy!

05 ConfirmPhoneNumberDemo.zip (1.32 mb)

Sending e-mail from Windows Azure

Note: this blog post used to be an article for the Windows Azure Roadtrip website. Since that one no longer exists, I decided to post the articles on my blog as well. Find the source code for this post here: 04 SendingEmailsFromTheCloud.zip (922.27 kb).

When a user subscribes, you send him a thank-you e-mail. When his account expires, you send him a warning message containing a link to purchase a new subscription. When he places an order, you send him an order confirmation. I think you get the picture: a fairly common scenario in almost any application is sending out e-mails.

Now, why would I spend a blog post on sending out an e-mail? Well, for two reasons. First, Windows Azure doesn’t have a built-in mail server. No reason to panic! I’ll explain why and how to overcome this in a minute. Second, I want to demonstrate a technique that will make your applications a lot more scalable and resilient to errors.

E-mail services for Windows Azure

Windows Azure doesn’t have a built-in mail server. And for good reasons: if you deploy your application to Windows Azure, it will be hosted on an IP address which previously belonged to someone else. It’s a shared platform, after all. Now, what if some obscure person used Windows Azure to send out a number of spam messages? Chances are your newly-acquired IP address has already been blacklisted, and any e-mail you send from it ends up in people’s spam filters.

All that is fine, but of course, you still want to send out e-mails. If you have your own SMTP server, you can simply configure your .NET application hosted on Windows Azure to make use of your own mail server. There are a number of so-called SMTP relay services out there as well. Even the Belgian hosters like Combell, Hostbasket or OVH offer this service. Microsoft has also partnered with SendGrid to have an officially-supported service for sending out e-mails too. Windows Azure customers receive a special offer of 25,000 free e-mails per month from them. It’s a great service to get started with sending e-mails from your applications: after all, you’ll be able to send 25,000 free e-mails every month. I’ll be using SendGrid in this blog post.

Asynchronous operations

I said earlier that I wanted to show you two things: sending e-mails and building scalable and fault-resilient applications. This can be done using asynchronous operations. No, I don’t mean AJAX. What I mean is that you should create loosely-coupled applications.

Imagine that I was to send out an e-mail whenever a user registers. If the mail server is not available for that millisecond when I want to use it, the send fails and I might have to show an error message to my user (or even worse: a YSOD). Why would that happen? Because my application logic expects that it can communicate with a mail server in a synchronous manner.

clip_image002

Now let’s remove that expectation. If we introduce a queue in between both services, the front-end can keep accepting registrations even when the mail server is down. And when it’s back up, the queue will be processed and e-mails will be sent out. Also, if you experience high loads, simply scale out the front-end and add more servers there. More e-mail messages will end up in the queue, but they are guaranteed to be processed in the future at the pace of the mail server. With synchronous communication, the mail service would probably experience high loads or even go down when a large number of front-end servers is added.

clip_image004

Show me the code!

Let’s combine the two approaches described earlier in this post: sending out e-mails over an asynchronous service. Before we start, make sure you have a SendGrid account (free!). Next, familiarise yourself with Windows Azure storage queues using this simple tutorial.

In a fresh Windows Azure web role, I’ve created a quick-and-dirty user registration form:

clip_image006

Nothing fancy, just a form that takes a post to an ASP.NET MVC action method. This action method stores the user in a database and adds a message to a queue named emailconfirmation. Here’s the code for this action method:

[HttpPost, ActionName("Register")] public ActionResult Register_Post(RegistrationModel model) { if (ModelState.IsValid) { // ... store the user in the database ... // serialize the model var serializer = new JavaScriptSerializer(); var modelAsString = serializer.Serialize(model); // emailconfirmation queue var account = CloudStorageAccount.FromConfigurationSetting("StorageConnection"); var queueClient = account.CreateCloudQueueClient(); var queue = queueClient.GetQueueReference("emailconfirmation"); queue.CreateIfNotExist(); queue.AddMessage(new CloudQueueMessage(modelAsString)); return RedirectToAction("Thanks"); } return View(model); }

As you can see, it’s not difficult to work with queues. You just enter some data in a message and push it onto the queue. In the code above, I’ve serialized the registration model containing my newly-created user’s name and e-mail to the JSON format (using JavaScriptSerializer). A message can contain binary or textual data: as long as it’s less than 64 KB in data size, the message can be added to a queue.

Being cheap with Web Workers

When boning up on Windows Azure, you’ve probably read about so-called Worker Roles, virtual machines that are able to run your back-end code. The problem I see with Worker Roles is that they are expensive to start with. If your application has 100 users and your back-end load is low, why would you reserve an entire server to run that back-end code? The cloud and Windows Azure are all about scalability and using a “Web Worker” will be much more cost-efficient to start with - until you have a large user base, that is.

A Worker Role consists of a class that inherits the RoleEntryPoint class. It looks something along these lines:

public class WebRole : RoleEntryPoint { public override bool OnStart() { // ... return base.OnStart(); } public override void Run() { while (true) { // ... } } }

You can run this same code in a Web Role too! And that’s what I mean by a Web Worker: by simply adding this class which inherits RoleEntryPoint to your Web Role, it will act as both a Web and Worker role in one machine.

Call me cheap, but I think this is a nice hidden gem. The best part about this is that whenever your application’s load requires a separate virtual machine running the worker role code, you can simply drag and drop this file from the Web Role to the Worker Role and scale out your application as it grows.

Did you send that e-mail already?

Now that we have a pending e-mail message in our queue and we know we can reduce costs using a web worker, let’s get our e-mail across the wire. First of all, using SendGrid as our external e-mail service offers us a giant development speed advantage, since they are distributing their API client as a NuGet package. In Visual Studio, right-click your web role project and click the “Library Package Manager” menu. In the dialog (shown below), search for Sendgrid and install the package found. This will take a couple of seconds: it will download the SendGrid API client and will add an assembly reference to your project.

clip_image008

All that’s left to do is write the code that reads out the messages from the queue and sends the e-mails using SendGrid. Here’s the queue reading:

public class WebRole : RoleEntryPoint { public override bool OnStart() { CloudStorageAccount.SetConfigurationSettingPublisher((configName, configSetter) => { string value = ""; if (RoleEnvironment.IsAvailable) { value = RoleEnvironment.GetConfigurationSettingValue(configName); } else { value = ConfigurationManager.AppSettings[configName]; } configSetter(value); }); return base.OnStart(); } public override void Run() { // emailconfirmation queue var account = CloudStorageAccount.FromConfigurationSetting("StorageConnection"); var queueClient = account.CreateCloudQueueClient(); var queue = queueClient.GetQueueReference("emailconfirmation"); queue.CreateIfNotExist(); while (true) { var message = queue.GetMessage(); if (message != null) { // ... // mark the message as processed queue.DeleteMessage(message); } else { Thread.Sleep(TimeSpan.FromSeconds(30)); } } } }

As you can see, reading from the queue is very straightforward. You use a storage account, get a queue reference from it and then, in an infinite loop, you fetch a message from the queue. If a message is present, process it. If not, sleep for 30 seconds. On a side note: why wait 30 seconds for every poll? Well, Windows Azure will bill you per 100,000 requests to your storage account. It’s a small amount, around 0.01 cent, but it may add up quickly if this code is polling your queue continuously on an 8 core machine… Bottom line: on any cloud platform, try to architect for cost as well.

Now that we have our message, we can deserialize it and create a new e-mail that can be sent out using SendGrid:

// deserialize the model var serializer = new JavaScriptSerializer(); var model = serializer.Deserialize<RegistrationModel>(message.AsString); // create a new email object using SendGrid var email = SendGrid.GenerateInstance(); email.From = new MailAddress("maarten@example.com", "Maarten"); email.AddTo(model.Email); email.Subject = "Welcome to Maarten's Awesome Service!"; email.Html = string.Format( "<html><p>Hello {0},</p><p>Welcome to Maarten's Awesome Service!</p>" + "<p>Best regards, <br />Maarten</p></html>", model.Name); var transportInstance = REST.GetInstance(new NetworkCredential("username", "password")); transportInstance.Deliver(email); // mark the message as processed queue.DeleteMessage(message);

Sending e-mail using SendGrid is in fact getting a new e-mail message instance from the SendGrid API client, passing the e-mail details (from, to, body, etc.) on to it and handing it your SendGrid username and password upon sending.

One last thing: you notice we’re only deleting the message from the queue after processing it has succeeded. This is to ensure the message is actually processed. If for some reason the worker role crashes during processing, the message will become visible again on the queue and will be processed by a new worker role which processes this specific queue. That way, messages are never lost and always guaranteed to be processed at least once.

What PartitionKey and RowKey are for in Windows Azure Table Storage

For the past few months, I’ve been coaching a “Microsoft Student Partner” (who has a great blog on Kinect for Windows by the way!) on Windows Azure. One of the questions he recently had was around PartitionKey and RowKey in Windows Azure Table Storage. What are these for? Do I have to specify them manually? Let’s explain…

Windows Azure storage partitions

All Windows Azure storage abstractions (Blob, Table, Queue) are built upon the same stack (whitepaper here). While there’s much more to tell about it, the reason why it scales is because of its partitioning logic. Whenever you store something on Windows Azure storage, it is located on some partition in the system. Partitions are used for scale out in the system. Imagine that there’s only 3 physical machines that are used for storing data in Windows Azure storage:

Windows Azure Storage partition

Based on the size and load of a partition, partitions are fanned out across these machines. Whenever a partition gets a high load or grows in size, the Windows Azure storage management can kick in and move a partition to another machine:

Windows Azure storage partition

By doing this, Windows Azure can ensure a high throughput as well as its storage guarantees. If a partition gets busy, it’s moved to a server which can support the higher load. If it gets large, it’s moved to a location where there’s enough disk space available.

Partitions are different for every storage mechanism:

  • In blob storage, each blob is in a separate partition. This means that every blob can get the maximal throughput guaranteed by the system.
  • In queues, every queue is a separate partition.
  • In tables, it’s different: you decide how data is co-located in the system.

PartitionKey in Table Storage

In Table Storage, you have to decide on the PartitionKey yourself. In essence, you are responsible for the throughput you’ll get on your system. If you put every entity in the same partition (by using the same partition key), you’ll be limited to the size of the storage machines for the amount of storage you can use. Plus, you’ll be constraining the maximal throughput as there’s lots of entities in the same partition.

Should you set the PartitionKey to the same value for every entity stored? No. You’ll end up with scaling issues at some point.
Should you set the PartitionKey to a unique value for every entity stored? No. You can do this and every entity stored will end up in its own partition, but you’ll find that querying your data becomes more difficult. And that’s where our next concept kicks in…

RowKey in Table Storage

A RowKey in Table Storage is a very simple thing: it’s your “primary key” within a partition. PartitionKey + RowKey form the composite unique identifier for an entity. Within one PartitionKey, you can only have unique RowKeys. If you use multiple partitions, the same RowKey can be reused in every partition.

So in essence, a RowKey is just the identifier of an entity within a partition.

PartitionKey and RowKey and performance

Before building your code, it’s a good idea to think about both properties. Don’t just assign them a guid or a random string as it does matter for performance.

The fastest way of querying? Specifying both PartitionKey and RowKey. By doing this, table storage will immediately know which partition to query and can simply do an ID lookup on RowKey within that partition.

Less fast but still fast enough will be querying by specifying PartitionKey: table storage will know which partition to query.

Less fast: querying on only RowKey. Doing this will give table storage no pointer on which partition to search in, resulting in a query that possibly spans multiple partitions, possibly multiple storage nodes as well. Wihtin a partition, searching on RowKey is still pretty fast as it’s a unique index.

Slow: searching on other properties (again, spans multiple partitions and properties).

Note that Windows Azure storage may decide to group partitions in so-called "Range partitions" - see http://msdn.microsoft.com/en-us/library/windowsazure/hh508997.aspx.

In order to improve query performance, think about your PartitionKey and RowKey upfront, as they are the fast way into your datasets.

Deciding on PartitionKey and RowKey

Here’s an exercise: say you want to store customers, orders and orderlines. What will you choose as the PartitionKey (PK) / RowKey (RK)?

Let’s use three tables: Customer, Order and Orderline.

An ideal setup may be this one, depending on how you want to query everything:

Customer (PK: sales region, RK: customer id) – it enables fast searches on region and on customer id
Order (PK: customer id, RK; order id) – it allows me to quickly fetch all orders for a specific customer (as they are colocated in one partition), it still allows fast querying on a specific order id as well)
Orderline (PK: order id, RK: order line id) – allows fast querying on both order id as well as order line id.

Of course, depending on the system you are building, the following may be a better setup:

Customer (PK: customer id, RK: display name) – it enables fast searches on customer id and display name
Order (PK: customer id, RK; order id) – it allows me to quickly fetch all orders for a specific customer (as they are colocated in one partition), it still allows fast querying on a specific order id as well)
Orderline (PK: order id, RK: item id) – allows fast querying on both order id as well as the item bought, of course given that one order can only contain one order line for a specific item (PK + RK should be unique)

You see? Choose them wisely, depending on your queries. And maybe an important sidenote: don’t be afraid of denormalizing your data and storing data twice in a different format, supporting more query variations.

There’s one additional “index”

That’s right! People have been asking Microsoft for a secondary index. And it’s already there… The table name itself! Take our customer – order – orderline sample again…

Having a Customer table containing all customers may be interesting to search within that data. But having an Orders table containing every order for every customer may not be the ideal solution. Maybe you want to create an order table per customer? Doing that, you can easily query the order id (it’s the table name) and within the order table, you can have more detail in PK and RK.

And there's one more: your account name. Split data over multiple storage accounts and you have yet another "partition".

Conclusion

In conclusion? Choose PartitionKey and RowKey wisely. The more meaningful to your application or business domain, the faster querying will be and the more efficient table storage will work in the long run.