Wednesday, July 01, 2009

A Quick Look at Azure, Microsoft's Cloud

Last week at a ThoughtWorks home-office gathering in New York we had a presentation by one of Microsoft's Architecture Evangelists about their in-the-works cloud offering, Azure. I have to say I was pretty impressed. (If not for the whole Windows thing, I'd probably be trying it out instead of writing about it right now. If you aren't as bothered by Windows as I am, you can register and try it yourself.) Here are some fairly random impressions, mostly based on comparison to the two other best-known cloud contenders.

Azure conceptually leans more towards Google AppEngine's "scale me up; scale me down; hide the details" approach than Amazon EC2's easy-to-manage-infrastructure-as-a-service model, though it currently sits somewhere between them. There's not (yet, at least) any automatic scaling up and down, which surprised me. However what you manually configure is just the number of instances of each application, which is far simpler than manually spinning up new instances of EC2 images and getting them into your cluster (or what-have-you). (Although if you've set up your infrastructure right, your EC2 scaling activities could be just as simple. EC2 just gives you the freedom to do it badly.)

There are two-and-a-half types of application: web apps, worker apps, and web+worker apps. The worker concept is very cool and not something you can do with AppEngine. (The closest thing is their support for scheduled tasks, but these are limited to scheduled URL hits no more often than once a minute with the same restrictions as a normal web request.) Apps all run on something pretty close to the normal .NET framework (minus normal file system access and presumably some other areas--I forgot to ask about threading.)

There are three kinds of storage.

  • A file-system-like BLOB storage API, which involves saving individual blocks and then telling the system in what order to stitch them together. It seemed clunky, but I could see writing a simple abstraction over it for small stuff (where you'd usually only need one block) and for large stuff maybe being able to take advantage of already having some blocks uploaded when something gets interrupted and can resume later.
  • A BigTable-like "table" storage API. It isn't as rich as AppEngine's BigTable-backed Datastore in that entity properties are restricted to very simple types (no lists, for example) and there's no built-in support for relationships of any kind (even the simple hierarchy of Datastore). (Note there's nothing to stop you from storing keys of other entities as property values and rolling your own awkward relationships.) Similar to Datastore on AppEngine, you have to think about how you want your data partitioned and make some under-informed guesses about slow queries across multiple partitions vs parallel queries each in their own partition. (On Azure, the key for each entity includes an app-generated token naming the partition in which it should reside, which is a bit simpler than Datastore's parentage-based entity groups.) The story on transactions (being limited to a single entity group) seems to be similar as well. There's also currently no indexing by anything other than key, though it sounded like they'll allow you to add at least one additional index per table before they go proper-live.
  • Most interestingly, damn-near plain old SQL Server storage in the form of SQL Data Services, with just about every feature of MS SQL Server any normal application would use (except distributed transactions). I believe this is a separate service that you'll need to sign up for and eventually pay for separately. Regardless, it's a key element of the offering, because it provides a huge boost to the suggestion that you can migrate existing .NET applications into the cloud.

There are also persistent queues, which web apps can use to communicate to worker apps asynchronously. (I guess this qualifies as a fourth kind of storage.) They feature guaranteed delivery (with possible repeated delivery) of messages. This is very cool and not something AppEngine offers. (Even if you rolled your own queue-ish thing in datastore, there's no facility to consume messages intelligently.)

There are REST APIs (and SOAP as well, I think) for just about everything, so you have a lot of options for architectures (using heterogeneous technologies) that run partly on-premisis and partly in the Azure cloud.

There are a bunch of other services you can hook into, including ".NET Services" (low-level technical stuff including a Service Bus, Access Control, and a Workflow Service) and "Live Services" (higher-level end-user-meaningful services such as Contacts, Windows Live ID, and assorted Bing stuff).

There's also, surprisingly, the ability to run unmanaged code in your .NET application, AND some form of support for non-.NET stuff via IIS's FastCGI support.

Anyway, it looks like it'll be a compelling offering, competitive with EC2-hosted Windows, and thanks to SQL Data Services, much closer to allowing you to put a normal .NET web app into the cloud than AppEngine can do for Java web apps. For any organization willing to trust Microsoft with its infrastructure, Azure is a good reason to be excited. For the rest of the world, carefully executed EC2 still looks like the best bet for the Enterprise in the cloud.

1 comment:

Brent Stineman said...

your making the boundaries a bit fuzzy in your post.

Windows Azure is the PaaS offering. It is complimented by Azure Storage (which contains queue, blob, and table dynamics).

SQL Data Services is a seperate service that must be subscribed to. It along with .NET Services, Live Service, xRM, etc... comprise other products within the larger umbrella of Microsoft Azure Services (which includes Windows Azure).

May seem like I'm picking nits. :) But when your ordering off the menu, if you don't get specific, you might not get what you expected.