Episode 69 - SQL Azure Federations with George Huey

Play Episode 69 - SQL Azure Federations with George Huey

The Discussion

  • User profile image

    SQL Azure Federations could be great. What's the deal with having to specify which shard to access using the USE FEDERATION statement?

    As I see it, it doesn't make much of a difference in having to incorporate these statements in every sql-statement and actually having n number of different connection-strings in your code. The only real functionality given here is that it's easy to scale out your database to many shards, and that's it. Going from SQL Aure to SQL Azure Federations still requires you to significantly change your code.

    Honestly, I think that the SQL Azure Federation story today is weak. There needs to be a higher abstraction level so that you don't need to handle the different shards in your code. Isn't this obvious?

  • User profile image
    Cihan Biyikoglu

    Hi There, Thanks for the feedback.
    I do understand folks who come from SQL Server with a single database setup need to change code to move into federations. In future we would like to make this much easier and more transparent but the goal with federations is to address a scale target much beyond what a single database or a centralized scale-out database could achieve. Federation also needs to deliver this with the new economics of the cloud; eliminate overcapacity with elasticity.
    I would love to hear your thoughts on what we should work on next in federations to get it closer to what you need.
    One part I am unclear on is "n number of connection strings with USE FEDERATION" comment. We require a connection string in the app to a single endpoint and that is the root database. Would be great if you can expand on that. if you prefer a private email I am cihan.biyikoglu@microsoft.com or @cihangirb on twitter.
    Many Thanks

  • User profile image

    Hi Cihan! Thanks for answering my post Smiley

    1. What I want from SQL Azure Federations
      1. I want true elasticity - scale out and scale in - without downtime. As of now, it's only possible to scale-out without downtime. If you want to reduce the number of shards you need to take the production system offline.
      2. I want to be able to write simple sql commands like
        DELETE Customers WHERE CustomerID IN (X, Y, Z)
        without having to worry if X, Y, Z is in the same shard or not. Why not just distribute out the sql command to all shards and join them together before my SqlCommand gets the results back? I see that this might hurt performance, but not necessarily! It's faster to let 3 servers process this command in parallell than to have it execute in only one server if your dataset is huge. In the very least let there be an option to do it, and let developers have the option to choose simplicity vs. performance.
    2. "Connection string vs. USE FEDERATION statements"
      1. What I meant here is simply that the difference between having 3 SQL Azure databases and 3 different connections strings in your production-code to reach them is essentially the same as having a SQL Azure Federation with 3 shards and starting every SQL command with "USE FEDERATION ..." statements to reach the correct database shard. From a practical standpoint SQL Azure Federations doesn't add that much practical added value to the cost, implementation or maintenance of production systems, as I see it.
  • User profile image

    @Cihan Biyikoglu: See my comment in the post below..  Forgot to reply to your post.

    Also, if it's possible at all, I would love to get some hints as to where sql azure federations is going form now. This could be a database-killer app as I see it!

  • User profile image
    Cihan Biyikoglu

    Thanks for the comments Hans.
    #1 above; abslutely. we'll certainly make these options available.
    #2 USE FED has a few benefits; one of them is preventing connection multiplexing or pool fragmentation. real big problem when compute and db node count gets large; http://blogs.msdn.com/b/cbiyikoglu/archive/2012/02/08/connection-pool-fragmentation-scale-to-100s-of-nodes-with-federations-and-you-won-t-need-to-ever-learn-what-these-nasty-problems-are.aspx
    Another benefit of USE FED is the ability to switch connections without disconnects. So a single thread does not have to keep handing around connections and can simply repeat USE FED to do this.
    Aside from all that, the benefit of USE FED is when the ONLINE repartitioning kicks in. There is a ton of race conditions on where to route your connection the moment a FED SPLIT or a FED DROP completes. USE FED makes that simply worry free and garantees you will be sent over to the right DB at any moment in time. You can imagine how important that will be especially when repartitioning becomes online.

  • User profile image

    @Cihan Biyikoglu:

    Thanks for your answer and your excellent blog post! I'm currently working with a relatively small dataset (only 80 GB database on-premise) and sharding up to hundreds of nodes is a whole other story. In this space SQL Azure Federations is truly a lifesaver!

    But SQL Azure Federations could become the cloud database-solution numero uno or the de facto standard! If it just works out-of-the-box for small federations upto like 10 shards. I would guess that 90% of real-world databases would fit in this category, or maybe even with fewer shards?

    Anyways; Thanks a lot for the input, Cihan!

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.