SQL Database Sharding Patterns

Download this episode

Download Video

Download captions

Download Captions

Description

Join Scott Klein as he welcomes Shaun Tinline-Jones, Senior Program Manager for AzureCat, as they discuss patterns and practices for sharding Azure SQL Database. Scott has been working with Shaun and the AzureCAT team for some time now on a sample library that provides elastic scalability in SQL Database. Scott and Shaun talk about some of the complexities of database sharding but introduce and demo a library that help overcome those complexities and provide the elastic scale needed in SQL Database. Shaun highlights the key components and concepts when working with the sharding and then dives right into demoing the library to illustrate how the library can be used to implement sharding patterns into SQL Database. The sample library is available for customers to modify/extend/use and is compiled from engaging with customers who have taken on the task of creating customer sharding solutions.

Tags:

Azure, SQL, SQL Azure

Embed

Format

Available formats for this video:

Actual format may change based on video formats available and browser capability.

    The Discussion

    • User profile image
      gkaya

      Hi, thank you, it was great conversation, custom sharding is very important.

      When can we start using this library? is it ready now, or if not where can we find it in the future?

    • User profile image
      shaunbg

      @gkaya: Thank you for viewing the session.  The library and supporting guidance documents are undergoing peer and legal review. Our timelines are targeting July/August 2014.  Please do contact me if you would like to discuss earlier access.

    • User profile image
      pini

      Hi when we will have it?
      pini

    • User profile image
      shaunbg

      @pini: Thank you for viewing the session.  The library and supporting guidance documents are undergoing peer and legal review. Our timelines are targeting July/August 2014.  Please do contact me if you would like to discuss earlier access.  I wish I had a different answer for you but this is the required steps, and we are working at meeting this timeline.

    • User profile image
      pateketu

      I am user of Sql Federation and looking for alternatives now as federation is being dropped, and that's how I stumbled upon the session.

      So the session talks about creating/spiting/merging a shard which is sort of akin to Create/Split Federation etc. in Sql Azure. But how about connection filtering equivalent in Sql Azure federation, how that is going to work? Currently we have lots of stored procs where we don't have to explicitly do "WHERE CustomerId = @CustomerId" as Connection Filtering (i.e. USE FEDERATION.....) takes care of returning only customer specify data, is there some equivalent or guidance on how to migrate that will be available?

    • User profile image
      shaunbg

      @pateketu: IIRC Federations was achieving this functionality via the gateway.  This is not available to database calls, especially to non-Azure DB environments, and therefore really tricky (if even possible) to implement.  Our own personal research into this topic bubbled a number of challenges that are simply unwieldy to deal with in a "generic" way.  I am aware of one customer that achieves this by constraining all database interactions to EF calls.  This means that they can add the predicate as each query is created.  If you were willing to make strict rules, such as all SPs must have sharding key parameter, all T-SQL commands were first passed through the transact script dom to get the sharding key predicate, then this could be added to the sharding code.

      In summary, this functionality (connection filtering) is not available in the current work discussed in this recording, nor is it something we have found a reasonable resolution for.  It is certainly something that would be great to have, and I would be very interested in hearing ideas on how this can be achieved.

    • User profile image
      pateketu

      @Shaunbg, thanks for the response, I am in touch with someone in Azure CAT team (I am guessing it's probably same as your team) and hoping to get access to the library talked about in the recording soon'ish. I just looked up "transact script dom" and found TransactSql.ScriptDOM. Never knew of it's existence, it does sound like a promising approach to inject WHERE Predicts!

    • User profile image
      Jorge

      Any news on when the library will be avail? I need to do custom sharding but it looks likes federation will no more be supported. Thanks!

    • User profile image
      shaunbg

      I just returned from vacation and received the output from the editors who have been working on the refining final content.  The most recent dates that were shared with me are beginning September.  You can contact me offline if that will help you.

    • User profile image
      skrysmanski​tv

      Is there any news on when the library will be released?

    • User profile image
      Graham

      Any update, now end of september

    • User profile image
      JPChauhan

      Many thanks for the session.

       

      Few questions –

       

      1. When can we expect the library and supporting guidance documents?
      2. In the session, custom code (.net code) used for "Sharding", would it be possible to have consistent approach to achieve the same e.g. via SSMS?
      3. How the pricing policy for data transfer will be applied, in case we need to shard data on various data centers e.g. 2 data center in US, 1 in Europe, 2 in Asia and?

       Thanks,

    • User profile image
      shaunbg

      My apologies for delayed reply...this is what I know in terms of timing and current status.  A slew of documents are to be released shortly, I thought it was last week, so I guess it will be next week.  There is an intention to make the code available but it's dependant on a release cycle that I don't have control over.

      Sorry about the dragging of this.  There have been a number of significant org changes that impacted the outcome.  You are welcome to contact me via twitter on @shaunbg and I'll help you where I can.

    • User profile image
      shaunbg

      @JPChauhan:

      Q1: This is in the hands of the publishers, but I believe it's next week or 2 away;

      Q2: I think I need you to expand on this.  My initial reaction is that SSMS as a tool for doing is tough to visualize, but maybe I just need to understand the question better.

      Q3: Great question. In principle this is not an azure service, therefore any topics such as cost of egress is a cost you will bare.

      Hope this helps.

      Shaun

    • User profile image
      JPChauhan

      @shaunbg

      Thanks for answering above questions. Related to Q2

      Q2: I would like to know how to shard the data. Do we need to develop .NET solution as you have demonstrated in the session? Need to know step by step approach ( I believe same will be available in the supporting guidance documents)

      Thanks,

       

    • User profile image
      TorstenG

      Today, we announced the public preview for Azure SQL DB Elastic Scale which provides libraries following the sharding patterns Shaun discusses here. You can find the Channel9 video for it: http://channel9.msdn.com/Shows/Data-Exposed/Azure-SQL-Database-Elastic-Scale.

      Thanks,
      Torsten

       

    • User profile image
      shaunbg

      @JPChauhan Any application that decides to take on the "journey" of being sharded will inevitably need to be coded with awareness of sharding.  Using API libraries that abstract the complexity and take care of known patterns is an efficient way to tackle that.  Designing and taking into account the nuances of the SDLC of a sharded solution is something that is in the promised documentation.  Hopefully somebody at Microsoft can share when this content will be published.  It's difficult to fully appreciate the value of an API library without these insights.

    Comments closed

    Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.