Azure SQL Database Elastic Scale

Play Azure SQL Database Elastic Scale

Description

This week, Scott welcomes Torsten Grabs, a Principal Program Manager Lead on the SQL team focusing on elastic scale for Azure SQL Database. In this exciting and much-anticipated episode, Torsten introduces us to Azure SQL Database Elastic Scale; .NET client libraries and Azure cloud service packages providing the ability to easily develop, scale, and manage the stateful data tiers of your SQL Server applications. The goal with these libraries and services is to solve the challenges of elastic scale; growing and shrinking database capacity for applications running on Azure SQL DB.  Elastic Scale adopts sharding as the underlying concept to distribute data and processing across multiple scale units in Azure SQL DB – and gives you the libraries and tools for sharding so that you can focus on the business logic of your application. Torsten wraps up the episode by showing us an awesome demo on Elastic Scale and data dependent routing. There is plenty more to cover on this topic so stay tuned to this show for more exiting episodes on this topic!

Embed

Download

The Discussion

  • User profile image
    TorstenG

    To find out more, check out our Getting Started page for Azure SQL DB Elastic Scale.

    To download the Elastic Scale bits, visit us on Nuget and search for 'Elastic Scale'.

    Let us know your questions and feedback on the Azure SQL Database forum.

    Happy sharding!

    Torsten

     

  • User profile image
    HansOlavS

    Cool stuff, will there be SDKs and tooling for other platforms like Java?

  • User profile image
    Jeff

    This is an interesting framework on top of an old technology. Unfortunately, this does not benefit me when I cannot make cross-shard complex joins or ACID transactions that hit multiple shards. UNION ALL puts extra load on the application server. If I'm going to do fan-out queries, why not use a NoSQL product with ACID guarantees.. Has MS ever looked into making a NewSQL product? In my opinion, it's too much work to have to create shard maps and mappings as well as split/merge data. There are new age products out there that handle this kind of administrative tasks. I love Azure, but the lack of a true cloud SQL database is disappointing.

  • User profile image
    erix

    Jeff,
    DocumentDB is an example of a (document-oriented) NoSQL DB in Azure.
    There is also support for multiple OSS NoSQL databases in Aure.
    Erix.

  • User profile image
    JoseIdz

    @HansOlavS: Thank you for the comment.  We fully understand the scenario.  I highly encourage you to add your feature request to the Azure SQL DB Feedback Forum@

  • User profile image
    JoseIdz

    @Jeff: Thank you for your detailed feedback.  Teasing apart your comment, there are multiple feature requests:

    1. Cross-database query
    2. Cross-database transactions
    3. Service managed auto-scaling

    Did I understand your feedback correctly?  We'd really like to know more about your specific scenarios, how these features fit into your application and what you're going to work around these features. If you have not already done so, please add your feature request to the Azure SQL DB Feedback Forum under the topics of "cross-database reference", "Add DTC support for SQL Azure", and "Create Federations vNext for true web scalability" respectively or please create new feature requests as appropriate.@

     

  • User profile image
    Jeff

    @erix, DocumentDB can't do OrderBy or aggregate functions like Sum and Count or LIMITing results. I think transactions and joins are also limited to a single collection.

    @Joseldz, just created a new feature request: http://feedback.azure.com/forums/217321-sql-database/suggestions/6519687-cross-shard-transactions-and-joins-in-elastic-scal (This is similar to "Create Federations vNext for true web scalability")

    A simple scenario is social in nature. Suppose I shard data based off of UserID. User1 lives in Shard1, User2 in Shard2, User3 in Shard3, etc. I want to order User1's posts by the number of comments they received. With the current implementation of Elastic Scale, I'd have to fetch all User1 post GUIDs from Shard1 and then issue a query to the entire shard set passing the GUIDs as parameters. This could potentially be a lot of data movement across the wire depending on how many posts this user generated (multiply the count by 16 bytes). The results then have to be merged on the application side and then LINQ has to do OrderBy. (Of course, each post could have a NumberOfComments column to make things easier, but what if I want to make my query more complex such as filtering by commentator age, location, gender, etc. - not to mention I have to make sure the counts are updated correctly, which can be done via a cross-database transaction [e.g. insert comment in Shard3 and increment count in Shard1])

    There may be workarounds to my use cases, but other customers will have different needs.

  • User profile image
    lsjames

    This looks promising -- moreso than Federations did.  I do have a couple questions.

    Looking through the documentation, I might easily be missing it but I don't see anything relating to schema changes.  How does that work?  E.g. we use SSDT to manage our schemas and generate update scripts.  In non-production environments we even publish directly from SSDT.  Would we have to manually get a script and execute it against each shardlet?  Ideally the whole set could be managed as one entity, with schema changes applied across the board.

    Similar concerns exist for reference data, or any shared tables.  When that data changes, do we have to synchronize it across all shardlets?  A far better solution would be to allow queries to span databases.  This would both enable more complex fan-out scenarios (see Jeff's post above), but also would allow queries to join sharded tables to reference tables, with the reference tables being stored in a common, shared database.

  • User profile image
    TorstenG

    @lsjames: Schema maintenance is still an ongoing hot topic for us here on the engineering side. For one thing, the best approach for schema and reference data management also depends on the stack you are using for your application. Entity Framework for instance provides for an interesting approach by running Update-Database across your shards. We are looking at alternative, more general approaches for rolling out schema changes. If you are interested to provide more feedback directly on what we are considering on this front, please reach out directly to me at torsteng((at))microsoft.com to loop you into that discussion.

    Also, please vote in favor of the cross-database query processing improvement that Jeff filed for the previous post to capture the second part of your comment!

    Many thanks,
    Torsten

     

  • User profile image
    Eduardo

    I looking into writing a Multi-tenant SaaS app in Azure. Whats the difference between sharding and having a separated Azure Sql DB per tenant?

  • User profile image
    TorstenG

    @Eduardo: Having a separate database per tenant is a popular sharding model. See the section on common sharding patterns here: https://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-introduction/. When you follow this tenancy model, you will soon find yourself in situations where the app needs to route requests to the right database, or where you need to run queries across some or all shards. This is typically the point where using a sharding infrastructure such as Elastic Scale help you save a good amount of work in your code.

    Hope this helps.

    Thanks,
    Torsten

  • User profile image
    Brent

    Here are a few more specific scenarios: http://www.tesora.com/blog/why-scaling-mysql-sharding-bad-idea

    Would love to see a SQL scale-out solution without the limitations of sharding

  • User profile image
    Daren

    I gave up waiting for Microsoft to come up with something better than sharding (or at least sharding that is easier to implement and flexible), so ended up using NuoDB, but would really like to get back to SQL Azure if only this could be improved upon somehow.

  • User profile image
    ddove66

    @HansOlavS:@Daren: Have you had the opportunity to take a quick look at Elastic Scale for Azure SQL Database. We specifically provided the core functionality through client libraries to expand on the flexibility and ease of implementation. We would love to see you come back to Azure SQL Databases, what are the improvements you are looking for?  I highly encourage you to add your feature request to the Azure SQL DB Feedback Forum

  • User profile image
    Burak

    DB5 [500-600] should be DB6 [500-600]
    DBn [n-n+100] should be DBn [(n-100)-n]

  • User profile image
    TorstenG

    @Burak: Yes, you are right. Thanks for catching this! Actually, to be precise, DBn should map to [(n-1)*100, n*100).

  • User profile image
    CapnMorgan

    Does sharding needed to be implemented for Azure Mobile Services?

    It mentions nothing of doing it for scaling the backed in this article https://azure.microsoft.com/en-us/documentation/articles/mobile-services-sql-scale-guidance/

    Thanks.

  • User profile image
    JoseIdz

    @CapnMorgan: There is nothing that we have found in Azure Mobile Services that would prevent the use of sharding for the database tier - for Entity Framework guidance with Elastic Scale, please see the following [link].  Azure Mobile Services is also an area that we have been looking into, I'd love to learn more about your application and requirements.  Please email me at joseidz at microsoft dot com.

  • User profile image
    Barony

    @Burak:As I know after completing a migration from Azure Managed cache to Azure Redis cache people should be aware that out of the box Redis is far less functional than the Azure Cache server and client.
    A lot of functionality we rely on and comes out of the box with Azure Managed Cache doesn't exist in Redis server or client so you need to roll your own. 

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.