Interview with Munil Shah (Safe Deployment Questions Answered)

Play Interview with Munil Shah (Safe Deployment Questions Answered)
Sign in to queue


In this interview, Director of Engineering Munil Shah returns to answer the questions asked about Safe Deployment of Visual Studio Team Services.

[00:32] Deployment Rings
[05:39] RM Deployment Definition
[08:30] Ring 4a
[11:35] Branching / CI
[14:14] Hot Fixes
[16:54] Cherrypicking changes
[17:46] We never delete branches
[20:10] Deployment frequency



VSTS, devops



The Discussion

  • User profile image

    TFVC isn't a "weird" configuration. TFVC is still a FULLY supported feature that our engineering team uses. :-/

  • User profile image
    Kevin Ham

    Asking again from the first video:
    How do you monitor the operational impact of deployment rollouts with dashboards like grafana (to see things like CPU utilization, RAM utilization, and Disk Throughput)?

  • User profile image

    @Stalen: Hi Stalen, I could not agree more. Weird was a bad choice of words.  The point we were trying to make was although we use VSTS for 48 hours before we release to the next ring we do not test every scenario. 

  • User profile image

    @Kevin: Yes we have very rich set of telemetry and dashboards to monitor tons of metrics, including the ones you mentioned. It's an internal system that our friends in Azure have built, includes some parts that are externally available like Azure Log Analytics, Application Insights. Maybe we should do another video of our live site dashboards! :)

  • User profile image
    Ted W

    With the deployment rings will VSTS be GDPR and C5 (Germany) compliant, similar to FedRamp regulations?

  • User profile image

    Would love to get a deep dive on the updatebinaries task...that's where there must be some serious magic sauce.

  • User profile image

    600 pull requests per day on just master. Would love to have a video dig deeper into how that is possible when 600 x 6 minutes = 3600 minutes / 60 mins in an hour = 60 hours and 60 hours / 24 hours in a day is 2.5 days. So how is it possible EVERY day to stay current when even if you do every PR in 2x parallel then you'd still have PRs taking an additional 12 hours (.5 days) to get "current".

  • User profile image

    Wait none of this makes sense to me, a release takes 10 days to be deployed to ring 4? What the heck is the definition of done for these teams if it doesn't include 'deployed to production'? Also it sounds like the deployments aren't tied to the sprint schedule the engineering teams are executing? Would love an overview of how teams know they are 'done'.

  • User profile image

    Donovan, great conversation but was way too fast and it's hard to follow for an outsider.

    Can you both expand on the branching model and maybe show the process? It was mentioned that every sprint gets it's own release branch...but that's cut off the master branch...where every team is committing daily...or is it that every team has their own sprint branch that is merged at the end of the sprint to the master branch and then those branches are merged to the sprint release branch?

  • User profile image

    @Migas: Before replying I took a peek and we were running 10 builds at this very moment in parallel. I posted a pic here.

  • User profile image

    @Steve: Every ring is production. We just deploy to different rings to control exposer.

    You can learn more here RunAsRadio: DevOps Practices inside Microsoft with Donovan Brown and Martin Woodward where Martin shares our DOD as:

    Delivered into production, collecting telemetry from customers that proves or disproves the hypothesis of why we added the feature.

    You can also watch Munil's first show about Safe Deployment.

  • User profile image
    Brian Lee

    @Donovan, surely for security the environments are treated different, with different credentials, keys, secrets, configurations. Could you discuss the security aspect of how/what you do to promote/ensure higher levels of security between rings?

  • User profile image

    I love that this has turned into an impromptu series. Synchronised AT/DT deployment is something we are struggling with, and this show is providing a lot of insight. Looking forward to the DT specific show.

  • User profile image
    Doug Liberman

    @Donovan if every ring is production, in the next interview in the series can you expand on how the devs actually do development, how do they do local dev (on their box), how do we they do dev before it goes to ring 0 (say they have to share a dependency, surely there are pre-production environments in place (like QA\Staging), or is everything just managed with thousands of feature flags and ifdefs. Great series so far.

  • User profile image

    Would like to know why any bug fixes / hot fixes are not committed in Master database first and then cherry picked to Release branch Munil mentioned the do it the other way around



  • User profile image

    @Niner447018, there is usually urgency to get the hotfixes out through Release branch. Hence we do it that way. 

Add Your 2 Cents