Dave Probert: Inside Windows 7 - User Mode Scheduler (UMS)

Play Dave Probert: Inside Windows 7 - User Mode Scheduler (UMS)
Sign in to queue


Here, we continue our exploration of the morphology of Windows 7 on Going Deep with windows kernel architect Dave Probert. You may remember him from an early four part episode of Going Deep where he teaches us about general purpose operating system architectures and history: Part 1, Part 2, Part 3, Part 4

That was a great conversation from a few years ago and it's been way too long since we returned to Windows kernel world to converse with and learn from Dr. Probert. Not surprisingly, Dave has been busy innovating the Windows core.

Dave and team, working very closely with the Parallel Computing Platform People, have created a very compelling new user mode thread scheduling/management system in Windows 7. In a nutshell, the User Mode Scheduler provides a new model for high-performance applications to control the execution of threads by allowing applications to schedule, throttle and control the overhead due to blocking system calls. In other words, applications can switch user threads completely in user mode without going through the kernel level scheduler. This frees up the kernel thread scheduler from having to block unnecessarily, which is a very good thing as we move into the age of Many-Core... Speaking of Many-Core, remember the piece we did on the Concurrency Runtime (ConcRT)? ConcRT is built on top of UMS and is the best way to most effectively utilize this new user mode thread scheduling model in Windows 7

Make yourself comfortable and spend some time watching and listening to Dave make all of this crystal clear.

This is another great conversation with a fantastic OS architect and Windows kernel professor. Lots to learn here. Enjoy.



Right click to download this episode

The Discussion

  • User profile image
    Very interesting video.

    So if I understand this correctly, this User-mode Scheduler feature of Windows 7 is basically about doing what Fibers do currently, but making them look and work like full threads from the user-mode code's point of view.

    It's a shame they couldn't figure a way to do user-mode pre-emptive switching. That really would have been killer. Actually, all I'd need to implement something I've been thinking about (a special .NET Virtual machine with super-lightweight user-mode threads) is a way to get the OS to periodically interrupt designated unblocked OS threads and jump (not call) to a pre-set address, having saved off the registers to a pre-set location. I say "all I need...." there's probably a hundred big problems with implementing such a scheme, which is why I'm not a kernel developer and Dave is Smiley
  • User profile image

    >>(a special .NET Virtual machine with super-lightweight user-mode threads)

    My god , thats a terrible idea Sad...  one of plus of the windows kernel is its approach to multi thread.. you are purposing a way to create havoc inside an already butched kernel... ridded with shortcuts to improve gamming and graphics Sad

    The original NT kernel was beautiful, a bit slow, but perfect in its original design, light years from unix.. now its a mix of hacks and tricks... Sad
    What windows needs is to return to its origins an clean and Inspirated kernel based in very good ideas from VMS....
  • User profile image
    err, all I'm asking for is an entirely optional, leight-weight way to get your thread interrupted periodically. Doesn't affect any of the way the kernel operates, it's just a different way to interface with existing timer event functionality.

    As for your views on the NT kernel evolution, the thing is speed matters a whole heck more than 'beauty' in the real world. Get used to it.
  • User profile image
    Charles I cannot thank you enough for this video.

    Ever since you posted the video with Mark Russinovich briefly talking about UMS I've been quite interested in it but couldn't find much detail about it, so this video is a godsend.

    Thanks a lot!  Keep up the great work Charles and the gang!
  • User profile image
    You're welcome!

    Keep on watching,
  • User profile image
    Yeah awesome video Charles, Dave. Digging into the kernel is always fun
  • User profile image
    Facinating. The UMS stuff is very interesting and you can never have enough of Dave Probert talking about the Windows kernel design.
  • User profile image
    i've a question, how is the tpl and plinq stuff related to the concRT stuff? is the tpl like a wrapper around concRT or are the paralell (haha) code bases or what? does the managed tpl use the ums in 7 as well?
    also, joe duffy and the tpl guys are part of the pcp team right? id love to see an interview about the relation between the managed and unmanaged world here Smiley
  • User profile image
    This might be a little bit off-topic,

    But a constant I've heard including this time, are optimizations to avoid hitting the kernel for the sheer cost of context switching and crossing the Kernel/User boundary, of course undertable the amount of operations that this requires.  But c'mon, this isn't a new problem guys like Robert are having this issue for at least 30 years.  Besides of GHz on the procesors , what are they doing to ease such switching.

    Also Mr. Robert talked about the origin of the process/thread abstraction and  I've heard from Unix folks (not only linux bashers) that creating a process on Windows have a bigger impact that on *nix, where process are very ligth.  Perhaps Robert can shed some ligth on this.

    Don't know, Charles, if this can even be included in an upcoming Going Deep video.

  • User profile image

    How does TPL & PLINQ relate to ConcRT? 

    The easiest distinction between TPL/PLINQ and the Concurrency Runtime (ConcRT) is the target customer; TPL & PLINQ are built on .NET while ConcRT, the Parallel Pattern Library (PPL) and the Asynchronous Agents library are targeted to C++ customers.  All are available in the Visual Studio 2010 CTP.

    Many of the scenarios and use cases between TPL & PPL are very similar particularly at a high level, i.e. both support task parallelism, parallel loops and have well defined cancellation and exception handling support.   The runtimes are different; while TPL and PLINQ are built on top of the CLR and it's threadpool, PPL and Agents are built on the Concurrency Runtime which is a component of the C Runtime that is new to Visual Studio 2010.

    As you've noted, Joe, Steve, myself and the rest of the TPL, PLINQ and ConcRT "folks" are all on the Parallel Computing team, we talk very frequently and are incredibly cognizant about the places where the technologies and APIs have differences; we try to ensure that the usage and semantics are similar wherever possible to minimize the amount of time spent by you (our developers) keeping track of idiosyncrasies that aren’t inherent to the .NET & C++ programming model differences.


  • User profile image
    Dave (there is no Mr. Robert, if anything it would be Dr. ProbertSmiley)  says:

    The ring crossing overhead is an important consideration, because in the fine-grain, over-decomposed, task-based, concurrent execution world of ConcRT – the overheads can significantly limit just how fine-grained tasks can be.

    Reducing the cost of ring crossing (the kernel/user boundary) is something I would very much like to see.  Despite many improvements, it is still a very significant overhead, and hopefully someday will be much less than it is today.  But even if the ring crossing was free, there is an inherent advantage to UMS in that the scheduling decisions are made in the run-time rather than the kernel.  This allows the run-time (e.g. ConcRT) a great amount of flexibility in terms of how it optimizes its use of the CPUs.  Sometimes it is suggested that instead of user-mode scheduling, what is needed is a pluggable kernel scheduler.  But user-mode scheduling has two great advantages over that approach.  First it has access to whatever great wealth of metadata about the computation that the compiler has made available in the program, while the kernel has a much more limited/expensive interface to user-mode state.  Second, if the user-mode scheduler screws up it only crashes/hangs the app, not the system.

    Process creation on Windows is more expensive compared to UNIX.  This is because it doesn’t have to be as cheap.  Unlike traditional UNIX, the Windows thread represents scheduling of the CPU.  We keep threads pretty cheap in Windows, but have loaded up process creation with a lot of functionality (including stuff like shims for broken apps and implementation of the subsystem model).  Process launch is generally synonymous with application launch on Windows (especially on client systems).   App launch is relatively rare (generally a user has to click something), but thread creation is very common.  So the system is optimized for threads (including facilities like the Win32 thread pool, which allows rampant thread re-use to amortize the creation overhead and reduce application memory requirements).

    This doesn’t mean that I wouldn’t like for us to make improvements in process creation on aesthetic grounds.  But it isn’t a problem in a practical sense, so it is always down the list.

    Linux is somewhat different than traditional UNIX, or even a more modern UNIX like Solaris (which has real threads).  But my knowledge of Linux is limited, so I won’t try to explain how Linux uses a form of a process as a thread as I will get pieces of it wrong.  UNIX and NT (aka modern Windows) were designed at different times for different environments with different goals, so when they run on the same environment they often take very different points of view, and so direct comparisons can be misleading.

    Thanks Dave!!

  • User profile image

    Does ConcRT work like SLI, you multiplex CPU's to make it look like a single faster CPU?

  • User profile image

    Have you watched this? Or this? These should really help you understand. If you'd rather just read, then check this out.
  • User profile image

    Are there any plans to expose the UMS to .NET?


  • User profile image

    What's the reasoning behind not supporting UMS in 32 bit windows?

    (as stated at http://msdn.microsoft.com/en-us/library/dd627187(VS.85).aspx)

  • User profile image

    What is the services work mentioned when Dave is talking about procrastination?

  • User profile image

    Hi, I'm trying to make ums work, but have problems I can't resolve myself.

    Look here for more details:





  • User profile image

    Can you elaborate here? UMS does not really provide a public API... You use ConcRT as the abstraction layer for native task-based concurrent programming on Windows.


    I've alerted the right folks to take a look at the problem you're running into (based on your code sample in the MSDN forums, which is where I pointed them to....). Again, to be clear, ConcRT is supposed to be the proxy you play with to get UMS goodness...


Add Your 2 Cents