Entries:
Comments:
Posts:

Loading User Information from Channel 9

Something went wrong getting user information from Channel 9

Latest Achievement:

Loading User Information from MSDN

Something went wrong getting user information from MSDN

Visual Studio Achievements

Latest Achievement:

Loading Visual Studio Achievements

Something went wrong getting the Visual Studio Achievements

Erika Parsons and Eric Eilebrecht : CLR 4 - Inside the Thread Pool

Download

Right click “Save as…”

  • High Quality WMV (PC)
  • MP3 (Audio only)
  • MP4 (iPhone, Android)
  • Mid Quality WMV (Lo-band, Mobile)
  • WMV (WMV Video)
General purpose thread pools are more complicated to get right than you may think. In CLR 4 (the next version of the VM that powers .NET), the thread pool has made some significant advances in performance and support for concurrency and parallelism.

Since V1, .NET programmers have been afforded the luxury of an automatic queue-dequeue-execute-from-the-queue thread management infrastructure inside the CLR. This is .NET's Thread Pool.

As expected, the CLR's thread pool has improved with each iteration of the CLR (hey, V1 was, well, V1...). The goal has always been efficient, reliable, performant thread management. With CLR 4, the team that designs and implements the thread pool, have made some truly compelling changes, which should add up to a very solid thread pool shipping with CLR 4. One of the big changes is the addition of thread-stealing algorithms to support concurrency and parallelism. Indeed, CLR 4 has native support for the Parallel Computing Platform's Parallel Extensions for .NET. What does this mean, exactly? How does it work, exactly? What else is new in CLR 4's thread pool?

Meet developer Eric Eilebrecht and program manager Erika Parsons. Eric helped implement the thread pool (he's been doing this for multiple versions, actually). Erika, as PMs do, helped design the thread pool and ensured that the design and implementation meets the needs expressed by customers who rely on the thread pool.

Tune in. Lots to learn. You'll be impressed both by the enhancements and direction set forth for the future in CLR 4's thread pool.

Eric has some great blog posts on the new addtions to the thread pool in CLR 4 that will be very useful for expanding on the knowledge you gain from this conversation.

Tags:

Follow the Discussion

  • This is a really cool "PC" on his/her desk Wink Where can i buy this "PC" ?

  • CharlesCharles Welcome Change

    Well, last I heard, Silverlight runs on OSX (and the CLR in Silverlight has a thread pool). It's funny though. I was in the room and didn't even notice it. Smiley

    C

  • I'm pretty sure that he is running windows 7. Sorry for this "off topic" stuff...

    Some time ago you were talking to a kernel developer and if i remember right one of the things he was talking about was the thread sheduling in user mode. The threads are managed in user mode so there is no context switch to the kernel (better performance). So i think this is very much related to the work that the CLR team has done, right?

  • CharlesCharles Welcome Change

    Well, no. The User Mode Scheduler in Windows 7 (this is what you're talking about) was written by Windows and the PCP (Parallel Computing Platform) teams. Now, the ConCRT (Concurrency Runtime (provides native APIs) - also from PCP team) is written on top of UMS. The CLR's thread pool implementation is not...

    C

  • Thank you for clarify this. But would it make sense that the CLR take use of the UMS on windows 7?

  • William Staceystaceyw Before C# there was darkness...

    "But would it make sense that the CLR take use of the UMS on windows 7?"

    Not so much.  Because if you can stay in managed code, you don't have to context switch into native - which saves a lot of time and work.  So keeping all locks, queues, and threads pools (as much as possible) in managed code is more or less more efficient.

  • William Staceystaceyw Before C# there was darkness...

    Interesting. Thanks again.

    I think it will turn out that we don't need more threading abstractions, but less.  In fact we need as close to zero as we can get.  In most code, it turns out that blocking on IO is the major need to spin up new threads so you can wait on 1 thing and continue to do something else.  If you can remove blocking, or at least make it appear to be gone, you can remove a lot of this.  Take for example and common server app.  You block waiting for a connection, then get the request, then block on some other IO during processing, then send a reply.  During this cycle, your doing a lot of thread management to keep stuff lively and also not making a thread per request.  However a thread per request is exactly how you really want to program because its easier.

    This may be a language issue, but why can blocking and callbacks and delegates/lambdas be further abstracted away?  Take a simple read socket, write to disk, and write ok back to client:

    byte[] request = mysock.Read();
    file.Write(request);
    mysock.Write("OK");

    So in todays world, we block on read, block on file write and block on socket write.  So we normally can't write it this way because we are tying up a thread for each client request.  So we fall back to async IO, which adds tons of complication and mess.  Now tools such as TPL and CCR try to address this and make it easier, but the model is still overly complex.  We need a new model.  Why can't the Read above, fire off the request and transparently (to me) return the now blocking thread to do other work for others and come back with result when the request is done (kinda coroutineish) - maybe the returning thread is different, we should not need to know or care?  Same with other methods above.  So everything is actually async, but looks and feels sync.  No callback blocks or other such goo.  The runtime handles this in the background.  It does require that all thread depency code needs to be removed or abstracted for us in the language and BCL.  Programmer should not have to think about that anyway and should be abstracted.  Couldn't this work?

  • Why can't the Read above, fire off the request and transparently (to me) return the now blocking thread to do other work for others and come back with result when the request is done (kinda coroutineish)

    Isn't it what it actually does? When a thread blocks on an I/O, its CPU time slice is preempted by the OS scheduler and another thread immediately gets to run. It's exactly the behavior you described, except the thread object isn't reused for other work (which is hard because a thread has a lot of context... the stack to start with), but another thread can run. So no CPU time is "lost".

    The point is that this situation doesn't create parallelism. If your application wasn't written with some multithreading in mind, it may well have no work to do during the blocking call. This is something a compiler can't invent, you have to express your parallelism or "tasks" to some degree. For example most app today are written in a purely single threaded way, so even if the scheduler wanted to re-use your thread during the blocking call, what task would it use for?

    Another point is that the thread is actually *blocked*. Even if you have more work going on other threads, this may be bad... E.g. if this is your UI thread.

    Overall I think that the Tasks concept is a good move. It's a bit like LINQ: don't say how to do something, describe the result you want and let the "black box" operate. The applications I am working on could easily benefit from throwing tasks at the runtime on multicore machines.

    Of course, the hard problem which remains to be solved is how to handle concurrency when there is shared state (and there always is some). Correctness and performance are hard and I would love to see some simplifications in that area (the concurent collections and other primitives in .NET 4 being a good start, but I have the feeling this is not enough to ensure easy and safe multithreaded development).

  • CharlesCharles Welcome Change

    jods, nice thinking. Have you checked out Axum? It provides a way to express parallelism and isolation using rather simple semantics. You should have a look.

    C

  • William Staceystaceyw Before C# there was darkness...

    "When a thread blocks on an I/O, its CPU time slice is preempted by the OS scheduler and another thread immediately gets to run."

    That is true. But my suggestion is for another reason.  If you doing a server for example, you can't write code like that because its not just that your blocking threads - it that you could be using and blocking 1000s of threads - which is one of the problems Eric described.  The work around for this issue has typically been async methods and we know this is a hard way to program.  My thought experiment is to gain both natural async behavior with a sync look and allow the blocked thread to return to the pool and the callback is done internally to continue executing the next step on a same thread or another from the pool.  With the CCR for example, you can acomplish this with the Iterator arbiter, but that also requires some code goo that is not that natural (but probably better then alternatives to date.) Basically all I am suggesting is the same idea, but with more language sugar that makes it look more sync (without any delegates) and has the same goal of not blocking any threads.

  • @Charles: I've read a bit about Axum, but didn't check it in depth yet. Indeed I think we'll probably have to introduce new concepts into the languages to be able to safely and easily code multithreaded applications.

    @staceyw: indeed threads are an important resource, if you block 1000s of them you need a better way to handle concurrency. You are totally correct: currently the solution is asynchronous programming and current languages make it unnecessarily hard.

    Did you check the F# asynchronous features? It's very similar to the idea you described. Basically you write your code inside an async {} block, and you can write everything just like the usual synchronous code (with the exception of adding an exclamation mark (!) before asynchronous calls). The language takes care of all the details. It's a very nice feature.

  • I think I can hep peuple with my private solution; think of it, my framework uses only several BCL classes, allow us delivers the full power of mult-core multi-threading apps directly from .NET 3.5

    I used only these BCL classes: AutoResetEvent, Monitor, ThreadPool and WaitHandle. Framework is designed for both parallelisms: vertical parallalism (number of concurrent items per processor) and horisontal parallelism (number of  processors consumed).

    It is so simple, that programmer must care about olny about how implement parallelism on a given algorithm using parallel work items. A programmer needs to implement single work item for the algorithm itself and generic type of data used in this work item.

    By default, the queue engine tries to scale algorithm horisontally, processing work items to fit the number of virtual cores (Environment.ProcessorCount) in a one single thread, then if it is succeded (so overall number of estimate parallel work item tasks is greater ot equal to the required number of processors), it scales vertivally, in a way, when th code with the least number of tasks gets the most of the priority in allocation nex work item, so all cores are used virtually equally, depending to the algorithm. But nothing stops to cutomize the algorithm to run 85% of ("cheap") of parallel work items on a single core, wile other, 15% ("hard") of work items (for example if you have a 8 core i7), on all other processor cores, and, beleive me, it is very easy.

    It is about a couple of kilobytes long and several lines oof code, extremely easy to read and understand. I used PEX, CodeContracts. So it really makes me happy for it! And it just works!

    If somebody want to get it, please leave requests here. Probably I will add it to my library http://plugins.codeplex.com

    That's all, folks!

  • I spent a couple of time to build a good, working solution for .NET 2.0 / 3.5, so you do not need to wait and install .NET 4.0

    Abstract:

        public abstract class PluginBaseWorkItem<T>

        {

            private static object _sync = new object();

            private T _item;

            public T Item { get { return _item; } }

            public void Run()

            {

                List<WaitHandle> _waitHandles = new List<WaitHandle>();

                for (int i = 0; i < TaskCount; i++)

                {

                    _waitHandles.Add(new AutoResetEvent(false));

                }

                for (int i = 0; i < TaskCount; i++)

                {

                    ThreadPool.QueueUserWorkItem(new WaitCallback(ProcessHandle), _waitHandles[i]);

                }

                WaitHandle.WaitAll(_waitHandles.ToArray());

            }

            protected abstract void ProcessItem();

            protected void SetValue(T item)

            {

                _item = item;

            }

            protected T GetValue()

            {

                return _item;

            }

            protected virtual int TaskCount { get { return Environment.ProcessorCount; } }

            private void Process()

            {

                ProcessItem();

            }

            private void ProcessHandle(object sync)

            {

                AutoResetEvent waitHandle = (AutoResetEvent)sync;

                Process();

                waitHandle.Set();

            }

        }

     

             private class WorkItem : PluginBaseWorkItem<Puzzle>

            {

                protected override void ProcessItem()

                {

                    DlxEngine engine = new DlxEngine();

                    Puzzle value = engine.Generate(20);

                    lock (this)

                    {

                        Puzzle item = GetValue();

                        if (item == null || value.Rating > item.Rating)

                        {

                            SetValue(value);

                        }

                    }

                }

            }

     

            public static IPuzzle Create()

            {

                WorkItem workItem = new WorkItem();

                workItem.Run();

                IPuzzle puzzle = workItem.Item;

                return Create(puzzle.Text, puzzle.Seed);

            }

  • sw1sw1

    Are there any plans to have the ThreadPool take advantage of Windows 7 UMS?

    This would allow you to use a synchronous programming model for I/O and have the kernel notify the ThreadPool when blocking was occuring to allow for another work item to be scheduled.

    It would be immensely useful for my application.

  • Adolph.McLoudAdolph.​McLoud

    Це дійсно вирішена моя проблема, спасибі!.

Remove this comment

Remove this thread

close

Comments Closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.