Posted By: Charles | Jun 1st @ 9:46 AM | 43,038 Views | 14 Comments
General purpose thread pools are more complicated to get right than you may think. In CLR 4 (the next version of the VM that powers .NET), the thread pool has made some significant advances in performance and support for concurrency and parallelism.

Since V1, .NET programmers have been afforded the luxury of an automatic queue-dequeue-execute-from-the-queue thread management infrastructure inside the CLR. This is .NET's Thread Pool.

As expected, the CLR's thread pool has improved with each iteration of the CLR (hey, V1 was, well, V1...). The goal has always been efficient, reliable, performant thread management. With CLR 4, the team that designs and implements the thread pool, have made some truly compelling changes, which should add up to a very solid thread pool shipping with CLR 4. One of the big changes is the addition of thread-stealing algorithms to support concurrency and parallelism. Indeed, CLR 4 has native support for the Parallel Computing Platform's Parallel Extensions for .NET. What does this mean, exactly? How does it work, exactly? What else is new in CLR 4's thread pool?

Meet developer Eric Eilebrecht and program manager Erika Parsons. Eric helped implement the thread pool (he's been doing this for multiple versions, actually). Erika, as PMs do, helped design the thread pool and ensured that the design and implementation meets the needs expressed by customers who rely on the thread pool.

Tune in. Lots to learn. You'll be impressed both by the enhancements and direction set forth for the future in CLR 4's thread pool.

Eric has some great blog posts on the new addtions to the thread pool in CLR 4 that will be very useful for expanding on the knowledge you gain from this conversation.
Rating:
3
0

This is a really cool "PC" on his/her desk Wink Where can i buy this "PC" ?

I'm pretty sure that he is running windows 7. Sorry for this "off topic" stuff...

Some time ago you were talking to a kernel developer and if i remember right one of the things he was talking about was the thread sheduling in user mode. The threads are managed in user mode so there is no context switch to the kernel (better performance). So i think this is very much related to the work that the CLR team has done, right?

Thank you for clarify this. But would it make sense that the CLR take use of the UMS on windows 7?

staceyw
staceyw
Before C# there was darkness...

"But would it make sense that the CLR take use of the UMS on windows 7?"

Not so much.  Because if you can stay in managed code, you don't have to context switch into native - which saves a lot of time and work.  So keeping all locks, queues, and threads pools (as much as possible) in managed code is more or less more efficient.

staceyw
staceyw
Before C# there was darkness...

Interesting. Thanks again.

I think it will turn out that we don't need more threading abstractions, but less.  In fact we need as close to zero as we can get.  In most code, it turns out that blocking on IO is the major need to spin up new threads so you can wait on 1 thing and continue to do something else.  If you can remove blocking, or at least make it appear to be gone, you can remove a lot of this.  Take for example and common server app.  You block waiting for a connection, then get the request, then block on some other IO during processing, then send a reply.  During this cycle, your doing a lot of thread management to keep stuff lively and also not making a thread per request.  However a thread per request is exactly how you really want to program because its easier.

This may be a language issue, but why can blocking and callbacks and delegates/lambdas be further abstracted away?  Take a simple read socket, write to disk, and write ok back to client:

byte[] request = mysock.Read();
file.Write(request);
mysock.Write("OK");

So in todays world, we block on read, block on file write and block on socket write.  So we normally can't write it this way because we are tying up a thread for each client request.  So we fall back to async IO, which adds tons of complication and mess.  Now tools such as TPL and CCR try to address this and make it easier, but the model is still overly complex.  We need a new model.  Why can't the Read above, fire off the request and transparently (to me) return the now blocking thread to do other work for others and come back with result when the request is done (kinda coroutineish) - maybe the returning thread is different, we should not need to know or care?  Same with other methods above.  So everything is actually async, but looks and feels sync.  No callback blocks or other such goo.  The runtime handles this in the background.  It does require that all thread depency code needs to be removed or abstracted for us in the language and BCL.  Programmer should not have to think about that anyway and should be abstracted.  Couldn't this work?

Why can't the Read above, fire off the request and transparently (to me) return the now blocking thread to do other work for others and come back with result when the request is done (kinda coroutineish)

Isn't it what it actually does? When a thread blocks on an I/O, its CPU time slice is preempted by the OS scheduler and another thread immediately gets to run. It's exactly the behavior you described, except the thread object isn't reused for other work (which is hard because a thread has a lot of context... the stack to start with), but another thread can run. So no CPU time is "lost".

The point is that this situation doesn't create parallelism. If your application wasn't written with some multithreading in mind, it may well have no work to do during the blocking call. This is something a compiler can't invent, you have to express your parallelism or "tasks" to some degree. For example most app today are written in a purely single threaded way, so even if the scheduler wanted to re-use your thread during the blocking call, what task would it use for?

Another point is that the thread is actually *blocked*. Even if you have more work going on other threads, this may be bad... E.g. if this is your UI thread.

Overall I think that the Tasks concept is a good move. It's a bit like LINQ: don't say how to do something, describe the result you want and let the "black box" operate. The applications I am working on could easily benefit from throwing tasks at the runtime on multicore machines.

Of course, the hard problem which remains to be solved is how to handle concurrency when there is shared state (and there always is some). Correctness and performance are hard and I would love to see some simplifications in that area (the concurent collections and other primitives in .NET 4 being a good start, but I have the feeling this is not enough to ensure easy and safe multithreaded development).

Microsoft Communities