@djmarcus:You are correct that his is a problem but a lot fo work has been going on in this space. In my Power Threading library, my AsyncEnumerator class (along with C#'s iterator feature) allows you to have a synchronous programming model for doing asynchronous operations. This has been available for about 5 years now. It has received such support and success, that the .NET team at MS is adding very similar features to the next version of .NET and, the new features will support a deep call stack where you can initiate an async I/O at the bottom of the stack, return all the way up and have execution continue at the bottom of the logical stack when the I/O compeletes.
@gbrayut:Fibers are a pretty old technology. They were originally added to NT4 to ease porting of apps from other OSes to Windows NT. While SQL server has a fiber mode, the mode is usually used for benchmarking and not for real day to day running of the DB. There was an attempt in .NET 2 to add fibers to the CLR but the attempt failed due to separating state to give the impression of different fibers being different threads. Since fibers are not threads, they do not work like threads and using them can become quite difficult. This is why the CLR dropped the feature. Maybe on a platform where fibers are a 1st class citizens from the very beginning, they could work well.
You should be able to use processor groups today in .NET if you P/Invoke out to the native Windows APIs.The CLR team could wrap these for you but it is trivial for you to do it yourself. There are few machines with this many cores and Azure machines have no more than 8 cores so this would be very low priority for the CLR team.
I plan to revise my CLR vua C# book for the next version of .NET but I haven't started working on it yet. I'll probably start when it enters beta.
@serializable:I am not a WCF expert and I don't have time to examine your code. So I'm not sure if the WCF infrastructure is implemented poorly or if your code is not using the infrastructure correctly. I do know that the WCF team cared a lot about async operations and so the most likely problem is that your code is not using it correctly.
Rick may have some valid ideas here but he is also assuming that every web request results in a DB request which, in many web servers, is not true. I have written web sites where many requests are handled from memory or from cache or possibly from a store other than a DB. In this case, there are NOT lots of threads blocking on the DB and the threads can do other work. In addition, his only argument for not making things async is really just to simplify the programming model. He's not suggesting making things sync because your service is faster or uses less resources (both of which hurt scalability). He's suggestion is purely about simplifying the code. But with technologies like my Power Threading library's AsyncEnumerator, or with the new async/await feature being put into the next version of the .NET Fx, the programming model is much simpler than it has been in the past and so a simplified programming model is much less compelling of an argument.
All of Microsoft's hosting infrastructures -- ASP.NET, WCF, etc -- support asynchronous programming. The server gets a client request, the server makes a request to another server (like SQL) asynchronously, and the thread returns back to the thread pool. The hosting infrastructure knows NOT to send the response back to the client. When the server (SQL) responds, its response it put in the thread pool, another threadpool thread wakes up and your code processes the response and returns. When the thread returns to the pool this time, the hosting infrastructure DOES send the response to the client. Lookup how to implement your service asynchronously in whatever infrastrucutre is hosting you. For example, do a web search for "implementing ASP.NET Web form service asynchronously".
With each operation, you should always consider whether it is an I/O operation or a compute-bound operation. I/O operations do not use the CPU on the motherboard at all and so you scale them out using asynchronous I/O operations (Begin/End methods); do NOT use threads to perform I/O operations in parallel as this just wastes threads. Compute operations DO use the CPU on the motherboard and so you improve performance by having multiple threads (up to 1 thread per CPU in the machine) execute different pieces of the work concurrently. For compute operations, you can use QueueUserWorkItem (but this is fire & forget), or a delegate's BeginInvoke/EndInvoke methods, or create & start .NET 4's Task object. Delegate's BeginInvoke/EndInvoke allow you to have the same programming model for compute operations as you get for I/O operations.
I cover a lot of this in my CLR via C#, 3rd Edition book. My Windows via C/C++ also talks alot of threads and their overhead. The Windows Internals book by Mark Russinovich & David Solomon also has a lot of info in it. There is no such thing as a "managed thread". In managed code, you can ask Windows to create a thread. Again, my CLR via C#, 3rd Edition book goes into all of this in great detail.
Memory-mapped files are all about making file I/O look like RAM and so you can't work asynchronously with memory-mapped files. You are tradiing the simpler progrmaming model for reduced scalability (when acessing MMF data actaully results in I/O as opposed to accessing cached data in RAM).
All computer resources are finite. You are always trading off one thing for another thing. When calling a Begin* method, you give up 1MB+ of memory (the thread's stack) and replace it with an I/O request packet (maybe 100 bytes). This is a huge diference and allows your application to scale much, much better. If your service is busy enough, then 1 machine can't handle the load, and you need multiple machines to handle the load. Trading threads for I/O requets packets allows you to defer or reduce the number of machines. Adding machines adds significant additional costs: hardware, electricity, software licenses, IT costs (backup, patching, maintenance) etc.
I certainly agree with your comments about server-side code. However, I disagree with your thoughts about client-side code. On my machine Outlook (just to pick some app as an example) has over 50 threads in it because the Outlook developers have the mindset
that it isn't bad to just spawn off a few more threads and allocate a few more megabytes of memory. However, this wastes a lot of resources and if Outlook is running on a terminal server machine hosting 100 client sessions, then that means 5,000 threads on
the system! In fact, because this problem is so bad, Windows 7 is doing a lot to reduce the number of threads used throughout system services and applications that ship with the OS so that Windows 7 will run well on small footprint machines with only 1GB of
RAM in them. Clearly the Windows team feels that all the threads they have been creating has been hurting them and they are finally doing something about it.
And, since I'd like to make things better for Windows users, I highly recommend asynchrounous programming for client-side applications. Also, delegates DO support the APM via their BeginInvoke/EndInvoke methods and because of this, asynchronous delegate invocations
integrate quite nicely with my AsyncEnumerator. In addition, my AsyncEnumerator automatically marshals the result back to the GUI thread so there is no need to call Control's BeginInvoke method (for Windows Forms) or DispatcherObject's BeginIvoke method (for WPF).
Furthermore, my AE offers cancelation/timeout support as well as discard support which are all useful features for client-side applications that you do not get with the APM alone or by just spawning up another thread.
ASP.NET does support the ability for developers to implement their web form or web service app asynchronously but few developers take advantage of this. However, ASP.NET offers it as an option because it does not simply give it to you by default. Also, it is
very useful to perform async programming when accessing a DB because the DB
IS a bottleneck (as you point out). If you make synchronous requests to a DB then your server will create a ton of threads which are all blocked; your server will handle just a few concurrent requests and memory consumption will be very high with thread
stacks which your code is not using at all.
In the real world, there are many developers who will avoid high-level synchronous libraries in order to gain scalability. I know that MS does avoid synchronous DB access for Hotmail and many of its other highly-scalable services. However, it is true, that
if your service has few concurrent users, then you do get a lot of benefit from various synchronous abstractions (like Linq to SQL). In an ideal world, these abstractions that offer such productivity will offer asynchronous ways of using them. Linq to SQL
doesn't offer this today but it could in the future (or Linq to Entities could).
And, as for refactoring...Iterators do offer some challenges here, I agree. However, from an iterator, you can call compute-bound synchronous methods or methods that initiate an asynchronous compute-bound or I/O-bound operation that itself returns an IAsyncResult.
And, also my AsyncEnumerator does support composition where an iterator can invoke another iterator. For example, you could create an iterator that asynchronously queries a web server to get some data and process it. And then, you can create another iterator
that consumes the first in a loop to do asynchronous processing for several web sites. In fact, all of this processing can happen concurrently allowing for phenominal scaling. If your iterator methods get overwhelming, then you can resort to call back methods
which is what all native applications have to do and what developers writing scalable managed application have to do today. Iterators give you a new ability; they don't take away any of the old abilities. This ability has very few drawbacks but if you come
across one, then don't use it; go back to an older tried and true way.