Thanks for the excellent reply. I'm still a little hazy on the details, thought.
The load balancing of MSMQ is performed by the WCF client, which will round-robin against servers that are running the OrderProcessor service host. This is built into the StockTrader AsyncOrderClient itself.
So, if I were to duplicate this without WCF, there would be code that essentially would select a different queue path depending for each request, thereby kinda load balancing. Fair enough, but how does this resolve the situation where the path you select points to a machine that is down?
Due to the nature of MSMQ, a local message will be queued and will wait in the outgoing queue until the target machine comes back online. This provides resiliency, but not really fault tolerance and failover, because now I have an orphaned message that may be stuck in the outgoing queue forever if the target machine never comes back.
With WCF, actually, when using an MSMQ binding, the client app is not directly communicating with the OrderProcessor.exe service host; rather it is talking to MSMQ.
There is a fairly well-known work around for the remote read/distributed tx issue with MSMQ 3.5, which calls for creation of a polling mechanism that essentially transfers the message to the local processing computers local MSMQ as part of a distributed tx; which then reads them locally and processes as part of another distributed tx.
Interesting. So you would have a central queue server that all message senders would send messages to. This queue server could be clustered and therefore is fault tolerant and can be failed over. Now this "polling mechanism", which would have to be running locally on the central queue server, would take the messages and send them to the target servers which are running the OrderProcessor.exe service. These servers would be able to then do local reads / transactions to process the messages. Is this correct?
If so, I still am confused as to how this is any different than before. You still have the problem of ensuring the the polling mechanism is only sending to servers that "up". Since MSMQ naturally abstracts this entire process away from you, you're still stuck with the "orphaned message" scenario if the target machine never comes back and the message is sitting in the outgoing queue.
Now, I suppose we could kinda of solve this issue by simply using TimeToRecieve/TimeToSend timeouts on the sending machines. If the message times out, it would be placed in a dead letter queue and we could implement a service that then selected a new server to send the message to. Is that the strategy you use? Sorry if WCF resolves these issues... I'm not familiar with WCF. (Yet. )
Also, MSMQ will work with Windows Network Load Balancing; there are some good articles on MSDN about this.
It is my understanding that MSMQ works with NLB only if you're not doing transactional messaging. The session aspect of MSMQ prevents NLB from properly routing all the various packets to the proper machines since NLB is a network-level load balancing mechanism, not an application load balancer which understands the MSMQ protocol. If I'm wrong, I'll be very happy since this is a problem I've dealt with for a while.
I really appreciate you taking the time to explain these things to me. I feel like I've read all the available documentation and still have a lack of understanding.