At the risk of distraction from my main point, let me tell you what we do right now, and then I'll address your concerns about performance. The way the C# runtime binder works is that once it's located a correct "behavior" for a particular call site, it feeds that behavior back to the DLR, which manages the delegate. The DLR sequences these rules given the order that it receives them from the binder, and they do not interleave. Which I think means that in your example, we "check each one repeatedly." Furthermore, at the level of the delegate generation, there is no profile-guided optimisation.
But that doesn't address your deeper concern about performance. Let me assure you that we care very much about the performance of the code that we're going to gen and that you're going to use, and we plan to measure it, set goals, and make changes to achieve those goals, both in the C# runtime binder and in the DLR. So it may be the case that the behavior I just described will change before we release the final version, or it may be that it's irrelevant. Consider that the delegate is going to be jitted and that the jitter could, in some cases, perhaps, perform the first optimization that you mention without the DLR having to worry about it. Obviously, in that case we wouldn't worry about it.
I can't say what particular changes, architectural or otherwise, will come about because of our performance investigations, but I can say that there will be some.