@Glen wrote: "If the problem can't be solved in a transparent way, either the platform model is too complicated or it will harm C++."
I get it -- you want it to be true that ISO C++'s great flexibility is so powerful that it's sufficient to directly express all object models. It's frustrating to be told by people who it seems ought to know better that it isn't that way. I understand; I wish it were true, too.
You may not realize how much I understand your frustration.
Let me tell you story.
In the Way-Back Machine
In December 2003, an effort to revise the Managed Extensions to C++ was already underway (an effort that would later be called C++/CLI), but I was not participating in it at the time and in fact had heard little about it. I had joined Microsoft just the year before and was still figuring out the huge company and its platforms. I already had a lot of C++ experience and love, as did the other people on the team, but was still drinking from the fire hose of information and trying to absorb as fast as I could all the deep details of the platform and tools that people around me already knew.
That December, the C++/CLI effort had hit a technical and organizational roadblock, and people started coming to me to look for help. In the end I asked for, and was given, the job of leading the design team. As you can imagine, this always presents tensions and challenges for a project team already in midstream, and especially when the newcomer has some pretty hard-nosed design sensibilities about doing things the ISO C++ way.
For the first several months I made a real pain of myself, insisting on understanding the deep rationale for every single feature because I just couldn't believe such a set of language extensions was needed. For example, I asked "why do we need a new class category, why doesn't it just work to have 'class X : System::Object' be an inheritance tag from a magic type that the compiler can treat specially?" and spent several weeks trying to make it work, and several dozen similar questions -- many similar to questions in this thread. And my probing was most of the time not specifically whether a language extension was needed; the questions were deeper, to find the goals and constraints -- 'what semantics need to be achieved?', 'what does the code generation have to be?', 'what is the IL we have to map to and round-trip with?', 'how does the compacting garbage collector work?', 'why can't regular pointers point into the GC heap?', 'what are the exact semantics of pinning an object's memory?', and so forth.
In the end, once I grokked the main design constraints, I like to think that I did make a number of contributions that improved and simplified the design quite a lot. But at first I was really wishing it could be 'just ISO C++,' and it was painful when I really began to understand the deep reasons why wishing wouldn't make it so.
Over the course of those first several months, I gradually realized that a few patterns were emerging in my thinking, notably (but not only): 1. I kept trying to make a language feature look like a library, but it still required compiler knowledge and so it was still a language feature in library-syntax clothing and I was only fooling myself that that somehow made it more like a library. 2. I kept discovering differences where this really was a foreign object model that couldn't be expressed directly in ISO C++ any more than full ISO C++ classes with virtual functions could be expressed directly in ISO C.
I see the same patterns here in this thread, and I understand completely because I said the same kinds of things for months in early 2004 during the C++/CLI design effort when I didn't understand the full problem. I appreciate the patience the team had with me as I absorbed it, but even then it took months in daily meetings in rooms with whiteboards and high-bandwidth interaction.
In this comment format, I can't possibly convey all that learning derived from months of face-to-face interaction with the best experts on the technologies, but maybe I can distill some key learnings and why I felt they were so important in changing my understanding of the nature of the problem.
Pursuing an Analogy: A General Answer to "Why Language Extensions"
Let me pursue the analogy with expressing virtual functions in ISO C, because it's a very good one that directly applies to expressing foreign object models in ISO C++.
In the past, people could have (and sometimes did) argue at great length that C is enough (heck, it's Turing-complete and you can write OSes in it) so "you don't need C++ classes and 'virtual' etc. added to the C language, you can write your own vtables in C" and look here are three projects I can cite that did that...
... Yes, clearly, you can. But that "you can" completely glosses over the costs, including the usability and other drawbacks of doing it by hand, including the quantity of code you write, the quality of the code you write, the quality of error messages, the run-time performance, and the tooling difficulty and quality. If that list sounds familiar because I've mentioned those same things already earlier in the thread, it's because it's what always happens when you fail to expose high-level abstractions and instead use low-level abstractions plus coding conventions.
So how would you answer someone who presses you with the question, "well, can you tell us why it isn't possible to just write our own vtables in C"? Of course it's possible -- and at this point, before you can say more, they may tend to interrupt with 'ha! it's possible, you admit it! so you can have no valid reason not to have done it in just ISO C' -- ; but that doesn't change the fact that it's not a good solution because you're missing abstractions that belong in the language because they require compiler knowledge.
Key point: As soon as something requires compiler knowledge, it's a language feature. Pushing the language extensions somewhere else, like to a MIDL language+tool, is just squeezing the toothpaste tube and moving the extension to somewhere else in the design space; it's not removing the need for language extensions.
Corollary: Magic libraries and compiler code generation tools are language extensions. As soon as something requires compiler knowledge, it's not a normal library type, and wishing won't make it so. Neither will giving it a library-like syntax.
Note that the answer to nearly all questions of the form "why did you need a language extension for X" is "because X required compiler knowledge." The reason I say "nearly all" is:
- In a few cases the answer is "for usability." For example, usability is the reason C++11 added "auto" and "range-for" and C++/CLI and C++/CX added ^. They're only conveniences, but deemed important enough to bake into the language because their convenience is so useful and frequent that the sugar they provide yields a high benefit even though they're just sugar over other existing language features.
- But nearly always it's "because X required compiler knowledge." For example, that's the reason C++11 added decltype (the twin of auto), override, final, lambdas, nullptr, enum class, move semantics and rvalue references, uniform initialization, initializer lists, constexpr, variadic templates, and nearly everything else. Libraries simply would not do. (Before you say "but initializer_list is a library feature," I slyly and deliberately included that in the list to underscore the point -- initializer_list is not fully a library feature, because the language is aware of the type and gives it special semantics and code generation; you cannot write initializer_list as a library in a C++98 compiler and get its full semantics).
Limitations of the World's Most Flexible Object Model
Finally, the thing that was perhaps hardest to accept was that this beautifully flexible ISO C++ object model that seemed like it could express anything, this so wonderfully controllable and flexible and efficient object model, could not after all directly express all interesting object models. Some object models would stay foreign (at least until added to some future ISO C++) -- not just dynamically typed object models, but even statically typed object models that have different rules and constraints. So the realization was the following:
Key point: There are interesting object models that are not directly expressible in C++. "Directly" means in the language without relying deeply on convention and discipline; you can express anything in assembler via convention and discipline, but that observation is neither helpful nor interesting. "Other object models" mean ones that reflect different object layout assumptions (e.g., hiding data rather than exposing it the way the C++ object model assumes and requires), different static and dynamic restrictions, and even different language rules (e.g., deep virtual dispatch in constructors). When you need to bind to them, and you want to do it well with best quality and performance and toolability, you need language extensions.
Corollary: Using those object models well from C++ requires a language-level binding. It can be done without a language binding to the same degree as virtual calls can be done using hand-coded vtables in C.
Let's return one more time to virtual functions expressed in ISO C.
Consider what would happen if you really tried hard to do virtual functions in C without language extensions. You could imagine creating a wonderful set of C macros (just like #import-enhancing macros) that automates a lot of the repetitive coding for you, and the resulting code may even look reasonably nice. (After all, when we're in this "imagine if" mode before doing real detailed systems prototyping things always look rosier and feasible because you haven't found the problems and limits yet; but let's assume what I just described was actually possible in this case for C macros simulating virtual functions.) (And I said "you could imagine creating" on purpose -- that's what this thread is often doing.)
Here's what we know you will never get without a language feature:
- You will never get the code elegance and readability. People using those macros will always write more verbose code than just adding the one word "virtual" to a function signature. (The same is true in C++/CX of, say, "ref".)
- You will never get as robust code. The code that people write will definitely be more brittle -- if nothing else, there will be many more places to spell it wrong because it relies on more-complex conventions and so there are more points of typing/thinking failure. Further, when you say "virtual" you declare your intent and language rules kick in to do two things on your behalf: 1. The language rules do a lot of work and code generation automatically, and they'll get those detailed vtable mechanics right every time (modulo compiler bugs) because it's the compiler that's writing the code, not you. 2. The language rules can often make it impossible to write certain classes of errors even by mistake, because the language can often be designed to make it impossible to say some nonsensical things. When you roll your own vtables, you're not declaring any intent, all the code generation is by hand or at best semiautomated, and there are no language guard rails to help you do it right.
- You will never get the same high quality of error messages. When "virtual" is a keyword you are clearly declaring your intent, the language has rules that enforce you did it right, and the compiler can tell you far more exactly when you got it wrong -- at compile time. When you roll your own vtables, most errors will be at link time or run time rather than compile time, and will be far more cryptic and less actionable.
- You will never get the same run-time performance optimizability. It's important to think about why this is fundamentally true: It too derives from expressing intent. When you don't declare your intent in a language-understood way, you cripple your compiler's and optimizer's ability to help you. For example, in VC++11 we are doing aggressive devirtualization which can result in some major performance wins. Good luck getting that optimization without the "virtual" keyword and instead using hand-coded vtables in C, because the compiler and optimizer will have no idea what you're trying to do. (Note that this is the same reason why C++ templated algorithms taking iterators and lambda functions are routinely faster than hand-coded C loops with direct pointer increment and dereference instructions! That higher-level abstractions just naturally enable performance optimization is just a fact of life, an inherent benefit that results from the fact that letting your tools know more about what you're doing directly increases their ability to help you.)
- You will never get the ease of producing high-quality tooling. Yes, you can code your vtables in C by convention -- but don't expect to get as nice a debugger or object browser, and possibly no class wizard or class inspector at all.
That's what (well-designed) language abstractions always give you over coding convention + discipline: More concise, more elegant code. More robust code. Higher quality error messages. More performance optimization opportunities. Better toolability. This is not wishful idealism; it's inherently true, every time.
All of these things -- said above about language extensions to C for dealing with virtual functions -- apply to language extensions for binding C++ to foreign object models, for they are simply always true for all language extensions that eliminate conventions + discipline.
ISO C just isn't suited to natively doing OO programming with virtual functions, any more than ISO C++ is suited to doing a component object model with true encapsulation and ABI safety. ISO C and C++ are both incredibly useful, powerful, and flexible languages; but an important part of using a power tool is to understand its limits. Sometimes that takes time and deep experience at the boundaries.
@Glen wrote: "How many other languages are you going to extend or recommend get two notions of class?"
Let me ask you a counter-question, please.
What would you say if you were proposing virtual functions for C, and someone asked, "Why do you recommend getting two notions of functions? Functions are functions, darn it, everyone knows that, virtual functions are just functions and I'm sure you can express them in ISO C without extensions, we have function pointers and everything you need."
Two questions:
- How would you answer that?
- When your answer begins with, "Well yes, you can, but..." -- now how do you keep the person listening after that first four-word sound bite that may tell him what he wants to hear, but isn't anywhere close to being the complete truth?
It took me a while to get all this through my head, so I know it takes time.
I don't know if this will help, but maybe it will.