Excellent interview as always, it's great to see more of the C++ team.
I thought Charles read my mind here, as I was just thinking that it would be great to have some videos covering optimization.
If there were any more videos on this theme I would be interested to know about what C++ code is difficult to optimize and how we should best write C++ to give the optimizer the most help we can about the semantics of our code, perhaps covering __assume et al?
Thanks for another great In-Deep look into C++
Hi! You bring up a good point. Because we generate code with this sequential translation approach there are times when some optimization transformations expose new opportunities. In general we call this the "Phase Ordering Problem". The traditional approach (what we do) is to re-run the transformations that are profitable when there's compiler throughput budget to do it. (of course we can't iterate forever in addition some of the problems won’t converge)For the predicted cycles (and code size) of a particular instruction we typically start getting a rough idea at lower time when we select a machine opcode. This becomes more concrete through register allocation and becomes very concrete in compiler terms at encoding time. The compiler selected instruction has a processor defined cost in terms of machine resources (e.g. execution elements or slots in the out of order buffer) which are fixed but then data dependencies – is the input value available at the right time or schedule - and micro-architectural issues intrude. Finally as you say the cache trumps the other issues. I think it’s important to note here that we’re getting new machines all the time with different micro-architectures, in fact much more quickly than new compilers, and the compiler needs to try and make a single executable that gives good performance across a spectrum of machines. So some of the instruction performance characteristics are hard to know ahead of time. Of course we work closely with our partners to make sure that future machines provide good performance for Microsoft apps and tools output. Finally, with respect to predictability, if you know the machine, it’s micro-arch, the working set, and execution environment (OS and workload) theoretically you can predict exactly the performance (digital computer after all) but in actuality we model a “typical” case, maintain good engineering, and then do lots of benchmarking on real world scenarios to ensure the performance of our output code.
Sniffy wrote:It would be nice if the interviewer was an actual C++ programmer. The interview is at too high a level for me, I've known that the compiler produces .obj's that get linked by the linker for quite awhile now.
Charles wrote:You will see a C++ Expert to Expert interview on C9 in addition to the more broadly-scoped conversations.