Windows Shell Architecture
We recently caught up with Jim Hogg, program manager for the Phoenix Framework, a very robust backend compiler platform co-created by the VC++ team and MSR. Ever wanted to extend the functionality of a backend compiler? Well, now you can, and it's much easier than you'd think given the great work that's gone into "Phoenix" (it's a code name at this point...).
for that you use a linkerstaceyw wrote:Very cool stuff. The whiteboard work was fantastic. So does this mean you can take a .net assem and output native exe that does not require the framework be installed?
The subject is slightly esoteric and might have benefited from having a possible int/external customer for the Phoenix to throw in few leading questions. When the interviewer comes off as trying to think up a question on the spot it just doesn't come out
very well in terms of audiences confidence on the information gained and time spent.
Though since I don't write compilers I feel bad complaining here about the free peak into latest research..
androidi wrote:The subject is slightly esoteric and might have benefited from having a possible int/external customer for the Phoenix to throw in few leading questions. When the interviewer comes off as trying to think up a question on the spot it just doesn't come out very well in terms of audiences confidence on the information gained and time spent.
Though since I don't write compilers I feel bad complaining here about the free peak into latest research..
Thanks for this video, nice to see a fellow Scot flying the flag! I have to say this was one of the most intriguing interviews yet...and to be honest the real implications of this are eluding me somewhat.
I guess it means that...
a) you get fast response to new chip architectures - and so new platforms, probably most important for you guys with all the convergence thats going on.
b) you get more flexibility in analysing and tweaking code and introducing funky bits and bytes not handled by the source language.
c) The ability to optimise code in a bespoke manner if you are into highly specialised applications....
Are you going to be doing another interview focusing on managed code? I'd be interested to see what kind of improvements you can expect for managed code compared to C / C++.
Question time!*
1) Wouldn't it be possible to check for buffer overflows on the front end of the compiler? Maybe somewhere between the lexer/parser stages and the backend? Hopefully that is done before the optimization phase.
2) Any thoughts about running Phoenix itself through the Phoenix compiler?
3) How does it handle hand-optimized assembly code in the C++? I know that isn't done all that often anymore, but it does happen.
4) In theory, you could pretty much target any processor you want (not just x86 related ones). All you'd have to do is make the compiler emit its machine code into a text file and then take that file to whatever system you want. Er, right? While you're at,
do it for the 6502.
5) His diagram threw me off a bit. If the .NET code is run by the JIT part of Phoenix, it would not produce a machine excecutable, correct? I hope I phrased that right.
6) I sensed there was some sort of "reverse engineering" ability with it? Is that a correct assessment? I thought at one point in the video he talked about taking a binary executable as
input. If that is the case, does it backtrack to the point where it will crank out C++ code given a particular binary executable for input? Isn't that opening up a whole Pandora's box if people start reverse engineering everything in sight?
7) I have not done assembly language level optimization in a long, long, long time (like the 6502 days). I have a mediocre handle on x86 assembly (and can figure it out if I'm asked to) but the way the Pentium is put together internally is sort of goofy...at
least the way the registers were sort of "added on to" over the years in terms of bits. Is that the case with the multi-core processors, too? Sort of like multiplying that several times over? I know the how/why of the register design additions over the years,
but I can't imagine having to write assembly for a multi-core system.
8) I think the way parsers work is rather archaic, but that's just me. It seems incredibly inefficient to process code one character at a time (and then string them together into tokens, and then compare those tokens to predefined grammar, and then...). Any
thoughts about changing that in the future? I have some ideas on how to do it, and if I find the time I might start messing around with that.
* yes, I watched the video
Here's my writeup of this interview.
Phoenix IR is a "linear", assembly-like language. Internally, it's binary, of course. A doubly-linked list of instructions, where each instruction has two single-linked lists hanging off - one for source operands, one for destination operands.
Take a look at "Advanced Compiler Design" by Muchnick to get a flavour of what it is, and the sorts of analysis/transformations you can apply.
Jim
Target new chips (or extensions to an existing chip's instruction set, like when SSE2 came along)? Yes.
Think of the job: get existing user programs, written in C++ or Fortran or whatever, to run on a new chip - the results must be correct (no compiler codegen bugs!), and the code must run fast. Well, with a framework like Phoenix, you start that job 90% already
done! Keep all of the analysis and optimizations and codegen it already provides (ie, all those parts that are independent of target chip). Just define the new chip's instruction set (opcodes, addressing modes, etc) and plug it in. (how easy this is depends
upon the chip - eg Itanium is much harder than the x64 chips from Intel AMD).
Can you add 'features' using Phoenix, that the front-end language does not provide? Yes. Think of things like adding code to gather runtime code-coverage by-function/by-block/by-edge/etc. Or all of the features attacked by "Aspect Oriented Programming".
Custom/Bespoke optimization? Yes - write a "plugin" that does what you need. Replace an existing phase, or provide new ones. Again, no need to build an entire optimizing compiler yourself from scratch before you get the job done - use 98% of what Phoenix
already does - just add your bespoke 2%.
This is starting to sound like "Yes, Phoenix can do that - now what's the question". But it's not - it's just that so far, most of the questions have been directed-enough, that I can say "yes".
Jim
Thanks for your response Jim,
I have to admit that compilers are somewhat of a blackbox to me..or rather I treat them as such, it's going to take a change in perspective and a bit of thinking on my (and others i suspect) part to get the most out of this.
Very inspiring stuff though, I heard *rumours* that AMD are going to be making changes to their architecture for the "k9" that might take some of the sting out of coding parallel processing friendly apps. Is this going to be a future focus of effort between cpu designers and compiler spcecialists?
Guess you cant talk about that though
. Keep up the good work!
billh wrote:
Charles, thanks again for this video. I think you need to interview compiler people more often.
jimhogg wrote:
Out of interest - why would you like to run programs without the .NET Framework installed? - I'm not suggesting it's a crazy idea, just wondering which of several possible reasons you're interested in.
Jim
Mmm ... obfuscation ... the system needs to keep a copy of the original metadata and msil around, even after Ngen is done. Because if any of its Ngen-time dependencies change (eg, new version of a dependent assembly), the CLR will silently fall back to JITting
the methods you need (the assembly is also enqueued to be 'fixed' by the Ngen service, I believe - tho' I'm not up-to-date with the details).
No easy solution on that one (altho' there's research afoot to improve obfuscators)
Jim
schrepfler wrote:Let say there's some code in places optimized to assembly code but for the PowerPC platform. Could Phoenix help a conversion of the pieces of code that target the PowerPC instruction set to Intel instruction set? Theoreticly of course, there's no need for such a thing I'm sure.