Entries:
Comments:
Posts:

Loading User Information from Channel 9

Something went wrong getting user information from Channel 9

Latest Achievement:

Loading User Information from MSDN

Something went wrong getting user information from MSDN

Visual Studio Achievements

Latest Achievement:

Loading Visual Studio Achievements

Something went wrong getting the Visual Studio Achievements

Jim Hogg: Phoenix Framework

Download

Right click “Save as…”

We recently caught up with Jim Hogg, program manager for the Phoenix Framework, a very robust backend compiler platform co-created by the VC++ team and MSR. Ever wanted to extend the functionality of a backend compiler? Well, now you can, and it's much easier than you'd think given the great work that's gone into "Phoenix" (it's a code name at this point...).

Tags:

Follow the Discussion

  • is there a similar project available for other languages? ex. C#?

    EDIT: sorry i tought something else from the title. Is it posibble to extend a existing language like C++ and add to it new features? using Phoenix.
  • CharlesCharles Welcome Change
    These questions are answered in the video. Why not just watch it?
    C
  • ChadkChadk excuse me - do you has a flavor?
    I will watch the video once i get home. I was just amazed how many video's you guys are doing atm! Great stuff!

    Good work!
  • how I.R. looks like?
  • William Staceystaceyw Before C# there was darkness...
    Very cool stuff. The whiteboard work was fantastic.  So does this mean you can take a .net assem and output native exe that does not require the framework be installed?
  • staceyw wrote:
    Very cool stuff. The whiteboard work was fantastic.  So does this mean you can take a .net assem and output native exe that does not require the framework be installed?
    for that you use a linker Smiley check this.
  • I've sent a letter about a year ago to join the program as a independent student but as a support from a professor was needed in the end my application wasn't approved (either Java oriented teachers that don't use .net or teachers that don't want to get exposed, or get messed up with the legal part of it, very frustrating all together). However I was very very pleasantly surprised when I saw that it's now open to download for everyone.
    Thank very much to MS for allowing this to happen and I hope many great plugins for it come out.
    Before I download it, as it was mentioned in the video this is a framework for the backend, does that mean support for stuff like parsers, lexers etc, that could help develop a whole new language are out of this Framework?
  • The subject is slightly esoteric and might have benefited from having a possible int/external customer for the Phoenix to throw in few leading questions. When the interviewer comes off as trying to think up a question on the spot it just doesn't come out very well in terms of audiences confidence on the information gained and time spent.

    Though since I don't write compilers I feel bad complaining here about the free peak into latest research..

  • CharlesCharles Welcome Change
    androidi wrote:
    

    The subject is slightly esoteric and might have benefited from having a possible int/external customer for the Phoenix to throw in few leading questions. When the interviewer comes off as trying to think up a question on the spot it just doesn't come out very well in terms of audiences confidence on the information gained and time spent.

    Though since I don't write compilers I feel bad complaining here about the free peak into latest research..



    Typically, the interviewer (me) knows about as much of the details of a given subject as the audience, which is by design. We feel it makes for a more interesting video than having domain experts asking all the right questions...

    That said, I do like the idea of having internal/external consumers of a given platform technology involved in an interview focusing on the platform.

    Good feedback. Thanks.
    C
  • Thanks for this video, nice to see a fellow Scot flying the flag!  I have to say this was one of the most intriguing interviews yet...and to be honest the real implications of this are eluding me somewhat.

    I guess it means that...

    a) you get fast response to new chip architectures - and so new platforms, probably most important for you guys with all the convergence thats going on. 

    b) you get more flexibility in analysing and tweaking code and introducing funky bits and bytes not handled by the source language.

    c) The ability to optimise code in a bespoke manner if you are into highly specialised applications....

    Are you going to be doing another interview focusing on managed code? I'd be interested to see what kind of improvements you can expect for managed code compared to C / C++.

  • billhbillh call -141

    Question time!*

    1) Wouldn't it be possible to check for buffer overflows on the front end of the compiler? Maybe somewhere between the lexer/parser stages and the backend? Hopefully that is done before the optimization phase.

    2) Any thoughts about running Phoenix itself through the Phoenix compiler?

    3) How does it handle hand-optimized assembly code in the C++? I know that isn't done all that often anymore, but it does happen.

    4) In theory, you could pretty much target any processor you want (not just x86 related ones). All you'd have to do is make the compiler emit its machine code into a text file and then take that file to whatever system you want. Er, right? While you're at, do it for the 6502.

    5) His diagram threw me off a bit. If the .NET code is run by the JIT part of Phoenix, it would not produce a machine excecutable, correct? I hope I phrased that right.

    6) I sensed there was some sort of "reverse engineering" ability with it? Is that a correct assessment? I thought at one point in the video he talked about taking a binary executable as input. If that is the case, does it backtrack to the point where it will crank out C++ code given a particular binary executable for input? Isn't that opening up a whole Pandora's box if people start reverse engineering everything in sight?

    7) I have not done assembly language level optimization in a long, long, long time (like the 6502 days). I have a mediocre handle on x86 assembly (and can figure it out if I'm asked to) but the way the Pentium is put together internally is sort of goofy...at least the way the registers were sort of "added on to" over the years in terms of bits. Is that the case with the multi-core processors, too? Sort of like multiplying that several times over? I know the how/why of the register design additions over the years, but I can't imagine having to write assembly for a multi-core system.

    8) I think the way parsers work is rather archaic, but that's just me. It seems incredibly inefficient to process code one character at a time (and then string them together into tokens, and then compare those tokens to predefined grammar, and then...). Any thoughts about changing that in the future? I have some ideas on how to do it, and if I find the time I might start messing around with that.

    * yes, I watched the video

  • Here's my writeup of this interview.

  • Phoenix IR is a "linear", assembly-like language.  Internally, it's binary, of course.  A doubly-linked list of instructions, where each instruction has two single-linked lists hanging off - one for source operands, one for destination operands.

    Take a look at "Advanced Compiler Design" by Muchnick to get a flavour of what it is, and the sorts of analysis/transformations you can apply.

    Jim

  • Yes, I maybe forgot to say in the video - you can download the Phoenix RDK (Research Development Kit) from

    http://research.microsoft.com/phoenix

    You need a version of Visual Studio 2005 to use it - again, it works just fine with the Express editions - free downloads at:

    (http://msdn.microsoft.com/vstudio/express

    We update the RDK every 6 months or so.  Next one is about ready, and should 'hit the streets' in a few weeks' time.

    [needless to say, the RDK is for experimenting/researching - you can't ship a commercial product based on it - it's not yet ready for prime time]

    Lexers and parsers - no, Phoenix doesn't help there.  That whole area is already well covered with lex/yacc, flex/bison - versions by the dozen for whatever language you want to build your front-end compiler with.

    Jim
  • Can Phoenix import a .NET assembly? - yes.

    Can Phoenix compile the il down to native code? - yes.

    Can that native code run without a .NET Framework? - no.


    Why?  Even "pure" native programs require runtime support.  For example, if your program writes a file, the OS does all the hard work, on your behalf, of finding free blocks on the disk to store that output, and remembering how to find them when you reopen the file tomorrow.

    And it's the same with .NET programs.  They can use features, provided (at runtime), by the OS.  But they also use a bunch of extra features provided by the .NET Framework - eg:
     
    a) memory management (allocation, and automatic garbage collection);

    b) code access security

    c) threadpool

    d) appdomains

    e) reflection

    and a bunch more.  Arguably, the most fundamental is the first - GC - but that's a different discussion.

    Oh - I've made it sound like Phoenix does all this clever stuff, of converting a .NET assembly into native code and saving it.  It's mostly done by the Ngen tool (also known, confusingly, as "prejit").  Phoenix can provide the IL-to-Native compile engine, but laying out classes, and persisting runtime data structures (thing of vtables, and such-like), etc is done by Ngen.

    Out of interest - why would you like to run programs without the .NET Framework installed? - I'm not suggesting it's a crazy idea, just wondering which of several possible reasons you're interested in.

    Jim
  • Target new chips (or extensions to an existing chip's instruction set, like when SSE2 came along)?  Yes. 

    Think of the job: get existing user programs, written in C++ or Fortran or whatever, to run on a new chip - the results must be correct (no compiler codegen bugs!), and the code must run fast.  Well, with a framework like Phoenix, you start that job 90% already done!  Keep all of the analysis and optimizations and codegen it already provides (ie, all those parts that are independent of target chip).  Just define the new chip's instruction set (opcodes, addressing modes, etc) and plug it in.  (how easy this is depends upon the chip - eg Itanium is much harder than the x64 chips from Intel AMD).


    Can you add 'features' using Phoenix, that the front-end language does not provide?  Yes.  Think of things like adding code to gather runtime code-coverage by-function/by-block/by-edge/etc.  Or all of the features attacked by "Aspect Oriented Programming".


    Custom/Bespoke optimization?  Yes - write a "plugin" that does what you need.  Replace an existing phase, or provide new ones.  Again, no need to build an entire optimizing compiler yourself from scratch before you get the job done - use 98% of what Phoenix already does - just add your bespoke 2%.


    This is starting to sound like "Yes, Phoenix can do that - now what's the question".  But it's not - it's just that so far, most of the questions have been directed-enough, that I can say "yes".

    Jim

  • 1)  Check for buffer overruns?  Depends upon the front-end language - and whether it's "safe".  So the C libary gives you a bunch of nifty string functions like gets.  But it doesn't ask, or let the programmer specify, the size of the destination buffer.  There's not much Phoenix can do to fix this "deficiency".  [But checkout the new "_s" versions of these C functions; and the /GS qualifier - both explained somewhere on MSDN].

    On the other hand, if you use "safe" languages like C#, C++/CLI, etc then you're good.  They all demand you specify the size of buffers (in their verifiable subset).

    2)  Yes.  We call it "self-build" - done every full build (nightly).

    3)  We keep the assembly code as-is.  We don't mess with it.

    4)  In theory, for unmanaged code you want compiled to native, yes.  For example, with a C source program, run it thru our C front-end, and into Phoenix backend.  Then write yourself a "dump" phase that transforms Phoenix IR into assembly-source for your target chip.  You could insert that phase wherever you want - before we optimize, or after.  The task for a managed language, like C#, is "more challenging" - you'd also need to write the equivalent of a .NET Framework for the 6502  Smiley

    5)  No, all JITs for .NET are handed an MSIL method and told to compile it into native code.  There's an API that the JIT uses to ask the 'surrounding' .NET engine (the "CLR") for info it needs to do its job - things like "tell me at what byte-offset you have decided to locate the field called "bal" in the class called "AccountDetails".  The CLR reuses the native code for that method, each time that method is called subsequently in this session.  But it does not save the result to-disk.  If you run the App over again, all those methods get JIT'd afresh.

    6)  Kind of.  Phoenix can read a .NET assembly, or a native binary, and convert into Phoenix IR (plus symbol table / type table).  You can then analyze it (eg build the flowgraphs) and view the info.

    For multi-core - currently, it's a hardware feature.  How to take a program, written assuming a single cpu, either in assembler, or some high-level language, and change it to make use of as many cores as available - is a big challenge for the industry.  Initial steps will likely introduce language extensions so the programmer can direct/suggest what needs to be done.  Auto parallelization would magically figure this out without hints.

    8)  I can't think of ways to avoid the classic approach you describe: scan characters one-at-a-time and assemble into tokens, pass them into the parser, etc.  Improvements for the future, I'd suggest, lie in devising better languages.  Programming in today's OO languages is just too much typing - the compiler should figure out 50% of the nonsense we have to supply as programmers.  For example, why do I need to tell the compiler the type of every variable? - the compiler can infer the right answer most of the time.  Take a look at ML or Haskell - clear, powerful, no needless typing (in the sense of hitting the keyboard).

    Jim

  • Thanks for your response Jim,

    I have to admit that compilers are somewhat of  a blackbox to me..or rather I treat them as such, it's going to take a change in perspective and a bit of thinking on my (and others i suspect) part to get the most out of this.

    Very inspiring stuff though, I heard *rumours* that AMD are going to be making changes to their architecture for the "k9" that might take some of the sting out of coding parallel processing friendly apps.  Is this going to be a future focus of effort between cpu designers and compiler spcecialists?

    Guess you cant talk about that though Wink. Keep up the good work!

     

  • billhbillh call -141
    Jim, you rock! Thanks for taking time out to reply to all of my questions. Smiley

    Charles, thanks again for this video. I think you need to interview compiler people more often.

    :O

    I love studying compilers (although I still don't have a total handle on yac, bison, etc.). I'll experiment when I get some free time with other types of parsers. It's hard to explain what I am envisioning in a parser without some pictoral explanations or a working demo.

    I'm sure I'll come up with more questions, and maybe eventually I'll get around to downloading Phoenix.

    Thanks again!

  • CharlesCharles Welcome Change
    billh wrote:


    Charles, thanks again for this video. I think you need to interview compiler people more often.



    Actually, I interviewed one of the developers of VC++'s backend compiler (he's been working on the VC++ compiler team for about 15 years!) a few weeks ago. You'll meet him soon right here on C9.
  • William Staceystaceyw Before C# there was darkness...
    jimhogg wrote:


    Out of interest - why would you like to run programs without the .NET Framework installed? - I'm not suggesting it's a crazy idea, just wondering which of several possible reasons you're interested in.

    Jim


    Thanks Jim.  The framework does not bother me - I like it.  However, as an MVP and NG fan, the primary "shots" I see are:
    1) Easy to decompile and see code (i.e. obfuscation problem).
    2) The tax of installing the framework (and over again for each new version)

    So the primary question I see (as y'all do I am sure) is why not just compile everything needed to native code and ship a native exe to kill both issues above?  I have to admit at times that would be nice to have.   Not sure if just running ngen would solve #1 and/or be practical?  I am not complaining, I enjoy the .Net framework. 
  • Let say there's some code in places optimized to assembly code but for the PowerPC platform. Could Phoenix help a conversion of the pieces of code that target the PowerPC instruction set to Intel instruction set? Theoreticly of course, there's no need for such a thing I'm sure.



  • Mmm ... obfuscation ... the system needs to keep a copy of the original metadata and msil around, even after Ngen is done.  Because if any of its Ngen-time dependencies change (eg, new version of a dependent assembly), the CLR will silently fall back to JITting the methods you need (the assembly is also enqueued to be 'fixed' by the Ngen service, I believe - tho' I'm not up-to-date with the details).

    No easy solution on that one (altho' there's research afoot to improve obfuscators)

    Jim

  • schrepfler wrote:
    Let say there's some code in places optimized to assembly code but for the PowerPC platform. Could Phoenix help a conversion of the pieces of code that target the PowerPC instruction set to Intel instruction set? Theoreticly of course, there's no need for such a thing I'm sure.



    We're not aiming to do this.  Binary translation (yes, it's source text, but machine-specific, so effectively 'binary') presents a whole raft of different problems.  We already have enough challenges  Smiley

    Jim

Remove this comment

Remove this thread

close

Comments Closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.