Entries:
Comments:
Posts:

Loading User Information from Channel 9

Something went wrong getting user information from Channel 9

Latest Achievement:

Loading User Information from MSDN

Something went wrong getting user information from MSDN

Visual Studio Achievements

Latest Achievement:

Loading Visual Studio Achievements

Something went wrong getting the Visual Studio Achievements

Louis Lafreniere - VC++ backend compiler

Download

Right click “Save as…”

Louis Lafreniere has been a developer on the VC++ compiler team for a long time; 15 years, to be exact. Specifically, Louis works on the backend compiler. What's a backend compiler? How's it evolved over the years? Where's it going? Watch and listen. Good stuff.

Tags:

Follow the Discussion

  • PonPon
    Cool, rather interesting Smiley

    I must say, I enjoy these compiler videos. Keep em' coming Big Smile

  • Great interview!

    Concerning the ia64 architecture, there was a mention saying
    the compiler had to do more of the smart to optimize code layout.
    So what would be the reasoning for this change? Is this about
    making the architecture simpler? (Assuming it's more complex
    on other aspects).

    Also appreciate a lot the improvements in back-end code generation for VC++. This is nice to see a video like this, as there are good
    surprises in code generation that we could only discover by
    stepping through the disassembler window.

    Additions to the language, or new libraries change the way we
    write code, but discovering new optimizations really gives a
    different perspective. For example, the removal of the copy of an
     object being returned from a function allows the writing of code
    that will do much more use of automatic variables (and therefore
    will release a lot from pointer management).

    I guess that someone who was writing C++ code, 10 or 15 years
    ago, and now still doing so, would certainly have the feeling he/she
    is using a different language, even though it's still C++.

    By the way, as more developers get familiar with c# coding style,
    it may be that more and more C++ classes could be written in a
    header, rather than the usual .h/.cpp pair. If a Visual studio guy
    reads this, this would be nice to factor this into the smart indent.



  • Sven GrootSven Groot Don't worry... I'm a doctor.
    Interesting video. One thing though, Charles: in the beginning your wording kind of implies that this frontend/backend setup is something unique to C++, while in fact every compiler works this way. Heck, I wrote a compiler for a subset of pascal in third year Computer Science, and even that had a separate frontend and backend. Smiley I'm sure you didn't mean it like that though, it just sounded that way.

    In the end you talked about making the compiler multithreaded. I think it's worth mentioning that although the compiler in VS2005 isn't, msbuild is. If you have a solution with more than one project, msbuild/VS2005 will build more than one project at the same time (if possible based on the project's dependencies) based on the number of CPUs in you system.
  • CharlesCharles Welcome Change
    Sven Groot wrote:
    Interesting video. One thing though, Charles: in the beginning your wording kind of implies that this frontend/backend setup is something unique to C++, while in fact every compiler works this way. Heck, I wrote a compiler for a subset of pascal in third year Computer Science, and even that had a separate frontend and backend. I'm sure you didn't mean it like that though, it just sounded that way.


    The topic at hand is C++, so I related the frontend backend statement to, well,  C++...  Smiley

    C
  • Hi Pierre,
    Glad you liked the interview.  

    Memory speed has not kept up with CPU speed increases in the past few decades.  So memory latency has become a caused a big bottleneck.  There are 2 different ways currently of approaching this problem. 

    One is to really on the hardware to dynamically figure out the dependencies between instructions, and allow them to execute out-of-order as soon as their inputs are ready.  This is the approach used by most chips today.

    The IA64 took a different approach, by adding flexibility in the instruction set to give tools to the compilers to schedule the instructions easily in a way such that loads can be executed far from their uses.  For example, if for "if (x) { y = *p; }", the compiler would normally not be able to hoist the load of *p outside of the if(), in case it was protecting the load to cause an exception.  IA64 provides a way to hoist this load, and differ the exception until you get inside the if().  If you don't, no exception is generated. 

    For "*q = x; y = *p;", the compiler would also not normally be able to hoist the *p load above the *q store in case they point to the same address.  The IA64 however provides a way to do this load ahead, and then check at the y= if the load was invalidated by the subsequent store.

    Branch misprediction is also a problem for CPUs with deep pipeline.  But the IA64 instruction can be set to be conditionaly executed based on true/false register predicates, which allows us to generate straigh line code if we want for if/else construct, avoiding the chance of mispredicted branches.

    This approach does avoid a lot of the complexity of the out-of-order execution, but these tools themselves do add a lot of complexity as well.

    The belief back when the IA64 was designed was that the x86 speed was approaching is peek, that out-of-order execution wouldn't be enough to avoid the memory bottleneck, and that they couldn't keep cranking up the clock speed on x86.  The though was that they would be able to crank it up higher on ia64.

    But doing a good job at generating code for IA64 is a very hard problem.  Using these "tools" isn't usually free, and so they involve a lot of trade-offs.  Profile guided optimization does provide a lot of info to the compiler to help making these decisions, but it is still very hard to take full advantage of the machine.


    -- Louis Lafreniere
  • billhbillh call -141

    Again, great video. More! You should interview some assembly language people...I would like to hear about the differences and changes over the years in the Pentium architecture and how your teams have adapted to that on very low levels. You kind of hit on that a bit with the multicore discussion here. I've thought a lot about getting back into some assembly programming just for fun (I did a fair amount of it back in the days of the 6502 chips), but am wondering how easy that will be considering the optimization that occurs on the chip itself, the caches, etc.

    Question: how do you target your compiler for different Pentium architectures? From what I remember, Intel seems to alter a few instructions with every generation (from the Pentium to the Pentium II, on up to the current ones). Does your compiler recognize the user's chip and pick the best optimization? How about for programs that are shipped? How do those recognize the user's chip? Or do you not take advantage of the latest additions made by Intel?

    Unfortunately, I do not own a copy of Visual Studio, so maybe those are options in the IDE, I don't know.

  • Hi Bill,
    We are working very closely with Intel and AMD to stay on top of the latest architecture changes, and adjust/tune the compiler accordingly.

    We've stopped giving customers the ability to pick which particular chip flavor they want to dirrectly target, since most people want their apps to run fast on the variety of chips on people's desk at that time.  So instead, we try to tune the compiler for the set of chips we thing will be dominent not only after we ship, but after our customer ship their own apps.  So this usually means the current chip that Intel/AMD is working on, plus the current shipping generation, and maybe the one before that as well.  We do provide the /arch:SSE and /arch:SSE2 switches to enable the compiler to use these new instructions (as well as CMOV), but the generated program will not run on the older architectures which don't support these.

    Tuning the generated code (or your assembly code) is a lot harder then it used to be, mainly because of the out-of-order execution.  Back in the 386/486 and even first generation Pentiums, we used to be able to pick up the instruction manual and figure out exactly how many cycles a particular instruction sequence would take, but you can't do that anymore.  You need to know how the machine works and identify the patterns that might cause problems in the out-of-order execution.

    As far as runtime detection of the architecture we run on, the CRT does look at it and take advantages of the SSE/SSE2 instruction when available to speed up some computations, and to move larger chunks of memory at a time.  The generated code from the compiler doesn't do this however.  Doing so would cause a lot of code duplication and our experience has showed that code size is very important for medium to large apps.

    -- Louis Lafreniere
  • louisl wrote:


    As far as runtime detection of the architecture we run on, the CRT does look at it and take advantages of the SSE/SSE2 instruction when available to speed up some computations, and to move larger chunks of memory at a time.  The generated code from the compiler doesn't do this however.  Doing so would cause a lot of code duplication and our experience has showed that code size is very important for medium to large apps.

    -- Louis Lafreniere


    How interesting. We could think the JIT should be able to take
    advantage of runtime detection of the hardware to generate code
    specific to the current processor. Still, as Brandon Bray was pointing
    out the JIT has time constraints stricter than for a regular
    compiler, and therefore cannot spend too much time optimizing.
    One could also wonder how this would impact performance in
    general, as most of the time the difference should be small. (?)

    Are these considerations part of the Phoenix project?

  • louisl wrote:


    As far as runtime detection of the architecture we run on, the CRT does look at it and take advantages of the SSE/SSE2 instruction when available to speed up some computations, and to move larger chunks of memory at a time.  The generated code from the compiler doesn't do this however.  Doing so would cause a lot of code duplication and our experience has showed that code size is very important for medium to large apps.

    -- Louis Lafreniere


    How interesting. We could think the JIT should be able to take
    advantage of runtime detection of the hardware to generate code
    specific to the current processor. Still, as Brandon Bray was pointing
    out the JIT has time constraints stricter than for a regular
    compiler, and therefore cannot spend too much time optimizing.
    One could also wonder how this would impact performance in
    general, as most of the time the difference should be small. (?)

    Are these considerations part of the Phoenix project?

  • Oops, sorry for the double message. Can be edited out?
  • Yes the JIT throughput is very important, still instruction selection is quick to do and this would be quite appropriate for a JIT.  The win though wouldn't be very big, and I could be wrong but I don't believe our JITs do any optimization dependent on the host CPU.

    We are currently working on the high level optimizations right now on Phoenix, and will tune the low level machine dependent code generation later on.  This is certainly something we'll consider if we see opportunities.

       -- Louis Lafreniere
  • Hi All,
    Anybody could tell me which one is better to develop applications on Windows Mobile Smartphone?
    We want better performance and powerful.
  • Hi All,
    Anybody could tell me which one is better to develop applications on Windows Mobile Smartphone?
    We want better performance and powerful.

Remove this comment

Remove this thread

close

Comments Closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.