Episode

GoingNative 7: VC11 Auto-Vectorizer, C++ NOW, Lang.NEXT

C9::GoingNative

In this installment of GoingNative, it's all about the latest C++ compiler technology from Microsoft. Most of the time is spent discussing VC11's Auto-Vectorizer with a few short forays into other VC compiler improvements (like Auto-Parallelizer). You meet the lead developer for VC11's backend compiler and the architect of Auto-Vectorizer, Jim Radigan (who spends all the time at the whiteboard). You also meet backend compiler PM Jim Hogg, a C9 veteran and one of the original folks behind the Phoenix Compiler Project.

In order to keep the conversation palatable to a large number of folks, we don't get into the math behind auto-vectorization. However, if this is something that really interests you, then we can get Jim to do a lecture on the internals (will take more than one session, of course—a lot of stuff goes on behind the scenes when you take a loop of arbitrary complexity and determine if it's vectorizable and then vectorize it with maximum efficiency...). Now, on to AutoVec.

The VC11 compiler includes a feature called Auto-Vectorization, or AutoVec for short. AutoVec tries to make loops in your code run faster by using the SSE, or vector, registers present in all current processors. The feature is on by-default. So, like other optimizations that the compiler performs, you don't need to know anything more to benefit. However, this session explains more background on what is going on, and digs a little into the kinds of sophisticated analyses that AutoVec performs, and the loop patterns that it successfully speeds up.

Here's a trivial example of a loop that gets automatically vectorized in VC11 with significant performance gains:

int i = 0; for (i=0; i<100000; i++) { a[i] = b[i] + c[i]; } The auto-vectorizer transforms the above tight loop into machine instructions that run the loop 4x faster on SIMD-capable (SSE/SSE2) processors. As Jim and Jim discuss, this is because each loop iteration simultaneously performs 4 computations using the modern CPU's vector registers. This is a great automatic optimization feature in VC11. Tune in and meet a couple of the key folks behind VC's Auto-Vec!

Table of Contents:

[00:00] Diego and Charles construct the show (C++NOW, Some news, Auto-Vectorizer in VC11 compiler)
[04:45] Charles interviews VC backend compiler lead developer Jim Radigan and backend compiler PM Jim Hogg
[52:03] Diego and Charles destruct the show (longer than usual, but worth the delay - Lang.NEXT, C&B 2012, C++ + XAML + DX, WRL Documentation)

Here's a trivial example of a loop that gets automatically vectorized in VC11 with significant performance gains:

Table of Contents: