@chanmm:Slides for this presentation and a few others in PDF form are:
http://developer.amd.com/afds/pages/rebroadcast2.aspx are more presentations as well.
The future IS Fusion
Herb, for the record I really dislike that (direct3d) is the keyword to enable the restriction., But yes your finally implementing my(old) idea of passing not only the architectures as as a command line flag as a target but also the device to know the memory properties for the execution; CPU/GPU. No matter what, there must be at least 2 dimensions as you showed in the presentation. I believe BOTH must be passed to the compiler, the architecture/execution style/depth of the computation and then a 2nd parameter must be passed which describes which side of the scale the memory model consists of, large address spaces, or small ones. Or let the guts of the compiler track the types of optimization it does like auto-vectorization and a loop unrolling depth, and if it does a certain combination of optimization then it can tag the function to automatically be restricted/enabled for candidate use in DirectCompute and AMP'd.
Can't wait for the day to come when the compiler will actually be able to infer the dial for both the architecture and the dial for memory model back and fourth on a per-function basis in the end and all linked together for a real masterpiece of a binary based on code keywords or code decorations or declarations or pragmas or even the compilers auto-detection of valid optimization during compilation. This may actually bear fruit where the past research didn't succeed from lack of knowledge or resources unlike now when the timing is right.
Maybe "restrict (vector_unit)" would be a more general and less product advertising keyword to use instead of "direct3d". even if it 95% relies on the DX11 DirectCompute stuff since generality = love just as much as an open standard = love.