DirectCompute Lecture Series 120: Basics of DirectCompute Application Development

Join James Fung from Developer Technology at NVIDIA as he covers the following advanced topics for DirectCompute:
Work distribution best practices
Compute shader code best practices
Algorithm selection best practices
Multi-GPU best practices
He also gives an example problem, scan/reduction, and then goes through various optimizations for that problem.
For more information on DirectCompute, download the PDC 2009 DirectCompute HOL, watch the DirectCompute Roundtable discussion, see the full DirectCompute lecture series, and download the slides for this lecture.
Where we can find the presentations?
The code is not visible in the movies
The relevant links should be in the description to the right of the video (https://code.msdn.microsoft.com/Project/Download/FileDownload.aspx?ProjectName=DirectComputeLecture&DownloadId=12929) is the link to the slide deck. The code that he uses is similar to that included in the DirectCompute Hands on Lab (HOL) which is also linked in the description. Hope that helps!
Here's an interesting question:
Wont we get the same problem in the GPU as in the CPU when all applications start using the GPU ? ( contention, GPU gets overloaded, which means delays, etc... )
So in other words we are just moving/postponing the problem.
Do you have any thoughts about this ?
I think overall its letting the system make use of all resources, GPU and CPU. Certainly its possible that down the line, applications could begin to contend for GPU resources. But, there will be tasks that remain on the CPU and many that can make new use of the GPU (and often need it intermittently) so overall it'll make better use of whats available.
Some slides (for example, #35 and 36) were apparently taken as-is from the older presentation by Mark Harris (http://developer.download.nvidia.com/compute/cuda/1_1/Website/projects/reduction/doc/reduction.pdf) and contain code from CUDA version of the reduction optimization experiment.
In my opinion, description of algorithm cascading along with Brent’s theorem is pretty important, but unfortunately it is missing in the current talk. Otherwise, great lecture, thank you James!