DirectCompute Lecture Series 210: GPU Optimizations and Performance
- Posted: Aug 04, 2010 at 8:31 AM
- 26,471 Views
- 5 Comments
Loading User Information from Channel 9
Something went wrong getting user information from Channel 9
Loading User Information from MSDN
Something went wrong getting user information from MSDN
Loading Visual Studio Achievements
Something went wrong getting the Visual Studio Achievements
Right click “Save as…”
Join James Fung from Developer Technology at NVIDIA as he covers the following advanced topics for DirectCompute:
Work distribution best practices
Compute shader code best practices
Algorithm selection best practices
Multi-GPU best practices
He also gives an example problem, scan/reduction, and then goes through various optimizations for that problem.
For more information on DirectCompute, download the PDC 2009 DirectCompute HOL, watch the DirectCompute Roundtable discussion, see the full DirectCompute lecture series, and download the slides for this lecture.
Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation,
please create a new thread in our Forums,
or
Contact Us and let us know.
Follow the Discussion
Oops, something didn't work.
What does this mean?
Following an item on Channel 9 allows you to watch for new content and comments that you are interested in. You need to be signed in to Channel 9 to use this feature.What does this mean?
Following an item on Channel 9 allows you to watch for new content and comments that you are interested in and view them all on your notifications page.sign up for email notifications?
Where we can find the presentations?
The code is not visible in the movies
The relevant links should be in the description to the right of the video (http://code.msdn.microsoft.com/Project/Download/FileDownload.aspx?ProjectName=DirectComputeLecture&DownloadId=12929) is the link to the slide deck. The code that he uses is similar to that included in the DirectCompute Hands on Lab (HOL) which is also linked in the description. Hope that helps!
Here's an interesting question:
Wont we get the same problem in the GPU as in the CPU when all applications start using the GPU ? ( contention, GPU gets overloaded, which means delays, etc... )
So in other words we are just moving/postponing the problem.
Do you have any thoughts about this ?
I think overall its letting the system make use of all resources, GPU and CPU. Certainly its possible that down the line, applications could begin to contend for GPU resources. But, there will be tasks that remain on the CPU and many that can make new use of the GPU (and often need it intermittently) so overall it'll make better use of whats available.
Some slides (for example, #35 and 36) were apparently taken as-is from the older presentation by Mark Harris (http://developer.download.nvidia.com/compute/cuda/1_1/Website/projects/reduction/doc/reduction.pdf) and contain code from CUDA version of the reduction optimization experiment.
In my opinion, description of algorithm cascading along with Brent’s theorem is pretty important, but unfortunately it is missing in the current talk. Otherwise, great lecture, thank you James!
Remove this comment
Remove this thread
close