Matrix Multiplication with C++ AMP
- Posted: Jul 03, 2012 at 5:38 PM
- 23,165 Views
- 8 Comments
Loading User Information from Channel 9
Something went wrong getting user information from Channel 9
Loading User Information from MSDN
Something went wrong getting user information from MSDN
Loading Visual Studio Achievements
Something went wrong getting the Visual Studio Achievements
Right click “Save as…”
Hi, I am Daniel Moth ![]()
This screencast assumes knowledge of the C++ AMP API, i.e. the simple model and the tiled model, and here we look at a common algorithm (matrix multiplication) and convert it from serial on the CPU, to run on the GPU, first with the simple model and then tiled.
To learn more please visit the C++ AMP blog, and we encourage C++ AMP questions in the Parallel Computing in C++ and Native Code MSDN forum.
Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation,
please create a new thread in our Forums,
or
Contact Us and let us know.
Follow the Discussion
Oops, something didn't work.
What does this mean?
Following an item on Channel 9 allows you to watch for new content and comments that you are interested in. You need to be signed in to Channel 9 to use this feature.What does this mean?
Following an item on Channel 9 allows you to watch for new content and comments that you are interested in and view them all on your notifications page.sign up for email notifications?
Awesome
Tnx a lot for this videocast...
not niche!!
Thank you for this video lecture.
Great video!
It was very interesting to watch.
You should have showed the time it takes for each version at the end....
Now I have to write the code and try it. :)
Thanks!
@Spetum: @Aiboy: Thanks, glad you enjoyed it.
@Elad:Thanks, glad you found it interesting. Yes I leave the time measurement to you so you can explore the benefits on your specific hardware
very nice video! i have some spare time these days so i'm giving AMP a spin.
one question though: i'm not sure if i understood this correctly but does your tiled version require the input to be of even length (because of TS = 2)? I'm relatively new to AMP. comparing with what i know from using sse intrinsics for example (not really comparable, i know, but some concepts are at least similar) i'd expect some code for dealing with odd length vectors i.e. padding or computing the rest in a non-tiled way (painful). did i miss something here? im sure i did..
cheers!
martin
@martin w: Thanks, glad you enjoyed it!
The answer to your question is in this blog post and the links it points to: http://blogs.msdn.com/b/nativeconcurrency/archive/2012/02/26/divisibility-requirement-of-tiled-extent-in-c-amp.aspx
If you still have questions after reading those, please post to our forum: http://social.msdn.microsoft.com/Forums/en-US/parallelcppnative/threads
Many thanks for the awesome tutorials :)
very nice video! i have some spare time these days so i'm giving AMP a spin.
Remove this comment
Remove this thread
close