C++ AMP core API introduction... from scratch

Hi, I am Daniel Moth
This screencast assumes knowledge of the C++ AMP API, i.e. the simple model and the tiled model, and here we look at a common algorithm (matrix multiplication) and convert it from serial on the CPU, to run on the GPU, first with the simple model and then tiled.
To learn more please visit the C++ AMP blog, and we encourage C++ AMP questions in the Parallel Computing in C++ and Native Code MSDN forum.
Awesome Tnx a lot for this videocast...
not niche!!
Thank you for this video lecture.
Great video!
It was very interesting to watch.
You should have showed the time it takes for each version at the end....
Now I have to write the code and try it. :)
Thanks!
very nice video! i have some spare time these days so i'm giving AMP a spin.
one question though: i'm not sure if i understood this correctly but does your tiled version require the input to be of even length (because of TS = 2)? I'm relatively new to AMP. comparing with what i know from using sse intrinsics for example (not really comparable, i know, but some concepts are at least similar) i'd expect some code for dealing with odd length vectors i.e. padding or computing the rest in a non-tiled way (painful). did i miss something here? im sure i did..
cheers!
martin
@martin w: Thanks, glad you enjoyed it!
The answer to your question is in this blog post and the links it points to: https://blogs.msdn.com/b/nativeconcurrency/archive/2012/02/26/divisibility-requirement-of-tiled-extent-in-c-amp.aspx
If you still have questions after reading those, please post to our forum: https://social.msdn.microsoft.com/Forums/en-US/parallelcppnative/threads
Many thanks for the awesome tutorials :)
very nice video! i have some spare time these days so i'm giving AMP a spin.