parallel_for_each - C++ AMP - msdn mag companion part 3
- Posted: Apr 12, 2012 at 8:53 AM
- 2,773 Views
- 2 Comments
Loading User Information from Channel 9
Something went wrong getting user information from Channel 9
Loading User Information from MSDN
Something went wrong getting user information from MSDN
Loading Visual Studio Achievements
Something went wrong getting the Visual Studio Achievements
Right click “Save as…”
Hi, I am Daniel Moth ![]()
This screencast is one part of a 4-part series accompanying my MSDN Magazine article that you can read online: A Code-Based Introduction to C++ AMP.
Please watch first the screencasts that precede this part, and then follow the links below to watch the screencasts that follow it.
To learn more please visit the C++ AMP blog, and we encourage C++ AMP questions in the Parallel Computing in C++ and Native Code MSDN forum.
Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation,
please create a new thread in our Forums,
or
Contact Us and let us know.
Follow the Discussion
Oops, something didn't work.
What does this mean?
Following an item on Channel 9 allows you to watch for new content and comments that you are interested in. You need to be signed in to Channel 9 to use this feature.What does this mean?
Following an item on Channel 9 allows you to watch for new content and comments that you are interested in and view them all on your notifications page.sign up for email notifications?
Nice video. Thank you for uploading these
You know the OpenCL 1.1 HelloWorld example from "OpenCL Programming Guide" by Aftab Munshi (editor of the OpenCL specification) -after removing all the comments- is around 260+ lines plus a 10-line kernel, and it exactly does the same thing: adding the corresponding elements of two arrays and putting the results in a third array. After watching this video, I'm really questioning the OpenCL's execution model and the need for all that explicit context creation, command queue creation, program object and memory object creation and alike in OpenCL! If we can do the same thing with ~50 lines of code, and with the simplicity we see in this video, why should we even bother using an execution model like OpenCL's? a performance comparison between C++ AMP, OpenCL and CUDA would be great to answer that question (I haven't seen any yet) it will defenitely tell us if we really need a complicated execution model like OpenCL's (or may be where do we need such a model)
@AliKouhzadi: Glad you like it!
An even simpler example of showing how productive you can be with C++ AMP is our "Hello World" example. We also have learning guides for those familiar with other programming models, please get them here: CUDA, OpenCL, DirectCompute.
The performance is comparable between all these approaches, and in our tests is not a factor for choosing one over the other, even now that the product is in Beta and we are still tuning the bits. Once we RTM, we invite anyone to measure the performance difference between C++ AMP and any other approach and share their workloads and results on a variety of hardware for comparison.
Remove this comment
Remove this thread
close