Entries:
Comments:
Posts:

Loading User Information from Channel 9

Something went wrong getting user information from Channel 9

Latest Achievement:

Loading User Information from MSDN

Something went wrong getting user information from MSDN

Visual Studio Achievements

Latest Achievement:

Loading Visual Studio Achievements

Something went wrong getting the Visual Studio Achievements

parallel_for_each - C++ AMP - msdn mag companion part 3

Download

Right click “Save as…”

Hi, I am Daniel Moth Smiley

This screencast is one part of a 4-part series accompanying my MSDN Magazine article that you can read online: A Code-Based Introduction to C++ AMP.

Please watch first the screencasts that precede this part, and then follow the links below to watch the screencasts that follow it.

  1. Setup code - C++ AMP - msdn mag companion part 1
  2. array_view, extent, index - C++ AMP - msdn mag companion part 2
  3. parallel_for_each - C++ AMP - msdn mag companion part 3
  4. accelerator - C++ AMP - msdn mag companion part 4

To learn more please visit the C++ AMP blog, and we encourage C++ AMP questions in the Parallel Computing in C++ and Native Code MSDN forum.

Tags:

Follow the Discussion

  • Nice video. Thank you for uploading these Smiley

     

    You know the OpenCL 1.1 HelloWorld example from "OpenCL Programming Guide" by Aftab Munshi (editor of the OpenCL specification) -after removing all the comments- is around 260+ lines plus a 10-line kernel, and it exactly does the same thing: adding the corresponding elements of two arrays and putting the results in a third array. After watching this video, I'm really questioning the OpenCL's execution model and the need for all that explicit context creation, command queue creation, program object and memory object creation and alike in OpenCL! If we can do the same thing with ~50 lines of code, and with the simplicity we see in this video, why should we even bother using an execution model like OpenCL's? a performance comparison between C++ AMP, OpenCL and CUDA would be great to answer that question (I haven't seen any yet) it will defenitely tell us if we really need a complicated execution model like OpenCL's (or may be where do we need such a model)

  • @AliKouhzadi: Glad you like it!

    An even simpler example of showing how productive you can be with C++ AMP is our "Hello World" example. We also have learning guides for those familiar with other programming models, please get them here: CUDA, OpenCL, DirectCompute

    The performance is comparable between all these approaches, and in our tests is not a factor for choosing one over the other, even now that the product is in Beta and we are still tuning the bits. Once we RTM, we invite anyone to measure the performance difference between C++ AMP and any other approach and share their workloads and results on a variety of hardware for comparison.

Remove this comment

Remove this thread

close

Comments Closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.