Tech Off Thread

3 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

From c++0x STL videos: Sharing experiments

Back to Forum: Tech Off
  • User profile image

    Hello all, while on the videos of STL happened among some of us an interest in use SEE with C++ Templates, while Stephan not cover Allocators I want to share some code with your guys.

    Last topic talk on STL was the shared_ptr, its a very powerfull tool that one can use for begin using aligned memory, essential for a good use of multimedia instructions. Another usefull guy is unique_ptr. The later differ from the former in implementation of strict ownership, ie ony one unique_ptr object can own the pointer, it let you use move semantics but hide the copy. Another nice of unique_ptr is it have a specialization for array ( Type[] ) what make the life even easer.

    Lets go to a simple code: wrap a aligned memory of 16 floats and fill it with non-aligned array of 4 floats.

    #include <intrin.h>
    #include <iostream>
    #include <ostream>
    #include <memory>
    #include <algorithm>
    using namespace std;
    int main(int argc, char* argv[])
        __m128 a;// forward declaration
        const char x = 'a'; //force an odd offset
        float data[] = {1.f, 2.f, 3.f, 4.f};
        //alloc and wrap data for 16 floats pointer aligned in 16byte
        auto alignedBuffer = unique_ptr<float[], decltype(&::_aligned_free) >((float *)::_aligned_malloc(sizeof(float)*16, 16), ::_aligned_free);
        //load unaligned (the little 'u' after 'load'
        a = _mm_loadu_ps(data);
        //store aligned data, i
        //loop unroll
        //Big note here: we dealing with float[]
        //caution if you use pointer aritmetic ($obj.get())
        _mm_store_ps(&alignedBuffer[0], a);
        a = _mm_add_ps(a, a);
        _mm_store_ps(&alignedBuffer[4], a);
        a = _mm_add_ps(a, a);
        _mm_store_ps(&alignedBuffer[8], a);
        a = _mm_add_ps(a, a);
        _mm_store_ps(&alignedBuffer[12], a);
        for_each(&alignedBuffer[0], &alignedBuffer[16], [](float f)
                cout << f << " ";
        return 0;

    Unique and shared _ptr are secure to use inside a vector and other STL classes,  You can  make, for example, a vector of chinks of aligned data to be processed by your algorithm.

    I'll back to this thread latter to refine it or add more examples. Please if you like and wish to contribute fell free to add your 2¢ Wink

  • User profile image

    I'll use this part to post some hints on how to debug the above code. While I'm taking screen shots (or make a video, thanks to Expression encoder 4 'express') I can direct some text:

    • Put a debug point on somewhere (line 18 for example)
    • Start debug session
    • Step Over until the unique_ptr be created, use the Locals to expand and inspect the address hold by it ( [ptr] )
    • Double click on the address and copy it (Ctrl-C)
    • Go to Debug > Windows > Memory > Memory 1 Detach it to floating mode and increase the size (no dock it)
    • On the memory window Go to the Address and paste (Ctrl-V)
    • Use right button on body and change the formating to 32-bit Floating point
    • Change the Columns to 4
    • The address must be filled with Í (user committed writable) and some bytes before teh address must have í (user committed fill)*
    • Go to Debug > Windows > Registers (Alt+5), detach it too
    • Right click and choose SS2, resize the window

    Now while you Step over the code you can see the SSE registers and the Memory changing Big Smile


    *(see on Old new thing about the symbols the compiler use to mark the space adress)

  • User profile image


Conversation locked

This conversation has been locked by the site admins. No new comments can be made.