Entries:
Comments:
Posts:

Loading User Information from Channel 9

Something went wrong getting user information from Channel 9

Latest Achievement:

Loading User Information from MSDN

Something went wrong getting user information from MSDN

Visual Studio Achievements

Latest Achievement:

Loading Visual Studio Achievements

Something went wrong getting the Visual Studio Achievements

C9 Lectures: Stephan T Lavavej - Advanced STL, 1 of n

Download

Right click “Save as…”

As promised, the great Stephan T. Lavavej is back! Smiley  Tens of thousands of you have watched STL's (those are his initials, so that's what we call him) introductory series on the STL, or Standard Template Library. If you haven't, you should. This series, Advanced STL, will cover the gory details of the STL's implementation -> you will therefore need to be versed in the basics of STL, competent in C++ (of course), and able to pay attention! Stephan is a great teacher and we are so happy to have him on Channel 9, and C9 is the only place you'll find this level of technical detail regarding the internals of the STL. There are no books. There are no websites. This is Stephan taking us into what is uncharted territory for most of us, even those with a more advanced STL skill set.

In the first part of this n-part series, Stephan digs deeply into shared_ptr. As you already know (since you will have the perquisites in place in your mind before watching this—remember, watch the intro series first), shared_ptr is a wrapper of sorts: it wraps a reference-counted smart pointer around a dynamically allocated object. shared_ptr is a template class (almost everything in the STL is a template, thus the name...) that describes an object (int, string, vector, etc.) that uses reference counting to manage resources. A shared_ptr object effectively either holds a pointer to the resource that it owns or holds a null pointer. A resource can be owned by more than one shared_ptr object, and when the last shared_ptr object that owns a particular resource is destroyed, the resource is freed.

You will also learn a lot about the beauty and the weirdness inside the STL. You should take Stephan's wisdom to heart and see if you can implement some of the patterns he shares with you in your own code, and you should of course take his advice about what NOT to do in your native compositions.

Welcome back, STL!!!

Tune in. Enjoy. Learn.

[Advanced STL]

Part 1 (shared_ptr - type erasure)

Part 2 (equal()/copy() - algorithm optimizations)

Part 3 (_ITERATOR_DEBUG_LEVEL, #pragma detect_mismatch, and /d1reportSingleClassLayout)

Part 4 (rvalue references v2.1 and associative container mischief)

Part 5 (deduplicator, using Boost.Bimap/Filesystem/ScopeExit) - see Stephan's deduplicate.cpp

Part 6 (container pretty printer) - see Stephan's pretty_printer.cpp

Tags:

Follow the Discussion

  • Awesome, I've been really looking forward to this. Can't wait to see the rest of C++ 0x (language and library enchancements) make it into Visual Studio .NEXT.

  • Good Videos.

    Now watching.

    Thank you S.T.L.Smiley

    Welcome.

  • Great Video. looking forward to the Rest.

     

  • MarekMarek

    You are great! In future I'd like to hear something about sorting, trees (maps) and some tricky algorithms.

  • Yes!  The new series is here!  And shared_ptr is a great place to start it.

  • Benjamin KimBenjamin Kim

    So happy. Welcome back.

  • MikeKMikeK

    I thought that video was great.
    As for ideas for future videos, one on std::string would be great.

  • Finally some C++ goodness (took you a month and a half Tongue Out Wink ) Finally something i know will be worth the time. More C++0x please (btw charles, i'm seeing a lot of system errors when trying to post comments today, is it fail fest over there? Wink Tongue Out )
  • Ben CraigBen Craig

    This video was a bit easy for me, but I've dealt with shared_ptr implementations in detail before.
    Here are some topic ideas:
    How exceptions work under the hood (the exception object is stored on the stack!)
    How dynamic_cast works, especially in cross casting situations (I understand there are some nasty x64 hacks)
    How you prevent STL code bloat (are the void * + static_cast tricks still necessary)?

  • Hi STL, excellent video on shared_ptr. I always look forward to your STL videos. I have some STL questions I hope you can answer. My user suggested to me to implement iterators for iterating elements in my open-source xml library. Is it possible to implement my own iterators to work with the STL algorithms in VC8, VC9 and VC10? It will be best if my custom iterators can work with algorithms in all STL implementations out there.

    I am thinking of rewriting my own next_combination algorithm to take advantage of bidirectional and random access iterators. Right now, it only uses the least-common denominator iterators (bidirectional iterators) which is slow because the algorithm has to increment íterator 1 by 1, instead of 'jumping' to the required iterator. Is it possible to write my algorithm to work with all VC(8,9,10) STL iterators or all STL implementations? Are the iterator trait tags the same?

    Thanks!!

  • STLSTL

    Thanks for watching, everyone!

    Marek> I'd like to hear something about sorting, trees (maps) and some tricky algorithms.

    Good ideas - I've definitely been planning to explore various algorithms, and our sorts and trees contain some of the most interesting machinery.

    MikeK> As for ideas for future videos, one on std::string would be great.

    std::string's Small String Optimization is worth looking at. And going through its support for move semantics would give me a chance to explain in detail the Standardization Committee's bug that we fixed right before VC10 RTM thanks to an observant customer. Also, I recently optimized string::resize() and string::erase(), and it might be useful to show what I did and how I looked at the generated assembly.

    Mr Crash> Finally some C++ goodness (took you a month and a half)

    Heh. The studio was still undergoing renovation in January, and also I've been very busy with VC11. Charles said that he wanted to get me in the studio every week, and I was all, "great idea, but I've got this day job that you might have heard about..." :->

    Ben Craig> This video was a bit easy for me, but I've dealt with shared_ptr implementations in detail before.

    I'll do my best to cover mind-bendingly complicated topics in the future. :->

    Ben Craig> How exceptions work under the hood (the exception object is stored on the stack!) How dynamic_cast works, especially in cross casting situations (I understand there are some nasty x64 hacks)

    (Un)fortunately, these things are beneath my level of abstraction, which is to say that they're deeply magical to me, I'm glad they just work, and I'm very glad that somebody else has to worry about their implementations. In particular, almost all of that machinery is in the compiler, not the libraries. It might appear that I know a lot about the compiler, but my knowledge mostly ends where the Standard ends.

    Ben Craig> How you prevent STL code bloat (are the void * + static_cast tricks still necessary)?

    We rely on /OPT:REF,ICF magic. That's basically guaranteed to merge stuff like vector<X *> and vector<Y *> (note: STL containers of owning raw pointers are leaktrocity, as I explained in the intro series, but STL containers of non-owning raw pointers are perfectly fine and sometimes useful). In 4 years of maintaining the STL, I haven't seen a single customer reporting code bloat problems.

    shaovoon> Is it possible to implement my own iterators to work with the STL algorithms in VC8, VC9 and VC10? It will be best if my custom iterators can work with algorithms in all STL implementations out there.

    Totally possible. That's the best thing about having an International Standard, and a library designed by a genius (Stepanov) with easy and efficient extensibility in mind.

    shaovoon> Right now, it only uses the least-common denominator iterators (bidirectional iterators) which is slow because the algorithm has to increment íterator 1 by 1, instead of 'jumping' to the required iterator.

    That's what std::advance(), std::distance(), std::next(), and std::prev() are for. They're O(1) for random-access iterators and O(N) for weaker iterators. See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf 24.4.4 [iterator.operations].

    shaovoon> Are the iterator trait tags the same?

    Yep, they're Standard.

  • @STL: "I'll do my best to cover mind-bendingly complicated topics in the future. :->"

    Yes please!  Smiley  I do like to watch the "easier" stuff, but the mind-bending stuff is always fun too.  The deeper you dive the better.

    @Charles:  STL's comment brings another idea to mind, but I suspect it would be hard to pull off.  I would REALLY like to see a series with the compiler guys, digging into the gory details of the compiler.  (Maybe that is too proprietary to share.  I'm sure the compiler guys are busy too.)  Or if that is too specific, maybe a more general series on compilation-related topics, parsing, translation, etc.... whatever happened to Phoenix, C# compiler as a service, etc...   Just thinking out loud here.

    Ah, the download FINALLY completed (C9 is really slow tonight).  Off to watch this episode.

  • CharlesCharles Welcome Change

    Stephan, we want you every week, plus cohosting a monthly C++ TV, man Smiley

    C

  • AshkanAshkan

    I've been always wondering why the C++ Standards Committee decided against adding intrusive_ptr to SC++L? A shared_ptr not only is sizeof( void* ) greater in size than intrusive_ptr, but it also requires an extra heap allocation for the ref count block (which can be avoided by using make_shared, but make_shared can not be used in scenarios where allocation needs to take place at a different site than pointer definition.) Besides, shared_ptr's approach to thread safety forces a design where reference counting must be done atomically regardless of whether the pointer is accessed by multiple threads or not. intrusive_ptr on the other hand, would allow for a design where such decisions could be made per object type. So my question is why?! Why don't we have a std::intrusive_ptr type like Boost?

  • CharlesCharles Welcome Change

    We could name it Native TV, but I don't want to offend my Native American brothers and sisters (and friends). Nor do I want to cause angst among the native developers out there who don't program in C++...

    C++ TV sound good, Niners?
    C

  • @Charles: Keep C++ TV video length at 40+ mins, ok? Because I always couldn't wait till I get home to watch the STL videos. I usually watch it at workplace during lunchtime. 15mins lunch and 45 mins of STL goodness! Big Smile

  • C++ TV sounds good to me Charles. 

    This is suddenly starting to sound very old-school 9.  Paging Robert Hess.  Somebody find Erica to read the news...  You should bring back the .Net Show as a weekly program (and not just for .net). 

  • CharlesCharles Welcome Change

    @ryanb: I like that idea. Dr. Hess?

    C

  • C++ TV! Awesome++

    Always excited see videos for  STL by STL!!

     

  • fileoffsetfileoffset

    Interesting lecture,
    I think it would have been good if you went into the assignment or copy constructor of the shared_ptr so its clear how the control block is passed around to the various copies of the shared/weak ptr's. 
    Other than that it was an insightful view into the topic, also interesting how you used the pre-processor macro's and the inclusion of the file multiple times to generate the different combinations of templates required (rather than using code generation tool or similar).
    The fact its included 11 times, does that mean that the STL version only supports up to 11 arguments for the constructor?

  • Allan LindqvistaL_ Kinect ftw

    really cool stuff, i snuck a peek even though im only part way through the intro series Smiley

    just on a c++TV note, it would be really really awsome to see more directX stuff on c9, and more stuff on managed c++ Smiley maybe and a better-together series with managed and unmanaged

  • CharlesCharles Welcome Change

    @ryanb: That's good stuff, Ryan. As you say, the C++ compiler people are extraordinarily busy - but it's not inconceivable that we go and meet them, dig into how the front end and back end compilers parse, analyze, optimize, etc... There is a huge amount of stuff we need to do further up the stack at the language level as well.

    It's also clear that we should consider exploring the jewel in the haystack: the machine. After all, the notion of "native code" we interchangeably use when refering to C++ really means the high level syntax we humans compose in such a way that efficiently abstracts what the machine will eventually do with the processing instructions (machine code) created by the compiler (in C++, the back end compiler...). The argument on reddit about my stating that C/C++ is native code (in the description of the Mohsen and Craig interview) is entertaining, by the way. Love the passion out there Smiley

    C

    PS: Let's end this tangential conversation. We can move the C++ TV ideas to the Coffeehouse and leave the comments on this thread for the topic at hand ->The  STL's shared_ptr implementation. Stephan doesn't have much free time - so let's make it easy for him to parse this thread for related questions/comments. Also, if you feel compelled to debate the meaning of "native code" then Coffeehouse is the place. Thanks for your understanding Smiley

  • Here is something I used to wonder for a long time. It is impossible to create an array of a user-define type that has no default constructor (unless you explicitly initialize all elements via the array-initalizer, of course). How does std::vector pull it off even though it uses an array internally? So if it's not too trivial, you could talk about placement new and explicit destructor calls. (This would be a perfect place to discuss various memory management and object lifetime details/issues.)

    Also, I would love to see a guide through the implementation of unordered_set, provided there is any interesting "magic" going on.

    I guess <initializer_list> is not practical to talk about yet due to lack of support in VC, right?

     

  • JamesGJamesG

    good presentation. Fix the blurry text problem with code in VS. Part of the time, I could not read the code.
     
     

  • Nicol BolasNicol Bolas

    One thing: why are shared pointers considered "advanced STL"? I know they're not as well-known as some of the other STL things, but they really shouldn't be considered advanced material.

    "How does std::vector pull it off even though it uses an array internally?"

    That's pretty easy. std::vector allocates bytes, not arrays. It doesn't internally do a "new ClassType[size]"; it just calls the allocator and asks for a block of memory with a size of "sizeof(ClassType) * size". When it adds an entry, it first constructs that piece of memory by calling a placement new, then issues the copy/move constructor.

  • STLSTL

    Ashkan> I've been always wondering why the C++ Standards Committee decided against adding intrusive_ptr to SC++L?

    I've been on the Committee's mailing lists for the last 4 years, but I haven't attended any of their meetings, so I can't answer this with perfect precision. (Also, while I worked on getting Dinkumware's implementation of TR1 into VC9 SP1, the design of TR1 happened before my time.) My understanding is that intrusive_ptr wasn't proposed for inclusion in TR1/C++0x, rather than being proposed and rejected. You might be able to get a more detailed answer by asking on the Boost mailing list.

    fileoffset> I think it would have been good if you went into the assignment or copy constructor of the shared_ptr so its clear how the control block is passed around to the various copies of the shared/weak ptr's.

    Agreed - especially for the converting copy constructor from shared_ptr<Derived> to shared_ptr<Base>. However, I have to cram everything into 40-45 minutes, and there just wasn't time. I also had to spend some time explaining the overall series.

    > The fact its included 11 times, does that mean that the STL version only supports up to 11 arguments for the constructor?

    0 to 10. 10 is infinity. This Standard Library doesn't go to 11. :->

    NotFredSafe> So if it's not too trivial, you could talk about placement new and explicit destructor calls.

    I may be able to work that into a future part. My concern wouldn't be that it's trivial - it's actually rather complicated - but that it's not widely useful enough. Using Part 1 as an example, type erasure is an enormously powerful trick that can be used in lots of situations, and knowing about make_shared<T>()'s optimizations can help you to use the STL more effectively. Placement new seems very low-level, given the number of times I've had to explain it to people (not many). Still, when I start poking around the guts of containers I may have to mention it whether I like it or not.

    > Also, I would love to see a guide through the implementation of unordered_set, provided there is any interesting "magic" going on.

    Some magic - actually, we've been squashing debug perf bugs there in VC11. (Debug perf isn't terribly important, except when it's so slow as to be un-debuggable!) I'm not too familiar with unordered_foo's machinery, but I could probably figure it out pretty quickly.

    > I guess <initializer_list> is not practical to talk about yet due to lack of support in VC, right?

    Correct, VC10 RTM doesn't support initializer lists. It contains a nonfunctional <initializer_list> header because I simply forgot to remove it. (I carefully scrubbed out Dinkumware's library support for initializer lists, but forgot about a whole header. Go figure.)

    JamesG> Fix the blurry text problem with code in VS. Part of the time, I could not read the code.

    What blurry text? I downloaded the High Quality WMV, viewed it at 100%, and forwarded to 30:28 - meow.cpp's Consolas font is ginormous (as intended), and I can even clearly read Intellisense's tooltip, which I feared would be invisible.

    Nicol Bolas> One thing: why are shared pointers considered "advanced STL"?

    Their *implementation* is advanced. Their interface is simple - I explained how to use them in Intro Part 3.

  • @JamesG:  Were you watching the streaming version?  Sounds like a smooth streaming issue.  There certainly wasn't anything blurry in the High Quality WMV.

     

  • std::vector allocates bytes, not arrays. It doesn't internally do a "new ClassType[size]"; it just calls the allocator and asks for a block of memory with a size of "sizeof(ClassType) * size". When it adds an entry, it first constructs that piece of memory by calling a placement new, then issues the copy/move constructor.

    Yes, I know. That's why I said "used to wonder" and later mentioned placement new.

  • MattMatt

    You are an excellent presenter Stephan.  This series is great to watch and follow even for more experienced developers.  Altohugh I have been using the STL for years its internals have always been intimidating.  Vut your clear presenting and expert knowledge give us mortals a much better understanding of the magic.  Keep up the good work.

  • Based on the previous STL videos and my interest in using templates to unroll loops, I have the following questions:

    1) What is the equivalent VS2010 C++ flag that corresponds to g++'s -ftemplate-depth=n ( http://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html#index-ftemplate_002ddepth-156 )?  Can this flag be set in the Visual Studio's Project/Properties/Configuration Properties dialog box (or some other dialog box) or just at the command line?

    2) Since C++0x's maximum template instantiation depth seems to be implementation dependent ( http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf - Section 14.7.1 - Point 14 - Page 372), what is the maximum template instantiation depth of VS2010's templates?  Or is it template parameter dependent (i.e. compile-time stack dependent)?

    3) At compile-time, is there any way (even some weird compiler-dependent macro) to determine how deep into template instantiations we are without keeping track of that depth ourselves?

    4) for-loops seem to unroll themsleves if the conditional expression can be determined at compile-time and if the number of iterations are small enough.  What is the maximum number of iterations that still allow for-loops to unroll?  In others words, when does for-loop unrolling end and assembly jumps begin?

    5) What improvements can be made to my loop unrolling code at the end of this post?

    Background:
    Previously, I asked if we could cover loop unrolling (esp. for assignment statements).  Normally, if I needed to repeatedly perform a few hundred thousand assignment statements (i.e. copying the buffer of images or video frames for processing) and I wanted to minimize the impact of the "i < size" and "++i" in that for-loop, then I would just write out a for-loop with a bunch of assignment statements in the for-loop body using a script and then copy & paste that code into the relevent cpp file by hand.  Of course, this manual for-loop unrolling assumed that the size of the loops (i.e. the size of the images or video frames) weren't going to change from one compilation to another.  A few weeks back, I had to come up with something a little easier to work with since I was going to be dealing with a number of different buffer sizes (all still known at compile time ... no run-time querying).  With the help of pages 314-318 of C++ Templates: The Complete Guide, I ended up writing something like the following:

    #include <cstddef>
    #include <iostream>
    
    using namespace std;
    
    //=========================================================
    
    #ifdef _MSC_VER
    #define INLINE __forceinline
    #else
    #define INLINE inline
    #endif
    
    //primary template:
    template < typename ACTION_TYPE, typename TYPE, size_t n >
    struct unroll_loop_t {
        INLINE static void call ( TYPE * destination_ptr, TYPE const * source_ptr ) {
            ACTION_TYPE::call( destination_ptr, source_ptr );
            unroll_loop_t< ACTION_TYPE, TYPE, n - 1 >::call( destination_ptr + 1, source_ptr + 1 );
        }
    };
    
    //partial specialization template:
    template < typename ACTION_TYPE, typename TYPE >
    struct unroll_loop_t< ACTION_TYPE, TYPE, 1 > {
        INLINE static void call ( TYPE * destination_ptr, TYPE const * source_ptr ) {
            ACTION_TYPE::call( destination_ptr, source_ptr );
        }
    };
    
    //partial specialization template:
    template < typename ACTION_TYPE, typename TYPE >
    struct unroll_loop_t< ACTION_TYPE, TYPE, 0 > {
        INLINE static void call ( TYPE *, TYPE const * ) {
            // nothing
        }
    };
    
    //primary template:
    template < typename ACTION_TYPE, typename TYPE, size_t n >
    struct loop_t {
        INLINE static void call ( TYPE * destination, TYPE const * source ) {
            size_t const block_size = 512; // max number of iterations unrolled
            size_t const number_of_blocks = ( n / block_size ); // integer division
            size_t const partial_block_size = ( n % block_size );
            for ( size_t block = 0; block < number_of_blocks; ++block )
                unroll_loop_t< ACTION_TYPE, TYPE, block_size >::call(
                    &destination[ block * block_size ],
                    &source[ block * block_size ]
                );
            unroll_loop_t< ACTION_TYPE, TYPE, partial_block_size >::call(
                &destination[ number_of_blocks * block_size ],
                &source[ number_of_blocks * block_size ]
            );
        }
    };
    
    template < typename TYPE >
    struct assignment_t {
        INLINE static void call ( TYPE * destination_ptr, TYPE const * source_ptr ) {
            *destination_ptr = *source_ptr;
        }
    };
    
    //convenience function template:
    template < size_t n, typename TYPE >
    INLINE void assign( TYPE * destination, TYPE const * source ) {
        loop_t< assignment_t< TYPE >, TYPE, n >::call( destination, source );
    }
    
    //=========================================================
    
    void zeroize ( double *, size_t const );
    void capture ( double *, size_t const );
    void print ( char const *, double const *, size_t const );
    void print_is_equal ( char const *, double const *, double const *, size_t const );
    void process ( double *, size_t const );
    
    int main () {
        size_t const width = 640; // assume this comes in a .h file
        size_t const height = 480; // assume this comes in a .h file
        size_t const depth = 3; // assume this comes in a .h file
        size_t const size = width * height * depth; // assume this comes in a .h file
    
        double * image_buffer = new double[ size ];
        double * image_1 = new double[ size ];
        double * image_2 = new double[ size ];
    
        capture( image_buffer, size );
    
        print( "image_buffer", image_buffer, size );
        print( "image_1", image_1, size );
        print( "image_2", image_2, size );
    
        cout << endl;
        cout << "assign< size >( image_1, image_buffer );" << endl;
        cout << "assign< size >( image_2, image_buffer );" << endl;
        cout << endl;
    
        assign< size >( image_1, image_buffer );
        assign< size >( image_2, image_buffer );
        
        print( "image_buffer", image_buffer, size );
        print( "image_1", image_1, size );
        print( "image_2", image_2, size );
    
        cout << endl;
    
        print_is_equal( "image_1 == image_buffer", image_1, image_buffer, size );
        print_is_equal( "image_2 == image_buffer", image_2, image_buffer, size );
        print_is_equal( "image_1 == image_2", image_1, image_2, size );
    
        process( image_1, size );
        process( image_2, size );
    
        delete[] image_2;
        delete[] image_1;
        delete[] image_buffer;
        
        return 0;
    }
    
    void zeroize ( double * img, size_t const n ) {
        for ( size_t i = 0; i < n; ++i )
            img[ i ] = 0;
    }
    
    void capture ( double * img, size_t const n ) {
        // simulate capturing an image
        for ( size_t i = 0; i < n; ++i )
            img[ i ] = i + 1;
    }
    
    void print ( char const * name, double const * img, size_t const n ) {
        cout << name << ": [ ";
        size_t const max_length = 2;
        size_t const length = ( n <= max_length ? n : max_length );
        for ( size_t i = 0; i < length; ++i )
            cout << ( i == 0 ? "" : ", " ) << img[ i ];
        if ( length < n )
            cout << ", " << ( length + 1 == n ? "" : "..., " ) << img[ n - 1 ];
        cout << " ]" << endl;
    }
    
    void print_is_equal ( char const * str, double const * img_1, double const * img_2, size_t const n ) {
        bool is_equal = true;
        for ( size_t i = 0; i < n; ++i ) {
            if ( img_1[ i ] != img_2[ i ] ) {
                is_equal = false;
                break;
            }
        }
        cout << str << ": " << is_equal << endl;
    }
    
    void process ( double *, size_t const ) {
        // apply filters
    }
    
    This code can be compiled in g++ 4.5.2 using the following command line (assuming the code is in a file named main.cpp):
    g++ -o main.exe main.cpp -std=c++0x -O3 -Wall -Wextra -Werror
    or compiled in VS2010 using Warning Level 4.  For VS2010, it takes about two minutes to compile in Release mode if you are also producing the Assembly with Source Code ( /FAs ... or Properties / Configuration Properties / C/C++ / Output Files / Assembler Output ) as well.  This code produces the following output:
    image_buffer: [ 1, 2, ..., 921600 ]
    image_1: [ 0, 0, ..., 0 ]
    image_2: [ 0, 0, ..., 0 ]
    
    assign< size >( image_1, image_buffer );
    assign< size >( image_2, image_buffer );
    
    image_buffer: [ 1, 2, ..., 921600 ]
    image_1: [ 1, 2, ..., 921600 ]
    image_2: [ 1, 2, ..., 921600 ]
    
    image_1 == image_buffer: 1
    image_2 == image_buffer: 1
    image_1 == image_2: 1

    Thanks In Advance,
    Joshua Burkholder

  • STLSTL

    Matt: Thanks! That's exactly what I'm trying to do here.

    Burkholder> 1) What is the equivalent VS2010 C++ flag that corresponds to g++'s -ftemplate-depth=n

    According to my knowledge, VC doesn't have a compiler option to control the maximum template instantiation depth.

    Burkholder> 2) Since C++0x's maximum template instantiation depth seems to be implementation dependent

    C++03 said "17 or more". C++0x says "1024 or more".

    Burkholder> what is the maximum template instantiation depth of VS2010's templates?

    VC10 RTM appears to believe that 500 is infinity:

    C:\Temp>type meow.cpp
    template <int N> struct X;
    
    template <> struct X<1> { };
    
    template <int N> struct X {
        X<N - 1> x;
    };
    
    template struct X<MEOW>;
    
    C:\Temp>cl /EHsc /nologo /W4 /c meow.cpp /DMEOW=500
    meow.cpp
    
    C:\Temp>cl /EHsc /nologo /W4 /c meow.cpp /DMEOW=501
    meow.cpp
    meow.cpp(6) : fatal error C1202: recursive type or function dependency context too complex
    ...
    

    Burkholder> Or is it template parameter dependent (i.e. compile-time stack dependent)?

    Increased template complexity may reduce this limit.

    Burkholder> 3) At compile-time, is there any way (even some weird compiler-dependent macro) to determine how deep into template instantiations we are without keeping track of that depth ourselves?

    No. I've never heard of any compiler having such an ability, and it would be extremely problematic.

    Burkholder> 4) for-loops seem to unroll themsleves if the conditional expression can be determined at compile-time and if the number of iterations are small enough.  What is the maximum number of iterations that still allow for-loops to unroll?  In others words, when does for-loop unrolling end and assembly jumps begin?

    This is up to the optimizer and your optimization settings. Crazy magic happens here.

    Burkholder> 5) What improvements can be made to my loop unrolling code at the end of this post?

    Consider using SSE, etc. Video processing is a perfect scenario for vectorization.

    (Of course, for simple copying, just use memcpy()/memmove(). In fact, our implementation of std::copy() calls memmove() when it can get away with it - something I'm very likely to cover in future parts.)

  • PhilhippusPhilhippus

    Great lecture as ever from STL of the STL. The pace was spot on - if I needed something clarifying I could use the seek bar.
    The book 'C++ in Action' recommends using a leading underscore to name private data members (http://relisoft.com/book/lang/scopes/2local.html (scroll to bottom)) which I started doing but couldn't stand its ugliness after a while. Now I know there's an even stronger reason not to use this convention.
    I would like to see how the STL can be used to implement machine learning, search algorithm optimisation (e.g. iterative deepening, incurred cost estimation, etc) and other aspects in the AI field.
    I'll be happy with whatever direction you go in though...good work!!

  • WATCHERWATCHER

    Would be nice if you got the watch window font to a size that can be seen. Be nice if you found out why the watch window was wrong, too, but I know it's too late to fix for 10. Maybe 11, then.

  • STLSTL

    Philhippus> using a leading underscore to name private data members

    To clarify, only _Leading_underscore_capital and double__underscoreAnywhere names are reserved everywhere. _leading_underscore names are reserved in the global namespace, but users can use them in classes. (See N3225 17.6.3.3.2 [global.names].)

    In my opinion, _member and member_ are terribly ugly. I use m_foo for members, because the lifetime of a data member exceeds that of any individual member function, and it's important to be constantly reminded of that fact. (This isn't Hungarian notation, which is evil - that attempts to encode types into names.)

    Philhippus> I would like to see how the STL can be used to implement machine learning, search algorithm optimisation (e.g. iterative deepening, incurred cost estimation, etc) and other aspects in the AI field.

    I'm not familiar with those domains, sorry. As I explained back in Intro Part 1, the STL is a library for pure computation, so you get to figure out how to apply it to your field. :->

    (My Nurikabe solver was an attempt to demonstrate how the STL could be applied in a nontrivial program - but while I thought it was fascinating, I'm not sure how successful it was. It also took me weeks to write, something I can't easily do again. Even repurposing my code at home for data compression or font rendering would take a while.)

    WATCHER> Would be nice if you got the watch window font to a size that can be seen.

    I couldn't find an option for it. If somebody could find one, I'd be very grateful.

    WATCHER> Be nice if you found out why the watch window was wrong, too

    I think I'll reformat my laptop before filming Advanced Part 2. I may have messed with the visualizers in the past, but I thought I put everything back to its original state.

  • JamesGJamesG

    @ryanb:
    yes, I was watching the streaming version. I'll make it a point to look for a download next time.

  • CharlesCharles Welcome Change

    @JamesG: Smooth streaming will streaming quality ranges from low to high depending on your network conditions. We are aware that this isn't a great experience when there's code on screen and your network isn't capable of a large data stream. The dowloadable files are located under the Download section next to the inline player.

    C

  • John Melville-- MDJohn Melville-- MD Equality Through Technology

    Is "type erasure" another name for the GOF strategy pattern or is there a subletly that I am missing here?

  • STLSTL

    Ugh, design patterns.  As far as I can tell (I'm looking at the book right now), "Strategy" means "customize behavior".  The STL does this in lots of places: functors given to algorithms, comparators given to maps, allocators given to containers.  Their bullet point "Strategies as template parameters." covers this.

    Now, compare how vector and shared_ptr handle custom allocators - they affect vector<T, MyAlloc<T>>'s type, but don't affect shared_ptr<T>'s type.  That's type erasure.

  • felix9felix9 the cat that walked by itself

    @Charles: maybe you should use a 2-pane layout for lectures and show us the slides and code in another view, just like the pdc player or msr lectures player. and there should be a formal section for download links of sildes or other downloadable materials.

  • MariusMarius

    First: Thanks for a fantastic video, Stephan. Type erasure is used in many places now (std::function, boost::any...) and it's a fantastic idiom that helps decoupling implementation details from interfaces.
    One question though: If the make_shared allocation allocates the object and the reference counting block together, do they also have to be freed together? If I have a very big object and no more "uses", but still weak links, will the memory not be freed?

  • CharlesCharles Welcome Change

    @felix9: In terms of the split-screen, possibly... Stephan, can you post your slides? I understand what you mean by formal, felix. Slides, code would live under Downloads. That's a good idea.

    C

  • MartyMarty

    Hi STL, love your videos, keep em coming :)

    I have a question that i think you can answer or at least clear up a bit.

    For some time now i have been confused about which of these to use for dynamic buffers that can be as small as 1 byte to over 500 megabytes and beyond, and modifiable. (But normally around 2 - 100 megabytes)
    vector buf; or
    unique_ptr buf(new BYTE[...]);

    Which one do i use and why. Performance is a priority.

  • STLSTL

    Marius> If the make_shared allocation allocates the object and the reference counting block together, do they also have to be freed together?

    Yep. When all of the shared_ptrs have been destroyed/reset/assigned/etc. the object will be destroyed, but the refcount control block containing space for the object will persist until all of the weak_ptrs have been destroyed/reset/assigned/etc.

    Marius> If I have a very big object and no more "uses", but still weak links, will the memory not be freed?

    Yep. This is the one scenario (big object, weak_ptrs) where traditional shared_ptr construction is better than make_shared<T>(). Of course, only sizeof(T) matters, not the size of any dynamically allocated memory it might contain - for example, vector<T> is small according to this metric (only 16 bytes in VC10 and 12 bytes in VC11).

    Charles> Stephan, can you post your slides?

    They're here: http://cid-e66e02dc83efb165.office.live.com/view.aspx/docs/vis-1.1.pptx

    Marty> Which one do i use and why.

    Use vector<unsigned char> instead of unique_ptr<unsigned char[]> unless your scenario would strongly benefit from unique_ptr and you know exactly what you're doing. vector is much more powerful (and still insanely efficient), beginning with the fact that it knows its own length. vector stores 3 raw pointers compared to unique_ptr's 1, but that matters only when you've got a zillion of the things - and if your buffers are megabyte-size I can guarantee that you don't have a zillion of the things.

    Way back in Intro Part 1, I stressed that vector should be your container of first resort. That's still true.

  • MartyMarty

    STL: "Way back in Intro Part 1, I stressed that vector should be your container of first resort. That's still true."

    Yes i did see it.
    There are so many ways of doing the same thing with the stl, that it is sometimes hard / confusing to know what to use in which situation, etc.
    Your videos are helping in that area though. :)

    Is the above still true when you use that vector buffer and cast it to a structure ex LPBITMAP and then change values in it and then saving the buffer to file again
    and / or the other way new up a new vector to use as a temporary buffer, inserting structure info/ file headers, etc and then saving it to a file ?

    "unique_ptr unless your scenario would strongly benefit from unique_ptr and you know exactly what you're doing"
    "...strongly benefit..." Can you elaborate on that ?

    I like to know both sides of the story before i choose a side.
    side1:pros/cons vs side2:pros/cons

    I like to be thorough, sorry if i'm a bother.

  • STLSTL

    Marty: Yep, still true. Also, vector works just fine with C-style APIs, where you can pass v.data() and v.size().

    Marty> "...strongly benefit..." Can you elaborate on that ?

    Basically, if you have a zillion - like ten million - tiny buffers (so fixed overheads per buffer really matter), AND their sizes aren't known at compiletime (so you can't just use std::array), AND they're not growing/shrinking during execution (so you don't have to worry about performing reallocation yourself), AND you can determine their runtime size from other information you're already storing (otherwise, you'll need to store unique_ptr<T[]> and size_t, which is 8 bytes versus vector's soon-to-be 12). In that contrived case, unique_ptr might be better than vector.

    In all other cases, use vector. It's really that simple.

  • &#21733;&#21733;哥哥

    Can not watch this voide.

  • CharlesCharles Welcome Change

    @哥哥: Where are you located? Do the Download links not work?

    C

  • MottiMotti

    Regarding your question about the pace of these shows, I listen to the audio while driving so the pace is fine for me. Perhaps if I were to watch the video it would be too slow but as it is I can usually follow everything that's going on without having too many fatal accidents.
    I do have a question regarding this show, when using make_shared the T must sometimes be destroyed before the reference counting block (if weak pointers exist). How is this managed? Do you use a char buffer with placement new and then call the destructor explicitly? If so how do you guarantee that the memory block to T is correctly aligned (is it as simple as it's the first part of the structure returned from new and therefore aligned to all types).
    Normal 0 false false false EN-US X-NONE HE guarantee

  • STLSTL

    Motti: See _Ref_count_obj in <memory>. Yes, we use placement new and explicit destructor calls. We also use a fancy bit of machinery called std::aligned_storage to guarantee alignment - as its name indicates, it's Standard machinery available for public consumption (by experts who know what they're doing).

  • Stephan: Excellent show!

    You have mentioned Standard machinery available to deal with object's alignment.
    It'd be great if you could cover the STL allocators and the new C++0x alignment specifiers with STL.
    Implementing an SSE-friendly-container guaranteed-alignment allocator, along the lines of estl:allocator discussed in N2271 ( http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html ) would be a fantastic example!

  • new2stlnew2stl

    @Matt_PD: on the comments of before video (the one about type_traits) I make a comment about std::aligned_storage, I almost sure I'm exploiting it wrongly but it worked fine and really grant the alignment of subsequent data. For work with most multimedia instructions (SSE, AVX, etc) the initial address must be aligned too, with the info of this series I'm now testing shared_ and unique_ptr with the compiler aligned_alloc and aligned_free. It makes easy use the std::vector to store, lets say, 4 or more small aligned buffers for audio/video processing, but it is compiler dependent.I believe using placement new and delete would helps decouple/shields the code.
    Anyway I'm loving the series and the slides from STL is helping to demystify the source code we see when playing with debug. Lets hope next version of VS makes easy add the code for formating debug.

  • I had posted this over in the C++ Blog.

     

    @Stephan

    To change the size of the text in the watch window do the following.

    Menu -> Tools -> Options..

    Options Dialog -> Environment -> Fonts and Colors

    Select the drop combo for Show settings for:

    Select Watch Windows

    Change the font size to what you want. This will make the lines of the watch window bigger.

  • ilcredoilcredo

    Keep them coming. The whole series is amazing. When you'll finish the STL you could put some BOOST lectures(i'm pretty sure you know that too). Good luck.

  • CharlesCharles Welcome Change

    Here are some topic ideas: How exceptions work under the hood...

     

    For SEH in Windows/Win32, specfically, this is the seminal paper on the topic of exceptions. Well worth a read if you haven't already: http://www.microsoft.com/msj/0197/exception/exception.aspx

    C

  • @STL:


    Burkholder> 5) What improvements can be made to my loop unrolling code at the end of this post?

    Consider using SSE, etc. Video processing is a perfect scenario for vectorization.

    (Of course, for simple copying, just use memcpy()/memmove(). In fact, our implementation of std::copy() calls memmove() when it can get away with it - something I'm very likely to cover in future parts.)

    Thanks for the AWESOME suggestions!!!  SSE, memcpy(), and memmove() are amazing!

    I'm a complete newbie to SSE ... but WOW ... it seems like using one instruction to load multiple floating point values into a 128-bit register and then using another instruction to store those values is a quicker way to go.  I have a few questions on SSE:

    1) Since I'm a newbie to SSE, I used the following type of code to copy memory from one place to another:

    #include <emmintrin.h>
    ...
    size_t const image_size = 640 * 480 * 3;
    double * image_buffer = new double[ image_size ];
    double * image_1 = new double[ image_size ];
    double * image_2 = new double[ image_size ];
    ...
    capture_image( image_buffer, image_size );
    ...
    __m128d sse_register;
    
    // Since two 64-bit doubles fit into one 128-bit sse register,
    // then our delta is 2.
    size_t const delta = 2;
    for ( size_t i = 0; i < image_size; i += delta ) {
        sse_register = _mm_load_pd( &image_buffer[ i ] );
        _mm_store_pd( &image_1[ i ], sse_register );
        _mm_store_pd( &image_2[ i ], sse_register );
    }
    In this case, is this the correct way to use SSE?  Or is there a better way?

    2) The following "normal" type of code (i.e. no _mm_xxxx_pd() stuff) also compiled and ran:

    #include <emmintrin.h>
    ...
    size_t const image_size = 640 * 480 * 3;
    double * image_buffer = new double[ image_size ];
    double * image_1 = new double[ image_size ];
    double * image_2 = new double[ image_size ];
    ...
    capture_image( image_buffer, image_size );
    ...
    // Since 128-bits is 2 * 64-bits, then ...
    size_t const img_size = image_size / 2;
    __m128 const * img_buffer = reinterpret_cast< __m128 const * >( &image_buffer[ 0 ] );
    __m128 * img_1 = reinterpret_cast< __m128 * >( &image_1[ 0 ] );
    __m128 * img_2 = reinterpret_cast< __m128 * >( &image_2[ 0 ] );
    for ( size_t i = 0; i < img_size; ++i ) {
        img_1[ i ] = img_buffer[ i ];
        img_2[ i ] = img_buffer[ i ];
    }
    
    Will this type of code (i.e. using __m128d * the same way I would double * or any other pointer) be valid in the future?  Or is this something that works in VS2010, but might not work in future versions?  ... If this will work in future versions, then what's up with all that _mm_xxxx_pd() stuff?

    3) I have no idea if I'm writing good or bad SSE code.  What are the suggested tutorials?  Are there any good books?

    Lastly, memcpy() and memmove() were faster at copying a single image than anything that I could write ... even with SSE and loop unrolling.  I could only beat memcpy() and memmove() when I took into account my specific situation ... copying a single image buffer into two images ... where I could use a single for-loop for both images (as above), vice a separate loop for each image.  So my question is:

    4) What is the "secret sauce" in memcpy() and memmove() that makes them so much faster?  Is the implementation of VS2010's memcpy() or memmove() available?  If so, where can I find that code?

    Definitely cover memmove() when you cover std::copy().

     

    Thanks In Advance,
    Joshua Burkholder

     

  • new2STLnew2STL

    If memory serves me well, since VS2008 memcpy and memmove already make use of SSE instructions (including cache bypass) when you compile in release mode and with SSE flags up (/arch:SSE2 or /arch:AVX etc), i ready about it on some specialized sites (new memory not nerve me :doh:)
    even string search functions can take advantages on SS4.2 (with specific str intrinsics).
    You don't need manually make move bytes in your code, unless u doing something specific like, move and expand YCbCr to RGB in the same pass.
    And if you like it, AVX (new instructions from 2nd generation of Core iX) are 256bit wide.

  • new2STLnew2STL

    Sorry the typing mistakes above. [old]"i ready about it on some specialized sites (new memory not nerve me :doh:) even string search functions can take advantages on SS4.2".[revised]"I read about it on some specialized sites (now memory not serve ...) ... SSE4.2). Another thing i want to add, u using #include , you only need include , this header include all other headers and include check macros against architectures (some intrinsics are 64bit or itanium specific and the macro nulls then)

  • new2stlnew2stl

    you are using emmintrin.h, you need only use intrin.h (better I make an account to edit my posts :P). Sorry for the triple post.

  • @new2stl:Thanks for the information.  I will use the intrin.h header.

    Unfortunately, the machines that I'm writing for don't seem to have the AVX instruction set and its 256 bit registers.  Our machines are about three years old ... and it seems that AVX is relatively new.

    I am new to SSE.  The only info that I have read is the couple of MSDN Help webpages about MMX/SSE intrinsics and the one GCC webpage that I could find.  Where can I learn about how to program for the SSE instruction sets?  What are the good book titles?  What are the good websites/tutorials?

    FYI:  The reason that I am making copies of images is that I am at the start of a research project where I will get video streams from a couple of cameras.  I need to send each video stream to a couple of real-time algorithms that will execute concurrently in separate threads.  Since I don't know how destructive each algorithm will be to the image buffer, I just have another thread capture the images (i.e. fill the image buffer), make copies, and then let those algorithm threads loose on the copies ... where they can destructively edit those copies in complete isolation.  The real-time algorithms have changed a few times, so this way I have a system that works without the chance of one one thread stomping on another.  I'm sure that I'll revisit this copying-images code at the end of the project ... but by then, all the other algorithms will have been decided upon (and hopefully, set in stone).

     

    Thanks Again,
    Joshua Burkholder

  • new2STLnew2STL xkcd.com

    @Burkholder: (now I make an account  Tongue Out), unfortunate I don't know any book about SSE/AVX, at least none for programming (when can find any they are a thousand page manual). Most I learned on intrinsic was reading Gamasutra articles and Intel (few times IBM) whitepapers. The Intel ones are complex but portable, as VS uses same headers (GCC prefer use vectorization classes).

    Is safe nowdays use SSE2 code, and near safe use SSE3. Some notable citizens: AMD CPU OpenCL and WARP (Windows Advanced Rasterization Platform). A foot note, intrinsics are mandatory when compiler is targeting 64bit.

    I need search if there is any form of direct mail or send a message to a niner, then I can stop poluting the comments with off-topic Perplexed)

  • @new2stl:Thanks for the information.  I will use the intrin.h header.

    Unfortunately, intrin.h does not seem to exist for GCC's g++ and MinGW's port of g++; however, emmintrin.h exists for both GCC's and MinGW's g++ ... and for Visual C++ 2010.  Here's the simple test that I ran:

    #include <iostream>
    #include <intrin.h>
    //#include <emmintrin.h>
    
    using namespace std;
    
    int main () {
        cout << sizeof( __m128d ) << endl;
        return 0;
    }
    
    ... and here's the command line:
    g++ -o main.exe main.cpp -std=c++0x -march=native -O3 -Wall -Wextra -Werror

    Joshua Burkholder

  • GordonGordon

    So when is the next video coming, i'm in withdrawal here.

  • theDUFFtheDUFF

    Well done STL@MSoft.  These videos are fantastic - as you said there are no books!
    I have a topic suggestion:
    Do you think it would be possible to cover allocators at some point? 
     
    SL99
     

  • new2STLnew2STL xkcd.com

    @Burkholder: Good they added the XXXintrin.h, last time I used gcc (mingw) was really long time ago Tongue Out

     

    The intrin.h is VS specific, only a huge all-in-one header that make those architecture check for you Wink

     

    On Intel have a guy talk about how his program (video processing, under en-us/blogs/2010/12/20/visual-studio-2010-built-in-cpu-acceleration/) get faster by simple using SSE2 arch option on VS2010. And under (en-us/avx/) is the Intel source for articles, the page always point to newest technology, but serve as a  hub for the 'old' ones.

     

    All x86 instructions can b found here.

     

    FYI: I already worked on similar case, activating 720(4:3) composition on H323plus project.

  • @STL: I wasn't sure if you kept an eye on the c9 forum so here you go.

    I've created a thread for my problem:
    http://channel9.msdn.com/Forums/TechOff/C0x-vs2010-question-please-tell-me-what-im-doing-wrong

    What am i doing wrong ?
    Is it me or is it vs2010 ? Is it both ?
    Thanks for your time.

  • @Burkholder: Regarding SSE intrinsics learning materials -- see Agner's Optimization Manuals: http://www.agner.org/optimize/

    @new2stl: I've tried looking for your comments mentioning that on the "Standard Template Library (STL), 10 of 10" video (that's the one discussing type_traits) but I couldn't see them -- was I looking in the wrong place?

     

  • @Matt_PD:Thank you so much ... the optimizing_cpp.pdf is OUTSTANDING!

  • STLSTL

    Matt_PD> It'd be great if you could cover the STL allocators and the new C++0x alignment specifiers with STL.

    I might cover our allocator machinery. However, I can't cover C++0x features that aren't implemented in VC10!

    ilcredo> When you'll finish the STL you could put some BOOST lectures(i'm pretty sure you know that too).

    I'm familiar with some Boost libraries, and I might cover them in the future.

    Charles> For SEH in Windows/Win32, specfically, this is the seminal paper on the topic of exceptions.

    Disclaimer: Structured Exception Handling is totally different from C++ exception handling. (Implementation detail: VC implements C++ exceptions with SEH.)

    SEH is extremely low level, and in general programs shouldn't mess with it.

    Burkholder> I have a few questions on SSE:

    I know very little about SSE, other than the fact that it exists and what it's generally useful for.

    Burkholder> What is the "secret sauce" in memcpy() and memmove() that makes them so much faster?

    They have dedicated assembly implementations, and I believe they're constantly maintained by some combination of our compiler back-end devs and our Intel/AMD contacts. These assembly implementations know the fastest way to copy bytes from one location to another, which can be very processor-specific.

    Burkholder> Is the implementation of VS2010's memcpy() or memmove() available?

    My vague understanding is that there are actually 3 implementations of memcpy/memmove: assembly, "compiler intrinsic", and plain old C. I don't know how the compiler selects between the 3. See "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\crt\src\intel\memcpy.asm" and "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\crt\src\memcpy.c" for the first and third. The compiler intrinsic implementation, which I've been told exists, is a sequence of instructions that the compiler knows how to generate on demand.

    Gordon> So when is the next video coming, i'm in withdrawal here.

    Filming tomorrow. I finished setting my laptop up today.

    theDUFF> Do you think it would be possible to cover allocators at some point?

    Yes, although I'll have to think of something useful to say about them.

    Mr Crash> I wasn't sure if you kept an eye on the c9 forum so here you go.

    I don't monitor the Channel 9 forums, but I occasionally scan the MSDN Visual C++ forums.

    Mr Crash> What am i doing wrong ?

    It looks like you're trying to write a scope guard. I've written an implementation powered by std::function:

     

    C:\Temp>type meow.cpp
    #include <exception>  // for std::terminate()
    #include <functional> // for std::function
    #include <iostream>   // for std::cout in foo()
    #include <ostream>    // for std::endl in foo()
    #include <stdexcept>  // for std::runtime_error in foo()
    #include <utility>    // for std::forward()
    using namespace std;
    
    class HyperScopeGuard {
    public:
        template <typename F> explicit HyperScopeGuard(F&& f) {
            try {
                m_f = forward<F>(f);
            } catch (...) {
                try {
                    f();
                } catch (...) {
                    terminate();
                }
    
                throw;
            }
        }
    
        void dismiss() {
            m_f = nullptr;
        }
    
        ~HyperScopeGuard() {
            if (m_f) {
                try {
                    m_f();
                } catch (...) {
                    terminate();
                }
            }
        }
    
    private:
        function<void ()> m_f;
    
        HyperScopeGuard(const HyperScopeGuard&);
        HyperScopeGuard& operator=(const HyperScopeGuard&);
    };
    
    void foo() {
        HyperScopeGuard g0([]{ cout << "cute" << endl; });
        HyperScopeGuard g1([]{ cout << "fluffy" << endl; });
        HyperScopeGuard g2([]{ cout << "kittens" << endl; });
        HyperScopeGuard g3([]{ cout << "zombies" << endl; });
    
        cout << "Throwing exception." << endl;
    
        g3.dismiss();
    
        throw runtime_error("Too many puppies!");
    }
    
    int main() {
        try {
            foo();
        } catch (const runtime_error& e) {
            cout << "Caught exception: " << e.what() << endl;
        }
    }
    
    C:\Temp>cl /EHsc /nologo /W4 meow.cpp
    meow.cpp
    
    C:\Temp>meow
    Throwing exception.
    kittens
    fluffy
    cute
    Caught exception: Too many puppies!

     

    Notes:

    * I haven't extensively tested this, nor have I used it in production code.

    * Destructors must not emit exceptions, so if invoking m_f() in ~HyperScopeGuard() throws, we immediately terminate().

    * In HyperScopeGuard's constructor, storing f (via perfect forwarding) in m_f might throw, e.g. if the std::function tries to allocate memory and that throws bad_alloc. In that event, we invoke f() before rethrowing, so that we don't leak whatever we're trying to guard. There are a couple of subtleties here. First, f() itself might throw. That's bad (just like in HyperScopeGuard's destructor), so immediate termination is the answer. Second, there are subtleties involving moved-from functors, but I think I'm worrying about nothing there.

  • new2STLnew2STL xkcd.com

    @Matt_PD: My previous comments are a bit hard to track, only now I'm using an account to post then. Plus in old posts I'm not writing code in comments cause they are limited to 'wall-of-text'.

    On the video about type_traits you can use:

    #include <type_traits>
    typedef std::aligned_storage<16, 16>::type _m128_type_sz_align;

    With std::aligned_storage I can make the default allocators grant the alignment of subsequent data, but the start address still need to be aligned too for SSE and SSE2. Inspecting the source code, allocators look for the info in the std::aligned_storage when possible.

    Now with a more indepth of std::XXX_ptr, I feel more confortable to append a call to _aligned_free for the destructor.

    If @STL provide some more info about allocator I believe the cycle will be complete making a a wrapper arround _aligned_malloc Smiley

    PS.: all this is only necessary for MMX/SSE/SSE2 or for play with cache lines, SSE3 and beyond include unaligned versions of the LD and MOV.

  • @new2STL: Regarding "SSE3 and beyond include unaligned versions of the LD and MOV" -- would you happen to know what the costs are of using unaligned load/move in SSE3+?

     

    According to Intel, for SSE2, the costs are at least 40% slowdown (going up to possible 500%): "Empirical evaluation using a 2.8 Ghz Pentium® 4 processor system shows that an unaligned 16-byte load contained within one cache line (128 bytes) is only moderately slower–about 40%–compared to an aligned access. The cost rises sharply though when the 16-byte chunk crosses a cache line boundary. Such cache line splitting loads can be up to five times slower!"

    http://software.intel.com/en-us/articles/reducing-the-impact-of-misaligned-memory-accesses/

  • Why does the decltype of a function change from "void()" to "void(*)()" in the following code?

    Code:

    #include <iostream>
    #include <type_traits>
    
    using namespace std;
    
    void f () {
        //
    }
    
    template < typename F >
    void check_similarity_1 ( F ) {
        cout << "is_same< F, void () >::value:   "
             << is_same< F, void () >::value << endl;// 0
        cout << "is_same< F, void(*)() >::value: "
             << is_same< F, void(*)() >::value << endl;// 1
    }
    
    template < typename F >
    void check_similarity_2 ( F f_ ) {
        static_cast< void >( f_ ); // to eliminate warnings
        cout << "is_same< decltype( f_ ), void () >::value:   "
             << is_same< decltype( f_ ), void () >::value << endl;// 0
        cout << "is_same< decltype( f_ ), void(*)() >::value: "
             << is_same< decltype( f_ ), void(*)() >::value << endl;// 1
    }
    
    int main () {
        cout << "is_same< decltype( f ), void () >::value:   "
             << is_same< decltype( f ), void () >::value << endl;// 1
        cout << "is_same< decltype( f ), void(*)() >::value: "
             << is_same< decltype( f ), void(*)() >::value << endl;// 0
        check_similarity_1( f );
        check_similarity_2( f );
        return 0;
    }
    
    The output of this code on both VC++2010 and g++ 4.5.2 is the following:
    is_same< decltype( f ), void () >::value:   1
    is_same< decltype( f ), void(*)() >::value: 0
    is_same< F, void () >::value:   0
    is_same< F, void(*)() >::value: 1
    is_same< decltype( f_ ), void () >::value:   0
    is_same< decltype( f_ ), void(*)() >::value: 1

    Thanks In Advance For Any Clarification,
    Joshua Burkholder

  • new2STLnew2STL xkcd.com

    @Matt_PD: SSE3 (and SSSE3) works better with Core Architecture (Core, Core2, Core i#). The Core architecture are different from Prescott (P4). The Core have a different way to handle cache and SIMD. Diary of X264 have some talks about cache and SIMD in Nehalem (Core i#).

    Here a excerpt of Intel about SSE3 LDDQU: "... is a special 128-bit unaligned load designed to avoid cache-line splits. If the address of the load is aligned on a 16-byte boundary, LDQQU loads the 16 bytes requested. If the address of the load is not aligned on a 16-byte boundary, LDDQU loads a 32-byte block starting at the 16-byte aligned address immediately below the load request. It then extracts the requested 16 bytes. The instruction provides significant performance improvement on 128-bit unaligned memory accesses at the cost of some usage-model restrictions."

    If I find some numbers i edit this post Wink

     

    @Burkholder: I can be wrong but, reading on en.wikipedia.org/wiki/Decltype, semantic rule 3, the function void f() as it is passed to decltype are an lvalue then it is returning the reference to the function type (the "(*)" denote an unnamed function type). You can see a lot of this declaration on OpenGL headers.

    void f() type is void (*) ()

  • STLSTL

    I filmed Part 2 today!

    Burkholder> Why does the decltype of a function change from "void()" to "void(*)()" in the following code?

    N3225 14.8.2.1 [temp.deduct.call]/2 describes how template argument deduction works by comparing a function parameter type P to a function argument type A:

    "If P is not a reference type:
    - If A is an array type, the pointer type produced by the array-to-pointer standard conversion (4.2) is used in place of A for type deduction; otherwise,
    - If A is a function type, the pointer type produced by the function-to-pointer standard conversion (4.3) is used in place of A for type deduction; otherwise,
    - If A is a cv-qualified type, the top level cv-qualifiers of A’s type are ignored for type deduction."

    The second bullet point applies here. When you pass an array or a function (like f) to a function template (like check_similarity_[12]) with a value parameter (like F f_), an array will decay to a pointer and a function will decay to a function pointer. These template argument deduction rules mirror how C and C++ work - any attempt to write a function that takes an array parameter or a function parameter is immediately and forcibly rewritten to take a pointer parameter or a function pointer parameter. (This rewriting is different from decay.) In C++ this is N3225 8.3.5 [dcl.fct]/5 "After determining the type of each parameter, any parameter of type “array of T” or “function returning T” is adjusted to be “pointer to T” or “pointer to function returning T,” respectively." and C has the same rule (C99 6.7.5.3/7). This is why array parameters are widely regarded to be a bad idea (anyone writing array parameters probably doesn't know what they're doing - fortunately almost nobody tries to write function parameters, which is why that rule is so obscure).

    The third bullet point, const-dropping, may seem weird but it makes sense. Given template <typename T> void foobar(T t), and const int c = 1729; calling foobar(c) deduces T to be int, not const int. That's because when passed by value, the constness of the source is unrelated to the constness of the destination. foobar()'s author may want to modify its copy of t. (Otherwise, they could write template <typename T> void foobar(const T t).)

    This stuff is somewhat subtle but fundamentally important, which is why I've explained at length - hopefully I haven't made things more confusing.

  • @new2STL:Since void f() isn't always void(*)(), let me clarify.  Why is the first std::is_same<...> in main() returing 1?

    is_same< decltype( f ), void () >::value:   1 
    Similarly, why is the second std::is_same<...> in main() returning 0?
    is_same< decltype( f ), void(*)() >::value: 0
    It seems like this should be 0 then 1 ... like the check_similarity_x() functions, vice 1 then 0.  In order to get a 0 then 1 in main(), I have to do the following:
     #include <iostream>
    #include <type_traits>
    
    using namespace std;
    
    void f () {
        //
    }
    
    template < typename F >
    void check_similarity_1 ( F ) {
        cout << "is_same< F, void () >::value:   "
             << is_same< F, void () >::value << endl; // 0
        cout << "is_same< F, void(*)() >::value: "
             << is_same< F, void(*)() >::value << endl; // 1
    }
    
    template < typename F >
    void check_similarity_2 ( F f_ ) {
        static_cast< void >( f_ ); // to eliminate warnings
        cout << "is_same< decltype( f_ ), void () >::value:   "
             << is_same< decltype( f_ ), void () >::value << endl; // 0
        cout << "is_same< decltype( f_ ), void(*)() >::value: "
             << is_same< decltype( f_ ), void(*)() >::value << endl; // 1
    }
    
    int main () {
        cout << "is_same< decltype( f ), void () >::value:   "
             << is_same< decltype( f ), void () >::value << endl; // 1
        cout << "is_same< decltype( f ), void(*)() >::value: "
             << is_same< decltype( f ), void(*)() >::value << endl; // 0
        cout << "is_same< decltype( &f ), void () >::value:   "
             << is_same< decltype( &f ), void () >::value << endl; // 0
        cout << "is_same< decltype( &f ), void(*)() >::value: "
             << is_same< decltype( &f ), void(*)() >::value << endl; // 1
        check_similarity_1( f );
        check_similarity_2( f );
        return 0;
    }
    
    Which produces the following output:
    is_same< decltype( f ), void () >::value:   1
    is_same< decltype( f ), void(*)() >::value: 0
    is_same< decltype( &f ), void () >::value:   0
    is_same< decltype( &f ), void(*)() >::value: 1
    is_same< F, void () >::value:   0
    is_same< F, void(*)() >::value: 1
    is_same< decltype( f_ ), void () >::value:   0
    is_same< decltype( f_ ), void(*)() >::value: 1
    Obviously, this is some rule that I didn't pay enough attention to when I was learning C++ ... or just didn't learn the right way Wink ... I'm just trying to figure out which rule this is.

    Joshua Burkholder

     

  • STLSTL

    new2STL: Your explanation of what's happening with decltype is not correct.

    double (int): This is a function type, "function taking int and returning double".
    double (*)(int): This is a pointer to function type, "pointer to function taking int and returning double".
    double (&)(int): This is an lvalue reference to function type, "lvalue reference to function taking int and returning double".

    The (*) means "pointer". Here's a definition of a function pointer:

    double (*fp)(int) = &func;

    N3225 7.1.6.2 [dcl.type.simple]/4 specifies how decltype works:

    "The type denoted by decltype(e) is defined as follows:
    - if e is an unparenthesized id-expression or a class member access (5.2.5), decltype(e) is the type of the entity named by e. If there is no such entity, or if e names a set of overloaded functions, the program is ill-formed;
    - otherwise, if e is a function call (5.2.2) or an invocation of an overloaded operator (parentheses around e are ignored), decltype(e) is the return type of the statically chosen function;
    - otherwise, if e is an lvalue, decltype(e) is T&, where T is the type of e;
    - otherwise, decltype(e) is the type of e.
    The operand of the decltype specifier is an unevaluated operand (Clause 5)."

    Both decltype(f) and decltype(f_) activate bullet point #1, "unparenthesized id-expression", and return the type of f/f_ without modification.

    Bullet point #3 applies to things like decltype(ptr[index]). In this case, it turns out that adding an lvalue reference is desirable.

  • @STL:Looks like we were both typing at the same time.  Wink  Thank you very much for the explanation!  I'm getting closer to understanding.

    Is there anyway to go from a "void(*)()" back to "void()"?  Or avoid this rewriting?

    Lastly, why is the decltype( f ) in my first couple of std::is_same<...> lines in main() __not__ being rewritten to a pointer to function type once it is inside std::is_same<...>?  In other words, why is the following output being produced from main()?

    is_same< decltype( f ), void () >::value:   1
    is_same< decltype( f ), void(*)() >::value: 0
    is_same< decltype( &f ), void () >::value:   0
    is_same< decltype( &f ), void(*)() >::value: 1 

    Thanks In Advance For Clarification ... And Your Patience With My Dense-ness,
    Joshua Burkholder

  • @STL:Looks like we were typing at the same time again!  Wink

    I'm still not sure why the rewriting in the std::is_same<>'s in main isn't taking the function type down to a pointer to function type, but I figured out how to go from "void(*)()" back to "void()" ... which was so simple I'm a little embarrased I asked the question ... just a simple typename std::remove_pointer<...>::type.  Here's the code:

    #include <iostream>
    #include <type_traits>
    
    using namespace std;
    
    void f () {
        //
    }
    
    template < typename F >
    void check_similarity_1 ( F ) {
        cout << "is_same< F, void () >::value:   "
             << is_same< F, void () >::value << endl; // 0
        cout << "is_same< F, void(*)() >::value: "
             << is_same< F, void(*)() >::value << endl; // 1
    }
    
    template < typename F >
    void check_similarity_2 ( F f_ ) {
        static_cast< void >( f_ ); // to eliminate warnings
        cout << "is_same< decltype( f_ ), void () >::value:   "
             << is_same< decltype( f_ ), void () >::value << endl; // 0
        cout << "is_same< decltype( f_ ), void(*)() >::value: "
             << is_same< decltype( f_ ), void(*)() >::value << endl; // 1
    }
    template < typename F >
    void check_similarity_3 ( F ) {
        cout << "is_same< typename remove_pointer< F >::type, void () >::value:   "
             << is_same< typename remove_pointer< F >::type, void () >::value << endl; // 1
        cout << "is_same< typename remove_pointer< F >::type, void(*)() >::value: "
             << is_same< typename remove_pointer< F >::type, void(*)() >::value << endl; // 0
    }
    
    template < typename F >
    void check_similarity_4 ( F f_ ) {
        static_cast< void >( f_ ); // to eliminate warnings
        cout << "is_same< typename remove_pointer< decltype( f_ ) >::type, void () >::value:   "
             << is_same< typename remove_pointer< decltype( f_ ) >::type, void () >::value << endl; // 1
        cout << "is_same< typename remove_pointer< decltype( f_ ) >::type, void(*)() >::value: "
             << is_same< typename remove_pointer< decltype( f_ ) >::type, void(*)() >::value << endl; // 0
    }
    
    int main () {
        cout << "is_same< decltype( f ), void () >::value:   "
             << is_same< decltype( f ), void () >::value << endl; // 1
        cout << "is_same< decltype( f ), void(*)() >::value: "
             << is_same< decltype( f ), void(*)() >::value << endl; // 0
        cout << "is_same< decltype( &f ), void () >::value:   "
             << is_same< decltype( &f ), void () >::value << endl; // 0
        cout << "is_same< decltype( &f ), void(*)() >::value: "
             << is_same< decltype( &f ), void(*)() >::value << endl; // 1
        cout << "is_same< typename remove_pointer< decltype( f ) >::type, void () >::value:   "
             << is_same< typename remove_pointer< decltype( f ) >::type, void () >::value << endl; // 1
        cout << "is_same< typename remove_pointer< decltype( f ) >::type, void(*)() >::value: "
             << is_same< typename remove_pointer< decltype( f ) >::type, void(*)() >::value << endl; // 0
        cout << "is_same< typename remove_pointer< decltype( &f ) >::type, void () >::value:   "
            << is_same< typename remove_pointer< decltype( &f ) >::type, void () >::value << endl; // 1
        cout << "is_same< typename remove_pointer< decltype( &f ) >::type, void(*)() >::value: "
             << is_same< typename remove_pointer< decltype( &f ) >::type, void(*)() >::value << endl; // 0
        check_similarity_1( f );
        check_similarity_2( f );
        check_similarity_3( f );
        check_similarity_4( f );
        return 0;
    }
    
    Here's the output:
    is_same< decltype( f ), void () >::value:   1
    is_same< decltype( f ), void(*)() >::value: 0
    is_same< decltype( &f ), void () >::value:   0
    is_same< decltype( &f ), void(*)() >::value: 1
    is_same< typename remove_pointer< decltype( f ) >::type, void () >::value:   1
    is_same< typename remove_pointer< decltype( f ) >::type, void(*)() >::value: 0
    is_same< typename remove_pointer< decltype( &f ) >::type, void () >::value:   1
    is_same< typename remove_pointer< decltype( &f ) >::type, void(*)() >::value: 0
    is_same< F, void () >::value:   0
    is_same< F, void(*)() >::value: 1
    is_same< decltype( f_ ), void () >::value:   0
    is_same< decltype( f_ ), void(*)() >::value: 1
    is_same< typename remove_pointer< F >::type, void () >::value:   1
    is_same< typename remove_pointer< F >::type, void(*)() >::value: 0
    is_same< typename remove_pointer< decltype( f_ ) >::type, void () >::value:   1
    is_same< typename remove_pointer< decltype( f_ ) >::type, void(*)() >::value: 0

    Thanks For Any Clarification ... And Your Continued Patience Wink,
    Joshua Burkholder

  • @STL:I am a bonehead! ... I finally understand your post ... with a little help from page 168 of "C++ Templates: The Complete Guide".

    Note To Self:  Implicitly deduced template arguments can decay.  Explicit template arguments cannot.

    In my search for clarity, I may have found a **** potential bug **** in VC++ 2010.  Here's the code that helped me out ... I'll explain the potential bug at the end:

    #include <iostream>
    #include <type_traits>
    
    using namespace std;
    
    // decltype( f ): void ()
    void f () {
        //
    }
    
    template< typename F >
    void check_similarity_1 ( F f_ ) {
        static_cast< void >( f_ ); // to eliminate warnings
        // if __implicit__ template argument deduction, then:
        // decltype( f ) decays from void () to void (*) ();
        // therefore, decltype( f_ ) is void (*) () and
        // F is void (*) ()
        cout << "is_same< decltype( f_ ), void () >::value:     "
             << is_same< decltype( f_ ), void () >::value << endl;
        cout << "is_same< decltype( f_ ), void (*) () >::value: "
             << is_same< decltype( f_ ), void (*) () >::value << endl;
        cout << "is_same< decltype( f_ ), void (&) () >::value: "
             << is_same< decltype( f_ ), void (&) () >::value << endl;
        cout << "is_same< F, void () >::value:     "
             << is_same< F, void () >::value << endl;
        cout << "is_same< F, void (*) () >::value: "
             << is_same< F, void (*) () >::value << endl;
        cout << "is_same< F, void (&) () >::value: "
             << is_same< F, void (&) () >::value << endl;
    }
    
    template < typename F >
    void check_similarity_2 ( F * f_ptr ) {
        static_cast< void >( f_ptr ); // to eliminate warnings
        // if __implicit__ template argument deduction, then:
        // decltype( f ) decays from void () to void (*) ();
        // therefore, decltype( f_ptr ) is void (*) () and 
        // F is void ()
        cout << "is_same< decltype( f_ptr ), void () >::value:     "
             << is_same< decltype( f_ptr ), void () >::value << endl;
        cout << "is_same< decltype( f_ptr ), void (*) () >::value: "
             << is_same< decltype( f_ptr ), void (*) () >::value << endl;
        cout << "is_same< decltype( f_ptr ), void (&) () >::value: "
             << is_same< decltype( f_ptr ), void (&) () >::value << endl;
        cout << "is_same< F, void () >::value:     "
             << is_same< F, void () >::value << endl;
        cout << "is_same< F, void (*) () >::value: "
             << is_same< F, void (*) () >::value << endl;
        cout << "is_same< F, void (&) () >::value: "
             << is_same< F, void (&) () >::value << endl;
    }
    
    template < typename F >
    void check_similarity_3 ( F & f_ref ) {
        static_cast< void >( f_ref ); // to eliminate warnings
        // if __implicit__ template argument deduction, then:
        // decltype( f ) goes from void() to void (&) ();
        // therefore, decltype( f_ref ) is void (&) () and
        // F is void ()
        cout << "is_same< decltype( f_ref ), void () >::value:     "
             << is_same< decltype( f_ref ), void () >::value << endl;
        cout << "is_same< decltype( f_ref ), void (*) () >::value: "
             << is_same< decltype( f_ref ), void (*) () >::value << endl;
        cout << "is_same< decltype( f_ref ), void (&) () >::value: "
             << is_same< decltype( f_ref ), void (&) () >::value << endl;
        cout << "is_same< F, void () >::value:     "
             << is_same< F, void () >::value << endl;
        cout << "is_same< F, void (*) () >::value: "
             << is_same< F, void (*) () >::value << endl;
        cout << "is_same< F, void (&) () >::value: "
             << is_same< F, void (&) () >::value << endl;
    }
    
    int main () {
        cout << "main():" << endl;
        cout << "-------" << endl;
        // no __implicit__ template argument deduction ( std::is_same<...> 
        // uses directly deducible, __explicit__ template arguments ); hence, 
        // function types do not decay to pointer types:
        // decltype( f ) is still void () inside of std::is_same<...>
        cout << "is_same< decltype( f ), void () >::value:     "
             << is_same< decltype( f ), void () >::value << endl;
        cout << "is_same< decltype( f ), void (*) () >::value: "
             << is_same< decltype( f ), void (*) () >::value << endl;
        cout << "is_same< decltype( f ), void (&) () >::value: "
             << is_same< decltype( f ), void (&) () >::value << endl;
        cout << endl;
    
        cout << "check_similarity_1( F f_ ):" << endl;
        cout << "---------------------------" << endl;
        check_similarity_1( f );
        cout << endl;
    
        cout << "check_similarity_2( F * f_ptr ):" << endl;
        cout << "--------------------------------" << endl;
        check_similarity_2( f );
        cout << endl;
    
        cout << "check_similarity_3( F & f_ref ):" << endl;
        cout << "--------------------------------" << endl;
        check_similarity_3( f );
        cout << endl;
    
        // Does the following produce a bug in VC++ 2010?
        cout << "check_similarity_1< decltype( f ) >( F f_ ):" << endl;
        cout << "--------------------------------------------" << endl;
        check_similarity_1< decltype( f ) >( f );
        cout << endl;
    
        return 0;
    }
    
    In VC++ 2010, this produces the following output:
    main():
    -------
    is_same< decltype( f ), void () >::value:     1
    is_same< decltype( f ), void (*) () >::value: 0
    is_same< decltype( f ), void (&) () >::value: 0
    
    check_similarity_1( F f_ ):
    ---------------------------
    is_same< decltype( f_ ), void () >::value:     0
    is_same< decltype( f_ ), void (*) () >::value: 1
    is_same< decltype( f_ ), void (&) () >::value: 0
    is_same< F, void () >::value:     0
    is_same< F, void (*) () >::value: 1
    is_same< F, void (&) () >::value: 0
    
    check_similarity_2( F * f_ptr ):
    --------------------------------
    is_same< decltype( f_ptr ), void () >::value:     0
    is_same< decltype( f_ptr ), void (*) () >::value: 1
    is_same< decltype( f_ptr ), void (&) () >::value: 0
    is_same< F, void () >::value:     1
    is_same< F, void (*) () >::value: 0
    is_same< F, void (&) () >::value: 0
    
    check_similarity_3( F & f_ref ):
    --------------------------------
    is_same< decltype( f_ref ), void () >::value:     0
    is_same< decltype( f_ref ), void (*) () >::value: 0
    is_same< decltype( f_ref ), void (&) () >::value: 1
    is_same< F, void () >::value:     1
    is_same< F, void (*) () >::value: 0
    is_same< F, void (&) () >::value: 0
    
    check_similarity_1< decltype( f ) >( F f_ ):
    --------------------------------------------
    is_same< decltype( f_ ), void () >::value:     0
    is_same< decltype( f_ ), void (*) () >::value: 1
    is_same< decltype( f_ ), void (&) () >::value: 0
    is_same< F, void () >::value:     1
    is_same< F, void (*) () >::value: 0
    is_same< F, void (&) () >::value: 0
    In g++ 4.5.2, this code produces the following output:
    main():
    -------
    is_same< decltype( f ), void () >::value:     1
    is_same< decltype( f ), void (*) () >::value: 0
    is_same< decltype( f ), void (&) () >::value: 0
    
    check_similarity_1( F f_ ):
    ---------------------------
    is_same< decltype( f_ ), void () >::value:     0
    is_same< decltype( f_ ), void (*) () >::value: 1
    is_same< decltype( f_ ), void (&) () >::value: 0
    is_same< F, void () >::value:     0
    is_same< F, void (*) () >::value: 1
    is_same< F, void (&) () >::value: 0
    
    check_similarity_2( F * f_ptr ):
    --------------------------------
    is_same< decltype( f_ptr ), void () >::value:     0
    is_same< decltype( f_ptr ), void (*) () >::value: 1
    is_same< decltype( f_ptr ), void (&) () >::value: 0
    is_same< F, void () >::value:     1
    is_same< F, void (*) () >::value: 0
    is_same< F, void (&) () >::value: 0
    
    check_similarity_3( F & f_ref ):
    --------------------------------
    is_same< decltype( f_ref ), void () >::value:     0
    is_same< decltype( f_ref ), void (*) () >::value: 0
    is_same< decltype( f_ref ), void (&) () >::value: 1
    is_same< F, void () >::value:     1
    is_same< F, void (*) () >::value: 0
    is_same< F, void (&) () >::value: 0
    
    check_similarity_1< decltype( f ) >( F f_ ):
    --------------------------------------------
    is_same< decltype( f_ ), void () >::value:     1
    is_same< decltype( f_ ), void (*) () >::value: 0
    is_same< decltype( f_ ), void (&) () >::value: 0
    is_same< F, void () >::value:     1
    is_same< F, void (*) () >::value: 0
    is_same< F, void (&) () >::value: 0
    Everything agrees between the output from VC++ 2010 and g++ 4.5.2, except the "check_similarity_1< decltype( f ) >( F f_ )" section ... and there is the potential bug.

    **** potential bug ****:  In VC++ 2010 when I make the call to check_similarity_1< decltype( f ) >( f ) inside of main(), is_same< decltype( f_ ), void (*) () >::value is 1 within that check_similarity_1 function ... even though I explicitly declared the template parameter F to be "void ()" (i.e. decltype( f ) ).  Shouldn't is_same< decltype( f_ ), void () >::value be 1 instead (esp. since is_same< F, void () >::value is 1 within that same function)?  In g++ 4.5.2, is_same< decltype( f_ ), void () >::value is 1.  It seems like VC++ 2010 is assuming that a function argument will always become a pointer ( i.e. "void ()" f will always go to "void (*) ()" f_ ) whether or not it agrees with its actual type ( F: void () ).

    Hope This Helps,
    Joshua Burkholder

  • To make this potential VC++ 2010 bug (explicit template argument type disagreement) a little more obvious, here is some much shorter code that pertains just to the bug:

    #include <iostream>
    #include <type_traits>
    
    using namespace std;
    
    template < typename F >
    bool is_F_same_as_decltype_ff ( F ff ) {
        // to suppress unreferenced formal parameter warnings:
        static_cast< void >( ff );
        // check if F is the same as the type of ff
        return is_same< F, decltype( ff ) >::value;
    }
    
    template < typename F >
    bool is_F_same_as_void_void_func_type ( F ) {
        return is_same< F, void () >::value;
    }
    
    template < typename F >
    bool is_F_same_as_ptr_to_void_void_func_type ( F ) {
        return is_same< F, void (*) () >::value;
    }
    
    void f () {
        //
    }
    
    int main () {
        cout << "=================================================" << endl;
        cout << endl;
        // implicit template arguments:
        cout << "is_F_same_as_decltype_ff( f ): ";//vc++: 1, g++: 1
        cout << is_F_same_as_decltype_ff( f ) << endl;
        cout << endl;
        cout << "Is F void ()?     ";//vc++: 0, g++: 0
        cout << is_F_same_as_void_void_func_type( f ) << endl;
        cout << "Is F void (*) ()? ";//vc++: 1, g++: 1
        cout << is_F_same_as_ptr_to_void_void_func_type( f ) << endl;
        cout << endl;
        cout << "=================================================" << endl;
        cout << endl;
        // explicit template arguments:
        cout << "is_F_same_as_decltype_ff< decltype( f ) >( f ): ";//vc++: 0, g++: 1
        cout << is_F_same_as_decltype_ff< decltype( f ) >( f ) << endl;
        cout << endl;
        cout << "Is F void ()?     ";//vc++: 1, g++: 1
        cout << is_F_same_as_void_void_func_type< decltype( f ) >( f ) << endl;
        cout << "Is F void (*) ()? ";//vc++: 0, g++: 0
        cout << is_F_same_as_ptr_to_void_void_func_type< decltype( f ) >( f ) << endl;
        cout << endl;
        cout << "=================================================" << endl;
        cout << endl;
        // ... and just to be extra explicit
        // explicit template arguments:
        cout << "is_F_same_as_decltype_ff< void () >( f ): ";//vc++: 0, g++: 1
        cout << is_F_same_as_decltype_ff< void () >( f ) << endl;
        cout << endl;
        cout << "Is F void ()?     ";//vc++: 1, g++: 1
        cout << is_F_same_as_void_void_func_type< void () >( f ) << endl;
        cout << "Is F void (*) ()? ";//vc++: 0, g++: 0
        cout << is_F_same_as_ptr_to_void_void_func_type< void () >( f ) << endl;
        cout << endl;
        cout << "=================================================" << endl;
    
        return 0;
    }
    On VC++ 2010, this produces the following output:
    =================================================
    
    is_F_same_as_decltype_ff( f ): 1
    
    Is F void ()?     0
    Is F void (*) ()? 1
    
    =================================================
    
    is_F_same_as_decltype_ff< decltype( f ) >( f ): 0
    
    Is F void ()?     1
    Is F void (*) ()? 0
    
    =================================================
    
    is_F_same_as_decltype_ff< void () >( f ): 0
    
    Is F void ()?     1
    Is F void (*) ()? 0
    
    =================================================
    While on g++ 4.5.2, this produces the following output:
    =================================================
    
    is_F_same_as_decltype_ff( f ): 1
    
    Is F void ()?     0
    Is F void (*) ()? 1
    
    =================================================
    
    is_F_same_as_decltype_ff< decltype( f ) >( f ): 1
    
    Is F void ()?     1
    Is F void (*) ()? 0
    
    =================================================
    
    is_F_same_as_decltype_ff< void () >( f ): 1
    
    Is F void ()?     1
    Is F void (*) ()? 0
    
    =================================================
    Since we have F ff as the parameter of the is_F_same_as_decltype_ff() function, it seems like F should __always__ agree with decltype( ff ).  In other words, the bug is that there is type disagreement when explicit template arguments are used (esp. when there is no type disagreement when implicit template arguments are deduced).

    Hope This Clarifies,
    Joshua Burkholder

  • STL wrote

    It looks like you're trying to write a scope guard. I've written an implementation powered by std::function:

     

    Interesting code though a bit heavy since it use exception handling.

     

    Please check the thread again, i'd like to hear your opinion on the matter.

  • Petr MinarPetr Minar

    Great videos, thanks!
    Just today I saw a bit weird behavior. Consider this code:
    struct deleter { void operator() (int* p) { delete p; }};
    std::unique_ptr a(new int);std::unique_ptr b;std::unique_ptr c;b=a; // does not compile c=a; // should not compile too; does not linkI checked the stl source code that the operator = is private member of unique_ptr but the "c=a" still compiles in VC2010. Am I missing something?

  • Sorry for the poorly formatted previous post. Let me try it one more time. Consider this code:

     

    struct deleter {    void operator() (int* p) { delete p; }
    };
    
    std::unique_ptr<int> a(new int);
    std::unique_ptr<int> b;
    std::unique_ptr<int, deleter> c;
    b=a; // does not compile 
    c=a; // should not compile too; does not link

     

    I checked the stl source code that the operator = is private member but the "c=a" still compiles in VC2010. Am I missing something? Thanks for any insight. Petr

  • @Burkholder:I filed this as a bug on Microsoft's Connect website.  The title of this bug report is "VC++ 2010: Explicit template arguments cause type disagreement for types that decay to pointers" (bug id: 647035) and is under the "Visual Studio and .NET Framework" section.

    Note:  This bug affects anything that decays ... so both function types and array types.  If you start out with say "int arr[3];" and pass arr to those type of functions using implicit and explicit template arguments, then you get the same type of results ... type disagreement ( int[3] versus int * ) when explicit template arguments are used in VC++ 2010, but __no__ type disagreement in g++ 4.5.2.

    Hope This Helps,
    Joshua Burkholder

  • STLSTL

    Mr Crash> Interesting code though a bit heavy since it use exception handling.

    Exception handling is part of the language, and is used by the STL.

    PetrM> I checked the stl source code that the operator = is private member but the "c=a" still compiles in VC2010.

    We've already changed unique_ptr such that VC11 emits "error C2679: binary '=' : no operator found which takes a right-hand operand of type 'std::unique_ptr<_Ty>' (or there is no acceptable conversion)".

    However, it appears that you've found a compiler bug. I've filed DevDiv#150368 "Access control mysteriously not applied for VC10 RTM's unique_ptr" with a minimal repro:

     

    C:\Temp>type meow.cpp
    template <typename T> class Unique {
    public:
        Unique() { }
        Unique(Unique&&) { }
        Unique& operator=(Unique&&) { return *this; }
        template <typename U> Unique(Unique<U>&&) { }
        template <typename U> Unique& operator=(Unique<U>&&) { return *this; }
    
    private:
        Unique(const Unique&);
        Unique& operator=(const Unique&);
        template <typename U> Unique(const Unique<U>&);
        template <typename U> Unique& operator=(const Unique<U>&);
    };
    
    int main() {
        Unique<int> a;
        Unique<double> b;
        a = b;
    }
    
    C:\Temp>cl /EHsc /nologo /W4 meow.cpp
    meow.cpp
    meow.obj : error LNK2019: unresolved external symbol "private: class Unique<int> & __thiscall Unique<int>::operator=<double>(class Unique<double> const &)" (??$?4N@?$Unique@H@@AAEAAV0@ABV?$Unique@N@@@Z) referenced in function _main
    meow.exe : fatal error LNK1120: 1 unresolved externals
    
    C:\Temp>g++ -Wall -Wextra -std=c++0x meow.cpp -o meow.exe
    meow.cpp: In function 'int main()':
    meow.cpp:13:35: error: 'Unique<T>& Unique<T>::operator=(const Unique<U>&) [with U = double, T = int, Unique<T> = Unique<int>]' is private
    meow.cpp:19:9: error: within this context


    Burkholder> http://connect.microsoft.com/VisualStudio/feedback/details/647035/vc-2010-explicit-template-arguments-cause-type-disagreement-for-types-that-decay-to-pointers

    I believe that VC's behavior is CORRECT. Here is a modified repro:

     

    C:\Temp>type meow.cpp
    #include <ios>
    #include <iostream>
    #include <ostream>
    #include <type_traits>
    using namespace std;
    template <typename T> struct stringify;
    template <> struct stringify<void ()> {
        static const char * str() { return "void ()"; }
    };
    template <> struct stringify<void (*)()> {
        static const char * str() { return "void (*)()"; }
    };
    template <typename F> void meow(F ff) {
        (void) ff;
        cout << "           F: " << stringify<F>::str() << endl;
        cout << "decltype(ff): " << stringify<decltype(ff)>::str() << endl;
        cout << "     is_same: " << is_same<F, decltype(ff)>::value << endl;
        cout << endl;
    }
    void f() { }
    int main() {
        cout << boolalpha;
        cout << "meow(f)" << endl;
        meow(f);
        cout << "meow<void (*)()>(f)" << endl;
        meow<void (*)()>(f);
        cout << "meow<decltype(f)>(f)" << endl;
        meow<decltype(f)>(f);
        cout << "meow<void ()>(f)" << endl;
        meow<void ()>(f);
    }
    C:\Temp>cl /EHsc /nologo /W4 meow.cpp
    meow.cpp
    C:\Temp>meow
    meow(f)
               F: void (*)()
    decltype(ff): void (*)()
         is_same: true
    meow<void (*)()>(f)
               F: void (*)()
    decltype(ff): void (*)()
         is_same: true
    meow<decltype(f)>(f)
               F: void ()
    decltype(ff): void (*)()
         is_same: false
    meow<void ()>(f)
               F: void ()
    decltype(ff): void (*)()
         is_same: false
    
    C:\Temp>g++ -Wall -Wextra -std=c++0x meow.cpp -o meow.exe
    C:\Temp>meow
    meow(f)
               F: void (*)()
    decltype(ff): void (*)()
         is_same: true
    meow<void (*)()>(f)
               F: void (*)()
    decltype(ff): void (*)()
         is_same: true
    meow<decltype(f)>(f)
               F: void ()
    decltype(ff): void ()
         is_same: true
    meow<void ()>(f)
               F: void ()
    decltype(ff): void ()
         is_same: true
    


    N3225 7.1.6.2 [dcl.type.simple]/4 says: "The type denoted by decltype(e) is defined as follows: — if e is an unparenthesized id-expression or a class member access (5.2.5), decltype(e) is the type of the entity named by e."

    8.3.5 [dcl.fct]/5 says: "After determining the type of each parameter, any parameter of type “array of T” or “function returning T” is adjusted to be “pointer to T” or “pointer to function returning T,” respectively."

    This "adjustment" happens before sizeof (which can easily be verified with arrays adjusted to pointers), so it should happen before decltype too.

    In fact, in the absence of templates, GCC believes that the adjustment happens before decltype:

     

    C:\Temp>type meow.cpp
    #include <stdio.h>
    template <typename T> struct stringify;
    template <> struct stringify<void ()> {
        static const char * str() { return "void ()"; }
    };
    template <> struct stringify<void (*)()> {
        static const char * str() { return "void (*)()"; }
    };
    void meow(void ff()) {
        (void) ff;
        printf("decltype(ff): %s\n", stringify<decltype(ff)>::str());
    }
    void f() { }
    int main() {
        meow(f);
    }
    C:\Temp>cl /EHsc /nologo /W4 meow.cpp
    meow.cpp
    C:\Temp>meow
    decltype(ff): void (*)()
    C:\Temp>g++ -Wall -Wextra -std=c++0x meow.cpp -o meow.exe
    C:\Temp>meow
    decltype(ff): void (*)()

  • @STL:

    Burkholder> http://connect.microsoft.com/VisualStudio/feedback/details/647035/vc-2010-explicit-template-arguments-cause-type-disagreement-for-types-that-decay-to-pointers

    I believe that VC's behavior is CORRECT. Here is a modified repro:

     

    44
    45
    46
    47
    48
    49
    50
    51
     
     
    meow<decltype(f)>(f)
               F: void ()
    decltype(ff): void (*)()
         is_same: false
    meow<void()>(f)
               F: void ()
    decltype(ff): void (*)()
         is_same: false
     


    N3225 7.1.6.2 [dcl.type.simple]/4 says: "The type denoted by decltype(e) is defined as follows: — if e is an unparenthesized id-expression or a class member access (5.2.5), decltype(e) is the type of the entity named by e."

    8.3.5 [dcl.fct]/5 says: "After determining the type of each parameter, any parameter of type “array of T” or “function returning T” is adjusted to be “pointer to T” or “pointer to function returning T,” respectively."

    This "adjustment" happens before sizeof (which can easily be verified with arrays adjusted to pointers), so it should happen before decltype too.

    If decltype(ff) has to be adjusted to a pointer type, then why doesn't F also have to be adjusted to a pointer type (regardless of the explicit template arguments) ... like it's adjusted in the implicit template arguments case?

    If (as you are suggesting) this type-disagreement behavior is intended by the standard, then why?  What does this behavior enable?  Or prevent?  I ask because conflicting types ( i.e. F not being the same type as decltype( ff ) even though we declared F ff ) doesn't make much sense to me (esp. if ... say ... F is char[8] and decltype( ff ) adjusts down to char * where sizeof( F ) would be 8 and sizeof( decltype( ff ) ) would be 4 ).

    Joshua Burkholder

     

  • STLSTL

    There are several things going on here, so it's helpful to go step by step.

    First, template argument deduction. This happens when you call a function template without providing explicit template arguments (or providing some-but-not-all). Being called in this manner is how most function templates are intended to be used, so template argument deduction is a very important process. (Indeed, providing explicit template arguments when you shouldn't is a subtle way to misuse C++.)

    I quoted the relevant rules for this above, N3225 14.8.2.1 [temp.deduct.call]/2. (There are other rules not relevant here.) This says that given "template <typename T> whatever foobar(T t)" and "double func(int)" and "foobar(func)", T is deduced to be double (*)(int), i.e. a function pointer type. That's just how C++ works. Now, there's a reason for these rules - you can't pass around functions, but you can pass around function pointers, so when func is passed around by value, it makes sense for T to be a function pointer. (Hypothetically, the language could be restrictive and simply ban foobar(func), requiring you to say foobar(&func) in which there should be absolutely no mystery whatsoever - but recall that C++ is extremely permissive and allows programmers to say lots of things, then tries to figure out what they meant.)

    After template argument deduction runs, what happens is the same as if explicit template arguments were used. So "foobar(func)" is exactly equivalent to "foobar<double (*)(int)>(func)".

    The second thing is function parameter adjustment. This rule is ANCIENT, literally, because it comes from C. This says that functions declared as taking arrays or functions are immediately rewritten, or adjusted, to take pointers or function pointers instead. That's just how C works (and how C++ works). Now, there's a reason for THESE rules - in C you can't pass around arrays or functions, but you can pass around pointers or function pointers. So when the language sees a function declared as taking something "impossible", it just says, "okay, you can think it works like that, but I need to compile this into something possible". Personally, I take an extremely harsh view of this syntax, as I mentioned earlier - but the rules are what they are.

    The rules for template argument deduction and function parameter adjustment are totally different - they occur in different clauses of the Standard - and yet similar, because they are both dealing with the same thing - you can't pass arrays and functions by value.

    Now, you've taken it the next level by mixing templates and function parameters of function types.

    > why doesn't F also have to be adjusted to a pointer type (regardless of the explicit template arguments)

    Explicit template arguments don't get messed with. (I believe I'm glossing over a couple of subtleties here, mostly with template non-type arguments - please don't ask about those - but for the most part this is true.) If you tell the compiler that F needs to be a function type instead of a function pointer type, then a function type it shall be.

    That still doesn't stop the compiler from performing function parameter adjustment, though - that process is unstoppable.

    As you can see, this is subtle enough that two compilers written by experts disagree, but I believe that the Standard speaks with a clear voice here. (Sometimes it's worse and the Standard itself is ambiguous, requiring a Core Language Issue to be filed. Even Standards have bugs.) I believe that function parameter adjustment should happen before decltype inspects the type. I could be wrong - I have been known to be wrong about the Core Language in the past.

    > I ask because conflicting types ( i.e. F not being the same type as decltype( ff ) even though we declared F ff )

    Take a look at my "absence of templates" example above where VC and GCC are in agreement. There, ff is declared to have function type, but it actually has function pointer type.

    Perhaps another example will illustrate why I believe that GCC is incorrect and inconsistent.

     

    C:\Temp>type meow.cpp
    #include <iostream>
    #include <ostream>
    using namespace std;
    
    template <typename T> void meow(T t) {
        (void) t;
        cout << "          sizeof(t): " << sizeof(t) << endl;
        cout << "sizeof(decltype(t)): " << sizeof(decltype(t)) << endl;
    }
    
    void purr(int x[3]) {
        (void) x;
        cout << "          sizeof(x): " << sizeof(x) << endl;
        cout << "sizeof(decltype(x)): " << sizeof(decltype(x)) << endl;
    }
    
    int main() {
        int arr[3] = { 1, 2, 3 };
        meow(arr);
        meow<int[3]>(arr);
        purr(arr);
    }
    
    C:\Temp>cl /EHsc /nologo /W4 meow.cpp
    meow.cpp
    
    C:\Temp>meow
              sizeof(t): 4
    sizeof(decltype(t)): 4
              sizeof(t): 4
    sizeof(decltype(t)): 4
              sizeof(x): 4
    sizeof(decltype(x)): 4
    
    C:\Temp>g++ -Wall -Wextra -std=c++0x meow.cpp -o meow.exe
    
    C:\Temp>meow
              sizeof(t): 4
    sizeof(decltype(t)): 4
              sizeof(t): 4
    sizeof(decltype(t)): 12
              sizeof(x): 4
    sizeof(decltype(x)): 4
    

     

    Given meow<int[3]>(arr), GCC believes that t is 4 bytes (incontrovertibly correct, it is a pointer), but that t's declared type is 12 bytes. Yet given purr(arr), where x is declared in the source code to be int[3], GCC believes that both x and x's declared type are 4 bytes.

    I cannot imagine any possible interpretation of the Standard that permits GCC's behavior here.

  • @STL:Wow ... time to submit a bug report on g++!  Or have you already done so?

    I hear what you are saying about C++ staying consistent with C via adjustment ... which is why void ( char[123] ) is the same as void ( char * ) ... and void (*) ( char[123] ) is the same as void (*) ( char * ); however, VC++ seems to be just as inconsistent as g++ ... only in a slightly different way.  Here's the inconsistency (switching to character arrays ... because sizeof( void () ) is not allowed by the standard):

    #include <iostream>
    #include <type_traits>
    
    using namespace std;
    
    template < typename T >
    void check_size_of ( T t ) {
        (void) t;// to suppress unreferenced formal parameter warnings: 
        cout << "    sizeof( T ):             " << sizeof( T )             << endl;
        cout << "    sizeof( t ):             " << sizeof( t )             << endl;
        cout << "    sizeof( decltype( t ) ): " << sizeof( decltype( t ) ) << endl;
    }
    
    int main () {
    
        cout << "template < typename T >" << endl;
        cout << "void check_size_of ( T t );" << endl;
        cout << endl;
    
        char a[] = "My type is char[20]";
    
        cout << "char a[] = \"" << a << "\";" << endl;
        cout << "Is decltype( a ) char[20]? " << is_same< decltype( a ), char[20] >::value << endl;
        cout << "Is decltype( a ) char *?   " << is_same< decltype( a ), char* >::value << endl;
        cout << endl;
    
        cout << "check_size_of( a ):" << endl;
        check_size_of( a );
        cout << endl;
    
        cout << "========================================================" << endl;
        cout << "***** Inconsistent on both VC++ 2010 and g++ 4.5.2 *****" << endl;
        cout << "========================================================" << endl;
        cout << "check_size_of< decltype( a ) >( a ):" << endl;
        check_size_of< decltype( a ) >( a );
        cout << "========================================================" << endl;
        cout << endl;
        
        cout << "check_size_of< decltype( a ) & >( a ):" << endl;
        check_size_of< decltype( a ) & >( a );
        cout << endl;
        
        return 0;
    }
    
    VC++ 2010 gives the following output:
    template < typename T >
    void check_size_of ( T t );
    
    char a[] = "My type is char[20]";
    Is decltype( a ) char[20]? 1
    Is decltype( a ) char *?   0
    
    check_size_of( a ):
        sizeof( T ):             4
        sizeof( t ):             4
        sizeof( decltype( t ) ): 4
    
    ========================================================
    ***** Inconsistent on both VC++ 2010 and g++ 4.5.2 *****
    ========================================================
    check_size_of< decltype( a ) >( a ):
        sizeof( T ):             20
        sizeof( t ):             4
        sizeof( decltype( t ) ): 4
    ========================================================
    
    check_size_of< decltype( a ) & >( a ):
        sizeof( T ):             20
        sizeof( t ):             20
        sizeof( decltype( t ) ): 20
    g++ 4.5.2 gives the following output:
    template < typename T >
    void check_size_of ( T t );
    
    char a[] = "My type is char[20]";
    Is decltype( a ) char[20]? 1
    Is decltype( a ) char *?   0
    
    check_size_of( a ):
        sizeof( T ):             4
        sizeof( t ):             4
        sizeof( decltype( t ) ): 4
    
    ========================================================
    ***** Inconsistent on both VC++ 2010 and g++ 4.5.2 *****
    ========================================================
    check_size_of< decltype( a ) >( a ):
        sizeof( T ):             20
        sizeof( t ):             4
        sizeof( decltype( t ) ): 20
    ========================================================
    
    check_size_of< decltype( a ) & >( a ):
        sizeof( T ):             20
        sizeof( t ):             20
        sizeof( decltype( t ) ): 20
    If the currently proposed standard allows either one of these inconsistencies, then maybe some compiler warnings should happen ... or just fix the proposed standard before it is finalized. Big Smile

    Lastly, you wrote:
    "Indeed, providing explicit template arguments when you shouldn't is a subtle way to misuse C++"

    I'm gonna go with "what?" on this one. Wink  Using implicit or explicit template arguments should produce exactly the same results ... because that would make the most sense (i.e. the whole "intuitive" thing).  Don't you agree?

    Joshua Burkholder

  • STLSTL

    > VC++ seems to be just as inconsistent as g++ ... only in a slightly different way.

    VC's working correctly there. In the case you're looking at (where VC prints 20 for T and 4 for t and decltype(t)), you've explicitly specified T to be char[20]. Function parameter adjustment makes t a char *, which is why it's 4 bytes.

    Function parameter adjustment does not affect template parameters.

    In particular, this behavior (sizeof(T) is 20, sizeof(t) is 4) is shared by VC and GCC, and mandated by C++98/03. It's not new.

    > Using implicit or explicit template arguments should produce exactly the same results

    It mostly does, when you're careful to specify exactly the same template argument that automatic deduction would have chosen for you - but then, why bother? And if you specify something different, now you're forcing the template into an unusual mode of operation.

    (I'm aware of one case where you can specify explicit template arguments identical to what template argument deduction would have chosen, and yet the compilation explodes. This happens when people use explicit template arguments with swap(), which is WRONG and BAD and WRONG. The problem is that there are many swap() overloads, and while the provided explicit template arguments will work for the overload desired by the programmer, the compiler has to look at the *other* overloads too, and plugging those explicit template arguments in can cause a hard error. In contrast, when you rely on template argument deduction like you're supposed to, the undesired overloads fail out of deduction and are silently removed from the overload set. Again, this is subtle - if you don't understand it, simply remember that you shouldn't use explicit template arguments unless the function is documented as being called like that, as with make_shared<T>() for the first template argument.)

  • Function parameter adjustment does not affect template parameters.
    Where in the proposed C++0x standard does it state that function parameter adjustment does not affect explicit template parameters? My interpretation of 14.8.2.3 (PDF page 383) and the explicit template arguments in it's example seems to suggest that it does (esp. #2, where f<const int> implies that T and decltype( t ) are both const int, but the signature of the explicit f<const int> is adjusted to void(*)(int) ).  In other words, T agrees with decltype( t ) in spite of adjustment ... but my interpretation could be wrong.
    In particular, this behavior (sizeof(T) is 20, sizeof(t) is 4) is shared by VC and GCC, and mandated by C++98/03.
    I don't have a copy of the C++98/03 standard handy.  Is this still in the proposed C++0x standard?  If so, where can I look this up?  I did a quick search of "sizeof" in the PDF and I didn't see anything applicable to this situation ... but I might have read right over it.

    > Using implicit or explicit template arguments should produce exactly the same results

    It mostly does, when you're careful to specify exactly the same template argument that automatic deduction would have chosen for you - but then, why bother? And if you specify something different, now you're forcing the template into an unusual mode of operation.

    In order to pass functions around, function types must be known at compile-time.  If you have a templated function, then we have to explicitly instaniate the function template in order to pass it around.  Contrived Example:
    #include <iostream>
    
    using namespace std;
    
    template < typename F, typename P >
    class delayed_call_t {
        private:
            F m_f;
            P m_p;
        public:
            delayed_call_t ( F f, P p ) : m_f( f ), m_p( p ) {}
            void call () { m_f( m_p ); }
    };
    
    template < typename F, typename P >
    delayed_call_t< F, P > make_delayed_call ( F f, P p ) {
        return delayed_call_t< F, P >( f, p );
    }
    
    template < typename T >
    void f ( T t ) {
        cout << "t:                       " << t << '\n';
        cout << "sizeof( T ):             " << sizeof( T )             << '\n';
        cout << "sizeof( t ):             " << sizeof( t )             << '\n';
        cout << "sizeof( decltype( t ) ): " << sizeof( decltype( t ) ) << '\n';
    }
    
    int main () {
        char s[] = "Hello, World!";
        auto delayed = make_delayed_call( f< decltype( s ) >, s );
        delayed.call();
        return 0;
    }
    

    Joshua Burkholder

  • STLSTL

    > Where in the proposed C++0x standard does it state that function parameter adjustment does not affect explicit template parameters?

    Function parameters and template parameters are totally different, so function parameter adjustment can't affect template parameters. The Standard/Working Paper doesn't need to explicitly say this, even in a footnote.

    > My interpretation of 14.8.2.3 (PDF page 383)

    Here's a tip to avoid confusion: when citing the Standard/Working Paper, always mention what you're citing (e.g. C++03 or N3225) and both the numeric and alphabetic section IDs (e.g. 14.8.2.3 [temp.deduct.conv]). Knowing what document is being cited avoids the pitfall of looking at different Working Papers and being confused by wording changes between them. As for the section IDs, numeric IDs are easy to find through the bookmark tree, but are occasionally renumbered as sections are added, removed, or moved. The alphabetic IDs are provided because they're more stable (although very rarely they are modified, as happened to the Standard Library after C++03).

    Most importantly, don't mix section numbers and paragraph numbers! You appear to be referring to N3225 14.8.2 [temp.deduct]/3 (that is, section 14.8.2, paragraph 3), not 14.8.2.3 [temp.deduct.conv].

    > and the explicit template arguments in it's example seems to suggest that it does

    Those examples are depicting what happens to the function parameter types (which affect the overall function type).

    Perhaps there's terminology confusion here. In "template <class T> void f(T * p);" the "T" is a template parameter. It'll be given a template argument, either explicitly or through template argument deduction. The "p" is a function parameter, and its type is "T *".

    The same applies to "f(T t)". The template type parameter (on the left) and the function parameter type (on the right) are still distinct things, although both appear as "T" in the source code. Function parameter adjustment affects the latter, but not the former.

    > (esp. #2, where f<const int> implies that T and decltype( t ) are both const int, but the signature of the explicit f<const int> is adjusted to void(*)(int) ).

    That one's special - I've tried to avoid mentioning every possible scenario in the interests of reducing complexity. The thing about const value parameters is that they don't affect the callers of a function, but they do affect the function itself (where the const value parameter cannot be modified). Therefore, const value parameters are stripped out of function types, but still affect function definitions. Note the differences between this and what happens to array/function parameters. For THOSE, they get adjusted to pointers/function pointers in function types, AND this affects function definitions.

    > In other words, T agrees with decltype( t ) in spite of adjustment

    In that case, yes, because the adjustment has deliberately not been performed on the function definition (where const value parameters still matter).

    > I don't have a copy of the C++98/03 standard handy.  Is this still in the proposed C++0x standard?

    Yes, same behavior. There are breaking changes between C++03 and C++0x, but not many (especially in the Core Language).

    > If so, where can I look this up?

    N3225 5.3.3 [expr.sizeof]/1: "The sizeof operator yields the number of bytes in the object representation of its operand. The operand is either an expression, which is an unevaluated operand (Clause 5), or a parenthesized type-id."

    When 8.3.5 [dcl.fct]/5 says "After determining the type of each parameter, any parameter of type “array of T” or “function returning T” is adjusted to be “pointer to T” or “pointer to function returning T,” respectively." it's talking about function parameters.

    Because the function parameter has been adjusted to be a pointer, sizeof(t) is 4. t behaves as a pointer in every other respect (e.g. when being passed to other templates).

    The template parameter is unaffected - it is still an array type. sizeof(T) therefore returns how many bytes would be in such an array, which is 20.

    > If you have a templated function, then we have to explicitly instaniate the function template in order to pass it around.

    Ah, but there's a better way to do that (one that avoids the pitfall I mentioned earlier, where explicit template arguments make the compilation explode).

    Instead of "f<decltype(s)>" you can pass "static_cast<void (*)(decltype(s))>(f)". (Yes, it's more typing, but it doesn't explode. I'll construct an example if you really want one.) Thanks to N3225 13.4 [over.over], when faced with overloaded and/or templated functions, you can use static_cast to disambiguate exactly which one you want. (This is one of the few good uses of casts).

    Note that this will change the output, because as soon as you say static_cast<void (*)(decltype(s))> which is static_cast<void (*)(char[])>, the compiler adjusts that function pointer type to static_cast<void (*)(char *)>, so T is deduced to be char *.

  • @STL:Well, I have to say that I'm really learning a lot about C++0x through this discussion.  Hopefully, you're getting something out of this as well ... so that I'm not just irritating you.  Wink

    > Where in the proposed C++0x standard does it state that function parameter adjustment does not affect explicit template parameters?

    Function parameters and template parameters are totally different, so function parameter adjustment can't affect template parameters. The Standard/Working Paper doesn't need to explicitly say this, even in a footnote.

    Sorry about that.  I didn't mean to confuse you.  I should have written "explicit template __arguments__", vice "parameters".  Since we were previously writing about the interaction between explicit template arguments for templated functions ( the A in "f<A>( a )" where f is "template<T>void f( T t )" ) and what the function parameters eventually get adjusted into ( where f<A> is adjusted to type "void ( adjusted(A) )" ), I figured that you would understand what I was writing about ... even though I slacked on the terminology.  All apologies.

    > If so, where can I look this up?

    N3225 5.3.3 [expr.sizeof]/1: "The sizeof operator yields the number of bytes in the object representation of its operand. The operand is either an expression, which is an unevaluated operand (Clause 5), or a parenthesized type-id."

    When 8.3.5 [dcl.fct]/5 says "After determining the type of each parameter, any parameter of type “array of T” or “function returning T” is adjusted to be “pointer to T” or “pointer to function returning T,” respectively." it's talking about function parameters.

    Because the function parameter has been adjusted to be a pointer, sizeof(t) is 4. t behaves as a pointer in every other respect (e.g. when being passed to other templates).

    The template parameter is unaffected - it is still an array type. sizeof(T) therefore returns how many bytes would be in such an array, which is 20.

    Thanks!  I completely forgot about that in C++98/03!!!

    I think that I'm starting to get a clearer picture of what is going wrong here ... and what is going right.  In order to clarify things even more, I'm trying to test something out using decltype() ... but VC++ 2010 is not cooperating.  If I have the following function template and regular function:

    template < typename T >
    void f ( T t ) {
        (void) t;
    }
    
    void g ( char s[20] ) {
        (void) s;
    }
    
    g++ 4.5.2 compiles the following decltype of the instantiated function template just fine, but I cannot get VC++ 2010 to compile the same code (I get the following error in VC++:  error C3556: 'f': incorrect argument to 'decltype'):
    decltype( f<char[20]> )
    Note: I can put regular functions in decltype in VC++ 2010 with no issues.  In other words, I __can__ compile the following in VC++ 2010:
    decltype( g )
    Is decltype fully baked in VC++ 2010?  Or are there known limitations?

    Joshua Burkholder

    P.S. - I would love more info and examples of that static_cast< function pointer >( function template name ) thing that you were writing about.

  • new2STLnew2STL xkcd.com

    About all this interesting debate about template and functions I see The Visual C++ Weekly Vol. 1 Issue 9 (Feb 26, 2011) come with an interesting link titled Expressive C++: Fun With Function Composition (cpp-next.com/archive/2010/11/expressive-c-fun-with-function-composition/), they talk about function composition in C++ like the compositor operation . ("dot") in Haskell.

    The examples ilustrate the use of template metaprogramaing and recursion, result_of protocol, boost equivalent (for pre C++0x compilers) and touch the type decay aborded by @Burkholder and @STL.

    Fun reading Wink

  • 2 things:

    • Where's the next video?
    • The conversation above is great! Keep it up. Would've missed something if the conversation were E-Mails.
  • Quick question

     

    What is the function type (i think it's called that) for this functor

    struct S_FUNC {
    void operator()(int i) const {
    }
    };
    
    S_FUNC func;
    
    guard<void (*)(int)> g(func);
    

    For some reason, vs2010 doesn't think it is "void (*)(int)"
    so if it's not that then what is it ?

     

  • STLSTL

    Burkholder> Well, I have to say that I'm really learning a lot about C++0x through this discussion.

    Cool!

    Burkholder> Hopefully, you're getting something out of this as well ... so that I'm not just irritating you.

    Very few things irritate me - chief among them is when people are wasting my time. But when I'm explaining something and people are listening, I'm never wasting my time.

    Burkholder> I should have written "explicit template __arguments__", vice "parameters".

    Precise terminology is indeed important. It appears that this doesn't affect my response, though. Basically, you've got a function template, like "template <typename T> void f(T& r, T v)", with template parameters (like "T") and function parameters (like "T& r" and "T v"). First, this needs to be fed template arguments. It can get them implicitly (through template argument deduction) or explicitly (through explicit template arguments). Template argument deduction follows certain rules, while explicit template arguments are used as-is. After template arguments have been determined, they're plugged ("substituted") into the function template, in order to instantiate a real function. Suppose that f<int[3]>(blah, blah) has been called. In this case, T is int[3], end of line. When substituted into the signature, we get (int (&r)[3], int v[3]). The former is cool, but the latter is not, so it gets adjusted, and we end up with (int (&r)[3], int * v). Those are the function parameters that the function will use. This is what sizeof(r) and sizeof(v) see, and I claim (and VC agrees) that the same should be true for decltype.

    Burkholder> I cannot get VC++ 2010 to compile the same code

    That's clearly a bug. I've filed this as DevDiv#151929 "decltype(f<int>) emits bogus error C3556: 'f': incorrect argument to 'decltype'".

    Burkholder> Is decltype fully baked in VC++ 2010?  Or are there known limitations?

    There are bugs, but there are always bugs. So far they seem to be relatively rare and relatively minor (for example, decltype(expr1, expr2, etc, exprN) wasn't working properly, and Dinkumware wanted that badly, so we got the compiler fixed).

    Burkholder> I would love more info and examples of that static_cast< function pointer >( function template name ) thing that you were writing about.

    This is an exercise left to the reader:

    C:\Temp>type kitty.cpp
    #include <algorithm>
    #include <iostream>
    #include <ostream>
    #include <vector> // Contributes to the explosion.
    using namespace std;
    
    int main() {
        int x = 1701;
        int y = 1729;
        cout << x << ", " << y << endl;
    
        #ifdef BOOM
            swap<int>(x, y); // BUG - ARGH - DO NOT WRITE THIS CODE!
        #else
            swap(x, y); // GOOD - WRITE THIS!
        #endif
    
        cout << x << ", " << y << endl;
    }
    
    C:\Temp>cl /EHsc /nologo /W4 kitty.cpp
    kitty.cpp
    
    C:\Temp>kitty
    1701, 1729
    1729, 1701
    
    C:\Temp>cl /EHsc /nologo /W4 /DBOOM kitty.cpp
    kitty.cpp
    C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\INCLUDE\vector(1544) : error C2825: '_Alloc': must be a class or namespace when followed by '::'
            C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\INCLUDE\vector(1589) : see reference to class template instantiation 'std::_Vb_iter_base<_Alloc>' being compiled
            with
            [
                _Alloc=int
            ]
            kitty.cpp(13) : see reference to class template instantiation 'std::_Vb_reference<_Alloc>' being compiled
            with
            [
                _Alloc=int
            ]
    [...etc...]

    Deraynger> Where's the next video?

    In the encoding pipeline.

    Renumbered> What is the function type (i think it's called that) for this functor

    I don't know what "guard" is - I'm not supposed to use my psychic powers in public. If it's like std::function, you want "void (int)". That's a function type taking int and returning void.

  • @STL:sorry, i thought my telepathic powers worked at that distance

    Here's guard:

    template <typename F>
    class guard {
        F m_func;
    public:
        explicit guard(F func) : m_func(func) {}
        ~guard() {
            m_func(123);
        }
    };

  • C64C64

    , STL wrote

    Matt_PD> It'd be great if you could cover the STL allocators and the new C++0x alignment specifiers with STL.

    I might cover our allocator machinery. However, I can't cover C++0x features that aren't implemented in VC10!

     

    theDUFF> Do you think it would be possible to cover allocators at some point?

    Yes, although I'll have to think of something useful to say about them.

     

    It would be appreciated if you could convert the string pool allocator implemented here:

     

    http://blogs.msdn.com/b/oldnewthing/archive/2005/05/19/420038.aspx

     

    to C++/STL code (guiding us through the process in a video).

     

    It would be interesting and useful if you could develop a generic PoolAllocator<T>, generalizing the above string pool allocator.

     

    Thanks much.

     

  • CoconutCoconut

    It seems STL on Windows CE is lagging behind the desktop version. Any idea why?

  • CharlesCharles Welcome Change

    Part 2: http://channel9.msdn.com/Shows/Going+Deep/C9-Lectures-Stephan-T-Lavavej-Advanced-STL-2-of-n

    C

  • STLSTL

    Renumbered: You want guard<S_FUNC>.

    Coconut: That's maintained by another team, you'll have to ask them. Dinkumware and I maintain the One True STL in Visual C++, which other Microsoft toolsets (e.g. the Xbox Development Kit) are derived from.

    Please note that I'm now monitoring Advanced Part 2 for comments.

  • &#24352;&#40527;张鹏

    哈哈,I am the first chinese to catch the sofa!!!
    beg for your direction!!

  • TedTed

    Just finished watching this video on shared_ptr.
    As an experienced educator myself, I'd say well done.  As an old programmer, though, I'd have wanted something a bit more challenging.  ;-)
    Anyway, I wonder if you would comment, in a bit more detail, about how this 'type forgetting' works, particularly with regard to inheritance trees.  I have often had to deal with complex inheritance trees for modelling ecosystems (and often would have base classes representing families or genera of related organisms and derived classes representing species: so a genus class might represent canids, and from it would be derived classes representing wolves, dogs, foxes, coyotes, &c.).  All of these classes would be ultimately derived from an abstract base class (with only a small number of data members and perhaps have a dozen pure vitual functions).  The simulation engine would have an std::vector containing boost::shared_ptr instances.  The most basic base class has an empty, but virtual destructor to ensure that the right destructor is always used when instances are destroyed.  In a complete, but simple, model codebase, there could well be hundreds of derived classes (which will inevitably grow as more life forms get modelled), and a model that is running would have thousands of instances of these.  When a new object is created, the value returned by operator new is cast to the base class.  I rely on this, and the fact the destructor is virtual, to ensure that each is cleaned up properly.
    If you ever saw my production code, you'd find that all pointers are immediately handed over to the most appropriate smart pointer the instant operator new returns it.  Except way back when I first started using C++, you would never find a naked pointer in my code.
    This is the context, and why I try to encourage my junior colleagues to develop a habit of making a virtual destructor whenever they find themselves adding a virtual member function to whatever class they've been assigned to write.
    My first question is how would your method of 'type forgetting' fit into the context I often face.  And my second is like the first:  Why?  Or what does this type forgetting provide that I don't already have with virtual destructors and a vector of boost::shared_ptr  (I am not eccentric enough to even consider using malloc/free in my C++ code, so your example left me wanting more).
    My last question relates to habits to be encouraged among junior programmers being mentored by old fossils like me.  As I said, I encourgage kids to add public virtual destructors whever they add virtual member functions to a class.  But I have been told, recently, by equally old programmers that it is better to encourage a habit of making destructors protected and non-virtual.  What I have not been able to get from these guys is an explanation of what significant downside there may be from the habit I encourage or what the upside is for the practice they recommend.
    I am not omniscient, so I will acknowledge there's plenty I don't know, but at the same time, I don't do things just because I can, but rather because there is a demonstrable benefit for doing it.  My objective is always stable, fast and correct production code.
    Can you contribute to my education on this matter?
    Thank
    Ted

  • @Ted:

    Hey Ted, you should either comment in Part 2 (or soon Part 3) or e-mail him. I am not sure, but he mentioned his e-mail address in a previous post (on another video).

    I'll try to answer the question about the protected/public virtual dtors. Am quite novice, so please correct me (anyone) if I say something wrong.

    If you have a (obviously public) virtual dtor you can destroy the derived class thru the base class pointer:

    CBase* b = new CDerived();
    delete b;

    Calls ctor/dtors like this:

    CBase()

    CDerived()

    ~CBase()

    ~CDerived()

    With a protected dtor the delete part won't be possible thru the base class pointer.

    CBase* b = new CDerived();
    delete b; // Won't compile! 

    Assignment will work:

    CDerived* d = new CDerived();
    CBase* b = d; // Works
    // delete b; // Won't compile!
    delete d; // Works

    I don't know if the reason that someone suggested this was, because you're maybe managing the deletion from elsewhere. Another reason, (not sure about this though), could be that the vtable will be bigger if you have a virtual dtor???

    Creating and destroying the derived class will work in both cases (public virtual dtor or protected dtor) as usual:

    CDerived* d = new CDerived();
    delete d;

    Calls ctor/dtors like this:

    CBase()

    CDerived()

    ~CDerived()

    ~CBase()

     

    Hope that was of any help, and hope even more that it's all correct as I explained it.

  • TedTed

    Thanks Deraynger,
    You have it right, as far as you go.
    It is true that having a virtual destructor increases the memory consumed by vtable by the size of one pointer, but that is hardly significant on a machine with 8 GB RAM, and pointers to objects that can consume several kilobytes.
    The problem that the recommendation of making destructors protected makes is that it becomes impossible to make a std::vector > instance that holds instances of boost::shared_ptr, so all objects held by the vector are properly deleted.
    boost::shared_ptr is a different type from boost::shared_ptr
    If you have only two derived types, this isn't much of a problem as you can have two vectors; one for each derived type.  But if you have thousands of derived types, it becomes a nightmare.  Having a public virtual destructor guarantees that one vector containing instances of all derived types through pointers to the base class is sufficient.
    But in the context of an event driven application where the user can set up a simulation by adding, modifying or removed instances of the derived classes, there are numerous places where these instances can be either created or deleted.  THAT is a second reason why is it so useful to have all instances of the derived classes managed in instances of shared_ptr containing pointers to the base class.
    If I understood Stephan correctly, the latest incarnation(s) of shared_ptr provides a way to make the base class destructor protected and still store pointers to the derived class in shared_ptr, in turn in an std::vector, and still have things properly deleted.  But what I don't see is what benefit this extra magic provides.
    Thanks
    Ted

  • @Ted:

    Ok, I get it now, and great that I got my part right (though you seem to know it all already, and also better than me Wink )

    Regarding the shared_ptr of a base class, with protected dtors, having instances of derived class', I have no idea, you'll have to ask STL (e-mail or comment in newest video thread: http://channel9.msdn.com/Shows/Going+Deep/C9-Lectures-Stephan-T-Lavavej-Advanced-STL-3-of-n). All I can think of, is maybe it doesn't use a vtable (not sure of the implementation of shared_ptr), and maybe the only benefit, is that it won't leave a memory leak, as opposed to not being able to delete the derived class Big Smile

    Ray

Remove this comment

Remove this thread

close

Comments Closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.