    @Dexter: The code in my post was just an example of compiler optimization. How a good template class should work.

    Both your code and Burkholder's code have a big overhead.
    Is there a way to minimize the overhead like in my example ?