Sometimes i do trust the compiler a bit too much thinking its smart enough to
understand what i'm doing when it can be really stupid and stubborn sometimes.

Here's version 3:

#include <algorithm>
#include <vector>
#include <iostream>
#include <iterator>

using namespace std;

template <typename FwdIt, typename OutIt, typename Predicate>
inline FwdIt move_if(FwdIt first, FwdIt last, OutIt out, Predicate pred)
{
    first = find_if(first, last, pred);
    if (first == last)
        return first; // Nothing to move.

    for (auto i = first; i != last; ++i)
    {
        if (pred(*i)) // Should it move it ?
            *out++ = move(*i); // Move it.
        else
            *first++ = move(*i);
    }
    return first;
}

int main()
{
    vector<int> v(10);
    generate(v.begin(), v.end(), [] () -> int {
        static size_t i = 1;
        return i++ * 11;
    });

    auto is_even = [](const int i){return (i % 2)==0;};
    
    //const size_t size = count_if(v.cbegin(), v.cend(), is_even);
    //vector<int> n(size);
    //auto new_end = move_if(v.begin(), v.end(), n.begin(), is_even);

    // This is slower then the code above when 'vector<int> v(10000);'
    // Naively timed test only using QueryPerformanceCounter
    // Is this what you meant ?
    vector<int> n;
    n.reserve(v.size());
    auto new_end = move_if(v.begin(), v.end(), back_inserter(n), is_even);

    v.erase(new_end, v.end());

    auto print_vec_int = [](const vector<int> &v) {
        cout << "[" << v.size() << "]";
        for_each(v.cbegin(), v.cend(), [](const int i) {
            cout << " " << i;
        });
        cout << endl;
    };

    print_vec_int(v);
    print_vec_int(n);
}

/* Output:
[5] 11 33 55 77 99
[5] 22 44 66 88 110
*/



When ever i see 'back_inserter' i think of performance loss. Is that incorrect ?

What do you think about that 'find_if' "optimization" ?

5. I know its bad to write code when tired but coding just flows when tired. Oh the dilemma Perplexed

6. Nice catch. Btw how big of an issue is that ? Cant the compiler optimize that away ?

7. I'm not used to doing that kind of math but i understand what you mean.
Naively i just run a few tests and see which runs the fastest.
The only goal i have is to make it a performant and correct as possible.
Learning something new in the process is always fun too.
"reallocation overhead (which usually isn't a big deal)"
That surprised me a bit. Isn't (&&) and std::move's main reason for existing just
because of reducing reallocation overhead ?
Now i'm a bit confused. When writing algorithms shouldn't you have reallocation
overhead in mind or are there some simple guidelines you can follow ?

8. See code, please.

9. I wasn't really concentrating on optimizing the debug print code but i would have
changed it eventually anyway.
Performance issues bug me until i fix them.

10. But it will not hurt performance right or is it bad in some way ?

"Yes, that was intentional."
Explain more, please. Explain your thinking.
If the algorithm didnt clean up after itself then the programer would need to do extra work
Good example here would be 'remove_if' algorithm
Also it would leave the input container in a less performant state.
That is using it as is would eat extra cpu cycles.
 
How big of an issue is it really that move_if use forward iterators ?
 
I'm beginning to understand the many decisions you must be making working on the STL
for every single function. Your C++ kung fu must indeed be strong ;)
How do you know when to stop: optimizing / correction cleaning and move on to the next thing ?