Have you tried running a binary for Windows compiled with GCC?
Or the VC built version under WINE under Ubuntu?
No. How would I go about doing that?
By the way, I reran the tests on Windows in single threaded mode with the processor affinity set so that the process would not jump from core to core and started my program from Visual Studio without debugging. The numbers after doing that were 5 seconds for
0 - 2281, 17 seconds for 0 - 3217, 12 seconds for 2281 - 3217 and 20 seconds for M21701. These numbers look even worse for Windows (the previous ones were slanted in Windows' favor as they were taken with the multithreaded version), but it is an apples to