GordonBGood GordonBGood

Niner since 2011


  • C++ and Beyond 2011: Herb Sutter - Why C++?

    , Charles wrote

    @GordonBGood: C++ is a widely used general purpose programming language that is at a level of maturity in C++11 that makes it more productive than you're giving it credit for. As Herb states, he and the standards folks need to put more emphasis on libraries, standard libraries for C++. Of course, this means they are, they will. A native BCL doesn't make much sense, but a set of portable, standard, efficient(of course, this is C++!) domain-specific libraries certainly does.

    What is the C++ Renaissance in your opinion? How would you define it? I'm not sure it's the same as we, Microsoft, intended. Please elaborate.

    I'll grant you everything you say in the first paragraph.  C++11 indeed is much easier to use than any previous C++ standard, and from the Windows 8 Developer Preview and Visual Studio Developer Preview we can see that progress is being made on developing libraries to make it easier to do most of the things that one can already do very easily in equivalent generations of C# (or now VB).

    As per Herb's talk here, Microsoft's view seems to be that the C++ is making a comeback or "Renaissance" in that it will become the programming language of choice for almost all applications programming due to the need for "more bang for the watt".  At least some of his extrapolations are flawed, such as that mobile devices application programming is moving to C++; as I stated in a previous post, Apple has always been pro native language programming and of the other (two) major programming environment providers (other than Microsoft), C++ is at most a somewhat poorly supported secondary programming environment as compared to the "Coffee type" languages, meant more to be called through p/invoke (actually JNI) calls.  Thus, there is no trend or movement to C++ there.  As to use for server code, the point is likely well made that there is a requirement for the higher efficiency of C++ - other than that ASP.NET might still be desired as a more tightly controlled security model

    Where there is a movement toward better native C++ support or a "C++ Renaissance" is within Microsoft itself, with native code C++ support in Visual Studio now making it feel like a first class language as compared to a distant cousin.  This goes in hand with the support for WinRT, which of course is actually a native code implementation under the covers.  With this movement, C++/CLI now feels like a distant cousin without much of a future as it is one of the languages without a formal connection to the new WinRT "Metro style" apps.

    The view expressed in this video seems to be that higher and higher percentages of application programming will be done in native C++ over the next ten years due to this need for more "bang for the watt".  I don't see that happening.  As always, one will choose the appropriate language for the job.  For faster development times where maximum performance isn't crucial, languages like C# (or even the quite low performance F#) will be chosen; where there are limitations in a VM environment, either a "pure" C++ development will be chosen or one will use a hybrid of VM languages calling more efficient C++ written libraries as required to overcome those limitations or for maximum performance.  Nothing has changed, other than that C++, especially for Microsoft environments, has been made a little easier to use.  If Microsoft really believed that most programming for the future needs to be done in C++, than those other languages would be de-emphasized with ways of porting existing code to native.

    So that's my view:  Thank you Microsoft for an improved C++(11) including conformance to the new standard(s) and libraries that support previously sorely lacking standards such as new asynchronous programming models and even advanced concepts such as AMP; however, even with all of this it isn't going to replace C# (and maybe VB) where programming productivity matters.  As to other programming environments, one does what one must.  I will likely (reluctantly) develop an Android app completely in C/C++ in order to overcome some limitations of the Android Java environment restricting memory use, but the majority of applications will continue to be written in Java in spite of the potential performance hit; for other applications such as games, C/C++ will be chosen for performance over Java.  I would think that the same decisions would be made as for developing applications for Windows tablets or for a new version of Microsoft Phone (assuming it might support writing apps in native code), with a decision to use native C++ eased somewhat by the fact that Microsoft have provided a much easier to use and more powerful set of development tools for C++ for Microsoft products than as provided for Android's NDK.


  • C++ and Beyond 2011: Herb Sutter - Why C++?

    , The Schickster wrote

    Okay so I'm sitting here wondering, after having watched the video and having read most of the comments, why doesn't Microsoft rewrite all of those OS libraries that were written in managed code in C++? Why not write the .NET framework in C++ while keeping the APIs as they are? Then the only meta data that is needed is for the .NET app itself, not the entire framework.

    If you have developer productivity managed apps that are running as a thin layer on top highly optimized C++ doesn't everybody win? What am I missing?

    I think you are missing that the entire Win32 underpinning of Windows still is and has always been native code (C++) and that all that is new in Windows 8 as per the WinRT libraries that support the Metro applications are also native code with a thin covering of metadata so that any of the support languages can call through to it, including the managed ones.  The only libraries that are not in native code are the DotNet Framework ones, and as most of those call though to the Win32 API as required, there would be no real performance advantage of re-writing them such as for the WinRT.  In fact, there is a performance disadvantage to repeatedly call through to native code as implemented as for WinRT as they are modified COM interface methods which are artificial to managed languages and would actually cost performance when repeatedly crossing the gap between the managed and unmanaged code.  Also, some of the most crucial DotNet libraries are actually in native code format as they have been ngen installed into the Global Assembly Cache.

    Managed languages are slower than native code mostly because of the delays of JIT compilation (often negligible but sometimes very onerous as for complex unrolled state machine operations) and the limits in optimization that may be done by such "on-the-fly" compilation as compared to the multi-pass compilation and in-depth optimization of a native code compiler as Microsoft's C++ compiler.  As well, the managed system has the overhead of managing (mostly garbage collection) of the managed heap.

    As one who has fairly frequently converted code between C# and C++, a more interesting concept would be to offer a more efficient multi-pass native code compiler for C# code.  This partly already exists as the ngen capability of DotNet, which generated special pre-compiled files for use as shared system files, but is not likely as efficient as native code from the C++ compiler.  What I observe is that in many ways, the external capabilities of C++11 and C# become very much similar:  generics are a somewhat different by similar in use form of templates, C++ smart pointers take over from requiring a managed heap by providing automatic reference counting and release, the compiler could be set to "safe" or "unsafe" mode as to checking array bounds limits and pointers or pointer type conversions, etcetera.  This is made more and more necessary because they now are both forces to support the same common library, the WinRT.  In fact, C++ still lags the managed languages in full support of futures (ie. Promises, Tasks, whatever) and asynchronous programming models; for instance, even some of the BUILD conference async demo's don't currently run on the Windows 8 Development Preview (and will require major work one the underlying libraries) where as almost all of the managed code ones do (one with a very slight modification but using existing libraries).  In effect, I am proposing the reverse of the C# Renaissance in adapting the C++ language to be more conformance to modern language forms and syntax to rather making C# in native mode produce code that is identical to what would be produced by C++:  unmanaged heap with automatic reference counting under the covers and so on.

    The fact that C++ produces much higher performance code is then offset by the fact that for production programming, it is much harder to write bug free code in C++ than in C# (or VB for those would still like that syntax).  To say that C++ is undergoing a Renaissance because the only way we will be able to write code that gets more done per watt is by all of us writing C++ code is stretching things!  Come now, Herb!  With properly written programs and apps that use good asynchronous programming models (note that C++ still doesn't very well support that!), much of the UI front end of Win8 Metro apps is spent doing nothing as it has already been farmed off to another thread, which probably has as much if not more to do with consuming power for the actual front end of the apps than the language.  Then, let C++ get on with those parts at which it excels, as in crunching the numbers, running the tight loops, and so on.

  • C++ and Beyond 2011: Herb Sutter - Why C++?


    If the performance is the only criteria which matter in programming, then we will have "Assembler Renaissance" after that "Machine Code Renaissance" and probably "Microcode Renaissance".

    While I agree with your implication that this wasn't one of Herb Sutter's best presentations (his two presentations at the BUILD conference were excellent, but he didn't dwell on the "C++ Renaissance"), and that the real new is just the re-entry (or rather acceptance) of C++ native code compilers as one of the first class supported languages of Windows 8, specifically Metro, I have done a lot of tests with comparing VS C++ native code generation with what I can do hand generating assembly language code, and although if one doesn't miss a single trick in the "book" one can sometimes squeeze out an extra 10% to 20% for very tight loop performance (if one does miss a trick, it can sometimes go the other way).  I don't think any of us want to go back to that, or even lower to microcode for such small gains except for very small portions of critical code.

    However, there is an interesting new capability that is (currently) only included with the C++ native code platform:  this Accelerated Massive Parallelism (AMP) code generation that can run computations using the Graphics Processing Unit.  Microsoft will offer this as a standard so that eventually it will be adopted by other compilers and (hopefully) by the ISO standards committee.  This seems very interesting and seems to be a better solution than Render Script that Android has come up with under Java.  This could be very interesting for computationally intensive applications, and could make much more of a performance difference than just some 10's of percent.

    If successful, I suppose that this might get ported to other languages.

  • C++ and Beyond 2011: Herb Sutter - Why C++?


    @dave and @gordon. Interesting argument over which is best C++ or C#. Many might think, just use the one you like where you like. A possibly more interesting question would then be: can you actually do that now and in the future? When you pick a favourite language, can it actually be used where you like for what you like, or will some big company make that too hard for you or limit you in self serving ways? If that were to happen to your favourite language it wouldn't matter which one was best would it? What's your thoughts on that?

    I suppose as example of what you describe is that it was harder to write C++ code including interfaces to window controls for native code with Visual Studio 2010 than it was to do much of that work using C# or another CLI language.  With Microsoft's C++ "Renaissance", this has been fixed, at least for Windows 8 style Metro apps.  Another example is that it is much harder to use C/C++ native code in Android than it is just to code with Java.

    When it is more difficult to code, one uses the easiest tools at hand to get the job done, but if those tools can't get the job done one then looks for whatever it takes to do it.  Previous to this "Renaissance", one called C++ code as necessary from DotNet CLI.  Now we have the option to code an entire metro application in C++, but as I find C# easier to code, I will continue to stick with it for most apps unless I need performance, in which case I will call C++ native code for those critical portions, just as I do for Android.  That is the approach of most programmers except for those very familiar with C++.

  • C++ and Beyond 2011: Herb Sutter - Why C++?


    If you try it with a an array of longs, the unsafe code is always more than twice as fast as the safe one.

    You are correct that the unsafe code is much faster than the safe code when dealing with eight byte data rather than one byte data.  This is because the JIT compiler isn't very optimized for dealing with these where as the pointer arithmetic uses the built in machine instructions directly.

    However, the real reason for this argument is your feeling that "unsafe C# code can be just as fast as C++ code".  It's not.  Coding just the accumulation of an array of longs which takes 71 milliseconds on my machine in 64 bit Release mode with Optimization On, C++ takes 31 milliseconds to do the same job.

    This is because of the differences in the generated native code:

    The C# generated native code has 11 machine instructions in its inner loop of which 8 of them are reading or writing to memory.  The C++ compiled native code generates 7 machine codes of which four are memory reads per four loops as it automatically unrolls the loop to that extent, for an average of one memory read and 0.75 other register based instructions per value added.  Interestingly, the pointer version of the loop actually takes longer in C++, and even longer than that for 32 bit code; however the very slowest implementation of C++ code is never slower than the very best of the C# code.

    And that was the point I was trying to get across:  that there are many cases where C# will not perform quite as well as C++ by a factor of two or a little more, and in these cases one should likely either write in C++ in the first place or p/invoke the C++ code from C# for those critical parts.  One does not get these performance gains by just writing C# unsafe code using pointers.

    EDIT ADD (next day):  I've been thinking about this matter of C# performance as compared to C++ native code a little more and do concede that using unsafe context can speed up code in some but not all situations.  To take this a little further, rather than just do the trivial task of summing an array, I decided to look at using a Look Up Table (LUT) array as is often used to do fast data transforms, in this case to count the number of bits in the array.

    The LUT was generated as follows:

    byte[] LUT = new byte[256];

    for (int i = 0; i < LUT.Length; ++i) {

      byte n = 0;

      for (int j = i; j != 0; j >>= 1) if ((j & 1) != 0) ++n;

      LUT[i] = n;


    Now, calculate the sum of the bits in the byte[] array as follows:

    fixed (byte* ptr = buf) {
      for (byte* j = ptr, lim = ptr + buf.Length; j < lim; j+=4) {
        cnt += LUT[j[0]];
        cnt += LUT[j[1]];
        cnt += LUT[j[2]];
        cnt += LUT[j[3]];

    fully optimized, 64 bit code, Release mode takes 97 milliseconds on my machine where the same algorithm on C++ takes 75 milliseconds.  While the above pointer based code is much faster than "safe" code because of the elimination of the multiple unavoidable array bounds checks, it still isn't as fast as the highly compiler optimized C++ code.  BTW and interestingly, using pointers for the LUT reference in the above manually unrolled loop actually takes about 10% longer than the above code.  Also, automatic array range bounds checks are only eliminated if the variable are local non-static variables, which means that they would never be eliminated for your original console application where the tests are done inside the static Main method of the Program class.

    In short, use the appropriate language and language implementation for the job.  Although it performs very well for a "Coffee" style language, Microsoft's implementation of C# can't quite keep up in performance to Microsoft's implementation of C++ no matter what optimizations are made if the same optimizations are allowed for both languages.  However, you have made me reconsider testing whether using C# unsafe context will get some code "close enough" to C++ so it isn't worth the bother of using it for a additional 30% gain in performance.


  • C++ and Beyond 2011: Herb Sutter - Why C++?


    It was obvious to me right after reading your comment, that you messed up some settings with the project. Either compiled Debug version instead of Release, or somehow explicitly set the project not to optimize code.

    Regardless, did the test, and _never_ saw unsafe code run slower than safe one.

    Here's the code.

    It's actually pretty easy for the compiler to bypass range checking here even for safe code, and it sometimes does it, sometimes doesn't.
    On my machine, the 2nd and 3rd calculations run at the exact same speed, ~48ms, while the 1st one run between around 48ms-90ms. Never faster than the unsafe ones. (the JIT usually managed to optimize it right after startup and at times when I made it run continously by holding the enter)

    Come now, Dave, I do know the difference between Debug and Release settings and to check that the compiler Optimize Code switch was on.  However, I was running on a 64 bit machine and there is a difference (both for Release mode), as for your example code:

    For 64 bit code, all routines whether safe outside or inside the unsafe context or the unsafe loop using pointers ran at about the same speed or about 71 milliseconds on my 2.3 GHz machine

    When forced to 32 bit x86 code, both the safe and the unsafe ran at about the same speed as above but the safe loop outside the safe context ran slower at about 109 milliseconds.  Inspection of the code reveals this is because the safe loop outside the unsafe context was actually not optimized as well and was using more memory operations rather than more register operations, which is also why the 64 bit version outside the loop runs faster:  it uses register operations for both cases.

    Inspection of the code shows that none of the "safe" loops eliminate buffer range checking, but that isn't the biggest consumption of machine cycles; rather it is the question of how well ordered the instructions are and their efficient use of registers.  As to why one optimization is different that another, it is likely just as for why sometimes the array bounds checking is eliminated and sometimes not:  a question of what scope the various variables are - range checking is only eliminated when all variable have the right scope and the loop has exactly the right form.

    I didn't say that the unsafe code would run slower than the safe code, just that you won't get the gains you think you will as compared to C++, and often the speed is just about the same.

  • C++ and Beyond 2011: Herb Sutter - Why C++?

    @Dave wrote:


    "Sorry, Dave, although you can use unsafe code context and thus use pointers in C# without range checking, there is (by design) no performance gain in doing this"


    Well, I don't know if they designed it with performance in mind, but I know that IN PRACTICE you can achieve huge performance benefits using unsafe context in the right places.

    Even the MSDN documentation says:

    "As examples, using an unsafe context to allow pointers is warranted by the following cases:

    ....Performance-critical code"


    Dave, I only know what native code the compiler generates as one can view with Visual Studio (even the Developer Express VS - view Assembly code after a breakpoint Alt-8) and the actual measured results, both with optimization turned on in full release mode, as follows:

    1)  The following classic contents of array counting loop takes about 245 milliseconds to run 10,000 times on my machine with an 8K buffer array, much of which time is revealed to be used by array bounds checking (the cnt and buf variable both local to the scope of the loop):

    for (int j = 0; j < buf.Length; j++) cnt += buf[j];

    2) The following code enclosed in an unsafe block with a fixed block pinning the array to the byte pointer "ptr" runs in about 285 milliseconds for exactly the same conditions:

    for (byte* j = ptr, lim = ptr + SIZE; j < lim; j++) cnt += *j;

    Inspection of the native code produced by the JIT compiler reveals that although there is no array bounds checking for this unsafe code, the code is quite poorly optimized with repeated reloading of contents of registers even when they already contain the right data.

    I had already read this to be the case:  that less optimization is done for unsafe code than as usual.  I confirmed the same results with both Visual Studio 2010 and the new Developer release of VS 11.

    Although MSDN is correct on the first two reasons for using unsafe mode, they are wrong that one uses it for performance.

    This makes sense as if the differences in performance were so extreme, programmers would be tempted to use unsafe mode as a matter of course by default.  Basically, one uses it when one can't get the job done another way or where the other way would be inefficient.  There may be situations where the use of unsafe code and pointers does aid in performance gains, but tight loops certainly isn't one of them as demonstrated above.

    C++ compiled with full optimization to release on the first loop runs in 42 milliseconds and the second style runs in 62 milliseconds, both due to very efficient loop optimization.  The second form suffers from implementing the extra indirection through pointers with both unrolling the loop to an small amount.  Unrolling the loop doesn't help C# in the first case because most of the time is still spend in array bounds checking and in the second case because the pointers are reloaded into the registers for every access.  In actual practice, both forms are very close in time for C++ but very much optimized as compared to the C# forms, with small further gains if one manually unrolls the loops further.  Remember, the C++ code is running an average of just over one machine clock cycle per memory access so very little additional optimization can be done do matter the form!

    In conclusion, the ability of C# to perform very tight loops is many times slower than C++, partly due to the required array bounds checking but also due to the C# JIT compiler not optimizing as highly, but in practical use the end result of the same fast algorithms in C# as compared to C++ results in code that is about two times slower on average as not all of the processing time can be boiled down to such trivial tight loops as this.  As I said before, if one requires the very utmost in performance use or call C++, but that comes at the cost of more "unsafe" code.


  • C++ and Beyond 2011: Herb Sutter - Why C++?

    @Dave wrote:

    First off, you can use unsafe context in C#, which will remove the performance overhead from array-range checks, enables you to use (limited-) pointers, and –if used correctly- removes most overhead from automatic memory management.

    Sorry, Dave, although you can use unsafe code context and thus use pointers in C# without range checking, there is (by design) no performance gain in doing this due to there being less code optimization done in unsafe mode:  it's about a wash in performance between the techniques.

    Unsafe mode is there so that one can do things that might not be possible from normal safe mode, but the reason is never for performance gains else many programmers would be tempted to use it as a matter of course.

    But of course we have the option you raised in your previous post of p/invoking a native module to do the performance intensive tasks in a language such as c++ with the parts where performance isn't generally an issue such as the UI more easily written in C#.

  • C++ and Beyond 2011: Herb Sutter - Why C++?

    @Dave wrote:

    To all those people bashing the performance of C#:

    It’s true, most of the features of the C# language aren’t made with performance in mind as the first priority, normally code quality precedes it, and it makes them less suited for performance-critical code, you’re not forced to use them.

    Even for most high-performance applications, in practice, a relatively little share of the codebase is responsible for the actual performance.
    If you need write a highly efficient application, in C#, you can write the majority of the code in high quality, and resort to low-quality, high-performance code in the few critical sections. In C++, the quality of performance-critical sections is better (need to make much lower trade-offs in quality), but the quality of the majority of the codebase is lower.

    Agreed, but it should be pointed out that using the same algorithms that C# code will be up to about 2 to 2.5 times slower mostly only for very tight loops of only a few instructions or so, if one looks at the generated native code that most of that loss is due to the compulsory array bounds checking with a very slight percentage (perhaps a factor of 1.1 to 1.2) due to better efficiency of the compiler.  The array bounds checking is necessary for safe code, so if one has a choice, safe code with C# or when highly optimized the faster but unsafe code of C++.

    For instance I wrote a highly optimized benchmark using C that could calculate the number of primes in the 32 bit number range in about a second using one core with most of the time spent culling composite numbers at about 2.5 machine cycles per cull.  The C# version using exactly the same array based algorithm respecting the cache sizes of the CPU took about 6 machine cycles per cull.  Upon inspection of the generated code, the difference was primarily the C# array bounds check, which could not be written around and still use the highly efficient algorithm.

    It should be noted that the original  C program from which I derived the work took about 50 times as long as the final C version and therefore about 25 times longer than the final C# version due to not respecting and optimizing the size of the cache and just using a huge linear array rather than paging the work as well as bit compressing the prime candidates; the point being that in any language, well written code using better algorithms for the specific purpose is more important that the choice of language.

  • C++ and Beyond 2011: Herb Sutter - Why C++?

    @Dennis wrote:

    I was like "OMG. YES!!!" when Herb Sutter showed the ios, android and wp overview. For one second I believed he is gonna unveil something ("native is coming for wp") right there...
    Oh well, I keep dreaming Smiley

    I think that, although the point that Microsoft is better supporting C++ and making it easier to use is valid, Herb Sutter is making quite a stretch to say that it is necessary only due power and performance it provides as compared to "Coffee languages" and trying to make an inference that there is a trend towards a.  For example, this trying to show there is a trend towards C++ using smartphones was completely phony, as follows:

    1)  Apple has always tended to use native code compilers and has a company culture/bias against VM's.

    2)  If Herb knew enough about the Android Native Development Kit, he would know that it has limits and is not recommended for other than to augment the capabilities of the Java platform (as in to be p/invoked = called through JNI).  Google has the following to say:  "

    The NDK is *not* a good way to write generic native code that runs on Androiddevices. In particular, your applications should still be written in the Javaprogramming language, handle Android system events appropriately to avoid the"Application Not Responding" dialog or deal with the Android applicationlife-cycle.


    3)  Further, with performace gains made since about five Android versions ago (the first Gingerbread), the NDK has not been updated since, perhaps because it is somewhat less necessary.  It is stated that perhaps its main use would be for large bitmap buffers or other memory intensive operations that exceed the current limited size of the application heap space or to augment performance for such specific compute intensive tasks as pixel manipulations.

    4)  That doesn't a trend make.

    Herb conveniently didn't mention Rim's Blackberry's which have always used Java and will likely always use Java.

    It also still isn't clear whether Microsoft's Phone OS will allow p/invode calls to native code.

    Where's the trend?

    As Herb does state correctly, each of the language approaches have their place; however, it is easier to p/invoke to call specific C++ modules from C# in order to enjoy these performance gains where required than it is to call C# modules from C++.