Entries:
Comments:
Posts:

Loading User Information from Channel 9

Something went wrong getting user information from Channel 9

Latest Achievement:

Loading User Information from MSDN

Something went wrong getting user information from MSDN

Visual Studio Achievements

Latest Achievement:

Loading Visual Studio Achievements

Something went wrong getting the Visual Studio Achievements

Stephan T. Lavavej: Digging into C++ Technical Report 1 (TR1)

Download

Right click “Save as…”

  • MP3 (Audio only)
  • Mid Quality WMV (Lo-band, Mobile)
  • MP4 (iPhone, Android)
  • High Quality MP4 (iPad, PC, Xbox)
  • Mid Quality MP4 (Windows Phone, HTML5, iPhone)
  • WMV (WMV Video)
From Effective C++, Third Edition:

TR1 ("Technical Report 1") is a specification for new functionality being added to C++'s standard library. This functionality takes the form of new class and function templates for things like hash tables, reference-counting smart pointers, regular expressions, and more.

So, what does this mean for Microsoft's Visual C++? What's being added to our suite of native libraries to support TR1? What new features will you find most useful?

Stephan T. Lavavej is a developer on the VC++ team who is implementing some of the features that will ship as part of a TR1 VC++ "Feature Pack" some time in the near future... He's really passionate about C++ and here we dig into some of his favorite new TR1 features and he explains how they work and why C++ developers should use them by default. Much of the time is spent on the whiteboard.

One of the really interesting new features of TR1 is shared_ptr (shared pointer) which provides some useful automation (memory safety) for native developers playing with pointers. shared_ptr is also performant as well as predictable. Good stuff.

Learn.

Enjoy.


Click here for low res download file
.

Tags:

Follow the Discussion

  • [6:17]: "Mallocate"...I like it. Wink
  • obrienslalomobrienslalom (2 2 2 3) *3 | 3 3 3
    Where is the end of the video? (maybe he needed another mt dew).

    With mention of functional stuff in c++ and boost in this video, I was recently encouraged to check out FC++.  I'm sure many of you have ran across this already, but it seems on topic for those of you who haven't. http://www.cc.gatech.edu/~yannis/fc++/boostpaper/fcpp.html
  • CharlesCharles Welcome Change
    obrienslalom wrote:
    Where is the end of the video? (maybe he needed another mt dew).

    With mention of functional stuff in c++ and boost in this video, I was recently encouraged to check out FC++.  I'm sure many of you have ran across this already, but it seems on topic for those of you who haven't. http://www.cc.gatech.edu/~yannis/fc++/boostpaper/fcpp.html


    Not sure what happened at the end... Looking into it. Sorry 'bout that.
    C
  • evildictaitorevildictait​or Devil's advocate
    dcuccia wrote:
    [6:17]: "Mallocate"...I like it.


    That's standard C-speak don't-cha-know.


    Oh, and it's an awesome first-half. Let us know when you get the next half up Charles. I'm wanting another hour Tongue Out
  • This video has really nothing new to C++ developers, but I think it is a good introduction to those who use other languages and are thinking of dipping their feet into C++.

    To be a little cliché here, C++'s biggest strength is also its biggest weakness: the shear power of it all.  It typically takes newbies years before they realize the full strength of it and even after that you find yourself learning new useful things all the time.  The result of this is that many will take a quick glance at it, label it overly complex, and proudly tell everyone they meet who mentions it how horrible it is.

    Rarely will newbies be told of its strong benefits (like RAII), and Stephan very clearly explains a lot of them.  Don't get me wrong - C++ is not the be-all-end-all of languages, but it definitely fits in a lot more places than some would give it credit for.

    He does go off on a few tangents, but the meat of the interview is good enough to look past that!
  • Not meaning to downplay how cool these additions sound to c++, but as a c# developer, it seriously begs the question.  Is it possible to enable a more declarative syntax to c# that allows you to create a special mode in the managed .net environment where you can bypass the garbage collection?  If you can prove to the complier that you will have deterministic destruction via these "shared pointers" that will be enforced by the runtime and reference counted I think that will lessen the burden of garbage collecting to those individual resources and speed up your code and be more memory efficient.

    I really like all the portability of MSIL being able to run managed code using mono, and having the JIT compile to either 32-bit or 64-bit on the fly.  However, one of my biggest annoyance with the managed environment is that there is currently no way to prove to the compiler or the runtime with verifiable proof that basically says, hey, look at this code, you can see that I am following the rules of this abstraction.  Let me setup this declarative optional feature that allows deterministic destruction of these resources and I will not only be more composable but my code will still bound by the .net type safety.  At the same time, this will also negate the need to have parallel garbage collection, which by the way, may never be solved without more declarative honesty built into the language.

    You have to get this guy talking with the guys from the Haskell camp with Erik Meijer and Gilad Bracha and maybe even talk with Anders Hejlsberg as a mediator discussing the future of dotnet and how to add declarative directives to the language and/or foundation and debate if it can improve both performance and composibility.  I think it will be a way to increase the honesty factor of dotnet as well.  I really want to see how they can mash this all out.

    As a side note, I think it's an interesting idea of creating a complier to determine certain code patterns and convert them automatically to be a more efficient managed reference pointing model while providing all the benefits of a managed language.  It could be an option on your compiler optimizer.

  • evildictaitorevildictait​or Devil's advocate
    Wizarr wrote:
    

    Not meaning to downplay how cool these additions sound to c++, but as a c# developer, it seriously begs the question.  Is it possible to enable a more declarative syntax to c# that allows you to create a special mode in the managed .net environment where you can bypass the garbage collection?  If you can prove to the complier that you will have deterministic destruction via these "shared pointers" that will be enforced by the runtime and reference counted I think that will lessen the burden of garbage collecting to those individual resources and speed up your code and be more memory efficient.

    Sure it's possible, but this problem is being arrived at from different perspectives. In dotNET pretty much everything lives on the heap. When you do a new object() or a new array[] construction, the item is built onto the heap, and the thing you are storing in your variable is a reference to the object. In C++ when you use a non-indirected structure (such as T value or vector<X> values) you are creating the object on the stack. This means that with C++ it is plausible for the compiler to know which objects need to be disposed as you leave scope, whereas in dotNET leaving scope is independent of the objects on the heap.

    This being said, deterministic garbage collection is possible for C#, but it is in general more expensive in terms of CPU cycles. Note that you can bypass the GC entirely by using unsafe code.

    Wizarr wrote:
    

    As a side note, I think it's an interesting idea of creating a complier to determine certain code patterns and convert them automatically to be a more efficient managed reference pointing model while providing all the benefits of a managed language.  It could be an option on your compiler optimizer.



    It's not entirely clear what you're getting at here. Please elaborate.
  • The following point, made in this video, needs to be stressed,

    deterministic reference counting is a universal resource management technique, whereas garbage collection is useless for anything other than memory.  For that reason GC is simply not a replacement for ref counting no matter how much Patrick Dussud congratulates himself.

    Cycles can be worked around but total lack of support for managing non memory resources in .net, can't be, you are back in the C world of malloc (new), free (dispose) with absolutely no help from the compiler/runtime.
  • garbage collection is useless for anything other than memory.

    No. Garbage collection is useless for anything that cannot be released lazily based on reachability. Sometimes, not even memory fits here, but often many things other than memory works just fine.
    GC is simply not a replacement for ref counting no matter how much Patrick Dussud congratulates himself.
    True, but that doesn't mean it isn't useful. Use the right tool for the job. There are cases where reference counting is better. But in my experience, if a garbage collector is available to you, you'd be an idiot not to take advantage of it, and the garbage collector is the right tool for the job quite often.
    Cycles can be worked around but total lack of support for managing non memory resources in .net, can't be, you are back in the C world of malloc (new), free (dispose) with absolutely no help from the compiler/runtime.
    Not true. Reference counting can be used if you an appropriate and well-adopted smart pointer library (if you think malloc/free is bad, you should try getting different reference counting libraries to work together correctly). Cycles can be worked around if you are exceedingly clever, carefully meticulous, and have a good weak reference library available (again, have fun with the integration issues). Not to say it is unusable, just pointing out that it isn't as easy as pie (and if you think it is, you haven't done anything sufficiently complex with reference counting). And to make all of this work, you have to think very carefully about how to handle your destructors, make a class for every allocated resource, etc.

    Garbage collection isn't without its problems either, but to say there isn't compiler or runtime support is not entirely accurate. The compiler supports it with the lock, using, and try/finally constructs. The runtime supports it with the IDisposable interface. There are still two primary weaknesses (lack of "Disposable" containers, and inability to mark a type as disposable after it has been released).

    Performance is a completely separate discussion, and for every example you give me where reference counting wins, I'll give you one where garbage collection wins. Usability, programmer efficiency, and type safety are also areas where there are tradeoffs. There's always a pro and a con.

    Just use the right tool for the job. For c++, the arrival of shared_ptr in VC 9.0 is very welcome because it is the right tool for a lot of jobs. But garbage collection is also the right tool for a lot of jobs.
     Don't dismiss it just because you don't understand it (and if you're dismissing it, you really don't understand it).
  • evildictaitor wrote:
    
     

    Sure it's possible, but this problem is being arrived at from different perspectives. In dotNET pretty much everything lives on the heap. When you do a new object() or a new array[] construction, the item is built onto the heap, and the thing you are storing in your variable is a reference to the object. In C++ when you use a non-indirected structure (such as T value or vector<X> values) you are creating the object on the stack. This means that with C++ it is plausible for the compiler to know which objects need to be disposed as you leave scope, whereas in dotNET leaving scope is independent of the objects on the heap.

    This being said, deterministic garbage collection is possible for C#, but it is in general more expensive in terms of CPU cycles. Note that you can bypass the GC entirely by using unsafe code.



    I am very clear about everything living on the heap in a managed world.  What I am talking about is shifting the concept of a managed environment to include these shared pointers in a way that can have access to the heap, maybe a seperate heap, or the same one with ownership properties, saying that the regular variables not belonging to this shared pointer structure cant have access to this memory and be able to isolate the memory via the runtime.

    The advantage to this is that when you can isolate memory that the gc doesnt have to collect means faster performance at the cost of internal structure overhead to keep track of who owns what.  However, I think there is pleanty of proof stated out there that deterministic destruction always wins in performance.  Now sure, the gc has really shined over the years, but if you can more declaratively state to the .net environment that you can guarentee that memory will be destructed, there will be no need to use unsafe code.  This is a method where you can still code safely (both type safe and memory cleanup safe if you will).

    This idea mostly stems back from the other channel 9 videos on language development and them talking about how the dotnet framework could have been even better with dynamic language support without adding a DLR library on top.  This is an idea of changing the dotnet framework to incorporate new technologies of delclarative programming and inject them into dotnet.  Even with dotnet 3.5 we are still using the dotnet 2.0 runtime basically.  This will be basically a heavy internal structural change while keeping current gc intact for backwards compatibility.

    I noticed with the shared pointers, there is a heavy level of management done on part of the pointer, why cant we integrate the same level of management into dotnet?  Garbage collection was a very easy method at the time comparatively with integrating the concepts of shared pointers.  Maybe have a new keyword called "spnew" that replaces or adds to the "new" keyword where it automatically handles where it gets allocated from.  Then the type can be infered by the spnew operation so that is bypasses the gc but add attributes to tell the gc the scope for deletion.  I havent really looked too hard at how best you can describe this feature but I think you can have them coexists.

    Basically it comes down to this, if you dont care, just use the normal new method and you will have big brother (gc) come and clean up your mess, however if you want more control over performance you have to show the compiler through some declarative programming that it is safe and it will reference count for you internally and you get a speed boost.

    Another point to look at is composibility.  Forget current programming languages for the moment.  Is it not more honest if you can tell the compilier declaratively where all the points of destruction are symantically?  From that point of view c++ is actually being more honest about destruction from that one paradigm of using shared pointers.  I think dotnet needs to do the same thing.

    If we can make dotnet more honest, I think we will have more acceptance to things like Haskell and other functional programming languages, while at the same time, being able to have a hybrid version that can handle the problems of Haskell, such as passing around state and how it can handle it.

    So to conclude, these shared pointers are at the very beginning of what it is capable.  It is most easily added to an unmanaged language because of the lack of integration with type safety that dotnet has to provide.  However, I think the potential of shared pointers are just reaching the beginning of what is possible.

    As far as the clarifying to my comment about compiler optimization support.  I was just thinking outloud stating, that if the compiler was smart enough, then we dont need to declarative state that we want this referenced counted, it can infer it automatically if all certian conditions of isolation are met.  Like all other optimizations, we never tell the compiler to change our code but it does so if it thinks it can guarentee the same output but faster.
  • CharlesCharles Welcome Change
    Well, I am unable to locate the last few minutes of the interview (and it only lasted a few minutes longer, I promise)... Just use your imagination Smiley

    C
  • STLSTL

    [dcook]
    > And to make all of this work, you have to think
    > very carefully about how to handle your
    > destructors

    Destructors are almost always simple to write.  Often, you don't need to write them at all, when memberwise destruction does exactly what you need.

    > make a class for every allocated resource, etc.

    Every resource should be encapsulated by a class.  "End of line", as the Master Control Program would say.  If you have non-encapsulated resources, exceptions trigger leaktrocity.

    Stephan T. Lavavej
    Visual C++ Libraries Developer

  • evildictaitorevildictait​or Devil's advocate
    Wizarr wrote:
    

    Maybe have a new keyword called "spnew" that replaces or adds to the "new" keyword where it automatically handles where it gets allocated from.  Then the type can be infered by the spnew operation so that is bypasses the gc but add attributes to tell the gc the scope for deletion.  I havent really looked too hard at how best you can describe this feature but I think you can have them coexists.



    I think you should have a look at managed C++ - we have two keywords: new works the same as in normal C++ and gcnew tells the garbage collector that it is responsible for cleaning up the object. This allows you to use the sharedpointer semantics for C++ types, and use the garbage collector for the .NET components.

    Wizarr wrote:
    

    If we can make dotnet more honest, I think we will have more acceptance to things like Haskell and other functional programming languages, while at the same time, being able to have a hybrid version that can handle the problems of Haskell, such as passing around state and how it can handle it.



    I think we're abusing the term "honest" here. If you want contractual honesty inside a program this can already be achieved with Spec#. The reason Haskell isn't widely accepted is because saving state is someone messy, and let's face it, most programs are about interacting with the user, which is much more difficult within a functional language.

    Wizarr wrote:
    

    As far as the clarifying to my comment about compiler optimization support.  I was just thinking outloud stating, that if the compiler was smart enough, then we dont need to declarative state that we want this referenced counted, it can infer it automatically if all certian conditions of isolation are met.  Like all other optimizations, we never tell the compiler to change our code but it does so if it thinks it can guarentee the same output but faster.



    I suspect we'll have to wait for TR1 to become part of the C++ standard before anything like this happens, but certainly the compilers will (and already do) infer usage patterns from your code and replace or insert mechanisms of achieving the same result but in less memory, code or time.
  • evildictaitor wrote:
    

    I think you should have a look at managed C++ - we have two keywords: new works the same as in normal C++ and gcnew tells the garbage collector that it is responsible for cleaning up the object. This allows you to use the sharedpointer semantics for C++ types, and use the garbage collector for the .NET components.



    You are correct if i wanted to use the shared pointer semantics today, but I drank the dotnet koolaid a long time ago Smiley  For the most part I was describing a theoritecal managed environment where you might not even need a gc, or at least be able to premark the memory for reclaim where a parallel gc would be possible without blocking all threads while it searches and reclaims memory.  The only times I use managed c++ is when I am importing native dlls into dotnet, only using it as a wrapper dll to managed code when it’s too hard to convert to PInvoke methods signatures.

  • I think you guys are giving GC a bad rap (and I say this as a person who works in kernel land exclusively in C, not even C++).  
     
    If you wish to write a multi-threaded program, maintaining reference counts becomes quite costly with the large number of interlocked operations you need to perform.  I think shared_ptr uses atomics or locks by default (according to the boost synopsis).  Allocating memory is also a pain because you have to lock down pieces of your heap and you get some internal and external fragmentation from your allocator.  This is okay for certain types of programs because they only deal with objects of certain sizes and so can pull tricks like per-thread memory caching, but you probably have to implement this yourself by overriding news.

    GC doesn't suffer from this problem.  Allocation is handled by incrementing a pointer (this one does have to be done interlocked), but further accesses to the object and copying of references do not require further synchronization.  As an additional optimization, Sun's GC creates per-thread first-gen memory pools called TLABs that allow you to make small allocations without any synchronization at all until you exhaust the TLAB block.  I'm not sure whether or .NET does this yet, but it's a relatively straightforward optimization.

    Stephan mentions the memory efficiency of shared_ptr-based C++ solutions, but I don't think that's the case.  Looking at the constructor implementation of shared_ptr from the Dr. Dobb's article[1], shared_ptr uses a pointer-sized element to refer to the object (unavoidable) plus a pointer to a reference count block and the reference count block itself which is new()ed up from the heap at 16-byte allocation granularity (3/4th of it is wasted space).  So for an object with only one reference you take up 32 bytes of memory plus the object itself and likely pull in three CPU cache lines (1 on the stack, one for the refcount, and one for the object itself that you're accessing).  And your allocated objects are sparser in memory because of the normal fragmentation effects that can't be avoided except by mark-and-sweep GCs.  I wouldn't necessarily call this memory efficiency.

    GC certainly appears to allocate bigger segments (such as 16 MB default generation segments... or even larger ones), but the OS memory manager is smart enough, thanks to Landy Wang, that it won't actually do anything to give you that memory until you touch it.  So if the GC can keep things tightly compacted (i.e. you have well-defined cycles of allocation and the frees), you get better cache locality (references are only pointer-sized and objects can be packed densely with a 12-byte overhead for a synch structure and a method table pointer, a thing you'd need for any polymorphic C type) and you shouldn't really take a much higher working-set cost. 

    I buy the Patrick Dussud Kool-Aid and think for memory, GC is clearly superior.  The arguments against GC for other resources is totally valid and GC languages would do well to incorporate RAII-style syntax for dispose that's even more automatic than "using".  C++/CLI has this already (just declare your managed variable as a stack type and you get Dispose called automatically). 

    If you make a GCed runtime that does not attempt to add extra safety guarantees like verifiable code or type-checking on cast, I think simple code would run faster than the transliteration of the code to simple TR-style C++ without custom allocation schemes or complicated wizardry.

    [1]http://www.ddj.com/cpp/184401507
     
  • STLSTL

    There's a problem with your comparison between shared_ptr and garbage collection: in a large majority of C++ programs, the large majority of objects will NOT be immediately held by shared_ptr.  The usual mechanisms of automatics, members, and containers continue to perform most of the resource management work.  shared_ptr simply spices things up by permitting polymorphic and noncopyable objects to attend the RAII party (and yes, it also enables sharing scenarios - but as I always emphasize, this is not the most interesting thing about shared_ptr).

    > I think shared_ptr uses atomics or locks by
    > default (according to the boost synopsis).

    TR1 shared_ptr uses InterlockedFoo().

    > Allocating memory is also a pain because you
    > have to lock down pieces of your heap

    Not if you have per-thread heaps.  shared_ptr is completely agnostic to how you get your memory (and we provide C++0x allocator support for controlling how shared_ptr gets its own memory). 

    > further accesses to the object and copying of
    > references do not require further synchronization.

    Accessing an object held by shared_ptr is zero overhead - no interlocked operations are performed.

    > Stephan mentions the memory efficiency of
    > shared_ptr-based C++ solutions, but I don't think
    > that's the case.

    See, for example: http://cs.umass.edu/~emery/pubs/gcvsmalloc.pdf

    Stephan T. Lavavej, Visual C++ Libraries Developer

  • STLSTL
    Yes.  See slide 43 "shared_ptr Thread Safety" of shared_ptr.pptx at http://blogs.msdn.com/vcblog/archive/2008/02/22/tr1-slide-decks.aspx .

    Stephan T. Lavavej, Visual C++ Libraries Developer
  • So Stephan,

    [question you already anwerered]

    Now that shared_ptr<> is thread-aware, does the same thing go for other parts of the language? the C++ standard does not mention the word "thread" just once, quite deliberatly, I suspect. Is the initialization of function-local statics now thread safe?

  • STLSTL

    C++98/03 (and TR1, which is a library-only addition to C++03) doesn't recognize the existence of threads.  C++0x does.  This work is still ongoing; you can follow the various proposals and changes to the Working Paper at http://www.open-std.org/jtc1/sc22/wg21/ .

    For function-local statics, see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2513.html .

  • I have to apologize for turning this into a GC-vs-refcount thread. I am overjoyed that Visual C++ is getting support for this, because it has been sorely needed for a long time. (I still prefer to use intrusive reference counting where possible, but that's another issue.)

    Nevertheless, I always find myself compelled to stand up for GC when I see people dismissing it for invalid reasons. It makes me think of people who refuse to use Windows because "Linux is teh roxxors and M$ sux". Just as there are places where Windows is better than Linux (and vice-versa), there are places where GC is appropriate and useful, and there are places where it is not. It's just another tool to have on hand, and I'm always looking for new and useful tools to make me a more productive developer.

    @neerajsi: YAY! Somebody who is willing to actually discuss the issues and keep an open mind. Very rare. Good for you.

    Regarding the GC vs. Malloc paper that STL has referenced, I take exception with the methodology used -- there are a number of fundamental flaws in their analysis that stack their deck against GC. That's not to say that GC is necessarily always better, just that it isn't nearly as bad as indicated in their paper.

    In the referenced paper, they took some traces from some programs that used garbage collection, then simulated the performance of the programs with various memory management mechanisms. This kind of simulation has a lot of potential and is very interesting, but their application of it misses some key issues. For example, it assumes that it is trivial for a non-garbage-collected system to track live/free resources (they simply inserted "delete" calls into the trace at the point where each object became unreachable), and this is patently false. It is often quite expensive to track items that need to be deleted (interlocked reference count updates, allocation of reference counts on the heap, keeping lists of items to be deleted, ownership tracking, more complicated error handling, etc.). Many papers have researched this issue, and in some cases for complicated systems, 50% or more of the code and data is devoted to tracking object lifetimes. The paper STL referenced assumed that all of this came for free, so the results are not entirely valid.

    There are many real-world cases where GC has led to some significant performance wins. Some server applications I've investigated had serious memory issues that turned out to be due to heap fragmentation. Nasty hacks and tricks were done to reduce heap fragmentation, including things like having two heaps and switching back and forth between them (allowing one heap to empty, essentially defragmenting it). After moving to GC, these issues simply went away, performance became significantly better all around, and no nasty hacks were needed. Other cases involve argument passing and complicated ownership management protocols, often involving unneccessary copies of long strings -- these all go away when you can let the GC take care of it.

    As an aside, I've never had a performance or usability issue in my NT programs turn out to be caused by GC, so using performance as a primary argument against GC seems pretty worthless. In terms of developer productivity, I've found using GC to be a huge win -- there are certainly some places where it takes a step back (we need something better than IDisposable and using!), but it takes 10 or 20 steps forward for each step back.

    As I said before, there are ways to come up with examples where GC is a huge win, and there are ways to come up with examples where manual or reference-counted resource tracking wins. This applies to performance, reliability, scalability, ease of implementation, and last but definitely not least, ease of integration with your existing systems. You have to keep an open mind, understand the pros and cons of each, and be able to choose the right tool for the job without letting personal biases get into the decision. I switch back and forth between C# and C++ on a daily basis, even within the same project. I try hard to use the right tool for the job. Generally, the right tool is C# with GC, and I think my results have been good.

    In general, here are my rules of thumb:

    If I can assume that .NET is installed on the target computer already, I am working in user mode on NT, the program I'm writing won't need to load and unload repeatedly (process creation is about 4X slower if .NET is involved, and that can be a signficant factor for programs that are run 1000s of times per minute), and I don't have any other specific compelling reason such as interoperability, I use C#; otherwise, I usually use C++. I really like C# -- the standard library, the GC, and the language make me much more productive, and performance is not a problem. I miss some things like templates and deterministic destructors, but the benefit far outweighs the cost for most cases.

    shared_ptr is great because it makes me much more productive in C++ for the cases where C# isn't appropriate. So I'm first in line to be happy that this and other much-needed libraries are being added.

    Nearly every time I talk to somebody who thinks GC is a bad thing, they haven't used C#/.NET/etc. for anything significant. They look at it, find a few things they don't like about it, and use them as the reason for not using it. They've let a personal bias color their judgement. This is known as prejudice. There are a few people who have used C# for major programs and still don't like it; these guys have usually had to work with a poorly-designed program -- perhaps somebody assumed at first that GC makes all problems go away, painted themselves into a corner with a bad design, and blamed the GC when the real problem was a poor program design. Finally, there are people who used C# when it wasn't appropriate. Again, they blame C# instead of realizing that no single tool is best for all jobs. You can find a number of bad C++ apps too, but nobody blames C++ for that. A lot of people hate X because they've run into problems with it, where X can be GC, reference counting, or one of any number of other technologies. Don't blame the technology if you run into trouble when it is misused.

    Just use the right tool for the job. What else can I say?
  • STLSTL

    > I still prefer to use intrusive reference counting
    > where possible, but that's another issue.

    C++0x shared_ptr Object Creation support (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2351.htm#creation ), voted into the Working Paper, will provide the efficiency advantages of intrusive reference counting without the usability penalties, by allocating a single chunk of memory for the object and its reference count.  Doing this properly requires variadic templates and rvalue references to solve the forwarding problem.

    > there are places where GC is appropriate and
    > useful, and there are places where it is not.

    I am not opposed to garbage collecting some types.  I am strongly opposed to universal garbage collection.

    > It is often quite expensive to track items that
    > need to be deleted

    Remember: most objects in most C++ programs will NOT be immediately held by shared_ptr.  Automatics, members, and elements of containers incur no such expenses.  shared_ptr is the icing, not the cake.

    Native and managed code involves completely different worldviews (I find most managed-talk incomprehensible, and I've found it hard to explain the native ways of doing things to managed programmers).  In native code, you can express things like "non-owning reference" - that's just a built-in reference or pointer.  Getting a non-owning reference from a shared_ptr involves zero overhead, and passing it around involves zero overhead.  Managed code doesn't make such an important distinction between owning and non-owning references ("let's make everything owning, and clean up the resulting cycles").  It may seem easier to be able to ignore whether something is owning or not - but the price is being unable to handle non-memory resources properly.  That price is absolutely unacceptable to me.

    > so using performance as a primary argument
    > against GC seems pretty worthless.

    My primary argument against GC is that it doesn't do anything for non-memory resources (finalizers are an abomination).

    Stephan T. Lavavej, Visual C++ Libraries Developer

  • >My primary argument against GC is that it doesn't do anything
    >for non-memory resources (finalizers are an abomination).

    Why is there any debate?  Ref counting is universal (release when done works with any resource type), GC is only for memory (release when memory pressure).  Therefore GC can not be a drop in replacement for ref counting, case closed.  There is no amount of performance or any other straw man  that can change that.

    Yet that's precisely what was done from vb6 to vb.net

    Since you are on the inside, can you find out why Microsoft's .NET people appear to not understand this?  Do they really believe that another abomination, the dispose pattern (which is nothing more than manual malloc, free) is really a "solution?"  Are there plans to change that?
  • Interesting vid. Like a few have said, there's probably not a huge amount of new or unfamiliar material here for your average C++ dev since usage of boost's shared_ptr is already fairly widespread, but it's still worth watching. If you're just wanting to take a quick dip into C++ with this vid then Stephan probably glosses over a lot of detail which would make it a little hard to follow at times. There's a lot of explanation in the vid that assumes you've already made the conceptual leap into thinking in C++ mode.

    Is Microsoft's implementation of TR1 based on Dinkum's implementation like the STL?
  • STLSTL

    [TicklishPenguin]
    > Interesting vid. Like a few have said, there's
    > probably not a huge amount of new or unfamiliar
    > material here for your average C++ dev since
    > usage of boost's shared_ptr is already fairly
    > widespread, but it's still worth watching.
     
    If you want lots and lots of detail, check out http://blogs.msdn.com/vcblog/archive/2008/02/22/tr1-slide-decks.aspx .

    Given only an hour (truncated, even), there's a lot of tension between explaining things from first principles, and explaining things in enough detail for advanced C++ programmers.  In this video, I think I stuck more to the "first principles" side of things, although I can definitely understand how those wanting a "quick dip" could be confused by even that.

    > Is Microsoft's implementation of TR1 based on
    > Dinkum's implementation like the STL?

    Yes.  We licensed Dinkumware's implementation and have been working closely with them to fix lots of bugs and integrate it into VC.

    Stephan T. Lavavej, Visual C++ Libraries Developer

  • C++0x shared_ptr Object Creation support (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2351.htm#creation ), voted into the Working Paper, will provide the efficiency advantages of intrusive reference counting without the usability penalties, by allocating a single chunk of memory for the object and its reference count.  Doing this properly requires variadic templates and rvalue references to solve the forwarding problem.

    Cool! Good to know.
    > It is often quite expensive to track items that
    > need to be deleted

    Remember: most objects in most C++ programs will NOT be immediately held by shared_ptr.  Automatics, members, and elements of containers incur no such expenses.  shared_ptr is the icing, not the cake.
    Ok, sure, most resources don't have to go into shared_ptr. But that wasn't what I was saying. The fact remains that tracking object lifetime is not free. shared_ptr makes it easier in some cases, and potentially even makes it more efficient, but with or without shared_ptr, tracking object lifetime is still a non-trivial cost, and manually tracking it (even with shared_ptr) is hard to get right. To repeat myself, several studies have found that 30-60% of a typical non-trivial program's CPU and memory usage is spent tracking and managing object lifetime (the kind of thing that goes away with GC), and that doesn't count the time spent in delete. You cited a particular paper as an argument that GC has lousy performance, but that paper ignored the costs of manual lifetime tracking (it assumed they were 0), so the paper's conclusions need to be adjusted appropriately before they are used to make any decisions. Even automatics, members, container cleanup and destructors have a cost, and the IF statements used to control flow and do cleanup at function exit also have a cost. This cost has to be factored into any comparison with GC's performance, and once that cost has been factored in, manual lifetime management and GC end up being fairly close in runtime cost.
    (I find most managed-talk incomprehensible, and I've found it hard to explain the native ways of doing things to managed programmers).
    I find it very frustrating when people who only have experience with one side of things criticize the other way without understanding it (and trying it for a while) first.
    It may seem easier to be able to ignore whether something is owning or not - but the price is being unable to handle non-memory resources properly.  That price is absolutely unacceptable to me.
    If that were actually the price, it would be unacceptable to me (and everybody else) too. Fortunately for managed runtimes, that isn't actually the case. There is a price, but it isn't nearly as dramatic as you make it out to be. With the current GC-based runtimes, critical resources need to be tracked separately from object lifetime. In C++, the release of critical resources is conflated with object destruction. That has some advantages, but it is not the only way to do it.

    The "managed way" is to let the runtime handle object lifetime and let the developer handle the release of critical resources, with finalization as a backstop in case the developer screws up. This assumes that there are a lot more objects than critical resources, and that seems to be true in my experience. There is certainly some room for improvement in the "managed way", but it certainly isn't an abomination or a catastrophe. If it were, nobody would be using it.
    My primary argument against GC is that it doesn't do anything for non-memory resources (finalizers are an abomination).
    Ah, now we're getting somewhere. You're right -- GC itself doesn't do much for non-memory resources (though it does do something -- it gives you a chance to do something at some point after the object becomes unreachable). GC solves an object lifetime issue and is mostly agnostic about critical resources. (Note that "non-memory" is not the best description here, since there are some memory-based resources that must be freed deterministically, and there are some non-memory resources that don't need deterministic release.)

    So basically, I think it boils down to this: with GC, you get automatic object lifetime management, but there is no support for deterministic critical resource management (finalizers are the backstop so that critical resources do eventually get freed, but they are tricky and should be used as little as possible). With C++, you have to manage object lifetime yourself, but you get a lot of help from the language and the runtime, and critical resource management is tied to the object lifetime. Neither system is perfect. Everybody will have a preference. But neither system is fatally flawed.

    Another thing I'd like to point out is that the real issue here is provable type safety, which actually can't be acheived in the general case with manual or RAII-based resource management. If there's any way for a program to access a type that has been released, type safety goes out the window. That's the real limitation here. If this were just an issue of garbage collection, it could be resolved by various clever schemes that allow both GC and non-GC resources to co-exist. However, since the actual goal is type safety (which is a very nice thing!), those clever schemes don't work very well.

    There is still some room for other clever schemes -- using, finally, and IDisposable are a start, but I would love to see more support from the language, runtime, and code analysis tools.

    Calling finalizers an abomination doesn't make it so. They are best avoided as much as possible, but they aren't inherently evil, and there are (rare) cases when letting the finalizer take care of things is really the best thing to do. If abused, they can certainly cause you no end of trouble. But the same goes for a lot of other things in software development. I'm not sure what makes them deserve the label "abomination". Is it the sharp edges? shared_ptr has a few sharp edges of its own -- accidentally turning a raw pointer into a shared_ptr twice leads to double-free, use of shared_ptr as a temporary leads to trouble, etc.
  • 

    I really don't know where to start and this will be long and rough and rant-like.. But I am only trying to help see the next hop and point to some very simple facts about C++ object model vs GC.


    I am certainly not phased by any talk of GC, and especially in the context of .NET.

    The committee should rapidly back off the idea of introducing one by default.

    I have worked with VMs for over a decade and can see why they will never be appropriate for a number of tasks that are becoming so important it will irreversibly differentiate value vs reference paradigms. Within couple of years it will also define the 'what now' ie. managed 'upgrade-mentality suicide' story ending.

    I am aiming at the next phase widely accepted as the demise of Murphy's hypothesis.

    Before going on with sarcasm and one tiny detail, I would first like to congratulate the VC++ team for coming out of the trenches after a decade of starvation and Microsoft's reinvention-in-Java, and investment in .NET (primarily marketing and poor language design ie. C# generics, .NET collections, LINQ abomination of 'thinking in sets', aka Bruce-Event-Lee De-lphins and plenty more to bother with listing here).

    On a more technical note, avalanche destructors, interlocked ref-counting and many other criticisms of 'all for GC'-style are going against the nature of all modern hardware and language constructs of a tool that caters for multi-paradigm and built-in expression power to bypass any deficiency presented. All of them are easily catered for by language constructs and compiler extension (and you'll see concepts play a huge role here). That's the next phase and you better follow boost people for the next step in this evolution rather than blogs about .NET.

    Once .NET people (especially those who converted to productivity candy just like many of us did back in 1996 with Java and 2000 with .NET) finally get disappointed and realise the advance is not going to happen in VMs but environments that can be catered for without breaking threading, memory, ownership, non-ownership and many other models that we can go back and discuss :

    Why GC will eventually be slaughtered?

     

    Meanwile, the best .NET libraries out there are consistently reverting to native code to make up for all the mess GC and runtime environments are very fond of (not to mention first-language education damage in managed-land).

    Another comment was on proving type-safety, you know, the proof. Please look at MSR and other projects that do better code analysis than any bytecode tool is capable of today. Actual samples are right there, today, and that's for C Driver Code. No hmm?

    Okay, please don't look puzzled (I know it is hard for a die-hard managed mentality), just boot your WPF app or pump a large dataset into your WinForm or ASP.NET app and you'll see the catastrophe that will soon scale as bad as JavaScript without any threading notion at all.

    It will probably hurt to realise this, Google is blasting away with this concept and has pretty bad C++ programmers judging by their blogs. But they try hard where it matters. It is just a fact you cannot avoid, and when you see it and fight trying to achieve same with .NET for over 5 years with memory barriers, with help of IOCP, with help of everything and fail, than you wonder:

    Isn’t it about time to move on and make a better environment?

    One without runtime overhead?

    That is the genius of Bjarne, and C and .NET and Java guys better learn fast as the next barrier and clean-up, and rewrite is quite near.

    I've been waiting for .NET and Java hype to materialise for next generation computing for a decade now. And it is constantly disappointing with abstractions that leak and diverge from reality.

    Point of no return has been passed though, aka memory latency kills.

    I mean is it so hard for people to realise all those VMs are written in pretty average C++, and by induction it satisfies all models you are currently working with (including simple/managed/.NET, OO,  functional, parallel, you name it).

    Why is it so hard to get this I wonder?

    Any comment against C++ or Boost or TR1 is a suicide for your next version of VM, and I have learned to immediately find all of them suspect, even if they are backed up by some surface-level research. The same fact applies to GC protagonists.

    Please see that C++ model is what is underneath you and it is advancing at the pace you will not be able to ignore if you care about your work, which VM guys consistently show they don't as they complete their work and boast about how easy it all was.

    If history teaches anything, nothing too easy was ever good enough.

    And for anti-interlocked and anti-ref-counter guys, no one is forcing you to do anything like it, and you can easily bypass it and blast any GC or VM:

    Use const on everything!

    Something modern VM and language 'heroes happen there' folks couldn't accept was the ultimate solution.

    Go back and read what Bjarne and Sutter are doing. It will help you beat all the .NET, Java, AOP, declarative, Erlang and Haskell people with a blink once you dive in.

    You'll be capable of building generations of frameworks (if you have to, but that's not the goal), not just use a single one that is so inefficient in expressing ultimate machine abstraction:

    C


    And it is evolving too, in parallel to C++, just like your hardware guys. PFX or PLINQ or similar will not help you here.

    And none of this is relevant to languages per se, as many follow similar syntax and even translatable semantic in at least one direction. The point is the managed world cannot satisfy some basic models and is starting to break down rapidly.

  • STLSTL

    [dcook]
    > I find it very frustrating when people who only have
    > experience with one side of things criticize the other
    > way without understanding it (and trying it for a while) first.

    I understand how universally GCed languages work - I actually learned Java before C++.  Like learning C before C++, it was useful - it taught me how programming languages can fail programmers miserably.

    > In C++, the release of critical resources is conflated with object destruction.

    Conflated? No. The word is "encapsulated".

    That, more than anything else, summarizes the gulf between the native and managed worldviews.  I've never figured out how to get through to someone who doesn't grok destructors.

    I also note your terminology of "critical". There's nothing special about files, sockets, textures, fonts, and so forth. They're just more resources, to be nuked from destructors. Only in managed-world are they "critically" important to remember, because you have to control their lifetimes by hand (with abominations like finally-blocks, which RAII vastly supersedes) as if in C.

    [vvldb]
    > The committee should rapidly back off the idea of introducing one by default.

    GC has been cut from C++0x.  And there was much rejoicing.

    (However, they'll keep trying.  GC's proponents are nothing if not persistent.)

    > I would first like to congratulate the VC++ team for coming out of the trenches after a decade of starvation

    Actually, VC has been continually improving since the release of The Infinite Enemy, VC6.  Each successive release (VC7.0, VC7.1, VC8, and to some extent VC9) has significantly improved language and library conformance.  These conformance improvements have made Boost's, and every other modern C++ programmer's, life much easier.  (I am deliberately leaving out the IDE here.)

    If you want a picture of a boot stomping on a human face, imagine VC6 - forever.

    > not to mention first-language education damage in managed-land

    I found the article at http://www.stsc.hill.af.mil/CrossTalk/2008/01/0801DewarSchonberg.html to be very interesting (note that I do not agree with all of its points, and the authors have an Ada bias).

    Stephan T. Lavavej, Visual C++ Libraries Developer

  • Is TR1 c++ managed code; I followed every example and MSDN, etc. I could find and I have MSVS Team System 2008 Development Edition - ENU Service Pack 1 (KB945140) installed.

    I still get the following link error: Error    1    error LNK2019: unresolved external symbol "__declspec(dllimport) void __cdecl std::tr1::_Xbad(enum std::tr1::regex_constants::error_type)" (__imp_?_Xbad@tr1@std@@YAXW4error_type@regex_constants@12@@Z) referenced in function "public: static unsigned int __cdecl std::tr1::_Regex_traits<char>::length(char const *)" (?length@?$_Regex_traits@D@tr1@std@@SAIPBD@Z)    event.obj    EndPointService

    Does anyone know why?  I searched the Internet and found no comments on this.

    Any help would be appreciated.

    If this is the wrong place to ask, please re-direct me to the a good place to ask.
  • I have the same problem.

Remove this comment

Remove this thread

close

Comments Closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.