Joe Duffy, Huseyin Yildiz, Daan Leijen, Stephen Toub - Parallel Extensions: Inside the…
obrienslalom wrote:Where is the end of the video? (maybe he needed another mt dew).
With mention of functional stuff in c++ and boost in this video, I was recently encouraged to check out FC++. I'm sure many of you have ran across this already, but it seems on topic for those of you who haven't. http://www.cc.gatech.edu/~yannis/fc++/boostpaper/fcpp.html
dcuccia wrote:[6:17]: "Mallocate"...I like it.
Not meaning to downplay how cool these additions sound to c++, but as a c# developer, it seriously begs the question. Is it possible to enable a more declarative syntax to c# that allows you to create a special mode in the managed
.net environment where you can bypass the garbage collection? If you can prove to the complier that you will have deterministic destruction via these "shared pointers" that will be enforced by the runtime and reference counted I think that will lessen the
burden of garbage collecting to those individual resources and speed up your code and be more memory efficient.
I really like all the portability of MSIL being able to run managed code using mono, and having the JIT compile to either 32-bit or 64-bit on the fly. However, one of my biggest annoyance with the managed environment is that there is currently no way to prove
to the compiler or the runtime with verifiable proof that basically says, hey, look at this code, you can see that I am following the rules of this abstraction. Let me setup this declarative optional feature that allows deterministic destruction of these
resources and I will not only be more composable but my code will still bound by the .net type safety. At the same time, this will also negate the need to have parallel garbage collection, which by the way, may never be solved without more declarative honesty
built into the language.
You have to get this guy talking with the guys from the Haskell camp with Erik Meijer and Gilad Bracha and maybe even talk with Anders Hejlsberg as a mediator discussing the future of dotnet and how to add declarative directives to the language and/or foundation and
debate if it can improve both performance and composibility. I think it will be a way to increase the honesty factor of dotnet as well. I really want to see how they can mash this all out.
As a side note, I think it's an interesting idea of creating a complier to determine certain code patterns and convert them automatically to be a more efficient managed reference pointing model while providing all the benefits of a managed language. It could
be an option on your compiler optimizer.
Sure it's possible, but this problem is being arrived at from different perspectives. In dotNET pretty much everything lives on the heap. When you do a new object() or a new array[] construction, the item is built onto the heap, and the thing you are storing in your variable is a reference to the object. In C++ when you use a non-indirected structure (such as T value or vector<X> values) you are creating the object on the stack. This means that with C++ it is plausible for the compiler to know which objects need to be disposed as you leave scope, whereas in dotNET leaving scope is independent of the objects on the heap.Wizarr wrote:Not meaning to downplay how cool these additions sound to c++, but as a c# developer, it seriously begs the question. Is it possible to enable a more declarative syntax to c# that allows you to create a special mode in the managed .net environment where you can bypass the garbage collection? If you can prove to the complier that you will have deterministic destruction via these "shared pointers" that will be enforced by the runtime and reference counted I think that will lessen the burden of garbage collecting to those individual resources and speed up your code and be more memory efficient.
Wizarr wrote:
As a side note, I think it's an interesting idea of creating a complier to determine certain code patterns and convert them automatically to be a more efficient managed reference pointing model while providing all the benefits of a managed language. It could be an option on your compiler optimizer.
garbage collection is useless for anything other than memory.
GC is simply not a replacement for ref counting no matter how much Patrick Dussud congratulates himself.True, but that doesn't mean it isn't useful. Use the right tool for the job. There are cases where reference counting is better. But in my experience, if a garbage collector is available to you, you'd be an idiot not to take advantage of it, and the garbage collector is the right tool for the job quite often.
Cycles can be worked around but total lack of support for managing non memory resources in .net, can't be, you are back in the C world of malloc (new), free (dispose) with absolutely no help from the compiler/runtime.Not true. Reference counting can be used if you an appropriate and well-adopted smart pointer library (if you think malloc/free is bad, you should try getting different reference counting libraries to work together correctly). Cycles can be worked around if you are exceedingly clever, carefully meticulous, and have a good weak reference library available (again, have fun with the integration issues). Not to say it is unusable, just pointing out that it isn't as easy as pie (and if you think it is, you haven't done anything sufficiently complex with reference counting). And to make all of this work, you have to think very carefully about how to handle your destructors, make a class for every allocated resource, etc.
evildictaitor wrote:Sure it's possible, but this problem is being arrived at from different perspectives. In dotNET pretty much everything lives on the heap. When you do a new object() or a new array[] construction, the item is built onto the heap, and the thing you are storing in your variable is a reference to the object. In C++ when you use a non-indirected structure (such as T value or vector<X> values) you are creating the object on the stack. This means that with C++ it is plausible for the compiler to know which objects need to be disposed as you leave scope, whereas in dotNET leaving scope is independent of the objects on the heap.
This being said, deterministic garbage collection is possible for C#, but it is in general more expensive in terms of CPU cycles. Note that you can bypass the GC entirely by using unsafe code.
[dcook]
> And to make all of this work, you have to think
> very carefully about how to handle your
> destructors
Destructors are almost always simple to write. Often, you don't need to write them at all, when memberwise destruction does exactly what you need.
> make a class for every allocated resource, etc.
Every resource should be encapsulated by a class. "End of line", as the Master Control Program would say. If you have non-encapsulated resources, exceptions trigger leaktrocity.
Stephan T. Lavavej
Visual C++ Libraries Developer
Wizarr wrote:
Maybe have a new keyword called "spnew" that replaces or adds to the "new" keyword where it automatically handles where it gets allocated from. Then the type can be infered by the spnew operation so that is bypasses the gc but add attributes to tell the gc the scope for deletion. I havent really looked too hard at how best you can describe this feature but I think you can have them coexists.
Wizarr wrote:
If we can make dotnet more honest, I think we will have more acceptance to things like Haskell and other functional programming languages, while at the same time, being able to have a hybrid version that can handle the problems of Haskell, such as passing around state and how it can handle it.
Wizarr wrote:
As far as the clarifying to my comment about compiler optimization support. I was just thinking outloud stating, that if the compiler was smart enough, then we dont need to declarative state that we want this referenced counted, it can infer it automatically if all certian conditions of isolation are met. Like all other optimizations, we never tell the compiler to change our code but it does so if it thinks it can guarentee the same output but faster.
evildictaitor wrote:
I think you should have a look at managed C++ - we have two keywords: new works the same as in normal C++ and gcnew tells the garbage collector that it is responsible for cleaning up the object. This allows you to use the sharedpointer semantics for C++ types, and use the garbage collector for the .NET components.
You are correct if i wanted to use the shared pointer semantics today, but I drank the dotnet koolaid a long time ago
For the most part I was describing a theoritecal managed environment where you might not even need a gc, or
at least be able to premark the memory for reclaim where a parallel gc would be possible without blocking all threads while it searches and reclaims memory. The only times I use managed c++ is when I am importing native dlls into dotnet, only using it as
a wrapper dll to managed code when it’s too hard to convert to PInvoke methods signatures.
There's a problem with your comparison between shared_ptr and garbage collection: in a large majority of C++ programs, the large majority of objects will NOT be immediately held by shared_ptr. The usual mechanisms of automatics, members, and containers
continue to perform most of the resource management work. shared_ptr simply spices things up by permitting polymorphic and noncopyable objects to attend the RAII party (and yes, it also enables sharing scenarios - but as I always emphasize, this is not the
most interesting thing about shared_ptr).
> I think shared_ptr uses atomics or locks by
> default (according to the boost synopsis).
TR1 shared_ptr uses InterlockedFoo().
> Allocating memory is also a pain because you
> have to lock down pieces of your heap
Not if you have per-thread heaps. shared_ptr is completely agnostic to how you get your memory (and we provide C++0x allocator support for controlling how shared_ptr gets its own memory).
> further accesses to the object and copying of
> references do not require further synchronization.
Accessing an object held by shared_ptr is zero overhead - no interlocked operations are performed.
> Stephan mentions the memory efficiency of
> shared_ptr-based C++ solutions, but I don't think
> that's the case.
See, for example: http://cs.umass.edu/~emery/pubs/gcvsmalloc.pdf
Stephan T. Lavavej, Visual C++ Libraries Developer
So Stephan,[question you already anwerered]
Now that shared_ptr<> is thread-aware, does the same thing go for other parts of the language? the C++ standard does not mention the word "thread" just once, quite deliberatly, I suspect. Is the initialization of function-local statics now thread safe?
C++98/03 (and TR1, which is a library-only addition to C++03) doesn't recognize the existence of threads. C++0x does. This work is still ongoing; you can follow the various proposals and changes to the Working Paper at
http://www.open-std.org/jtc1/sc22/wg21/ .
For function-local statics, see
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2513.html .
> I still prefer to use intrusive reference counting
> where possible, but that's another issue.
C++0x shared_ptr Object Creation support (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2351.htm#creation ), voted into the Working Paper, will provide the efficiency
advantages of intrusive reference counting without the usability penalties, by allocating a single chunk of memory for the object and its reference count. Doing this properly requires variadic templates and rvalue references to solve the forwarding problem.
> there are places where GC is appropriate and
> useful, and there are places where it is not.
I am not opposed to garbage collecting some types. I am strongly opposed to universal garbage collection.
> It is often quite expensive to track items that
> need to be deleted
Remember: most objects in most C++ programs will NOT be immediately held by shared_ptr. Automatics, members, and elements of containers incur no such expenses. shared_ptr is the icing, not the cake.
Native and managed code involves completely different worldviews (I find most managed-talk incomprehensible, and I've found it hard to explain the native ways of doing things to managed programmers). In native code, you can express things like "non-owning
reference" - that's just a built-in reference or pointer. Getting a non-owning reference from a shared_ptr involves zero overhead, and passing it around involves zero overhead. Managed code doesn't make such an important distinction between owning and non-owning
references ("let's make everything owning, and clean up the resulting cycles"). It may seem easier to be able to ignore whether something is owning or not - but the price is being unable to handle non-memory resources properly. That price is absolutely unacceptable
to me.
> so using performance as a primary argument
> against GC seems pretty worthless.
My primary argument against GC is that it doesn't do anything for non-memory resources (finalizers are an abomination).
Stephan T. Lavavej, Visual C++ Libraries Developer
[TicklishPenguin]
> Interesting vid. Like a few have said, there's
> probably not a huge amount of new or unfamiliar
> material here for your average C++ dev since
> usage of boost's shared_ptr is already fairly
> widespread, but it's still worth watching.
If you want lots and lots of detail, check out
https://blogs.msdn.com/vcblog/archive/2008/02/22/tr1-slide-decks.aspx .
Given only an hour (truncated, even), there's a lot of tension between explaining things from first principles, and explaining things in enough detail for advanced C++ programmers. In this video, I think I stuck more to the "first principles" side of things,
although I can definitely understand how those wanting a "quick dip" could be confused by even that.
> Is Microsoft's implementation of TR1 based on
> Dinkum's implementation like the STL?
Yes. We licensed Dinkumware's implementation and have been working closely with them to fix lots of bugs and integrate it into VC.
Stephan T. Lavavej, Visual C++ Libraries Developer
C++0x shared_ptr Object Creation support (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2351.htm#creation ), voted into the Working Paper, will provide the efficiency advantages of intrusive reference counting without the usability penalties, by allocating a single chunk of memory for the object and its reference count. Doing this properly requires variadic templates and rvalue references to solve the forwarding problem.Cool! Good to know.
> It is often quite expensive to track items thatOk, sure, most resources don't have to go into shared_ptr. But that wasn't what I was saying. The fact remains that tracking object lifetime is not free. shared_ptr makes it easier in some cases, and potentially even makes it more efficient, but with or without shared_ptr, tracking object lifetime is still a non-trivial cost, and manually tracking it (even with shared_ptr) is hard to get right. To repeat myself, several studies have found that 30-60% of a typical non-trivial program's CPU and memory usage is spent tracking and managing object lifetime (the kind of thing that goes away with GC), and that doesn't count the time spent in delete. You cited a particular paper as an argument that GC has lousy performance, but that paper ignored the costs of manual lifetime tracking (it assumed they were 0), so the paper's conclusions need to be adjusted appropriately before they are used to make any decisions. Even automatics, members, container cleanup and destructors have a cost, and the IF statements used to control flow and do cleanup at function exit also have a cost. This cost has to be factored into any comparison with GC's performance, and once that cost has been factored in, manual lifetime management and GC end up being fairly close in runtime cost.
> need to be deleted
Remember: most objects in most C++ programs will NOT be immediately held by shared_ptr. Automatics, members, and elements of containers incur no such expenses. shared_ptr is the icing, not the cake.
(I find most managed-talk incomprehensible, and I've found it hard to explain the native ways of doing things to managed programmers).I find it very frustrating when people who only have experience with one side of things criticize the other way without understanding it (and trying it for a while) first.
It may seem easier to be able to ignore whether something is owning or not - but the price is being unable to handle non-memory resources properly. That price is absolutely unacceptable to me.If that were actually the price, it would be unacceptable to me (and everybody else) too. Fortunately for managed runtimes, that isn't actually the case. There is a price, but it isn't nearly as dramatic as you make it out to be. With the current GC-based runtimes, critical resources need to be tracked separately from object lifetime. In C++, the release of critical resources is conflated with object destruction. That has some advantages, but it is not the only way to do it.
My primary argument against GC is that it doesn't do anything for non-memory resources (finalizers are an abomination).Ah, now we're getting somewhere. You're right -- GC itself doesn't do much for non-memory resources (though it does do something -- it gives you a chance to do something at some point after the object becomes unreachable). GC solves an object lifetime issue and is mostly agnostic about critical resources. (Note that "non-memory" is not the best description here, since there are some memory-based resources that must be freed deterministically, and there are some non-memory resources that don't need deterministic release.)
I really don't know where to start and this will be long and rough and rant-like.. But I am only trying to help see the next hop and point to some very simple facts about C++ object model vs GC.
I am certainly not phased by any talk of GC, and especially in the context of .NET.
The committee should rapidly back off the idea of introducing one by default.
I have worked with VMs for over a decade and can see why they will never be appropriate for a number of tasks that are becoming so important it will irreversibly differentiate value vs reference paradigms. Within couple of years it will also define the 'what now' ie. managed 'upgrade-mentality suicide' story ending.
I am aiming at the next phase widely accepted as the demise of Murphy's hypothesis.
Before going on with sarcasm and one tiny detail, I would first like to congratulate the VC++ team for coming out of the trenches after a decade of starvation and Microsoft's reinvention-in-Java, and investment in .NET (primarily marketing and poor language design ie. C# generics, .NET collections, LINQ abomination of 'thinking in sets', aka Bruce-Event-Lee De-lphins and plenty more to bother with listing here).
On a more technical note, avalanche destructors, interlocked ref-counting and many other criticisms of 'all for GC'-style are going against the nature of all modern hardware and language constructs of a tool that caters for multi-paradigm and built-in expression power to bypass any deficiency presented. All of them are easily catered for by language constructs and compiler extension (and you'll see concepts play a huge role here). That's the next phase and you better follow boost people for the next step in this evolution rather than blogs about .NET.
Once .NET people (especially those who converted to productivity candy just like many of us did back in 1996 with Java and 2000 with .NET) finally get disappointed and realise the advance is not going to happen in VMs but environments that can be catered for without breaking threading, memory, ownership, non-ownership and many other models that we can go back and discuss :
Why GC will eventually be slaughtered?
Meanwile, the best .NET libraries out there are consistently reverting to native code to make up for all the mess GC and runtime environments are very fond of (not to mention first-language education damage in managed-land).
Another comment was on proving type-safety, you know, the proof. Please look at MSR and other projects that do better code analysis than any bytecode tool is capable of today. Actual samples are right there, today, and that's for C Driver Code. No hmm?
Okay, please don't look puzzled (I know it is hard for a die-hard managed mentality), just boot your WPF app or pump a large dataset into your WinForm or ASP.NET app and you'll see the catastrophe that will soon scale as bad as JavaScript without any threading notion at all.
It will probably hurt to realise this, Google is blasting away with this concept and has pretty bad C++ programmers judging by their blogs. But they try hard where it matters. It is just a fact you cannot avoid, and when you see it and fight trying to achieve same with .NET for over 5 years with memory barriers, with help of IOCP, with help of everything and fail, than you wonder:
Isn’t it about time to move on and make a better environment?
One without runtime overhead?
That is the genius of Bjarne, and C and .NET and Java guys better learn fast as the next barrier and clean-up, and rewrite is quite near.
I've been waiting for .NET and Java hype to materialise for next generation computing for a decade now. And it is constantly disappointing with abstractions that leak and diverge from reality.
Point of no return has been passed though, aka memory latency kills.
I mean is it so hard for people to realise all those VMs are written in pretty average C++, and by induction it satisfies all models you are currently working with (including simple/managed/.NET, OO, functional, parallel, you name it).
Why is it so hard to get this I wonder?
Any comment against C++ or Boost or TR1 is a suicide for your next version of VM, and I have learned to immediately find all of them suspect, even if they are backed up by some surface-level research. The same fact applies to GC protagonists.
Please see that C++ model is what is underneath you and it is advancing at the pace you will not be able to ignore if you care about your work, which VM guys consistently show they don't as they complete their work and boast about how easy it all was.
If history teaches anything, nothing too easy was ever good enough.
And for anti-interlocked and anti-ref-counter guys, no one is forcing you to do anything like it, and you can easily bypass it and blast any GC or VM:
Use const on everything!
Something modern VM and language 'heroes happen there' folks couldn't accept was the ultimate solution.
Go back and read what Bjarne and Sutter are doing. It will help you beat all the .NET, Java, AOP, declarative, Erlang and Haskell people with a blink once you dive in.
You'll be capable of building generations of frameworks (if you have to, but that's not the goal), not just use a single one that is so inefficient in expressing ultimate machine abstraction:
C
And it is evolving too, in parallel to C++, just like your hardware guys. PFX or PLINQ or similar will not help you here.
And none of this is relevant to languages per se, as many follow similar syntax and even translatable semantic in at least one direction. The point is the managed world cannot satisfy some basic models and is starting to break down rapidly.
[dcook]
> I find it very frustrating when people who only have
> experience with one side of things criticize the other
> way without understanding it (and trying it for a while) first.
I understand how universally GCed languages work - I actually learned Java before C++. Like learning C before C++, it was useful - it taught me how programming languages can fail programmers miserably.
> In C++, the release of critical resources is conflated with object destruction.
Conflated? No. The word is "encapsulated".
That, more than anything else, summarizes the gulf between the native and managed worldviews. I've never figured out how to get through to someone who doesn't grok destructors.
I also note your terminology of "critical". There's nothing special about files, sockets, textures, fonts, and so forth. They're just more resources, to be nuked from destructors. Only in managed-world are they "critically" important to remember, because you have to control their lifetimes by hand (with abominations like finally-blocks, which RAII vastly supersedes) as if in C.
[vvldb]
> The committee should rapidly back off the idea of introducing one by default.
GC has been cut from C++0x. And there was much rejoicing.
(However, they'll keep trying. GC's proponents are nothing if not persistent.)
> I would first like to congratulate the VC++ team for coming out of the trenches after a decade of starvation
Actually, VC has been continually improving since the release of The Infinite Enemy, VC6. Each successive release (VC7.0, VC7.1, VC8, and to some extent VC9) has significantly improved language and library conformance. These conformance improvements have made Boost's, and every other modern C++ programmer's, life much easier. (I am deliberately leaving out the IDE here.)
If you want a picture of a boot stomping on a human face, imagine VC6 - forever.
> not to mention first-language education damage in managed-land
I found the article at http://www.stsc.hill.af.mil/CrossTalk/2008/01/0801DewarSchonberg.html to be very interesting (note that I do not agree with all of its points, and the authors have an Ada bias).
Stephan T. Lavavej, Visual C++ Libraries Developer