Entries:
Comments:
Posts:

Loading User Information from Channel 9

Something went wrong getting user information from Channel 9

Latest Achievement:

Loading User Information from MSDN

Something went wrong getting user information from MSDN

Visual Studio Achievements

Latest Achievement:

Loading Visual Studio Achievements

Something went wrong getting the Visual Studio Achievements

Inheritance Is The Base Class of Evil

Download

Right click “Save as…”

Zip

Implementing non-intrusive runtime polymorphic objects with value-semantics, and multiple-undo in 20 minutes.

Follow the Discussion

  • Sorry if that was too fast! Feel free to ask questions here.

  • cjlovecjlove

    Enjoyed the (light-speed) talk, but am looking forward to digesting the slides and online video when available <grin>

  • Sean,

    Would you recommend this nonintrusive approach for types used to inject test seams into code? E.g. consider a program which depends on the file system, or a database, or similar, which needs to be replaced with something else during testing.

    Would you recommend this value semantics behavior when injecting these kinds of seams that expose data external to the program that can't be copied? (E.g. we can copy a file handle, but we can't copy the underlying structures backing it because those are outside our programs' control)

    Thanks!

  • Sean, could you provide a download link for the source in the examples? I'd love to spend more time with it and didn't get a chance to transcribe it all in the audience.

  • I found that Sean did pretty much the same presentation before, and the source code for that one is on github here: https://github.com/boostcon/cppnow_presentations_2012/blob/master/fri/value_semantics/value_semantics.cpp 

    Sean, is there any difference in the code now? And thank you for the great presentation!

  • @SeanParent: The talk was extremely interesting but unfortunately it was too fast. I'd love to see the updated source code so that I can "study" it.

  • @sbehnke15: Recent code can be found here: https://github.com/sean-parent/sean-parent.github.com/wiki/Papers-and-Presentations

    This version is nearly identical to the code presented and much more recent than the code linked to by @jdkoftinoff.

    @vittorioromeo, I believe this also answers your question.

  • @BillyONeal: the technique can be used to build a object holding anything that models a file system, file, or database so you can mock an interface at runtime, though I'm not sure a runtime polymorphic approach is the right choice for a unit test.

    Sometimes when an object is external, there  is now way to avoid reference semantics. I still prefer having the handle packaged RAII for those cases.

  • Thank you @SeanParent for the source code. Is there a version available with `unique_ptr` instead of `shared_ptr`?

  • @vittorioromeo: I updated the above page to include the unique_ptr version of the code.

  • scottscott

    That was absolutely brilliant. That one talk just changed the whole way I program. I can't believe I didnt see how you could accomplish that before... Just amazing

  • Very nice, thank you Sean! Do you think is still apropriate to (try to) use this pattern in a tightly coupled hierarchy, where the objects have references to the parent object or to the document? Value semantics for objects in such a hierarchy may not function as well.

    Regarding performance, since the concept_t uses virtual dispatch anyway, it should have the same perf characteristic as if the document stores directly the unique_ptr<concept_t>, or am I overseeing something?

  • Thiago AdamsThiago Adams

    Very good, thank you Sean!

    "It is my hope that the language (and libraries) will evolve to make creating polymorphic types with value semantics easier"

    It is my hope too. Especially for decoupling polymorphism.
    I think we already know what we want for the client side.
    Now, we need an easy way to express the container.



  • Thanks for the talk, Sean! It was nice to have the material presented in a version where the audio is much much better than in the Boostcon 2012 video.

    Any tips on how an object with more complicated behaviors could be implemented more conveniently? The function "draw" is involved 4 times in the object_t example class. (Not counting here the additional stand alone function implementation required.) I've even considered whether I should write a single "really generic"  function that does this and that and the kitchen sink, but I somehow feel that wouldn't really be in the spirit of the technique presented... Wink

  • @Xenakios:There are library approaches to try and reduce the boiler plate - see Steven Watanabe's boost type-erasure library as the most complete example <http://www.boost.org/doc/libs/1_54_0/doc/html/boost_typeerasure.html>.

    Until we have some better compile time introspection in the language, library solutions will continue to be fairly complex to use. That said, in my experience when polymorphism is no longer intrusive and can be added on demand, I find you need much less of it and it doesn't permeate your entire design. Either coding the polymorphism manually as I show, or using a library such as Steven's can be effective.

  • axelaxel

    Great talk Sean, I have learned so much. However, I did not get in detail why you prefer unique_ptr to shared_ptr. Doesn't introduce this unnecessary copies of the models and their data when you commit a history state?
    It would be great if you can enlighten me on that.

  • @axel: Although copy is important, it is not always needed. When copy is needed, incrementing a reference count for a shared pointer is somewhat expensive (you can copy about 10 words in the same amount of time as an atomic increment). An std::shared_ptr<> is two words in size so copy and moving it is more expensive and if you do small object optimizations it pushes more cases to the heap. With a shared pointer you don't have mutability, or if you do, you get it through copy-on-write which has semantics that are surprising enough that I recommend making the request-write operation explicit (a write will invalidate any references to the object). Although the decision is rarely clear-cut, I've found that just using unqiue_ptr and copying in the implementation is simple and efficient for most use cases.

  • NawazNawaz

    @SeanParent

    At 19:11, commit() has this line:

    x.push_back(x.back());

    Is this safe? I know MSVC++10 implementation of vector::push_back guarantees it to be safe. But does the specification require it to be safe? I think it would invoke undefined behavior if the vector resizes itself, as a result of which the reference returned by `x.back()` would die before its copy is added to the vector.

  • scooprscoopr

    Am I understanding it right, that if you would have a deeper tree like structure in such structural-sharing scheme, a single change in a leaf means copying all the parents of that node? That could incur many-even-if-small heap allocations for a relatively small change. In that the parallels with the git data-model are uncanny, just except for pointers, it's content-addressable hashing (so added bonus of deduping identical objects).

  • @Nawaz:The consensus is that this is guaranteed to work by the standard. Here is a quote from a conversation I had with Howard Hinnant on the topic:

    "Yes, it is allowed and guaranteed to give the expected answer.  There's not a particular place I can quote.  The general reasoning is that nowhere in the standard does it say it is disallowed.  Compare that with Table 100:

    a.insert(p, i, j)

    which says:

    pre: i and j are not iterators into a.
    Inserts copies of elements in [i, j) before p

    It is pretty easy for implementations to guarantee this behavior:

    In the has-capacity-case it is very easy to see there is no problem.

    In the doesn't-have-capacity-case, the new buffer must be created and filled and all the implementation has to do is to make sure the new value is constructed prior to deleting the old buffer.

    v.insert(p, v[i]) is more interesting to implement, but is also guaranteed to work."

    Sean

  • @scoopr:It may mean that though there are structures that can limit the impact. For example, it is possible to implement a rope structure so that the blocks within the rope are shared (Google "Hans Bohm Rope").

  • scooprscoopr

    I believe the sample code has error, in that the model should inherit the concept publicly :)

  • @scoopr:It does inherit publically - structs default to public members and inheritance (a difference between struct and class).

  • JoeJoe

    What exactly is the language defect related to move assignment? I tried a quick search but couldn't find anything. Does something like

    class Foo
    {
    /* ... */
    Foo& operator=(Foo o) & { member = move(o.member); }
    }

    not work?

  • @Joe:The issue is that if you took your class Foo [for clarity let's write your assignment as:

    Foo& operator=(Foo o) noexcept { member = move(o.member); return *this; }

    ] and put it in a struct:

    struct wrap { Foo m_ };

    Then wrap will not get a default move assignment. For wrap to get a default noexcept move assignment, all members must have a noexcept move assignment - this determination is made by signature. That is the standard says that for wrap to get a default noexcept move assignment all members must have a move assignment with the signature T& operator=(T&&) noexcept.

    The fix is to rephrase the requirement so it says that a struct or class will get a default noexcept move assignment if all members satisfies  is_nothrow_move_assignable<T> - which the above does. That is, we want to define the requirement in terms of the concept, or operation semantics, and not in terms of matching an exact signature.

  • JoeJoe

    @SeanParent: Thank you for the quick response. I checked the N3337 draft, and I assume you are referring to 12.8 §19 and §23, correct? In any case, GCC 4.8 seems to ignore that rule, as wrap is reported to be nothrow-move-assignable which is why I didn't notice this before.

  • , SeanParent wrote

    @sbehnke15: Recent code can be found here: https://github.com/sean-parent/sean-parent.github.com/wiki/Papers-and-Presentations

    This version is nearly identical to the code presented and much more recent than the code linked to by @jdkoftinoff.

    @vittorioromeo, I believe this also answers your question.

    Thank you for sharing this, are there any other public facing examples of your current coding style available? You don't perchance have a version of your personal 'coding guidelines' in text form you could share?

  • @tomkirbygreen:I don't have a "coding guidelines" document in text form. The work from STLab can be found at stlab.adobe.com where you can find the ASL libraries. There are also docs, papers, and presentations on the wiki: stlab.adobe.com/wiki. I also can't recommend working your way through "Elements of Programming" enough.

    The STLab has been gone for a few years now - and ASL has been decaying a bit. I'm trying to get development moved over to github, and you can find the latest code here github.com/stlab/legacy (the reason it is in a "legacy" repo is that the plan is to clean it up and break it into seperate libraries that will go into the adobe-source-libraries repo and/or get submitted to boost).

    I'm also looking for a decent blog space to make some of my coding tips public - github pages hasn't been worked for me in my attempts to get that going.

    Jaakko Järvi and I have collaborated on a number of articles (we're working on another now) and he and his students have done some related work - you can find more papers here: http://faculty.cs.tamu.edu/jarvi/publications/Author/JARVI-J.html

    You can also Google me to find some other video presentations that have been recorded over the years.

  • @Joe:Yes - that is the correct citation in the standard. The version of clang I use obeys the rule - but gets the trait wrong, which makes things even worse. I posted a test case with my results here:

    https://gist.github.com/sean-parent/6612672

  • When you add my_class_t and the draw function, how do they make it possible to insert into the document? I'm lost at the most interesting part Sad

  • @McHalls:It works like this:

    • When the instance of my_class_t is placed into the document, an object_t is constructed through the template constructor.
    • The constructor instantiates an instance of model<my_class_t>
    • The instance of model<my_class_t> has a virtual draw method, that forwards to the draw function taking a my_class_t instance
    • The object_t is not a templated type, but holds a pointer to a concept_t - from which the model<> is derived. The technique is known as "type-erasure" and is used by std::function<>, boost::any, and other libraries.

    So long as there is a stand alone draw function (or the class is serializable through stream out - so it picks up the default implementation of draw), then it can be stored in an object_t, and into the document. No (client-side) inheritance required.

  • Thanks Sean!  I've learnt a lot from your talk.  
    I wasn't able to understand the last part on using shared_ptr to eliminate copying.
    Is it some sort of copy-on-write mechanism that is taking place?
    And what is the significance of the const in shared_ptr?

  • @sutm: I certainly can't speak for Sean, but there was a very similar StackOverflow question asked recently that may give some insights:

    http://stackoverflow.com/a/18803611/245869

  • @sutm: The answer given by @bkuhns: stack overflow is correct. I would phrase it a bit differently. The axioms of copy are that copies are equal and modifying the copy does not modify the original. By convention, a const value is not modified in any apparent way (though there are many ways that const can be circumvented, doing so violates convention). So as long as a const reference to a value has a lifetime at least as long a the reference, a const reference has value semantics.

    This is why we are able to pass argument by const& to avoid a copy. The caller guarantees, by convention, that the value is not modified and will survive for the duration of the call. With a shared_ptr, the lifetime of the object is ensured by the ref-counted pointer.

    There is a subtle difference with equality - comparing share_ptr for equality, with respect to the shared object, is comparing identity. This doesn't violate the axioms of copy - copies are equal because they are identical, but is typically not the desired comparison. But since we don't expose the "raw" shared_ptr from our object_t, we can trivially provide an equality operator on object_t which returns true iff the contained objects are equal.

    Element of Programming (EoP) has a great discussion on the topics of copy and equality and precisely defines the semantics of both operations. You might also look at this paper:

    http://www.stepanovpapers.com/DeSt98.pdf

    I believe this was the first paper to define the term "regular type" - and I still consider it a must-read.

     

  • @bkuhns: Thanks for pointing me to stackoverflow! @Sean: Thanks for taking your time to explain it here again. Will definitely take a look at Stepanov papers and his EoP book.  Smiley

  • @SeanParent: I meditated on your code snippet for object_t and could not find a good solution to implement serialization. The serialize part is easy (as always) but I was not capable of finding a cool way to deserialize. I ended with a system where each valid type is registered in a map and the system dispatches on formerly stored RTTI. How did you solve this?

     

     

  • @MarkusWerle:Much the same way - but I build the lookup table as just a static table harvested from the code sorted by the type name. Another approach I've used for messages is to defer deserialization until code extracts the value (at which point the class being extracted is known). It avoids the global registry but requires that the code extracting has full information about the type.

    The global registry problem is a classic "there is no good way to do that in C++" problem.

  • WillWill

    Thanks very much for the talk, slides, and code!

    About the move assignment operator issue: will it be fixed in C++14?

    I've read Core Language Defect Report 1402, the proposed resolution presented in N3667 has been incorporated in the Working Draft N3691 several months ago (and is still in the latest draft N3797); but I see no mention of "is_nothrow_move_assignable", so is there still a problem as concerns the "noexcept exact signature" thing with the current draft wording?

  • @Will:From how the current wording is phrased, I believe it is still an issue. The problem is the use of the term move assignment operator in the added section of 12.8.20 instead of using is_move_assignable. I have not been following the issue with the committee and do not know if there is an objection to changing the wording to be concept based.

  • Steve RobbFuz

    Awesome talk.  However, is emplace_back of the vector into itself guaranteed to work?  I think it might only be working because of an implementation which doesn't increment its internal size until after construction, rather than incrementing and rolling back if an exception is thrown (which I don't believe is forbidden by the standard).

    Similarly, an emplace into the middle of the vector with itself seems like it could never work.

    I can see how push_back/insert would work, because the copies would be made at the call site, before the vector is modified.

    Just seems a little dangerous, unless I'm missing something.

  • @Fuz:I've discussed this issue with several folks on the committee in the past, the consensus is that it is guaranteed because it isn't explicitly disallowed. From Howard Hinnant:

    ---

    Yes, it is allowed and guaranteed to give the expected answer.  There's not a particular place I can quote.  The general reasoning is that nowhere in the standard does it say it is disallowed.  Compare that with Table 100:

    a.insert(p, i, j)

    which says:

    pre: i and j are not iterators into a.
    Inserts copies of elements in [i, j) before p

    It is pretty easy for implementations to guarantee this behavior:

    In the has-capacity-case it is very easy to see there is no problem.

    In the doesn't-have-capacity-case, the new buffer must be created and filled and all the implementation has to do is to make sure the new value is constructed prior to deleting the old buffer.

    vector.insert(p, v[i]) is more interesting to implement, but is also guaranteed to work.

    ----

Remove this comment

Remove this thread

close

Comments Closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.