Tech Off Thread

57 posts

How about some syntax sugar for IEnumerable<>?

Back to Forum: Tech Off
  • dpratt71

    I posted this to my nacent blog, but I'd probably reach a wider audience broadcasting to Mars, so...

    I find myself wanting to use IEnumerable<> in my own API, but it's not very readable. I'd love it if MS stole an idea from themselves, and gave us some syntax sugar for this useful interface. In other words, instead of IEnumerable<Foo>, I'd like to see just Foo*. What do you think?

  • W3bbo

    Foo* would be a pointer to a Foo Wink

    Anyway, I'm not sure if I totally understand what you're aiming for. 'foreach' is the syntactical sugar around IEnumerable since it manages the whole while(collection.GetEnumerator().MoveNext()) thing for you.

  • dpratt71

    W3bbo said:
    Foo* would be a pointer to a Foo Wink

    Anyway, I'm not sure if I totally understand what you're aiming for. 'foreach' is the syntactical sugar around IEnumerable since it manages the whole while(collection.GetEnumerator().MoveNext()) thing for you.
    Thanks for the reply, W3bbo. To better explain myself, suppose that you want to define a property which is a collection of "Foo" objects, but you don't want to allow adding or removing items via the property (i.e. the collection is managed internally). In such a case, it would be appropriate for the type of the property to be IEnumerable<Foo>. But that's hard to read. So instead of this:

    public class SomeClass
    {
        ...
        public IEnumerable<Foo> Foos
        {
            get { ... }
        }
    }

    I would rather see this:

    public class SomeClass
    {
        ...
        public Foo* Foos
        {
            get { ... }
        }
    }

    Of course, as you pointed out, Foo* is currently interpreted as pointer-to-Foo (at least in an unsafe context), which almost certainly precludes using that particular syntax. My overall point is that it would be nice to have some simplified syntax for representing IEnumerable<Foo> (perhaps Foo+).

    I do lament the fact that "*" is already taken. The "*" character is used in many programming contexts to represent "0 to n" number of "things".

  • W3bbo

    dpratt71 said:
    W3bbo said:
    *snip*
    Thanks for the reply, W3bbo. To better explain myself, suppose that you want to define a property which is a collection of "Foo" objects, but you don't want to allow adding or removing items via the property (i.e. the collection is managed internally). In such a case, it would be appropriate for the type of the property to be IEnumerable<Foo>. But that's hard to read. So instead of this:

    public class SomeClass
    {
        ...
        public IEnumerable<Foo> Foos
        {
            get { ... }
        }
    }

    I would rather see this:

    public class SomeClass
    {
        ...
        public Foo* Foos
        {
            get { ... }
        }
    }

    Of course, as you pointed out, Foo* is currently interpreted as pointer-to-Foo (at least in an unsafe context), which almost certainly precludes using that particular syntax. My overall point is that it would be nice to have some simplified syntax for representing IEnumerable<Foo> (perhaps Foo+).

    I do lament the fact that "*" is already taken. The "*" character is used in many programming contexts to represent "0 to n" number of "things".
    Java uses the elipse in a similar way to C#'s 'param's keyword to represent variable-parameter functions, Mathcad uses the semicolon, there's a whole load of unassigned punctuation that could be used.

    But I''m weary of adding in too many language features, otherwise it becomes the anthesis of what the language was in the beginning; like how Firefox is now the bloated browser it. I propose C# adopts a 'modular' approach like XHTML or CSS (this would require a compiler rearchitecture though, but it would release it from inevitable bloat).

    As for your problem, it reminds me of Delphi's support for indexer properties (similar to C#'s this[], but named).

  • Minh

    dpratt71 said:
    W3bbo said:
    *snip*
    Thanks for the reply, W3bbo. To better explain myself, suppose that you want to define a property which is a collection of "Foo" objects, but you don't want to allow adding or removing items via the property (i.e. the collection is managed internally). In such a case, it would be appropriate for the type of the property to be IEnumerable<Foo>. But that's hard to read. So instead of this:

    public class SomeClass
    {
        ...
        public IEnumerable<Foo> Foos
        {
            get { ... }
        }
    }

    I would rather see this:

    public class SomeClass
    {
        ...
        public Foo* Foos
        {
            get { ... }
        }
    }

    Of course, as you pointed out, Foo* is currently interpreted as pointer-to-Foo (at least in an unsafe context), which almost certainly precludes using that particular syntax. My overall point is that it would be nice to have some simplified syntax for representing IEnumerable<Foo> (perhaps Foo+).

    I do lament the fact that "*" is already taken. The "*" character is used in many programming contexts to represent "0 to n" number of "things".
    I'm all for clearer intention... so what if we add new keywords? The compiler is supposed to work for us...

    I want syntatic sugar for this...

    private void RunMe()
    {
    if (!InvokeRequired)
    {
    myLabel.Text = "You pushed the button!";
    }
    else
    {
    Invoke(new ThreadStart(RunMe));
    }
    }

  • wkempf

    1.  There's other "readonly collection" types.  ReadOnlyCollection<T> for instance.  It's not always appropriate to expose IEnumerable only because you want a "readonly collection" type.

    2.  IEnumerable<Foo> isn't the ugly syntax you want to make it out to be.  Foo* or Foo+ may be shorter, but they are also less meaningful.  I think it was at least a little questionable that we added the Foo? syntax, when this is nothing but sugar for Nullable<Foo>.  My gut tells me it's even more questionable to do the same in this case.  Obviously that's only an opinion.

  • dpratt71

    wkempf said:
    1.  There's other "readonly collection" types.  ReadOnlyCollection<T> for instance.  It's not always appropriate to expose IEnumerable only because you want a "readonly collection" type.

    2.  IEnumerable<Foo> isn't the ugly syntax you want to make it out to be.  Foo* or Foo+ may be shorter, but they are also less meaningful.  I think it was at least a little questionable that we added the Foo? syntax, when this is nothing but sugar for Nullable<Foo>.  My gut tells me it's even more questionable to do the same in this case.  Obviously that's only an opinion.
    While ReadOnlyCollection<T> may be fine for "managing" a read-only collection, I don't think it a good choice for exposing a read-only collection. For one thing, ReadOnlyCollection<T> implements IList<T>, which, of course, has methods/properties for modifying the collection.

    Another point in favor of IEnumerable<> that I forgot to mention is flexibility. In other words, you're able to change the implementation later on and perhaps choose a different underlying container.

    I'm not sure what you mean when you say "...they are also less meaningful...". Why? If you program in C#, then you are very accustomed to using symbols to refer to various concepts. Even terms like "object" and "int" are really symbols representing some "real" type. I'm proposing that there be a symbol that means "some number of things".

  • W3bbo

    dpratt71 said:
    wkempf said:
    *snip*
    While ReadOnlyCollection<T> may be fine for "managing" a read-only collection, I don't think it a good choice for exposing a read-only collection. For one thing, ReadOnlyCollection<T> implements IList<T>, which, of course, has methods/properties for modifying the collection.

    Another point in favor of IEnumerable<> that I forgot to mention is flexibility. In other words, you're able to change the implementation later on and perhaps choose a different underlying container.

    I'm not sure what you mean when you say "...they are also less meaningful...". Why? If you program in C#, then you are very accustomed to using symbols to refer to various concepts. Even terms like "object" and "int" are really symbols representing some "real" type. I'm proposing that there be a symbol that means "some number of things".
    ReadOnlyCollection implements IList<T>'s members only explicitly, so they are inaccessible when the ROC is not cast as an IList<T>, and even then every mutator operation throws an exception.

    ROC is meant to be a Decorator wrapper around another list (which can be modified). I think ROCs work best with C#'s auto-implemented properties:

    class PublicFacingCollection : ReadOnlyCollection<Object> { // I often alias generic types like this for readability and semantics
    }

    class Foo {

    public PublicFacingCollection TheCollection { get; private set; }

    private List<Object> underlyingCollection;

    public Foo() {

    underlyingCollection = new List<Object>();

    TheCollection = new PublicFacingCollection(underlyingCollection); // the PublicFacingCollection instance no-longer needs to be dealt with by the Foo owner class. The underlyingCollection can be modified and the changes shown up in the TheCollection property instantly. Consumers of the Foo type do not see any mutator methods of the ROC as they are implemented explicitly.

    }

    }

  • dpratt71

    W3bbo said:
    dpratt71 said:
    *snip*
    ReadOnlyCollection implements IList<T>'s members only explicitly, so they are inaccessible when the ROC is not cast as an IList<T>, and even then every mutator operation throws an exception.

    ROC is meant to be a Decorator wrapper around another list (which can be modified). I think ROCs work best with C#'s auto-implemented properties:

    class PublicFacingCollection : ReadOnlyCollection<Object> { // I often alias generic types like this for readability and semantics
    }

    class Foo {

    public PublicFacingCollection TheCollection { get; private set; }

    private List<Object> underlyingCollection;

    public Foo() {

    underlyingCollection = new List<Object>();

    TheCollection = new PublicFacingCollection(underlyingCollection); // the PublicFacingCollection instance no-longer needs to be dealt with by the Foo owner class. The underlyingCollection can be modified and the changes shown up in the TheCollection property instantly. Consumers of the Foo type do not see any mutator methods of the ROC as they are implemented explicitly.

    }

    }

    I am at least somewhat acquainted with ReadOnlyCollection. In any case, I consider ReadOnlyCollection vs. IEnumerable as mostly apples vs. oranges. Changing nothing else about your example, what would be the drawback to defining TheCollection as type IEnumerable<Object>?

  • Cannot​Resolve​Symbol

    dpratt71 said:
    W3bbo said:
    *snip*

    I am at least somewhat acquainted with ReadOnlyCollection. In any case, I consider ReadOnlyCollection vs. IEnumerable as mostly apples vs. oranges. Changing nothing else about your example, what would be the drawback to defining TheCollection as type IEnumerable<Object>?

    I'm assuming you mean like this:

    class Foo { 
    public IEnumerable<Object> TheCollection { get; private set; }
    private List<Object> underlyingCollection;
    
    public Foo() {
    underlyingCollection = new List<Object>();
    TheCollection = (IEnumerable<Object>) underlyingCollection; 
    }
    }

    In this case, your collection isn't actually read-only.  A user of your library can cast TheCollection back to List<Object> and modify the contents of the list, which is bad in most situations (since you're directly accessing a private data member).

  • TommyCarlier

    One thing I sometimes miss is an interface between IEnumerable<T> and ICollection<T> that represents a read-only collection (implements IEnumerable<T>) with a Count-property. Something like this:

    public interface IReadOnlyCollection<T>
     : IEnumerable<T>
    {
        int Count { get; }
    }

  • stevo_

    TommyCarlier said:
    One thing I sometimes miss is an interface between IEnumerable<T> and ICollection<T> that represents a read-only collection (implements IEnumerable<T>) with a Count-property. Something like this:
    public interface IReadOnlyCollection<T>
     : IEnumerable<T>
    {
        int Count { get; }
    }
    If it supports count would it not also support indexed based access? and I agree.. personally readonlycollection makes me cringe.. I don't like exposing it, and it can only ever then act as runtime protection.. I would love some readonly interfaces for data structures.. and perhaps even some immutable data structure interfaces and implementations (surely the implementations already exist in f# for example).

  • TommyCarlier

    stevo_ said:
    TommyCarlier said:
    *snip*
    If it supports count would it not also support indexed based access? and I agree.. personally readonlycollection makes me cringe.. I don't like exposing it, and it can only ever then act as runtime protection.. I would love some readonly interfaces for data structures.. and perhaps even some immutable data structure interfaces and implementations (surely the implementations already exist in f# for example).

    No it wouldn't have to support index-based access. Just look at ICollection<T>: you can enumerate it (via IEnumerable<T>), it has a Count-property, but also an Add(T)-method and some additional members (like Remove() and IsReadOnly). Yet it doesn't have index-based access. That's what IList<T> is for.

    Now: IEnumerable<T> « ICollection<T> « IList<T>
    What I'd like: IEnumerable<T> « ICountEnumerable<T> « ICollection<T> « IList<T>

  • dpratt71

    CannotResolveSymbol said:
    dpratt71 said:
    *snip*
    I'm assuming you mean like this:

    class Foo { 
    public IEnumerable<Object> TheCollection { get; private set; }
    private List<Object> underlyingCollection;
    
    public Foo() {
    underlyingCollection = new List<Object>();
    TheCollection = (IEnumerable<Object>) underlyingCollection; 
    }
    }

    In this case, your collection isn't actually read-only.  A user of your library can cast TheCollection back to List<Object> and modify the contents of the list, which is bad in most situations (since you're directly accessing a private data member).
    I'm sorry, that's not what I meant (for the reason you stated). I meant only change the type of the public "TheCollection" property.

  • dpratt71

    TommyCarlier said:
    One thing I sometimes miss is an interface between IEnumerable<T> and ICollection<T> that represents a read-only collection (implements IEnumerable<T>) with a Count-property. Something like this:
    public interface IReadOnlyCollection<T>
     : IEnumerable<T>
    {
        int Count { get; }
    }
    Of course, thanks to the magic of extension methods, IEnumerable now effectively does have a Count method. I pointed Reflector at the implementation: As you might expect, if the source implements ICollection, then ICollection.Count is returned. Otherwise it enumerates over the collection, incrementing a counter along the way.

  • wkempf

    TommyCarlier said:
    One thing I sometimes miss is an interface between IEnumerable<T> and ICollection<T> that represents a read-only collection (implements IEnumerable<T>) with a Count-property. Something like this:
    public interface IReadOnlyCollection<T>
     : IEnumerable<T>
    {
        int Count { get; }
    }
    There's lots you lose with IEnumerable<T>.  Count and random access are the most obvious.  LINQ sort of gives this back to you.  The Count() method will give you the count, and is optimized to try and cast the IEnumerable<T> to an ICollection<T> in order to get the count without having to iterate over the collection.  However, I believe that is bad design.  If the underlying collection is changed to no longer be an ICollection<T> the complexity of Count() has radically changed.  Since the public interface only exposed IEnumerable<T>, I have to expect that sort of behavior.

    The ReadOnlyCollection types aren't ideal.  I'd much prefer it if the collection designs had started with IReadOnlyCollection and ICollection had derived from there.  IOW, I share your consternation about the existence of non-readonly methods that throw exceptions.  That said, however, we have what we have, and in practice there's seldom if ever an issue with the IsReadOnly/exception design of ReadOnlyCollection types.

  • evildictait​or

    wkempf said:
    TommyCarlier said:
    *snip*
    There's lots you lose with IEnumerable<T>.  Count and random access are the most obvious.  LINQ sort of gives this back to you.  The Count() method will give you the count, and is optimized to try and cast the IEnumerable<T> to an ICollection<T> in order to get the count without having to iterate over the collection.  However, I believe that is bad design.  If the underlying collection is changed to no longer be an ICollection<T> the complexity of Count() has radically changed.  Since the public interface only exposed IEnumerable<T>, I have to expect that sort of behavior.

    The ReadOnlyCollection types aren't ideal.  I'd much prefer it if the collection designs had started with IReadOnlyCollection and ICollection had derived from there.  IOW, I share your consternation about the existence of non-readonly methods that throw exceptions.  That said, however, we have what we have, and in practice there's seldom if ever an issue with the IsReadOnly/exception design of ReadOnlyCollection types.
    Be aware that the reason Count and Random access are not provided in IEnumerable<> is quite deliberate - whereas random access and counting in a List<T> is fast, random access and counting in a LinkedList<> is not - furthermore, IEnumerable<T> allows some interesting properties - for example there's no reason why an IEnumerable<T> should actually ever terminate, and it's certainly not nessisarilly the case that it should know if (or when) it's going to terminate when it starts.

    For example, if you're reading through a series of lines in a text file, there's a good argument to say that you only need to read them one at a time. Finding out how many lines there are requires you to read the entire file into memory (and the file may be big), but reading each line into memory one at a time does not require you to read all of the file into memory. Thus by using IEnumerable<T> rather than a Collection<T>, some benefit has been achieved.

    Another example is when tokenising a HTML stream to be rendered - if we were to insist on knowing how many elements there were before we started, then we would need to start parsing after the entire page contents has returned. If on the other hand we do not need to do this (by using an IEnumerable<T> rather than an ICollection<T>) we can start processing the page BEFORE the entire thing has been downloaded - allowing for a speedup of the effective render time for the page.

    One of the things that slightly concerns me with some of the comments on this page is people noting that if you have a List<T> and you expose this as an IEnumerable<T> that this can (in principle) be cast back to a List<T> and modified - but as soon as I see this I think "that sentiment is certainly true - but if a programmer is willing to forego safe programming practises, is that not his/her fault when it all goes wrong?". Certainly in my code I expect that when I expose an IEnumerable<T> that either myself or any other programmer will never try to cast it back to a List<T> - if they need random access they can use the constructor of List<T> that takes an IEnumerable<T>, but according to the contract that the method gives, it does not say that the IEnumerable<T> is castable to a List<T>, and to assume that it can be cast back is an unsafe assumption - if later I change the implementation to a LinkedList<T> then their code will break, even though the method signature will remain the same.

    The question you need to then ask is WHY and WHO you are doing these optimisations for. If you're a component maker who's exposing these properties to a customer, then sure, go right ahead and protect these things with ReadOnlyCollections. If on the other hand you're programming to prevent yourself or your collegages from doing dodgy assumptions, then perhaps you and your collegagues need to have a sit down and discuss whether there is a better way of doing what they're doing.

    Remember:

    class {
     List<T> _val;
      public IEnumerable<T> {
      get { return _val; }
     }
    }

    has a computational complexity of 1 and memory overhead of 0.

    class {
      List<T> _val;
      public IEnumerable<T> {
        get { return new ReadOnlyCollection(_val); }
      }
    }

    has a computational complexity and memory overhead of O(_val.Count) - and so you're slowing and bloating your program at the same time.

  • wkempf

    evildictaitor said:
    wkempf said:
    *snip*
    Be aware that the reason Count and Random access are not provided in IEnumerable<> is quite deliberate - whereas random access and counting in a List<T> is fast, random access and counting in a LinkedList<> is not - furthermore, IEnumerable<T> allows some interesting properties - for example there's no reason why an IEnumerable<T> should actually ever terminate, and it's certainly not nessisarilly the case that it should know if (or when) it's going to terminate when it starts.

    For example, if you're reading through a series of lines in a text file, there's a good argument to say that you only need to read them one at a time. Finding out how many lines there are requires you to read the entire file into memory (and the file may be big), but reading each line into memory one at a time does not require you to read all of the file into memory. Thus by using IEnumerable<T> rather than a Collection<T>, some benefit has been achieved.

    Another example is when tokenising a HTML stream to be rendered - if we were to insist on knowing how many elements there were before we started, then we would need to start parsing after the entire page contents has returned. If on the other hand we do not need to do this (by using an IEnumerable<T> rather than an ICollection<T>) we can start processing the page BEFORE the entire thing has been downloaded - allowing for a speedup of the effective render time for the page.

    One of the things that slightly concerns me with some of the comments on this page is people noting that if you have a List<T> and you expose this as an IEnumerable<T> that this can (in principle) be cast back to a List<T> and modified - but as soon as I see this I think "that sentiment is certainly true - but if a programmer is willing to forego safe programming practises, is that not his/her fault when it all goes wrong?". Certainly in my code I expect that when I expose an IEnumerable<T> that either myself or any other programmer will never try to cast it back to a List<T> - if they need random access they can use the constructor of List<T> that takes an IEnumerable<T>, but according to the contract that the method gives, it does not say that the IEnumerable<T> is castable to a List<T>, and to assume that it can be cast back is an unsafe assumption - if later I change the implementation to a LinkedList<T> then their code will break, even though the method signature will remain the same.

    The question you need to then ask is WHY and WHO you are doing these optimisations for. If you're a component maker who's exposing these properties to a customer, then sure, go right ahead and protect these things with ReadOnlyCollections. If on the other hand you're programming to prevent yourself or your collegages from doing dodgy assumptions, then perhaps you and your collegagues need to have a sit down and discuss whether there is a better way of doing what they're doing.

    Remember:

    class {
     List<T> _val;
      public IEnumerable<T> {
      get { return _val; }
     }
    }

    has a computational complexity of 1 and memory overhead of 0.

    class {
      List<T> _val;
      public IEnumerable<T> {
        get { return new ReadOnlyCollection(_val); }
      }
    }

    has a computational complexity and memory overhead of O(_val.Count) - and so you're slowing and bloating your program at the same time.
    I always wonder about posts like this.  It's obvious from the post your replying to that I fully understand why IEnumerable<T> doesn't have a Count property or provide random access.  That kinda was the entire point of my post, don't ya think?

Comments closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.