Tech Off Thread

12 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

LINQ Where; fold vs bind...

Back to Forum: Tech Off
  • User profile image
    Thorium

    Hello!

    This post refers to some Going Deep videos...

    Greg told in his video about shape, wrap and roll.

    LINQ Select many is "wrap" and Aggregate is "roll", right?

    LINQ Where have this kind of signature:

    public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate) 
    

    So... As Bart said in his video, "where" could be done with SelectMany.

    • public static IEnumerable<TResult> SelectMany<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, IEnumerable<TResult>> selector);

    I'll try...It will be something like this:

    return source.SelectMany(s => Enumerable.Repeat(s, predicate(s) ? 1 : 0));

    But.... as Ralf said in his video, "where" could be done with Aggregate.

    • public static TAccumulate Aggregate<TSource, TAccumulate>(this IEnumerable<TSource> source, TAccumulate seed, Func<TAccumulate, TSource, TAccumulate> func);

    I'll try...It will be something like this:

    return source.Aggregate((a, s) => predicate(s) ? a.Concat(Enumerable.Repeat(s, 1)):a);

    Now... Can we do Aggregate with SelectMany?

    No? Can you prove it?

    It shouldn't be possible, as monads are about shap, wrap (=bind, here: SelectMany) and roll (here: Aggregate)?

    SelectMany seems to be easy to do with Aggregate:

    return source.Aggregate((a, s) => a.Concat(selector(s)));

    Am I cheating here by using Concat? That will aggregate again inside the same container(monad). Will it use SelectMany inside?

    I could do SelectMany with Aggregate by exiting the monad and go back again (so I will cause the side effect to exit the monad and break the continuation), right? So:

    // SelectMany:
    M<A> -> (A -> M<B>) -> M<B>
    
    // Aggregate:
    M<A> -> B -> (B -> A -> B) -> B
    // Aggregate (if accumulator would be type of M<B>):
    M<A> -> M<B> -> (M<B> -> A -> M<B>) -> M<B>
    // this Aggregate without initial value:
    M<A> -> (M<B> -> A -> M<B>) -> M<B>
    // This is very near to select many, you can handle the extra parameter in the inner func as you like...
    

    So I can use IEnumerable<T> as accumulator (or could even a Func<...>). On the other hand, LINQ-methods are not pure... e.g. SelectMany is not pure monadic bind: it has some overloads.

     

  • User profile image
    JohnAskew

    SelectMany seems to be easy to do with Aggregate:

    1
    return source.Aggregate(Enumerable.Empty<TResult>(), (a, s) => a.Concat(selector(s)));

    Am I cheating here by using Concat? Will it use SelectMany inside?

    What is 'selector(s)' here?

    In your post, 'predicate(s)' and 'selector(s)' are placeholders for code that isn't yet written, correct?

     

    return source.Aggregate(Enumerable.Empty<TResult>(), (a, s) => a.Concat(selector(s)));
     

    Your seed here is an empty enumerable, so this is equivalent to

    return source.Aggregate((a, s) => selector(s));

    This theoretical 'selector(s)' bothers me. Let's use real code instead to insure we're on the same page and understand each other.

     

    return source.Aggregate((a, s) => s.StartsWith("A"));

    The specified function ('selector(s)') is used to select the result value. The result values are a select-many subset (e.g. StartsWith("A")). These result values are accumulated and returned. There is nothing extra required.

     

    LINQ is a monad. The two basic concepts about a monad is that it's (1) a sequence of operations and (2) there's a context that flows from operation to operation.

    There is a keyword that is used in LINQ that you seem to be seeking to understand. Research 'yield' & 'yield return' to find out how the monad is re-entrant (2) so that select-many operations work with functions such as Aggregate. It is the context that is free to flow from operation to operation which are all embedded in a single call.

    Is this helpful? I'm guessing at what you are looking for here.

    Best wishes.

     

    My favorite online LINQ reference:

    http://code.msdn.microsoft.com/101-LINQ-Samples-3fb9811b

    definitions:

    http://msdn.microsoft.com/en-us/library/bb397926

  • User profile image
    Thorium

    Maybe I wasn't clear...

    The "predicate(s)" is just anything that holds the definition of the Where predicate:

     Func<TSource, bool> predicate

    The "selector(s)" is just anything that holds the definition of the SelectMany selector:

    Func<TSource, IEnumerable<TResult>> selector

    The concrete implementation doesn't matter.

    ---

    Fold / Aggregate is a kind of visitor pattern to visit every item of the container (monad).

    ---

    The SelectMany (monadic bind) syntax is something like this:

    M<A> -> (A -> M<B>) -> M<B>

    LINQ deals with immutable collections. So M<A> is not altered/transformed. Instead someone will know how to create a new M<B>. And applying this (A->M<B>) will mean that there must be a way to get A from M<A>. That will need some kind of visitor pattern. 

     

  • User profile image
    wkempf

    The implementation for SelectMany using Aggregate would actually be:

    return source.Aggregate(Enumerable.Empty<TResult>(), (a, s) => a.Concat(selector(s)));

    The original post had the seed as Enumerable.Empty<TSource>() which isn't correct. Regardless, though, the seed would have to be supplied as our TAccumulate is not the same as TSource.

  • User profile image
    JohnAskew

    , Thorium wrote

    Maybe I wasn't clear...

    The "predicate(s)" is just anything that holds the definition of the Where predicate:

     Func<TSource, bool> predicate

    The "selector(s)" is just anything that holds the definition of the SelectMany selector:

    Func<TSource, IEnumerable<TResult>> selector

    The concrete implementation doesn't matter.

    ---

    Fold / Aggregate is a kind of visitor pattern to visit every item of the container (monad).

    ---

    The SelectMany (monadic bind) syntax is something like this:

    M<A> -> (A -> M<B>) -> M<B>

    LINQ deals with immutable collections. So M<A> is not altered/transformed. Instead someone will know how to create a new M<B>. And applying this (A->M<B>) will mean that there must be a way to get A from M<A>. That will need some kind of visitor pattern. 

     

    You are right. I don't understand the question.

    LINQ iteration over M<A> gives us A.   What is the question? I have no idea what you are asking.

    The 'yield' keyword may be what you are seeking regarding the visitor pattern.

    Are you looking for a way to do a specific task or is this academic research?

    BTW, Aggregate does have another signature which allows a transformation of the resultset.

  • User profile image
    Thorium

    wkempf, right, that was a typo. But as JohnAskew stated out, the initial value can be left away. Thanks for both, I fixed this to the orginal message.

    Videos like this state that "Bind is the mother of all operators". But is that true, if you can do bind with Aggregate (=fold)?

    I'm not looking to make a specific task, I just want to make this clear to me.

     

  • User profile image
    JohnAskew

    I was reading about how Aggregate is a 'fold' operation in functional programming terms.

    Aggregate 'folds' the resultset into a single return value, and it is not the same as a scalar function, from what I've been reading... Now what a 'right fold' really is I can't tell you...

    Another piece of information: Another possible explanation for [LINQ Aggregate] not being called fold is that fold generally implies a recursive operation. Linq is mostly implemented with iterators. This may be something of interest to you. There is no recursive 'fold' operation in LINQ, apparently, Aggregate simulates a fold but is implemented with iterators. Does this mean that you were looking for a true 'fold' operation for a purpose which LINQ is not able to support? I think you're too smart for me.

    Wink

    This article is by Bart De Smet and is worth a look: http://dotnet.dzone.com/news/linq-folding-left-right-and-th

    Here's another nice article about folds in LINQ: http://weblogs.asp.net/podwysocki/archive/2009/02/26/functional-c-fun-with-folds.aspx

  • User profile image
    wkempf

    John, he's not asking for a solution to anything. He's trying to understand the concept of monads. Unfortunately, I'm sitting at about a 70% understanding right now, which is maybe a little less than Thorium already has.

    Thorium, I'm just going to make a guess here. The reality is that you can implement pretty much any part of LINQ given just one or two existing operators. However, from the viewpoint of monads, only SelectMany represents the "bind operation" (from Wikipedia):

    A binding operation of polymorphic type (M t)→(t→M u)→(M u), which Haskell represents by the infix operator >>=. Its first argument is a value in a monadic type, its second argument is a function that maps from the underlying type of the first argument to another monadic type, and its result is in that other monadic type. Typically, the binding operation can be understood as having four stages:

    1. The monad-related structure on the first argument is "pierced" to expose any number of values in the underlying type t.
    2. The given function is applied to all of those values to obtain values of type (M u).
    3. The monad-related structure on those values is also pierced, exposing values of type u.
    4. Finally, the monad-related structure is reassembled over all of the results, giving a single value of type (M u).
  • User profile image
    JohnAskew

    I see a woman in the avatar... a Swedish flag... a reference to an element discovered by a Swede...

    SelectMany

    For a user-provided mapping from collection elements to collections, semantically two steps are performed. First, every element is mapped to its corresponding collection. Second, the result of the first step is flattened by one level. Note: Select and Filter are both implementable in terms of SelectMany, as long as singleton and empty collections are available. The translation rules mentioned above still make it mandatory for a LINQ provider to provide the other two operators.

    http://en.wikipedia.org/wiki/Language_Integrated_Query

    It does not list Aggregate as "implementable in terms of SelectMany".

    Perhaps the projection in step (2) is what Aggregate does not do? It certainly returns a single value...

     

  • User profile image
    Thorium

    The source to try as LINQPad program (sorry for one-liner):

    void Main(){   var l = new[]{new []{1,2,3},new []{4,5,6}};   var res1 = l.SelectMany(s => s.Select(i => i+1));   res1.Dump();      var res2 = l.MySelectMany(s => s.Select(i => i+1));   res2.Dump();}public static class myExtensions{   public static IEnumerable<TResult> MySelectMany<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, IEnumerable<TResult>> selector){      return source.Aggregate(Enumerable.Empty<TResult>(), (a, s) => a.Concat(selector(s)));   }}

     

     

    , JohnAskew wrote

    This article is by Bart De Smet and is worth a look: http://dotnet.dzone.com/news/linq-folding-left-right-and-th

    Here's another nice article about folds in LINQ: http://weblogs.asp.net/podwysocki/archive/2009/02/26/functional-c-fun-with-folds.aspx

    Thanks for the links... Matthew there wrote how to make .Concat() with .Aggregate(), so that is ok to use I guess...

     

  • User profile image
    Thorium

    While Bind (SelectMany) and Fold (Aggregate) are pretty similar, there is a slightly different point of view...

    Matthew's Concat uses recursive aggregate and function as accumulator value.

    • To replace Bind we need many aggregates... (or do we, maybe not?)
    • By default, Aggregate/fold will exit the monad (so we "reify" (=execute), get the side effects and break the continuation)
    • By using func here, we can remain the continuation.
    • But still we kind of exit and enter the monad many times, which could affect the performance.

    Bind is not totally "pure" as outer function will produce set of results but the outer function doesn't actually know (by type syntax) how to combine different results produced by the inner function. So it has to know something about these results that the function syntax won't tell us.

    In theory Bind could match without running through items. Then it would be different. See this short video. Maybe do the concatenation without creating a new instance... (or with immutable data structures this means just using the pointer to old when creating a "new" one). This way we could get e.g. effective parallel execution. I think the current implementations, when they flatten, can reuse items but not the inner collections (/monads).

  • User profile image
    JohnAskew

    Maybe do the concatenation without creating a new instance... (or with immutable data structures this means just using the pointer to old when creating a "new" one).

    @Thorium: Are you working on a Master's thesis or a PhD in Computer Science?

    I'm not wanting to optimize LINQ... but to learn to use it better.

    In the short video, I see how matching 'has to know something about these results that the function syntax won't tell us'. I imagine reflection might be used for this. Regarding memory allocation, it is what it is for me ~ I'm not asking for changes so that LINQ can fold more purely and quickly.

    Surely you are in school somewhere... ?

    Cheers.

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.