Tech Off Thread
12 postsForum Read Only
This forum has been made read only by the site admins. No new threads or comments can be added.
LINQ Where; fold vs bind...

Hello!
This post refers to some Going Deep videos...
Greg told in his video about shape, wrap and roll.
LINQ Select many is "wrap" and Aggregate is "roll", right?
LINQ Where have this kind of signature:
public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
So... As Bart said in his video, "where" could be done with SelectMany.
 public static IEnumerable<TResult> SelectMany<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, IEnumerable<TResult>> selector);
I'll try...It will be something like this:
return source.SelectMany(s => Enumerable.Repeat(s, predicate(s) ? 1 : 0));
But.... as Ralf said in his video, "where" could be done with Aggregate.
 public static TAccumulate Aggregate<TSource, TAccumulate>(this IEnumerable<TSource> source, TAccumulate seed, Func<TAccumulate, TSource, TAccumulate> func);
I'll try...It will be something like this:
return source.Aggregate((a, s) => predicate(s) ? a.Concat(Enumerable.Repeat(s, 1)):a);
Now... Can we do Aggregate with SelectMany?
No? Can you prove it?
It shouldn't be possible, as monads are about shap, wrap (=bind, here: SelectMany) and roll (here: Aggregate)?
SelectMany seems to be easy to do with Aggregate:
return source.Aggregate((a, s) => a.Concat(selector(s)));
Am I cheating here by using Concat? That will aggregate again inside the same container(monad). Will it use SelectMany inside?
I could do SelectMany with Aggregate by exiting the monad and go back again (so I will cause the side effect to exit the monad and break the continuation), right? So:
// SelectMany: M<A> > (A > M<B>) > M<B> // Aggregate: M<A> > B > (B > A > B) > B // Aggregate (if accumulator would be type of M<B>): M<A> > M<B> > (M<B> > A > M<B>) > M<B> // this Aggregate without initial value: M<A> > (M<B> > A > M<B>) > M<B> // This is very near to select many, you can handle the extra parameter in the inner func as you like...
So I can use IEnumerable<T> as accumulator (or could even a Func<...>). On the other hand, LINQmethods are not pure... e.g. SelectMany is not pure monadic bind: it has some overloads.

SelectMany seems to be easy to do with Aggregate:
1return
source.Aggregate(Enumerable.Empty<TResult>(), (a, s) => a.Concat(selector(s)));
Am I cheating here by using Concat? Will it use SelectMany inside?
What is 'selector(s)' here?
In your post, 'predicate(s)' and 'selector(s)' are placeholders for code that isn't yet written, correct?
return source.Aggregate(Enumerable.Empty<TResult>(), (a, s) => a.Concat(selector(s)));
Your seed here is an empty enumerable, so this is equivalent to
return source.Aggregate((a, s) => selector(s));
This theoretical 'selector(s)' bothers me. Let's use real code instead to insure we're on the same page and understand each other.
return source.Aggregate((a, s) => s.StartsWith("A"));
The specified function ('selector(s)') is used to select the result value. The result values are a selectmany subset (e.g. StartsWith("A")). These result values are accumulated and returned. There is nothing extra required.
LINQ is a monad. The two basic concepts about a monad is that it's (1) a sequence of operations and (2) there's a context that flows from operation to operation.
There is a keyword that is used in LINQ that you seem to be seeking to understand. Research 'yield' & 'yield return' to find out how the monad is reentrant (2) so that selectmany operations work with functions such as Aggregate. It is the context that is free to flow from operation to operation which are all embedded in a single call.
Is this helpful? I'm guessing at what you are looking for here.
Best wishes.
My favorite online LINQ reference:
http://code.msdn.microsoft.com/101LINQSamples3fb9811b
definitions:

Maybe I wasn't clear...
The "predicate(s)" is just anything that holds the definition of the Where predicate:
Func<TSource, bool> predicate
The "selector(s)" is just anything that holds the definition of the SelectMany selector:
Func<TSource, IEnumerable<TResult>> selector
The concrete implementation doesn't matter.

Fold / Aggregate is a kind of visitor pattern to visit every item of the container (monad).

The SelectMany (monadic bind) syntax is something like this:
M<A> > (A > M<B>) > M<B>
LINQ deals with immutable collections. So M<A> is not altered/transformed. Instead someone will know how to create a new M<B>. And applying this (A>M<B>) will mean that there must be a way to get A from M<A>. That will need some kind of visitor pattern.

The implementation for SelectMany using Aggregate would actually be:
return source.Aggregate(Enumerable.Empty<TResult>(), (a, s) => a.Concat(selector(s)));
The original post had the seed as Enumerable.Empty<TSource>() which isn't correct. Regardless, though, the seed would have to be supplied as our TAccumulate is not the same as TSource.

27 minutes ago, Thorium wrote
Maybe I wasn't clear...
The "predicate(s)" is just anything that holds the definition of the Where predicate:
Func<TSource, bool> predicate
The "selector(s)" is just anything that holds the definition of the SelectMany selector:
Func<TSource, IEnumerable<TResult>> selector
The concrete implementation doesn't matter.

Fold / Aggregate is a kind of visitor pattern to visit every item of the container (monad).

The SelectMany (monadic bind) syntax is something like this:
M<A> > (A > M<B>) > M<B>
LINQ deals with immutable collections. So M<A> is not altered/transformed. Instead someone will know how to create a new M<B>. And applying this (A>M<B>) will mean that there must be a way to get A from M<A>. That will need some kind of visitor pattern.
You are right. I don't understand the question.
LINQ iteration over M<A> gives us A. What is the question? I have no idea what you are asking.
The 'yield' keyword may be what you are seeking regarding the visitor pattern.
Are you looking for a way to do a specific task or is this academic research?
BTW, Aggregate does have another signature which allows a transformation of the resultset.

wkempf, right, that was a typo. But as JohnAskew stated out, the initial value can be left away. Thanks for both, I fixed this to the orginal message.
Videos like this state that "Bind is the mother of all operators". But is that true, if you can do bind with Aggregate (=fold)?
I'm not looking to make a specific task, I just want to make this clear to me.

I was reading about how Aggregate is a 'fold' operation in functional programming terms.
Aggregate 'folds' the resultset into a single return value, and it is not the same as a scalar function, from what I've been reading... Now what a 'right fold' really is I can't tell you...
Another piece of information: Another possible explanation for [LINQ Aggregate] not being called fold is that fold generally implies a recursive operation. Linq is mostly implemented with iterators. This may be something of interest to you. There is no recursive 'fold' operation in LINQ, apparently, Aggregate simulates a fold but is implemented with iterators. Does this mean that you were looking for a true 'fold' operation for a purpose which LINQ is not able to support? I think you're too smart for me.
This article is by Bart De Smet and is worth a look: http://dotnet.dzone.com/news/linqfoldingleftrightandth
Here's another nice article about folds in LINQ: http://weblogs.asp.net/podwysocki/archive/2009/02/26/functionalcfunwithfolds.aspx

John, he's not asking for a solution to anything. He's trying to understand the concept of monads. Unfortunately, I'm sitting at about a 70% understanding right now, which is maybe a little less than Thorium already has.
Thorium, I'm just going to make a guess here. The reality is that you can implement pretty much any part of LINQ given just one or two existing operators. However, from the viewpoint of monads, only SelectMany represents the "bind operation" (from Wikipedia):
A binding operation of polymorphic type (M t)→(t→M u)→(M u), which Haskell represents by the infix operator
>>=
. Its first argument is a value in a monadic type, its second argument is a function that maps from the underlying type of the first argument to another monadic type, and its result is in that other monadic type. Typically, the binding operation can be understood as having four stages: The monadrelated structure on the first argument is "pierced" to expose any number of values in the underlying type t.
 The given function is applied to all of those values to obtain values of type (M u).
 The monadrelated structure on those values is also pierced, exposing values of type u.
 Finally, the monadrelated structure is reassembled over all of the results, giving a single value of type (M u).

I see a woman in the avatar... a Swedish flag... a reference to an element discovered by a Swede...
SelectMany
Further information: Bind (higherorder function)For a userprovided mapping from collection elements to collections, semantically two steps are performed. First, every element is mapped to its corresponding collection. Second, the result of the first step is flattened by one level. Note: Select and Filter are both implementable in terms of SelectMany, as long as singleton and empty collections are available. The translation rules mentioned above still make it mandatory for a LINQ provider to provide the other two operators.
http://en.wikipedia.org/wiki/Language_Integrated_Query
It does not list Aggregate as "implementable in terms of SelectMany".
Perhaps the projection in step (2) is what Aggregate does not do? It certainly returns a single value...

The source to try as LINQPad program (sorry for oneliner):
void Main(){ var l = new[]{new []{1,2,3},new []{4,5,6}}; var res1 = l.SelectMany(s => s.Select(i => i+1)); res1.Dump(); var res2 = l.MySelectMany(s => s.Select(i => i+1)); res2.Dump();}public static class myExtensions{ public static IEnumerable<TResult> MySelectMany<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, IEnumerable<TResult>> selector){ return source.Aggregate(Enumerable.Empty<TResult>(), (a, s) => a.Concat(selector(s))); }}
3 hours ago, JohnAskew wrote
This article is by Bart De Smet and is worth a look: http://dotnet.dzone.com/news/linqfoldingleftrightandth
Here's another nice article about folds in LINQ: http://weblogs.asp.net/podwysocki/archive/2009/02/26/functionalcfunwithfolds.aspx
Thanks for the links... Matthew there wrote how to make .Concat() with .Aggregate(), so that is ok to use I guess...

While Bind (SelectMany) and Fold (Aggregate) are pretty similar, there is a slightly different point of view...
Matthew's Concat uses recursive aggregate and function as accumulator value.
 To replace Bind we need many aggregates... (or do we, maybe not?)
 By default, Aggregate/fold will exit the monad (so we "reify" (=execute), get the side effects and break the continuation)
 By using func here, we can remain the continuation.
 But still we kind of exit and enter the monad many times, which could affect the performance.
Bind is not totally "pure" as outer function will produce set of results but the outer function doesn't actually know (by type syntax) how to combine different results produced by the inner function. So it has to know something about these results that the function syntax won't tell us.
In theory Bind could match without running through items. Then it would be different. See this short video. Maybe do the concatenation without creating a new instance... (or with immutable data structures this means just using the pointer to old when creating a "new" one). This way we could get e.g. effective parallel execution. I think the current implementations, when they flatten, can reuse items but not the inner collections (/monads).

Maybe do the concatenation without creating a new instance... (or with immutable data structures this means just using the pointer to old when creating a "new" one).
@Thorium: Are you working on a Master's thesis or a PhD in Computer Science?
I'm not wanting to optimize LINQ... but to learn to use it better.
In the short video, I see how matching 'has to know something about these results that the function syntax won't tell us'. I imagine reflection might be used for this. Regarding memory allocation, it is what it is for me ~ I'm not asking for changes so that LINQ can fold more purely and quickly.
Surely you are in school somewhere... ?
Cheers.
Conversation locked
This conversation has been locked by the site admins. No new comments can be made.