E2E: Brian Beckman and Erik Meijer  Co/Contravariance in Physics and Programming, 3 of n
Description
You learned about Brian Beckman's perspective on covariance and contravariance in physics. Erik Meijer found this topic to be incredibly interesting and the two geniuses decided to take a stab at identifying the relationship between co/contra in two different domains: physics and programming.
What will they discover at the whiteboards?
Tune in to find out in this npart series (part 1 here, part 2 here) with two of Channel 9's and Microsoft's most famous and respected software practitioners. Will there be a part 4? Perhaps you can help Brian and Erik find an answer to this interesting problem. They're real close. Niners can help reach the end line (if there is in fact one). It is highly recommended that you watch the first parts before watching this one!
Thinking caps on? Go!
Download
Right click or Alt+Enter to download this episode
 High Quality MP4 (834.0 MB)
 Mid Quality MP4 (526.3 MB)
 High Quality WMV (816.0 MB)
 MP3 (16.5 MB)
 Low Quality MP4 (340.2 MB)
 Mid Quality WMV (453.5 MB)
The Discussion

You know I saw this coming down in iTunes first and I thought: "it has to be a mistake!"  but apparently not. What a completely wonderful surprise. Let's hope this is an unbounded series...

I think there will be a part 4... The problem is big enough to keep the series alive for a while.
C

Now this is just a wild guess from someone who's extremely novice in Category Theory, but since the functions from P to {X} and from P to {Y} were treated specially, perhaps the functor from slice category Set/{X} to slice category Set/{Y} might be interesting to study?

Downloading this one to get the best of the colourful personalities. 7 minutes ....

*scratches head*
Must repeat the first episodes.
A great show as usual folks.
And: category theorists, show yourselves!
It's a pretty fun concept: realtime* theoretical investigation.
Well, it was real at the time

First of all I would like to state that I am not a physicist and I have almost zero knowledge of cathegory theory, but I wanted to share my view on the issue as I've really found these sessions very inspiring (and also some others, particularly what Bart de Smet is doing ... but I like you too, Brian and Erik ).
I think there is no correspondence between this particular programming and physics sample. Let me explain why.
Programming sample:
We have functions f and g and we are exploring what happens to their composition after applying Select functor.
Here are our types:
g :: A > B
f :: B > C
Select :: IEnumerable<a> > IEnumerable<b>  maybe it should say (a>IE<a>)>(b>IE<b>), but I wanted to capture the essence
xs :: A > IEnumerable<A>  lets assume xs is a monadic constructor or how that thing was called
Composition (f o g) under Select...
xs.Select(x => f(g(x)))  gives us IEnumerable<C> (or C>IE<C>)And our covariance...
xs.Select(x => g(x))  gives us IEnumerable<B> (or B>IE<B>)
.Select(x => f(x))  gives us IEnumerable<C> (or C>IE<C>)
Notice that we are always in IEnumerable<a> monad.

Now, to the physics sample:
There is our manifold P and its coordinatization (hope I spelled it right) cx which goes to our coordinate system {X} (cx: P>{X}). In the programming example image of cx would be equivalent to xs, P would be A, and {X} would be IEnumberable<A>.
Let's denote the image of cx as {X}<P>.
Now, the physics example's covariance and contravariance was about dealing with how functions p and f behave under changing the coordinate system (let's call it ToY:{X}>{Y} and ToX:{Y}>{X}).
But going from {X}<P> to {Y}<P> is fundamentally different than going from IEnumerable<A> to IEnumberable<B> (which is happening in our example). Going from {X}<P> to {Y}<P> is more like (in my opinion) going from IEnumberable<A> to IObservable<A>.
In the physics sample we have transformations ToY :: {X} > {Y} and ToX :: {Y} > {X}. In the programming world we have ToObservable() (~ ToY) and ToEnumerable() (~ ToX). And what you can then explore is how certain functions behave under (after applying) ToObservable() and ToEnumberable(). Unfortunately in this programming example with Select there is no direct correspondence between programming's f and g, and physic's p and f (because in physics P is A. Then we can say that T is B and F is C. Then g_{programming} = p^(1) and f_{programming} = ?). At this moment no example comes into my mind. But...
...the fundametal question is:
Why do we want to change coordinate systems in physics?
I am no physicist, but I believe that the answer is that because sometimes it's easier to calculate stuff in system {Y} instead of in system {X}.
Well, and why do we want to change from IEnumerable to IObservable  because it's better to do some stuff in IO than in IE.
Now this, I think, is a really beautiful and fundamental corrspondence.

Of course might be completely wrong, just wanted to share my 2 cents 
Disclaimer: I'm a software assembler not a programmer, but...
Now that the squiggly arrow from T to {x} is reversed, I’m curious why the straight arrows only have to go from T > P > F where they should be able to be T <> P <> F (Given fuel burn rate and 2 Position calc the Time you get there) which I think should solve:
C(t) =
C(P>T)=
C((P~>{x})>({x}~>T))=
C(P)>C(T)=
Which would now make both calcs covariant?

Go Niners! Go!
Love this!
C

Monadic turbulence in the conceptual boundary layer?
Monad. Monad. Monad.
C

Kyrae  this is a good one and fits my hunch that monads and coordinate systems are similar, that is, the coordinate transforms from {X} to {Y} and back are similar to the monad transformations from IEnumerable to IObservable and back. I'll think about this some more.

N2Cheval  yes, you're right: spotting the "signature" of contravariance is totally sensitive to the directions of the arrows. We will have to do better to nail that down. Too squishy so far But free jazz is like that sometimes.

I don't think the T > IEnumerable<T> is relevant to the physics problem, since there are no untyped values there. my guess is you need something to switch the coordinate systems, something like
FG :: (F A > F B) > (G A > G B)
then you don't need different types of arrows. you can just traverse the blue & green graphs by composing lists of those "monad transformers" ??

Well I think it is relevant, because you need to go from the manifold to the coordinate system. Of course your manifold would be stronly typed (`landmark`) but then again so is xs (where T would be int, string, w/e).
But I really think the important thing is going from {X} to {Y} and back as going from IE to IO.

Glad you like the idea

Maybe dumb commentary, but anyway.
I think that we should operate not on values but on functions, and there should be different type constructors when we're talking about time and fuel.
Something like this, e.g. for fuel CF<P,C(P)> = (P > F) > (C(P) > F) .
Values of such types should describe how function from position to fuel changes when we switch coordinate system from P to C(P)
And then we should prove that corresponding functor is covariant.
Or, probably that's overcomplicated. Maybe type constructor CF<P> = P > F is sufficient. Then to define functor we need mapping CF:: P > (P > F) that is for each coordinate system we should have function from position to fuel. And then we need to show that CF:: (P > CP) > ((P>F) > (CP>F)) is covariant functor. Hm, type parameter P is in input position. There should be contravariance then.

What ever happened to Brian's suprise that was coming a year ago? Still working on it or did something change or did I miss it? tia

It's already running in production. It's just that you can't know what it is yet
In time, grasshopper.
C 
So P is a collection of countries, T is the time it takes to fly there and F is the fuel it takes to fly there and {x} is the coordinate of the country?
I think that what maybe needed is a "coordinization" which takes coordinates from {x} to coordinates in {y} that is to say a functor which takes the function of a>b to a function using their coordinates in x and y. (Otherwise the coordinatization of T and F both to the same coordinate system will obviously just give exactly the same coordinates.)
Roughly speaking...
(function) TimeToFuel= T>F
(function) XcoordsOfTimeToYcoordsOfFuel = xCoordFromTime(T)>yCoordFromFuel(F) = xToYFunctor[ T>F ]
then also the reverse: timeToFuelFunctor[ x>y] = timeFromXCoord(T) > fuelFromYCoord(F)
The coordinization in your video (C) would here be the xToYFunctor.
Something like that anyway.
In code it might be(???) wildy guessing here:
complexVectorToPolarArray = Vector<complex> > Array<polar> = vectorToArray[ complex>polar ]
I give up.

Just another thought but...
I think the diagram doesn't serve you well. How I visualise it would be more like a childs pivoting seesaw. (Can't seem to get a image to insert here.) Where T is on the left end of the horizontal bar and F is on the right and P is an equilateral triange pivot in the middle. So a change in T are two end points in the yaxis and change in F are the same on the right. (think vertical up and down double ended arrows) The change in P though are the two end points of the bottom side of the triangle. (so bigger distance = bigger triangle.)
So with that in mind, covarince would be same change in the yaxis of both T and F due to the change in the height of P (ie. the shorter the distance reduces the time and fuel burn in the same vertical direction) and contravariance would be keeping the same distance P but changing the fuel burn F (or time when you get fast enough) to inversely change time T.
I hope that makes sense. Now to think of how to display the functional code for it... maybe my part 3?!?

I would have to agree that the "coordinization" of P to {x} feels a lot like the case of going from T to IEnumerable<T>. The only problem is that it may be from a collection of instances of T not the actual class of T itself.P is a set of landmarks, which is ordered, discrete, and cannot be used for calculating differentials or integrals. You can however convert each member of P into a LatLongAlt coordinate representation and then fill in the gaps between point using a path approximation function. The path approximation function lets you turn a discrete list of landmarks into a continuous infinite set of points that can be used for calculations.In the programming world T represents a Class, not a set of instances of that class. Once you have a set of ordered, distinct instances of T you could use an IEnumerable to go from the distinct set to a continuous set where interim results between distinct points are calculated using some predefined function that takes the previous and next discrete instances of T as it's input parameters and produces an estimated output value between the to points. The problem is that {x} is an infinite set/function based on time where as IEnumerable is a bounded set with fixed increments of time (ie: each call to Next() corresponds to 1 second)The part that still confuses me is Erik's remark about "Operations on noncoordinated things" and "Operations on coordinated things" being commutable. The coordinated things simply means you have a value (real or computed) for all inputs, which basically means the IEnumerable would always return a value when calling Next() and would never throw an exception. But what is the correlation for operations on noncoordinated (discrete) things in programing? The only operations I could think of are type casting, which take a T and return a new type. If so then you are looking for an operation in physics that takes P and turns it into something else. Does P > F count?
I would have to agree that the "coordinization" of P to {x} feels a lot like the case of going from T to IEnumerable<T>. It may however be from a collection of instances of T not the actual class of T itself. P is a set of landmarks, not the definition or "class type" of a landmark. This is easier to see for the other two sets which have well defined class types, such as F which is a set of real numbers corresponding to fuel consumption (ie: contains instances of the real number class)
P is a set of landmarks, which is ordered, discrete, and cannot be used for calculating differentials or integrals. You can however convert each member of P into a LatLongAlt coordinate representation and then fill in the gaps between points using a path approximation function. The path approximation function lets you turn a discrete list of landmarks into a continuous infinite set of points that can be used for calculations.
In the programming world T represents a Class, not a set of instances of that class. I guess you could view T as the abstract set of all possible instances of T, but that would really be comparable to the domain of all landmarks not the set of specific landmarks described by P in the example. Once you have a set of ordered, distinct instances of T you could use an IEnumerable to go from the distinct set to a continuous set where interim results between distinct points are calculated using some predefined function that takes the previous and next discrete instances of T as it's input parameters and produces an estimated output value between the to points. The problem is that {x} is an infinite set/function based on time where as IEnumerable is a bounded set with fixed increments of time (ie: each call to Next() corresponds to 1 second). You would end up with something like:
var x =PositionFromTime(t); if(x != null){ if(x == EndPosition){ return PositionToCord(x); } else { yield return PositionToCord(x); } } else { yield return ComputeCord(LastPosition, NextPosition, tFromLastPosition); }
The part that still confuses me is Erik's remark about "Operations on noncoordinated things" and "Operations on coordinated things" being commutable. The "coordinated things" simply means you have a value (real or computed) for all inputs, which basically means the IEnumerable would always return a value when calling Next() and would never throw an exception. But what is the correlation for operations on noncoordinated things (discrete instance) in programing? The only operations I could think of are type casting, which take a T and return a new type. If so then you are looking for an operation in physics that takes P and turns it into something else. Does P > F count?
Maybe someone should post this to MathOverflow.net and see if they can come up with anything

I think that there are two different things that got mixed:
1) similarity of definitions of co and contravariance in physics and programming
2) similarity of coordinatization to going to Monad
It should be possible to prove first without considering second. When we do Select on IEnumerable<> we're allowed to change type of element. Coordinatization might be treated similarly to such type change.
If we're talking about second then we probably need to take into account not only coordinatization of positions but also linear approximation needed to make continuous versions of f and p.

Great synergy . Great video.
I've been watching C9 for a long time and a big fan.
if possible, waiting for another E2E with other field area to discuss ( who knows what will get out !?!?! )

I suspect that the "key" of the Beckman category is in the physical domain.
At first sight, the properties of this category seem similar to the Wave/Particle duality (http://en.wikipedia.org/wiki/Wave–particle_duality)
The key => particles are "enumerable" and waves are "observable".
Both domains (the Schrodinger domain for particle and Maxwell for waves) are using the "curl" operator but in a dual (and not compatible) way.
In the physical domain
The only things what we know on a (long living) particle is his spacecoordinate (http://en.wikipedia.org/wiki/Identity_of_indiscernibles and http://en.wikipedia.org/wiki/Indistinguishable_particles) =>
coordinization (first sense) = "position enumeration"
In the "wave domain" the referential are transparent (time relative) => the wave doesn't have a local positionstate but only a timing (for wave events such wave absorption and wave emission).
In the computer domain
=> the root observer is the type "T" (the basic observation = computer clock = now and everywhere)
(composing observers = dividing the clock)
=> the root enumerator is the type "P" (the basic enumeration = here and every time)
(composing enumerator = filling the memory)
The flight path permits to convert "locally" the type T and P and F (but not globally) 
My understanding is that "normal" vectors (such as position and velocity) are said to be "contravariant" with respect to changes in coordinate systems because the coordinates of the vector transform in the opposite direction to the change of basis. Imagine holding a clock face in your hands, with the minute hand representing a vector pointing in a certain direction. Now turn the entire clock face 90 degrees clockwise. This represents a change in basis. In order to adjust the minute hand to point to its original position, it must be rotated counterclockwise by 90 degrees – the change in vector coordinates contravaries with the change in basis.
So now we want to express this idea in terms of functions and types. Following Brian’s idea that “coordinitization” is the functor in this problem:
public class Vector { }
public delegate double[] ToCoords<in TBasis>(Vector vector, TBasis basis);
Here Vector is the type for coordinatefree vectors, and ToCoords maps a vector to a set of vector coordinates given a vector basis (i.e. a coordinate system). This delegate type is contravariant in both the basis and vector arguments.
Now suppose we have a ToCoords<A> for a certain vector basis A. What happens if we apply a transformation from basis A to basis B? We want to compute ToCoords<B> in order to obtain vector coordinates for the new basis. Here’s how ToCoords changes with a change in basis:
public static ToCoords<B> TransformCoords<A, B>(this ToCoords<A> toCoordsA, Func<B, A> basisB2A)
{
return (vector, basis_b) => toCoordsA(vector, basisB2A(basis_b));
}
Notice that in order to go from a ToCoords<A> to a ToCoords<B> we need a function from basis B to basis A. This is the inverse of the A > B transformation we applied to start things off.
Now suppose we apply two transformations in a row, one from A to B and then one from B to C. In order to “back out” the change to the vector coordinates, we need to apply the inverse transformations in reverse order. So the following two expressions are equivalent:
ToCoords<C> tcc1 = tca.TransformCoords<A, C>(x => f(g(x)));
ToCoords<C> tcc2 = tca.TransformCoords(f).TransformCoords(g);
This is the opposite behaviour under composition to that of Enumerable.Select that Erik gave in the video, since Select is covariant in T and ToCoords is contravariant in TBasis.
Now there are some more “exotic” things in physics that are covariant with respect to a change in basis. They are called variously covectors, pseudovectors, dual vectors, row vectors, bivectors (my personal favourite), or even “linear functionals”. The last of these is probably the easiest to explain.
A linear functional is a function that takes a set of vector coordinates and returns a scalar (a number) through a process that looks a lot like a dot product of vectors. It’s easier to explain in code:
public delegate double LinearFunctional<out TBasis>(Vector vector, ToCoords<TBasis> toCoords);
public static LinearFunctional<TBasis> CreateLinearFunctional<TBasis>(TBasis basis, double[] coords)
{
return (vector, toCoords) => toCoords(vector, basis).Zip(coords, (x, y) => x * y).Sum();
}
Instead of providing the linear functional with a set of vector coordinates to operate on, I provided a vector and a way of getting coordinates from the vector in the form of a ToCoords delegate.
Notice that LinearFunctional is covariant in TBasis (i.e. <out TBasis>). This is due to the fact that ToCoords is <in TBasis> and I’m passing one in. Somehow two ins make an out. Maybe Erik can explain that...
So what happens to LinearFunctional<A> under a basis change from A to B? This one writes itself:
public static LinearFunctional<B> TransformLinearFunctional<A, B>(this LinearFunctional<A> lfa, Func<A, B> basisA2B)
{
return (vector, toCoords) => lfa(vector, TransformCoords(toCoords, basisA2B));
}
In order to go from a LinearFunctional<A> to a LinearFunctional<B> we need a function from A to B. This is the opposite of ToCoords. The behaviour under composition is also opposite to ToCoords (and the same as Select). The following expressions are equivalent:
LinearFunctional<C> lfc1 = lfa.TransformLinearFunctional(x => g(f(x)));
LinearFunctional<C> lfc2 = lfa.TransformLinearFunctional(f).TransformLinearFunctional(g);
Now once you get to tensors you can get even more exotic things that have both covariant and contravariant parts. I suspect this is the kind of thing Brian was trying to work with.
What have I missed? Is this too simplistic?
Steve.

I've not read all the comments, but i have a question/suggestion.
May it be that we are missign the coordinization of TIME and FUEL?
Let's make more simple:
10:00 AM : LONDON
11:00 AM : PARIS
12:AM : ROME
In this case TIME is not a "continous variable", it's discrete.
So, to be able to move to algebra to make calculus we need a function to move TIME frome discrete to continous, as we did with the POSTISIONS (cities).
EDIT: to clarify. TIME in the timesheet is only a set of values. We can call them TIME_A, TIME_B, TIME_C. We need to apply a coordinization to TIME and FUEL as well. I think that the misleading is that that function is identity.
So to answer Erik question:
F :: (A > B) > (FA > FB)
i think A > B is the lookup table itself.
Then FA > FB is the function that allows to compute position from time in the algebric way.
In this way F, the coordinization, should be well defined.
P.S.: sorry if i mispelled somthing, but is the first time ever that i talk math in english
P.S.S.: sorry if i wrote a lot of rubbish.

Wow! This is all great stuff! I have been thinking of this in terms of complex maps, where you take the function of something like sin(z) and map it out using the color map explorer. The interesting thing I have been thinking of is that the map is like your corridnate system, each point has an RGB value from 0 to 255. Values on the map can be transformed back to a Real and Imaginary pair for a pixel on the map. Well, if you take the map, and use a pixel shader to transform the RGB values you could then create a pixle shader that transforms the corridanes of your manifold to what ever mapping you like just by supplying the map from RGBx => RGBy. Totally covariant. Nifty thing is you can creat pixel shaders that do all kinds of analysis on the manifold, like a pixelshader that just modifies all the R values in the map, when you do the on a complex z map it produces some very interesting results. I did this with the sin(z) and then just used a pixelshader in photoshop and watch the map change in real time, I could instantly see when the value of R being modified caused the polar shift in the map to be inverse sin(z). I will work on a prototype demo in silverlight to show how this actaully works. The code for the complex mapper I found on The Codeproject: http://www.codeproject.com/KB/recipes/ComplexExplorer.aspx I'm going to integrate this project with an exsisting Silverlight Caculator with graphing: http://blogs.zibmak.net/ If I can wrap my head around the concept of compisition, I will attempt to create an interface where you can dynamically create a covariant pixelshader for math functions and add them to a moniod?!?! Just don't know if I have totally wrapped my head around the concept of moniods and the other one?!?! "maniods?" ~err the one that takes the compisition and applies a functor to it, in this case the funcrot would be a pixelshader that does mapping from RGBx => RRGBy on a manifold expresed as a complex color map.... Am I totally in another ball park? Or is this in line with what this series is attempting to accomplish?

I tried to come up with some comparable examples in code. We have already seen covariance in the shows, but I think what I believe is a contrvariant example is interesting because it has a pattern similar to the physics examples.
// Here is a simple select to get the fuel spent
fuelUsed = times.Select(ToLatLon).Select(FuelSpentAt);// Now suppose the function we have the get the fuel spent needs XYZ
// then we can compose FuelSpentAt with ToXYZ
fuelUsed = times.Select(ToLatLon).Select(latLon => FuelSpentAt(ToXYZ(latLon)));// Now if we separate the composition we see it is covariant (yes this was already covered)
fuelUsed = times.Select(ToLatLon).Select(ToXYZ).Select(FuelSpentAt);// FuelSpentAt • ToXYZ • ToLatLon
IEnumerable<XYZ> orderedXYZs;//IEnumerable<LatLon> orderedLatLon;
// Now lets say we want to get the XYZs ordered by the rate that fuel was being consumed at that point. This is pretty straight forward with a function that takes XYZ and returns a rate.
orderedXYZs = times.Select(ToXYZ).OrderBy(FuelRateAt);
// Suppose though we only have a FuelRateAt function that takes LatLon in this case we can compose ToLatLon with FuelRateAt inside the OrderBy
orderedXYZs = times.Select(ToXYZ).OrderBy(xyz => FuelRateAt(ToLatLon(xyz)));// If we separate the composition we get the following. Now I beliebe OrderBy is contravariant since IComparable is contravariant, but I might be wrong. But the interesting thing is that this pattern is like the pattern Brian Beckman had in the physics example
orderedXYZs = times.Select(ToXYZ).Select(ToLatLon).OrderBy(FuelRateAt).Select(ToXYZ);
// ToXYZ • FuelRateAt • ToLatLon • ToXYZ
// Note the inverse (ToXYZ) is composed to the left of the FuelRateAt function

the whole thing answered in a tweet :)
https://twitter.com/TopologyFact/status/944239945148960773 
Studying this in 2018, 8 years after the series was created, and just read Brian's last comment. This was all brilliant and inspiring. Thanks all.

Attached lecture by Bartosz Milewski builds up to the definition of co and contravariance in category theory at the end of the lecture. Thirty minutes, highly recommended, worthy of your time: https://www.youtube.com/watch?v=TtvVHokhSoM