Defered evalution is definately a very interesting subject. The work being done with LINQ is along the same lines. The compiler generates an expression tree that can then be passed around as data and transformed before evaluation.
I wonder if its possible to take expression trees generated by LINQ and transform them into parallelisable computations. I suppose it really comes down to "map" and "reduce" functions in the end. Whilst you are kind of limited to pure arithmetic operations in the GPU, the future of multi-cores certainly could widen the scope.
Of course, I can't talk about abstract syntax trees without once again mentioning syntactic macros

It would be interesting to look at using syntactic macros to perform staged computation. I'm sure some of the parallelising of operations can be decided at compile time. That could make for even more performance increases since you can take some weight off the JIT compiler.

Anyway... Awesome work and great video Charles.
BTW: Charles, you need to get a secondary job at MSR being "social glue"! We need to get all these academics down to the bar to mix their ideas.