Microsoft Research recently announced the availability, under Academic Licensing, of Dryad, an infrastructure which allows a programmer to use the resources of a computer cluster or a data center for running data-parallel programs. A Dryad programmer can use thousands of machines, each of them with multiple processors or cores, without knowing anything about concurrent programming.
DryadLINQ is the managed high level programming abstraction used to compose Dryad vertex topology graphs that the Dryad infrastructure uses to partition parallel computations. Here, Erik Meijer and Dryad team member Roger Barga discuss Drayad and DryadLINQ at a high level so that most of us can understand the implications, history and future of Dryad. This is an introductory piece. Erik and I will dive deep into Dryad with one of the scientists behind it in the second part of this Expert to Expert mini series on Dryad. UPDATE: The Going Deep episode on Dryad is now live.
Enjoy! This is incredible and important technology for simplifying the inherent complexity of distributed computation in the cloud. In essence, DryadLINQ enables a sequential programming experience over what will execute across potentially thousands of machines (depending upon the computational complexity of the program) concurrently. Much to learn here. Channel 9 will help teach.
Editorial note: When we discuss native code and the implementation of Dryad, the focus is on DryadLINQ not the Dryad infrastructure and low level vertex APIs, which are written in C++. Just to be clear...
Useful links:
Connect site: http://connect.microsoft.com/site/sitehome.aspx?SiteID=891
ER Website on Academic Use: http://research.microsoft.com/en-us/collaboration/tools/dryad.aspx
MSR Info: http://research.microsoft.com/en-us/projects/dryadlinq/