Loading user information from Channel 9

Something went wrong getting user information from Channel 9

Latest Achievement:

Loading user information from MSDN

Something went wrong getting user information from MSDN

Visual Studio Achievements

Latest Achievement:

Loading Visual Studio Achievements

Something went wrong getting the Visual Studio Achievements

Expert to Expert: Erik Meijer and Michael Isard - Inside Dryad

1 hour, 6 minutes, 33 seconds


Right click “Save as…”

Microsoft Research recently announced the availability, under Academic Licensing, of Dryad, an infrastructure which allows a programmer to use the resources of a computer cluster or a data center for running data-parallel programs.

A Dryad programmer can use thousands of machines, each of them with multiple processors or cores, without knowing anything about concurrent programming.

That's a pretty heady statement. What does Dryad do, exactly, to enable this level of abstraction, shielding programmers from the incredibly complex world of distributed parallel computing? Does the level of abstraction impact the degree to which sophisticated programmers can interact with and control some of the low level mechanisms of the Dryad runtime? What is it about LINQ that made it the no-brainer managed programming abstraction for Dryad?

Simply, how does Dryad work? This is the core question that Erik and I had after our conversation with Roger Barga (part one of this E2E mini-series on Dryad and DryadLINQ - perhaps we should focus just on DryadLINQ next time, but for now, all the information in this conversation is certain to keep you very busy and answer many questions you may have after learning about Dryad in part one...). 

Lots of whiteboarding here. Put on your thinking caps!



Follow the discussion

  • Oops, something didn't work.

    Getting subscription
    Subscribe to this conversation
  • SoftwareWarriorSoftware​Warrior Software Science

    I really, really like the idea of Dryad. Any idea when we can get our hand on the bits? Not just the client but the whole thing. In academia were are starting to test Hadoop, Eucalyptus, and the like. I would love to add Dryad to the mix. Any chance of this happening in the near future?

  • CharlesCharles Welcome Change

    Grab what you need here: http://research.microsoft.com/en-us/downloads/03960cab-bb92-4c5c-be23-ce51aee0792c/default.aspx


  • SoftwareWarriorSoftware​Warrior Software Science

    Thank you Charles, this is going to be a lot of fun!

  • SoftwareWarriorSoftware​Warrior Software Science

    Quick question is "Windows HPC Server 2008" cluster the same as "Windows Server 2008" cluster? If no, can we use a "Windows Server 2008" cluster? For that matter, can we use any Windows machine?

  • CharlesCharles Welcome Change

    The requirement is Windows HPC Server 2008. This is a framework and runtime that is for use in an HPC cluster. Good question in terms of whether or not a Windows Server cluster would suffice, but it won't.


  • SoftwareWarriorSoftware​Warrior Software Science

    So close and yet so far, implementation wise, vanilla Windows boxes would have more traction. Everyone has Windows boxes around. Thank you all the same.

  • William Staceystaceyw Before C# there was darkness...

    @softwarewarrior.  "So close and yet so far, implementation wise, vanilla Windows boxes would have more traction. Everyone has Windows boxes around." 


    I was thinking the same thing.  Installing multiple HPCs is a pretty high bar to kick the tires or even for research purposes.  Not even sure why that is required.  Why does it have a dependancy on hpc?  IMO, it should be able to work on any win system (xp and above) by just installing a listener/worker process on a system (even dynamically via rexec.exe) and wait for directions.  The high road solution would be to enlist any/all windows computers in your org and bring them up and down as needed creating  virtual-compute farms.

  • BassBass Knows the way the wind is flowing.

    Damn right and license it for non-academia while you are at it. I really think it's incredibly lame to discriminate like that. Fabulous job there on completely alienating commercial developers who need clustering and would love access to something like this. You know, the kind of people that actually make you money in the form of increased Windows sales.


    Yes it does piss me off. What reaction do you expect to do when you show me something interesting and make it illegal for me to use it. Gratification? Hell no. This is really so lame.


    I knew I should I sticked with Java. I'd actually have a real clustering solution today (Hadoop). Really .NET can be so much better if you all stopped with your overzealous software hoarding mindsets. But Java is looking better every day.

  • CharlesCharles Welcome Change

    Good questions, all. You'll soon hear from the people who can best answer them.


  • We are already working with the Dryad team on a version that will not require Windows HPCS.  We had to make a decison early in the project whether to leverage HPCS for the initial implementation or base it on Windows and build the necessary scheduler, monitoring utilities, and file metadata management.  The decision to start with the HPCS implementation was to get Dryad released asap and we can deploy this on HPCS clusters that he have set up at universities worldwide.  And since Windows HPCS is free through the MSDN Academic Aliance, this would not cost our academic partners a dime.  All in all we thought this would be an effective way to get started.


    As for the choice of license, our group in Microsoft Research is responsible for university relations and partnerships with academic researchers.  Hence this is the community our small team can support during this initial release.  We are working on a broader release under a more permisive license with the goal of an open source release of Dryad, in collaboration with the MSR-SV team. 


    One step at a time...  

  • William Staceystaceyw Before C# there was darkness...

    Thanks Roger.  That makes sense.  I totally understand this stuff takes time to bake.  Nice job btw.

    Thinking about it, something like Mesh could make a good model of how to add computers into your compute "circle of trust".  Also mesh already has a host process you could hook onto somehow.  Now just tell the Oz-man his world is about to change again Smiley

  • BassBass Knows the way the wind is flowing.

    Thank you for addressing my concern. I'm sorry if I sounded a bit rude, it's just something I am very interested in for my own (non-academic) projects.


    Hopefully it won't be long before I can use this EXTREMELY USEFUL (!!!!) technology in my own projects. The confines of a single computer is becoming really limiting, I really do not want to switch to Java, or have to re-implement my own Dyrad/Hadoop instead of focus on the real problem. So it's quite frustrating.

  • Very exciting, guys. Thanks for your increasing commitment to the academic and open source communities. Hopefully, we'll see an MS-PL license for this work someday, too.


    I do have (an admittedly biased) comment on the recent C9 "many-core" discussions. I really appreciate your encouraging discussion on many-core, Charles, in this interview and others. There seems, however, to be a supposition in the interviews that "one day in the future" we'll have many-core at our disposal. We obviously already have extremely powerful many-core processors in our everyday gaming rigs - NVIDIA and AMD have revolutionized a large set of scientific computing problems, and a cross-platform stream computing language has been ratified and adopted my many of the major players. Stream computing seems to be almost intentionally absent from the discussion. It's especially poignant to me in these two videos when discussing Dryad's capabilities for re-structuring expression tree nodes or pipelining computations based on the skew of the data - couldn't these strategies also help when targeting architectures such as the GPU (or Cell...I guess)?


    I know you've covered the Microsoft Accelerator project in the distant past, and I know Dryad was built to target systems without shared memory, and I know parts of PFX are built specifically with CLR architectures in mind, and I knowcurrent GPU architectures have PCI-Express bandwidth limitations...but Map-Reduce has been shown to map to the GPU quite readily for some applications, Brahma and C$ languages have shown you can efficiently map high-level lamdas in managed code to GPU code, and DX Compute (though unmanaged) is here..


    So, in the end, I guess my question is: Can we have a C9 discussion about how I might soon write something as "simple" as:


    myVeryLargeArray.AsParallel().OnGPU().Select( x => a*x +b); ?


    Thanks for listening - keep up the great work!!

  • bump - any thoughts?

  • slasla

    THanks Charles.

    This one much better talk about Dryad then previous one with Roger Braga Smiley More details, less advertisement.


    Bring us more interesting talks Smiley !

Remove this comment

Remove this thread


Comments closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.