Cloud Cover Episode 16 - Big Compute with Full Monte

Sign in to queue

Description

Join Ryan and Steve each week as they cover the Microsoft cloud. You can follow and interact with the show at @cloudcovershow

In this episode:  
  • Discuss the lessons learned in building the HPC-style sample called Full Monte.
  • For 'big compute' apps, learn the gotchas around partitioning work, message overhead, aggregation, and sending results.
  • Listen in on how to effectively use queues in Windows Azure.
  • Discover a tip for keeping your hosted services stable during upgrades.

Show Links:

Using Affinity Groups in Windows Azure
Enzo SQL Shard (SQL Azure Sharding)
CloudCache (Memcached in Windows Azure)
Windows Azure Training Kit - June Release
Full Monte Sample

Embed

Download

Download this episode

The Discussion

  • User profile image
    dsghi

    I'm not sure why, but I chuckle every time I hear "lease the blob" .. Smiley

  • User profile image
    Pablo Marambio

    That was an awesome presentation. I really enjoyed the theory-oriented way this episode took... I guess I like episodes like this --with less code-- beacause I can get into the details after, if I really need to --thing is, this time I actually had to dwelve into the details, but I liked being able to choose... long live freedom!

     

    Anyway, great show.  Now, I want to ask you where can I find the work-partitioning theory you talked about today... I know it's a preety common thing in computer science (at least nowadays, where paralelization is more rule than choice), but... what book/paper/source whould you recommend about this topic?

  • User profile image
    tamerbinto

    Where can I find the code for the Full Monte HPC example?

  • User profile image
    dunnry

    I updated the post above to link to it.  I really should have done that to begin with.  Sorry!

  • User profile image
    dunnry

    Hi Pablo - thanks for watching!  Glad you liked the show.  I haven't seen much in the way of books or papers on partitioning.  What Steve and I were talking about came from experience trying to build this ourselves.  I suspect that how your partition your work will also be highly dependant on not only the application, but some characteristics of the cloud you are working on.  In general, you have to balance your work to overhead ratio.  You have conflicting goals in some cases - maximizing available work (and hence parallelization) while minimizing performance loss by overhead of more work.

     

    For Full Monte, we literally tested a number of ratios of work to message size to find the right balance.  We did not want to starve roles or lose huge chunks of work in failure, but we also did not want to spend more time on overhead than computation.  Turns out it was a tricky balancing act.

     

    Take a look at the code (linked above now) and see how we did it.  That might be a good start.  Thanks again!

     

Add Your 2 Cents