Entries:
Comments:
Posts:

Loading User Information from Channel 9

Something went wrong getting user information from Channel 9

Latest Achievement:

Loading User Information from MSDN

Something went wrong getting user information from MSDN

Visual Studio Achievements

Latest Achievement:

Loading Visual Studio Achievements

Something went wrong getting the Visual Studio Achievements

Frank McSherry: Introduction to Naiad and Differential Dataflow

Download

Right click “Save as…”

Naiad is an investigation of data-parallel dataflow computation in the spirit of Dryad and DryadLINQ, but with a focus on incremental computation. Naiad introduces a new computational model, differential dataflow, operating over collections of differences rather than collections of records, and resulting in very efficient implementations of programming patterns that are expensive in existing systems. [Source: Microsoft Research]

"Our goal with Naiad was to address one of the recurring requests for systems like Dryad and DryadLINQ, incremental recomputation, but in so doing found that the necessary mechanisms gave rise to a new computational model, differential dataflow, capable of efficiently processing substantially more complex computations than current systems support, namely incremental and arbitrarily nested iterative dataflow computation."

Microsoft Researcher Frank McSherry joins us to discuss what this all means and how it would be useful in the big data problem space (a big problem space...). Demos included, of course.

Resources:

Download Naiad.

Read the Naiad tech report.

Learn more about Naiad on the MSR SVC Big Data blog.

 

 

Tags:

Follow the Discussion

  • contextfreecontextfree

    looks like a really interesting project, thanks

  • Allan LindqvistaL_ Kinect ftw

    That's some heavy stuff once you start thinking about it Smiley naiad + azure = skynet?

     

    Hope too see more naiad on c9!

  • Can you explain more about the .FixedPoint() linq method?  I found the naiad.pptx presentation with a lot of animations, but I'm unable to follow it without the narrative that I presume went with it.  I'll check out the paper, but if it's anything like this video, perhaps there are some prerequisite materials you can recommend?

    From a practical standpoint, I'm interested in how to integrate this with a datasource like SQL Server for persistence.  To maintain a twitter-like service optimized for individual user/viewer queries, I assume that we'd have the data in an intuitive, application-specific schema of Users, Tweets, and Mentions.  As changes are committed to the persisted data store (SQL Server), we must independently notify the Naiad cluster of the change - and then the Naiad cluster members (or is it just the Controllers?) can service user queries quickly.  Is it advisable to allow this notification to Naiad to come from the database layer rather than from the application layer, using something like SQL Query Notifications?  Or does the high volume of changes that might be expected rule out Query Notifications?  The benefit of being able to listen to SQL Server directly for relevant changes would be relative transparency to existing applications, less chance of data getting out of sync due to a forgotten call to Naiad.

    Alternatively, perhaps Naiad becomes the primary datastore and is augmented to persist its Dataflow dataset in a Naiad-friendly relational schema?

Remove this comment

Remove this thread

close

Comments Closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.