Frank McSherry: Introduction to Naiad and Differential Dataflow
- Posted: Nov 08, 2012 at 10:29 AM
- 25,705 Views
- 3 Comments
Loading User Information from Channel 9
Something went wrong getting user information from Channel 9
Loading User Information from MSDN
Something went wrong getting user information from MSDN
Loading Visual Studio Achievements
Something went wrong getting the Visual Studio Achievements
Right click “Save as…”
Naiad is an investigation of data-parallel dataflow computation in the spirit of Dryad and DryadLINQ, but with a focus on incremental computation. Naiad introduces a new computational model, differential dataflow, operating over collections of differences rather than collections of records, and resulting in very efficient implementations of programming patterns that are expensive in existing systems. [Source: Microsoft Research]
"Our goal with Naiad was to address one of the recurring requests for systems like Dryad and DryadLINQ, incremental recomputation, but in so doing found that the necessary mechanisms gave rise to a new computational model, differential dataflow, capable of efficiently processing substantially more complex computations than current systems support, namely incremental and arbitrarily nested iterative dataflow computation."
Microsoft Researcher Frank McSherry joins us to discuss what this all means and how it would be useful in the big data problem space (a big problem space...). Demos included, of course.
Resources:
Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation,
please create a new thread in our Forums,
or
Contact Us and let us know.
Follow the Discussion
Oops, something didn't work.
What does this mean?
Following an item on Channel 9 allows you to watch for new content and comments that you are interested in. You need to be signed in to Channel 9 to use this feature.What does this mean?
Following an item on Channel 9 allows you to watch for new content and comments that you are interested in and view them all on your notifications page.sign up for email notifications?
looks like a really interesting project, thanks
That's some heavy stuff once you start thinking about it
naiad + azure = skynet?
Hope too see more naiad on c9!
Can you explain more about the .FixedPoint() linq method? I found the naiad.pptx presentation with a lot of animations, but I'm unable to follow it without the narrative that I presume went with it. I'll check out the paper, but if it's anything like this video, perhaps there are some prerequisite materials you can recommend?
From a practical standpoint, I'm interested in how to integrate this with a datasource like SQL Server for persistence. To maintain a twitter-like service optimized for individual user/viewer queries, I assume that we'd have the data in an intuitive, application-specific schema of Users, Tweets, and Mentions. As changes are committed to the persisted data store (SQL Server), we must independently notify the Naiad cluster of the change - and then the Naiad cluster members (or is it just the Controllers?) can service user queries quickly. Is it advisable to allow this notification to Naiad to come from the database layer rather than from the application layer, using something like SQL Query Notifications? Or does the high volume of changes that might be expected rule out Query Notifications? The benefit of being able to listen to SQL Server directly for relevant changes would be relative transparency to existing applications, less chance of data getting out of sync due to a forgotten call to Naiad.
Alternatively, perhaps Naiad becomes the primary datastore and is augmented to persist its Dataflow dataset in a Naiad-friendly relational schema?
Remove this comment
Remove this thread
close