<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" media="screen" href="/styles/xslt/rss.xslt"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:c9="http://channel9.msdn.com">
<channel>
	<title>Comment Feed for Channel 9 - Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
	<atom:link rel="self" type="application/rss+xml" href="http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/RSS"></atom:link>
	<image>
		<url>http://ecn.channel9.msdn.com/o9/previewImages/100/249611_100x75.jpg</url>
		<title>Channel 9 - Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<link></link>
	</image>
	<description>Burton Smith is&amp;nbsp;a Technical Fellow at Microsoft who thinks about ways in which our platform needs to be structured to support general purpose computers that will soon have
 clustered super computer processing power as we move closer to manycore everywhere (not too far off into the future...). Burton is a parallel computing expert, an industry thought leader in high performance, massively parallel distributed (aka super)&amp;nbsp;computing.
 Winner of the Seymour Cray Computer Engineering Award, Burton knows a thing or two about how to architect and implement software systems that can succeed in the Age of Manycore.
This is a long and great&amp;nbsp;conversation, unedited of course. You&#39;ll want to make some time for this and listen carefully to what Burton says. This is a very important general introduction to parallelism and high performance computing.&amp;nbsp;As always, we can&#39;t talk
 about super computing without&amp;nbsp;addressing&amp;nbsp;program language evolution in the context of manycore (you&#39;ve seen this quite a bit on C9 over the years). We cover a lot of ground here including Burton&#39;s insights into&amp;nbsp;functional programming, transactions, compatability,
 shared mutable state, operating systems, technical redunancy and the role of Technical Fellows in the post-Bill era.Enjoy this great introduction to&amp;nbsp;parallelism and the future&amp;nbsp;of our platform technologies and tools as we head into the age of manycore. This is the first in a series of several interviews covering parallel computing and Microsoft&#39;s Parallel Computing Platform
 technologies, specifically.Low res file for the bandwidth-challenged.
</description>
	<link></link>
	<language>en</language>
	<pubDate>Sun, 19 May 2013 01:11:38 GMT</pubDate>
	<lastBuildDate>Sun, 19 May 2013 01:11:38 GMT</lastBuildDate>
	<generator>Rev9</generator>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[I'm such a pedant, but &quot;Low res file for the bandwidth challenged&quot; means that the file was challenged, not that the people are bandwidth-challenged. You need a hyphen or the entire sentence means something else. (I sat here for a minute wondering why the
 hell someone would challenge the file)<br /><p>posted by evildictaitor</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385198790000000</link>
		<pubDate>Wed, 13 Feb 2008 17:17:59 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385198790000000</guid>
		<dc:creator>evildictaitor</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">evildictaitor wrote:</div>
<div class="quoteBody">&#65279;I'm such a pedant, but &quot;Low res file for the bandwidth challenged&quot; means that the file was challenged, not that the people are bandwidth-challenged. You need a hyphen or the entire sentence means something else. (I sat here for a minute
 wondering why the hell someone would challenge the file)<br /></div>
</blockquote>
<br /><br />Fixed. Now, watch the interview. <img src='http://ecn.channel9.msdn.com/o9/content/images/emoticons/emotion-1.gif' alt='Smiley' /><br />C<p>posted by Charles</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385199900000000</link>
		<pubDate>Wed, 13 Feb 2008 17:19:50 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385199900000000</guid>
		<dc:creator>Charles</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[You might want to run a spell checker over the accompanying text, hint &quot;clustered&quot; <img src='http://ecn.channel9.msdn.com/o9/content/images/emoticons/emotion-5.gif' alt='Wink' /><br /><br />Great video though!<p>posted by tomkirbygreen</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385249660000000</link>
		<pubDate>Wed, 13 Feb 2008 18:42:46 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385249660000000</guid>
		<dc:creator>tomkirbygreen</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[I think most of us managed to get the meaning. Anyhow - excellent video! <img src='http://ecn.channel9.msdn.com/o9/content/images/emoticons/emotion-1.gif' alt='Smiley' /><p>posted by esoteric</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385286830000000</link>
		<pubDate>Wed, 13 Feb 2008 19:44:43 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385286830000000</guid>
		<dc:creator>esoteric</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[Good to hear from a fellow HPC geek;)<br /><br />I'm not sure if silos have to be eliminated in quite the way that Burton mentions. Sometimes (in my experience most the time) one group needs a component before another group will be ready to deliver it, even if they have the &quot;winning solution&quot;. Say for example
 the winner is in the middle of a reliablity scrum so&nbsp;they aren't going to be coding anything for the next 1-2 weeks. That will push any component of any meaningful size off for at least a month.<br /><br />In my view it isn't so much picking the winner up front,&nbsp;but getting a rough API design and letting whoever needs it first code it. Then everyone needs to know that feature has been made so when they run into a need for it they can &quot;cut and paste&quot; it into their
 app. Otherwise what happens is you pick a winner and the other projects get a stop sign placed on their Gantt charts wherever the point at which the project leaders can deliver that component.<br /><br />I would like to see a little bit of what I'd like to term &quot;architect in a cloud&quot; where an architect isn't tasked to a particular product group but is free to roam&nbsp;(Bill Gates probably has demonstrated that role best in the past). That way if they have the winning
 idea but another group than the one they currently are in has spare dev resources or an earlier dependance on the component the architect can float on over to that group and lead the project rather than wait for the resources to be free in his department or
 steal the resources from another department.<br /><br />Two arguements for it:<br /><br />1) It is easier to move one person then a dev team.<br />2) Architects spending more time circulating about will have a deeper grasp of the company and a better idea of who the &quot;thought leaders&quot; are in a bunch of different niches.<br />3) Components that are depended on by multiple projects get started earlier.<p>posted by deltalmg911</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385345010000000</link>
		<pubDate>Wed, 13 Feb 2008 21:21:41 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385345010000000</guid>
		<dc:creator>deltalmg911</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[Just got round to watching it, and I'm impressed. Burton mentions a lot of very important things (like that functional programs can be easilly serialized) and goes into a lot of depth here about why functional-geeks care about functional languages.<br /><br />Can I just help to clarify for some people - when Burton goes on about everything in functional languages being constant and making things out of these constants, a much better way of putting this is that are composed of functions which are
<i>constant with respect to their arguments</i><br />, and this allows for more benefits than just compile (and just-in-time) compilation for parallelisation - it also has serious impacts as to how you compute them, so it's
<i>even more </i>important than he was suggesting.<br /><br />Anyway, this was an excellent video, and I hope to see more where they came from in the near future! Good work Burton! (and Charles, yeah. shouldn't forget him <img src='http://ecn.channel9.msdn.com/o9/content/images/emoticons/emotion-4.gif' alt='Tongue Out' />)<br /><p>posted by evildictaitor</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385381730000000</link>
		<pubDate>Wed, 13 Feb 2008 22:22:53 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385381730000000</guid>
		<dc:creator>evildictaitor</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[Great video! Reassuring! He seems to really &quot;get it&quot; (IMO) so it's nice to know that someone like that is walking the halls and nudging things in the right direction.<br /><p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385384990000000</link>
		<pubDate>Wed, 13 Feb 2008 22:28:19 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385384990000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[SQL is a fine example of a functional language that most of us are used to already... but even can become complex when considering a database getting hit by many users simulaneously with row/table locks and buffering modes. It doesn't really solve the
 race condition that crops up in parallel computing.<p>posted by Dark_Halmut</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385521950000000</link>
		<pubDate>Thu, 14 Feb 2008 02:16:35 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385521950000000</guid>
		<dc:creator>Dark_Halmut</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">Dark_Halmut wrote:</div>
<div class="quoteBody">&#65279;SQL is a fine example of a functional language that most of us are used to already... but even can become complex when considering a database getting hit by many users simulaneously with row/table locks and buffering modes. It doesn't
 really solve the race condition that crops up in parallel computing.</div>
</blockquote>
<br /><br />The bit where the locks are is where the functional bit (SQL) is put down in state - i.e. where SQL stops being functional (storing things is a non-functional thing to do). Of course, if we didn't do that bit, it would never bother to store the data (which
 would suck as databases go). If your database was immutable (i.e. you never wrote to it), SQL would be purely functional, and achieves massive parrallelism.<br /><p>posted by evildictaitor</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385567030000000</link>
		<pubDate>Thu, 14 Feb 2008 03:31:43 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385567030000000</guid>
		<dc:creator>evildictaitor</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<p>Now the spelling is ok, the download link doesn't work. <img src='http://ecn.channel9.msdn.com/o9/content/images/emoticons/emotion-7.gif' alt='Perplexed' /></p>
<p>posted by klinkby</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385730460000000</link>
		<pubDate>Thu, 14 Feb 2008 08:04:06 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385730460000000</guid>
		<dc:creator>klinkby</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[So Excel spreadsheets/SQL are functional programing. Indisputable, but I'm ashamed to admit that I never thought of it that way.<br /><br />Burton has been around for about 2 years, how many other Technical Fellows have <i>
not </i>been on Channel 9 (consentual of course)? Please try and get as many, irrespective of their field for a session. The richness and depth of experience these 'Fellows' have means hearing from them is like receiving 'Manna From Heaven'.<br /><br /><br />It's good to hear from Microsofties, and non-Microsofties like Gilhad Bracha for equilibrium. Bravo!<br /><p>posted by vesuvius</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385742050000000</link>
		<pubDate>Thu, 14 Feb 2008 08:23:25 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385742050000000</guid>
		<dc:creator>vesuvius</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">evildictaitor wrote:</div>
<div class="quoteBody">&#65279;If your database was immutable (i.e. you never wrote to it), SQL would be purely functional, and achieves massive parallelism.</div>
</blockquote>
<br />And be nowhere near as useful as a database where you could change data.<p>posted by JChung2006</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385785950000000</link>
		<pubDate>Thu, 14 Feb 2008 09:36:35 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385785950000000</guid>
		<dc:creator>JChung2006</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">klinkby wrote:</div>
<div class="quoteBody">&#65279;
<p>Now the spelling is ok, the download link doesn't work. <img src="/emoticons/emotion-7.gif" border="0"></p>
</div>
</blockquote>
<br /><br />The link is not broken.<br />C<p>posted by Charles</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385938900000000</link>
		<pubDate>Thu, 14 Feb 2008 13:51:30 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385938900000000</guid>
		<dc:creator>Charles</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[It was super pleasure to listen the thoughts &amp; techniques and the way things have evolved in the Parallel WORLD.....by Burton Smith..<br />Especially about how we can make Functional languages to truly evolve by adding sort of Transaction thing...( which has been well done in data world )<br /><br />It was super great ...&amp; just put his name in ur pocket Charles, so that we can have more of him in future...and may be if u can find more of architects ..., then it will be super awesome..<br /><br />F# &amp; talk about Don Syme is also nice ..., who built generics for .NET<br />F# is really super cool language and i have started learning it, its just awesome and as a Programmer you really get to learn something that really develops you..in back...<br /><br />Once again thx to Burton Smith for such an awesome&nbsp;talk..<br /><br />cheers Charles..<br /><br /><p>posted by gaurav.net</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385949270000000</link>
		<pubDate>Thu, 14 Feb 2008 14:08:47 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633385949270000000</guid>
		<dc:creator>gaurav.net</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[The interview is memorable because of Burton's ability to express complex issues in a simple way. Like others, I like the linking of Excel/SQL/defered constants to functional programming.<br /><p>posted by BSalita</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386028930000000</link>
		<pubDate>Thu, 14 Feb 2008 16:21:33 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386028930000000</guid>
		<dc:creator>BSalita</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<p>So Charles, just why is David Cutler so reluctant to appear on C9? Are we so monstrous? <img src='http://ecn.channel9.msdn.com/o9/content/images/emoticons/emotion-5.gif' alt='Wink' /></p>
<p>Also, would it be possible to include links to papers or people referred to in your interviews? I often find myself hitting pause and web-grepping for the resource in question, it would be great to have them up front.</p>
<p>posted by tomkirbygreen</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386116590000000</link>
		<pubDate>Thu, 14 Feb 2008 18:47:39 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386116590000000</guid>
		<dc:creator>tomkirbygreen</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">gaurav.net wrote:</div>
<div class="quoteBody">&#65279;<br />Especially about how we can make Functional languages to truly evolve by adding sort of Transaction thing...( which has been well done in data world )<br /></div>
</blockquote>
<br /><br />You may be interested to hear that Haskell has had transactions since 2005. In fact, there's a channel 9 video about them with Simon Peyton Jones and Tim Harris (though it's more general, and not specifically about their Haskell implementation).<br /><br />And yes, it does indeed work very well. For most cases you would try data parallelism, if that doesn't work you'd try task parallelism, if that doesn't work you would threads and message passing, and if that doesn't work either you could try using shared memory
 and then the transactions are really helpful. They round off the &quot;portfolio&quot; of strategies to use when doing parallel programming nicely, as there are in practice a few things where we really need a large shared data set that can be modified by thousands of
 agents at once without having to do course grained locking as that would cripple performance (consider a game, for example).<br /><p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386119670000000</link>
		<pubDate>Thu, 14 Feb 2008 18:52:47 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386119670000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[Nice.&nbsp; Thanks C and Burton.&nbsp; Top video.<p>posted by staceyw</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386316090000000</link>
		<pubDate>Fri, 15 Feb 2008 00:20:09 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386316090000000</guid>
		<dc:creator>staceyw</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>gaurav.net wrote:</strong>
<hr size="1">
<i>&#65279;<br />Especially about how we can make Functional languages to truly evolve by adding sort of Transaction thing...( which has been well done in data world )<br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br /><br />You may be interested to hear that Haskell has had transactions since 2005. In fact, there's a channel 9 video about them with Simon Peyton Jones and Tim Harris (though it's more general, and not specifically about their Haskell implementation).<br /></div>
</blockquote>
<br /><br />Just checking out that video of Peyton Jones and Tim Harris...<br />Thx for the info...<br /><br />This is very interesting and the languages like F# also popping up...<p>posted by gaurav.net</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386710760000000</link>
		<pubDate>Fri, 15 Feb 2008 11:17:56 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386710760000000</guid>
		<dc:creator>gaurav.net</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[Excellent nothing. That was a brilliant video.<br /><br />Thank you Burton, and thank you Charles.<br /><br /><br />PS: As for Dave Cutler. I certainly don't want to disrespect him in any way, but my mischievous side just can't resist: My take is that he spends a lot of his time in the primitive world of the Operating System Kernel, etc, and has thus adopted primitive beliefs
 about Cameras/Pictures stealing people's souls... Either that or he can't trust himself to keep a secret...[A]<br /><br /><br /><p>posted by RichardRudek</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386716060000000</link>
		<pubDate>Fri, 15 Feb 2008 11:26:46 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386716060000000</guid>
		<dc:creator>RichardRudek</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<p>Glad you liked this conversation. Burton's a very friendly and engaging person.</p>
<p>Re: Cutler, he's just not interested in this type of thing (being &quot;interviewed&quot; - these are conversations, really, not
<em>interviews</em> per se...). There's nothing more to it...</p>
<p>I'll try and provide more links to content mentioned in conversations going forward. Thanks for the feedback.</p>
<p>C</p>
<p>posted by Charles</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386759030000000</link>
		<pubDate>Fri, 15 Feb 2008 12:38:23 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386759030000000</guid>
		<dc:creator>Charles</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[Hi Charles,<br /><br />Thanks to you and Burton Smith for the conversation. I would like to hear more about anything at microsoft having to do with isolation via message passing and capability security, similar to the E language:<br /><br /><a href="http://www.erights.org/">http://www.erights.org/</a><br /><br />I know we have CCR, but the syntax is clumsy, messaging does not have first class status, nor is there any isolation. What kind of similar things to E are happening at Microsoft, if any?<br /><br />For me, isolation via message passing sounds more interesting than transactional memory. It has&nbsp;proven useful for decades.<br /><br />Thanks,<br />Frank<p>posted by Frank Hileman</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386817180000000</link>
		<pubDate>Fri, 15 Feb 2008 14:15:18 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633386817180000000</guid>
		<dc:creator>Frank Hileman</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">Frank Hileman wrote:</div>
<div class="quoteBody"><br />For me, isolation via message passing sounds more interesting than transactional memory. It has&nbsp;proven useful for decades.<br /></div>
</blockquote>
<br /><br />Why not have both?<br />In some cases message passing just doesn't work well at all. For example you may want 10000 objects to be able to query the same data (but only one or two of those objects of those modifies the data). Do you really want each to pass through a protocol with
 just a single thread &quot;owning&quot; that data? Sounds like a recipe for disaster w.r.t. performance to me. Transactions scale very well in situations like that (which is extremely common in practice), because each thread can do all the reading it wants to without
 interfering with the execution of any other threads.<br /><br />I agree that message passing is ideal <i>when suitable</i>, but sometimes it just isn't. Also, while threads are sometimes excellent abstractions (even for things where you *don't* care about concurrency they may be the right model), sometimes they just suck.
 They basically suffer from the same problem that &quot;goto&quot; does: obfuscation of the structure of the program (you have to jump back and forth through messages in different threads, which one may not be known until you run the program, to understand what the program
 does).<br /><br />Now I freely confess that I hadn't seen E (I'll look into it now) so if they give some new cool abstraction that solves all of this I'll recant my statements, but for now I'll say this: there is no one solution, we need multiple solutions to the problem of
 parallelism/concurrency. In my book the bara minimum is: Nested data parallelism, task-based (purely functional) parallelism, threads with messages, and threads with shared state (here's where you need transactions).<br /><p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633387022810000000</link>
		<pubDate>Fri, 15 Feb 2008 19:58:01 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633387022810000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">Frank Hileman wrote:</div>
<div class="quoteBody">&#65279;Hi Charles,<br /><br />Thanks to you and Burton Smith for the conversation. I would like to hear more about anything at microsoft having to do with isolation via message passing and capability security, similar to the E language:<br /><br /><a href="http://www.erights.org/">http://www.erights.org/</a><br /><br />I know we have CCR, but the syntax is clumsy, messaging does not have first class status, nor is there any isolation. What kind of similar things to E are happening at Microsoft, if any?<br /><br />For me, isolation via message passing sounds more interesting than transactional memory. It has&nbsp;proven useful for decades.<br /><br />Thanks,<br />Frank</div>
</blockquote>
<br /><br />I have come to same conclusion.&nbsp; Looked at some of that e stuff.&nbsp; They have the right idea, but looks very confusing. They make up a lot of uneeded terminology.&nbsp; It needs to be much simplier then that and can be.&nbsp; The abstraction has to be correct by construction
 and simple and has to be explicit opt-out model (ala unsafe) me thinks.<p>posted by staceyw</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633387204840000000</link>
		<pubDate>Sat, 16 Feb 2008 01:01:24 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633387204840000000</guid>
		<dc:creator>staceyw</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>Frank Hileman wrote:</strong>
<hr size="1">
<i><br />For me, isolation via message passing sounds more interesting than transactional memory. It has&nbsp;proven useful for decades.<br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br /><br />Why not have both?<br />In some cases message passing just doesn't work well at all. For example you may want 10000 objects to be able to query the same data (but only one or two of those objects of those modifies the data). Do you really want each to pass through a protocol with
 just a single thread &quot;owning&quot; that data? Sounds like a recipe for disaster w.r.t. performance to me. Transactions scale very well in situations like that (which is extremely common in practice), because each thread can do all the reading it wants to without
 interfering with the execution of any other threads.<br />...<br /></div>
</blockquote>
<br /><br />I see what your saying, but lets analyze that.&nbsp; Essentially, that is a classic ReaderWriter lock sample.&nbsp; You need the sync, because you never know who the last action was, a reader or writer.&nbsp; So you still need some kind of sync primitive to protect the invariants.<br /><br />One may say, lets keep the message queue for writers, but let readers read properties directly. But here we have couple issues.&nbsp; You still need lock to insure atomic reads (i.e non cache and non torn).&nbsp; Second, you have possible coordination issues the runtime
 could never know and only your logic knows.&nbsp; For example, your object may intentionally not be popping the queue (sort of a working blocking operation) queue until it completes work for last task - as maybe order is important or maybe is waiting on various
 replies.&nbsp; There is all kinds of reasons order could be important and you can't reliably short circut that in general case.&nbsp; So only the Object knows best.&nbsp; As far as Perf goes, yes the message passing has some cost.&nbsp; But so do ReaderWriters and Monitors.&nbsp;
 When you look at the &quot;net&quot; performance, I think it probably is a wash or better with message passing.&nbsp; Costs such as complexity and correctness proof are much less from the start.&nbsp; There is many other benefits you can't get with locks or even trans memory.&nbsp;
 You can get order symatics, self messaging, throttling,&nbsp;naturally async, pipelining,&nbsp;natural composition, loose binding, and others.<br /><br />Think Juval Lowy nailed it here. Every class needs to be a service:<br /><a href="/ShowPost.aspx?PostID=349561#349561"><a href="http://channel9.msdn.com/ShowPost.aspx?PostID=349561#349561">http&#58;&#47;&#47;channel9.msdn.com&#47;ShowPost.aspx&#63;PostID&#61;349561&#35;349561</a></a><p>posted by staceyw</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633387225120000000</link>
		<pubDate>Sat, 16 Feb 2008 01:35:12 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633387225120000000</guid>
		<dc:creator>staceyw</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">staceyw wrote:</div>
<div class="quoteBody">&#65279;<br />I see what your saying, but lets analyze that.&nbsp; Essentially, that is a classic ReaderWriter lock sample.&nbsp; You need the sync, because you never know who the last action was, a reader or writer.&nbsp; So you still need some kind of sync primitive to protect the invariants.<br /><br />One may say, lets keep the message queue for writers, but let readers read properties directly. But here we have couple issues.&nbsp; You still need lock to insure atomic reads (i.e non cache and non torn).&nbsp; Second, you have possible coordination issues the runtime
 could never know and only your logic knows.&nbsp; For example, your object may intentionally not be popping the queue (sort of a working blocking operation) queue until it completes work for last task - as maybe order is important or maybe is waiting on various
 replies.&nbsp; There is all kinds of reasons order could be important and you can't reliably short circut that in general case.&nbsp; So only the Object knows best.&nbsp; As far as Perf goes, yes the message passing has some cost.&nbsp; But so do ReaderWriters and Monitors.&nbsp;
 When you look at the &quot;net&quot; performance, I think it probably is a wash or better with message passing.&nbsp; Costs such as complexity and correctness proof are much less from the start.&nbsp; There is many other benefits you can't get with locks or even trans memory.&nbsp;
 You can get order symatics, self messaging, throttling,&nbsp;naturally async, pipelining,&nbsp;natural composition, loose binding, and others.<br /><br />Think Juval Lowy nailed it here. Every class needs to be a service:<br /><a href="/ShowPost.aspx?PostID=349561#349561"><a href="http://channel9.msdn.com/ShowPost.aspx?PostID=349561#349561">http&#58;&#47;&#47;channel9.msdn.com&#47;ShowPost.aspx&#63;PostID&#61;349561&#35;349561</a></a></div>
</blockquote>
<br /><br />The problem is that the action you want to do may be &quot;read value, do lots of complicated logic that takes a fairly long time, then you may decide to either do nothing, or update the value&quot;. If you're going to have a single owner of the data that handles this
 as a service (with an ad-hoc transactional protocol) you'll get contention. Let's for the sake of argument say that the computation that needs to be done before we know if we need to update takes a full second and you'll see how performance would be dreadful
 with even a few dozen threads accessing this data (even if actual updates are highly unlikely).<br /><br />You can't just separate reading/writing into two separate requests either because you *need* transactional semantics. You need to know for sure that you read a value, then you decide to update, and the value won't have changed. With message passing that basically
 means &quot;one client at a time please&quot;. I.e. horrible contention. You could do the equivalent of the usual locks/monitors business but that's what we were trying to get away from in the first place (they don't compose, and they're a nightmare to get right under
 complicated - i.e. realistic - scenarios where the set of locks you need to take depend on computations you can only do once you've already taken a bunch of other locks)!<br /><br />With transactions the operations on the objects would still be owned by the objects (because it &quot;knows best&quot;), but the actual code would run on the caller's thread, meaning that all your clients can run at the same time and in 99.9999% of the cases there is
 zero contention, and everything scales almost linearly with the number of cores, and we're happy.<br /><br />I agree that totally isolated threads are a better than shared state threads for correctness (but if all you want is parallelism, then tasks are even better, and data parallelism is even better than that), but there are *many* real world scenarios where shared
 state is crucial. Only having message passing is a non-starter for general purpose programming. We NEED a practical and composable (i.e. not locks) way of synchronising access to shared mutable state. Transactions is the only technology I'm aware of that does
 this currently.<br /><br /><br />EDIT: Oh, and consider the problem of granularity. Take the example of a game. You're running a thread for an AI character who wants to perform an action on the game world atomically. How do you solve that? Do you ask the World object to retrieve whatever it
 is the AI wants to access? In other words is the World object's service going to act as basically having one big lock on it for any atomic updates that AI characters want to do? Or do you put the implicit &quot;lock&quot; somewhere further down in the hierarchy (e.g.
 on individual objects in the world)? If so, how is this different from just having locks on the individual objects? How do you solve the problem of not knowing up front which objects you want to modify (because the set of objects you need depends on the values
 you find in the first couple of objects you examine)? Aren't we back in the old hornets nest of locks and condition variables again?<br />So basically we either have to let the toplevel object provide a transactional interface (amounting to a big implicit lock), and basically eliminate any chances of parallelism, or we go back to horribly impractical fine grained locking with all the problems
 that poses. In this scenario, message passing gives no improvement to us, whereas transactional memory &quot;just works&quot; without any synchronisation burden placed on the programmer at all!<p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633387257170000000</link>
		<pubDate>Sat, 16 Feb 2008 02:28:37 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633387257170000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[[quote user=&quot;sylvan&quot;]&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>staceyw wrote:</strong>
<hr size="1">
<i>&#65279;<br />I see what your saying, but lets analyze that.&nbsp; Essentially, that is a classic ReaderWriter lock sample.&nbsp; You need the sync, because you never know who the last action was, a reader or writer.&nbsp; So you still need some kind of sync primitive to protect the invariants.<br /><br />One may say, lets keep the message queue for writers, but let readers read properties directly. But here we have couple issues.&nbsp; You still need lock to insure atomic reads (i.e non cache and non torn).&nbsp; Second, you have possible coordination issues the runtime
 could never know and only your logic knows.&nbsp; For example, your object may intentionally not be popping the queue (sort of a working blocking operation) queue until it completes work for last task - as maybe order is important or maybe is waiting on various
 replies.&nbsp; There is all kinds of reasons order could be important and you can't reliably short circut that in general case.&nbsp; So only the Object knows best.&nbsp; As far as Perf goes, yes the message passing has some cost.&nbsp; But so do ReaderWriters and Monitors.&nbsp;
 When you look at the &quot;net&quot; performance, I think it probably is a wash or better with message passing.&nbsp; Costs such as complexity and correctness proof are much less from the start.&nbsp; There is many other benefits you can't get with locks or even trans memory.&nbsp;
 You can get order symatics, self messaging, throttling,&nbsp;naturally async, pipelining,&nbsp;natural composition, loose binding, and others.<br /><br />Think Juval Lowy nailed it here. Every class needs to be a service:<br /><a href="/ShowPost.aspx?PostID=349561#349561"><a href="http://channel9.msdn.com/ShowPost.aspx?PostID=349561#349561">http&#58;&#47;&#47;channel9.msdn.com&#47;ShowPost.aspx&#63;PostID&#61;349561&#35;349561</a></a></i></td>
</tr>
</tbody>
</table>
</blockquote>
<p><br /><br /><em>The problem is that the action you want to do may be &quot;read value, do lots of complicated logic that takes a fairly long time, then you may decide to either do nothing, or update the value&quot;. If you're going to have a single owner of the data that handles
 this as a service (with an ad-hoc transactional protocol) you'll get contention. Let's for the sake of argument say that the computation that needs to be done before we know if we need to update takes a full second and you'll see how performance would be dreadful
 with even a few dozen threads accessing this data (even if actual updates are highly unlikely).<br /><br />You can't just separate reading/writing into two separate requests either because you *need* transactional semantics. You need to know for sure that you read a value, then you decide to update, and the value won't have changed. With message passing that basically
 means &quot;one client at a time please&quot;. I.e. horrible contention. You could do the equivalent of the usual locks/monitors business but that's what we were trying to get away from in the first place (they don't compose, and they're a nightmare to get right under
 complicated - i.e. realistic - scenarios where the set of locks you need to take depend on computations you can only do once you've already taken a bunch of other locks)!<br /><br />With transactions the operations on the objects would still be owned by the objects (because it &quot;knows best&quot;), but the actual code would run on the caller's thread, meaning that all your clients can run at the same time and in 99.9999% of the cases there is
 zero contention, and everything scales almost linearly with the number of cores, and we're happy.<br /><br />I agree that totally isolated threads are a better than shared state threads for correctness (but if all you want is parallelism, then tasks are even better, and data parallelism is even better than that), but there are *many* real world scenarios where shared
 state is crucial. Only having message passing is a non-starter for general purpose programming. We NEED a practical and composable (i.e. not locks) way of synchronising access to shared mutable state. Transactions is the only technology I'm aware of that does
 this currently.</em><br />...quote]<br /><br /><br />I agree transactions are very important abstraction here.&nbsp; And they need to work with db transactions also. So there is some interesting work there.&nbsp; I am not sure STM is required to pull this off.&nbsp; The new runtime/language abstractions&nbsp;could do this by wrapping
 access to classes with proper behaviors.&nbsp; I would think we need optimistic concurrency transactions so remote and local behavior could be same abstraction.<br /><br />class MyClass<br />{<br />&nbsp;&nbsp; string Name;<br />&nbsp;&nbsp; int Count;<br />}</p>
<p>MyClass mc = ...<br />MyClass mc2 = ...<br /><br />tryTransaction(mc,mc2) // OCC trans started on mc,mc2. mc's inside block are a copy.<br />{<br />&nbsp;&nbsp; mc.Name = &quot;joe&quot;;<br />&nbsp;&nbsp; mc2.Count&#43;&#43;;&nbsp;&nbsp;&nbsp;&nbsp; <br />} // Commit. Runtime updates mc in batch and uses memory barrier to flush writes. Skipped if no writes.
<br />catch(Exception ex)<br />{<br />&nbsp; // trans is rolled back. <br />}</p>
<p>mc.Count&#43;&#43;;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // Error. Writes must be inside OCC trans.<br />int i = mc.Count; // Reads ok outside trans.<br /><br />TMK, expensive memory barriers would still be needed, but still cheaper then contenting explicit locks.&nbsp; However, at lease the runtime does this for us and could dynamic pick the fastest way (i.e. Interlocks, Thread.MemoryBarrier, etc).&nbsp; They could push this
 down even farther in the stack by changing the clr memory model somehow to address this.</p>
<p>posted by staceyw</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633387862410000000</link>
		<pubDate>Sat, 16 Feb 2008 19:17:21 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633387862410000000</guid>
		<dc:creator>staceyw</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">staceyw wrote:</div>
<div class="quoteBody">&#65279;<br /><p>I agree transactions are very important abstraction here.&nbsp; And they need to work with db transactions also. So there is some interesting work there.&nbsp; I am not sure STM is required to pull this off.&nbsp; The new runtime/language abstractions&nbsp;could do this by
 wrapping access to classes with proper behaviors.&nbsp; I would think we need optimistic concurrency transactions so remote and local behavior could be same abstraction.<br /><br />class MyClass<br />{<br />&nbsp;&nbsp; string Name;<br />&nbsp;&nbsp; int Count;<br />}</p>
<p>MyClass mc = ...<br />MyClass mc2 = ...<br /><br />tryTransaction(mc,mc2) // OCC trans started on mc,mc2. mc's inside block are a copy.<br />{<br />&nbsp;&nbsp; mc.Name = &quot;joe&quot;;<br />&nbsp;&nbsp; mc2.Count&#43;&#43;;&nbsp;&nbsp;&nbsp;&nbsp; <br />} // Commit. Runtime updates mc in batch and uses memory barrier to flush writes. Skipped if no writes.
<br />catch(Exception ex)<br />{<br />&nbsp; // trans is rolled back. <br />}</p>
<p>mc.Count&#43;&#43;;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // Error. Writes must be inside OCC trans.<br />int i = mc.Count; // Reads ok outside trans.<br /><br />TMK, expensive memory barriers would still be needed, but still cheaper then contenting explicit locks.&nbsp; However, at lease the runtime does this for us and could dynamic pick the fastest way (i.e. Interlocks, Thread.MemoryBarrier, etc).&nbsp; They could push this
 down even farther in the stack by changing the clr memory model somehow to address this.</p>
</div>
</blockquote>
<br /><br /><br />Isn't this system just a less flexible version of STM? How is this different from STM, aside from not supporting &quot;retry&quot; and &quot;orelse&quot;? I assume you don't really want to specify the objects that you keep track of up front either (since that brings us back to
 the original problem where we don't really know what variables we need to update until we've started the transaction).<br />Also, don't know why reads need to be okay outside transactions, since you could easily just do a single-statement transaction doing the read.<br /><br /><br /><p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633387895410000000</link>
		<pubDate>Sat, 16 Feb 2008 20:12:21 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633387895410000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<p>&quot;Isn't this system just a less flexible version of STM? How is this different from STM, aside from not supporting &quot;retry&quot; and &quot;orelse&quot;?&quot;</p>
<p>TMK, STM is still an open research problem (at least from MS perspective). IMO, STM proper is more an implementation detail, I am still at the Abstraction level here. As Burton says, STM is the big hammer. It tries to solve the larger, more general problem
 for all memory access. Maybe it is used below the abstraction (if it ever gets done), or maybe not. But, in concert with the compiler and runtime I think they can pick out the core ideas and make it work now with combo of language, compiler, and runtime.</p>
<p>Method above would also support orElse and maybe condition variables at top of block.</p>
<p>tryTransaction(class)<br />{<br />&nbsp; // stuff.<br />}<br />orElse<br />{<br />&nbsp; // other stuff.<br />}<br />orElse<br />{<br />&nbsp;&nbsp; retry 5; // retry 5 times or fail. TM keeps count down.<br />}<br />catch<br />{<br />}</p>
<p>My experiment here is that we handle invariants at the *class level, not the variable level as such. There is no change log, there is only the original copy of class(s) at the start of transaction stored by the TM. So all invariants of the class are always
 true for Readers and you can't get &quot;temporary&quot; inconsistent state inside transactions as shown in
<a href="http://en.wikipedia.org/wiki/Memory_transactions">http://en.wikipedia.org/wiki/Memory_transactions</a> in &quot;Implementation issues&quot;. If a writer transaction fails, the Original copy is but back atomically. A writer is always first a reader of the class(s)
 in the transaction.<br /><br />&quot;I assume you don't really want to specify the objects that you keep track of up front either (since that brings us back to the original problem where we don't really know what variables we need to update until we've started the transaction).&quot;</p>
<p>As above, this copies the object upfront. And all changes are applied or none of them are.</p>
<p>&quot;Also, don't know why reads need to be okay outside transactions, since you could easily just do a single-statement transaction doing the read.&quot;</p>
<p>You can do reads inside a trans. In fact, that would be the only way to ensure consistent reads of all the members of a class as you get a &quot;snapshot&quot; of the class. However single reads outside (i.e. Length, Count, etc) could be allowed as&nbsp;a transaction does
 impose overhead that may not be needed in some cases such as just reading current state for display purposes (i.e. Progress). That said, it may be too easy to shoot the foot if this was allowed. Not sure on that one.</p>
<p>Interesting topic. Hope they are having fun.</p>
<p>posted by staceyw</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633388208140000000</link>
		<pubDate>Sun, 17 Feb 2008 04:53:34 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633388208140000000</guid>
		<dc:creator>staceyw</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">staceyw wrote:</div>
<div class="quoteBody">&#65279;
<p>As above, this copies the object upfront. And all changes are applied or none of them are.</p>
</div>
</blockquote>
<p></p>
<p>Well that would be the deal breaker then. It seems to me that many of these transactions would have a &quot;potential&quot; set of affected objects that is
<i>much</i> larger than the <i>actual </i>set for any given transaction (as the transaction would be likely to have control flow in it). So the overhead of copying objects you didn't actually need, and copying objects back that you didn't actually write to
 (which is what I assume you're talking about, as keeping track of the objects you actually write to amounts to basically a transaction log, and then why not just use STM?) would be very high indeed.<br /></p>
<p>So again this is only a tiny improvement on locks really, since a transaction that commits would cause all other in-progress transactions to be aborted,
<i>even if they don't actually overlap</i> (which is similar to how taking all the locks you
<i>may</i> need causes contention even in cases where there are no <i>actual</i> conflicts). My intution tells me that for &quot;real&quot; applications (i.e. the ones where locks and monitors simply aren't tractable) many transactions would suffer from this.<br /></p>
<p>So again, I think we need a solution where you can read and write to variables however you want, and have the conflicts resolved only when they actually occur, rather than doing some sort of conservative estimate. This key IMO. Almdahl's requires us to be
 very careful about not artificially limiting the amount of parallelism we can get by doing these kinds of conservative estimations (because if we do so, then that will quickly become our bottle neck when the number of threads increases).<br /></p>
<p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633388510140000000</link>
		<pubDate>Sun, 17 Feb 2008 13:16:54 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633388510140000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>staceyw wrote:</strong>
<hr size="1">
<i>&#65279;</i>
<p><i>As above, this copies the object upfront. And all changes are applied or none of them are.</i></p>
</td>
</tr>
</tbody>
</table>
</blockquote>
<p></p>
<p>Well that would be the deal breaker then. It seems to me that many of these transactions would have a &quot;potential&quot; set of affected objects that is
<i>much</i> larger than the <i>actual </i>set for any given transaction.</p>
</div>
</blockquote>
<p></p>
Who says you copy it before it's used. Surely you'd only copy an object at the point where you know you're using it (because you need it now). You're copying it before it's used, but you're not copying things you don't need to copy.<br /><p>posted by evildictaitor</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633388534140000000</link>
		<pubDate>Sun, 17 Feb 2008 13:56:54 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633388534140000000</guid>
		<dc:creator>evildictaitor</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">evildictaitor wrote:</div>
<div class="quoteBody">&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>sylvan wrote:</strong>
<hr size="1">
<i>&#65279;</i>
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>staceyw wrote:</strong>
<hr size="1">
<i>&#65279;</i>
<p><i>As above, this copies the object upfront. And all changes are applied or none of them are.</i></p>
</td>
</tr>
</tbody>
</table>
</blockquote>
<p></p>
<p><i>Well that would be the deal breaker then. It seems to me that many of these transactions would have a &quot;potential&quot; set of affected objects that is
<i>much</i> larger than the <i>actual </i>set for any given transaction.</i></p>
</td>
</tr>
</tbody>
</table>
</blockquote>
<p></p>
Who says you copy it before it's used. Surely you'd only copy an object at the point where you know you're using it (because you need it now). You're copying it before it's used, but you're not copying things you don't need to copy.<br /></div>
</blockquote>
<br /><br />Well, staceyw said you copy it before it's used, for one. Doesn't really matter though, the main problem is that transactions would be rolled back when another transaction commits, just because their
<i>potential</i> set of objects overlap (even if they don't actually interfere). Or if you do check for actual overlap before rolling back, then you lose isolation since a potentially overlapping transaction can survive the commit (you can read variable X before
 a transaction commits a change to X and Y, and then you read Y afterwards. But now the X and Y that you've read aren't consistent anymore, since X is from before the last change, and Y is from after), how would you solve that without a transaction log to check
 at the end of the transaction to verify that the state that was read is consistent?
<br />The way I understand the proposed alternative here is that you DO copy all the objects up front to get isolation (this copy has to happen under a lock, btw, which introduces additional overhead), and then at the end you copy them back (again, under a lock),
 and any transaction that happens to overlap with the potential set of touched variables need to be aborted (to ensure consistency, as they may end up relying on the values written, we just don't know).<br />I don't see how you can make it work any other way. If you allow other transactions to survive, you need a transaction log at the end so you can see which variables have changed by other transactions since you started yours (and thereby catch collisions).<br /><br /><br />It just seems to me that this this proposed mechanism just introduces additional overhead, and inhibits parallelism for no good reason. Where are the
<i>benefits </i>compared to regular STM?<br /><br /><p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633388611650000000</link>
		<pubDate>Sun, 17 Feb 2008 16:06:05 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633388611650000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />Well, staceyw said you copy it before it's used, for one.<br /></div>
</blockquote>
<br />Copying before doesn't mean doing redundant copies or copying all at once.<br /><br /><blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />Or if you do check for actual overlap before rolling back, then you lose isolation since a potentially overlapping transaction can survive the commit (you can read variable X before a transaction commits a change to X and Y, and then you read Y afterwards.
 But now the X and Y that you've read aren't consistent anymore, since X is from before the last change, and Y is from after), how would you solve that without a transaction log to check at the end of the transaction to verify that the state that was read is
 consistent? <br /></div>
</blockquote>
<br />That's pretty easy. You can check at compile time what the read/write state is on various variables and associate them with the instruction pointer in the function, and whenever a transaction is committed elsewhere you check that bitwise and of the variables
 being written don't coincide with the variables having been read, using the following code:<br /><br /><font color="#000000" face="Courier New">extern int numThreads;<br />extern struct thread** threads;</font><br /><br /><font face="Courier New">void doRollbacks(struct transaction* t, struct transaction** ts, int len){<br />&nbsp; // this <i>must </i>be atomic!<br />&nbsp; pause_all_threads();<br />&nbsp; exclusive_lock_begin();<br />&nbsp; <br />&nbsp; int i;<br />&nbsp; for(i=0;i&lt;len;i&#43;&#43;){<br />&nbsp;&nbsp;&nbsp; if( *t-&gt;write_vars[get_ip(t)] &amp; *(ts[i]-&gt;read</font><font face="Courier New">[get_ip(ts[i])]</font><font face="Courier New">) || *t-&gt;read[get_ip(t)] &amp; *(ts[i]-&gt;write[get_ip(ts[i])]))<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; rollback(ts[i]);<br />&nbsp; }<br /><br />&nbsp; exclusive_lock_end();<br />&nbsp; unpause_all_threads();<br />}<br /><br />// on each thread:<br />void rollback(struct transaction *t){<br />&nbsp; int i;<br />&nbsp; for(i=0;i&lt;t-&gt;write_cache_len;i&#43;&#43;)<br />&nbsp;&nbsp;&nbsp; // roll back all writes<br />&nbsp;&nbsp;&nbsp; memcpy(t-&gt;write_cache[i]-&gt;ptr, t-&gt;write_cache[i]-&gt;buffer, t-&gt;write_cache[i]-&gt;len);<br /><br />&nbsp; // roll back the stack:<br />&nbsp; int x = t-&gt;stackptr;<br />&nbsp; int v = t-&gt;transaction_start;<br />&nbsp; __asm {<br />&nbsp;&nbsp;&nbsp; add ebp %x&nbsp;&nbsp; ; rollback the eval stack<br />&nbsp;&nbsp;&nbsp; jmp %v&nbsp; ; rollback the function pointer.<br />&nbsp; }<br />}<br /><br />void get_ip(struct transaction *t){<br />&nbsp; int r = 0;<br />&nbsp; __asm {<br />&nbsp;&nbsp;&nbsp; mov eax ebp<br />&nbsp;&nbsp;&nbsp; mov eax [eax]<br />&nbsp;&nbsp;&nbsp; add eax 4<br />&nbsp;&nbsp;&nbsp; mov eax [eax]<br />&nbsp;&nbsp;&nbsp; mov %r eax<br />&nbsp; }<br />&nbsp; return r;<br />}<br /><br /></font><font face="Courier New">void </font><font face="Courier New">exclusive_lock_begin(){
<br />&nbsp; __asm {<br />&nbsp;&nbsp;&nbsp; sti<br />&nbsp;&nbsp;&nbsp; pushf eflags<br />&nbsp; }<br />}<br /></font><font face="Courier New">void </font><font face="Courier New">exclusive_lock_end(){
<br />&nbsp; __asm {<br />&nbsp;&nbsp;&nbsp; popf eflags<br />&nbsp; }<br />}<br /><br /></font><font face="Courier New">void pause_all_threads(){<br />&nbsp; int i;<br />&nbsp; for(i=0; i&lt; numThreads; i&#43;&#43;)<br />&nbsp;&nbsp;&nbsp; if(thread[i] != get_current_thread())<br />&nbsp;&nbsp; &nbsp;&nbsp; pause(thread[i]);<br />}<br /><br /></font><font face="Courier New">void resume_all_threads(){<br />&nbsp; int i;<br />&nbsp; for(i=0; i&lt; numThreads; i&#43;&#43;)<br />&nbsp;&nbsp;&nbsp; pause(thread[i]);<br />}</font><br /><font face="Courier New"></font><font face="Courier New"></font><p>posted by evildictaitor</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633388840550000000</link>
		<pubDate>Sun, 17 Feb 2008 22:27:35 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633388840550000000</guid>
		<dc:creator>evildictaitor</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">evildictaitor wrote:</div>
<div class="quoteBody">&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>sylvan wrote:</strong>
<hr size="1">
<i>&#65279;<br />Well, staceyw said you copy it before it's used, for one.<br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br />Copying before doesn't mean doing redundant copies or copying all at once.<br /></div>
</blockquote>
<br /><br />Well the wording was &quot;up front&quot;, I believe, which I would take to mean &quot;copy everything at once before the transaction starts&quot;.<br /><br /><blockquote>
<div class="quoteAuthor">evildictaitor wrote:</div>
<div class="quoteBody">&#65279;<br /><blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>sylvan wrote:</strong>
<hr size="1">
<i>&#65279;<br />Or if you do check for actual overlap before rolling back, then you lose isolation since a potentially overlapping transaction can survive the commit (you can read variable X before a transaction commits a change to X and Y, and then you read Y afterwards.
 But now the X and Y that you've read aren't consistent anymore, since X is from before the last change, and Y is from after), how would you solve that without a transaction log to check at the end of the transaction to verify that the state that was read is
 consistent? <br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br />That's pretty easy. You can check at compile time what the read/write state is on various variables and associate them with the instruction pointer in the function<br /></div>
</blockquote>
<br /><br />Can you though?<br />What if I read a variable, or maybe it's an input parameter to the function, and in an if-statement checking this value for 0 or whatever, I may read/update a bunch of transactional variables, then after that I go on to do something else. How would you know,
 just from the instruction pointer, which variables have been read if the IP points to some code after the if statement? How could you easily know what path through the preceeding code you've taken? Wouldn't that have to be a very conservative estimate (again,
 needlessly inhibiting parallelism)?<br /><br />In fact, if my IP is right at the final &quot;return&quot; statement of a transaction, then wouldn't this bit field be the same as my &quot;potential set&quot; of touched variables, even though the
<i>actual</i> set of touched variables could be empty?<br /><br />If there's one thing I think we should avoid like the plague, it's introducing anything which may reduce the degree of parallelism we can see in a program. Almdahl's law scares me. We may not have hit it yet, but it's coming at us at 300mph and it's like a
 big brick wall on the horizon.<br /><br />Without actually logging what gets done, I don't see how we could easily check wether the transaction
<i>actually </i>conflicts. Though I would be interested in seeing an approach to solving this issue (a transaction seeing inconsistent state) in a regular STM system using a simliar approach (i.e. right after a commit, check any potentially overlapping transaction's
 current log, and if the read set in those overlaps the write set in the transaction you're commiting, you can restart it). I think the implementation in GHC just relies on the commit of the incosistent transaction to catch this itself, where the odd cases
 of the inconsistency leading to non-termination are handled by checking it on each GC or something like that. I suspect the reasons for doing it this way are performance related (because it certainly seems obvious that a committing transaction would kill any
 other transactions that it has invalidated, so I can't see that they've all just missed it), which tells me that perhaps this may have terrible performance characteristics (e.g. due to that big global locking you're doing everytime something commits - ouch!),
 as well as being too conservative.<br /><br /><p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633388858420000000</link>
		<pubDate>Sun, 17 Feb 2008 22:57:22 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633388858420000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />Can you though?<br />What if I read a variable, or maybe it's an input parameter to the function, and in an if-statement checking this value for 0 or whatever, I may read/update a bunch of transactional variables, then after that I go on to do something else. How would you know,
 just from the instruction pointer, which variables have been read if the IP points to some code after the if statement? How could you easily know what path through the preceeding code you've taken? Wouldn't that have to be a very conservative estimate (again,
 needlessly inhibiting parallelism)?<br /></div>
</blockquote>
<br />The short answer is yes. Particularly in managed languages, but also in C&#43;&#43; and C to a lesser extent this is not only feasible but currently done - on x64 architectures C&#43;&#43; programs compiled with CRT checking implement try...catch...finally blocks as exactly
 this so that when an exception is thrown it can &quot;rollback&quot; the function pointer to the appropriate stage and it uses a simmilar lookup table to determine which objects on the stack were created and thus need to be finalized.<br /><br /><blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />In fact, if my IP is right at the final &quot;return&quot; statement of a transaction, then wouldn't this bit field be the same as my &quot;potential set&quot; of touched variables, even though the
<i>actual</i> set of touched variables could be empty?<br /></div>
</blockquote>
<br />You seriously underestimate the power of modern compiler theory. You might only have one return statement, but the compiler will have a lot. You need to do truly appalling things to C or C&#43;&#43; before these kind of optimisations become concerned (I mean breaking
 into asm or jump to void pointer kind of nasty code).<br /><br /><blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />If there's one thing I think we should avoid like the plague, it's introducing anything which may reduce the degree of parallelism we can see in a program. Almdahl's law scares me. We may not have hit it yet, but it's coming at us at 300mph and it's like a
 big brick wall on the horizon.<br /></div>
</blockquote>
<br />I agree. Hopefully such advancements will allow us to get greater parallelisation with less programmer manual intervention.<br /><br /><blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />Without actually logging what gets done, I don't see how we could easily check wether the transaction
<i>actually </i>conflicts. Though I would be interested in seeing an approach to solving this issue (a transaction seeing inconsistent state) in a regular STM system using a simliar approach (i.e. right after a commit, check any potentially overlapping transaction's
 current log, and if the read set in those overlaps the write set in the transaction you're commiting, you can restart it).<br /></div>
</blockquote>
<br />We do simmilar things to databases - why should imperative code be any different? The only big concern is when rolling back very large or very old transactions, and even these are at a cost that is feasible.<br /><br /><blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />... which tells me that perhaps this may have terrible performance characteristics (e.g. due to that big global locking you're doing everytime something commits - ouch!), as well as being too conservative.<br /></div>
</blockquote>
<br />Determining whether or not a transaction collision occurs by definition needs a global lock on the transactions and a pause on the threads. Any other course of action could end up with a race-condition on the rollback. As you can see the time to commit is linear
 time complexity with respect to the number of threads and objects, with the cost of a rollback being linear time complexity (each) with respect to the the number of mutated objects.<br /><br />Consequently the average case is a global lock over a function of complexity O(n log n)
<br /><p>posted by evildictaitor</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633388934670000000</link>
		<pubDate>Mon, 18 Feb 2008 01:04:27 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633388934670000000</guid>
		<dc:creator>evildictaitor</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">evildictaitor wrote:</div>
<div class="quoteBody">&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>sylvan wrote:</strong>
<hr size="1">
<i>&#65279;<br />Can you though?<br />What if I read a variable, or maybe it's an input parameter to the function, and in an if-statement checking this value for 0 or whatever, I may read/update a bunch of transactional variables, then after that I go on to do something else. How would you know,
 just from the instruction pointer, which variables have been read if the IP points to some code after the if statement? How could you easily know what path through the preceeding code you've taken? Wouldn't that have to be a very conservative estimate (again,
 needlessly inhibiting parallelism)?<br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br />The short answer is yes. Particularly in managed languages, but also in C&#43;&#43; and C to a lesser extent this is not only feasible but currently done - on x64 architectures C&#43;&#43; programs compiled with CRT checking implement try...catch...finally blocks as exactly
 this so that when an exception is thrown it can &quot;rollback&quot; the function pointer to the appropriate stage and it uses a simmilar lookup table to determine which objects on the stack were created and thus need to be finalized.<br /></div>
</blockquote>
<br /><br />How would this work exactly?<br />Are you saying that each &quot;path&quot; through the code gets transformed into a &quot;tree&quot; like shape by duplicating any code that happens after a branch so that each option gets it's own copy of the following statements?<br />e.g.<br /><br />if (x&gt;0)<br />{<br />&nbsp;&nbsp;&nbsp; foo();<br />}<br />bar();<br /><br />turns into:<br /><br />if (x&gt;0)<br />{<br />&nbsp;&nbsp;&nbsp; foo();<br />&nbsp;&nbsp;&nbsp; bar();<br />}<br />else<br />{<br />&nbsp;&nbsp;&nbsp; bar();<br />}<br /><br />I could see how doing something like that could indeed give you a way of checking the instruction pointer for the exact set of objects used, but wouldn't code bloat be quite horrific?<br /><br /><blockquote>
<div class="quoteAuthor">evildictaitor wrote:</div>
<div class="quoteBody">&#65279;<br />Determining whether or not a transaction collision occurs by definition needs a global lock on the transactions and a pause on the threads.<br /></div>
</blockquote>
<br /><br /><br />Couldn't you just (write-)lock the data you've read/written when trying to commit (in some global ordering)? That way you wouldn't lock *all* transactions, only the ones who try to do anything to the data you've needed.<br />You still need to &quot;fix&quot; any transactions that get into an infinite loop or something due to an inconsistent view of the world, but you could do&nbsp; a check on each thread switch, or GC or something, and of course in their own commit (if they get that far).<br /><p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633389202330000000</link>
		<pubDate>Mon, 18 Feb 2008 08:30:33 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633389202330000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />I could see how doing something like that could indeed give you a way of checking the instruction pointer for the exact set of objects used, but wouldn't code bloat be quite horrific?<br /></div>
</blockquote>
<br />You certainly end up with bigger emitted code if you do it that way, but it does make transactions and exceptions easier to cope with. It's worth pointing out that this does not make the program
<i>slower</i>, merely <i>bigger</i>.<br /><br /><blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />Couldn't you just (write-)lock the data you've read/written when trying to commit (in some global ordering)? That way you wouldn't lock *all* transactions, only the ones who try to do anything to the data you've needed.<br /></div>
</blockquote>
<br />That would be a better system, I agree. I was trying to avoid using locks for the sake of simplicity.<br /><br /><blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />You still need to &quot;fix&quot; any transactions that get into an infinite loop or something due to an inconsistent view of the world, but you could do&nbsp; a check on each thread switch, or GC or something, and of course in their own commit (if they get that far).<br /></div>
</blockquote>
<br /><br />Detecting whether a program is in an infinite loop is (in general) impossible. Happily however, a transaction in an infinite loop will never commit (by definition), and therefore will keep running &quot;out-of-the-way&quot; or will be eventually reset by another transaction
 commit that overwrites data that the infinitely-looping transaction has read from or written to.<br /><br />You suggest however that the infintely-running transaction might commit, however this would signal the immediate end of the transaction - a transaction can be thought of as a boolean method on it's own thread (or a turing machine), and it commits, it returns
 true and terminates, and if it aborts, it returns false and terminates.<br /><br /><br /><p>posted by evildictaitor</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633389712580000000</link>
		<pubDate>Mon, 18 Feb 2008 22:40:58 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633389712580000000</guid>
		<dc:creator>evildictaitor</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>Frank Hileman wrote:</strong>
<hr size="1">
<i><br />For me, isolation via message passing sounds more interesting than transactional memory. It has&nbsp;proven useful for decades.<br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br /><br />Why not have both?<br />In some cases message passing just doesn't work well at all. For example you may want 10000 objects to be able to query the same data (but only one or two of those objects of those modifies the data). Do you really want each to pass through a protocol with
 just a single thread &quot;owning&quot; that data? Sounds like a recipe for disaster w.r.t. performance to me. Transactions scale very well in situations like that (which is extremely common in practice), because each thread can do all the reading it wants to without
 interfering with the execution of any other threads.<br /><br />I agree that message passing is ideal <i>when suitable</i>, but sometimes it just isn't. Also, while threads are sometimes excellent abstractions (even for things where you *don't* care about concurrency they may be the right model), sometimes they just suck.
 They basically suffer from the same problem that &quot;goto&quot; does: obfuscation of the structure of the program (you have to jump back and forth through messages in different threads, which one may not be known until you run the program, to understand what the program
 does).<br /><br />Now I freely confess that I hadn't seen E (I'll look into it now) so if they give some new cool abstraction that solves all of this I'll recant my statements, but for now I'll say this: there is no one solution, we need multiple solutions to the problem of
 parallelism/concurrency. In my book the bara minimum is: Nested data parallelism, task-based (purely functional) parallelism, threads with messages, and threads with shared state (here's where you need transactions).<br /></div>
</blockquote>
<br /><br />In message oriented systems, erlang, scala, etc, the isolated, message passing units are more lightweight than threads. The idea is to get away from the traditional thread with its heavy stack. Message passing is a robust, proven way of isolating state, that
 can scale well. Most people believe message passing is easier than locks, which do not scale well. Message passing in conjunction with functional programming (erlang) has proven successful for difficult concurrent telecommunications applications.<br /><br />Transactions will always be needed for some types of applications, but transactional memory is a new thing, that to me, seems more of a hack to keep holding onto older ways of working.<p>posted by Frank Hileman</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390372720000000</link>
		<pubDate>Tue, 19 Feb 2008 17:01:12 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390372720000000</guid>
		<dc:creator>Frank Hileman</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">staceyw wrote:</div>
<div class="quoteBody">&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>sylvan wrote:</strong>
<hr size="1">
<i>&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>Frank Hileman wrote:</strong>
<hr size="1">
<i><br />For me, isolation via message passing sounds more interesting than transactional memory. It has&nbsp;proven useful for decades.<br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br /><br />Why not have both?<br />In some cases message passing just doesn't work well at all. For example you may want 10000 objects to be able to query the same data (but only one or two of those objects of those modifies the data). Do you really want each to pass through a protocol with
 just a single thread &quot;owning&quot; that data? Sounds like a recipe for disaster w.r.t. performance to me. Transactions scale very well in situations like that (which is extremely common in practice), because each thread can do all the reading it wants to without
 interfering with the execution of any other threads.<br />...<br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br /><br />I see what your saying, but lets analyze that.&nbsp; Essentially, that is a classic ReaderWriter lock sample.&nbsp; You need the sync, because you never know who the last action was, a reader or writer.&nbsp; So you still need some kind of sync primitive to protect the invariants.<br /><br />One may say, lets keep the message queue for writers, but let readers read properties directly. But here we have couple issues.&nbsp; You still need lock to insure atomic reads (i.e non cache and non torn).&nbsp; Second, you have possible coordination issues the runtime
 could never know and only your logic knows.&nbsp; For example, your object may intentionally not be popping the queue (sort of a working blocking operation) queue until it completes work for last task - as maybe order is important or maybe is waiting on various
 replies.&nbsp; There is all kinds of reasons order could be important and you can't reliably short circut that in general case.&nbsp; So only the Object knows best.&nbsp; As far as Perf goes, yes the message passing has some cost.&nbsp; But so do ReaderWriters and Monitors.&nbsp;
 When you look at the &quot;net&quot; performance, I think it probably is a wash or better with message passing.&nbsp; Costs such as complexity and correctness proof are much less from the start.&nbsp; There is many other benefits you can't get with locks or even trans memory.&nbsp;
 You can get order symatics, self messaging, throttling,&nbsp;naturally async, pipelining,&nbsp;natural composition, loose binding, and others.<br /><br />Think Juval Lowy nailed it here. Every class needs to be a service:<br /><a href="/ShowPost.aspx?PostID=349561#349561"><a href="http://channel9.msdn.com/ShowPost.aspx?PostID=349561#349561">http&#58;&#47;&#47;channel9.msdn.com&#47;ShowPost.aspx&#63;PostID&#61;349561&#35;349561</a></a></div>
</blockquote>
<br /><br />If you think about &quot;processes&quot; that are lighter than a thread, and do not need a stack, messages can possibly be as fast as ordinary method calls, with the added difficulty of enregistration of parameters. A stack is used to save the function call frames. A
 message queue saves message parameters. In terms of storage, there is not a big difference, it is primarily a question of creating a very fast queue and dispatching mechanism.&nbsp;<br /><br />I think it is best to assume the creators of erlang, scala, E, etc have or will address performance, and avoid assumptions about implementation or performance.
<br /><br />Just looking at the advantages, I agree with you, it is a great way to work.<p>posted by Frank Hileman</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390380720000000</link>
		<pubDate>Tue, 19 Feb 2008 17:14:32 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390380720000000</guid>
		<dc:creator>Frank Hileman</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">evildictaitor wrote:</div>
<div class="quoteBody">&#65279; <br /><br />Detecting whether a program is in an infinite loop is (in general) impossible. <br /><br /></div>
</blockquote>
<br /><br />Well you only need to detect problems that has resulted from the program reading inconsistent state. If it diverges for any other reason it's someone else's problem!<br />With a log-based transactional system this is trivial, just check that the read values in the log match the actual values. If not, then the actual values must have been updated by another transaction, and any transaction that has read from it is invalid. Normally
 the transaction could detect this inconsistency when it validates before it commits, but if the inconsistency happesn to cause it to diverge then you need to check it somewhere else too (e.g. on thread switching) as it will never commit.<br /><br />This way you never have to look at any other transactions when committing, you just make sure that
<i>you</i> are consistent and then commit. And in most cases a slight inconsistency won't lead to a transaction diverging, so that transaction can check itself when it reaches its commit. And if it does diverge it should be rare, so we can just validate the
 &quot;read&quot; entries in the log periodically.<br /><br /><br /><blockquote>
<div class="quoteAuthor">Frank Hileman wrote:</div>
<div class="quoteBody">In message oriented systems, erlang, scala, etc, the isolated, message passing units are more lightweight than threads. The idea is to get away from the traditional thread with its heavy stack. Message passing is a robust, proven way
 of isolating state, that can scale well. Most people believe message passing is easier than locks, which do not scale well. Message passing in conjunction with functional programming (erlang) has proven successful for difficult concurrent telecommunications
 applications.<br /><br />Transactions will always be needed for some types of applications, but transactional memory is a new thing, that to me, seems more of a hack to keep holding onto older ways of working.<br /></div>
</blockquote>
<br /><br />The performance issues with threads isn't just that threads have overhead themselves, it's that many applications simply don't map well to the concept of &quot;threads and messages&quot;. This leads to massive contention. I gave several examples in my previous posts,
 like for example a big game world that all your objects want to update (each of them needs atomic updates). With message passing these accesses will be serialised, as only one object a time can access the world atomically (unless you break the world up into
 smaller pieces with one &quot;service&quot; each, which means you're effectively simulating locks using threads, so you get all the usual locking issues - deadlock, race conditions etc.). In situations such as this, when there is high contention for a resource (even
 if it's &quot;large&quot;)where atomic access is needed (even the accesses are usually independent), threads simply doesn't give you any parallelism.<br /><br />Threads are great when you problem happens to map nicely to small indpendent chunks that can communicate via messages. Many problems simply cannot be written in this way without horrible contention.<br />So for a general purpose language we can't rely entirely on threads and messages. We need good support for threads and message passing too, but we can't afford to just leave things running sequentially because our programming model doesn't support parallelizing
 it (almdahl's law tells us that if we do, then this bit of sequential code will become our bottleneck in no time).<br /><br /><br /><p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390453690000000</link>
		<pubDate>Tue, 19 Feb 2008 19:16:09 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390453690000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />Well you only need to detect problems that has resulted from the program reading inconsistent state. If it diverges for any other reason it's someone else's problem!<br /></div>
</blockquote>
<br /><br />The point of transactions is that you never have inconsistent state. Whenever a transaction does a write, it clones the object and writes to the clone, and only when it
<i>commits</i> does it write to the shared memory. You can then cancel any transactions which might have read or written to these values (and would have inconsistent state if they continued), and simply free their memory and restart them. This has the nicety
 that you never need to check for your own consistency because it is guarranteed when you start, while you are running, and that any possibly overwrite of data that might render your world view inconsistent will result in your own immediate termination and
 rescheduling. You would however need to use an appropriate weighting function to avoid livelock.<br /><p>posted by evildictaitor</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390467830000000</link>
		<pubDate>Tue, 19 Feb 2008 19:39:43 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390467830000000</guid>
		<dc:creator>evildictaitor</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>evildictaitor wrote:</strong>
<hr size="1">
<i>&#65279;<br />The point of transactions is that you never have inconsistent state. Whenever a transaction does a write, it clones the object and writes to the clone, and only when it
<i>commits</i> does it write to the shared memory.&nbsp; You can then cancel any transactions which might have read or written to these values (and would have inconsistent state if they continued), and simply free their memory and restart them.<br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br /><br />That's <i>one</i> implementation strategy, yes, I was talking about was the implementation strategy used in e.g. Haskell STM (which is the other way around) where you have a transaction log that you fill in as you go along, and then in the commit you simply
 check your read values for consistency with actual &quot;live&quot; values, and if they haven't changed you're good to write out your changes.<br />This <i>can</i> lead to inconsistencies. So the key is to kill off these invalid transactions somehow. You
<i>could </i>do it by having each transaction kill off any conflicting transactions when it commits (your approach), but in GHC what they do (as far as I can tell) is that they let those transactions detect that they're inconsistent when
<i>they</i> commit. That leaves the corner case of a transaction that diverges (and thus would never commit) due to an inconsistency. These should be rare, and can be checked as a special case periodically (e.g. on GC or thread switching).<br />The benefits of this approach, as far as I can tell, is that each transaction only ever looks at its own stuff, and won't have to go look through a bunch of other transactions just to verify that none of them conflict (which they usually won't have). Either
 approach would work, which is why I said earlier that I would be interested in seeing the other way tried in e.g. Haskell to see if this approach would have benefits (e.g. because invalid transactions would be terminated immediately).<br /><br />(there are variations on this idea, e.g. lookup Tim Harris' research you'll see an approach where the writes in the transactions actually write directly to the &quot;live&quot; objects - optimizing for successful commits, and the transation logs are only used for rolling
 back if the commit fails)<br /><br />However, the suggested strategy of specifying transactional variables up front would have problems with scenarios where the transaction variables aren't known at compile time.<br />E.g. if you read transactional variables from a transactional channel, and do transactional updates on them. Since those transactional variables can come from any other thread that has access to the channel (and depend on user input or whatever), and you don't
 even know how many of them you'll get, there's no way of producing a static lookup of which variables have been read based on the instruction pointer. As far as I can see you either need a transaction log, or you disallow transactional variables as first class
 citizens. So I'm not sure how well the log-less idea would work in practice (it seems to me that storing transactional variables inside other transactional variables would be very common, e.g. when you send a message to another thread and also supply a message
 channel for that thread to send its reply).<br /><p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390555090000000</link>
		<pubDate>Tue, 19 Feb 2008 22:05:09 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390555090000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[duplicate<br /><p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390559890000000</link>
		<pubDate>Tue, 19 Feb 2008 22:13:09 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390559890000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />there are variations on this idea, e.g. lookup Tim Harris' research you'll see an approach where the writes in the transactions actually write directly to the &quot;live&quot; objects - optimizing for successful commits, and the transation logs are only used for rolling
 back if the commit fails<br /></div>
</blockquote>
<br />It would surprise me if this sped things up, since most objects are accessed via pointer (and certainly are in .NET) and thus whether you copy first and mutate on the new object or make a backup so you can rollback is merely a question of whether you swap the
 pointers at the point of a transactional write. But maybe there's some other benefit to this, I don't know.<br /><br /><blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />However, the suggested strategy of specifying transactional variables up front would have problems with scenarios where the transaction variables aren't known at compile time.<br /></div>
</blockquote>
<br />I fail to see why. All of the variable sites are known at compile time (by definition they are on the stack).<br /><br /><blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />E.g. if you read transactional variables from a transactional channel, and do transactional updates on them. Since those transactional variables can come from any other thread that has access to the channel (and depend on user input or whatever), and you don't
 even know how many of them you'll get, there's no way of producing a static lookup of which variables have been read based on the instruction pointer.<br /></div>
</blockquote>
<br />One of the points of transactions is they are (effectively) syncronous. That is to say that if a transactional message is passed to a function, either that function must run syncronously and return, or it must guarrantee that the source thread relinquishes
 control of the object and that the result of the computation is non-side-effecting.<br /><br /><blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />As far as I can see you either need a transaction log, or you disallow transactional variables as first class citizens. So I'm not sure how well the log-less idea would work in practice (it seems to me that storing transactional variables inside other transactional
 variables would be very common, e.g. when you send a message to another thread and also supply a message channel for that thread to send its reply).<br /></div>
</blockquote>
<br />Again, I can't think of a good reason to spawn multiple threads within a transaction. If you start to get to that level of complexity, the effect of a rollback or commit would be effectively to cancel a large number of transactions and do many copy-backs at
 the point of commit (or abort), which strikes me as needing an alternative model than transactions for efficient use.<br /><p>posted by evildictaitor</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390665250000000</link>
		<pubDate>Wed, 20 Feb 2008 01:08:45 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390665250000000</guid>
		<dc:creator>evildictaitor</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">evildictaitor wrote:</div>
<div class="quoteBody">&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>sylvan wrote:</strong>
<hr size="1">
<i>&#65279;<br />Well you only need to detect problems that has resulted from the program reading inconsistent state. If it diverges for any other reason it's someone else's problem!<br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br /><br />The point of transactions is that you never have inconsistent state. Whenever a transaction does a write, it clones the object and writes to the clone, and only when it
<i>commits</i> does it write to the shared memory. You can then cancel any transactions which might have read or written to these values (and would have inconsistent state if they continued), and simply free their memory and restart them. This has the nicety
 that you never need to check for your own consistency because it is guarranteed when you start, while you are running, and that any possibly overwrite of data that might render your world view inconsistent will result in your own immediate termination and
 rescheduling. You would however need to use an appropriate weighting function to avoid livelock.<br /></div>
</blockquote>
<br /><br />I agree.&nbsp; This also is relatively easy to reason about, which is important as code gets complex.<p>posted by staceyw</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390682720000000</link>
		<pubDate>Wed, 20 Feb 2008 01:37:52 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390682720000000</guid>
		<dc:creator>staceyw</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />EDIT: Oh, and consider the problem of granularity. Take the example of a game. You're running a thread for an AI character who wants to perform an action on the game world atomically. How do you solve that? Do you ask the World object to retrieve whatever it
 is the AI wants to access? In other words is the World object's service going to act as basically having one big lock on it for any atomic updates that AI characters want to do? Or do you put the implicit &quot;lock&quot; somewhere further down in the hierarchy (e.g.
 on individual objects in the world)? If so, how is this different from just having locks on the individual objects? How do you solve the problem of not knowing up front which objects you want to modify (because the set of objects you need depends on the values
 you find in the first couple of objects you examine)? Aren't we back in the old hornets nest of locks and condition variables again?<br />So basically we either have to let the toplevel object provide a transactional interface (amounting to a big implicit lock), and basically eliminate any chances of parallelism, or we go back to horribly impractical fine grained locking with all the problems
 that poses. In this scenario, message passing gives no improvement to us, whereas transactional memory &quot;just works&quot; without any synchronisation burden placed on the programmer at all!</div>
</blockquote>
<br /><br />Ignoring the &quot;thread&quot; implementation idea (since the isolated units are less than threads), and other implementation assumptions,&nbsp;I assume you are stating the problems as: world, one big unit containing&nbsp;smaller pieces of mutating data, and ai characters, many
 smaller units that interact with this. Let's call the units&nbsp;processes (not process in the OS sense). What interaction do you see between these processes?<br /><br />I am not sure how you would define the interaction, but intuitively, I think you selected an&nbsp;example where message passing works well (and is probably why massive multiplayer games use messaging). Lets suppose each ai process receives&nbsp; message&nbsp;G to retrieve
 data from the world, and sends message&nbsp;S to mutate the world.<br /><br />Each message is queued. Assume the world has a set of messages queued, of type S. Assume it is sending out G in between S processing work.&nbsp;<br /><br />AI1 requested data and got a message G. It sends out S based on the values in G, lets call it world state 1. In the mean time the world has changed. By the time it processes that S, it is in world state 2.<br />&nbsp;<br />1) How can an ai reliably compute the data for S if the world is constantly changing state?<br /><br />2) How much parallelism have we lost by putting so much mutable&nbsp;data in the large world process?<br /><br />The answer to question 1 is usually in the design of the algorithms. Ideally the difference between world state 1 and 2 does not affect the validity of the ai message S to the world. That is, it may be something like, add this amount of energy to a particle,
 not, set the absolute value of the energy of this particle. A delta, for a game, is probably more appropriate.<br /><br />If the validity of the message is&nbsp;dependent upon&nbsp;the world state, message S could include as data a &quot;world state stamp&quot;, that is an identifier showing that S is only valid if the world is still in state 1. Message G, from the world to the state would include
 this stamp.<br /><br />The world process would then ignore S messages with past due stamps, resending G to the sender ai processes with ignored S messages, so they could recompute.<br /><br />This would be a contention problem with one giant world process. Which leads to the second question, granularity. If the world is broken up into many smaller processes, the contention could potentially go away, assuming every ai is not working with a&nbsp;world
 state dependent algorithm, with each&nbsp;trying to modify the same small mutable state with that same algorithm.<br /><br />Messaging does not make contention problems go away, it is true. You must still design the system correctly to avoid that.&nbsp;<br /><br />I would argue messaging makes contention problems easier to identify and analyze, because now you deal only with the contention problem itself, instead of synthetic problems introduced with traditional muti-threading and locks.<br /><p>posted by Frank Hileman</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390811920000000</link>
		<pubDate>Wed, 20 Feb 2008 05:13:12 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390811920000000</guid>
		<dc:creator>Frank Hileman</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;In some cases message passing just doesn't work well at all. For example you may want 10000 objects to be able to query the same data (but only one or two of those objects of those modifies the data). Do you really want each to pass
 through a protocol with just a single thread &quot;owning&quot; that data? Sounds like a recipe for disaster w.r.t. performance to me. Transactions scale very well in situations like that (which is extremely common in practice), because each thread can do all the reading
 it wants to without interfering with the execution of any other threads.<br /></div>
</blockquote>
<br /><br />Again, assuming you do not mean &quot;threads&quot; but some lightweight process concept, this is a problem for the message dispatcher, not the programmer. The dispatcher can recognize a sequence of many messages retrieving data, and they can all be run in parallel.
 There is no need to serialize the message processing until state is mutated. That is why messaging and functional programming work so well together. Any locking or true OS threads are hidden to the programmer.<p>posted by Frank Hileman</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390820860000000</link>
		<pubDate>Wed, 20 Feb 2008 05:28:06 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390820860000000</guid>
		<dc:creator>Frank Hileman</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[Anyone interested in comparing message&nbsp;passing concurrency to other techniques may wish to read this:<br /><br /><a href="http://www.info.ucl.ac.be/~pvr/bookcc.html">http://www.info.ucl.ac.be/~pvr/bookcc.html</a><br /><br />And in particular this paper:<br /><br /><a href="http://www.info.ucl.ac.be/~pvr/flopsPVRarticle.pdf">http://www.info.ucl.ac.be/~pvr/flopsPVRarticle.pdf</a><p>posted by Frank Hileman</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390908100000000</link>
		<pubDate>Wed, 20 Feb 2008 07:53:30 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390908100000000</guid>
		<dc:creator>Frank Hileman</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">evildictaitor wrote:</div>
<div class="quoteBody">&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>sylvan wrote:</strong>
<hr size="1">
<i>&#65279;<br />there are variations on this idea, e.g. lookup Tim Harris' research you'll see an approach where the writes in the transactions actually write directly to the &quot;live&quot; objects - optimizing for successful commits, and the transation logs are only used for rolling
 back if the commit fails<br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br />It would surprise me if this sped things up, since most objects are accessed via pointer (and certainly are in .NET) and thus whether you copy first and mutate on the new object or make a backup so you can rollback is merely a question of whether you swap the
 pointers at the point of a transactional write. But maybe there's some other benefit to this, I don't know.<br /></div>
</blockquote>
<br /><br />It does indeed seem to speed things up. He gets only about 50% overhead (compared to no transactions) for short lived transactions, which is much better than locks in the benchmarks in the paper and very impressive. There are other optimizations too, though.<br /><blockquote><br /></blockquote>
<blockquote>
<div class="quoteAuthor">evildictaitor wrote:</div>
<div class="quoteBody">&#65279;<br />I fail to see why. All of the variable sites are known at compile time (by definition they are on the stack).<br /></div>
</blockquote>
<br />Why could they not be on the heap?<br />Yould read a TVar containing a filter function, and another TVar containing a list of TVars, and then filter this list based on the filter function to get a reduced list of TVars, and then modify each of them based on the value of yet another TVar, etc. etc.
 How could you possibly know which values have been written to when the IP points &quot;past&quot; this code? The number of TVars that you have modified is not statically known, the filter function is not statically known, and the list of TVars itself is not statically
 known.<br /><br /><blockquote>
<div class="quoteAuthor">evildictaitor wrote:</div>
<div class="quoteBody">&#65279;<br />Again, I can't think of a good reason to spawn multiple threads within a transaction.<br /></div>
</blockquote>
<br /><br />I haven't said anything about spawning multiple threads within a transaction?<br /><br /><br /><br /><blockquote>
<div class="quoteAuthor">Frank Hileman wrote:</div>
<div class="quoteBody"><br />Again, assuming you do not mean &quot;threads&quot; but some lightweight process concept, this is a problem for the message dispatcher, not the programmer. The dispatcher can recognize a sequence of many messages retrieving data, and they can all be run in parallel.
 There is no need to serialize the message processing until state is mutated. That is why messaging and functional programming work so well together. Any locking or true OS threads are hidden to the programmer.<br /></div>
</blockquote>
<br /><br />How could these message be run in parallel, if each of these message requires atomic updates? I.e I need to do &quot;Find position of explosive barrel, if I collide with it then explod it&quot;, it's no good if someon else does this at the same time and moves the barrel
 after I've read the position but before I explode it! How could you possibly know that two threads who both read data from the game world (for example) will not decide to write to the same place as a result of that read? Atomic access is key, and with message
 passing that means the accesses get serialised.<br /><p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390923920000000</link>
		<pubDate>Wed, 20 Feb 2008 08:19:52 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633390923920000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;How could these message be run in parallel, if each of these message requires atomic updates? I.e I need to do &quot;Find position of explosive barrel, if I collide with it then explod it&quot;, it's no good if someon else does this at the same
 time and moves the barrel after I've read the position but before I explode it! How could you possibly know that two threads who both read data from the game world (for example) will not decide to write to the same place as a result of that read? Atomic access
 is key, and with message passing that means the accesses get serialised.<br /></div>
</blockquote>
<br /><br />If you have decided by design that all messages to the barrel modify its state, and that all messages are dependent upon the state of the barrel (ie are invalid if the barrel has been modified), you have serialized access to the barrel by design. It does not
 matter what form of concurrency you use, locks, message passing, transactions, it is the same problem, and is the same problem a CPU has when determining the dispatch order of instructions that write to a memory location.<br /><br />The description I gave previously of the ai and world processes applies to your barrel scenario. Ideally the game is designed so that serialization is not needed by design -- ie the barrel explodes regardless of whether it has moved. If you decide to create
 a choke point in your design, and all processes are&nbsp;hitting that&nbsp;spot at the same time,&nbsp;you have designed something that does not parallelize well, regardless of the concurrency mechanism.<br /><br />Massively multiplayer games use messaging extensively, and probably avoid that type of &quot;serialization by design&quot;.<p>posted by Frank Hileman</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391208250000000</link>
		<pubDate>Wed, 20 Feb 2008 16:13:45 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391208250000000</guid>
		<dc:creator>Frank Hileman</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">Frank Hileman wrote:</div>
<div class="quoteBody">&#65279;<br /><br />If you have decided by design that all messages to the barrel modify its state, and that all messages are dependent upon the state of the barrel (ie are invalid if the barrel has been modified), you have serialized access to the barrel by design. It does not
 matter what form of concurrency you use, locks, message passing, transactions, it is the same problem, and is the same problem a CPU has when determining the dispatch order of instructions that write to a memory location.</div>
</blockquote>
<br /><br />No, it's not serialized by design, in fact it's (deliberately) extremely parallel by design, with the occasional rare conflict. You have tens of thousands of objects, most of which don't care one bit about that barrel, but sometimes one of them does, and even
 more rarely two or more of them do. <br />The point is that the mere infinitismal <i>possibility </i>of conflicts cause 100% serialization when you use message passing, whereas with transactions you can run in parallel, and deal with those rare cases of conflicts when and only when they actually occur.<br /><br />If it's just the case of a single barrel you may be able to hack your own optimistic transactional memory on top of the messages (e.g. you have one message which does not block that you can use to check if you need to update the barrel, and if so you just do
 it again with the atomic version - that way 99.9% of the objects would just decide that they don't care about the barrel at all and leave it alone), but it gets much worse in real world scenarios. In practice you'll often have each object want read N unspecified
 objects from the world, and modify M other unspecified objects in the world (which may or may not overlap with the N that you read). There is no way to know up front which objects you need to read/modify, you only know the exact set of objects that was needed
<i>after</i> the operation has occured. All this has to happen atomically, naturally, which means that with message passing you'll be forced to have a single service guarding &quot;The World&quot;, and each object's operations on the world will be entirely serialized.
 It's simply impossible to do this concurrently if your world is guarded by a message process, even though the number of
<i>actual</i> conflicts that these atomic operations have are very very low.<br />And again, with transactional memory, the problem simply disappears and you get near linear speedup as you add more CPU:s.<br /><br />Also, I didn't &quot;design&quot; the problem to be difficult for message passing, it just <i>
was</i> difficult for message passing all by itself. Sometimes the thing you're simulating just isn't suitable to message passing. You can't blame the problem because the language doesn't offer a good way of solving it!<br /><br />Look, I'm the biggest FP advocate there is. I like Erlang et al. as much as the next guy (though my favourite language is Haskell), but the fact of the matter is that there are real problems that can not be solved with message passing. In my experience, most
 applications that are actually concurrent by nature (servers, etc.) can use message passing good effect, but when you try to speed up non-concurrent applications the instances where your problems map nicely to threads and messages start to become more rare.
 We can't just ignore these problems (again, Almdahl's law won't let us), we need to provide a solution for them too. That's why we need many ways of doing these things. In most cases you can be data parallel, in some cases you need task based parallelism,
 and in yet fewer cases you can use threads and message passing, and in fewer cases still you need transactional memory. We can't leave any of these out though, as that would disqualify the language from being considered &quot;general purpose&quot; w.r.t. concurrency/parallelism,
 IMO.<br /><p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391286310000000</link>
		<pubDate>Wed, 20 Feb 2008 18:23:51 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391286310000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>evildictaitor wrote:</strong>
<hr size="1">
<i>&#65279;<br />I fail to see why. All of the variable sites are known at compile time (by definition they are on the stack).<br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br />Why could they not be on the heap?<br /></div>
</blockquote>
<br />Because you need to keep a pointer to that object on the heap in order to use it. Or put another way, an object on the heap that is not referenced inside the transitive closure of the variables on the stack is elegiable for disposal.<br /><br /><br /><blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />Yould read a TVar containing a filter function, and another TVar containing a list of TVars, and then filter this list based on the filter function to get a reduced list of TVars, and then modify each of them based on the value of yet another TVar, etc. etc.
 How could you possibly know which values have been written to when the IP points &quot;past&quot; this code? The number of TVars that you have modified is not statically known, the filter function is not statically known, and the list of TVars itself is not statically
 known.<br /></div>
</blockquote>
<br />No, but you know that the list of TVars was read, and you have the TVars site referenced locally in the function locals list. You can see then that the TVars have been changed, and that when you want to do a mutate on the TVars object you will need to make
 a copy of the array (which is an array of pointers and thus not expensive) for either backup or for preemptive mutation.<br /><br />As to whether or not the filter function is known, it is clear that the filter function
<i>is </i>known statically, since in managed languages you must define all of the code that may run over the course of the program statically prior the program execution.<br /><br /><blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br /><blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>evildictaitor wrote:</strong>
<hr size="1">
<i>&#65279;<br />Again, I can't think of a good reason to spawn multiple threads within a transaction.<br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br /><br />I haven't said anything about spawning multiple threads within a transaction?<br /></div>
</blockquote>
<br />In which case your mechanism of passing messages must either be considered to be a side-effect (which is another problem, but one that can also be solved) or just a linear sequential operation, which is no different to any other transaction based function call.<br /><p>posted by evildictaitor</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391311710000000</link>
		<pubDate>Wed, 20 Feb 2008 19:06:11 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391311710000000</guid>
		<dc:creator>evildictaitor</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">evildictaitor wrote:</div>
<div class="quoteBody">&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>sylvan wrote:</strong>
<hr size="1">
<i>&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>evildictaitor wrote:</strong>
<hr size="1">
<i>&#65279;<br />I fail to see why. All of the variable sites are known at compile time (by definition they are on the stack).<br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br />Why could they not be on the heap?<br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br />Because you need to keep a pointer to that object on the heap in order to use it. Or put another way, an object on the heap that is not referenced inside the transitive closure of the variables on the stack is elegiable for disposal.<br /></div>
</blockquote>
<br /><br />Yes, of course, but at what point should we stop saying that the set of transactional variables that has been read/written is &quot;known at compile time&quot;?
<br />If you actually have to trace down arbitrarily deep into a data structure in order to find the TVar that was actually modified, then I would say that the set of TVars can be figured out
<i>dynamically</i>, but it's not something that can be figured out <i>statically </i>
at compile time anymore (which was my whole point).<br /><br />So to support first class transactional variable you cannot say &quot;my program counter is X, therefore here's my set of touched TVars&quot; in O(1) anymore. And if the set of TVars depend on e.g. a filter function applied to a list, then you need to run the filter
 function <i>again</i> to find out which TVars where actually written to (and imagine that the function is very expensive...), or of course assume that all of them were (which would cause needless inhibition of parallelism).<br /><br />So I think my point remains: The proposed alternative implementation would have difficulties with this, whereas just keeping a transaction log would handle this a lot easier (as you just store which variables you've accessed when you access them).<br /><p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391336410000000</link>
		<pubDate>Wed, 20 Feb 2008 19:47:21 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391336410000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;<br />Yes, of course, but at what point should we stop saying that the set of transactional variables that has been read/written is &quot;known at compile time&quot;?<br />If you actually have to trace down arbitrarily deep into a data structure in order to find the TVar that was actually modified, then I would say that the set of TVars can be figured out
<i>dynamically</i>, but it's not something that can be figured out <i>statically </i>
at compile time anymore (which was my whole point).<br /></div>
</blockquote>
<br /><br />No, you only look at variable <i>sites</i>, you don't look at items on the heap.<br /><br />While you may think that you are trying to make your point, I do have to point out again that this technology is not new or theoretical. It is
<i>currently being used</i> in various architectures and in simmilar mechanisms, such as the CRT exception handler model for x64 processors.<br /><p>posted by evildictaitor</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391500760000000</link>
		<pubDate>Thu, 21 Feb 2008 00:21:16 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391500760000000</guid>
		<dc:creator>evildictaitor</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">sylvan wrote:</div>
<div class="quoteBody">&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>Frank Hileman wrote:</strong>
<hr size="1">
<i>&#65279;<br /><br />If you have decided by design that all messages to the barrel modify its state, and that all messages are dependent upon the state of the barrel (ie are invalid if the barrel has been modified), you have serialized access to the barrel by design. It does not
 matter what form of concurrency you use, locks, message passing, transactions, it is the same problem, and is the same problem a CPU has when determining the dispatch order of instructions that write to a memory location.</i></td>
</tr>
</tbody>
</table>
</blockquote>
<br /><br />No, it's not serialized by design, in fact it's (deliberately) extremely parallel by design, with the occasional rare conflict. You have tens of thousands of objects, most of which don't care one bit about that barrel, but sometimes one of them does, and even
 more rarely two or more of them do. <br />The point is that the mere infinitismal <i>possibility </i>of conflicts cause 100% serialization when you use message passing, whereas with transactions you can run in parallel, and deal with those rare cases of conflicts when and only when they actually occur.<br /><br />If it's just the case of a single barrel you may be able to hack your own optimistic transactional memory on top of the messages (e.g. you have one message which does not block that you can use to check if you need to update the barrel, and if so you just do
 it again with the atomic version - that way 99.9% of the objects would just decide that they don't care about the barrel at all and leave it alone), but it gets much worse in real world scenarios. In practice you'll often have each object want read N unspecified
 objects from the world, and modify M other unspecified objects in the world (which may or may not overlap with the N that you read). There is no way to know up front which objects you need to read/modify, you only know the exact set of objects that was needed
<i>after</i> the operation has occured. All this has to happen atomically, naturally, which means that with message passing you'll be forced to have a single service guarding &quot;The World&quot;, and each object's operations on the world will be entirely serialized.
 It's simply impossible to do this concurrently if your world is guarded by a message process, even though the number of
<i>actual</i> conflicts that these atomic operations have are very very low.<br />And again, with transactional memory, the problem simply disappears and you get near linear speedup as you add more CPU:s.<br /><br />Also, I didn't &quot;design&quot; the problem to be difficult for message passing, it just <i>
was</i> difficult for message passing all by itself. Sometimes the thing you're simulating just isn't suitable to message passing. You can't blame the problem because the language doesn't offer a good way of solving it!<br /><br />Look, I'm the biggest FP advocate there is. I like Erlang et al. as much as the next guy (though my favourite language is Haskell), but the fact of the matter is that there are real problems that can not be solved with message passing. In my experience, most
 applications that are actually concurrent by nature (servers, etc.) can use message passing good effect, but when you try to speed up non-concurrent applications the instances where your problems map nicely to threads and messages start to become more rare.
 We can't just ignore these problems (again, Almdahl's law won't let us), we need to provide a solution for them too. That's why we need many ways of doing these things. In most cases you can be data parallel, in some cases you need task based parallelism,
 and in yet fewer cases you can use threads and message passing, and in fewer cases still you need transactional memory. We can't leave any of these out though, as that would disqualify the language from being considered &quot;general purpose&quot; w.r.t. concurrency/parallelism,
 IMO.<br /></div>
</blockquote>
<br /><br />Games work exceptionally well with message passing. As I discussed regarding your previous ai scenario, if the message to the barrel (change state) includes a &quot;barrel state stamp&quot; then the barrel knows it can change state, assuming that stamp matches the current
 barrel state. If this is hard to envision, imagine the barrel increments a private counter every time it changes important state (important to the message sender). That counter is the state stamp.<br /><br />When the barrel receives your state changing message, it can process it&nbsp;as long as there is not other similar messages competing. If the state stamp has changed it&nbsp;must&nbsp;tell the sender that message was discarded, as it is no longer valid. Then the&nbsp;sender can
 recompute or abandon.<br /><br />This is essentially what happens with transactions as well. The messages as I describe are a type of optimistic transaction. Scalability in games is probably acheived by minimizing choke points, regardless of the concurrency mechanmism used.<br /><br />There is no more serialization with message passing than with transactions. If you have many processes modifying the same mutable state in your barrel, and all these modifications are dependent on the state of the barrel (ie invalid if state has changed) you
 have a contention problem that is not solved by any concurrency mechanism. Messaging does not make this worse. If most processes are not modifying your barrel state, they are not barrel state dependent, and there is no serialization problem as you describe.
 Then both messaging and transations work well.<br /><br />Your second argument, regarding N reads and M writes, you claim is solved better by a transaction. All you are doing is breaking up the granularity. You can do the exact same thing with messages. Instead&nbsp;treating the whole world as one process, break up the
 writes into messages&nbsp;to M processes. If all must be done atomically, then you do need a transactional system built using messages. Ultimately&nbsp;such a&nbsp;transactional system must commit writes.
<br /><br />One way you might do it is a two stage commit. First the supervisor (modifying) process sends a message to each M process acquiring a lock. Assuming a message is sent back with succeed or fail, the next step is to send a message to each M process to actually
 mutate data. During that time each M process cannot be modified by anything else (ie is temporarily owned by the supervisor process). After mutation is complete, the lock is freed. This only requires two messages to each M process and one message back from
 each M process. This is a fine grained form of locking and does not block any other processes from modifying any other mutable data in the meantime. Nor do the locks prevent reading messages from being processed. Only a writing or lock acquiring message would
 fail, and only to those specific pieces of data.<br /><br />The point is you can do anything you wish with message passing. It is a fundamental building block, and can scale as well as your design permits. If your design has no need for atomic composite commits, you can do that. If you do need atomic composite commits,
 you can do that as well.<p>posted by Frank Hileman</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391746390000000</link>
		<pubDate>Thu, 21 Feb 2008 07:10:39 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391746390000000</guid>
		<dc:creator>Frank Hileman</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">evildictaitor wrote:</div>
<div class="quoteBody">&#65279;
<blockquote>
<table class="quoteTable">
<tbody>
<tr>
<td valign="top" width="10"><img src="/Themes/AlmostGlass/images/icon-quote.gif"></td>
<td class="txt3"><strong>sylvan wrote:</strong>
<hr size="1">
<i>&#65279;<br />Yes, of course, but at what point should we stop saying that the set of transactional variables that has been read/written is &quot;known at compile time&quot;?<br />If you actually have to trace down arbitrarily deep into a data structure in order to find the TVar that was actually modified, then I would say that the set of TVars can be figured out
<i>dynamically</i>, but it's not something that can be figured out <i>statically </i>
at compile time anymore (which was my whole point).<br /></i></td>
</tr>
</tbody>
</table>
</blockquote>
<br /><br />No, you only look at variable <i>sites</i>, you don't look at items on the heap.<br /><br />While you may think that you are trying to make your point, I do have to point out again that this technology is not new or theoretical. It is
<i>currently being used</i> in various architectures and in simmilar mechanisms, such as the CRT exception handler model for x64 processors.<br /></div>
</blockquote>
<br /><br />Well you would have to look at variables on the heap, if that's where the transactional variables are.<br />Again, I'm not saying that it wouldn't work, just that it wouldn't work very <i>well.</i> Using it for exception handling is obviously quite different (you don't frequently need a list of all variables touched, all you need is a list of the root pointers for
 the current stack frame so you can call destructors, right?).<br /><br />The fact remains, that with your approach you have an unbounded amount of work to do to find which transactional variables have been used, which is my point. Using a log you already have them available immediately.<br /><p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391784960000000</link>
		<pubDate>Thu, 21 Feb 2008 08:14:56 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391784960000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<blockquote>
<div class="quoteAuthor">Frank Hileman wrote:</div>
<div class="quoteBody">&#65279;<br />Games work exceptionally well with message passing. <br /></div>
</blockquote>
<br /><br />Actually no they don't. I do this for a living, which is precisely why I used them as an example.<br /><br /><blockquote>
<div class="quoteAuthor">Frank Hileman wrote:</div>
<div class="quoteBody">&#65279;<br /><br /><br />Your second argument, regarding N reads and M writes, you claim is solved better by a transaction. All you are doing is breaking up the granularity. You can do the exact same thing with messages. Instead&nbsp;treating the whole world as one process, break up the
 writes into messages&nbsp;to M processes. <br /></div>
</blockquote>
<br /><br />But the N and M will be <i>different</i> for each object. On thread might grab all the barrels in the game an do something to them, another might grab all the players
<i>and</i> barrels in the games, etc. etc. So just statically breaking the world up into M processes won't work well.<br /><br />Essentially what you have is a big set of tens of thousands of objects, each of these must be able to atomically modifiy small sets of the others. There is no simple way of splitting this up into sub-processes (and even if you do you would need to &quot;lock&quot; each
 sub-process you're interested in, which scales poorly). Simple locking (even if implemented with threads) leads to deadlock etc, here. With a simple message passing system you will end up serialising all of these updates.<br /><br /><blockquote>
<div class="quoteAuthor">Frank Hileman wrote:</div>
<div class="quoteBody">&#65279;<br />The point is you can do anything you wish with message passing. It is a fundamental building block, and can scale as well as your design permits. If your design has no need for atomic composite commits, you can do that. If you do need atomic composite commits,
 you can do that as well.</div>
</blockquote>
<br /><br />Well obviously you can use threads to simulate locks, and then use that to build an STM library, but that wasn't really the point. The point was that message passing itself isn't really suitable for all problems. If you just end up building ad hoc (inefficient)
 transactional systems on top of the message passing, then obviously you would be better off to have an efficient system instead (however cheap the threads are, they're not cheap enough to simulate a lock efficiently).<br /><p>posted by sylvan</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391791890000000</link>
		<pubDate>Thu, 21 Feb 2008 08:26:29 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633391791890000000</guid>
		<dc:creator>sylvan</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<p>There is no way to tell if an efficient message passing system would be as fast or faster than transactional memory for your scenario, without building and trying it out. But I suspect there is some bias against the idea of lightweight processes and efficient
 message passing, because we see so few common implementations with that level of efficiency.<br /><br />If you read some of the links I pointed out earlier, they explain how message passing is a lower level building block than transactional memory. Being lower level, it can be faster as well, when you do not need full transactions.<br /><br />I have nothing against transactional memory except that it helps preserve existing serial ways of thinking. Share nothing, message passing concurrency, seems to balance and scale almost automatically.</p>
<p>posted by Frank Hileman</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633392519140000000</link>
		<pubDate>Fri, 22 Feb 2008 04:38:34 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633392519140000000</guid>
		<dc:creator>Frank Hileman</dc:creator>
	</item>
	<item>
		<title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title>
		<description>
			<![CDATA[
<p>Message passing in games:</p>
<p><a href="http://softwaremaven.innerbrane.com/2007/12/python-versus-erlang-for-mmog.html">http://softwaremaven.innerbrane.com/2007/12/python-versus-erlang-for-mmog.html</a><br /><br />That is only distributed server stuff. It seems message passing would work well on clients as well, given that the game might be modeled as many independent actors interacting with one another.</p>
<p>posted by Frank Hileman</p>]]>
		</description>
		<link>http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633392526620000000</link>
		<pubDate>Fri, 22 Feb 2008 04:51:02 GMT</pubDate>
		<guid isPermaLink="true">http://channel9.msdn.com/Shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism#c633392526620000000</guid>
		<dc:creator>Frank Hileman</dc:creator>
	</item>
</channel>
</rss>