<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:evnet="http://www.mscommunities.com/rssmodule/"><channel><title>Comment Feed for Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism (Going Deep on Channel 9)</title><atom:link rel="self" type="application/rss+xml" href="http://channel9.msdn.com/shows/going+deep/burton-smith-on-general-purpose-super-computing-and-the-history-and-future-of-parallelism/rss/default.aspx" /><image><url>http://mschnlnine.vo.llnwd.net/d1/Dev/App_Themes/C9/images/feedimage.png</url><title>Comment Feed for Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism (Going Deep on Channel 9)</title><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/</link></image><description>Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</description><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/</link><language>en-us</language><pubDate>Fri, 22 Feb 2008 04:51:02 GMT</pubDate><lastBuildDate>Fri, 22 Feb 2008 04:51:02 GMT</lastBuildDate><generator>EvNet (EvNet, Version=1.0.3243.35083, Culture=neutral, PublicKeyToken=null)</generator><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;p&gt;Message passing in games:&lt;/p&gt;
&lt;p&gt;&lt;a href="http://softwaremaven.innerbrane.com/2007/12/python-versus-erlang-for-mmog.html"&gt;http://softwaremaven.innerbrane.com/2007/12/python-versus-erlang-for-mmog.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;That is only distributed server stuff. It seems message passing would work well on clients as well, given that the game might be modeled as many independent actors interacting with one another.&lt;/p&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=390340</link><pubDate>Fri, 22 Feb 2008 04:51:02 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=390340</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/390340/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>Message passing in games:
http://softwaremaven.innerbrane.com/2007/12/python-versus-erlang-for-mmog.htmlThat is only distributed server stuff. It seems message passing would work well on clients as well, given that the game might be modeled as many independent actors interacting with one another.</evnet:previewtext><dc:creator>Frank Hileman</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/390340/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;p&gt;There is no way to tell if an efficient message passing system would be as fast or faster than transactional memory for your scenario, without building and trying it out. But I suspect there is some bias against the idea of lightweight processes and efficient message passing, because we see so few common implementations with that level of efficiency.&lt;br /&gt;&lt;br /&gt;If you read some of the links I pointed out earlier, they explain how message passing is a lower level building block than transactional memory. Being lower level, it can be faster as well, when you do not need full transactions.&lt;br /&gt;&lt;br /&gt;I have nothing against transactional memory except that it helps preserve existing serial ways of thinking. Share nothing, message passing concurrency, seems to balance and scale almost automatically.&lt;/p&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=390334</link><pubDate>Fri, 22 Feb 2008 04:38:34 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=390334</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/390334/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>There is no way to tell if an efficient message passing system would be as fast or faster than transactional memory for your scenario, without building and trying it out. But I suspect there is some bias against the idea of lightweight processes and efficient message passing, because we see so few&amp;#8230;</evnet:previewtext><dc:creator>Frank Hileman</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/390334/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;Frank Hileman wrote:&lt;/div&gt;
				&lt;div&gt;﻿&lt;br /&gt;Games work exceptionally well with message passing. &lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;Actually no they don't. I do this for a living, which is precisely why I used them as an example.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;Frank Hileman wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Your second argument, regarding N reads and M writes, you claim is solved better by a transaction. All you are doing is breaking up the granularity. You can do the exact same thing with messages. Instead&amp;nbsp;treating the whole world as one process, break up the writes into messages&amp;nbsp;to M processes. &lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;But the N and M will be &lt;i&gt;different&lt;/i&gt; for each object. On thread might grab all the barrels in the game an do something to them, another might grab all the players &lt;i&gt;and&lt;/i&gt; barrels in the games, etc. etc. So just statically breaking the world up into M processes won't work well.&lt;br /&gt;&lt;br /&gt;Essentially what you have is a big set of tens of thousands of objects, each of these must be able to atomically modifiy small sets of the others. There is no simple way of splitting this up into sub-processes (and even if you do you would need to "lock" each sub-process you're interested in, which scales poorly). Simple locking (even if implemented with threads) leads to deadlock etc, here. With a simple message passing system you will end up serialising all of these updates.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;Frank Hileman wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;The point is you can do anything you wish with message passing. It is a fundamental building block, and can scale as well as your design permits. If your design has no need for atomic composite commits, you can do that. If you do need atomic composite commits, you can do that as well.&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;Well obviously you can use threads to simulate locks, and then use that to build an STM library, but that wasn't really the point. The point was that message passing itself isn't really suitable for all problems. If you just end up building ad hoc (inefficient) transactional systems on top of the message passing, then obviously you would be better off to have an efficient system instead (however cheap the threads are, they're not cheap enough to simulate a lock efficiently).&lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=390037</link><pubDate>Thu, 21 Feb 2008 08:26:29 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=390037</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/390037/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	Frank Hileman wrote:
				﻿Games work exceptionally well with message passing. 
		
		
		Actually no they don't. I do this for a living, which is precisely why I used them as an example.Frank Hileman wrote:﻿Your second argument, regarding N reads and M writes, you claim is solved better by a&amp;#8230;</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/390037/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;evildictaitor wrote:&lt;/div&gt;
				&lt;div&gt;﻿&lt;blockquote&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img src="http://channel9.msdn.com/Themes/AlmostGlass/images/icon-quote.gif /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;sylvan wrote:&lt;/strong&gt;&lt;i&gt;﻿&lt;br /&gt;Yes, of course, but at what point should we stop saying that the set of transactional variables that has been read/written is "known at compile time"?&lt;br /&gt;If you actually have to trace down arbitrarily deep into a data structure in order to find the TVar that was actually modified, then I would say that the set of TVars can be figured out &lt;i&gt;dynamically&lt;/i&gt;, but it's not something that can be figured out &lt;i&gt;statically &lt;/i&gt;at compile time anymore (which was my whole point).&lt;br /&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;No, you only look at variable &lt;i&gt;sites&lt;/i&gt;, you don't look at items on the heap.&lt;br /&gt;&lt;br /&gt;While you may think that you are trying to make your point, I do have to point out again that this technology is not new or theoretical. It is &lt;i&gt;currently being used&lt;/i&gt; in various architectures and in simmilar mechanisms, such as the CRT exception handler model for x64 processors.&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;Well you would have to look at variables on the heap, if that's where the transactional variables are.&lt;br /&gt;Again, I'm not saying that it wouldn't work, just that it wouldn't work very &lt;i&gt;well.&lt;/i&gt; Using it for exception handling is obviously quite different (you don't frequently need a list of all variables touched, all you need is a list of the root pointers for the current stack frame so you can call destructors, right?).&lt;br /&gt;&lt;br /&gt;The fact remains, that with your approach you have an unbounded amount of work to do to find which transactional variables have been used, which is my point. Using a log you already have them available immediately.&lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=390035</link><pubDate>Thu, 21 Feb 2008 08:14:56 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=390035</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/390035/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	evildictaitor wrote:
				﻿sylvan wrote:﻿Yes, of course, but at what point should we stop saying that the set of transactional variables that has been read/written is "known at compile time"?If you actually have to trace down arbitrarily deep into a data structure in order to find the TVar that was&amp;#8230;</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/390035/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;sylvan wrote:&lt;/div&gt;
				&lt;div&gt;﻿ 
&lt;blockquote&gt;
&lt;table&gt;

&lt;tr&gt;
&lt;td&gt;&lt;img src="http://channel9.msdn.com/Themes/AlmostGlass/images/icon-quote.gif /&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Frank Hileman wrote:&lt;/strong&gt; 

&lt;i&gt;﻿&lt;br /&gt;&lt;br /&gt;If you have decided by design that all messages to the barrel modify its state, and that all messages are dependent upon the state of the barrel (ie are invalid if the barrel has been modified), you have serialized access to the barrel by design. It does not matter what form of concurrency you use, locks, message passing, transactions, it is the same problem, and is the same problem a CPU has when determining the dispatch order of instructions that write to a memory location.&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;No, it's not serialized by design, in fact it's (deliberately) extremely parallel by design, with the occasional rare conflict. You have tens of thousands of objects, most of which don't care one bit about that barrel, but sometimes one of them does, and even more rarely two or more of them do. &lt;br /&gt;The point is that the mere infinitismal &lt;i&gt;possibility &lt;/i&gt;of conflicts cause 100% serialization when you use message passing, whereas with transactions you can run in parallel, and deal with those rare cases of conflicts when and only when they actually occur.&lt;br /&gt;&lt;br /&gt;If it's just the case of a single barrel you may be able to hack your own optimistic transactional memory on top of the messages (e.g. you have one message which does not block that you can use to check if you need to update the barrel, and if so you just do it again with the atomic version - that way 99.9% of the objects would just decide that they don't care about the barrel at all and leave it alone), but it gets much worse in real world scenarios. In practice you'll often have each object want read N unspecified objects from the world, and modify M other unspecified objects in the world (which may or may not overlap with the N that you read). There is no way to know up front which objects you need to read/modify, you only know the exact set of objects that was needed &lt;i&gt;after&lt;/i&gt; the operation has occured. All this has to happen atomically, naturally, which means that with message passing you'll be forced to have a single service guarding "The World", and each object's operations on the world will be entirely serialized. It's simply impossible to do this concurrently if your world is guarded by a message process, even though the number of &lt;i&gt;actual&lt;/i&gt; conflicts that these atomic operations have are very very low.&lt;br /&gt;And again, with transactional memory, the problem simply disappears and you get near linear speedup as you add more CPU:s.&lt;br /&gt;&lt;br /&gt;Also, I didn't "design" the problem to be difficult for message passing, it just &lt;i&gt;was&lt;/i&gt; difficult for message passing all by itself. Sometimes the thing you're simulating just isn't suitable to message passing. You can't blame the problem because the language doesn't offer a good way of solving it!&lt;br /&gt;&lt;br /&gt;Look, I'm the biggest FP advocate there is. I like Erlang et al. as much as the next guy (though my favourite language is Haskell), but the fact of the matter is that there are real problems that can not be solved with message passing. In my experience, most applications that are actually concurrent by nature (servers, etc.) can use message passing good effect, but when you try to speed up non-concurrent applications the instances where your problems map nicely to threads and messages start to become more rare. We can't just ignore these problems (again, Almdahl's law won't let us), we need to provide a solution for them too. That's why we need many ways of doing these things. In most cases you can be data parallel, in some cases you need task based parallelism, and in yet fewer cases you can use threads and message passing, and in fewer cases still you need transactional memory. We can't leave any of these out though, as that would disqualify the language from being considered "general purpose" w.r.t. concurrency/parallelism, IMO.&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;Games work exceptionally well with message passing. As I discussed regarding your previous ai scenario, if the message to the barrel (change state) includes a "barrel state stamp" then the barrel knows it can change state, assuming that stamp matches the current barrel state. If this is hard to envision, imagine the barrel increments a private counter every time it changes important state (important to the message sender). That counter is the state stamp.&lt;br /&gt;&lt;br /&gt;When the barrel receives your state changing message, it can process it&amp;nbsp;as long as there is not other similar messages competing. If the state stamp has changed it&amp;nbsp;must&amp;nbsp;tell the sender that message was discarded, as it is no longer valid. Then the&amp;nbsp;sender can recompute or abandon.&lt;br /&gt;&lt;br /&gt;This is essentially what happens with transactions as well. The messages as I describe are a type of optimistic transaction. Scalability in games is probably acheived by minimizing choke points, regardless of the concurrency mechanmism used.&lt;br /&gt;&lt;br /&gt;There is no more serialization with message passing than with transactions. If you have many processes modifying the same mutable state in your barrel, and all these modifications are dependent on the state of the barrel (ie invalid if state has changed) you have a contention problem that is not solved by any concurrency mechanism. Messaging does not make this worse. If most processes are not modifying your barrel state, they are not barrel state dependent, and there is no serialization problem as you describe. Then both messaging and transations work well.&lt;br /&gt;&lt;br /&gt;Your second argument, regarding N reads and M writes, you claim is solved better by a transaction. All you are doing is breaking up the granularity. You can do the exact same thing with messages. Instead&amp;nbsp;treating the whole world as one process, break up the writes into messages&amp;nbsp;to M processes. If all must be done atomically, then you do need a transactional system built using messages. Ultimately&amp;nbsp;such a&amp;nbsp;transactional system must commit writes. &lt;br /&gt;&lt;br /&gt;One way you might do it is a two stage commit. First the supervisor (modifying) process sends a message to each M process acquiring a lock. Assuming a message is sent back with succeed or fail, the next step is to send a message to each M process to actually mutate data. During that time each M process cannot be modified by anything else (ie is temporarily owned by the supervisor process). After mutation is complete, the lock is freed. This only requires two messages to each M process and one message back from each M process. This is a fine grained form of locking and does not block any other processes from modifying any other mutable data in the meantime. Nor do the locks prevent reading messages from being processed. Only a writing or lock acquiring message would fail, and only to those specific pieces of data.&lt;br /&gt;&lt;br /&gt;The point is you can do anything you wish with message passing. It is a fundamental building block, and can scale as well as your design permits. If your design has no need for atomic composite commits, you can do that. If you do need atomic composite commits, you can do that as well.</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=390030</link><pubDate>Thu, 21 Feb 2008 07:10:39 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=390030</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/390030/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	sylvan wrote:
				﻿ 





Frank Hileman wrote: 

﻿If you have decided by design that all messages to the barrel modify its state, and that all messages are dependent upon the state of the barrel (ie are invalid if the barrel has been modified), you have serialized access to the barrel by&amp;#8230;</evnet:previewtext><dc:creator>Frank Hileman</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/390030/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;sylvan wrote:&lt;/div&gt;
				&lt;div&gt;﻿&lt;br /&gt;Yes, of course, but at what point should we stop saying that the set of transactional variables that has been read/written is "known at compile time"?&lt;br /&gt;If you actually have to trace down arbitrarily deep into a data structure in order to find the TVar that was actually modified, then I would say that the set of TVars can be figured out &lt;i&gt;dynamically&lt;/i&gt;, but it's not something that can be figured out &lt;i&gt;statically &lt;/i&gt;at compile time anymore (which was my whole point).&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;No, you only look at variable &lt;i&gt;sites&lt;/i&gt;, you don't look at items on the heap.&lt;br /&gt;&lt;br /&gt;While you may think that you are trying to make your point, I do have to point out again that this technology is not new or theoretical. It is &lt;i&gt;currently being used&lt;/i&gt; in various architectures and in simmilar mechanisms, such as the CRT exception handler model for x64 processors.&lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389988</link><pubDate>Thu, 21 Feb 2008 00:21:16 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389988</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389988/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	sylvan wrote:
				﻿Yes, of course, but at what point should we stop saying that the set of transactional variables that has been read/written is "known at compile time"?If you actually have to trace down arbitrarily deep into a data structure in order to find the TVar that was actually modified,&amp;#8230;</evnet:previewtext><dc:creator>evildictaitor</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389988/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;evildictaitor wrote:&lt;/div&gt;
				&lt;div&gt;﻿&lt;blockquote&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img src="http://channel9.msdn.com/Themes/AlmostGlass/images/icon-quote.gif /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;sylvan wrote:&lt;/strong&gt;&lt;i&gt;﻿&lt;blockquote&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img src="http://channel9.msdn.com/Themes/AlmostGlass/images/icon-quote.gif /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;evildictaitor wrote:&lt;/strong&gt;&lt;i&gt;﻿&lt;br /&gt;I fail to see why. All of the variable sites are known at compile time (by definition they are on the stack).&lt;br /&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/blockquote&gt;&lt;br /&gt;Why could they not be on the heap?&lt;br /&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/blockquote&gt;&lt;br /&gt;Because you need to keep a pointer to that object on the heap in order to use it. Or put another way, an object on the heap that is not referenced inside the transitive closure of the variables on the stack is elegiable for disposal.&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;Yes, of course, but at what point should we stop saying that the set of transactional variables that has been read/written is "known at compile time"? &lt;br /&gt;If you actually have to trace down arbitrarily deep into a data structure in order to find the TVar that was actually modified, then I would say that the set of TVars can be figured out &lt;i&gt;dynamically&lt;/i&gt;, but it's not something that can be figured out &lt;i&gt;statically &lt;/i&gt;at compile time anymore (which was my whole point).&lt;br /&gt;&lt;br /&gt;So to support first class transactional variable you cannot say "my program counter is X, therefore here's my set of touched TVars" in O(1) anymore. And if the set of TVars depend on e.g. a filter function applied to a list, then you need to run the filter function &lt;i&gt;again&lt;/i&gt; to find out which TVars where actually written to (and imagine that the function is very expensive...), or of course assume that all of them were (which would cause needless inhibition of parallelism).&lt;br /&gt;&lt;br /&gt;So I think my point remains: The proposed alternative implementation would have difficulties with this, whereas just keeping a transaction log would handle this a lot easier (as you just store which variables you've accessed when you access them).&lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389910</link><pubDate>Wed, 20 Feb 2008 19:47:21 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389910</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389910/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	evildictaitor wrote:
				﻿sylvan wrote:﻿evildictaitor wrote:﻿I fail to see why. All of the variable sites are known at compile time (by definition they are on the stack).Why could they not be on the heap?Because you need to keep a pointer to that object on the heap in order to use it. Or put&amp;#8230;</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389910/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;sylvan wrote:&lt;/div&gt;
				&lt;div&gt;﻿&lt;blockquote&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img src="http://channel9.msdn.com/Themes/AlmostGlass/images/icon-quote.gif /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;evildictaitor wrote:&lt;/strong&gt;&lt;i&gt;﻿&lt;br /&gt;I fail to see why. All of the variable sites are known at compile time (by definition they are on the stack).&lt;br /&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/blockquote&gt;&lt;br /&gt;Why could they not be on the heap?&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;Because you need to keep a pointer to that object on the heap in order to use it. Or put another way, an object on the heap that is not referenced inside the transitive closure of the variables on the stack is elegiable for disposal.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;
&lt;blockquote&gt;&lt;div&gt;sylvan wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;
Yould read a TVar containing a filter function, and another TVar
containing a list of TVars, and then filter this list based on the
filter function to get a reduced list of TVars, and then modify each of
them based on the value of yet another TVar, etc. etc. How could you
possibly know which values have been written to when the IP points
"past" this code? The number of TVars that you have modified is not
statically known, the filter function is not statically known, and the
list of TVars itself is not statically known.&lt;br /&gt;
&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;
No, but you know that the list of TVars was read, and you have the
TVars site referenced locally in the function locals list. You can see
then that the TVars have been changed, and that when you want to do a
mutate on the TVars object you will need to make a copy of the array
(which is an array of pointers and thus not expensive) for either
backup or for preemptive mutation.&lt;br /&gt;
&lt;br /&gt;
As to whether or not the filter function is known, it is clear that the filter function &lt;i&gt;is &lt;/i&gt;known
statically, since in managed languages you must define all of the code
that may run over the course of the program statically prior the
program execution.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;sylvan wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;&lt;blockquote&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img src="http://channel9.msdn.com/Themes/AlmostGlass/images/icon-quote.gif /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;evildictaitor wrote:&lt;/strong&gt;&lt;i&gt;﻿&lt;br /&gt;Again, I can't think of a good reason to spawn multiple threads within a transaction.&lt;br /&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;I haven't said anything about spawning multiple threads within a transaction?&lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;In which case your mechanism of passing messages must either be considered to be a side-effect (which is another problem, but one that can also be solved) or just a linear sequential operation, which is no different to any other transaction based function call.&lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389900</link><pubDate>Wed, 20 Feb 2008 19:06:11 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389900</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389900/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	sylvan wrote:
				﻿evildictaitor wrote:﻿I fail to see why. All of the variable sites are known at compile time (by definition they are on the stack).Why could they not be on the heap?
		
		Because you need to keep a pointer to that object on the heap in order to use it. Or put another way, an&amp;#8230;</evnet:previewtext><dc:creator>evildictaitor</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389900/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;Frank Hileman wrote:&lt;/div&gt;
				&lt;div&gt;﻿&lt;br /&gt;&lt;br /&gt;If you have decided by design that all messages to the barrel modify its state, and that all messages are dependent upon the state of the barrel (ie are invalid if the barrel has been modified), you have serialized access to the barrel by design. It does not matter what form of concurrency you use, locks, message passing, transactions, it is the same problem, and is the same problem a CPU has when determining the dispatch order of instructions that write to a memory location.&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;No, it's not serialized by design, in fact it's (deliberately) extremely parallel by design, with the occasional rare conflict. You have tens of thousands of objects, most of which don't care one bit about that barrel, but sometimes one of them does, and even more rarely two or more of them do. &lt;br /&gt;The point is that the mere infinitismal &lt;i&gt;possibility &lt;/i&gt;of conflicts cause 100% serialization when you use message passing, whereas with transactions you can run in parallel, and deal with those rare cases of conflicts when and only when they actually occur.&lt;br /&gt;&lt;br /&gt;If it's just the case of a single barrel you may be able to hack your own optimistic transactional memory on top of the messages (e.g. you have one message which does not block that you can use to check if you need to update the barrel, and if so you just do it again with the atomic version - that way 99.9% of the objects would just decide that they don't care about the barrel at all and leave it alone), but it gets much worse in real world scenarios. In practice you'll often have each object want read N unspecified objects from the world, and modify M other unspecified objects in the world (which may or may not overlap with the N that you read). There is no way to know up front which objects you need to read/modify, you only know the exact set of objects that was needed &lt;i&gt;after&lt;/i&gt; the operation has occured. All this has to happen atomically, naturally, which means that with message passing you'll be forced to have a single service guarding "The World", and each object's operations on the world will be entirely serialized. It's simply impossible to do this concurrently if your world is guarded by a message process, even though the number of &lt;i&gt;actual&lt;/i&gt; conflicts that these atomic operations have are very very low.&lt;br /&gt;And again, with transactional memory, the problem simply disappears and you get near linear speedup as you add more CPU:s.&lt;br /&gt;&lt;br /&gt;Also, I didn't "design" the problem to be difficult for message passing, it just &lt;i&gt;was&lt;/i&gt; difficult for message passing all by itself. Sometimes the thing you're simulating just isn't suitable to message passing. You can't blame the problem because the language doesn't offer a good way of solving it!&lt;br /&gt;&lt;br /&gt;Look, I'm the biggest FP advocate there is. I like Erlang et al. as much as the next guy (though my favourite language is Haskell), but the fact of the matter is that there are real problems that can not be solved with message passing. In my experience, most applications that are actually concurrent by nature (servers, etc.) can use message passing good effect, but when you try to speed up non-concurrent applications the instances where your problems map nicely to threads and messages start to become more rare. We can't just ignore these problems (again, Almdahl's law won't let us), we need to provide a solution for them too. That's why we need many ways of doing these things. In most cases you can be data parallel, in some cases you need task based parallelism, and in yet fewer cases you can use threads and message passing, and in fewer cases still you need transactional memory. We can't leave any of these out though, as that would disqualify the language from being considered "general purpose" w.r.t. concurrency/parallelism, IMO.&lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389886</link><pubDate>Wed, 20 Feb 2008 18:23:51 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389886</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389886/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	Frank Hileman wrote:
				﻿If you have decided by design that all messages to the barrel modify its state, and that all messages are dependent upon the state of the barrel (ie are invalid if the barrel has been modified), you have serialized access to the barrel by design. It does not matter what&amp;#8230;</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389886/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;sylvan wrote:&lt;/div&gt;
				&lt;div&gt;﻿How could these message be run in parallel, if each of these message requires atomic updates? I.e I need to do "Find position of explosive barrel, if I collide with it then explod it", it's no good if someon else does this at the same time and moves the barrel after I've read the position but before I explode it! How could you possibly know that two threads who both read data from the game world (for example) will not decide to write to the same place as a result of that read? Atomic access is key, and with message passing that means the accesses get serialised.&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;If you have decided by design that all messages to the barrel modify its state, and that all messages are dependent upon the state of the barrel (ie are invalid if the barrel has been modified), you have serialized access to the barrel by design. It does not matter what form of concurrency you use, locks, message passing, transactions, it is the same problem, and is the same problem a CPU has when determining the dispatch order of instructions that write to a memory location.&lt;br /&gt;&lt;br /&gt;The description I gave previously of the ai and world processes applies to your barrel scenario. Ideally the game is designed so that serialization is not needed by design -- ie the barrel explodes regardless of whether it has moved. If you decide to create a choke point in your design, and all processes are&amp;nbsp;hitting that&amp;nbsp;spot at the same time,&amp;nbsp;you have designed something that does not parallelize well, regardless of the concurrency mechanism.&lt;br /&gt;&lt;br /&gt;Massively multiplayer games use messaging extensively, and probably avoid that type of "serialization by design".</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389855</link><pubDate>Wed, 20 Feb 2008 16:13:45 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389855</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389855/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	sylvan wrote:
				﻿How could these message be run in parallel, if each of these message requires atomic updates? I.e I need to do "Find position of explosive barrel, if I collide with it then explod it", it's no good if someon else does this at the same time and moves the barrel after I've read&amp;#8230;</evnet:previewtext><dc:creator>Frank Hileman</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389855/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;evildictaitor wrote:&lt;/div&gt;
				&lt;div&gt;﻿&lt;blockquote&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img src="http://channel9.msdn.com/Themes/AlmostGlass/images/icon-quote.gif /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;sylvan wrote:&lt;/strong&gt;&lt;i&gt;﻿&lt;br /&gt;there are variations on this idea, e.g. lookup Tim Harris' research you'll see an approach
where the writes in the transactions actually write directly to the
"live" objects - optimizing for successful commits, and the transation
logs are only used for rolling back if the commit fails&lt;br /&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/blockquote&gt;&lt;br /&gt;It would surprise me if this sped things up, since most objects are accessed via pointer (and certainly are in .NET) and thus whether you copy first and mutate on the new object or make a backup so you can rollback is merely a question of whether you swap the pointers at the point of a transactional write. But maybe there's some other benefit to this, I don't know.&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;It does indeed seem to speed things up. He gets only about 50% overhead (compared to no transactions) for short lived transactions, which is much better than locks in the benchmarks in the paper and very impressive. There are other optimizations too, though.&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;div&gt;evildictaitor wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;I fail to see why. All of the variable sites are known at compile time (by definition they are on the stack).&lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;Why could they not be on the heap?&lt;br /&gt;Yould read a TVar containing a filter function, and another TVar containing a list of TVars, and then filter this list based on the filter function to get a reduced list of TVars, and then modify each of them based on the value of yet another TVar, etc. etc. How could you possibly know which values have been written to when the IP points "past" this code? The number of TVars that you have modified is not statically known, the filter function is not statically known, and the list of TVars itself is not statically known.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;evildictaitor wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;Again, I can't think of a good reason to spawn multiple threads within a transaction.&lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;I haven't said anything about spawning multiple threads within a transaction?&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;Frank Hileman wrote:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;Again, assuming you do not mean "threads" but some lightweight
process concept, this is a problem for the message dispatcher, not the
programmer. The dispatcher can recognize a sequence of many messages
retrieving data, and they can all be run in parallel. There is no need
to serialize the message processing until state is mutated. That is why
messaging and functional programming work so well together. Any locking
or true OS threads are hidden to the programmer.&lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;How could these message be run in parallel, if each of these message requires atomic updates? I.e I need to do "Find position of explosive barrel, if I collide with it then explod it", it's no good if someon else does this at the same time and moves the barrel after I've read the position but before I explode it! How could you possibly know that two threads who both read data from the game world (for example) will not decide to write to the same place as a result of that read? Atomic access is key, and with message passing that means the accesses get serialised.&lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389781</link><pubDate>Wed, 20 Feb 2008 08:19:52 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389781</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389781/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	evildictaitor wrote:
				﻿sylvan wrote:﻿there are variations on this idea, e.g. lookup Tim Harris' research you'll see an approach
where the writes in the transactions actually write directly to the
"live" objects - optimizing for successful commits, and the transation
logs are only used for&amp;#8230;</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389781/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>Anyone interested in comparing message&amp;nbsp;passing concurrency to other techniques may wish to read this:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.info.ucl.ac.be/~pvr/bookcc.html"&gt;http://www.info.ucl.ac.be/~pvr/bookcc.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;And in particular this paper:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.info.ucl.ac.be/~pvr/flopsPVRarticle.pdf"&gt;http://www.info.ucl.ac.be/~pvr/flopsPVRarticle.pdf&lt;/a&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389777</link><pubDate>Wed, 20 Feb 2008 07:53:30 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389777</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389777/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>Anyone interested in comparing message&amp;nbsp;passing concurrency to other techniques may wish to read this:http://www.info.ucl.ac.be/~pvr/bookcc.htmlAnd in particular this paper:http://www.info.ucl.ac.be/~pvr/flopsPVRarticle.pdf</evnet:previewtext><dc:creator>Frank Hileman</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389777/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;sylvan wrote:&lt;/div&gt;
				&lt;div&gt;﻿In some cases message passing just doesn't work well at all. For example you may want 10000 objects to be able to query the same data (but only one or two of those objects of those modifies the data). Do you really want each to pass through a protocol with just a single thread "owning" that data? Sounds like a recipe for disaster w.r.t. performance to me. Transactions scale very well in situations like that (which is extremely common in practice), because each thread can do all the reading it wants to without interfering with the execution of any other threads.&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;Again, assuming you do not mean "threads" but some lightweight process concept, this is a problem for the message dispatcher, not the programmer. The dispatcher can recognize a sequence of many messages retrieving data, and they can all be run in parallel. There is no need to serialize the message processing until state is mutated. That is why messaging and functional programming work so well together. Any locking or true OS threads are hidden to the programmer.</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389750</link><pubDate>Wed, 20 Feb 2008 05:28:06 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389750</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389750/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	sylvan wrote:
				﻿In some cases message passing just doesn't work well at all. For example you may want 10000 objects to be able to query the same data (but only one or two of those objects of those modifies the data). Do you really want each to pass through a protocol with just a single thread&amp;#8230;</evnet:previewtext><dc:creator>Frank Hileman</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389750/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;sylvan wrote:&lt;/div&gt;
				&lt;div&gt;﻿&lt;br /&gt;EDIT: Oh, and consider the problem of granularity. Take the example of a game. You're running a thread for an AI character who wants to perform an action on the game world atomically. How do you solve that? Do you ask the World object to retrieve whatever it is the AI wants to access? In other words is the World object's service going to act as basically having one big lock on it for any atomic updates that AI characters want to do? Or do you put the implicit "lock" somewhere further down in the hierarchy (e.g. on individual objects in the world)? If so, how is this different from just having locks on the individual objects? How do you solve the problem of not knowing up front which objects you want to modify (because the set of objects you need depends on the values you find in the first couple of objects you examine)? Aren't we back in the old hornets nest of locks and condition variables again?&lt;br /&gt;So basically we either have to let the toplevel object provide a transactional interface (amounting to a big implicit lock), and basically eliminate any chances of parallelism, or we go back to horribly impractical fine grained locking with all the problems that poses. In this scenario, message passing gives no improvement to us, whereas transactional memory "just works" without any synchronisation burden placed on the programmer at all!&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;Ignoring the "thread" implementation idea (since the isolated units are less than threads), and other implementation assumptions,&amp;nbsp;I assume you are stating the problems as: world, one big unit containing&amp;nbsp;smaller pieces of mutating data, and ai characters, many smaller units that interact with this. Let's call the units&amp;nbsp;processes (not process in the OS sense). What interaction do you see between these processes?&lt;br /&gt;&lt;br /&gt;I am not sure how you would define the interaction, but intuitively, I think you selected an&amp;nbsp;example where message passing works well (and is probably why massive multiplayer games use messaging). Lets suppose each ai process receives&amp;nbsp; message&amp;nbsp;G to retrieve data from the world, and sends message&amp;nbsp;S to mutate the world.&lt;br /&gt;&lt;br /&gt;Each message is queued. Assume the world has a set of messages queued, of type S. Assume it is sending out G in between S processing work.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;AI1 requested data and got a message G. It sends out S based on the values in G, lets call it world state 1. In the mean time the world has changed. By the time it processes that S, it is in world state 2.&lt;br /&gt;&amp;nbsp;&lt;br /&gt;1) How can an ai reliably compute the data for S if the world is constantly changing state?&lt;br /&gt;&lt;br /&gt;2) How much parallelism have we lost by putting so much mutable&amp;nbsp;data in the large world process?&lt;br /&gt;&lt;br /&gt;The answer to question 1 is usually in the design of the algorithms. Ideally the difference between world state 1 and 2 does not affect the validity of the ai message S to the world. That is, it may be something like, add this amount of energy to a particle, not, set the absolute value of the energy of this particle. A delta, for a game, is probably more appropriate.&lt;br /&gt;&lt;br /&gt;If the validity of the message is&amp;nbsp;dependent upon&amp;nbsp;the world state, message S could include as data a "world state stamp", that is an identifier showing that S is only valid if the world is still in state 1. Message G, from the world to the state would include this stamp.&lt;br /&gt;&lt;br /&gt;The world process would then ignore S messages with past due stamps, resending G to the sender ai processes with ignored S messages, so they could recompute.&lt;br /&gt;&lt;br /&gt;This would be a contention problem with one giant world process. Which leads to the second question, granularity. If the world is broken up into many smaller processes, the contention could potentially go away, assuming every ai is not working with a&amp;nbsp;world state dependent algorithm, with each&amp;nbsp;trying to modify the same small mutable state with that same algorithm.&lt;br /&gt;&lt;br /&gt;Messaging does not make contention problems go away, it is true. You must still design the system correctly to avoid that.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;I would argue messaging makes contention problems easier to identify and analyze, because now you deal only with the contention problem itself, instead of synthetic problems introduced with traditional muti-threading and locks.&lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389744</link><pubDate>Wed, 20 Feb 2008 05:13:12 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389744</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389744/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	sylvan wrote:
				﻿EDIT: Oh, and consider the problem of granularity. Take the example of a game. You're running a thread for an AI character who wants to perform an action on the game world atomically. How do you solve that? Do you ask the World object to retrieve whatever it is the AI wants to&amp;#8230;</evnet:previewtext><dc:creator>Frank Hileman</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389744/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;evildictaitor wrote:&lt;/div&gt;
				&lt;div&gt;﻿
&lt;blockquote&gt;
&lt;table&gt;

&lt;tr&gt;
&lt;td&gt;&lt;img src="http://channel9.msdn.com/Themes/AlmostGlass/images/icon-quote.gif /&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;sylvan wrote:&lt;/strong&gt;

&lt;i&gt;﻿&lt;br /&gt;Well you only need to detect problems that has resulted from the program reading inconsistent state. If it diverges for any other reason it's someone else's problem!&lt;br /&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;The point of transactions is that you never have inconsistent state. Whenever a transaction does a write, it clones the object and writes to the clone, and only when it &lt;i&gt;commits&lt;/i&gt; does it write to the shared memory. You can then cancel any transactions which might have read or written to these values (and would have inconsistent state if they continued), and simply free their memory and restart them. This has the nicety that you never need to check for your own consistency because it is guarranteed when you start, while you are running, and that any possibly overwrite of data that might render your world view inconsistent will result in your own immediate termination and rescheduling. You would however need to use an appropriate weighting function to avoid livelock.&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;I agree.&amp;nbsp; This also is relatively easy to reason about, which is important as code gets complex.</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389683</link><pubDate>Wed, 20 Feb 2008 01:37:52 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389683</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389683/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	evildictaitor wrote:
				﻿





sylvan wrote:

﻿Well you only need to detect problems that has resulted from the program reading inconsistent state. If it diverges for any other reason it's someone else's problem!The point of transactions is that you never have inconsistent state. Whenever&amp;#8230;</evnet:previewtext><dc:creator>staceyw</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389683/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;sylvan wrote:&lt;/div&gt;
				&lt;div&gt;﻿&lt;br /&gt;there are variations on this idea, e.g. lookup Tim Harris' research you'll see an approach
where the writes in the transactions actually write directly to the
"live" objects - optimizing for successful commits, and the transation
logs are only used for rolling back if the commit fails&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;It would surprise me if this sped things up, since most objects are accessed via pointer (and certainly are in .NET) and thus whether you copy first and mutate on the new object or make a backup so you can rollback is merely a question of whether you swap the pointers at the point of a transactional write. But maybe there's some other benefit to this, I don't know.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;sylvan wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;However, the suggested strategy of specifying transactional variables up front would have problems with scenarios where the transaction variables aren't known at compile time.&lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;I fail to see why. All of the variable sites are known at compile time (by definition they are on the stack).&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;sylvan wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;E.g. if you read transactional variables from a transactional channel, and do transactional updates on them. Since those transactional variables can come from any other thread that has access to the channel (and depend on user input or whatever), and you don't even know how many of them you'll get, there's no way of producing a static lookup of which variables have been read based on the instruction pointer.&lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;One of the points of transactions is they are (effectively) syncronous. That is to say that if a transactional message is passed to a function, either that function must run syncronously and return, or it must guarrantee that the source thread relinquishes control of the object and that the result of the computation is non-side-effecting.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;sylvan wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;As far as I can see you either need a transaction log, or you disallow transactional variables as first class citizens. So I'm not sure how well the log-less idea would work in practice (it seems to me that storing transactional variables inside other transactional variables would be very common, e.g. when you send a message to another thread and also supply a message channel for that thread to send its reply).&lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;Again, I can't think of a good reason to spawn multiple threads within a transaction. If you start to get to that level of complexity, the effect of a rollback or commit would be effectively to cancel a large number of transactions and do many copy-backs at the point of commit (or abort), which strikes me as needing an alternative model than transactions for efficient use.&lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389672</link><pubDate>Wed, 20 Feb 2008 01:08:45 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389672</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389672/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	sylvan wrote:
				﻿there are variations on this idea, e.g. lookup Tim Harris' research you'll see an approach
where the writes in the transactions actually write directly to the
"live" objects - optimizing for successful commits, and the transation
logs are only used for rolling back if the&amp;#8230;</evnet:previewtext><dc:creator>evildictaitor</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389672/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>duplicate&lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389608</link><pubDate>Tue, 19 Feb 2008 22:13:09 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389608</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389608/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>duplicate</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389608/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;table&gt;
						
								&lt;tr&gt;
										&lt;td&gt;
												&lt;img src="http://channel9.msdn.com/Themes/AlmostGlass/images/icon-quote.gif /&gt;
										&lt;/td&gt;
										&lt;td&gt;
												&lt;strong&gt;evildictaitor wrote:&lt;/strong&gt;
												
												&lt;i&gt;﻿&lt;br /&gt;The point of transactions is that you never have inconsistent state. Whenever a transaction does a write, it clones the object and writes to the clone, and only when it &lt;i&gt;commits&lt;/i&gt; does it write to the shared memory.&amp;nbsp; You can then cancel any transactions which might have read or written
to these values (and would have inconsistent state if they continued),
and simply free their memory and restart them.&lt;br /&gt;&lt;/i&gt;
										&lt;/td&gt;
								&lt;/tr&gt;
						
				&lt;/table&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;That's &lt;i&gt;one&lt;/i&gt; implementation strategy, yes, I was talking about was the implementation strategy used in e.g. Haskell STM (which is the other way around) where you have a transaction log that you fill in as you go along, and then in the commit you simply check your read values for consistency with actual "live" values, and if they haven't changed you're good to write out your changes.&lt;br /&gt;This &lt;i&gt;can&lt;/i&gt; lead to inconsistencies. So the key is to kill off these invalid transactions somehow. You &lt;i&gt;could &lt;/i&gt;do it by having each transaction kill off any conflicting transactions when it commits (your approach), but in GHC what they do (as far as I can tell) is that they let those transactions detect that they're inconsistent when &lt;i&gt;they&lt;/i&gt; commit. That leaves the corner case of a transaction that diverges (and thus would never commit) due to an inconsistency. These should be rare, and can be checked as a special case periodically (e.g. on GC or thread switching).&lt;br /&gt;The benefits of this approach, as far as I can tell, is that each transaction only ever looks at its own stuff, and won't have to go look through a bunch of other transactions just to verify that none of them conflict (which they usually won't have). Either approach would work, which is why I said earlier that I would be interested in seeing the other way tried in e.g. Haskell to see if this approach would have benefits (e.g. because invalid transactions would be terminated immediately).&lt;br /&gt;&lt;br /&gt;(there are variations on this idea, e.g. lookup Tim Harris' research you'll see an approach
where the writes in the transactions actually write directly to the
"live" objects - optimizing for successful commits, and the transation
logs are only used for rolling back if the commit fails)&lt;br /&gt;&lt;br /&gt;However, the suggested strategy of specifying transactional variables up front would have problems with scenarios where the transaction variables aren't known at compile time.&lt;br /&gt;E.g. if you read transactional variables from a transactional channel, and do transactional updates on them. Since those transactional variables can come from any other thread that has access to the channel (and depend on user input or whatever), and you don't even know how many of them you'll get, there's no way of producing a static lookup of which variables have been read based on the instruction pointer. As far as I can see you either need a transaction log, or you disallow transactional variables as first class citizens. So I'm not sure how well the log-less idea would work in practice (it seems to me that storing transactional variables inside other transactional variables would be very common, e.g. when you send a message to another thread and also supply a message channel for that thread to send its reply).&lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389609</link><pubDate>Tue, 19 Feb 2008 22:05:09 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389609</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389609/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	
						
								
										
												
										
										
												evildictaitor wrote:
												
												﻿The point of transactions is that you never have inconsistent state. Whenever a transaction does a write, it clones the object and writes to the clone, and only when it&amp;#8230;</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389609/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;sylvan wrote:&lt;/div&gt;
				&lt;div&gt;﻿&lt;br /&gt;Well you only need to detect problems that has resulted from the program reading inconsistent state. If it diverges for any other reason it's someone else's problem!&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;The point of transactions is that you never have inconsistent state. Whenever a transaction does a write, it clones the object and writes to the clone, and only when it &lt;i&gt;commits&lt;/i&gt; does it write to the shared memory. You can then cancel any transactions which might have read or written to these values (and would have inconsistent state if they continued), and simply free their memory and restart them. This has the nicety that you never need to check for your own consistency because it is guarranteed when you start, while you are running, and that any possibly overwrite of data that might render your world view inconsistent will result in your own immediate termination and rescheduling. You would however need to use an appropriate weighting function to avoid livelock.&lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389581</link><pubDate>Tue, 19 Feb 2008 19:39:43 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389581</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389581/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	sylvan wrote:
				﻿Well you only need to detect problems that has resulted from the program reading inconsistent state. If it diverges for any other reason it's someone else's problem!
		
		
		The point of transactions is that you never have inconsistent state. Whenever a transaction does a&amp;#8230;</evnet:previewtext><dc:creator>evildictaitor</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389581/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;evildictaitor wrote:&lt;/div&gt;
				&lt;div&gt;﻿ &lt;br /&gt;&lt;br /&gt;Detecting whether a program is in an infinite loop is (in general) impossible. &lt;br /&gt;&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;Well you only need to detect problems that has resulted from the program reading inconsistent state. If it diverges for any other reason it's someone else's problem!&lt;br /&gt;With a log-based transactional system this is trivial, just check that the read values in the log match the actual values. If not, then the actual values must have been updated by another transaction, and any transaction that has read from it is invalid. Normally the transaction could detect this inconsistency when it validates before it commits, but if the inconsistency happesn to cause it to diverge then you need to check it somewhere else too (e.g. on thread switching) as it will never commit.&lt;br /&gt;&lt;br /&gt;This way you never have to look at any other transactions when committing, you just make sure that &lt;i&gt;you&lt;/i&gt; are consistent and then commit. And in most cases a slight inconsistency won't lead to a transaction diverging, so that transaction can check itself when it reaches its commit. And if it does diverge it should be rare, so we can just validate the "read" entries in the log periodically.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;Frank Hileman wrote:&lt;/div&gt;&lt;div&gt; In message oriented systems, erlang, scala, etc, the isolated,
message passing units are more lightweight than threads. The idea is to
get away from the traditional thread with its heavy stack. Message
passing is a robust, proven way of isolating state, that can scale
well. Most people believe message passing is easier than locks, which
do not scale well. Message passing in conjunction with functional
programming (erlang) has proven successful for difficult concurrent
telecommunications applications.&lt;br /&gt;&lt;br /&gt;Transactions will always be
needed for some types of applications, but transactional memory is a
new thing, that to me, seems more of a hack to keep holding onto older
ways of working.&lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;The performance issues with threads isn't just that threads have overhead themselves, it's that many applications simply don't map well to the concept of "threads and messages". This leads to massive contention. I gave several examples in my previous posts, like for example a big game world that all your objects want to update (each of them needs atomic updates). With message passing these accesses will be serialised, as only one object a time can access the world atomically (unless you break the world up into smaller pieces with one "service" each, which means you're effectively simulating locks using threads, so you get all the usual locking issues - deadlock, race conditions etc.). In situations such as this, when there is high contention for a resource (even if it's "large")where atomic access is needed (even the accesses are usually independent), threads simply doesn't give you any parallelism.&lt;br /&gt;&lt;br /&gt;Threads are great when you problem happens to map nicely to small indpendent chunks that can communicate via messages. Many problems simply cannot be written in this way without horrible contention.&lt;br /&gt;So for a general purpose language we can't rely entirely on threads and messages. We need good support for threads and message passing too, but we can't afford to just leave things running sequentially because our programming model doesn't support parallelizing it (almdahl's law tells us that if we do, then this bit of sequential code will become our bottleneck in no time).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389568</link><pubDate>Tue, 19 Feb 2008 19:16:09 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389568</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389568/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	evildictaitor wrote:
				﻿ Detecting whether a program is in an infinite loop is (in general) impossible. 
		
		
		Well you only need to detect problems that has resulted from the program reading inconsistent state. If it diverges for any other reason it's someone else's problem!With a&amp;#8230;</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389568/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;staceyw wrote:&lt;/div&gt;
				&lt;div&gt;﻿
&lt;blockquote&gt;
&lt;table&gt;

&lt;tr&gt;
&lt;td&gt;&lt;img src="http://channel9.msdn.com/Themes/AlmostGlass/images/icon-quote.gif /&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;sylvan wrote:&lt;/strong&gt;

&lt;i&gt;﻿ 
&lt;blockquote&gt;
&lt;table&gt;

&lt;tr&gt;
&lt;td&gt;&lt;img src="http://channel9.msdn.com/Themes/AlmostGlass/images/icon-quote.gif /&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Frank Hileman wrote:&lt;/strong&gt; 

&lt;i&gt;&lt;br /&gt;For me, isolation via message passing sounds more interesting than transactional memory. It has&amp;nbsp;proven useful for decades.&lt;br /&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;Why not have both?&lt;br /&gt;In some cases message passing just doesn't work well at all. For example you may want 10000 objects to be able to query the same data (but only one or two of those objects of those modifies the data). Do you really want each to pass through a protocol with just a single thread "owning" that data? Sounds like a recipe for disaster w.r.t. performance to me. Transactions scale very well in situations like that (which is extremely common in practice), because each thread can do all the reading it wants to without interfering with the execution of any other threads.&lt;br /&gt;...&lt;br /&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;I see what your saying, but lets analyze that.&amp;nbsp; Essentially, that is a classic ReaderWriter lock sample.&amp;nbsp; You need the sync, because you never know who the last action was, a reader or writer.&amp;nbsp; So you still need some kind of sync primitive to protect the invariants.&lt;br /&gt;&lt;br /&gt;One may say, lets keep the message queue for writers, but let readers read properties directly. But here we have couple issues.&amp;nbsp; You still need lock to insure atomic reads (i.e non cache and non torn).&amp;nbsp; Second, you have possible coordination issues the runtime could never know and only your logic knows.&amp;nbsp; For example, your object may intentionally not be popping the queue (sort of a working blocking operation) queue until it completes work for last task - as maybe order is important or maybe is waiting on various replies.&amp;nbsp; There is all kinds of reasons order could be important and you can't reliably short circut that in general case.&amp;nbsp; So only the Object knows best.&amp;nbsp; As far as Perf goes, yes the message passing has some cost.&amp;nbsp; But so do ReaderWriters and Monitors.&amp;nbsp; When you look at the "net" performance, I think it probably is a wash or better with message passing.&amp;nbsp; Costs such as complexity and correctness proof are much less from the start.&amp;nbsp; There is many other benefits you can't get with locks or even trans memory.&amp;nbsp; You can get order symatics, self messaging, throttling,&amp;nbsp;naturally async, pipelining,&amp;nbsp;natural composition, loose binding, and others.&lt;br /&gt;&lt;br /&gt;Think Juval Lowy nailed it here. Every class needs to be a service:&lt;br /&gt;&lt;a href="http://channel9.msdn.com/ShowPost.aspx?PostID=349561#349561&gt;http://channel9.msdn.com/ShowPost.aspx?PostID=349561#349561&lt;/a&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;If you think about "processes" that are lighter than a thread, and do not need a stack, messages can possibly be as fast as ordinary method calls, with the added difficulty of enregistration of parameters. A stack is used to save the function call frames. A message queue saves message parameters. In terms of storage, there is not a big difference, it is primarily a question of creating a very fast queue and dispatching mechanism.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;I think it is best to assume the creators of erlang, scala, E, etc have or will address performance, and avoid assumptions about implementation or performance. &lt;br /&gt;&lt;br /&gt;Just looking at the advantages, I agree with you, it is a great way to work.</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389542</link><pubDate>Tue, 19 Feb 2008 17:14:32 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389542</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389542/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	staceyw wrote:
				﻿





sylvan wrote:

﻿ 





Frank Hileman wrote: 

For me, isolation via message passing sounds more interesting than transactional memory. It has&amp;nbsp;proven useful for decades.Why not have both?In some cases message passing just doesn't work well at all. For&amp;#8230;</evnet:previewtext><dc:creator>Frank Hileman</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389542/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;sylvan wrote:&lt;/div&gt;
				&lt;div&gt;﻿ 
&lt;blockquote&gt;
&lt;table&gt;

&lt;tr&gt;
&lt;td&gt;&lt;img src="http://channel9.msdn.com/Themes/AlmostGlass/images/icon-quote.gif /&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Frank Hileman wrote:&lt;/strong&gt; 

&lt;i&gt;&lt;br /&gt;For me, isolation via message passing sounds more interesting than transactional memory. It has&amp;nbsp;proven useful for decades.&lt;br /&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;Why not have both?&lt;br /&gt;In some cases message passing just doesn't work well at all. For example you may want 10000 objects to be able to query the same data (but only one or two of those objects of those modifies the data). Do you really want each to pass through a protocol with just a single thread "owning" that data? Sounds like a recipe for disaster w.r.t. performance to me. Transactions scale very well in situations like that (which is extremely common in practice), because each thread can do all the reading it wants to without interfering with the execution of any other threads.&lt;br /&gt;&lt;br /&gt;I agree that message passing is ideal &lt;i&gt;when suitable&lt;/i&gt;, but sometimes it just isn't. Also, while threads are sometimes excellent abstractions (even for things where you *don't* care about concurrency they may be the right model), sometimes they just suck. They basically suffer from the same problem that "goto" does: obfuscation of the structure of the program (you have to jump back and forth through messages in different threads, which one may not be known until you run the program, to understand what the program does).&lt;br /&gt;&lt;br /&gt;Now I freely confess that I hadn't seen E (I'll look into it now) so if they give some new cool abstraction that solves all of this I'll recant my statements, but for now I'll say this: there is no one solution, we need multiple solutions to the problem of parallelism/concurrency. In my book the bara minimum is: Nested data parallelism, task-based (purely functional) parallelism, threads with messages, and threads with shared state (here's where you need transactions).&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;In message oriented systems, erlang, scala, etc, the isolated, message passing units are more lightweight than threads. The idea is to get away from the traditional thread with its heavy stack. Message passing is a robust, proven way of isolating state, that can scale well. Most people believe message passing is easier than locks, which do not scale well. Message passing in conjunction with functional programming (erlang) has proven successful for difficult concurrent telecommunications applications.&lt;br /&gt;&lt;br /&gt;Transactions will always be needed for some types of applications, but transactional memory is a new thing, that to me, seems more of a hack to keep holding onto older ways of working.</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389539</link><pubDate>Tue, 19 Feb 2008 17:01:12 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389539</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389539/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	sylvan wrote:
				﻿ 





Frank Hileman wrote: 

For me, isolation via message passing sounds more interesting than transactional memory. It has&amp;nbsp;proven useful for decades.Why not have both?In some cases message passing just doesn't work well at all. For example you may want 10000&amp;#8230;</evnet:previewtext><dc:creator>Frank Hileman</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389539/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;sylvan wrote:&lt;/div&gt;
				&lt;div&gt;﻿&lt;br /&gt;I could see how doing something like that could indeed give you a way of checking the instruction pointer for the exact set of objects used, but wouldn't code bloat be quite horrific?&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;You certainly end up with bigger emitted code if you do it that way, but it does make transactions and exceptions easier to cope with. It's worth pointing out that this does not make the program &lt;i&gt;slower&lt;/i&gt;, merely &lt;i&gt;bigger&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;sylvan wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;Couldn't you just (write-)lock the data you've read/written when trying to commit (in some global ordering)? That way you wouldn't lock *all* transactions, only the ones who try to do anything to the data you've needed.&lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;That would be a better system, I agree. I was trying to avoid using locks for the sake of simplicity.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;sylvan wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;You still need to "fix" any transactions that get into an infinite loop or something due to an inconsistent view of the world, but you could do&amp;nbsp; a check on each thread switch, or GC or something, and of course in their own commit (if they get that far).&lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;Detecting whether a program is in an infinite loop is (in general) impossible. Happily however, a transaction in an infinite loop will never commit (by definition), and therefore will keep running "out-of-the-way" or will be eventually reset by another transaction commit that overwrites data that the infinitely-looping transaction has read from or written to.&lt;br /&gt;&lt;br /&gt;You suggest however that the infintely-running transaction might commit, however this would signal the immediate end of the transaction - a transaction can be thought of as a boolean method on it's own thread (or a turing machine), and it commits, it returns true and terminates, and if it aborts, it returns false and terminates.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389360</link><pubDate>Mon, 18 Feb 2008 22:40:58 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389360</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389360/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	sylvan wrote:
				﻿I could see how doing something like that could indeed give you a way of checking the instruction pointer for the exact set of objects used, but wouldn't code bloat be quite horrific?
		
		You certainly end up with bigger emitted code if you do it that way, but it does make&amp;#8230;</evnet:previewtext><dc:creator>evildictaitor</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389360/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;evildictaitor wrote:&lt;/div&gt;
				&lt;div&gt;﻿&lt;blockquote&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img src="http://channel9.msdn.com/Themes/AlmostGlass/images/icon-quote.gif /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;sylvan wrote:&lt;/strong&gt;&lt;i&gt;﻿&lt;br /&gt;Can you though?&lt;br /&gt;What if I read a variable, or maybe it's an input parameter to the function, and in an if-statement checking this value for 0 or whatever, I may read/update a bunch of transactional variables, then after that I go on to do something else. How would you know, just from the instruction pointer, which variables have been read if the IP points to some code after the if statement? How could you easily know what path through the preceeding code you've taken? Wouldn't that have to be a very conservative estimate (again, needlessly inhibiting parallelism)?&lt;br /&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/blockquote&gt;&lt;br /&gt;The short answer is yes. Particularly in managed languages, but also in C++ and C to a lesser extent this is not only feasible but currently done - on x64 architectures C++ programs compiled with CRT checking implement try...catch...finally blocks as exactly this so that when an exception is thrown it can "rollback" the function pointer to the appropriate stage and it uses a simmilar lookup table to determine which objects on the stack were created and thus need to be finalized.&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;
		&lt;br /&gt;How would this work exactly?&lt;br /&gt;Are you saying that each "path" through the code gets transformed into a "tree" like shape by duplicating any code that happens after a branch so that each option gets it's own copy of the following statements?&lt;br /&gt;e.g.&lt;br /&gt;&lt;br /&gt;if (x&amp;gt;0)&lt;br /&gt;{&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; foo();&lt;br /&gt;}&lt;br /&gt;bar();&lt;br /&gt;&lt;br /&gt;turns into:&lt;br /&gt;&lt;br /&gt;if (x&amp;gt;0)&lt;br /&gt;{&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; foo();&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; bar();&lt;br /&gt;}&lt;br /&gt;else&lt;br /&gt;{&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; bar();&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;I could see how doing something like that could indeed give you a way of checking the instruction pointer for the exact set of objects used, but wouldn't code bloat be quite horrific?&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;evildictaitor wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;Determining whether or not a transaction collision occurs by definition needs a global lock on the transactions and a pause on the threads.&lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Couldn't you just (write-)lock the data you've read/written when trying to commit (in some global ordering)? That way you wouldn't lock *all* transactions, only the ones who try to do anything to the data you've needed.&lt;br /&gt;You still need to "fix" any transactions that get into an infinite loop or something due to an inconsistent view of the world, but you could do&amp;nbsp; a check on each thread switch, or GC or something, and of course in their own commit (if they get that far).&lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389228</link><pubDate>Mon, 18 Feb 2008 08:30:33 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389228</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389228/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	evildictaitor wrote:
				﻿sylvan wrote:﻿Can you though?What if I read a variable, or maybe it's an input parameter to the function, and in an if-statement checking this value for 0 or whatever, I may read/update a bunch of transactional variables, then after that I go on to do something else. How&amp;#8230;</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389228/Trackback.aspx</trackback:ping></item><item><title>Re: Burton Smith: On General Purpose Super Computing and the History and Future of Parallelism</title><description>&lt;blockquote&gt;
				&lt;div&gt;sylvan wrote:&lt;/div&gt;
				&lt;div&gt;﻿&lt;br /&gt;Can you though?&lt;br /&gt;What if I read a variable, or maybe it's an input parameter to the function, and in an if-statement checking this value for 0 or whatever, I may read/update a bunch of transactional variables, then after that I go on to do something else. How would you know, just from the instruction pointer, which variables have been read if the IP points to some code after the if statement? How could you easily know what path through the preceeding code you've taken? Wouldn't that have to be a very conservative estimate (again, needlessly inhibiting parallelism)?&lt;br /&gt;&lt;/div&gt;
		&lt;/blockquote&gt;
		&lt;br /&gt;The short answer is yes. Particularly in managed languages, but also in C++ and C to a lesser extent this is not only feasible but currently done - on x64 architectures C++ programs compiled with CRT checking implement try...catch...finally blocks as exactly this so that when an exception is thrown it can "rollback" the function pointer to the appropriate stage and it uses a simmilar lookup table to determine which objects on the stack were created and thus need to be finalized.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;sylvan wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;In fact, if my IP is right at the final "return" statement of a transaction, then wouldn't this bit field be the same as my "potential set" of touched variables, even though the &lt;i&gt;actual&lt;/i&gt; set of touched variables could be empty?&lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;You seriously underestimate the power of modern compiler theory. You might only have one return statement, but the compiler will have a lot. You need to do truly appalling things to C or C++ before these kind of optimisations become concerned (I mean breaking into asm or jump to void pointer kind of nasty code).&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;sylvan wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;If there's one thing I think we should avoid like the plague, it's introducing anything which may reduce the degree of parallelism we can see in a program. Almdahl's law scares me. We may not have hit it yet, but it's coming at us at 300mph and it's like a big brick wall on the horizon.&lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;I agree. Hopefully such advancements will allow us to get greater parallelisation with less programmer manual intervention.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;sylvan wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;Without actually logging what gets done, I don't see how we could easily check wether the transaction &lt;i&gt;actually &lt;/i&gt;conflicts. Though I would be interested in seeing an approach to solving this issue (a transaction seeing inconsistent state) in a regular STM system using a simliar approach (i.e. right after a commit, check any potentially overlapping transaction's current log, and if the read set in those overlaps the write set in the transaction you're commiting, you can restart it).&lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;We do simmilar things to databases - why should imperative code be any different? The only big concern is when rolling back very large or very old transactions, and even these are at a cost that is feasible.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div&gt;sylvan wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;br /&gt;... which tells me that perhaps this may have terrible performance characteristics (e.g. due to that big global locking you're doing everytime something commits - ouch!), as well as being too conservative.&lt;br /&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;Determining whether or not a transaction collision occurs by definition needs a global lock on the transactions and a pause on the threads. Any other course of action could end up with a race-condition on the rollback. As you can see the time to commit is linear time complexity with respect to the number of threads and objects, with the cost of a rollback being linear time complexity (each) with respect to the the number of mutated objects.&lt;br /&gt;&lt;br /&gt;Consequently the average case is a global lock over a function of complexity O(n log n) &lt;br /&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389181</link><pubDate>Mon, 18 Feb 2008 01:04:27 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Burton-Smith-On-General-Purpose-Super-Computing-and-the-History-and-Future-of-Parallelism/?CommentID=389181</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/389181/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>	sylvan wrote:
				﻿Can you though?What if I read a variable, or maybe it's an input parameter to the function, and in an if-statement checking this value for 0 or whatever, I may read/update a bunch of transactional variables, then after that I go on to do something else. How would you know, just&amp;#8230;</evnet:previewtext><dc:creator>evildictaitor</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/389181/Trackback.aspx</trackback:ping></item></channel></rss>