<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" media="screen" href="/App_Themes/default/rss.xslt"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:evnet="http://www.mscommunities.com/rssmodule/"><channel><title>Comment Feed for Programming in the Age of Concurrency: Software Transactional Memory (Going Deep on Channel 9)</title><atom:link rel="self" type="application/rss+xml" href="http://channel9.msdn.com/shows/going+deep/programming-in-the-age-of-concurrency-software-transactional-memory/rss/default.aspx" /><image><url>http://mschnlnine.vo.llnwd.net/d1/Dev/App_Themes/C9/images/feedimage.png</url><title>Comment Feed for Programming in the Age of Concurrency: Software Transactional Memory (Going Deep on Channel 9)</title><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/</link></image><description>Programming in the Age of Concurrency: Software Transactional Memory</description><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/</link><language>en-us</language><pubDate>Sun, 26 Nov 2006 13:47:55 GMT</pubDate><lastBuildDate>Sun, 26 Nov 2006 13:47:55 GMT</lastBuildDate><generator>EvNet (EvNet, Version=1.0.3608.3122, Culture=neutral, PublicKeyToken=null)</generator><item><title>Definition of atomicity of transactions</title><description>How is atomicity of transactions defined?&amp;nbsp; Basically you define atomicity with respect to other things.&amp;nbsp; Write transations are obviously atomic w.r.t other write transactions, but what about reads?&amp;nbsp; With single reads obviously you can't tell but what if you were traversing a linked list that had concurrent write updates being applied to it for example?&amp;nbsp; On such a traversal, would you see all or none of any write transaction, or would you see parts of write transactions depending on which nodes in the list got updated and where you were in the traversal when the updates happened?&amp;nbsp; Or are there read transactions as well?&amp;nbsp; So before you traverse through the list, you start a read transaction so you get a consistent version of the list.&amp;nbsp; And do you have retries on read transactions or do you use the memory management scheme to keep things consistent?&lt;br&gt;&lt;br&gt;I had done some investigation is this area myself when I was looking on a couple of occasions at what it would take to make TDB (Trivial DataBase), which is used by Samba, lock-free using RCU for preemptive user threads (or something very much like it).&amp;nbsp; On the second occasion I got around to looking at the actual code and realized that TDB had transactions.&amp;nbsp;&amp;nbsp; So I had to figure out how to do read lock-free transactions.&amp;nbsp; Ultimately, too much work was involved for not very well defined benefits.&amp;nbsp; But bascially when you do this kind of stuff (lock-free), you become aware of the importance of well defined semantics.&lt;br&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267945</link><pubDate>Sun, 26 Nov 2006 13:47:55 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267945</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267945/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>How is atomicity of transactions defined?&amp;nbsp; Basically you define atomicity with respect to other things.&amp;nbsp; Write transations are obviously atomic w.r.t other write transactions, but what about reads?&amp;nbsp; With single reads obviously you can't tell but what if you were traversing a linked&amp;#8230;</evnet:previewtext><dc:creator>jseigh</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267945/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;P&gt;DAMN! This fourm software mess my post again!!!&lt;BR&gt;&lt;BR&gt;ArGH! Let be try one more time!&lt;BR&gt;&lt;BR&gt;&lt;/P&gt;
&lt;P&gt;Here they are the introductory papers I used for my vZOOM coolthreads project...&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;http://appcore.home.comcast.net/vzoom/&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;http://appcore.home.comcast.net/vzoom/round-1.pdf&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;http://appcore.home.comcast.net/vzoom/round-2.pdf&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;http://appcore.home.comcast.net/vzdoc/atomic/static-init/&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;This is a major alternative to STM. It outperforms it by orders of magnitude. The video should have mentioned other ways of doing things.&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;Any thoughts? &lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;-- P.S. this is in regard to the SUN CoolThreads contest:&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;https://coolthreads.dev.java.net/&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267867</link><pubDate>Sat, 25 Nov 2006 19:26:05 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267867</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267867/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>DAMN! This fourm software mess my post again!!!ArGH! Let be try one more time!
Here they are the introductory papers I used for my vZOOM coolthreads&amp;#8230;</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267867/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;P&gt;Here they are the introductory papers I used for my vZOOM coolthreads project...&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;http://appcore.home.comcast.net/vzoom/&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;http://appcore.home.comcast.net/vzoom/round-1.pdf&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;http://appcore.home.comcast.net/vzoom/round-2.pdf&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;http://appcore.home.comcast.net/vzdoc/atomic/static-init/&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;This is a major alternative to STM. It outperforms it by orders of magnitude. The video should have mentioned other ways of doing things.&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;Any thoughts? &lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;-- P.S. this is in regard to the SUN CoolThreads contest:&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;https://coolthreads.dev.java.net/&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267866</link><pubDate>Sat, 25 Nov 2006 19:24:56 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267866</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267866/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>Here they are the introductory papers I used for my vZOOM coolthreads&amp;#8230;</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267866/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;P&gt;I almost forgot that the video mentions that they have to use Garbage Collector. I have STM that does not depend on a GC at all. Funny… I wonder why Simon and Tim depend on a GC? Its clearly not needed… A lightweight PDR scheme can implement STM…&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267860</link><pubDate>Sat, 25 Nov 2006 18:30:23 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267860</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267860/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>I almost forgot that the video mentions that they have to use Garbage Collector. I have STM that does not depend on a GC at all. Funny… I wonder why Simon and Tim depend on a GC? Its clearly not needed… A lightweight PDR scheme can implement STM…</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267860/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>Can you use STM to efficently solve this problem:&lt;BR&gt;&lt;BR&gt;&lt;a href="http://appcore.home.comcast.net/vzdoc/atomic/static-init/"&gt;http://appcore.home.comcast.net/vzdoc/atomic/static-init/&lt;/a&gt;&lt;BR&gt;&lt;BR&gt;Any thoughts?</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267854</link><pubDate>Sat, 25 Nov 2006 17:58:21 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267854</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267854/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>Can you use STM to efficently solve this problem:http://appcore.home.comcast.net/vzdoc/atomic/static-init/Any thoughts?</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267854/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>For the record, I did not call you stupid. I want to know how you would address the Cell SPU with STM, that all...&lt;BR&gt;&lt;BR&gt;:^)</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267834</link><pubDate>Sat, 25 Nov 2006 15:18:26 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267834</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267834/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>For the record, I did not call you stupid. I want to know how you would address the Cell SPU with STM, that all...:^)</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267834/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;P&gt;“As for me not wanting to learn anything new… Well you wrong. I implemented STM. I know all of the issues. I have to make sure that I let everybody know what’s up, and let them get mislead by the marketing for STM."&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;ACK... &lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;I meant: &lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;As for me not wanting to learn anything new… Well you wrong. I actually implemented several versions of&amp;nbsp;STM; I know all of the issues. I have to make sure that I let everybody know what’s up, and NOT let them get mislead by the marketing for STM.&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267833</link><pubDate>Sat, 25 Nov 2006 15:11:58 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267833</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267833/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>“As for me not wanting to learn anything new… Well you wrong. I implemented STM. I know all of the issues. I have to make sure that I let everybody know what’s up, and let them get mislead by the marketing for STM."
&amp;nbsp;
ACK... 
&amp;nbsp;
I meant: 
&amp;nbsp;
As for me not wanting to learn anything&amp;#8230;</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267833/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;P&gt;"Again, if you validate x and y before computing the if statement in A, it can't go wrong. This is not an unavoidable problem."&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;Okay, well that makes STM doubly expensive. You have to see if you can commit, before you try to commit? What if your validation contains thousands of read transactions? How to address very large shared linked data-structures in STM without livelock and explicit contention management?&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267832</link><pubDate>Sat, 25 Nov 2006 15:06:28 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267832</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267832/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>"Again, if you validate x and y before computing the if statement in A, it can't go wrong. This is not an unavoidable problem."
&amp;nbsp;
Okay, well that makes STM doubly expensive. You have to see if you can commit, before you try to commit? What if your validation contains thousands of read&amp;#8230;</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267832/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;P&gt;Why would you use STM in a game engine? And how would use use STM on the Cell? Perhaps I, or somebody else at comp.programming.theads can help you get a MUCH better design for any engine you may be working on.&lt;/P&gt;
&lt;P&gt;﻿ &lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;"You do know that the founder and technical director (Tim Sweeney)of Epic games (recently released Gears of War for the XBox 360) is a heavy advocate for purely functional programming and STM for games, right? If not, check out "The Next Mainstream Programing Language, A Game Developer's Perspective". But I'm sure you know more than both him and me."&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;Yeah. Well, if Tim says so... I happen to completely disagree with him. BTW... Sadly, a lot of game programmers are not talented multi-threaded programmers:&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;http://groups.google.com/group/comp.programming.threads/msg/2c477272953fc5e0&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;http://groups.google.com/group/comp.programming.threads/browse_frm/thread/6fe2dd51631afee5&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;http://groups.google.com/group/comp.arch/search?hl=en&amp;amp;group=comp.arch&amp;amp;q=games+multi-core&amp;amp;qt_g=1&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;That’s the sad truth. STM is not going to help them learn how to leverage lock-free programming against their very high-end game engines synchronization needs. I bet Tim doesn’t let his programmers use any Assembly Language either? I doubt it. I also doubt that he would choose a STM based engine, over a lock-free engine that outperforms it by an order of magnitude. He would have to be crazy to do something like that.&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;IMHO, game programming is all about squeezing all of the horsepower you can out of the architecture. STM prevents this because it will be flooding the cache coherency system with boat loads of unneeded traffic. Humm… Assembly language and game engines… Do they go together? Yes. Big time. Its going to be interesting to see how the gaming industry programs the Cell with STM… Wow. Its basically impossible. How to address the SPU with STM?&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;As for me not wanting to learn anything new… Well you wrong. I implemented STM. I know all of the issues. I have to make sure that I let everybody know what’s up, and let them get mislead by the marketing for STM.&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267831</link><pubDate>Sat, 25 Nov 2006 15:02:51 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267831</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267831/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>Why would you use STM in a game engine? And how would use use STM on the Cell? Perhaps I, or somebody else at comp.programming.theads can help you get a MUCH better design for any engine you may be working on.
﻿ 
&amp;nbsp;
&amp;nbsp;
"You do know that the founder and technical director (Tim Sweeney)of&amp;#8230;</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267831/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;P&gt;You use the splendid "you must just be stupid" argument, which is just great. So game engines are not difficult to parallize if you're smart enough, thus using any technology which would make this easier is an admission of stupidity. Well that's just great. Have fun with that.&lt;BR&gt;&lt;BR&gt;You do know that the founder and technical director (Tim Sweeney)of Epic games (recently released Gears of War for the XBox 360) is a heavy advocate for purely functional programming and STM for games, right? If not, check out "The Next Mainstream Programing Language, A Game Developer's Perspective". But I'm sure you know more than both him and me.&lt;BR&gt;&lt;BR&gt;Also, I'm just going to stop arguing with you right here. I dislike your argumentation style, and you don't seem interested in learning anything new.&amp;nbsp;Though I do think it's funny that you pretend to be an expert on game engines, and then proceed to share your wisdom on the subject to educate poor me in case I ever want to work on games (for the record, I actually work on game engine technology for a living, for the XBox 360).&lt;BR&gt;&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267829</link><pubDate>Sat, 25 Nov 2006 14:45:59 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267829</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267829/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>You use the splendid "you must just be stupid" argument, which is just great. So game engines are not difficult to parallize if you're smart enough, thus using any technology which would make this easier is an admission of stupidity. Well that's just great. Have fun with that.You do know that the&amp;#8230;</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267829/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;BLOCKQUOTE&gt;&lt;div&gt;cristom wrote:&lt;/div&gt;&lt;div&gt;﻿
&lt;P&gt;One problem with implementing software transactional memory is that it's possible for a transaction to read inconsistent state (that is, to read a mixture of old and new values written by another transaction). Such a transaction is doomed to abort if it ever tries to commit, but it's possible for inconsistent state to cause a transaction to trigger a fatal exceptional condition such as a segmentation fault or even enter an endless loop, as in the following contrived example from Figure 4 of "Language Support for Lightweight Transactions":&lt;/P&gt;
&lt;TABLE&gt;

&lt;TR&gt;
&lt;TD&gt;&lt;PRE&gt;&lt;B&gt;atomic&lt;/B&gt; {
    if (x != y) 
        while (true) { }
}
&lt;/PRE&gt;
&lt;P&gt;Transaction A&lt;/P&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;PRE&gt;&lt;B&gt;atomic&lt;/B&gt; {
    x++;
    y++;
}
&lt;/PRE&gt;
&lt;P&gt;Transaction B&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TABLE&gt;
&lt;P&gt;Provided &lt;I&gt;x&lt;/I&gt;=&lt;I&gt;y&lt;/I&gt; initially, neither transaction above alters this invariant, but it's possible transaction A will read &lt;I&gt;x&lt;/I&gt; after transaction B updates it but read &lt;I&gt;y&lt;/I&gt; before transaction B updates it, causing it to enter an infinite loop. The usual strategy for dealing with this is to intercept any fatal exceptions and periodically check if each transaction is valid; if not, it can be aborted immediately, since it's doomed to fail in any case.&lt;/P&gt;&lt;/div&gt;&lt;/BLOCKQUOTE&gt;&lt;BR&gt;&lt;BR&gt;Again, if you validate x and y before computing the if statement in A, it can't go wrong. This is not an unavoidable problem.</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267828</link><pubDate>Sat, 25 Nov 2006 14:37:42 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267828</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267828/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>cristom wrote:﻿
One problem with implementing software transactional memory is that it's possible for a transaction to read inconsistent state (that is, to read a mixture of old and new values written by another transaction). Such a transaction is doomed to abort if it ever tries to commit, but&amp;#8230;</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267828/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;P&gt;“E.g. let's say you have a game world, and 10000 game objects that interact with it in complex and data dependent ways (i.e. there is no way to partition the world into lockable regions). Each of these objects interact with about half a dozen of the other objects, and they do so concurrently, and the interactions must be atomic. How do you solve that with lock free data structures? The game world consists of multiple different data structures nested in complex ways (oct-trees, portal cell graphs, BSP trees, etc.). How do you solve it with locks? You really don't, unless you stick a big lock on the entire world, or use some global locking order or something equally detrimental to composability and productivity. STM in this case "Just Works".”&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;If you don’t know how to write a high-end multi-threaded game engine without using STM, that’s your problem. Game engines HEAVILY benefit from assembly language and other assorted low-level tricks and techniques. Low-overhead lock-free algorithms are a must. STM in a game engine? No way. I want my game to be very fast. STM would prevent the game engine from realizing the full potential of today’s high-and architectures…&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;BTW, how would STM work on the Cell Processor? I don’t think it’s even possible for STM to even being to make use of the Cells SPU’s. Ha…. You want to program games? Do you know assembly language inside in out? Do you know how to use DMA with the Cells AltiVec SPU to shunt from shared memory to local SPU memory? &amp;nbsp;Do you know the PPC instruction set? Do you know distributed/network programming paradigm for the Cell? What about the XBox 360? Why bog down its PowerPC with all of the coherency traffic generated from the crappy TM?&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;You want to get into game engines? I suggest posting something overing comp.programming.threads and/or comp.arch. BTW, have you ever programmer the IBM Cell Processor or have you done any lock-free synchronization algorithms that use the PPC instruction set? Do you know the AltiVec instruction set? I have experience with all of these. I would not want you on the design team if alls you have to say to synchronization issues in the game engine is STM… No way no how!&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;How skilled are you wrt high-end game engine synchronization issues? They are basically the same issues that exist in the fast-pathed part of a database server… Anyway, please follow-ups in comp.programming.threads and/or comp.arch. This forum seems to get too crowed...&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;Humm…&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267827</link><pubDate>Sat, 25 Nov 2006 14:28:12 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267827</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267827/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>“E.g. let's say you have a game world, and 10000 game objects that interact with it in complex and data dependent ways (i.e. there is no way to partition the world into lockable regions). Each of these objects interact with about half a dozen of the other objects, and they do so concurrently, and&amp;#8230;</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267827/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;P&gt;One problem with implementing software transactional memory is that it's possible for a transaction to read inconsistent state (that is, to read a mixture of old and new values written by another transaction). Such a transaction is doomed to abort if it ever tries to commit, but it's possible for inconsistent state to cause a transaction to trigger a fatal exceptional condition such as a segmentation fault or even enter an endless loop, as in the following contrived example from Figure 4 of "Language Support for Lightweight Transactions":&lt;/P&gt;
&lt;TABLE&gt;

&lt;TR&gt;
&lt;TD&gt;&lt;PRE&gt;&lt;B&gt;atomic&lt;/B&gt; {
    if (x != y) 
        while (true) { }
}
&lt;/PRE&gt;
&lt;P&gt;Transaction A&lt;/P&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;PRE&gt;&lt;B&gt;atomic&lt;/B&gt; {
    x++;
    y++;
}
&lt;/PRE&gt;
&lt;P&gt;Transaction B&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TABLE&gt;
&lt;P&gt;Provided &lt;I&gt;x&lt;/I&gt;=&lt;I&gt;y&lt;/I&gt; initially, neither transaction above alters this invariant, but it's possible transaction A will read &lt;I&gt;x&lt;/I&gt; after transaction B updates it but read &lt;I&gt;y&lt;/I&gt; before transaction B updates it, causing it to enter an infinite loop. The usual strategy for dealing with this is to intercept any fatal exceptions and periodically check if each transaction is valid; if not, it can be aborted immediately, since it's doomed to fail in any case.&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267826</link><pubDate>Sat, 25 Nov 2006 14:15:48 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267826</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267826/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>One problem with implementing software transactional memory is that it's possible for a transaction to read inconsistent state (that is, to read a mixture of old and new values written by another transaction). Such a transaction is doomed to abort if it ever tries to commit, but it's possible for&amp;#8230;</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267826/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;BLOCKQUOTE&gt;&lt;div&gt;cristom wrote:&lt;/div&gt;&lt;div&gt;﻿One other MAJOR *point about TM... STM or HTM, my implmentation or not, both "suffer" from a not so well known phenomena: &lt;BR&gt;
&lt;P&gt;Inconsistent data can, and will, be accessed in a transactional atomic block! A users algorithm for a transactional based system is based solely inside of an atomic block, e.g., &lt;BR&gt;
&lt;P&gt;&lt;BR&gt;atomic { // start transactional atomic block&amp;nbsp;&lt;BR&gt;&amp;nbsp;&lt;BR&gt;users_algorithm{ &lt;BR&gt;&amp;nbsp; &amp;nbsp;// Access transactional data-structures &lt;BR&gt;&amp;nbsp;} &lt;BR&gt;
&lt;P&gt;
&lt;DIV&gt;} // try to commit transactional atomic block &lt;BR&gt;&lt;BR&gt;&lt;BR&gt;&lt;/DIV&gt;This means that the users algorithm can actually "operate" on inconsistent data... No kidding, not at all... therefore, a users algorithm behavior is basically undefined, whether the programmer likes it or not... Think about it for a minute... If your algorithm is fed with "no-so coherent" data, then your algorithm will be subject to undefined behavior by default... &lt;BR&gt;
&lt;P&gt;Wow... The implementations that I have seen have to actually catch any and all fatal exceptions that can, and WILL, arise when a user algorithm is subjected to inconsistent data... Sometimes your algorithm crashes right off the bat and gets intercepted by the STM where the transaction will be aborted, and retried... Sometimes your algorithm will be stuck in an infinite loop (most common problem, IMHO) , inside the atomic block... STM has to watch out for this kind of stuff... IMHO, this is a MAJOR drawback and a down right dangerous aspect of transactional memory... This behavior is inherent in basically any TM implementation; hardware or software... &lt;BR&gt;
&lt;P&gt;Need to think some more on this troubling subject... Now you know just some of the reasons why I don't like TM... &lt;BR&gt;
&lt;P&gt;Yikes! &lt;BR&gt;
&lt;P&gt;:O &lt;BR&gt;
&lt;P&gt;* I can't believe I forgot to mention this; totally slipped my mind. Right when I remembered this I went over to Wikipedia to see if anybody bring this fact up; somebody did... IMO, its vital that anybody who thinks about using TM completely understands this particular caveat! &lt;BR&gt;&lt;/P&gt;
&lt;P&gt;&lt;/div&gt;&lt;/BLOCKQUOTE&gt;&lt;BR&gt;&lt;BR&gt;This may be a problem in C, but not in Haskell. &lt;BR&gt;Haskell uses lazy evaluation, i.e. it's demand-driven. Nothing will be computed unless it's needed, and before any of writes are performed or data returned and forced by the&amp;nbsp;caller (which is the only two things that can force evaluation) the accessed transactional variables are checked for consistency.&lt;BR&gt;&lt;BR&gt;I.e. if you do something like the following in Haskell:&lt;/P&gt;
&lt;P&gt;atomically&amp;nbsp;$ do {&amp;nbsp;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;v1 &amp;lt;- readTVar t1;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;v2 &amp;lt;- readTVar t2;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;writeTVar t3 ( willDivergeIfInconsistent v1 v2 )&lt;BR&gt;}&lt;BR&gt;&lt;BR&gt;t1 and t2 will be validated before t3 is written to, and the writing of t3 is the only thing that will force willDivergeIfInconsistent to be evaluated, and at that point v1 and v2 are guaranteed to be valid, thus nothing will ever be evaluated with inconsistent data.&lt;BR&gt;This holds for every atomic computation, even with interleaved reads and writes, since all the data used by a function will be validated before it can be retrieved and thus must be at least consistent enough to execute the call properly. &lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267825</link><pubDate>Sat, 25 Nov 2006 14:11:05 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267825</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267825/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>cristom wrote:﻿One other MAJOR *point about TM... STM or HTM, my implmentation or not, both "suffer" from a not so well known phenomena: 
Inconsistent data can, and will, be accessed in a transactional atomic block! A users algorithm for a transactional based system is based solely inside of an&amp;#8230;</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267825/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>One other MAJOR *point about TM... STM or HTM, my implmentation or not, both "suffer" from a not so well known phenomena: &lt;BR&gt;
&lt;P&gt;Inconsistent data can, and will, be accessed in a transactional atomic block! A users algorithm for a transactional based system is based solely inside of an atomic block, e.g., &lt;BR&gt;
&lt;P&gt;&lt;BR&gt;atomic { // start transactional atomic block&amp;nbsp;&lt;BR&gt;&amp;nbsp;&lt;BR&gt;users_algorithm{ &lt;BR&gt;&amp;nbsp; &amp;nbsp;// Access transactional data-structures &lt;BR&gt;&amp;nbsp;} &lt;BR&gt;
&lt;P&gt;
&lt;DIV&gt;} // try to commit transactional atomic block &lt;BR&gt;&lt;BR&gt;&lt;BR&gt;&lt;/DIV&gt;This means that the users algorithm can actually "operate" on inconsistent data... No kidding, not at all... therefore, a users algorithm behavior is basically undefined, whether the programmer likes it or not... Think about it for a minute... If your algorithm is fed with "no-so coherent" data, then your algorithm will be subject to undefined behavior by default... &lt;BR&gt;
&lt;P&gt;Wow... The implementations that I have seen have to actually catch any and all fatal exceptions that can, and WILL, arise when a user algorithm is subjected to inconsistent data... Sometimes your algorithm crashes right off the bat and gets intercepted by the STM where the transaction will be aborted, and retried... Sometimes your algorithm will be stuck in an infinite loop (most common problem, IMHO) , inside the atomic block... STM has to watch out for this kind of stuff... IMHO, this is a MAJOR drawback and a down right dangerous aspect of transactional memory... This behavior is inherent in basically any TM implementation; hardware or software... &lt;BR&gt;
&lt;P&gt;Need to think some more on this troubling subject... Now you know just some of the reasons why I don't like TM... &lt;BR&gt;
&lt;P&gt;Yikes! &lt;BR&gt;
&lt;P&gt;:O &lt;BR&gt;
&lt;P&gt;* I can't believe I forgot to mention this; totally slipped my mind. Right when I remembered this I went over to Wikipedia to see if anybody bring this fact up; somebody did... IMO, its vital that anybody who thinks about using TM completely understands this particular caveat! &lt;BR&gt;&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267821</link><pubDate>Sat, 25 Nov 2006 13:43:54 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267821</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267821/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>One other MAJOR *point about TM... STM or HTM, my implmentation or not, both "suffer" from a not so well known phenomena: 
Inconsistent data can, and will, be accessed in a transactional atomic block! A users algorithm for a transactional based system is based solely inside of an atomic block,&amp;#8230;</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267821/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;BLOCKQUOTE&gt;&lt;div&gt;cristom wrote:&lt;/div&gt;&lt;div&gt;﻿
&lt;P&gt;ACK!!!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;"I am mostly talking about database servers the make heavy use of shared memory for fast-pathing database queries. I can traverse mutable data-structures with atomic operations and/or memory barriers. "&lt;BR&gt;&lt;BR&gt;Is suppose to read as:&lt;BR&gt;&lt;BR&gt;&lt;/P&gt;
&lt;P&gt;I am mostly talking about database servers the make heavy use of shared memory for fast-pathing database queries. I can traverse mutable data-structures&amp;nbsp;"WITHOUT"&amp;nbsp;atomic operations and/or memory barriers. &lt;BR&gt;&lt;BR&gt;&lt;BR&gt;&lt;BR&gt;Sorry for any confusion.&lt;/P&gt;&lt;/div&gt;&lt;/BLOCKQUOTE&gt;&lt;BR&gt;&lt;BR&gt;Again, what does this prove? Nothing!&lt;BR&gt;"I have this prime number generator written in asm which is much faster than the same algorithm produces when compiled with VC, ergo C is useless."&lt;BR&gt;&lt;BR&gt;Again, you point out specific cases where existing technology works well, and conclude that STM is useless. The point here though is that there are plenty of applications which are essentially single threaded because splitting them up in multiple threads is too difficult to be a feasible proposition.&amp;nbsp; It's for these difficult cases, cases where the traditional methods are too difficult to use, that STM will help.&lt;BR&gt;&lt;BR&gt;The point is that STM is really nice and elegant for the &lt;EM&gt;general&lt;/EM&gt; case. Your whole argument basically boils down to premature optimization. Use STM, if it's too slow somewhere, and you can find a way to get it faster (most likely by compromising your ability to use this abstraction in a composable way in the future - which may be a completely valid tradeoff) then you are free to do so. Just like you can use inline asm in C where it's needed.</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267820</link><pubDate>Sat, 25 Nov 2006 13:30:41 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267820</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267820/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>cristom wrote:﻿
ACK!!!
&amp;nbsp;
"I am mostly talking about database servers the make heavy use of shared memory for fast-pathing database queries. I can traverse mutable data-structures with atomic operations and/or memory barriers. "Is suppose to read as:
I am mostly talking about database&amp;#8230;</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267820/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;P&gt;"Actually I'm really not. I've posted twice why I don't think this makes sense in an imperative language."&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;I think we have to agree to disagree on this. I have no problems with using C for creating high-end multi-threaded applications.&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;"I would recommend you post it and point it in the direction of Simpon Peyton Jones and Tim Harris though. If your implementation really is as good as you say, and it does what the Haskell one does, I'm sure they will be willing to do it your way instead."&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;Its definitely as good as I say it is, however, it doesn’t have some of the features that Haskell has. You should think of my STM as a “lean-and-mean” minimalist version. I was trying to go for extreme speed in the STM implementation itself, some of the stuff that Haskell uses makes this impossible.&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;;^(…&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;Anyway… I will think this over. To patent, or not to patent. I wonder how popular a fairly speedy C/C++ STM alternative to Haskell would actually be. I personally would not even want to use my STM in any of my application infrastructures’. But if most of the programmers go for STM, I could do well by providing a top-notch minimalist alternative in C/C++.&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;Humm..&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267819</link><pubDate>Sat, 25 Nov 2006 13:25:28 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267819</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267819/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>"Actually I'm really not. I've posted twice why I don't think this makes sense in an imperative language."
&amp;nbsp;
I think we have to agree to disagree on this. I have no problems with using C for creating high-end multi-threaded applications.
&amp;nbsp;
"I would recommend you post it and point it in&amp;#8230;</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267819/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;P&gt;ACK!!!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;"I am mostly talking about database servers the make heavy use of shared memory for fast-pathing database queries. I can traverse mutable data-structures with atomic operations and/or memory barriers. "&lt;BR&gt;&lt;BR&gt;Is suppose to read as:&lt;BR&gt;&lt;BR&gt;&lt;/P&gt;
&lt;P&gt;I am mostly talking about database servers the make heavy use of shared memory for fast-pathing database queries. I can traverse mutable data-structures&amp;nbsp;"WITHOUT"&amp;nbsp;atomic operations and/or memory barriers. &lt;BR&gt;&lt;BR&gt;&lt;BR&gt;&lt;BR&gt;Sorry for any confusion.&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267817</link><pubDate>Sat, 25 Nov 2006 13:16:24 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267817</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267817/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>ACK!!!
&amp;nbsp;
"I am mostly talking about database servers the make heavy use of shared memory for fast-pathing database queries. I can traverse mutable data-structures with atomic operations and/or memory barriers. "Is suppose to read as:
I am mostly talking about database servers the make heavy&amp;#8230;</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267817/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;P&gt;"And you don't have to. Writing a web server probably doesn't require STM at all."&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;I am mostly talking about database servers the make heavy use of shared memory for fast-pathing database queries. I can traverse mutable data-structures with atomic operations and/or memory barriers. Again, this is impossible with STM. Using STM would make that fast-path run very slow because of the vast amount of coherency traffic that the TM logic generates.&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267816</link><pubDate>Sat, 25 Nov 2006 13:12:54 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267816</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267816/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>"And you don't have to. Writing a web server probably doesn't require STM at all."
&amp;nbsp;
I am mostly talking about database servers the make heavy use of shared memory for fast-pathing database queries. I can traverse mutable data-structures with atomic operations and/or memory barriers. Again,&amp;#8230;</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267816/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;BLOCKQUOTE&gt;&lt;div&gt;cristom wrote:&lt;/div&gt;&lt;div&gt;﻿
&lt;P&gt;C is fine. You can create STM implementation in C/Assembly. &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;...&lt;BR&gt;&lt;/P&gt;
&lt;P&gt;Interested?&lt;/P&gt;&lt;/div&gt;&lt;/BLOCKQUOTE&gt;&lt;BR&gt;&lt;BR&gt;Actually I'm really not. I've posted twice why I don't think this makes sense in an imperative language. I would recommend you post it and point it in the direction of Simpon Peyton Jones and Tim Harris though. If your implementation really is as good as you say, and it does what the Haskell one does, I'm sure they will be willing to do it your way instead.</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267815</link><pubDate>Sat, 25 Nov 2006 13:12:05 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267815</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267815/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>cristom wrote:﻿
C is fine. You can create STM implementation in C/Assembly. 
&amp;nbsp;
...
Interested?Actually I'm really not. I've posted twice why I don't think this makes sense in an imperative language. I would recommend you post it and point it in the direction of Simpon Peyton Jones and Tim&amp;#8230;</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267815/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;BLOCKQUOTE&gt;&lt;div&gt;cristom wrote:&lt;/div&gt;&lt;div&gt;﻿&lt;BR&gt;I am not willing to sacrifice my server applications performance. Sorry, not sold.&lt;BR&gt;&lt;/div&gt;&lt;/BLOCKQUOTE&gt;&lt;BR&gt;&lt;BR&gt;And you don't have to.&amp;nbsp;Writing a web server probably doesn't require STM at all.&lt;BR&gt;&lt;BR&gt;The whole point here is, again, that you take some very simple cases say "X is faster than STM in case x, Y is faster than STM in case y" and you ignore all the difficult cases alltogether, or the fact that STM is an easy to use technology which Just Works for both x and y without having to do anything special&amp;nbsp;(again, nobody has claimed that it isn't possible to write multithreaded programs without STM, please try to understand that very important point).&lt;BR&gt;&lt;BR&gt;Embarassingly parallel applications (like e.g. a web sever) are not hard to write, and you can get a way with very light weight abstractions that are very fast indeed. But try writing, say, a game engine which scales to hundreds or thousands of threads and you'll understand that a couple of clever lock-free data structures is not a viable solution for parallellizing problems &lt;EM&gt;in general&lt;/EM&gt;.&lt;BR&gt;Again, if you're going to compare against "alternatives", please do so in an intellectually honest way, rather than cherry picking specific examples where there are existing data structures that work well. The whole point here was that with STM it's all of a sudden a reasonable proposition to split something like a game engine up into hundreds and thousands of threads - something which is very difficult to do without something as general and flexible.&lt;BR&gt;&lt;BR&gt;Your argument is akin to someone arguing that assembly is better than C because a couple of small programs he's written were faster when written in hand tuned assembly than when compiled with a C compiler. It might be true, but that doesn't mean assembly is a better choice in general. For sufficiently large applications you're not going to be able to do any meassuring of the difference in performance, because only one of the version will actually run properly (the one which sacrifices speed for usability).</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267813</link><pubDate>Sat, 25 Nov 2006 13:06:22 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267813</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267813/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>cristom wrote:﻿I am not willing to sacrifice my server applications performance. Sorry, not sold.And you don't have to.&amp;nbsp;Writing a web server probably doesn't require STM at all.The whole point here is, again, that you take some very simple cases say "X is faster than STM in case x, Y is faster&amp;#8230;</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267813/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;P&gt;C is fine. You can create STM implementation in C/Assembly. Just follow the rules:&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;http://groups-beta.google.com/group/comp.programming.threads/msg/423df394a0370fa6&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;Okay… If I had to use STM, I would use it in C, not Haskell. Humm, I have created experimental STM that try to get around some of the overheads. Still not sure I want to Patent it or not. Perhaps I should patent it, and sell it as a much faster alternative to most of the STM implementations’. Humm. A couple of implementations of mine beat Haskell by a wide margin. Indeed.&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;Okay… If I can’t convince you the STM is not a good solution at all… How would you feel about using a STM algorithm with a pure C interface? I have several solutions. I created them way back when I was experimenting/inventing my vZOOM algorithm. I needed something to test it against. I was very disappointed with every STM implementation out there. So, I invented a couple of new ways to implement STM… I guess if STM is going to be rammed down programmer’s throats, I might as well get the experimental prototypes’ and post them here…&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;I have 2 STM word-based implementations. One is written is about 75% IA-32 assembly language, and the other is in about 90% SPARC V9 assembly language… The all have C API/ABI. Standard transaction API (e.g., begin, load, store, commit, abort, validate), and they all outperform Haskel.&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;Interested?&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267812</link><pubDate>Sat, 25 Nov 2006 13:03:39 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267812</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267812/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>C is fine. You can create STM implementation in C/Assembly. Just follow the rules:
&amp;nbsp;
http://groups-beta.google.com/group/comp.programming.threads/msg/423df394a0370fa6
&amp;nbsp;
Okay… If I had to use STM, I would use it in C, not Haskell. Humm, I have created experimental STM that try to get&amp;#8230;</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267812/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;P&gt;"I really think you're not really trying to see the point here. It also seems that you may have a financial interest in not seeing the point. Which makes arguing with you pretty much a waste of time."&lt;/P&gt;
&lt;P&gt;IMHO, STM is a deceptive marking ploy. They never get into the alternatives’. BTW, I mention many different ways to do things. I did not just refer you to my vZOOM algorithm; I also refer you to Joe Seighs RCU+SMR algorithm, and some others. So you wrong again… Unlike the video, I point out all of the caveats, and explain them in detail.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;"Look at the video again. They specifically say that skilled programmers can use existing techniques to write correct code - they're not claiming that it's not possible to work around the difficulties of multi threaded programming with existing techniques. The point here, though, is that it is difficult. Your own posts is are a case in point. Why are you so proud of your inventions of various lock-free data structures if it's so easy?"&lt;/P&gt;
&lt;P&gt;The complexity can be hidden behind a simple and portable API, that anybody can use. Like the pthreads api. We all know that a particular pthread implementation may be very complex, and may very well use some lock-free algorithms. But, does the average user actually care about this kind of stuff? My point is, again "The complexity can be hidden behind a simple API". Really. &lt;/P&gt;
&lt;P&gt;Would there be bugs... Well, maybe; it depends on the implementer. There may be very subtle bugs in your pthread implementation as well... &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;"Because it isn't very easy, which is the whole reason for why you want STM instead."&lt;/P&gt;
&lt;P&gt;No. STM is NOT a good solution if you’re interested in execellent scalability and performance.&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;"Getting everything right with locks or lock-free programming techniques is difficult - if you're selling software based on this I can understand why you would argue otherwise, but I'm not going to waste my time arguing against a paycheck (everyone whose salary isn't based on pretending it's easy will agree that it's difficult). Not impossible, but difficult. Your argument that it's possible to use other strategies is completely pointless. It's possible to write a full comercial business application using nothing but assembly as well, but is it practical? No not really, and as concurrency becomes more common using the sophisticated and clever lock-free programming techniques will stop being practical as well."&lt;/P&gt;
&lt;P&gt;Utter nonsense. The power of lock-free programming techniques will be in demand. Man. Concurrency has been common for a very long time now. Ever heard of RCU? Heck, this stuff, including STM and HTM has been around for the past 25 years at least. Don't try to act like this is something new. Lock-free has been around for ages, and it not going to go out of practice. Its going to be in demand. That is, if you want your application to have excellent scalability and throughput characteristics&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;"What if I want to read 4 elements, or none at all, from your nice lock free FIFO queue?"&lt;/P&gt;
&lt;P&gt;Simple, you use an eventcount.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;"What if I want to do that operation on two different FIFOs &amp;gt; (from different vendors), or none at all? What if I want to combine that with some operations of my&amp;nbsp; &amp;gt; own, or do none at all?"&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;This is example of bad programming practice. Having to force things together is a sign that there is a MUCH better way. The overhead involved with your request is too great. Again, I urge to you to post to comp.programming.theads and comp.arch. Please. &lt;BR&gt;Anyway, you can use the eventcount to add fine-grain ultra low-overhead conditional blocking and coordination of your data with lock-free collections. Too much to post here. The technique deserves its own thread. STM can’t compare with this wrt scalability; its dead in the water. The performance increases of most alternatives to STM are usually an order of magnitude.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;"STM composes because you can create new concurrency abstractions from existing ones without having to have access to the source code of the original abstractions."&lt;/P&gt;
&lt;P&gt;I am not willing to sacrifice my server applications performance. Sorry, not sold.&lt;BR&gt;&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267810</link><pubDate>Sat, 25 Nov 2006 12:42:23 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267810</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267810/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>"I really think you're not really trying to see the point here. It also seems that you may have a financial interest in not seeing the point. Which makes arguing with you pretty much a waste of time."
IMHO, STM is a deceptive marking ploy. They never get into the alternatives’. BTW, I mention many&amp;#8230;</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267810/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;BLOCKQUOTE&gt;&lt;div&gt;cristom wrote:&lt;/div&gt;&lt;div&gt;﻿ 
&lt;P&gt;For those who don't follow links, here are some important snippets from some of them:&lt;BR&gt;&lt;/P&gt;&lt;BR&gt;SNIP!&lt;BR&gt;&lt;BR&gt;&lt;/div&gt;&lt;/BLOCKQUOTE&gt;&lt;BR&gt;&lt;BR&gt;Okay, this is only slightly less annoying/insulting than&amp;nbsp;outright spam. Clicking links was not the problem, having to deduce your points from multiple related discussions, was. You're not that important to me. I don't care about you enough to read everything you've posted on related subjects and from there figure out where you stand on &lt;EM&gt;this&lt;/EM&gt; issue.&lt;BR&gt;&lt;BR&gt;If you have a specific point that has to do with STM, please make it. If you have a specific point to anything someone else posted here, please make it.&lt;BR&gt;If, however, you don't feel like you have time and energy&amp;nbsp;to extend the common courtesy of doing so in a coherent way&amp;nbsp;by formulating it specifically for this discussion, in a way which flows naturally together with the topic of discussion, then please note that posting here is completely volontary.&lt;BR&gt;&lt;BR&gt;Also, please note that examples are not proof. Yes there are, as said many times already, cases where STM will be slower than other methods, but listing them is completely irrelevant. The point here is that there are plenty of cases where STM helps out a lot, and is probaby the only practical way of writing software in certain circumstances. E.g. let's say you have a game world, and 10000 game objects that interact with it in complex and data dependent ways (i.e. there is no way to partition the world into lockable regions). Each of these objects interact with about half a dozen of the other objects, and they do so concurrently, and the interactions must be atomic. How do you solve that with lock free data structures? The game world consists of multiple different data structures nested in complex ways (oct-trees, portal cell graphs, BSP trees, etc.). How do you solve it with locks? You really don't, unless you stick a big lock on the entire world, or use some global locking order or something equally detrimental to composability and productivity. STM in this case "Just Works".&lt;BR&gt;&lt;BR&gt;Yeah, STM for a FIFO is overkill, but showing how a non-STM implementation of a FIFO is faster than an STM based one does not prove that STM is a waste of time. Again, you're pitting STM up against other approaches only in very simple cases, which proves exactly nothing.</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267809</link><pubDate>Sat, 25 Nov 2006 12:42:05 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267809</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267809/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>cristom wrote:﻿ 
For those who don't follow links, here are some important snippets from some of them:SNIP!Okay, this is only slightly less annoying/insulting than&amp;nbsp;outright spam. Clicking links was not the problem, having to deduce your points from multiple related discussions, was. You're not&amp;#8230;</evnet:previewtext><dc:creator>sylvan</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267809/Trackback.aspx</trackback:ping></item><item><title>Re: Programming in the Age of Concurrency: Software Transactional Memory</title><description>&lt;P&gt;For those who don't follow links, here are some important snippets from some of them:&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&amp;gt; I am wondering where RW locks are appropriate ?? &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;Basically, they work fairly well with "read-mostly" data-structures. If you &lt;BR&gt;don't really care about performance and scalability, then most rw-mutex &lt;BR&gt;implementations should work perfectly fine; however, in some cases they are &lt;BR&gt;definitely not an "optimal solution". For instance, IMHO a rw-mutex would &lt;BR&gt;probably not be an ideal candidate for protecting a "critical shared &lt;BR&gt;collection" than can experience episodic periods of sustained &lt;BR&gt;"moderate-to-heavy write activity". Reader threads that use rw-mutexs to &lt;BR&gt;protect such a container would probably not perform and/or scale very well &lt;BR&gt;at all with regard to adding cpus. One of the reasons why is usually due to &lt;BR&gt;the fact that most rw-mutex implementations force reader threads to use some &lt;BR&gt;rather expensive atomic/membar/blocking methods in order to enforce &lt;BR&gt;synchronization and proper data-coherency among the participating &lt;BR&gt;processors; this is not very cache friendly, not at all. Reader threads &lt;BR&gt;would not really be able to keep the processors "caches primed" and probably &lt;BR&gt;would not experience the full benefits of a modern multi-processor system. &lt;BR&gt;For instance, a reader thread that "frequently" executes a store/load style &lt;BR&gt;membar would constantly be "invalidating and thrashing the cache"; not good &lt;BR&gt;at all! The thread could spend most of its time "waiting" for the &lt;BR&gt;modifications to the "rw-mutexs internal state" to become "fully visible" to &lt;BR&gt;all of processors involved. This process is usually made up of fairly &lt;BR&gt;expensive operation(s) that may involve the overhead of cache snooping &lt;BR&gt;and/or locking the system bus. The overhead involved could actually prevent &lt;BR&gt;reader threads from truly exeuting in parallel; not good. &lt;/P&gt;
&lt;P&gt;Therefore, IMHO, a rw-mutex is generally a bad solution for a critical &lt;BR&gt;shared data-structure if the overhead of acquiring and releasing it is &lt;BR&gt;greater than the overhead of "reading the data-structure in a &lt;BR&gt;single-threaded application"... &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;gt; is it appropriate when only you don't want 'dirty' reads?? &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;Not really, but yes they do eliminate the possibility of processing "stale &lt;BR&gt;data"; however, as stated above the methods involved are usually expensive. &lt;BR&gt;Fortunately, there is a more advanced class of algorithms out there that do &lt;BR&gt;not tend to add a boatload of extra overhead and can allow reader and writer &lt;BR&gt;threads to execute in parallel. Please note that this behavior alone can be &lt;BR&gt;beneficial to many application designs. For instance, since readers no &lt;BR&gt;longer block writers an existing synchronization design can usually be based &lt;BR&gt;on a "much simpler locking scheme"; however, stale reads can and will occur &lt;BR&gt;in this type of setup. Luckily, one can address stale/dirty reads in a &lt;BR&gt;number of interesting ways: &lt;/P&gt;
&lt;P&gt;&lt;a href="http://groups.google.com/group/comp.programming.threads/msg/523aa1e77"&gt;http://groups.google.com/group/comp.programming.threads/msg/523aa1e77&lt;/a&gt;... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;Another solution for stale reads involves writer threads making copies of &lt;BR&gt;existing shared data-structures and then atomically swapping them with the &lt;BR&gt;existing ones. A garbage collection scheme can be used to free the "old" &lt;BR&gt;data-structures. This solution does not add overhead to reader threads, &lt;BR&gt;unless, the method of garbage collection used is sub-optimal... &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;"John Hickin" &amp;lt;&lt;a href="http://channel9.msdn.commailto:hic...@nortelnetworks.com&gt;hic...@nortelnetworks.com&lt;/a&gt;&amp;gt; wrote in message &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&lt;a&gt;news:ec4u46$hvv$1@zcars129.ca.nortel.com&lt;/a&gt;... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&amp;gt; "David Schwartz" &amp;lt;&lt;a href="http://channel9.msdn.commailto:dav...@webmaster.com&gt;dav...@webmaster.com&lt;/a&gt;&amp;gt; wrote in message &lt;BR&gt;&amp;gt; &lt;a&gt;news:1155889979.905637.238660@i42g2000cwa.googlegroups.com&lt;/a&gt;... &lt;BR&gt;[...] &lt;/P&gt;
&lt;P&gt;&lt;a href="http://www.computer.org/portal/site/computer/menuitem.5d61c1d591162e4b0ef1bd108bcd45f3/index.jsp?&amp;amp;pName=computer_level1_article&amp;amp;TheCat=1005&amp;amp;path=computer/homepage/0506&amp;amp;file=cover.xml&amp;amp;xsl=article.xsl"&gt;http://www.computer.org/portal/site/computer/menuitem.5d61c1d591162e4b0ef1bd108bcd45f3/index.jsp?&amp;amp;pName=computer_level1_article&amp;amp;TheCat=1005&amp;amp;path=computer/homepage/0506&amp;amp;file=cover.xml&amp;amp;xsl=article.xsl&lt;/a&gt;&amp;amp;&lt;BR&gt;("The Problem with Threads" - IEEE's Computer magazine article)&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&amp;gt; This example is quite good and it explains what I don't like with threads. &lt;BR&gt;&amp;gt; In the real world, 8 of the 10 people in your example will go on to do &lt;BR&gt;&amp;gt; something else. In a multi-threaded world, 8 threads are blocked. &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;a href="http://groups.google.com/group/comp.programming.threads/msg/a3e38e27d"&gt;http://groups.google.com/group/comp.programming.threads/msg/a3e38e27d&lt;/a&gt;... &lt;BR&gt;( refer to the pseudo-code at the bottom of the msg ) &lt;/P&gt;
&lt;P&gt;Any thoughts? &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;gt; Unless, of &lt;BR&gt;&amp;gt; course, you use some sort of lock-free algorithm. Unless I've missed the &lt;BR&gt;&amp;gt; mark, lock-free is quite avant guard, and not taught at the undergraduate &lt;BR&gt;&amp;gt; level. It may be time for a change there. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;Totally agreed: &lt;/P&gt;
&lt;P&gt;&lt;a href="http://groups.google.com/group/comp.lang.c++.moderated/msg/d07b79e963"&gt;http://groups.google.com/group/comp.lang.c++.moderated/msg/d07b79e963&lt;/a&gt;... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;FWIW: &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;I don't agree with the commonly assertion that lock-free programming is far &lt;BR&gt;to complicate for a "normal" programmer to learn. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;I also don't agree with the assertion that threads are vastly too difficulty &lt;BR&gt;and fragile for any programmer to use... The paper the OP linked to seems to &lt;BR&gt;attempt to make this type of assertion. &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;gt; Anyway, the article was interesting to read. The author is an IEEE Fellow, &lt;BR&gt;&amp;gt; and you don't get that designation without earning it. &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Humm... Did you read the part where is basically says that programmers that &lt;BR&gt;use threads are insane? He points to a "folk saying", &lt;/P&gt;
&lt;P&gt;'repeatedly performing the same action and expecting different results &lt;BR&gt;defines a form of insanity' &lt;BR&gt;(paraphrase) &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;The he says this, again paraphrasing &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;'if programmers that use threads were not insane, they simply would not be &lt;BR&gt;able to understand any of their programs' &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;I wonder if he was being serious... Anyway.... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;Yikes! I guess that I am INSANE!!!!!!!!! &lt;BR&gt;YeeeHaw.... Weeeee..... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;;) &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;I find it amusing that the IEEE Fellow seems to not be able to create a &lt;BR&gt;thread-safe observer pattern... I have several lock-free observer patterns &lt;BR&gt;that can rely on my vZOOM or Joe Seighs RCU-SMR algorithms. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;For some reason, I don't think that the author could create a highly &lt;BR&gt;scaleable thread-safe, lock-based or lock-free, observer pattern &lt;BR&gt;implementation if his life depended on it??? &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;:O &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;(simply EXECELLENT Wheeler posts!!!)&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;"Chris Thomasson" &amp;lt;&lt;a href="http://channel9.msdn.commailto:_no_spam_cristom@no_spam_comcast._net&gt;_no_spam_cristom@no_spam_comcast._net&lt;/a&gt;&amp;gt; writes: &lt;BR&gt;&amp;gt; IMHO, I believe its too early in the game for actual hardware &lt;BR&gt;&amp;gt; support, even though they really do need it in order for TM to &lt;BR&gt;&amp;gt; become really useful. All of the software emulations I have seen are &lt;BR&gt;&amp;gt; loaded with expensive atomic operations; mutexs seem more &lt;BR&gt;&amp;gt; efficient... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;original 801 had a form ... it was used by journal filesystem &lt;BR&gt;for aix in rs/6000. &lt;/P&gt;
&lt;P&gt;filesystem metadata was organized ... and the memory system caught all &lt;BR&gt;changes ... so basically you avoided having to do all the explicit &lt;BR&gt;logging/locking statements in the filesystem code ... you did have to &lt;BR&gt;add commit calls. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;there was some disagreement whether the overhead of explicit &lt;BR&gt;logs/locks were more expensive than the implicit overhead involved in &lt;BR&gt;transaction memory. turns out there was the overhead of scanning the &lt;BR&gt;whole memory area for changes at commits. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;a portable version of the journal filesystem code with explicit &lt;BR&gt;lock/logging changes in the code turned up the explicit lock/logging &lt;BR&gt;calls had lower overhead than transactional memory (at least in the &lt;BR&gt;journal filesystem case). &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;and unrelated old reference from about same period as aixv3 ... the &lt;BR&gt;Proceedings of the 20th International Symposium on Fault-Tolerant &lt;BR&gt;Computing Systems, pp 89--96, Newcastle, June 1990. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;A Fault Tolerant Tightly Coupled Multiprocessor Architecture based on &lt;BR&gt;Stable Transactional Memory &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;Authors: Michel Banatre and Philippe Joubert &lt;BR&gt;Abstract: &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;Traditionally, tightly coupled multiprocessors allow data sharing &lt;BR&gt;between multiple caches by keeping cached copies of memory blocks &lt;BR&gt;coherent with respect to shared memory. This is difficult to achieve &lt;BR&gt;in a fault tolerant environment due to the need to save global &lt;BR&gt;checkpoints in shared memory from where consistent cache states can be &lt;BR&gt;recovered after a failure. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;The architecture presented in this report solves this problem by &lt;BR&gt;encapsulating the memory modifications done by a process into an &lt;BR&gt;atomic transaction. Caches record dependencies between the &lt;BR&gt;transactions associated with processes modifying the same memory &lt;BR&gt;blocks. Dependent transactions may then be atomically committed. Such &lt;BR&gt;an operation requires a cache coherence protocol responsible for &lt;BR&gt;recording process dependencies as well as keeping coherent cached &lt;BR&gt;copies of blocks and a shared Stable Transactional Memory owing to &lt;BR&gt;which all memory updates are atomic to allow recovery after a &lt;BR&gt;processor failure. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;-- &lt;BR&gt;Anne &amp;amp; Lynn Wheeler | &lt;a href="http://www.garlic.com/~lynn/"&gt;http://www.garlic.com/~lynn/&lt;/a&gt; &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;"Chris Thomasson" &amp;lt;&lt;a href="http://channel9.msdn.commailto:cris...@comcast.net&gt;cris...@comcast.net&lt;/a&gt;&amp;gt; writes: &lt;BR&gt;&amp;gt; Good point. Do you happen to know if they implemented it so they could &lt;BR&gt;&amp;gt; successfully pop from a lock-free stack? I wonder.... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;charlie had invented compare&amp;amp;swap as part of his work on fine-grain &lt;BR&gt;locking (leading to some number of lock free operations) for cp67 &lt;BR&gt;(360/67) mutliprocessor support at the science center &lt;BR&gt;&lt;a href="http://www.garlic.com/~lynn/subtopic.html#545tech"&gt;http://www.garlic.com/~lynn/subtopic.html#545tech&lt;/a&gt; &lt;/P&gt;
&lt;P&gt;the trick then was to come up with a mnemonic that matched Charlie' &lt;BR&gt;initials, CAS. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;the attempt was then made to get the instruction into the up and &lt;BR&gt;coming 370 architecture. working with ron smith in the pok 370 &lt;BR&gt;architecture group (they owned the 370 architecture "red book"), the &lt;BR&gt;response was that 370 didn't need another multiprocessor specific &lt;BR&gt;instruction, that the test and set from 360 was sufficient.&amp;nbsp;&amp;nbsp; &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;to get compare and swap into the 370 architecture we had to come up &lt;BR&gt;with useage for compare&amp;amp;swap that wasn't multiprocessor specific. &lt;BR&gt;thus was born some number of examples that were applicable to &lt;BR&gt;multi-threaded applications that might be running enabled for &lt;BR&gt;interrupts ... independent of whether the machine was single processor &lt;BR&gt;or multiprocessor. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;originally in the 370 principles of operation, the examples were part &lt;BR&gt;of the programming notes that were part of the compare&amp;amp;swap &lt;BR&gt;instruction. in subsequent version of the principle of operations the &lt;BR&gt;examples were moved to a section in the appendix. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;also as part of this activity, compare&amp;amp;swap double instruction was &lt;BR&gt;added in addition to compar&amp;amp;swap. that resulted in two instructions &lt;BR&gt;for 370, compare&amp;amp;swap along with compare&amp;amp;swap double ...&amp;nbsp; so the &lt;BR&gt;instruction mnemonics become CS and CDS (instead of CAS ... defeating &lt;BR&gt;the original objective of coming up with instruction name &lt;BR&gt;compare&amp;amp;swap). &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;total topic drift ... science center was like that ... GML &lt;BR&gt;&lt;a href="http://www.garlic.com/~lynn/subtopic.html#sgml"&gt;http://www.garlic.com/~lynn/subtopic.html#sgml&lt;/a&gt; &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;precusor to SGML, HTML, XML, etc ... also invented at the science &lt;BR&gt;center, actually are the first initials of the last name of the three &lt;BR&gt;inventors (and you probably thot it stood for generalized markup &lt;BR&gt;language). &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;misc. past posts on multiprocessor support and/or compare&amp;amp;swap &lt;BR&gt;instruction &lt;BR&gt;&lt;a href="http://www.garlic.com/~lynn/subtopic.html#smp"&gt;http://www.garlic.com/~lynn/subtopic.html#smp&lt;/a&gt; &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;esa/390 principles of operation appendix for multiprogramming &lt;BR&gt;(i.e. multithread) and multiprocessing &lt;BR&gt;&lt;a href="http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/A"&gt;http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/A&lt;/a&gt;... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;cs &amp;amp; cds appendix: &lt;BR&gt;&lt;a href="http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/A"&gt;http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/A&lt;/a&gt;... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;bypassing post and wait &lt;BR&gt;&lt;a href="http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/A"&gt;http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/A&lt;/a&gt;... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;lock/unlock: &lt;BR&gt;&lt;a href="http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/A"&gt;http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/A&lt;/a&gt;... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;free pool manipulation &lt;BR&gt;&lt;a href="http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/A"&gt;http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/A&lt;/a&gt;... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;and more recent z/architecture (64-bit) principles of operation &lt;BR&gt;multiprogramming and multiprocessing examples appendix &lt;BR&gt;&lt;a href="http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9ZR003/A"&gt;http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9ZR003/A&lt;/a&gt;... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;note that the above also includes discussion of the newer PLO ...perform lock &lt;BR&gt;operation instruction &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;-- &lt;BR&gt;Anne &amp;amp; Lynn Wheeler | &lt;a href="http://www.garlic.com/~lynn/"&gt;http://www.garlic.com/~lynn/&lt;/a&gt; &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;- conversation with Andy...&lt;/P&gt;
&lt;P&gt;&lt;a href="http://groups.google.com/group/comp.arch/browse_frm/thread/71f8e0094e353e5/a3ebfe80363a4399?lnk=gst&amp;amp;q=why+do+double+wide+compare+and+swap&amp;amp;rnum=1#a3ebfe80363a4399"&gt;http://groups.google.com/group/comp.arch/browse_frm/thread/71f8e0094e353e5/a3ebfe80363a4399?lnk=gst&amp;amp;q=why+do+double+wide+compare+and+swap&amp;amp;rnum=1#a3ebfe80363a4399&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;"Andy Glew" &amp;lt;&lt;a href="http://channel9.msdn.commailto:first.l...@employer.domain&gt;first.l...@employer.domain&lt;/a&gt;&amp;gt; wrote in message &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&lt;a&gt;news:peyp8xnzl9g2.fsf@PXPL8591.amr.corp.intel.com&lt;/a&gt;... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&amp;gt;&amp;gt; &amp;gt; Are there a rule of thumb as to when LL/SC starts showing &lt;BR&gt;&amp;gt;&amp;gt; &amp;gt; scalability issues? &lt;/P&gt;
&lt;P&gt;&amp;gt;&amp;gt; IMHO there is no rule of thumb. Scalability issues are not typically &lt;BR&gt;&amp;gt;&amp;gt; induced by LL/SC itself. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&amp;gt; I disagree. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&amp;gt; Although maybe it's not really scalability issues.&amp;nbsp; LL/SC starts &lt;BR&gt;&amp;gt; introducing deadlock / forward progress issues the moment you have &lt;BR&gt;&amp;gt; more than one processor on any interconnect that is not a simple &lt;BR&gt;&amp;gt; snoopy bus, and they just get worse as you add more, and have more &lt;BR&gt;&amp;gt; complicated interconnects. &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Something like this?: &lt;/P&gt;
&lt;P&gt;&lt;a href="http://groups.google.com/group/comp.arch/msg/d40cf092b679169d"&gt;http://groups.google.com/group/comp.arch/msg/d40cf092b679169d&lt;/a&gt; &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;Coarse reservation granularity can cause live-lock in LL/SC via. simple &lt;BR&gt;false-sharing. I really do believe that HTM is going to suffer from the same &lt;BR&gt;thing... Multiple processors can live-lock if they keep interfering with the &lt;BR&gt;reservations that track the TM. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;Humm... Kind of like the live-lock that is inherent in any obstruction-free &lt;BR&gt;algorithm; they need to use "backoff-and-retry" algorithms to keep up &lt;BR&gt;forward-progress... &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Good response post form Intel Architect Andy Glew:&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&amp;gt; Humm... Kind of like the live-lock that is inherent in any obstruction-free &lt;BR&gt;&amp;gt; algorithm; they need to use "backoff-and-retry" algorithms to keep up &lt;BR&gt;&amp;gt; forward-progress... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;Yeah, I was thinking of correcting my older post to try to get to this. &lt;/P&gt;
&lt;P&gt;It is not fundamentally LL/SC vs. CAS and other atomic RMWs. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;It is more that LL/SC was originally proposed and implemented via an &lt;BR&gt;"optimistic concurrency" type of address "link", whereas CAS and other &lt;BR&gt;atomic RMWs are traditionally implemented via a "lock" - whether bus &lt;BR&gt;lock, cache lock, or address lock. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;If other people try to access the lock while the RMW is in flight they &lt;BR&gt;are stalled or told to go away, come back later. This guarantees &lt;BR&gt;forward progress. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;Whereas if other people access the link between the LL and SC, it is &lt;BR&gt;the SC that fails. In the presence of an adversary doing ordinary &lt;BR&gt;stores, the LL/SC might never complete. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;But these are just implementation artifacts.&amp;nbsp; E.g. you can implement &lt;BR&gt;CAS or an atomic RMW with LL/SC, whether in user code or microcode - &lt;BR&gt;and such an RMW would have the same forward progress problems as &lt;BR&gt;LL/SC. You can implement a hybrid - try the optimistic concurreny &lt;BR&gt;approach first, and then try the lock approach. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;Similarly, you can implement LL/SC by acquiring a lock, guaranteeing &lt;BR&gt;that the SC will finish.&amp;nbsp; But then you need some way of backing out, &lt;BR&gt;protecting yourself against a malicious user who does the LL but never &lt;BR&gt;the SC. E.g. the Intel 860 released such a lock after 32 instructions. &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;Chris Thomasson wrote: &lt;BR&gt;&amp;gt; "Joe Seigh" &amp;lt;&lt;a href="http://channel9.msdn.commailto:jseigh...@xemaps.com&gt;jseigh...@xemaps.com&lt;/a&gt;&amp;gt; wrote &lt;BR&gt;&amp;gt;&amp;gt;What's with the Sparc architects? &lt;/P&gt;
&lt;P&gt;&amp;gt; IMHO, I believe the major reason 32-bit architectures support DWCAS was to &lt;BR&gt;&amp;gt; be "64-bit ready"; DWCAS gets converted to normal CAS on 64-bit systems. So, &lt;BR&gt;&amp;gt; if your 32-bit applications used DWCAS, then they can be run 64-bit arch &lt;BR&gt;&amp;gt; without changes... Of course there is a caveat... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&amp;gt; A 32-bit application cannot use DWCAS to manipulate a pointer along with &lt;BR&gt;&amp;gt; another contiguous variable... &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&amp;gt; struct dosent_work_on_64_bit_s { &lt;BR&gt;&amp;gt;&amp;nbsp;&amp;nbsp; void *a; &lt;BR&gt;&amp;gt;&amp;nbsp;&amp;nbsp; whatever b; &lt;BR&gt;&amp;gt; }; &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&amp;gt; IMO, the hardware architects' did NOT have lock-free algorithms in mind when &lt;BR&gt;&amp;gt; they decided to put DWCAS in their 32-bit instruction sets. The fact that &lt;BR&gt;&amp;gt; 128-bit CAS is not reliably supported seems to support my opinion... &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;IBM S370 had double wide compare and swap long before 64 bit support &lt;BR&gt;became an issue. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&amp;gt;&amp;gt;There's a bit of a disconnect here I think. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&amp;gt; Perhaps they are designing the lock-free algorithms for a upcoming processor &lt;BR&gt;&amp;gt; design. Maybe one that has hardware transactional memory... That could be &lt;BR&gt;&amp;gt; the reason they designed KCSS: &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;They've done a bit on STM (software transactional memory).&amp;nbsp; If they did &lt;BR&gt;come out with hardware based transactional memory it would be after the &lt;BR&gt;fact of 64 bit sparc and wouldn't be generally available.&amp;nbsp; So it would &lt;BR&gt;have to be hidden behind some system or library api with alternate &lt;BR&gt;implementations on non TM platforms.&amp;nbsp; That would limit its applicability. &lt;BR&gt;For example, you couldn't have atomically thread-safe refcounted smart &lt;BR&gt;pointers.&amp;nbsp; Well, not without RCU and/or SMR.&amp;nbsp; If they go with STM to &lt;BR&gt;compliment hw TM, it's only going to work at a lower contention level &lt;BR&gt;if the STM algorihtm is only obstruction-free. &lt;/P&gt;
&lt;P&gt;It's a big problem.&amp;nbsp; You can design efficient synchronization mechanisms &lt;BR&gt;that are portable if you ignore Sparc.&amp;nbsp; If you want portability to &lt;BR&gt;Sparc you have to sacrifice performance or functionality. &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;If Sun has a 64 bit JVM implementation, I wonder what they do for &lt;BR&gt;the AtomicMarkedReference and AtomicStampedReference implementations &lt;BR&gt;for non Sparc platforms.&amp;nbsp; Cripple them so they perform as poorly as &lt;BR&gt;the Sparc implementations? &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;-- &lt;BR&gt;Joe Seigh &lt;/P&gt;
&lt;P&gt;&lt;BR&gt;When you get lemons, you make lemonade. &lt;BR&gt;When you get hardware, you make software. &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;Text taken from:&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&lt;a href="http://groups.google.com/group/comp.arch/msg/e1e391055c7acecb"&gt;http://groups.google.com/group/comp.arch/msg/e1e391055c7acecb&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&lt;a href="http://groups.google.com/group/comp.programming.threads/browse_frm/thread/9c572b709248ae64"&gt;http://groups.google.com/group/comp.programming.threads/browse_frm/thread/9c572b709248ae64&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&lt;a href="http://groups.google.com/group/comp.programming.threads/browse_frm/thread/f6399b3b837b0a40"&gt;http://groups.google.com/group/comp.programming.threads/browse_frm/thread/f6399b3b837b0a40&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&lt;a href="http://groups.google.com/group/comp.arch/msg/1b9e405080e93149"&gt;http://groups.google.com/group/comp.arch/msg/1b9e405080e93149&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&lt;a href="http://groups.google.com/group/comp.arch/msg/bbbc035cf1a8502c"&gt;http://groups.google.com/group/comp.arch/msg/bbbc035cf1a8502c&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&lt;a href="http://groups.google.com/group/comp.arch/msg/11b14c4bda2d5d82"&gt;http://groups.google.com/group/comp.arch/msg/11b14c4bda2d5d82&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&lt;a href="http://groups.google.com/group/comp.arch/msg/9b00fda2752966f9"&gt;http://groups.google.com/group/comp.arch/msg/9b00fda2752966f9&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&lt;a href="http://groups.google.com/group/comp.arch/msg/335aeb22fd6fe526"&gt;http://groups.google.com/group/comp.arch/msg/335aeb22fd6fe526&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&lt;a href="http://groups.google.com/group/comp.arch/msg/1ace9400b1b16cd4"&gt;http://groups.google.com/group/comp.arch/msg/1ace9400b1b16cd4&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&lt;a href="http://groups.google.com/group/comp.arch/msg/995379a16beb3b69"&gt;http://groups.google.com/group/comp.arch/msg/995379a16beb3b69&lt;/a&gt;&lt;BR&gt;(simply excellent Wheeler post)&lt;/P&gt;
&lt;P&gt;and more...&lt;BR&gt;&lt;BR&gt;Any thoughts?&lt;BR&gt;;^)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&amp;nbsp;&lt;/P&gt;</description><comments></comments><link>http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267800</link><pubDate>Sat, 25 Nov 2006 12:16:44 GMT</pubDate><guid isPermaLink="false">http://channel9.msdn.com/shows/Going+Deep/Programming-in-the-Age-of-Concurrency-Software-Transactional-Memory/?CommentID=267800</guid><evnet:views>0</evnet:views><evnet:viewtrackingurl>http://channel9.msdn.com/267800/WebViewBug.aspx?EVT=0</evnet:viewtrackingurl><evnet:previewtext>For those who don't follow links, here are some important snippets from some of them:
&amp;gt; I am wondering where RW locks are appropriate ?? 
Basically, they work fairly well with "read-mostly" data-structures. If you don't really care about performance and scalability, then most rw-mutex&amp;#8230;</evnet:previewtext><dc:creator>cristom</dc:creator><slash:comments>0</slash:comments><wfw:commentRss></wfw:commentRss><trackback:ping>http://channel9.msdn.com/267800/Trackback.aspx</trackback:ping></item></channel></rss>