Entries:
Comments:
Posts:

Loading User Information from Channel 9

Something went wrong getting user information from Channel 9

Latest Achievement:

Loading User Information from MSDN

Something went wrong getting user information from MSDN

Visual Studio Achievements

Latest Achievement:

Loading Visual Studio Achievements

Something went wrong getting the Visual Studio Achievements

Brian Beckman: Hidden Markov Models, Viterbi Algorithm, LINQ, Rx and Higgs Boson

Download

Right click “Save as…”

It's been WAY too long since we've had Brian Beckman sharing knowledge, insights and perspectives on Channel 9. This changes now! Smiley

Needless to say, I was incredibly happy to spend an hour with Brian learning all about what he's up to these days. Not surprisingly, he's writing code and employing Rx and monads to solve very interesting problems. In this conversation (a code lesson, algorithm survey, a splash of random topical diversion), Brian explains and demonstrates his latest endeavor: implementing the Viterbi algorithm in C#. What's the Viterbi algorithm, Brian? What are hidden Markov models? What are you using this stuff for? Where does Rx fit into this? What's going on? By the way, it's awesome to learn that a Niner has been sharing C# monadic implementations with Brian (state monad, maybe monad). Smiley

Of course, no conversation with Brian - a physicist by training and a software architect at Microsoft - is complete without talking about some current physics problem: Finding the elusive Higgs Boson is all the rage these days, so we talk about what it means.

Brian also shares insights on Haskell, functional and hybrid programming languages (C# is imperative, but it provides functional capabilities like LINQ, for example, upon which Rx is built (Rx is LINQ-to-Streams or observable sequences of events, really)...). We also finally discuss his previous work at MS that we never got a chance to talk to him about while he was doing it. Before joining the Bing Mobile team, Brian was working on a project to create a new functional programming language. What was it?

Thank you, Brian!

Happy holidays from Channel 9 wherever you are and whatever, if anything, you're celebrating!


Notes and More:

The code Brian demos (download it, unzip it, launch VS, open the solution, then watch this video and play along): https://github.com/rebcabin/DotNetExtensionsImproved

From Wikipedia - information on Markov and Viterbi:

A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states

The Viterbi algorithm is a dynamic programming algorithm for finding the most likely sequence of hidden states – called the Viterbi path – that results in a sequence of observed events, especially in the context of Markov information sources, and more generally, hidden Markov models.

Tags:

Follow the Discussion

  • martinminemartinmine I eat C# for breakfast

    Awesome! Its amazing I actually understood all what he said about probablillity Smiley Also, interesting talk around the Higgs boson which I find very interesting myself! Happy holidays!

  • everydaypanoseverydaypan​os

    Heavy stuff

  • Allan LindqvistaL_ Kinect ftw

    yay, brian Big Smile

  • Brian - Thanks for the plug!

    All - The implementation of the Maybe monad used in this project is a much older implementation.  I've continued to improve it, adding operators, and I made a huge investment in improving the performance a number of months ago.  I'm going to drop the latest version of the Maybe monad implementation into this project and see what the performance numbers come out as.  I'll send Brian a pull request so you can see the results.

    Beyond the Maybe monad implementation, I'm working on two others: Outcome and Result (which really derive from the Writer monad).  My blog has an *older* post explaining the Maybe monad and it's implementation (I'm in the process of writing an updated post), and you can see the code for all of it on GitHub and try it out through Nuget.  Watch my blog for more info on the Maybe, Outcome, and Result monad implementations.

    - Jordan

     

  • CharlesCharles Welcome Change

    @jordanterrell: Thank you, Jordan.

    C

  • felix9felix9 the cat that walked by itself

    beautiful code Big Smile

  • Both a "Checking In" && "Going Deep" released on Christmas Day. What a good gift. Both, out of the park. Happy Holidays guys.

  • bryaneddsbryanedds An ​individuali​st is he who is saving himself from all those who are saving the world.

    I am a huge BrianMcn fan, but I wonder why he seems generally unsatisfied with immutable Maps a la F#. You can transform such persistent data structures typically with log n performance. When you consider that Dictionary is constant time, but with a comparatively big constant and memory footprint, the race seems a little more even.

    EDIT: Meant Brian Beckman fan Smiley But also McN fan too still Smiley

  • @bryanedds : I've had a couple of good experiences with F# (simulations of card and dice games, just for fun) but I need some more "seat time" with it before weighing in.  I'm particularly interested in particle filters (/slaps forehead), as well as Kalman filters (Unscented, Extended, and others) in reactive form.  I can hammer them out with confidence in C# or take the deep dive and try them in F#.  We'll see Smiley

    My first job out of grad school involved "heavy industrial" Kalman in Fortran II at Jet Propulsion Lab, where we used them to track everything we could track in the solar system. Daily runs had around 1,000 parameters and 25 million observations, but that was decades ago, and that kind of thing can be easily done on a windows phone now (required liquid nitrogren back then Smiley  Imagine the applications!

  • Bent Rasmussenexoteric stuck in a loop, for a while

    Elegant and cool as usual. (And the blackboard color crayon style looks nice.)

    I like the way the monadic laws are being expressed here:

    https://github.com/iSynaptic/Monad-Comonad-Precis/blob/master/Precis.cs

  • bryaneddsbryanedds An ​individuali​st is he who is saving himself from all those who are saving the world.

    @brianbec:Awesome - I can't wait to hear your thoughts on F#. We can trivially implement a nice Rx monad for F#'s computation comprehensions. However, it looks like there are still a couple implementation issues with it that the Rx team might look at.

    I'm just now exploring what reactive facilities that F# conveniently exposes. I'm using it for declarative simulation building (think objects that transform themselves through time according to their environment without any manual mutation or effectful message passing). Because there is no classical mutation (and no mutation at all in debug mode), it should allow automatic parallelization of the simulation by dependency analysis. It also should make reasoning about large simulations tractable (you should see the simulation engine behind the Sims... a gargantuan mess from the eyes of a functional programmer, yet taken for granted by OO people). For further detail, AML and DOL.

  • Wow, what a great video!

  • @exoteric:the blackboard style is a cheat Smiley  try this (on Win7)

    Windows key & "+" -- turn on built-in magnifier

    Windows key & "-" -- shrink back down

    Control-Alt "i" -- inverse video your whole world

  • Adam SpeightAdam​Speight2008 The Bandito Coder

    What advantages does the MayBe monad have over an Option<T> ?

    Option<T> can be thought as nullable over any "kind" of type (ref or val)

  • SimonSimon

    What about http://msdn.microsoft.com/en-us/library/bb531208.aspx instead of Brian's AddUnconditionally ?

  • @Simon: that will work great if you have explicit data that you can represent in-line in the code (and it so happens I do in my Wikipedia example).  But I don't see how to take advantage of these collection initializers when reading data from an external source.  For an application like that, I can still Aggregate over a composable dictionary, as in lines 168 through 174 in my unit test DictionaryExtensionsTests.cs... like this

    const int range = 1280;
    var kvps = Enumerable
        .Range(0, range)
        .Select(i => new KeyValuePair<int, string>(i, Convert.ToChar(i).ToString()))
        ;
    
    dict = kvps.Aggregate(dict, (d, kvp) => d.AddUnconditionally(kvp));
    

  • @AdamSpeight2008:looks like Option<T> is F#'s implementation of the Maybe monad -- different in name only, so far as I can see.  They even have a good fraction of the LINQ operators implemented on it.  I hope to be able to say more about F#, in general, after getting some more seat time with it.

  • MartinMartin

    I've watched this three times now - it's that good! - downloaded the source and had a play with it.

    It was great to be able to look at the code and play with it. The functional aspects notwithstanding, it was great to look at some C# that uses some of what I consider more advanced language features (such as Linq, generics and extension methods) and also using Test projects in a real world context.

    I'm still coming to terms with Functional programming, but videos like this are certainly a great help. One take home message I got from this video and Jordan and Brian's work is this: investing time in learning Functional techniques to begin with will pay dividends in the future, not least because once you start using functional techniques your code becomes far easier to write (fewer try...catch blocks all over the place etc.) and therefore you become more productive.

    Also, where did Brian get that great keyboard? I want one!

  • Here is an F# version  - just for kicks

    http://fsviterbi.codeplex.com

    See F# source here: http://fsviterbi.codeplex.com/SourceControl/changeset/view/64363#1117175

    Test script: http://fsviterbi.codeplex.com/SourceControl/changeset/view/64363#1117178

    Some differences:

    Actual code (excluding comments) is slighly shorter due to F#'s better type inference and buit-in tuple support.

    Path copying is avoided due to 'structural sharing' of F# collections (list in this case)

    The double loop calculation is parallelized using TPL wapper "PSeq"  but can be easily serialized. This could be useful for very large state sets on multi-core machines.

  • I also would prefer F#.

    But how about using ReSharper with C#?

     

  • Brian - I dropped in the latest Maybe implementation and the performance improved slightly, but not enough to get excited by.  It will have to stay "art" code - for now. There is a pull request in your GitHub queue to integration these changes, sourced from this repo: https://github.com/iSynaptic/DotNetExtensionsImproved

    All - That said, it shouldn't deter anyone from using monads in general or the Maybe implementation.  My team is using it in performance critical production code with no problems.  It entirely depends on your use case.  If you interacting with web services, databases, or other IO, using monads is most likely not going to have a detrimental impact.  However, if you doing computationally intensive work, my suggestion is you profile your code to see where it is spending CPU cycles.  You may, or may not, find that usage of monads isn't the biggest bottleneck.

  • @jordanterrell: Yes, yes, yes, PROFILE to find perf bottlenecks!  I jumped straight from the art code to the opposite extreme of in-place mutation, but did not profile the art code, just assumed that spinning up closures in State was the culprit.

    And 100% agree that the provability and reliability of monadic functional code is worth it in many (most?) circumstances.  On the perf critical path, profile and compromise as needed, focusing on implementation innards.

  • @fwaris:Very nice!

    Anyone taking up my invitation to try a spelling corrector with this?

  • Joshua RossJoshRoss Drinking Ovaltine since 2004

    @brianbec: I'm looking for a good word frequency list, and this is what I've found.  Code soon to follow, I hope.

    -Josh

  • PCBPCB

    @brianbec: in AddConditionally, why TryGetValue rather than just ContainsKey?  Was it just for symmetry with the "art" code?  Smiley

  • @PCB: yup, just for symmetry (euphemism for "copy-paste" code Smiley  ContainsKey would have been better.

  • Jan de VaanJan de Vaan


    Early in the video Brian mentioned he would like to make his Viterbi algorithm generic with respect to the numeric type used (double, float, decimal,.... ). The article below summarises a potential solution for this deficit in clr generics.

    I dont know if this technique was ever used in production code but it deserves to be mentioned.
    http://www.codeproject.com/KB/cs/genericnumerics.aspx

  • @JoshRoss: you will also need "transition probabilities," which you can get from digram frequencies (frequencies of word pairs). One really cool way to get these is to analyze free texts on project gutenberg (public-domain ebooks as plain text!). 

  • @Jan de Vaan: Yes, this looks like a promising direction. http://www.codeproject.com/KB/cs/genericnumerics.aspx

    I also found out it's possible to go generic on numeric types in F# IFF the methods are INLINED. That's because the compiler just "pastes" the inlined code after resolving the generic type, and then the normal method lookup finds the appropriate operator implementations. Not sure if this approach is available in C#.

  • Joshua RossJoshRoss Drinking Ovaltine since 2004

    @brianbec: Or better yet, it seems that getting a plain text dump of wikipedia, isn't so difficult.

    -Josh

  • @JoshRoss: or even a precomputed corpus from the same folks Smiley

    http://blog.afterthedeadline.com/2010/07/20/after-the-deadline-bigram-corpus-our-gift-to-you/ 

  • Richard Anthony HeinRichard.Hein Stay on Target

    Awesome!  My internet has been intermittent for 4 days, and was out completely for 2 ... now I finally have access again (hopefully not intermittent) and have to catch up on all these awesome Christmas presents ... thanks Charles and Brian!  Smiley

  • CKoenigCbrAInK Carsten

    finally ... cannot wait to see this ... downloading now - enjoying after work ... thank you guys

  • Very clever guy. I am learning to code with C++ for flight simulation. It is quite a subject but it has helped me out no end. Superb article!

  • Just had another little thought -- Since Viterbi is modifying its internal variables V and Path as observations present themselves in OnNext, Viterbi is, itself a little monad.  It could inherit from the state monad or just built up on its own, but the point is that the OnNext function on the surface would internally call "Bind" or "SelectMany" on the Viterbi monad, that is, OnNext would just call the LINQ provider of Viterbi!  This might shrink and robustify the code even more.

  • Can Brian please put his dot files on GitHub?  Would love to switch to another awesome dark theme.

  • Even with C# and LINQ a compact functional implementation is possible:

    //using alias directive for P: using P = Double;
    static IObservable<IDictionary<S, Tuple<P, S>>> ViterbiRx<O, S>
    (IObservable<Tuple<O, IEnumerable<S>>> obsStates, Func<S, P> startingProb, Func<S, S, P> transitionProb, Func<S, O, P> emissionProb)
    {
        return obsStates.Scan((IDictionary<S, Tuple<P, S>>)null, (v, o) => o.Item2.ToDictionary(target => target, target =>
            v == null ? Tuple.Create(startingProb(target) * emissionProb(target, o.Item1), target)
            : v.Select(source => Tuple.Create(source.Value.Item1 * transitionProb(source.Key, target) * emissionProb(target, o.Item1), source.Key)).Max()));
    }
    

  • Suspended threadSuspended thread

    Nice video, clear content, great speaker:
    Any change we get Brian for a series on:
    "Implementing well known algorithms in functional style
    with an IObservable topping"

  • Richard Anthony HeinRichard.Hein Stay on Target

    Hi Brian, you said you had some problems with generic operators ... there's a library here (a friend of mine just told me about it) that you may find useful in the future:  http://www.yoda.arachsys.com/csharp/miscutil/usage/genericoperators.html

     

  • bubblebubble

    I'd like to see this in C++11, please!

  • , SteffenZeidler wrote

    Hi, Steffen -- I was after something a little deeper, such as a user-defined numerical class (sketching, now), say logarithmic doubles, in which operator * is mapped to arithmetic + on doubles.  I couldn't find a way to make Viterbi generic over both user-supplied classes with operator overloads AND over built-in types, where operators are not the same kind of thing as operator overloads.  I could wrapper all the built-in types with operator * that maps to * etc., but that's  a lot of work.  That's essentially implementing my own "Numeric Tower" over the CLR basic types, and it didn't seem worth it to me to get a corner case like logarithmic double.  Maybe there *is* a way, I just couldn't find it.  Now I'm going to look at Richard.Hein's link Smiley

     

    Edit: just an aside about why logarithmic doubles might be valuable: multiplying a bunch of small probabilities over and over eventually underflows doubles.  Bad.  Sometimes better to add logarithms of probabilities to get the log of the product of the probabilities. 

  • Hi Brian -- For a simple generic numerical solution could be used an additional function parameter mul.

    static IObservable<IDictionary<S, Tuple<P, S>>> ViterbiRx<O, S, P>
    (IObservable<Tuple<O, IEnumerable<S>>> obsStates, Func<S, P> startingProb, Func<S, S, P> transitionProb, Func<S, O, P> emissionProb, Func<P, P, P> mul)
    {
        return obsStates.Scan((IDictionary<S, Tuple<P, S>>)null, (v, o) => o.Item2.ToDictionary(target => target, target =>
            v == null ? Tuple.Create(mul(startingProb(target), emissionProb(target, o.Item1)), target)
            : v.Select(source => Tuple.Create(mul(mul(source.Value.Item1, transitionProb(source.Key, target)), emissionProb(target, o.Item1)), source.Key)).Max()));
    }
    

    Test:

    var obsResults = ViterbiRx<string,string,double>(mul: (x1, x2) => x1 * x2, ...);
    var res = obsResults.ToListObservable();
    Debug.Assert(Math.Round(res.Value[Rainy].Item1, 6) == 0.013440);
    Debug.Assert(Math.Round(res.Value[Sunny].Item1, 6) == 0.002592);
    Debug.Assert(res.Value.OrderBy(x => x.Value).Last().Key == Rainy);
    Debug.Assert(res[2][Rainy].Item2 == Rainy);
    Debug.Assert(res[1][Rainy].Item2 == Sunny);
    

Remove this comment

Remove this thread

close

Comments Closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.