must review my arabic ... انا تلميذ حديد :)
4 days ago
@Richard -- here you go feedback always welcome!
Jan 12, 2012 at 6:56 PM
Hi, Steffen -- I was after something a little deeper, such as a user-defined numerical class (sketching, now), say logarithmic doubles, in which operator * is mapped to arithmetic + on doubles. I couldn't find a way to make Viterbi generic over both user-supplied classes with operator overloads AND over built-in types, where operators are not the same kind of thing as operator overloads. I could wrapper all the built-in types with operator * that maps to * etc., but that's a lot of work. That's essentially implementing my own "Numeric Tower" over the CLR basic types, and it didn't seem worth it to me to get a corner case like logarithmic double. Maybe there *is* a way, I just couldn't find it. Now I'm going to look at Richard.Hein's link
Edit: just an aside about why logarithmic doubles might be valuable: multiplying a bunch of small probabilities over and over eventually underflows doubles. Bad. Sometimes better to add logarithms of probabilities to get the log of the product of the probabilities.
Jan 02, 2012 at 2:50 PM
Just had another little thought -- Since Viterbi is modifying its internal variables V and Path as observations present themselves in OnNext, Viterbi is, itself a little monad. It could inherit from the state monad or just built up on its own, but the point is that the OnNext function on the surface would internally call "Bind" or "SelectMany" on the Viterbi monad, that is, OnNext would just call the LINQ provider of Viterbi! This might shrink and robustify the code even more.
Dec 31, 2011 at 10:42 AM
@JoshRoss: or even a precomputed corpus from the same folks
Dec 30, 2011 at 3:03 PM
@Jan de Vaan: Yes, this looks like a promising direction. http://www.codeproject.com/KB/cs/genericnumerics.aspx
I also found out it's possible to go generic on numeric types in F# IFF the methods are INLINED. That's because the compiler just "pastes" the inlined code after resolving the generic type, and then the normal method lookup finds the appropriate operator implementations. Not sure if this approach is available in C#.
Dec 30, 2011 at 2:59 PM
@JoshRoss: you will also need "transition probabilities," which you can get from digram frequencies (frequencies of word pairs). One really cool way to get these is to analyze free texts on project gutenberg (public-domain ebooks as plain text!).
Dec 30, 2011 at 2:45 PM
@PCB: yup, just for symmetry (euphemism for "copy-paste" code ContainsKey would have been better.
Dec 29, 2011 at 7:04 AM
Anyone taking up my invitation to try a spelling corrector with this?