C9 Lectures: Dr. Erik Meijer - Functional Programming Fundamentals Chapter 7 of 13

In Chapter 8, Functional Parsers, it's all about parsing and parsers. A parser is a program that analyses a piece of text to determine its syntactic structure. In a functional language such as Haskell, parsers can naturally
be viewed as functions.
type Parser = String -> Tree
A parser is a function that takes a string and returns some form of tree.
You should watch these in sequence (or skip around depending on your curent level of knowledge in this domain):
Get the presentation slides here
Chapter 1
Chapter 2
Chapter 3
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Chapter 9
Chapter 10
Chapter 11
Chapter 12
Chapter 13
Very nice shirt
Hope you get the time to do an introduction on QuickCheck. This is one of the things in Haskell that seems the most interesting to me, and I think it is a pretty unique to functional programming languages, if I understand it correctly.
Excelent lecture as always. Keep up the good work.
In the book Hutton didn't explain why he chose List instead of the Maybe type. I'm glad you touch that subject in the lecture.
Erik, could you make the generalization offered by lists over optional types more statically correct with dependent typing? Saying that Maybe is just a list of zero or one element?
In the ideal world we want to have lists as a generalization of Maybe but we don't want to loose the type safety that Maybe offers.
This problem is complete Deja Vu for me: I encountered the same tension between the option type and the list type in an attempt to implement a lazy functional L-system library in haXe back in 2007.
Thanks for the "epic fail" parser, haha.
Deja Vu again! Brian Beckman repeated this so many times it was burned into my memory: the state monad is a function from state to state contents pair. This looks exactly as what we see here: you start with a string and get back a string and a tree.
Erik, why do you not like the layout rule in this case and use explicit delineation. I don't think you mention this.
This whole parsing business is reminiscent of unfolding. Expansion of some input into a more complex form. Composition, application and unfolding, hmm...
While parser combinators can be expressed functionally on .NET, even in C#, after having written a few such combinator libraries I've found that Pratt parsers strike a better balance between simplicity and performance on the CLR. Pratt parsers also naturally handle operator precedence, while parser combinator libraries require more complicated structures to express these relationships. The extensible parser in the above link will be available in an upcoming release of my open source Sasa class library, so check it out if you're interested in this domain.
I've seen your blog before naasking and will check this out, thanks!
Excellent as always, Erik.
Perhaps like exoteric, I felt a little dissatisfied with the return type of Parser being "list of..." rather than "Maybe...". I've come to admire the expressive power of Haskell types and this seems like a bit of a hack. Considering, as you pointed out, that both the list and Maybe types comprise Monads, I am wondering where the advantage exists for using the list type. Monads allow for map, filter, lift operations, no?
The list type is generally used for parser combinators to return multiple possible parses.
I don't own Hutton's book, but I can think of a few reasons for not using Maybe. It would require introducing a new datatype to the reader, while the reader is already familiar with the list type. Also, it takes more characters to write when pattern matching
on a Maybe value .
Now, in a real project you'd use the appropiate data type. If you know that at most only a single answer will be produced, use Maybe.
In general, this tension between structural abstraction and type-safety is conceptually dissatisfying.
There must be a way to use the same list abstraction, only constrained to apply to [n,m] elements. Such value-restrictions upheld by a type system appear to be what dependent typing does. This provides a new expressive dimension which is missing in order to type programs correctly without also loosing the abstraction provided by lists over options. The ideal must be a way to fluently move from intensional to extensional semantics by explicitly coercing a type into the extensional domain of structural compatibility.
It also reminds me of template metaprogramming. Someone built a compile-time regexp module in D and there's also a compile-time raytracer in D. In the last case you may discuss the applicability of that but at least it shows the expressive power.
http://www.digitalmars.com/d/2.0/templates-revisited.html
Now time to play with parsers in C# using LINQ...
I'm mystified by Erik's comment that the inserting calls to parse "makes the type conversion explicit".
I can certainly see that:
parse p "something"
is preferable to
p "something"
However, to me it seems that it makes semantics exlicit rather than type conversion.
Erik hardly seems more likely to be confused here than I am, so I think I'm missing some fundamental alternative way of viewing this.
As I'm writing this I think I'm starting to get it. Using parse makes it explicit that p is a parser. Since applying a parser to a value always returns a value with a type that has the same relationship to the input value's type, using parse makes both the semantics and the type conversions explicit.
Am I understanding this correctly?
If not, I'd appreciate it if someone could expand upon Erik's statements.
The balance between too little and too much typing is subtle. You can goo all the way and have your types be a very precise specification of your code http://www.cs.chalmers.se/~bengt/talks/tt-snu.pdf or have no static types at all. I think Hindley-Milner is a good compromise, although I think something like Cayenne http://www.cs.chalmers.se/~augustss/cayenne/ should work in practice as well.
Hello, I've tried the following code with Hugs:
type Parser a = String -> [(a,String)]
item :: Parser Char
item = \inp -> case inp of
[] -> []
(x:xs) -> [(x,xs)]
return' :: a -> Parser a
return' v = \inp -> [(v,inp)]
p' :: Parser (Char,Char)
p' = do x <- item
z <- item
y <- item
return' (x,y)
It gives the following error:
Prelude>
p' "12345"ERROR file:{Hugs}\packages\hugsbase\Hugs.hs:94 - Type error in explicitly typed binding
*** Term : p'
*** Type : [Char] -> [(([(Char,[Char])],[(Char,[Char])]),[Char])]
*** Does not match : Parser (Char,Char)
Does anyone had this problem? Thanks in advance for any help.
I must be going blind, but I can't seem to find the slides anywhere. Also, did anyone manage to collect all the links Erik pointed-to/talked-about at the beginning of the presentation?
I also couldn't find the slides.
One of the links is this: http://www.comlab.ox.ac.uk/people/Richard.Bird/index.html
Another one is this: http://portal.acm.org/portal.cfm?coll=GUIDE&dl=GUIDE&CFID=63594272&CFTOKEN=61762843
Regards
Slides for the series: http://www.cs.nott.ac.uk/~gmh/book.html#slides
C
Thank you Charles. The source code, available at the same link, is great. Now I understand why my code wasn´t running.
Regards
homework:
2) Extend the parser for arithmetic expressions to support substraction and
division, based upon the following extensions to the grammar:
expr ::= term (+ expr | - expr | e)
term ::= factor (* term | / term | e)
expr' :: Parser Int expr' = do t <- term' do symbol "+" e <- expr' return (t + e) +++ do symbol "-" e <- expr' return (t - e) +++ return t term' :: Parser Int term' = do f <- factor' do symbol "*" t <- term' return (f * t) +++ do symbol "/" e <- expr' return (f `div` e) +++ return f factor' :: Parser Int factor' = do symbol "(" e <- expr' symbol ")" return e +++ natural eval' :: String -> Int eval' xs = case parse expr' xs of [(n,[])] -> n [(_,out)] -> error ("unused input " ++ out) [] -> error "invalid input"
I'm getting the same error from my code. How did you fix your code?
I cannot find a Gilat Braga's blog (I am not sure if I write it correct)... Can anyone help me?
Hi,
His name is Gilad Bracha. His blog is http://gbracha.blogspot.com/
You can also watch him on C9: https://channel9.msdn.com/tags/Gilad+Bracha
C
The second Paper is "The Genuine sieve of Eratosthenes" by Melissa O'Neill
http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf
I do not agree with most as far as this one presentation goes... I really appreciated all the other ones but here...
I lost you or you lost me or you lost it... whichever.
With
p :: Parser (Char,Char)
p = do x <- item
item
y <- item
return (x,y)
I can see that item selects the first character. But what happens then? What is item operating on the second time? And how does it get to operate on that? The same for the third time. What is item operating on there and how does it get that as an argument?
'Item' must each time operate on the "remainder string", but how does that happen? Where is it set up to do that? In the bind definition below, there is parse (f v) out. So with the second item this becomes Parse ( item v ) out (?), but what is item v? And how is the result applied to out?
p >>= f = P (\inp -> case parse p inp of
[] -> []
[(v,out)] -> parse (f v) out)