Many parser combinator libraries in Haskell (such as parsec) have both a `Monad`

interface as well as an `Applicative`

interface. (Actually, to be really useful, you also need `MonadPlus`

along with `Monad`

, or `Alternative`

along with `Applicative`

, in order to encode choice; from now on when I talk about `Monad`

and `Applicative`

note that I really have `MonadPlus`

or `Alternative`

in mind as well.) The `Applicative`

interface is often nicer, but it is less powerful than the `Monad`

interface: in particular, using `Applicative`

you can only parse context-free languages, whereas `Monad`

lets you parse arbitrary context-sensitive languages. Intuitively, this is because the structure of `Applicative`

computations cannot depend on intermediate results, whereas `Monad`

computations allow you to choose which computation to run next based on intermediate results.

This is a good bit of intuition, with only one minor caveat: it isn’t true! I believe it was two years ago, during the second Hac phi, when I first learned from Edward Kmett how `Applicative`

(by which I mean, of course, `Alternative`

) can be used to parse arbitrary context-sensitive languages. The question just came up again in the `#haskell`

IRC channel, and I figured it would be useful to have this all written down somewhere. In particular, Reid Barton gave a nice sketch which I decided to turn into some working code.

Here’s the key insight: normally, grammars are defined as *finite* objects: a finite set of terminals, a finite set of nonterminals, and a finite set of productions. However, Haskell’s general recursion means that we can write down a "grammar" with an *infinite* set of production rules. This is what lets us get away with parsing context-sensitive languages with `Applicative`

: we just make a different production rule for every possible input!

First, some imports. Notice that I do *not* import `Control.Monad`

.

```
> import Text.Parsec
> import Text.Parsec.String
> import Control.Arrow ((&&&))
> import Control.Applicative hiding ((<|>))
> import Data.List (group)
```

The usual `guard`

function is for `MonadPlus`

, but we can make something equivalent for `Alternative`

.

```
> guard' :: Alternative f => Bool -> f ()
> guard' True = pure ()
> guard' False = empty
```

And now for the meat of the example. `parseArbitrary`

takes an arbitrary predicate on `Strings`

built from lowercase letters and turns it into a parser. The created parser will accept `Strings`

for which the predicate evaluates to `True`

(returning `()`

) and fail for any other string.

```
> parseArbitrary :: (String -> Bool) -> Parser ()
> parseArbitrary p =
```

If we encounter `eof`

, we simply ensure that the predicate holds of the empty string.

```
> (eof <* guard' (p []))
```

Otherwise, we choose between 26 alternatives based on the next character in the input. If the character `c`

is encountered, we make a recursive call to `parseArbitrary (p . (c:))`

. The remainder of the input must satisfy `(p . (c:))`

, that is, it must consist of some `String`

`s`

such that `(c:s)`

satisfies the predicate `p`

.

```
> <|> foldr (<|>) parserZero
> (map (\c -> char c *>
> parseArbitrary (p . (c:))
> )
> ['a'..'z']
> )
```

For any given predicate `p`

, you can think of `parseArbitrary p`

as an infinite tree with a 26-way branch at each node. Each node "remembers" the path taken to reach it from the root of the tree, in the form of prepend functions composed with the original predicate. We have constructed an infinite grammar: each node in the tree corresponds to a production, one for every possible input prefix.

Let’s try it out. Here’s a function which only accepts `String`

s of the form `"aaabbbccc"`

, with an equal number of a’s, b’s, and c’s. This is a well-known example of a language which is not context-free (easily shown using the pumping lemma for context-free languages).

```
> f :: String -> Bool
> f s
> | [('a',na), ('b',nb), ('c',nc)]
> <- map (head &&& length). group $ s
>
> = na == nb && nb == nc
>
> | otherwise = False
```

Now we make `f`

into a parser and test it on some example inputs:

```
> p = parseArbitrary f
>
> main = do
> parseTest p "aaa"
> parseTest p "aaabbbcc"
> parseTest p "aaaabcccc"
> parseTest p "aaaaabbbbbccccc"
```

The last test succeeds by returning `()`

. For the first three, we get error messages like this:

```
parse error at (line 1, column 4):
unexpected end of input
expecting "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y" or "z"
```

Obviously, these are not very helpful. But what were you expecting? ~~After all, this is one of those things that is interesting in theory, but in practice amounts to an awful hack that no one would ever want to use in real code.~~

~~In the end, it’s still true that ~~`Applicative`

can only parse context-free languages *as long as we restrict ourselves to finite grammars*—which any sensible programmer would do anyway.

[**ETA**: it looks like using infinite grammars is not as impractical as I thought—see the comments below!]

I’m not sure, but does this imply that there is a principled way to embed arbitrary context-sensitive grammars *inside* context-free grammars?

I’m not sure I understand what you’re asking here.

Well, consider this implementation of a^n b^n c^n:

abcCount n = (char ‘a’ *> abcCount (n+1)) string (replicate n ‘b’ ++ replicate n ‘c’)

p = abcCount 0

This grammar still has an infinite number of productions, but they are done in a principled way– a family of productions abcCount_n indexed by a natural number. I think you will find you get much better error messages with this implementation.

Ah, neat, you’re right!

Haven’t you shown here that it handles all the recursively enumerable languages too? or those decidable at least if we assume the (String -> Bool) predicate to be total?

Also I thought Edward Kmett had some reasonable use for infinite grammars, but they seem to be handy in Agda anyhow:

http://www.cse.chalmers.se/~nad/publications/danielsson-parser-combinators.html

Yes, you’re right. I was using “context-sensitive” as if it meant “not context-free” but I see now it is narrower than that. And thanks for the link.

Infinite grammars are perfectly useful, provided that we acknowledge the structure they have. This is just like the reason why PDAs are useful: technically push-down automata are infinite-state machines, but by encoding them as a finite-state machine plus a stack we make explicit the subregularities of the infinite-state machine. With PDAs we’re saying that infinite graphs (i.e., infinite DFAs) are helpful, because we can finitize them; with your example in mind, we’re saying that infinite hypergraphs (i.e., infinite CFGs) are helpful— which should intuitively follow from the facts that hypergraphs are helpful and that infinite graphs are helpful.

This is the same intuition behind why Ryan Ingram’s implementation gives better error messages: his version makes the subregularities defining the a^n b^n c^n language explicit; that is, he makes the finitization of the infinite hypergraph explicit. We can do this for a large collection of languages simply by adding indices/parameters to the nonterminals of a CFG —which is akin to giving a stack to a DFA— and this is what is done in many grammar formalisms more powerful than context-free. It’s the explicit finitization which enables giving decent error messages.

Though your example is quite nice because it shows how we can take an arbitrary characteristic function and convert it into an Applicative parser. I’m loathe to call a characteristic function a “grammar” for precisely the same reason why you get awful error messages— namely, that characteristic functions have no structure, whereas I’d define grammars as being a means of explaining the structure behind (particular classes of) characteristic functions.

To pile on a bit more…. (sorry)

When we translate context free grammars into the typical combinator libraries, we arguably make use of infinite grammars all the time. For instance, to translate:

S ::= a S b | e

We write

s = char ‘a’ *> s <* char 'b' return ()

or something of the sort. This looks just like the context free grammar, but it’s actually self-referential at the Haskell expression level, meaning that it’s just as infinite as:

naturals = 0 : map (+1) naturals

We just don’t typically think of it that way (although if you write parser combinator libraries in, say, Scala, you’ll bang up against this, as if you don’t make your combinators sufficiently lazy, definitions like the above will simply blow up), because it looks so similar to the CFG. But most libraries don’t deal in terminals and non-terminals, they deal in infinite trees of terminals glued together with concatenation and alternation.

So infinite grammars are actually ubiquitous in the typical parser combinator approach.

Pingback: How do you identify monadic design patterns? | PHP Developer Resource

If the set of tokens is infinite, then monadic combinators can, at least in some cases, be more expressive. For instance, consider the language { nn | n ∈ ℕ }. The approach outlined in your post does not work for this language. I discuss this in more detail in the paper mentioned above (Total Parser Combinators).

Cool, thanks for the pointer!