Stephanie Weirich, Tim Sheard and I recently submitted a paper to ICFP entitled *Binders Unbound*. (You can read a draft here.) It’s about our ~~kick-ass~~, I mean, expressive and flexible library, unbound (note: GHC 7 required), for generically dealing with names and binding structure when writing programs (compilers, interpreters, refactorers, proof assistants…) that work with syntax. Let’s look at a small example of representing untyped lambda calculus terms. This post is working Literate Haskell, feel free to save it to a `.lhs`

file and play around with it yourself!

First, we need to enable lots of wonderful GHC extensions:

```
> {-# LANGUAGE MultiParamTypeClasses
> , TemplateHaskell
> , ScopedTypeVariables
> , FlexibleInstances
> , FlexibleContexts
> , UndecidableInstances
> #-}
```

Now to import the library and a few other things we’ll need:

```
> import Unbound.LocallyNameless
>
> import Control.Applicative
> import Control.Arrow ((+++))
> import Control.Monad
> import Control.Monad.Trans.Maybe
>
> import Text.Parsec hiding ((<|>), Empty)
> import qualified Text.Parsec.Token as P
> import Text.Parsec.Language (haskellDef)
```

We now declare a `Term`

data type to represent lambda calculus terms.

```
> data Term = Var (Name Term)
> | App Term Term
> | Lam (Bind (Name Term) Term)
> deriving Show
```

The `App`

constructor is straightforward, but the other two constructors are worth discussing in more detail. First, the `Var`

constructor holds a `Name Term`

. `Name`

is an abstract type for representing names, provided by Unbound. `Name`

s are indexed by the sorts of things to which they can refer (or more precisely, the sorts of things which can be substituted for them). Here, a variable is simply a name for some `Term`

, so we use the type `Name Term`

.

Lambdas are where names are *bound*, so we use the special `Bind`

combinator, also provided by the library. Something of type `Bind p b`

represents a pair consisting of a *pattern* `p`

and a *body* `b`

. The pattern may bind names which occur in `b`

. Here is where the power of generic programming comes into play: we may use (almost) any types at all as patterns and bodies, and Unbound will be able to handle it with very little extra guidance from us. In this particular case, a lambda simply binds a single name, so the pattern is just a `Name Term`

, and the body is just another `Term`

.

Now we use Template Haskell to automatically derive a generic representation for `Term`

:

```
> $(derive [''Term])
```

There are just a couple more things we need to do. First, we make `Term`

an instance of `Alpha`

, which provides most of the methods we will need for working with the variables and binders within `Term`

s.

```
> instance Alpha Term
```

What, no method definitions? Nope! In this case (and in most cases) the default implementations, implemented in terms of automatically-derived generic representations, work just fine.

We also need to provide a `Subst Term Term`

instance. In general, an instance for `Subst b a`

means that we can use the `subst`

function to substitute things of type `b`

for `Name`

s occurring in things of type `a`

. We override the `isvar`

method so the library knows which constructor(s) of our type represent variables which can be substituted for.

```
> instance Subst Term Term where
> isvar (Var v) = Just (SubstName v)
> isvar _ = Nothing
```

OK, now that we’ve got the necessary preliminaries set up, what can we do with this? Here’s a little lambda-calculus evaluator:

```
> done :: MonadPlus m => m a
> done = mzero
>
> step :: Term -> MaybeT FreshM Term
> step (Var _) = done
> step (Lam _) = done
> step (App (Lam b) t2) = do
> (x,t1) <- unbind b
> return $ subst x t2 t1
> step (App t1 t2) =
> App <$> step t1 <*> pure t2
> <|> App <$> pure t1 <*> step t2
>
> tc :: (Monad m, Functor m) => (a -> MaybeT m a) -> (a -> m a)
> tc f a = do
> ma' <- runMaybeT (f a)
> case ma' of
> Just a' -> tc f a'
> Nothing -> return a
>
> eval :: Term -> Term
> eval x = runFreshM (tc step x)
```

Note how we use `unbind`

to take bindings apart safely, using the the `FreshM`

monad (also provided by Unbound) for generating fresh names. We also get to use `subst`

for capture-avoiding substitution. All without ever having to touch a de Bruijn index!

OK, but does it work? First, a little Parsec parser:

```
> lam :: String -> Term -> Term
> lam x t = Lam $ bind (string2Name x) t
>
> var :: String -> Term
> var = Var . string2Name
>
> lexer = P.makeTokenParser haskellDef
> parens = P.parens lexer
> brackets = P.brackets lexer
> ident = P.identifier lexer
>
> parseTerm = parseAtom `chainl1` (pure App)
>
> parseAtom = parens parseTerm
> <|> var <$> ident
> <|> lam <$> (brackets ident) <*> parseTerm
>
> runTerm :: String -> Either ParseError Term
> runTerm = (id +++ eval) . parse parseTerm ""
```

Now let’s try some arithmetic:

```
*Main> runTerm "([m][n][s][z] m s (n s z))
([s] [z] s (s z))
([s] [z] s (s (s z)))
s z"
Right (App (Var s) (App (Var s) (App (Var s)
(App (Var s) (App (Var s) (Var z))))))
```

2 + 3 is still 5, and all is right with the world.

This blog post has only scratched the surface of what’s possible. There are several other combinators other than just `Bind`

for expressing binding structure: for example, nested bindings, recursive bindings and embedding terms within patterns are all supported. There are also other operations provided, such as free variable calculation, simultaneous unbinding, and name permutation. To learn more, see the package documentation, or read the paper!

This is super awesome!

btw, is there a particular reason the FreshM is not a member of the MonadFix typeclass?

Thanks!

No reason other than simple oversight. I’ve just uploaded a new version (0.2.2) with MonadFix instances for FreshM and LFreshM. Thanks for the suggestion!

Some related work you seem not to be aware of: Dave Herman, in his recent thesis at Northeastern, provided an expressive binding specification language for specifying and typechecking Scheme macros. In particular, see Herman and Wand, A Theory of Hygienic Macros, ESOP 2008, and his dissertation, 2010, both available here: http://www.ccs.neu.edu/home/dherman/

Thanks! We were indeed not aware of that work, we’ll take a look.

Wow, really nice work. This is exactly the sort of thing that should make my life easier: I’m implementing a Haskell dialect with numeric type indices, and binding-related woes are commonplace.

Trying out the library, it seems to give incomprehensible error messages in the TH-spliced code if my expression type is a GADT or existential. Are these not supported, or am I simply missing something? If the former, would it be possible to implement support for them?

Hi Adam, unfortunately, GADTs and existential types are not supported at the moment (primarily because they are not supported by RepLib). However, they are on our to-do list. Actually, it’s more of a priority queue, and knowing you want them has just increased their priority. I make no promises but I’ll take a look and let you know.

Pingback: Unbound now supports “set” binders and GADTs « blog :: Brent -> [String]