Binders Unbound

Stephanie Weirich, Tim Sheard and I recently submitted a paper to ICFP entitled Binders Unbound. (You can read a draft here.) It’s about our kick-ass, I mean, expressive and flexible library, unbound (note: GHC 7 required), for generically dealing with names and binding structure when writing programs (compilers, interpreters, refactorers, proof assistants…) that work with syntax. Let’s look at a small example of representing untyped lambda calculus terms. This post is working Literate Haskell, feel free to save it to a .lhs file and play around with it yourself!

First, we need to enable lots of wonderful GHC extensions:

> {-# LANGUAGE MultiParamTypeClasses
>            , TemplateHaskell
>            , ScopedTypeVariables
>            , FlexibleInstances
>            , FlexibleContexts
>            , UndecidableInstances
>   #-}

Now to import the library and a few other things we’ll need:

> import Unbound.LocallyNameless
> 
> import Control.Applicative
> import Control.Arrow ((+++))
> import Control.Monad
> import Control.Monad.Trans.Maybe
> 
> import Text.Parsec hiding ((<|>), Empty)
> import qualified Text.Parsec.Token as P
> import Text.Parsec.Language (haskellDef)

We now declare a Term data type to represent lambda calculus terms.

> data Term = Var (Name Term)
>           | App Term Term
>           | Lam (Bind (Name Term) Term)
>   deriving Show

The App constructor is straightforward, but the other two constructors are worth discussing in more detail. First, the Var constructor holds a Name Term. Name is an abstract type for representing names, provided by Unbound. Names are indexed by the sorts of things to which they can refer (or more precisely, the sorts of things which can be substituted for them). Here, a variable is simply a name for some Term, so we use the type Name Term.

Lambdas are where names are bound, so we use the special Bind combinator, also provided by the library. Something of type Bind p b represents a pair consisting of a pattern p and a body b. The pattern may bind names which occur in b. Here is where the power of generic programming comes into play: we may use (almost) any types at all as patterns and bodies, and Unbound will be able to handle it with very little extra guidance from us. In this particular case, a lambda simply binds a single name, so the pattern is just a Name Term, and the body is just another Term.

Now we use Template Haskell to automatically derive a generic representation for Term:

> $(derive [''Term])

There are just a couple more things we need to do. First, we make Term an instance of Alpha, which provides most of the methods we will need for working with the variables and binders within Terms.

> instance Alpha Term

What, no method definitions? Nope! In this case (and in most cases) the default implementations, implemented in terms of automatically-derived generic representations, work just fine.

We also need to provide a Subst Term Term instance. In general, an instance for Subst b a means that we can use the subst function to substitute things of type b for Names occurring in things of type a. We override the isvar method so the library knows which constructor(s) of our type represent variables which can be substituted for.

> instance Subst Term Term where
>   isvar (Var v) = Just (SubstName v)
>   isvar _       = Nothing

OK, now that we’ve got the necessary preliminaries set up, what can we do with this? Here’s a little lambda-calculus evaluator:

> done :: MonadPlus m => m a
> done = mzero
> 
> step :: Term -> MaybeT FreshM Term
> step (Var _) = done
> step (Lam _) = done
> step (App (Lam b) t2) = do
>   (x,t1) <- unbind b
>   return $ subst x t2 t1
> step (App t1 t2) =
>       App <$> step t1 <*> pure t2
>   <|> App <$> pure t1 <*> step t2
> 
> tc :: (Monad m, Functor m) => (a -> MaybeT m a) -> (a -> m a)
> tc f a = do
>   ma' <- runMaybeT (f a)
>   case ma' of
>     Just a' -> tc f a'
>     Nothing -> return a
> 
> eval :: Term -> Term
> eval x = runFreshM (tc step x)

Note how we use unbind to take bindings apart safely, using the the FreshM monad (also provided by Unbound) for generating fresh names. We also get to use subst for capture-avoiding substitution. All without ever having to touch a de Bruijn index!

OK, but does it work? First, a little Parsec parser:

> lam :: String -> Term -> Term
> lam x t = Lam $ bind (string2Name x) t
> 
> var :: String -> Term
> var = Var . string2Name
> 
> lexer    = P.makeTokenParser haskellDef
> parens   = P.parens lexer
> brackets = P.brackets lexer
> ident    = P.identifier lexer
> 
> parseTerm = parseAtom `chainl1` (pure App)
> 
> parseAtom = parens parseTerm
>         <|> var <$> ident
>         <|> lam <$> (brackets ident) <*> parseTerm
> 
> runTerm :: String -> Either ParseError Term
> runTerm = (id +++ eval) . parse parseTerm ""

Now let’s try some arithmetic:

*Main> runTerm "([m][n][s][z] m s (n s z)) 
                ([s] [z] s (s z)) 
                ([s] [z] s (s (s z))) 
                s z"
Right (App (Var s) (App (Var s) (App (Var s) 
        (App (Var s) (App (Var s) (Var z))))))

2 + 3 is still 5, and all is right with the world.

This blog post has only scratched the surface of what’s possible. There are several other combinators other than just Bind for expressing binding structure: for example, nested bindings, recursive bindings and embedding terms within patterns are all supported. There are also other operations provided, such as free variable calculation, simultaneous unbinding, and name permutation. To learn more, see the package documentation, or read the paper!

About these ads
This entry was posted in grad school, haskell, projects, writing and tagged , , , , , . Bookmark the permalink.

7 Responses to Binders Unbound

  1. Job Vranish says:

    This is super awesome!

    btw, is there a particular reason the FreshM is not a member of the MonadFix typeclass?

    • Brent says:

      Thanks!

      No reason other than simple oversight. I’ve just uploaded a new version (0.2.2) with MonadFix instances for FreshM and LFreshM. Thanks for the suggestion!

  2. Sam Tobin-Hochstadt says:

    Some related work you seem not to be aware of: Dave Herman, in his recent thesis at Northeastern, provided an expressive binding specification language for specifying and typechecking Scheme macros. In particular, see Herman and Wand, A Theory of Hygienic Macros, ESOP 2008, and his dissertation, 2010, both available here: http://www.ccs.neu.edu/home/dherman/

  3. Adam Gundry says:

    Wow, really nice work. This is exactly the sort of thing that should make my life easier: I’m implementing a Haskell dialect with numeric type indices, and binding-related woes are commonplace.

    Trying out the library, it seems to give incomprehensible error messages in the TH-spliced code if my expression type is a GADT or existential. Are these not supported, or am I simply missing something? If the former, would it be possible to implement support for them?

    • Brent says:

      Hi Adam, unfortunately, GADTs and existential types are not supported at the moment (primarily because they are not supported by RepLib). However, they are on our to-do list. Actually, it’s more of a priority queue, and knowing you want them has just increased their priority. I make no promises but I’ll take a look and let you know.

  4. Pingback: Unbound now supports “set” binders and GADTs « blog :: Brent -> [String]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s