Typed type-level programming in Haskell, part III: I can haz typs plz?

July 19, 2010

In Part II, I showed how type families can be used to do type-level programming in a functional style. For example, here is addition of natural numbers again:

  data Z
  data S n

  type family Plus m n :: *
  type instance Plus Z n = n
  type instance Plus (S m) n = S (Plus m n)

Now, why might we want to do such a thing? One example (I know, I know, this is always the example… but hey, it’s a good example) is if we wanted to have a type of polymorphic length-indexed vectors (or as they are sometimes known, “Length-Observed Lists”) where the type of a vector includes its length. Using a generalized algebraic data type (GADT), we can write something like this:

  data LOL :: * -> * -> * where
    KThxBye :: LOL Z a
    Moar    :: a -> LOL n a -> LOL (S n) a

This says that

  1. LOL is a type constructor of kind * -> * -> *, that is, it takes two type arguments of kind * and produces a type of kind *. The intention is that the first argument records the length, and the second records the type of the elements.
  2. KThxBye constructs a vector of length zero.
  3. Given an element of type a and a vector of as of length n, Moar constructors a vector of length S n.

The type-level function Plus comes in when we implement an append function for our length-indexed vectors: in order to express the type of append we have to add the lengths of the input vectors.

  append :: LOL m a -> LOL n a -> LOL (Plus m n) a
  append KThxBye     v = v
  append (Moar x xs) v = Moar x (append xs v)

If you haven't already seen things like this, it's a good exercise to figure out why this definition of append typechecks (and why it wouldn't typecheck if we put anything other than Plus m n as the length of the output).

OK, great! We can make GHC check the lengths of our lists at compile time. So what's the problem? Well, there are (at least) three obvious things which this code leaves to be desired:

  1. It doesn't matter whether we have already declared a Nat type with constructors Z and S; we have to redeclare some empty types Z and S to represent our type-level natural number "values". And declaring empty types to use like "values" seems silly anyway.
  2. It also doesn't matter whether we've already implemented a plus function for our Nat values; we must re-code the addition algorithm at the type level with the type family Plus. Especially irksome is the fact that these definitions will be virtually identical.
  3. Finally, and most insidiously, LOL is essentially untyped. Look again at the kind of LOL :: * -> * -> *. There's nothing in the kind of LOL that tells us the first argument is supposed to be a type-level number. Nothing prevents us from accidentally writing the type LOL Int (S Z) -- we'll only run into (potentially confusing) problems later when we try to write down a value with this type.

Wouldn't it be nice if we could reuse (1) values and (2) functions at the type level, and (3) get more informative kinds in the bargain? Indeed, inspired by Conor McBride's SHE, our work aims precisely to enable (1) and (3) in GHC as a start, and hopefully eventually (2) (and other features) as well. Hopefully soon, you'll be able to write this:

  data Nat = Z | S Nat

  type family Plus (m::Nat) (n::Nat) :: Nat
  type instance Plus Z n = n
  type instance Plus (S m) n = S (Plus m n)

  data LOL :: Nat -> * -> * where
    KThxBye :: LOL Z a
    Moar    :: a -> LOL n a -> LOL (S n) a

  append :: ...  -- exactly the same as before

...or even this:

  data Nat = Z | S Nat

  plus :: Nat -> Nat -> Nat
  plus Z n = n
  plus (S m) n = S (plus m n)

  data LOL :: Nat -> * -> * where ... -- same as above

  append :: LOL m a -> LOL n a -> LOL (plus m n) a
  append = ...  -- same as before

In another post I'll explain what the above fantasy code would be doing in a bit more detail, talk about precisely how we propose to accomplish this, and discuss why we might want to do things this way, rather than introducing full dependent types (or just chucking Haskell and all moving to Agda).


Typed type-level programming in Haskell, part II: type families

July 6, 2010

In my previous post, we saw how multi-parameter type classes with functional dependencies in Haskell allow us to do type-level programming in a logic programming style. (If you’re not clear on why this corresponds to a logic programming style, see the ensuing discussion on reddit, where others explained it much better than I did in my post.)

However, MPTCs + FDs weren’t the last word on type-level programming. In 2007, along came type families.

Essentially, type families allow us to write functions on types. For example, here’s how we would implement the same Plus function from the last post, this time using type families:

  data Z
  data S n

  type family Plus m n :: *
  type instance Plus Z n = n
  type instance Plus (S m) n = S (Plus m n)

This says that for any types m and n, Plus m n is type of kind *. But it isn’t a new type, it’s just an alias for some existing type. It’s instructive to think carefully about the difference between this and type synonyms. After all, using a type synonym declaration, we can already make Plus m n an alias for some existing type, right?

Well, yes, but the difference is that a type synonym doesn’t get to look at its arguments. The technical term for this is that type synonyms must be parametric. So, for example, we can say

  type Foo m n = [(m, Maybe n)]

which defines the type synonym Foo uniformly for all arguments m and n, but using only type synonyms we cannot say

  type Foo m Int = [m]
  type Foo m Char = Maybe m

where Foo acts differently depending on what its second argument is. However, this is precisely what type families allow us to do — to declare type synonyms that do pattern-matching on their type arguments. Looking back at the Plus example above, we can see that it evaluates to different types depending on whether its first argument is Z or S n. Notice also that it is essentially identical to the way we would implement addition on regular value-level natural numbers, using pattern-matching on the first argument and a recursive call in the successor case:

  data Nat = Z | S Nat

  plus :: Nat -> Nat -> Nat
  plus Z n = n
  plus (S m) n = S (plus m n)

Let’s check that Plus works as advertised:

  *Main> :t undefined :: Plus (S Z) (S Z)
  undefined :: Plus (S Z) (S Z) :: Plus (S Z) (S Z)

Well, unfortunately, as a minor technical point, we can see from the above that ghci doesn’t expand the type family for us. The only way I currently know how to force it to expand the type family is to generate a suitable error message:

  *Main> undefined :: Plus (S Z) (S Z)

  ...No instance for (Show (S (S Z)))...

This is ugly, but it works: S (S Z) is the reduced form of Plus (S Z) (S Z).

So type families let us program in a functional style. This is nice — I daresay most Haskell programmers will be more comfortable only having to use a single coding style for both the value level and the type level. There are a few cases where a logic programming style can be quite convenient (for example, with an additional functional dependency we can use the Plus type class from the last post to compute both addition and subtraction), but in my opinion, the functional style is a huge win in most cases. (And, don’t worry, FDs and TFs are equivalent in expressiveness.)

Of course, there is a lot more to all of this; for example, I haven’t even mentioned data families or associated types. For more, I recommend reading the excellent tutorial by Oleg Kiselyov, Ken Shan, and Simon Peyton Jones, or the page on the GHC wiki. For full technical details, you can look at the System FC paper.

Nothing is ever perfect, though — in my next post, I’ll explain what type families still leave to be desired, and what we’re doing to improve things.


Follow

Get every new post delivered to your Inbox.