Back from Baltimore

October 2, 2010

I’m finally back in Philly after attending ICFP, the Haskell Symposium, and the Haskell Implementors’ Workshop in Baltimore. I enjoyed meeting every person I met, so if I met you please feel free to instantiate the universal quantifier.

  • Here’s a link to my Haskell Symposium paper, Species and Functors and Types, Oh My!. My talk seemed to be well-received, and it was videotaped, but unfortunately I think it’s going into the ACM Digital Library, which you need a subscription to access. Here are the slides, at least, although they don’t make all that much sense by themselves. I had several people ask me what I used to make the slides; it was a combination of inkscape and my diagrams EDSL.
  • Here are links to the slides and video of my talk from the Haskell Implementors’ Workshop, Typed type-level functional programming with GHC.

Typed type-level programming in Haskell, part IV: collapsing types and kinds

August 5, 2010

In Part III, we saw how the current state of the art in Haskell type-level programming leaves some things to be desired: it requires duplicating both data declarations and code, and even worse, it’s untyped. What to do?

Currently, GHC’s core language has three “levels”:

  1. Expressions: these include things like term variables, lambdas, applications, and case expressions.
  2. Types: e.g. type variables, base types, forall, function types.
  3. Kinds: * and arrow kinds.

Types classify expressions (the compiler accepts only well-typed expressions), and kinds classify types (the compiler accepts only well-kinded types). For example,

3 :: Int,
Int :: *,
Maybe :: * -> *,

and so on.

The basic idea is to allow the automatic lifting of types to kinds, and their data constructors to type constructors. For example, assuming again that we have

data Nat = Z | S Nat

we can view Z :: Nat as either the data constructor Z with type Nat, or as the type Z with kind Nat. Likewise, S :: Nat -> Nat can be viewed either as a data constructor and a type, or a type constructor and a kind.

One obvious question: if Z is a type, and types classify expressions, then what expressions have type Z? The answer: there aren’t any. But this makes sense. We want to be able to use Z as a type-level “value”, and don’t really care about whether it classifies any expressions. And indeed, without this auto-lifting, if we wanted to have a type-level Z we would have declared an empty data type data Z.

Notice we have much richer kinds now, since we are basically importing an entire copy of the type level into the kind level. But that also means we will need something to classify kinds as well, so we need another level… and what’s to stop us from lifting kinds up to the next level, and so on? We would end up with an infinite hierarchy of levels. In fact, this is exactly what Omega does.

But in our case we can do something much simpler: we simply collapse the type and kind levels into a single level so that types and kinds are now the same thing, which I will call typekinds (for lack of a better term). We just take the ordinary syntax of types that we already had, and the only things we need to add are lifted data constructors and *. (There are still some questions about whether we represent arrow kinds using the arrow type constructor or forall, but I’ll leave that aside for the moment.) To tie the knot, we add the axiom that the typekind * is classified by itself. It is well-known that this allows the encoding of set-theoretic paradoxes that render the type system inconsistent when viewed as a logic — but Haskell’s type system is already an inconsistent logic anyway, because of general recursion, so who cares?

So, what are the difficult issues remaining?

  • Coercions: GHC’s core language includes a syntax of coercions for explicitly casting between equivalent types. Making the type system richer requires more sophisticated types of coercions and makes it harder to prove that everything still works out. But I think we have this mostly ironed out.
  • Surface syntax: Suppose in addition to Z :: Nat we also declare a type called Z. This is legal, since expressions and types inhabit different namespaces. But now suppose GHC sees Z in a type. How does it know which Z we want? Is it the type Z, or is it the data constructor Z lifted to a type? There has to be a way for the programmer to specify what they mean. I think we have a solution to this that is simple to understand and not too heavyweight — I can write about this in more detail if anyone wants.
  • Type inference: This probably makes type inference a lot harder. But I wouldn’t know for sure since that’s the one thing we haven’t really thought too hard about yet.

One final question that may be bothering some: why not just go all the way and collapse all the levels, and have a true dependently-typed language? It’s a valid question, and there are of course languages, notably Agda, Coq, and Epigram, which take this approach. However, one benefit of maintaining a separation between the expression and typekind levels is that it enables a simple erasure semantics: everything at the expression level is needed at runtime, and everything at the typekind level can be erased at compile time since it has no computational significance. Erasure analysis for languages with collapsed expression and type levels is still very much an active area of research.

There’s more to say, but at this point it’s probably easiest to just open things up to questions/comments/feature requests and I’ll write more about whatever comes up! I should probably give more examples as well, which I’ll try to do soon.


Typed type-level programming in Haskell, part II: type families

July 6, 2010

In my previous post, we saw how multi-parameter type classes with functional dependencies in Haskell allow us to do type-level programming in a logic programming style. (If you’re not clear on why this corresponds to a logic programming style, see the ensuing discussion on reddit, where others explained it much better than I did in my post.)

However, MPTCs + FDs weren’t the last word on type-level programming. In 2007, along came type families.

Essentially, type families allow us to write functions on types. For example, here’s how we would implement the same Plus function from the last post, this time using type families:

  data Z
  data S n

  type family Plus m n :: *
  type instance Plus Z n = n
  type instance Plus (S m) n = S (Plus m n)

This says that for any types m and n, Plus m n is type of kind *. But it isn’t a new type, it’s just an alias for some existing type. It’s instructive to think carefully about the difference between this and type synonyms. After all, using a type synonym declaration, we can already make Plus m n an alias for some existing type, right?

Well, yes, but the difference is that a type synonym doesn’t get to look at its arguments. The technical term for this is that type synonyms must be parametric. So, for example, we can say

  type Foo m n = [(m, Maybe n)]

which defines the type synonym Foo uniformly for all arguments m and n, but using only type synonyms we cannot say

  type Foo m Int = [m]
  type Foo m Char = Maybe m

where Foo acts differently depending on what its second argument is. However, this is precisely what type families allow us to do — to declare type synonyms that do pattern-matching on their type arguments. Looking back at the Plus example above, we can see that it evaluates to different types depending on whether its first argument is Z or S n. Notice also that it is essentially identical to the way we would implement addition on regular value-level natural numbers, using pattern-matching on the first argument and a recursive call in the successor case:

  data Nat = Z | S Nat

  plus :: Nat -> Nat -> Nat
  plus Z n = n
  plus (S m) n = S (plus m n)

Let’s check that Plus works as advertised:

  *Main> :t undefined :: Plus (S Z) (S Z)
  undefined :: Plus (S Z) (S Z) :: Plus (S Z) (S Z)

Well, unfortunately, as a minor technical point, we can see from the above that ghci doesn’t expand the type family for us. The only way I currently know how to force it to expand the type family is to generate a suitable error message:

  *Main> undefined :: Plus (S Z) (S Z)

  ...No instance for (Show (S (S Z)))...

This is ugly, but it works: S (S Z) is the reduced form of Plus (S Z) (S Z).

So type families let us program in a functional style. This is nice — I daresay most Haskell programmers will be more comfortable only having to use a single coding style for both the value level and the type level. There are a few cases where a logic programming style can be quite convenient (for example, with an additional functional dependency we can use the Plus type class from the last post to compute both addition and subtraction), but in my opinion, the functional style is a huge win in most cases. (And, don’t worry, FDs and TFs are equivalent in expressiveness.)

Of course, there is a lot more to all of this; for example, I haven’t even mentioned data families or associated types. For more, I recommend reading the excellent tutorial by Oleg Kiselyov, Ken Shan, and Simon Peyton Jones, or the page on the GHC wiki. For full technical details, you can look at the System FC paper.

Nothing is ever perfect, though — in my next post, I’ll explain what type families still leave to be desired, and what we’re doing to improve things.


Typed type-level programming in Haskell, part I: functional dependencies

June 29, 2010

The other project I’m working on at MSR this summer is a bit more ambitious: our headline goal is to extend GHC to enable typed, functional, type-level programming. What’s that, you ask? Well, first, let me tell you a little story…

Once upon a time there was a lazy*, pure, functional programming language called Haskell. It was very careful to always keep its values and types strictly separated. So of course “type-level programming” was completely out of the question! …or was it?

In 1997, along came multi-parameter type classes, soon followed by functional dependencies. Suddenly, type-level programming became possible (and even fun and profitable, depending on your point of view). How did this work?

Whereas normal type classes represent predicates on types (each type is either an instance of a type class or it isn’t), multi-parameter type classes represent relations on types. For example, if we create some types to represent natural numbers,

  data Z
  data S n

we can define a multi-parameter type class Plus which encodes the addition relation on natural numbers:

  class Plus m n r

  instance Plus Z n n
  instance (Plus m n r) => Plus (S m) n (S r)

This says that for any types m, n, and r, (Z,n,n) are in the Plus relation, and (S m, n, S r) are in the Plus relation whenever (m,n,r) are. We can load this into ghci (after enabling a few extensions, namely MultiParamTypeClasses, FlexibleInstances, and EmptyDataDecls), but unfortunately we can’t yet actually use the Plus relation to do any type-level computation:

  *Main> :t undefined :: (Plus (S Z) (S Z) r) => r
  undefined :: (Plus (S Z) (S Z) r) => r :: (Plus (S Z) (S Z) r) => r

We asked for the type of something which has type r, given that the relation Plus (S Z) (S Z) r holds — but notice that ghci was not willing to simplify that constraint at all. The reason is that type classes are open — there could be lots of instances of the form Plus (S Z) (S Z) r for many different types r, and ghci has no idea which one we might want.

To the rescue come functional dependencies, which let us specify that some type class parameters are determined by others — in other words, that the relation determined by a multi-parameter type class is actually a function.

  class Plus m n r | m n -> r
  instance Plus Z n n
  instance (Plus m n r) => Plus (S m) n (S r)

Here we’ve added the functional dependency m n -> r, which says that knowing m and n also determines r. In practice, this means that we are only allowed to have a single instance of Plus for any particular combination of m and n, and if ghc can determine m and n and finds an instance matching them, it will assume it is the only one and pick r to match. Now we can actually do some computation (after enabling UndecidableInstances):

  *Main> :t undefined :: (Plus (S Z) (S Z) r) => r
  undefined :: (Plus (S Z) (S Z) r) => r :: S (S Z)

Aha! So 1 + 1 = 2, at the type level!

Type-level programming with multi-parameter type classes and functional dependencies has a strong resemblance to logic programming in languages like Prolog. We declare rules defining a number of relations, and then “running” a program consists of searching through the rules to find solutions for unconstrained variables in a given relation. (The one major difference is that GHC’s type class instance search doesn’t (yet?) do any backtracking.)

This is getting a bit long so I’ll break it up into a few posts. In the next installment, I’ll explain type families, which are a newer, alternative mechanism for type-level programming in Haskell.

* OK, OK, non-strict.


2009 ICFP programming contest reflections

June 29, 2009

This year, unlike last year, I had the good fortune to be physically located in the same place as several other people interested in competing in the 2009 ICFP Programming Contest. We only ended up with three people—we could have used more—but it was quite a lot of fun.

The problem this year (controlling some satellites to accomplish certain goals) was a good one. I especially liked the approach of distributing binaries to be run on a virtual machine, and requiring execution traces as submissions—as opposed to last year’s contest, it removed all the ickiness with wrangling execution platforms and ensuring your code would run on the organizers’ servers. And the problem itself was nifty, easy to get started on, difficult to finish, and always left room for improvements. My only complaint is that I would have been more excited about something with a more discrete flavor, since this was the second contest in a row with a simulation of physical objects, vector math, trig, and so on—but that’s only a small complaint. We did respectably, solving 8 of the 16 scenarios (100x and 200x)—although of course hindsight is 20/20, and I think in particular we could have pretty easily solved all the 300x scenarios. We ended up spending too much time trying to work things out mathematically and not enough just doing simulation and search (although I’m definitely biased since I’m the one who wrote a physics simulator module… =).

We used Haskell, of course, which seemed to mostly work well. I wrote our VM, and although it wasn’t blazing fast, it was fast enough for what we ended up doing with it. The only truly annoying part was serializing and deserializing IEEE doubles, since Data.Binary doesn’t use IEEE format, and I didn’t know about the data-binary-ieee754 package until too late. I ended up doing some ugliness involving ByteStrings, foreign pointers, peek and poke… yuck!

The first scenario was easy enough, using the provided information about Hohmann transfer orbits and looking up some math on computing the required vectors (my teammate Julien did most of the heavy lifting mathematics-research-wise… I just typed what he told me to =). For the second scenario, we were able to figure out how to compute the correct angle at which to initiate a Hohmann transfer in order to meet up with the second satellite, but the problem at first was that it wasn’t accurate enough—but we finally got it to work doing some simple searches with a physics simulator (after I spent two hours figuring out that negating an angle is NOT the same thing as adding pi to it!). I’m confident we could have gotten the third scenario to work similarly, by first transferring to a circular orbit tangent to an apogee of the target orbit (doing some sort of calculation/simulation to work out the timing correctly), but oh well, we didn’t get there. This is also where having more people would have helped, both to be able to code more stuff in parallel and for some fresh ideas.

But, all in all, a most enjoyable experience, and I look forward to putting together a (hopefully larger) team again next summer!


Follow

Get every new post delivered to your Inbox.