Introducing Math.Combinatorics.Species!

I have just uploaded to Hackage version 0.1 of the species package, a Haskell library for computing with combinatorial species. Much like David Amos’s great series of posts introducing his Haskell for Maths library, I plan to write a series of posts over the next week or two introducing the library, explaining how it works, and showing off some interesting examples.

But, first things first: if you’d like to install the library and play along, just

cabal update; cabal install species

(Man, do I ever love cabal-install! But I digress.)

Combinatorial what?

So, what are combinatorial species? Intuitively, a species describes a certain combinatorial structure: given an underlying set of labels, it specifies a set of structures which can be built using those labels. For example, L, the species of lists, when applied to an underlying set of labels U yields the set of all linear orderings of the elements of U. So in general a species can be viewed as a function which takes any set (the labels) and produces another set (of structures built out of those labels).

The species L of lists.

The species L of lists.

Actually, this isn’t quite enough to capture our intuition about what a species ought to be: we want a species to work “independently” of the underlying set; which labels we use shouldn’t matter at all when it comes to describing structures. So, additionally, we require that if F is a species, any bijection \sigma : U \to T between two sets of labels can be “lifted” to a bijection between sets of F-structures, F[\sigma] : F[U] \to F[T], in a way that respects composition of bijections. (Of course, the categorists ought to be jumping out of their seats right now: all this just amounts to saying that a species is an endofunctor F : \mathbb{B} \to \mathbb{B} on the category of sets with bijections.) Importantly, it is not too hard to see that this requirement means that for any species F, the size of F[U] depends only on the size of U, and not on the actual elements of U.

Counting labelled structures

So, let’s see some examples already! What sorts of things might we want to compute about species?

First, we of course want to be able to count how many structures are generated by a species. As a first example, consider again the species L of lists. Given an underlying set U of size n, how many lists L[U] are there? That’s easy: n!.

[brent@euclid:~]$ ghci -XNoImplicitPrelude
> :m +Math.Combinatorics.Species
> take 10 $ labelled lists

The function labelled takes a combinatorial species F as an argument, and computes an infinite list where the entry at index n is the number of labelled F-structures on an underlying set of size n.

(This is also a good time to mention that the species library depends on the Numeric Prelude, an alternative Haskell Prelude with a mathematically sane hierarchy of numeric types; hence we must pass ghci the -XNoImplicitPrelude flag so we don’t get lots of horrible name clashes. I’ll write some additional thoughts on the Numeric Prelude in a future post.)

Now, so far this is nothing new: Dan Piponi wrote a blog post about a Haskell DSL for counting labelled structures back in 2007, and in fact, that post was part of my inspiration for this library. Counting labelled structures works by associating exponential generating functions to species. (More on this in a future post.) But we can do more than that!

Counting unlabelled structures

For one, we can also count unlabelled structures. What’s an unlabelled structure? Intuitively, it’s a structure where you can’t tell the difference between the elements of the underlying set; formally, it’s an equivalence class of labelled structures, where two labelled structures are equivalent if one can be transformed into the other by permuting the labels.

So, how about unlabelled lists?

> take 10 $ unlabelled lists

Boring! This makes sense, though: there’s only one way to make a list out of n identical objects.

The species of lists on indistinguishable labels

The species of lists on indistinguishable labels

But how about something a bit less trivial?

> take 10 $ labelled partitions
> take 10 $ unlabelled partitions
> :m +Math.OEIS
> description `fmap` (lookupSequence . take 10 $ labelled partitions)
Just "Bell or exponential numbers: ways of placing n labeled balls into n indistinguishable boxes."
> description `fmap` (lookupSequence . take 10 $ unlabelled partitions)
Just "a(n) = number of partitions of n (the partition numbers)."

Unlabelled partitions

Unlabelled partitions

(I couldn’t resist sneaking in a little plug for my Math.OEIS module there too. =) So, how does this work? Well… it’s a bit more complicated! But I’ll explain it in a future post, too.

Generating structures

But that’s not all! Not only can we count structures, we can generate them, too:

> generate lists ([1..3] :: [Int])
> generate partitions ([1..3] :: [Int])

This is a bit magical, and of course I will… explain it in a future post. For now, I leave you with this challenge: can you figure out what the asterisks are doing there? (Hint: the curly brackets denote a cycle…)

Of course, no DSL would be complete without operations with which to build up more complicated structures from simpler ones; in my next post I’ll talk about operations on combinatorial species.

Further reading

If you just can’t wait for my next post and want to read more about combinatorial species, I recommend reading the Wikipedia article, which is OK, this fantastic blog post, which is what introduced me to the wonderful world of species, or for a great dead-tree reference (whence I’m getting most of my information), check out Combinatorial Species and Tree-Like Structures by Bergeron, Labelle, and Leroux.

About Brent

Assistant Professor of Computer Science at Hendrix College. Functional programmer, mathematician, teacher, pianist, follower of Jesus.
This entry was posted in combinatorics, haskell, math, projects and tagged , . Bookmark the permalink.

14 Responses to Introducing Math.Combinatorics.Species!

  1. Andrea Vezzosi says:

    in the generate examples, why do you get all the permutations for lists but not for partitions?

  2. Brent says:

    Because the partitions are set partitions; for example, [[1,2],[3]] denotes a set of sets. I guess the notation I chose is not the best, this is something I can improve in future versions. =)

  3. Dan Piponi says:

    Cool! I’ve been waiting for someone to write a nice library like this, especially the generation part.

  4. Ani says:

    I got the following error while installing:
    [10:52 PM]$ sudo cabal install species
    Resolving dependencies…
    cabal: cannot configure unamb-0.2.2. It requires base ==4.*
    There is no available version of base that satisfies ==4.*
    Any ideas on how to solve this?

    • Brent says:

      What version of ghc do you have? base-4 comes with ghc-6.10.x, so it appears that (for now) the species library will only compile with ghc-6.10.1 or later. I may not ultimately need the unamb dependency, though.

  5. A while ago, inspired by a conversation with Wouter Swierstra, I decided to implement generic data generation based on combinatorial species theory.

    For instance, for the family of datatypes
    data Expr = Const Int
    | Add Expr Expr
    | Mul Expr Expr
    | EVar Var
    | Let Decl Expr

    data Decl = Var := Expr
    | Seq Decl Decl
    | None

    type Var = String

    we can generate declarations up to size 6:
    > test (elements Decl) 6
    0 (0):
    1 (1): None

    2 (0):
    3 (1): Seq None None

    4 (2): “” := Const 0
    “” := EVar “”

    5 (2): Seq None (Seq None None)
    Seq (Seq None None) None

    or expressions:
    > test (elements Expr) 6
    0 (0):
    1 (0):
    2 (2): Const 0
    EVar “”

    3 (0):
    4 (2): Let None (Const 0)
    Let None (EVar “”)

    5 (8): Add (Const 0) (Const 0)
    Add (Const 0) (EVar “”)
    Add (EVar “”) (Const 0)
    Add (EVar “”) (EVar “”)
    Mul (Const 0) (Const 0)
    Mul (Const 0) (EVar “”)
    Mul (EVar “”) (Const 0)
    Mul (EVar “”) (EVar “”)

    The example code is here, but this really needs memoization for any serious use.

  6. Pingback: Primitive species and species operations « blog :: Brent -> [String]

  7. Hi Brent,

    When you explained this in the vegetarian restaurant in Edinburgh, I mentioned that there was a talk at ICFP that seemed related. Here is the paper:

    It defines a “fan”: “fan.F of a datatype F is as a non-deterministic program that, given a so-called seed, constructs an arbitrary F structure in which the only stored value is the seed.” Which sounds really similar to what your unlabelled does.

    • Brent says:

      Thanks Sjoerd! I’ll be sure to give it a read.

      It was good to meet you at ICFP–I look forward to seeing you at future conferences as well.

  8. David Sankel says:

    I’m tried running some of your inline examples and the “generate” and “lists” bindings are no longer in the Math.Combinatorics.Species namespace. Are there equivalents in version 0.3.2?

    • Brent says:

      Hi David, sorry for the delayed response. Yes, generate and lists have been renamed to ‘enumerate’ and ‘linOrds’ respectively (since these names are more accurate).

  9. Pingback: And now, back to your regularly scheduled combinatorial species | blog :: Brent -> [String]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s