I have just uploaded to Hackage version 0.1 of the species package, a Haskell library for computing with combinatorial species. Much like David Amos’s great series of posts introducing his Haskell for Maths library, I plan to write a series of posts over the next week or two introducing the library, explaining how it works, and showing off some interesting examples.
But, first things first: if you’d like to install the library and play along, just
cabal update; cabal install species
(Man, do I ever love cabal-install! But I digress.)
Combinatorial what?
So, what are combinatorial species? Intuitively, a species describes a certain combinatorial structure: given an underlying set of labels, it specifies a set of structures which can be built using those labels. For example, , the species of lists, when applied to an underlying set of labels yields the set of all linear orderings of the elements of . So in general a species can be viewed as a function which takes any set (the labels) and produces another set (of structures built out of those labels).
Actually, this isn’t quite enough to capture our intuition about what a species ought to be: we want a species to work “independently” of the underlying set; which labels we use shouldn’t matter at all when it comes to describing structures. So, additionally, we require that if is a species, any bijection between two sets of labels can be “lifted” to a bijection between sets of -structures, , in a way that respects composition of bijections. (Of course, the categorists ought to be jumping out of their seats right now: all this just amounts to saying that a species is an endofunctor on the category of sets with bijections.) Importantly, it is not too hard to see that this requirement means that for any species , the size of depends only on the size of , and not on the actual elements of .
Counting labelled structures
So, let’s see some examples already! What sorts of things might we want to compute about species?
First, we of course want to be able to count how many structures are generated by a species. As a first example, consider again the species of lists. Given an underlying set of size , how many lists are there? That’s easy: .
[brent@euclid:~]$ ghci -XNoImplicitPrelude
> :m +Math.Combinatorics.Species
> take 10 $ labelled lists
[1,1,2,6,24,120,720,5040,40320,362880]
The function labelled
takes a combinatorial species as an argument, and computes an infinite list where the entry at index is the number of labelled -structures on an underlying set of size .
(This is also a good time to mention that the species library depends on the Numeric Prelude, an alternative Haskell Prelude with a mathematically sane hierarchy of numeric types; hence we must pass ghci the -XNoImplicitPrelude flag so we don’t get lots of horrible name clashes. I’ll write some additional thoughts on the Numeric Prelude in a future post.)
Now, so far this is nothing new: Dan Piponi wrote a blog post about a Haskell DSL for counting labelled structures back in 2007, and in fact, that post was part of my inspiration for this library. Counting labelled structures works by associating exponential generating functions to species. (More on this in a future post.) But we can do more than that!
Counting unlabelled structures
For one, we can also count unlabelled structures. What’s an unlabelled structure? Intuitively, it’s a structure where you can’t tell the difference between the elements of the underlying set; formally, it’s an equivalence class of labelled structures, where two labelled structures are equivalent if one can be transformed into the other by permuting the labels.
So, how about unlabelled lists?
> take 10 $ unlabelled lists
[1,1,1,1,1,1,1,1,1,1]
Boring! This makes sense, though: there’s only one way to make a list out of n identical objects.
But how about something a bit less trivial?
> take 10 $ labelled partitions
[1,1,2,5,15,52,203,877,4140,21147]
> take 10 $ unlabelled partitions
[1,1,2,3,5,7,11,15,22,30]
> :m +Math.OEIS
> description `fmap` (lookupSequence . take 10 $ labelled partitions)
Just "Bell or exponential numbers: ways of placing n labeled balls into n indistinguishable boxes."
> description `fmap` (lookupSequence . take 10 $ unlabelled partitions)
Just "a(n) = number of partitions of n (the partition numbers)."
(I couldn’t resist sneaking in a little plug for my Math.OEIS module there too. =) So, how does this work? Well… it’s a bit more complicated! But I’ll explain it in a future post, too.
Generating structures
But that’s not all! Not only can we count structures, we can generate them, too:
> generate lists ([1..3] :: [Int])
[{*,1,2,3},{*,1,3,2},{*,2,1,3},{*,2,3,1},{*,3,1,2},{*,3,2,1}]
> generate partitions ([1..3] :: [Int])
[[[1,2,3]],[[1,2],[3]],[[1,3],[2]],[[1],[2,3]],[[1],[2],[3]]]
This is a bit magical, and of course I will… explain it in a future post. For now, I leave you with this challenge: can you figure out what the asterisks are doing there? (Hint: the curly brackets denote a cycle…)
Of course, no DSL would be complete without operations with which to build up more complicated structures from simpler ones; in my next post I’ll talk about operations on combinatorial species.
Further reading
If you just can’t wait for my next post and want to read more about combinatorial species, I recommend reading the Wikipedia article, which is OK, this fantastic blog post, which is what introduced me to the wonderful world of species, or for a great dead-tree reference (whence I’m getting most of my information), check out Combinatorial Species and Tree-Like Structures by Bergeron, Labelle, and Leroux.
in the generate examples, why do you get all the permutations for lists but not for partitions?
Because the partitions are set partitions; for example, [[1,2],[3]] denotes a set of sets. I guess the notation I chose is not the best, this is something I can improve in future versions. =)
Cool! I’ve been waiting for someone to write a nice library like this, especially the generation part.
Thanks Dan! I’d love any feedback (or even code contributions! =) you might have.
I got the following error while installing:
[10:52 PM]$ sudo cabal install species
Resolving dependencies…
cabal: cannot configure unamb-0.2.2. It requires base ==4.*
There is no available version of base that satisfies ==4.*
Any ideas on how to solve this?
What version of ghc do you have? base-4 comes with ghc-6.10.x, so it appears that (for now) the species library will only compile with ghc-6.10.1 or later. I may not ultimately need the unamb dependency, though.
Ah thanks for quick response. I was using 6.8.2. I changed it to 6.10.4 and that fixed the problem :)
A while ago, inspired by a conversation with Wouter Swierstra, I decided to implement generic data generation based on combinatorial species theory.
For instance, for the family of datatypes
data Expr = Const Int
| Add Expr Expr
| Mul Expr Expr
| EVar Var
| Let Decl Expr
data Decl = Var := Expr
| Seq Decl Decl
| None
type Var = String
we can generate declarations up to size 6:
> test (elements Decl) 6
0 (0):
1 (1): None
2 (0):
3 (1): Seq None None
4 (2): “” := Const 0
“” := EVar “”
5 (2): Seq None (Seq None None)
Seq (Seq None None) None
or expressions:
> test (elements Expr) 6
0 (0):
1 (0):
2 (2): Const 0
EVar “”
3 (0):
4 (2): Let None (Const 0)
Let None (EVar “”)
5 (8): Add (Const 0) (Const 0)
Add (Const 0) (EVar “”)
Add (EVar “”) (Const 0)
Add (EVar “”) (EVar “”)
Mul (Const 0) (Const 0)
Mul (Const 0) (EVar “”)
Mul (EVar “”) (Const 0)
Mul (EVar “”) (EVar “”)
The example code is here http://hpaste.org/fastcgi/hpaste.fcgi/view?id=7569#a7569, but this really needs memoization for any serious use.
Pingback: Primitive species and species operations « blog :: Brent -> [String]
Hi Brent,
When you explained this in the vegetarian restaurant in Edinburgh, I mentioned that there was a talk at ICFP that seemed related. Here is the paper:
Click to access FanClub.pdf
It defines a “fan”: “fan.F of a datatype F is as a non-deterministic program that, given a so-called seed, constructs an arbitrary F structure in which the only stored value is the seed.” Which sounds really similar to what your unlabelled does.
Thanks Sjoerd! I’ll be sure to give it a read.
It was good to meet you at ICFP–I look forward to seeing you at future conferences as well.
I’m tried running some of your inline examples and the “generate” and “lists” bindings are no longer in the Math.Combinatorics.Species namespace. Are there equivalents in version 0.3.2?
Hi David, sorry for the delayed response. Yes, generate and lists have been renamed to ‘enumerate’ and ‘linOrds’ respectively (since these names are more accurate).
Pingback: And now, back to your regularly scheduled combinatorial species | blog :: Brent -> [String]