Species operations: differentiation

August 5, 2009

Continuing my series describing my new combinatorial species library, today we’ll take a look at the operation of differentiation.

You may remember that the Species type class has an Algebra.Differential constraint, which, as I previously explained, transitively implies an Algebra.Ring constraint. But we haven’t yet talked about the Differential contraint itself, which requires a method differentiate :: Species s => s -> s (which I will abbreviate using the standard “prime” notation), which should satisfy

(x * y)' \equiv x' * y + x * y'

(up to isomorphism). Okay, this is just the normal product rule for differentiation, from calculus—but what on earth could such a thing mean combinatorially?

There is actually a nice, simple answer: an F'-structure on the underlying set U consists of an F-structure on U \cup \{*\}, where * is a distinguished element distinct from all the elements of U. To make the connection to data type differentiation, we can also think of * as a “hole”.

Species differentiation

Species differentiation

The above diagram illustrates the situation: an F'-structure on \{1,2,3,4,5\} is an F-structure on \{1,2,3,4,5,*\}.

And how about the law (F * G)' \equiv F' * G + F * G'? Does this make combinatorial sense? (You may want to stop and think about it before reading on!)

By definition, an (F * G)'-structure on U is an (F*G)-structure on U \cup \{*\}, which is a pair of an F-structure and a G-structure on a splitting (a two-partition) of U \cup \{*\}. The distinguished * label must end up on one side or the other, so an (F*G)'-structure can arise in one of two ways: it is either an F'-structure paired with a G-structure, or an F-structure paired with a G'-structure, depending on where the * ends up. But this is precisely saying that (F * G)' \equiv F' * G + F * G'!

Where does species differentiation show up? The most well-known place is in defining the species L of lists (linear orderings). In fact,

L = C',

that is, the species L is the derivative of the species C of cycles. A cycle defines an ordering, but there is no distinguished beginning or end; by making a cycle out of some elements with a distinguished extra element *, we uniquely identify a beginning/end of the ordering on the original elements: a list!

Differentiating a cycle to get a list

Differentiating a cycle to get a list


> take 10 . labelled $ lists
[1,1,2,6,24,120,720,5040,40320,362880]
> take 10 . labelled $ oneHole cycles
[1,1,2,6,24,120,720,5040,40320,362880]
> generate lists ([1..3] :: [Int])
[[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]]
> generate (oneHole cycles) ([1..3] :: [Int])
[<*,1,2,3>,<*,1,3,2>,<*,2,1,3>,<*,2,3,1>,<*,3,1,2>,<*,3,2,1>]

Here’s an example of differentiation in action. In the species library, the function oneHole is provided as a synonym for differentiate. The session above shows that there are the same number of labelled lists as labelled one-hole cycles: this isn’t surprising given the discussion above, and in fact, list is actually implemented as oneHole cycle. Actually, this is a tiny lie, as the rest of the session shows: since lists are such a common combinatorial structure, there is a special case for them in the generation code. But we can explicitly generate one-hole cycles as above; it’s easy to see that they are in one-to-one correspondence with the lists.

To finish off this post, a few exercises for you (you can check your answers with the species library):

  1. Describe the species 1'.
  2. Describe the species X'.
  3. Describe the species E'.
  4. Does differentiation distribute over addition? That is, is it true that (F + G)' \equiv F' + G' for any species F and G? Give a combinatorial interpretation of this identity, or say why it does not hold.
  5. Describe the species L'.
  6. Describe the species C^{(n)} (i.e. the nth derivative of the species of cycles).

Primitive species and species operations, part II

July 31, 2009

In my previous post, I began describing the primitive species and species operations supported by my combinatorial species library; we looked at the ring structure on species, that is, the primitive species 0 and 1, and the operations of species sum and product. Today we’ll continue by looking at a few more primitive species: singletons, sets, and cycles.

[By the way, all the diagrams for this post and the previous one were generated programmatically using my diagrams library. I'll probably put the code up as an example on the diagrams website sometime in the near future.]

X

The species X, also known as the species of singletons, is picky in a way similar to the species 1. Whereas 1 only puts a structure on the empty set of labels, X only puts a (single) structure on singleton label sets. If you give it more than one label, or none, it turns up its nose and refuses to do anything.

The species X of singletons

The species X of singletons


> take 10 $ labelled singleton
[0,1,0,0,0,0,0,0,0,0]
> generate singleton (['a'] :: [Char])
['a']
> generate singleton ("abc" :: [Char])
[]

A few exercises: try to work them out yourself, then use the species library to check if you are correct!

  1. Describe the species X + X. Show that it is isomorphic to the species 2 * X.
  2. Describe the species X * X.

E

The species E of sets, on the other hand, isn’t picky at all: it will happily put a singleton structure on any label set. Usually we identify this structure with the label set itself; that is, the only E-structure on a label set U is U itself.

The species E of sets

The species E of sets


> take 10 $ labelled sets
[1,1,1,1,1,1,1,1,1,1]
> take 10 $ unlabelled sets
[1,1,1,1,1,1,1,1,1,1]
> generate set ([1..3] :: [Int])
[{1,2,3}]
> generate set ([] :: [Int])
[{}]

We can now also describe the derived species X * E of elements, also known as the species of pointed sets. The only way to get any X * E structures is by partitioning the label set U into a singleton and all the rest, in which case we get exactly one structure; so there is one X * E structure for each element of U.


> take 10 $ labelled (x * set)
[0,1,2,3,4,5,6,7,8,9]
> take 10 $ unlabelled (x * set)
[0,1,1,1,1,1,1,1,1,1]
> generate (x * set) ([1..3] :: [Int])
[(1,{2,3}),(2,{1,3}),(3,{1,2})]

(x is just a synonym for singleton.) Noteworthy is the fact that this is the first species we’ve looked at which has different numbers of labelled and unlabelled structures! This makes sense: there are n labelled (X * E)-structures on a size n set; but if we can’t tell the difference between the labels, any one of them is just as good as any other, so we only get one unlabelled structure (unless the label set is empty, when we don’t get any structures: the X still requires us to have at least one element!). Note also that element is a special synonym for x * set with a special semantics under generate: if we really want to pick elements of the label set, then we probably don’t want to actually see each element paired with a set of the leftover elements, we just want to see the element itself:


> generate elements ([1..3]::[Int])
[1,2,3]

C

The final primitive species—and the only one so far that doesn’t feel quite so utterly trivial—is the species C of cycles. C puts no structures on an empty label set, but given any non-empty label set, C generates the set of all cyclical orderings of the labels.

The species C of cycles

The species C of cycles

Of course, the above diagram only shows six of the cycle structures on five labels; the ellipsis is meant to suggest the others not shown. So… how many labelled cycle structures are there on five labels, or generally on n labels? (Of course there is only one unlabelled n-cycle.) I’ll leave it as a (hopefully easy) exercise; and of course you know how to check your answer!


> generate cycles ([1..4] :: [Int])
[<1,2,3,4>,<1,2,4,3>,<1,3,2,4>,<1,3,4,2>,<1,4,2,3>,<1,4,3,2>]

As you can see, cycles are indicated with angle brackets; it is understood that <1,2,3,4>, <2,3,4,1>, <3,4,1,2>, and <4,1,2,3> are all equivalent.

At this point, you’re probably thinking about a certain species and wondering why I haven’t mentioned it yet—it seems pretty fundamental and primitive. Are you thinking of… the species of lists? It turns out that we don’t have to take lists as primitive—we can define the species of lists as the derivative of the species of cycles! The derivative of a regular type is its type of one-hole… but I’m getting ahead of myself. We’ll look at species differentiation (along with several other operations on species) in the next post!


Primitive species and species operations

July 30, 2009

In this second post about my new combinatorial species library, I plan to begin writing about the species DSL itself: what are the primitive combinatorial species and the primitive operations on species? (The first post described the concept of combinatorial species in general. Also, for those following along at home, I’ve just uploaded version 0.2.1 of the species library, which is a vast improvement over 0.1, with many new features and a few bug fixes; just cabal update && cabal upgrade species. Also also, note that it currently only builds on ghc 6.10.x.)

The Species type class

The central point of the combinatorial species formalism is that there is a deep and fruitful analogy between species and certain types of generating functions: every species corresponds to (several different types of) generating functions, and every species operation corresponds, in a fairly natural way, to operations on generating functions. For example, “adding” two species has the combinatorial interpretation of disjoint sum, but also corresponds to generating function addition—that is, the generating function of a sum of species is the sum of their generating functions. Like every good generating function, these generating functions encode various sorts of information about species (such as counting labelled or unlabelled structures), so once we have written down a description of a combinatorial structure using primitive species and species operations, we can use the generating function analogy to compute various properties of the species.

So, how to represent combinatorial species in our library? With a Species type class, of course! The type class mechanism is perfectly suited to this situation—we have an abstract algebraic structure (species and species operations) which can be interpreted in several ways. So we can write down an expression of type Species s => s, and then choose to compute certain things about it simply by choosing what type it should be. Without further ado, let’s see (an idealized version of) the type class itself, defined in Math.Combinatorics.Species.Class:

class (Algebra.Differential.C s) => Species s where
  singleton :: s
  set       :: s
  cycle     :: s

  o         :: s -> s -> s
  cartesian :: s -> s -> s
  fcomp     :: s -> s -> s
  ofSize    :: s -> (Integer -> Bool) -> s

(I’ve actually left out a few methods, but they all share the property that they have default implementations in terms of the other methods, and are only in the class so they can be given specialized implementations in certain instances. I’ve left them out for now to simplify the discussion.)

So we can now write expressions like

(set `ofSize` (>3)) `fcomp` (cycle `o` singleton) :: Species s => s

but what does it mean? (Actually, this particular example is pretty meaningless. =) And what’s that Algebra.Differential.C constraint? Let’s start at the beginning.

0

The Algebra.Differential.C constraint requires any instance of Species to be a differentiable ring. In particular, it (transitively) implies the constraint Algebra.Additive.C, which means that instances of Species must form an additive group: there must be a species operation (+), and a species 0 which is the identity for (+). (It also requires an operation negate which produces additive inverses, but that isn’t implemented yet!) Let’s see what these correspond to.

The species 0 is the Scrooge of the species world: it refuses to create a single structure, no matter how many labels you give it!

The species 0

The species 0

Let’s see how to use this species with the library:

> take 10 $ labelled 0
[0,0,0,0,0,0,0,0,0,0]
> take 10 $ unlabelled 0
[0,0,0,0,0,0,0,0,0,0]
> generate 0 ([1..3] :: [Int])
[]

Pretty boring, huh? Well, it’s supposed to be. 0 doesn’t get explicitly used very much, but it’s nice to know it’s there.

(Also, remember that to follow along, you’ll have to start ghci with the -XNoImplicitPrelude flag, then remove the loaded Prelude module with :m -Prelude, and then load MyPrelude (from the NumericPrelude library) and the species library: :m +MyPrelude Math.Combinatorics.Species.)

Species sum

And what about species addition? Addition just corresponds to disjoint (i.e. tagged) union: an (F+G)-structure is either an F-structure or a G-structure, along with a tag so you know which it is. If you have m F-structures and n G-structures, then you have m + n (F+G)-structures.


> take 10 $ labelled lists
[1,1,2,6,24,120,720,5040,40320,362880]
> take 10 $ labelled octopi
[0,1,3,14,90,744,7560,91440,1285200,20603520]
> take 10 $ labelled (lists + octopi)
[1,2,5,20,114,864,8280,96480,1325520,20966400]
> generate (lists + octopi) ([1,2] :: [Int])
[inl([1,2]),inl([2,1]),inr(<[1,2]>),inr(<[2,1]>),
inr(<[1],[2]>)]

Do you see why the 0 species is the identity element for species sum? If you have a structure of the species 0 + F, it must be either a 0-structure, or an F-structure: but there are no 0-structures! Now, you may complain that 0 is not really an identity, since the addition still introduces an extra tag:


> generate subsets ([1..3] :: [Int])
[{1,2,3},{1,2},{1,3},{1},{2,3},{2},{3},{}]
> generate (0 + subsets) ([1..3] :: [Int])
[inr({1,2,3}),inr({1,2}),inr({1,3}),inr({1}),
inr({2,3}),inr({2}),inr({3}),inr({})]

That’s true, but we really only care about species identity up to isomorphism, and the species F, 0 + F, and F + 0 are clearly all isomorphic for any species F, even if they are not identical.

1

The Algebra.Differential.C constraint also implies a Algebra.Ring.C constraint, which requires a multiplication operation (*) and identity element 1.

So, what is the species 1? It puts a singleton structure on the empty set of labels, but no structures on any nonempty label sets:

The species 1

The species 1


> take 10 $ labelled 1
[1,0,0,0,0,0,0,0,0,0]
> take 10 $ unlabelled 1
[1,0,0,0,0,0,0,0,0,0]
> generate 1 ([] :: [Int])
[1]
> generate 1 ([1..3] :: [Int])
[]

So you can see that on the empty set, 1 generates a single structure which is also called 1 (although it could be called anything, really).

Species product

And species product? An (F*G)-structure on a set of labels is a pair consisting of an F-structure on a subset of the labels, and a G-structure on whatever labels are left over. In other words, to form all (F*G)-structures on a set of labels U, we first partition U into an ordered pair of subsets in all possible ways, and for each pair, take all possible combinations of an F-structure on the first subset, and a G-structure on the second subset. For example:


> generate (list * list) ([1..3] :: [Int])
[([1,2,3],[]),([1,3,2],[]),([2,1,3],[]),([2,3,1],[]),([3,1,2],[]),([3,2,1],[]),([1,2],[3]),([2,1],[3]),([1,3],[2]),([3,1],[2]),([1],[2,3]),([1],[3,2]),([2,3],[1]),([3,2],[1]),([2],[1,3]),([2],[3,1]),([3],[1,2]),([3],[2,1]),([],[1,2,3]),([],[1,3,2]),([],[2,1,3]),([],[2,3,1]),([],[3,1,2]),([],[3,2,1])]

Can you see why 1 is the identity element for this operation? The only partition of the label set that will produce any (1*F)-structures is (\emptyset, U): in any other case, 1 produces no structures. But a 1-structure paired with an F-structure on U is really just an F-structure on U, since there is only one 1-structure.

As an exercise, can you figure out what the species 2, 3, … ought to be?

I think I’ll stop there for now. In my next post, I’ll talk about the other primitive species in the Species type class: singletons, sets, and cycles.


Introducing Math.Combinatorics.Species!

July 24, 2009

I have just uploaded to Hackage version 0.1 of the species package, a Haskell library for computing with combinatorial species. Much like David Amos’s great series of posts introducing his Haskell for Maths library, I plan to write a series of posts over the next week or two introducing the library, explaining how it works, and showing off some interesting examples.

But, first things first: if you’d like to install the library and play along, just

cabal update; cabal install species

(Man, do I ever love cabal-install! But I digress.)

Combinatorial what?

So, what are combinatorial species? Intuitively, a species describes a certain combinatorial structure: given an underlying set of labels, it specifies a set of structures which can be built using those labels. For example, L, the species of lists, when applied to an underlying set of labels U yields the set of all linear orderings of the elements of U. So in general a species can be viewed as a function which takes any set (the labels) and produces another set (of structures built out of those labels).

The species L of lists.

The species L of lists.

Actually, this isn’t quite enough to capture our intuition about what a species ought to be: we want a species to work “independently” of the underlying set; which labels we use shouldn’t matter at all when it comes to describing structures. So, additionally, we require that if F is a species, any bijection \sigma : U \to T between two sets of labels can be “lifted” to a bijection between sets of F-structures, F[\sigma] : F[U] \to F[T], in a way that respects composition of bijections. (Of course, the categorists ought to be jumping out of their seats right now: all this just amounts to saying that a species is an endofunctor F : \mathbb{B} \to \mathbb{B} on the category of sets with bijections.) Importantly, it is not too hard to see that this requirement means that for any species F, the size of F[U] depends only on the size of U, and not on the actual elements of U.

Counting labelled structures

So, let’s see some examples already! What sorts of things might we want to compute about species?

First, we of course want to be able to count how many structures are generated by a species. As a first example, consider again the species L of lists. Given an underlying set U of size n, how many lists L[U] are there? That’s easy: n!.


[brent@euclid:~]$ ghci -XNoImplicitPrelude
> :m +Math.Combinatorics.Species
> take 10 $ labelled lists
[1,1,2,6,24,120,720,5040,40320,362880]

The function labelled takes a combinatorial species F as an argument, and computes an infinite list where the entry at index n is the number of labelled F-structures on an underlying set of size n.

(This is also a good time to mention that the species library depends on the Numeric Prelude, an alternative Haskell Prelude with a mathematically sane hierarchy of numeric types; hence we must pass ghci the -XNoImplicitPrelude flag so we don’t get lots of horrible name clashes. I’ll write some additional thoughts on the Numeric Prelude in a future post.)

Now, so far this is nothing new: Dan Piponi wrote a blog post about a Haskell DSL for counting labelled structures back in 2007, and in fact, that post was part of my inspiration for this library. Counting labelled structures works by associating exponential generating functions to species. (More on this in a future post.) But we can do more than that!

Counting unlabelled structures

For one, we can also count unlabelled structures. What’s an unlabelled structure? Intuitively, it’s a structure where you can’t tell the difference between the elements of the underlying set; formally, it’s an equivalence class of labelled structures, where two labelled structures are equivalent if one can be transformed into the other by permuting the labels.

So, how about unlabelled lists?


> take 10 $ unlabelled lists
[1,1,1,1,1,1,1,1,1,1]

Boring! This makes sense, though: there’s only one way to make a list out of n identical objects.

The species of lists on indistinguishable labels

The species of lists on indistinguishable labels

But how about something a bit less trivial?


> take 10 $ labelled partitions
[1,1,2,5,15,52,203,877,4140,21147]
> take 10 $ unlabelled partitions
[1,1,2,3,5,7,11,15,22,30]
> :m +Math.OEIS
> description `fmap` (lookupSequence . take 10 $ labelled partitions)
Just "Bell or exponential numbers: ways of placing n labeled balls into n indistinguishable boxes."
> description `fmap` (lookupSequence . take 10 $ unlabelled partitions)
Just "a(n) = number of partitions of n (the partition numbers)."

Unlabelled partitions

Unlabelled partitions

(I couldn’t resist sneaking in a little plug for my Math.OEIS module there too. =) So, how does this work? Well… it’s a bit more complicated! But I’ll explain it in a future post, too.

Generating structures

But that’s not all! Not only can we count structures, we can generate them, too:


> generate lists ([1..3] :: [Int])
[{*,1,2,3},{*,1,3,2},{*,2,1,3},{*,2,3,1},{*,3,1,2},{*,3,2,1}]
> generate partitions ([1..3] :: [Int])
[[[1,2,3]],[[1,2],[3]],[[1,3],[2]],[[1],[2,3]],[[1],[2],[3]]]

This is a bit magical, and of course I will… explain it in a future post. For now, I leave you with this challenge: can you figure out what the asterisks are doing there? (Hint: the curly brackets denote a cycle…)

Of course, no DSL would be complete without operations with which to build up more complicated structures from simpler ones; in my next post I’ll talk about operations on combinatorial species.

Further reading

If you just can’t wait for my next post and want to read more about combinatorial species, I recommend reading the Wikipedia article, which is OK, this fantastic blog post, which is what introduced me to the wonderful world of species, or for a great dead-tree reference (whence I’m getting most of my information), check out Combinatorial Species and Tree-Like Structures by Bergeron, Labelle, and Leroux.


Monad.Reader article (Multiset partitions)

September 28, 2007

I imagine that most who read this blog have already seen it, but for those who haven’t, my article Generating Multiset Partitions has been published in Issue #8 of The Monad.Reader.  In it, I discuss the problem of generating multiset partitions in a functional context, and give an implementation in Haskell.  The final published version contains many major improvements over the draft version I posted here a while ago; give it a read and let me know what you think if you’re interested!