Primitive species and species operations, part II

July 31, 2009

In my previous post, I began describing the primitive species and species operations supported by my combinatorial species library; we looked at the ring structure on species, that is, the primitive species 0 and 1, and the operations of species sum and product. Today we’ll continue by looking at a few more primitive species: singletons, sets, and cycles.

[By the way, all the diagrams for this post and the previous one were generated programmatically using my diagrams library. I'll probably put the code up as an example on the diagrams website sometime in the near future.]

X

The species X, also known as the species of singletons, is picky in a way similar to the species 1. Whereas 1 only puts a structure on the empty set of labels, X only puts a (single) structure on singleton label sets. If you give it more than one label, or none, it turns up its nose and refuses to do anything.

The species X of singletons

The species X of singletons


> take 10 $ labelled singleton
[0,1,0,0,0,0,0,0,0,0]
> generate singleton (['a'] :: [Char])
['a']
> generate singleton ("abc" :: [Char])
[]

A few exercises: try to work them out yourself, then use the species library to check if you are correct!

  1. Describe the species X + X. Show that it is isomorphic to the species 2 * X.
  2. Describe the species X * X.

E

The species E of sets, on the other hand, isn’t picky at all: it will happily put a singleton structure on any label set. Usually we identify this structure with the label set itself; that is, the only E-structure on a label set U is U itself.

The species E of sets

The species E of sets


> take 10 $ labelled sets
[1,1,1,1,1,1,1,1,1,1]
> take 10 $ unlabelled sets
[1,1,1,1,1,1,1,1,1,1]
> generate set ([1..3] :: [Int])
[{1,2,3}]
> generate set ([] :: [Int])
[{}]

We can now also describe the derived species X * E of elements, also known as the species of pointed sets. The only way to get any X * E structures is by partitioning the label set U into a singleton and all the rest, in which case we get exactly one structure; so there is one X * E structure for each element of U.


> take 10 $ labelled (x * set)
[0,1,2,3,4,5,6,7,8,9]
> take 10 $ unlabelled (x * set)
[0,1,1,1,1,1,1,1,1,1]
> generate (x * set) ([1..3] :: [Int])
[(1,{2,3}),(2,{1,3}),(3,{1,2})]

(x is just a synonym for singleton.) Noteworthy is the fact that this is the first species we’ve looked at which has different numbers of labelled and unlabelled structures! This makes sense: there are n labelled (X * E)-structures on a size n set; but if we can’t tell the difference between the labels, any one of them is just as good as any other, so we only get one unlabelled structure (unless the label set is empty, when we don’t get any structures: the X still requires us to have at least one element!). Note also that element is a special synonym for x * set with a special semantics under generate: if we really want to pick elements of the label set, then we probably don’t want to actually see each element paired with a set of the leftover elements, we just want to see the element itself:


> generate elements ([1..3]::[Int])
[1,2,3]

C

The final primitive species—and the only one so far that doesn’t feel quite so utterly trivial—is the species C of cycles. C puts no structures on an empty label set, but given any non-empty label set, C generates the set of all cyclical orderings of the labels.

The species C of cycles

The species C of cycles

Of course, the above diagram only shows six of the cycle structures on five labels; the ellipsis is meant to suggest the others not shown. So… how many labelled cycle structures are there on five labels, or generally on n labels? (Of course there is only one unlabelled n-cycle.) I’ll leave it as a (hopefully easy) exercise; and of course you know how to check your answer!


> generate cycles ([1..4] :: [Int])
[<1,2,3,4>,<1,2,4,3>,<1,3,2,4>,<1,3,4,2>,<1,4,2,3>,<1,4,3,2>]

As you can see, cycles are indicated with angle brackets; it is understood that <1,2,3,4>, <2,3,4,1>, <3,4,1,2>, and <4,1,2,3> are all equivalent.

At this point, you’re probably thinking about a certain species and wondering why I haven’t mentioned it yet—it seems pretty fundamental and primitive. Are you thinking of… the species of lists? It turns out that we don’t have to take lists as primitive—we can define the species of lists as the derivative of the species of cycles! The derivative of a regular type is its type of one-hole… but I’m getting ahead of myself. We’ll look at species differentiation (along with several other operations on species) in the next post!


Primitive species and species operations

July 30, 2009

In this second post about my new combinatorial species library, I plan to begin writing about the species DSL itself: what are the primitive combinatorial species and the primitive operations on species? (The first post described the concept of combinatorial species in general. Also, for those following along at home, I’ve just uploaded version 0.2.1 of the species library, which is a vast improvement over 0.1, with many new features and a few bug fixes; just cabal update && cabal upgrade species. Also also, note that it currently only builds on ghc 6.10.x.)

The Species type class

The central point of the combinatorial species formalism is that there is a deep and fruitful analogy between species and certain types of generating functions: every species corresponds to (several different types of) generating functions, and every species operation corresponds, in a fairly natural way, to operations on generating functions. For example, “adding” two species has the combinatorial interpretation of disjoint sum, but also corresponds to generating function addition—that is, the generating function of a sum of species is the sum of their generating functions. Like every good generating function, these generating functions encode various sorts of information about species (such as counting labelled or unlabelled structures), so once we have written down a description of a combinatorial structure using primitive species and species operations, we can use the generating function analogy to compute various properties of the species.

So, how to represent combinatorial species in our library? With a Species type class, of course! The type class mechanism is perfectly suited to this situation—we have an abstract algebraic structure (species and species operations) which can be interpreted in several ways. So we can write down an expression of type Species s => s, and then choose to compute certain things about it simply by choosing what type it should be. Without further ado, let’s see (an idealized version of) the type class itself, defined in Math.Combinatorics.Species.Class:

class (Algebra.Differential.C s) => Species s where
  singleton :: s
  set       :: s
  cycle     :: s

  o         :: s -> s -> s
  cartesian :: s -> s -> s
  fcomp     :: s -> s -> s
  ofSize    :: s -> (Integer -> Bool) -> s

(I’ve actually left out a few methods, but they all share the property that they have default implementations in terms of the other methods, and are only in the class so they can be given specialized implementations in certain instances. I’ve left them out for now to simplify the discussion.)

So we can now write expressions like

(set `ofSize` (>3)) `fcomp` (cycle `o` singleton) :: Species s => s

but what does it mean? (Actually, this particular example is pretty meaningless. =) And what’s that Algebra.Differential.C constraint? Let’s start at the beginning.

0

The Algebra.Differential.C constraint requires any instance of Species to be a differentiable ring. In particular, it (transitively) implies the constraint Algebra.Additive.C, which means that instances of Species must form an additive group: there must be a species operation (+), and a species 0 which is the identity for (+). (It also requires an operation negate which produces additive inverses, but that isn’t implemented yet!) Let’s see what these correspond to.

The species 0 is the Scrooge of the species world: it refuses to create a single structure, no matter how many labels you give it!

The species 0

The species 0

Let’s see how to use this species with the library:

> take 10 $ labelled 0
[0,0,0,0,0,0,0,0,0,0]
> take 10 $ unlabelled 0
[0,0,0,0,0,0,0,0,0,0]
> generate 0 ([1..3] :: [Int])
[]

Pretty boring, huh? Well, it’s supposed to be. 0 doesn’t get explicitly used very much, but it’s nice to know it’s there.

(Also, remember that to follow along, you’ll have to start ghci with the -XNoImplicitPrelude flag, then remove the loaded Prelude module with :m -Prelude, and then load MyPrelude (from the NumericPrelude library) and the species library: :m +MyPrelude Math.Combinatorics.Species.)

Species sum

And what about species addition? Addition just corresponds to disjoint (i.e. tagged) union: an (F+G)-structure is either an F-structure or a G-structure, along with a tag so you know which it is. If you have m F-structures and n G-structures, then you have m + n (F+G)-structures.


> take 10 $ labelled lists
[1,1,2,6,24,120,720,5040,40320,362880]
> take 10 $ labelled octopi
[0,1,3,14,90,744,7560,91440,1285200,20603520]
> take 10 $ labelled (lists + octopi)
[1,2,5,20,114,864,8280,96480,1325520,20966400]
> generate (lists + octopi) ([1,2] :: [Int])
[inl([1,2]),inl([2,1]),inr(<[1,2]>),inr(<[2,1]>),
inr(<[1],[2]>)]

Do you see why the 0 species is the identity element for species sum? If you have a structure of the species 0 + F, it must be either a 0-structure, or an F-structure: but there are no 0-structures! Now, you may complain that 0 is not really an identity, since the addition still introduces an extra tag:


> generate subsets ([1..3] :: [Int])
[{1,2,3},{1,2},{1,3},{1},{2,3},{2},{3},{}]
> generate (0 + subsets) ([1..3] :: [Int])
[inr({1,2,3}),inr({1,2}),inr({1,3}),inr({1}),
inr({2,3}),inr({2}),inr({3}),inr({})]

That’s true, but we really only care about species identity up to isomorphism, and the species F, 0 + F, and F + 0 are clearly all isomorphic for any species F, even if they are not identical.

1

The Algebra.Differential.C constraint also implies a Algebra.Ring.C constraint, which requires a multiplication operation (*) and identity element 1.

So, what is the species 1? It puts a singleton structure on the empty set of labels, but no structures on any nonempty label sets:

The species 1

The species 1


> take 10 $ labelled 1
[1,0,0,0,0,0,0,0,0,0]
> take 10 $ unlabelled 1
[1,0,0,0,0,0,0,0,0,0]
> generate 1 ([] :: [Int])
[1]
> generate 1 ([1..3] :: [Int])
[]

So you can see that on the empty set, 1 generates a single structure which is also called 1 (although it could be called anything, really).

Species product

And species product? An (F*G)-structure on a set of labels is a pair consisting of an F-structure on a subset of the labels, and a G-structure on whatever labels are left over. In other words, to form all (F*G)-structures on a set of labels U, we first partition U into an ordered pair of subsets in all possible ways, and for each pair, take all possible combinations of an F-structure on the first subset, and a G-structure on the second subset. For example:


> generate (list * list) ([1..3] :: [Int])
[([1,2,3],[]),([1,3,2],[]),([2,1,3],[]),([2,3,1],[]),([3,1,2],[]),([3,2,1],[]),([1,2],[3]),([2,1],[3]),([1,3],[2]),([3,1],[2]),([1],[2,3]),([1],[3,2]),([2,3],[1]),([3,2],[1]),([2],[1,3]),([2],[3,1]),([3],[1,2]),([3],[2,1]),([],[1,2,3]),([],[1,3,2]),([],[2,1,3]),([],[2,3,1]),([],[3,1,2]),([],[3,2,1])]

Can you see why 1 is the identity element for this operation? The only partition of the label set that will produce any (1*F)-structures is (\emptyset, U): in any other case, 1 produces no structures. But a 1-structure paired with an F-structure on U is really just an F-structure on U, since there is only one 1-structure.

As an exercise, can you figure out what the species 2, 3, … ought to be?

I think I’ll stop there for now. In my next post, I’ll talk about the other primitive species in the Species type class: singletons, sets, and cycles.


Hac φ roundup

July 29, 2009

By all accounts, Hac φ was a great success! 25 people attended (from as far away as Chicago, Toronto, and Utrecht) and we had a great time hanging out, eating good food, listening to some interesting talks, and, oh yes, hacking. There was a wide range of Haskell experience represented among the participants, which made for a great atmosphere of collaboration and learning. A collection of my pictures can be found here; Shae’s pictures can be found here, here, here, and here. Unfortunately there don’t seem to be any pictures of when we went out for dinner and beer, but I can assure you that much fun was had by all. I also have another post describing the talks on Saturday afternoon.

Here’s a quick rundown of some of the things people worked on (by no means complete):

  • Chris Eidhof (chr1s) and doug Beardsley (mightybyte) worked on the formlets library, making a bunch of internal changes.
  • Tracy Wadleigh (twadleigh) worked on his mathlink library for implementing Mathematica packages in Haskell.
  • Greg Collins (gcollins) made a number of improvements and bugfixes to the OSX installer for the Haskell Platform.
  • Chris Casinghino (ccasin) worked on creating some open-source tools for crossword puzzle construction and solving.
  • Edward Kmett (edwardk) and Shae Erisson (shapr) worked on Edward’s small, mostly untyped, lazy functional programming language Kata; Shae also did a bit of work on parsing FLAC files.
  • Ravi Nanavati (ravi_n) collaborated with Edward Kmett to design a monoid for error message collection and started planning for a Boston hackathon in January (?), among other things.
  • Mike Burns (mike-burns) finished a minimally working program called “taggert” that provides for a tagged view of a directory, using gtk2hs and HDBC-Sqlite3.
  • Gershom Bazerman (sclv) worked on releasing his jmacro library for programmatic JavaScript generation.
  • Dan DaCosta (chaosape) worked on learning Haskell and designing a library for collection, aggregation, and analysis of network traffic, to be used for his Master’s thesis.
  • Daniel Wagner (dmwit) worked on a library for parsing .sgf files.
  • Yours truly (byorgey) worked on a new splitting method for my Data.List.Split library, applied a few xmonad patches, and worked a bunch on my new combinatorial species library.

Those are the projects I know about; there were also a number of other attendees, whose accomplishments were surely no less interesting even though I don’t know what they were: Adam Turoff (\z), Anton van Straaten (mmmdonuts), Luke Hoersten (Luke), Reece Heineke, Sasha Rush, Malik Hamro (malikh), Andrew Robbins (adu), Andrew Wagner (chessguy), Aleks Jakulin (AleksJ), Hristo Asenov (hasenov), and Ken Takusagawa. Please feel free to leave comments with information about projects I left out, more pictures, blog posts, or whatever!

Towards the end of the hackathon several people asked me whether we would be hosting another one next year. We hadn’t planned that far ahead, of course—but judging by how much fun this one was (and how unstressful it was to organize (although that was largely thanks to our amazing administrative coordinator, Cheryl Hickey, who handled a lot of the details!)), it’s a distinct possibility! In fact, since it looks like ICFP will be in Baltimore next year, perhaps we can help organize another hackathon in conjunction with that. We shall see!


More from Hac φ

July 25, 2009

Yet more pictures from Hac φ!

This afternoon we were treated to talks by Dimitry Golubovsky (typed JavaScript generation), Edward Kmett (monoids library and monoidal parsing), Gershom Bazerman (DSL and quasiquoting for JavaScript generation), and Anton van Straaten (Gitit as an application framework).


Hac φ day 2

July 25, 2009

A few pictures from Day 2 of Hac φ:

Dan DaCosta ponders.

Tracy Wadleigh, Aleks Jakulin, Chris Casinghino, Ravi Nanavati

DSC_0193

DSC_0194

DSC_0195

DSC_0196


Introducing Math.Combinatorics.Species!

July 24, 2009

I have just uploaded to Hackage version 0.1 of the species package, a Haskell library for computing with combinatorial species. Much like David Amos’s great series of posts introducing his Haskell for Maths library, I plan to write a series of posts over the next week or two introducing the library, explaining how it works, and showing off some interesting examples.

But, first things first: if you’d like to install the library and play along, just

cabal update; cabal install species

(Man, do I ever love cabal-install! But I digress.)

Combinatorial what?

So, what are combinatorial species? Intuitively, a species describes a certain combinatorial structure: given an underlying set of labels, it specifies a set of structures which can be built using those labels. For example, L, the species of lists, when applied to an underlying set of labels U yields the set of all linear orderings of the elements of U. So in general a species can be viewed as a function which takes any set (the labels) and produces another set (of structures built out of those labels).

The species L of lists.

The species L of lists.

Actually, this isn’t quite enough to capture our intuition about what a species ought to be: we want a species to work “independently” of the underlying set; which labels we use shouldn’t matter at all when it comes to describing structures. So, additionally, we require that if F is a species, any bijection \sigma : U \to T between two sets of labels can be “lifted” to a bijection between sets of F-structures, F[\sigma] : F[U] \to F[T], in a way that respects composition of bijections. (Of course, the categorists ought to be jumping out of their seats right now: all this just amounts to saying that a species is an endofunctor F : \mathbb{B} \to \mathbb{B} on the category of sets with bijections.) Importantly, it is not too hard to see that this requirement means that for any species F, the size of F[U] depends only on the size of U, and not on the actual elements of U.

Counting labelled structures

So, let’s see some examples already! What sorts of things might we want to compute about species?

First, we of course want to be able to count how many structures are generated by a species. As a first example, consider again the species L of lists. Given an underlying set U of size n, how many lists L[U] are there? That’s easy: n!.


[brent@euclid:~]$ ghci -XNoImplicitPrelude
> :m +Math.Combinatorics.Species
> take 10 $ labelled lists
[1,1,2,6,24,120,720,5040,40320,362880]

The function labelled takes a combinatorial species F as an argument, and computes an infinite list where the entry at index n is the number of labelled F-structures on an underlying set of size n.

(This is also a good time to mention that the species library depends on the Numeric Prelude, an alternative Haskell Prelude with a mathematically sane hierarchy of numeric types; hence we must pass ghci the -XNoImplicitPrelude flag so we don’t get lots of horrible name clashes. I’ll write some additional thoughts on the Numeric Prelude in a future post.)

Now, so far this is nothing new: Dan Piponi wrote a blog post about a Haskell DSL for counting labelled structures back in 2007, and in fact, that post was part of my inspiration for this library. Counting labelled structures works by associating exponential generating functions to species. (More on this in a future post.) But we can do more than that!

Counting unlabelled structures

For one, we can also count unlabelled structures. What’s an unlabelled structure? Intuitively, it’s a structure where you can’t tell the difference between the elements of the underlying set; formally, it’s an equivalence class of labelled structures, where two labelled structures are equivalent if one can be transformed into the other by permuting the labels.

So, how about unlabelled lists?


> take 10 $ unlabelled lists
[1,1,1,1,1,1,1,1,1,1]

Boring! This makes sense, though: there’s only one way to make a list out of n identical objects.

The species of lists on indistinguishable labels

The species of lists on indistinguishable labels

But how about something a bit less trivial?


> take 10 $ labelled partitions
[1,1,2,5,15,52,203,877,4140,21147]
> take 10 $ unlabelled partitions
[1,1,2,3,5,7,11,15,22,30]
> :m +Math.OEIS
> description `fmap` (lookupSequence . take 10 $ labelled partitions)
Just "Bell or exponential numbers: ways of placing n labeled balls into n indistinguishable boxes."
> description `fmap` (lookupSequence . take 10 $ unlabelled partitions)
Just "a(n) = number of partitions of n (the partition numbers)."

Unlabelled partitions

Unlabelled partitions

(I couldn’t resist sneaking in a little plug for my Math.OEIS module there too. =) So, how does this work? Well… it’s a bit more complicated! But I’ll explain it in a future post, too.

Generating structures

But that’s not all! Not only can we count structures, we can generate them, too:


> generate lists ([1..3] :: [Int])
[{*,1,2,3},{*,1,3,2},{*,2,1,3},{*,2,3,1},{*,3,1,2},{*,3,2,1}]
> generate partitions ([1..3] :: [Int])
[[[1,2,3]],[[1,2],[3]],[[1,3],[2]],[[1],[2,3]],[[1],[2],[3]]]

This is a bit magical, and of course I will… explain it in a future post. For now, I leave you with this challenge: can you figure out what the asterisks are doing there? (Hint: the curly brackets denote a cycle…)

Of course, no DSL would be complete without operations with which to build up more complicated structures from simpler ones; in my next post I’ll talk about operations on combinatorial species.

Further reading

If you just can’t wait for my next post and want to read more about combinatorial species, I recommend reading the Wikipedia article, which is OK, this fantastic blog post, which is what introduced me to the wonderful world of species, or for a great dead-tree reference (whence I’m getting most of my information), check out Combinatorial Species and Tree-Like Structures by Bergeron, Labelle, and Leroux.


Hac φ is underway!

July 24, 2009

Hac φ is now in full swing, with almost 20 people hacking away and more coming tomorrow. More news and pictures to come!