Back when I was writing my PhD thesis on combinatorial species, I was aware that André Joyal’s original papers introducing combinatorial species are written in French, which I don’t read. I figured this was no big deal, since there is plenty of secondary literature on species in English (most notably Bergeron et al., which, though originally written in French, has been translated into English by Margaret Readdy). But at some point I asked a question on MathOverflow to which I hadn’t been able to find an answer, and was told that the answer was already in one of Joyal’s original papers!
So I set out to try to read Joyal’s original papers in French (there are two in particular: Une théorie combinatoire des séries formelles, and Foncteurs analytiques et espèces de structures), and found out that it was actually possible since (a) they are mathematics papers, not high literature; (b) I already understand a lot of the mathematics; and (c) these days, there are many easily accessible digital tools to help with the task of translation.
However, although it was possible for me to read them, it was still hard work, and for someone without my background in combinatorics it would be very tough going—which is a shame since the papers are really very beautiful. So I decided to do something to help make the papers and their ideas more widely accessible. In particular, I’m making an English translation of the papers^{1}—or at least of the first one, for now—interspersed with my own commentary to fill in more background, give additional examples, make connections to computation and type theory, or offer additional perspective. I hope it will be valuable to those in the English-speaking mathematics and computer science communities who want to learn more about species or gain more appreciation for a beautiful piece of mathematical history.
This is a long-term project, and not a high priority at the moment; I plan to work on it slowly but steadily. I’ve only worked on the first paper so far, and I’m at least far enough along that I’m not completely embarrassed to publicize it (but not much more than that). I decided to publicize my effort now, instead of waiting until I’m done, for several reasons: first, it may be a very long time before I’m really “done”, and some people may find it helpful or interesting before it gets to that point. Second, I would welcome collaboration, whether in the form of help with the translation itself, editing or extending the commentary, or simply offering feedback on early drafts or fixing typos. You can find an automatically updated PDF with the latest draft here, and the github repo is here. There are also simple instructions for compiling the paper yourself (using stack) should you want to do that.
And yes, I checked carefully, and this is explicitly allowed by the copyright holder (Elsevier) as long as I put certain notices on the first page.
Suppose you have a full binary tree and you do an operation on every node, where the cost of the operation is proportional to the height of that node. That is, the cost for each of the leaves is , for each of the nodes in the next level up the cost is , and so on. We can visualize the scenario like this:
As a function of the total number of nodes , how expensive is this? We can see that is an upper bound, since there are nodes and the height of each node is at most . But it seems like it might actually be faster than this in reality, since, intuitively, most of the nodes have a height which is much smaller than .
(One specific motivation for this scenario is that we can build a binary heap from an arbitrary set of data by looping over the nodes from the bottom up and calling reheapDown
on each; in the worst case reheapDown
takes time proportional to the height of the node, as in this scenario. But it doesn’t matter if you don’t know about binary heaps.)
Let’s take the same tree and put a dollar at every node, for a total of :
Now imagine sliding all the money as far up and to the right as it will go. That is, we take each dollar, and keep moving it up as long as it is a left child. As soon as we reach a node which is a right child we stop. The tree ends up looking like this:
Now take each pile of money and move it up one step to its parent, except the money at the root of the tree, which you can put in your pocket.
And voilà! We now have exactly enough money at each node to pay for the cost of the operations, and we even have a bit left over (which we can use to buy coffee). But we started with and only shuffled money around; this shows that the total cost is actually .
Exercise for the reader: what does this have to do with the number of bit flips needed to count from to with a binary counter?
]]>If you missed seeing me at ICFP, this is why.
In honor of my son’s birth (he will need to learn the alphabet and Haskell soon)—and at the instigation of Kenny Foner—I revived the Haskell Alphabet by converting it to modern Hakyll and updating some of the broken or outdated links. Some of it is a bit outdated (I wrote it seven years ago), but it’s still a fun little piece of Haskell history. Enjoy!
]]>I am always on the lookout for more exercises to add and for more links to interesting further reading. If you know of a cool exercise or a cool paper or blog post that helps explain/illustrate/apply a standard Haskell type class, please let me know (or just add it yourself, it’s a wiki!). And, of course, the same goes if you notice any errors or confusing bits.
Happy Haskelling!
]]>Feel free to use any of the lecture notes, assignments, or even exam questions. I didn’t leave the exams linked by accident; I use an exam format where the students have a week or so to prepare solutions to the exam, using any resources they want, and then have to come in on exam day and write down their solutions without referring to anything (I got this idea from Scott Weinstein). So leaving the exams posted publically on the web isn’t a problem for me.
Please don’t ask for solutions; I won’t give any, even if you are an instructor. But questions, comments, bug reports, etc. are most welcome.
]]>The workshop will be co-located with ICFP in Oxford, and is devoted to the use of “static type information…used effectively in the development of computer programs”, construed broadly (see the CFP for more specific examples of what is in scope). Last year’s workshop drew a relativey large crowd and had a lot of really great papers and talks, and I expect this year to be no different! Andrew Kennedy (Facebook UK) will also be giving an invited keynote talk.
Please consider submitting something! We are looking for both full papers as well as two-page extended abstracts reporting work in progress. The submission deadline for regular papers is 24 May, and 7 June for extended abstracts.
]]>My deep work sessions are still going strong, though having this opportunity to reflect has been good: I realized that over the months I have become more lax about using my computer and about what sort of things I am willing to do during my “deep work” sessions. It’s too easy to let them become just a block of time I can use to get done all the urgent things I think I need to get done. Of course, sometimes there are truly urgent things to get done, and having a big block of time to work on them can be great, especially if they require focused effort. But it pays to be more intentional about using the deep work blocks to work on big, long-term projects. The myriad little urgent things will get taken care of some other time, if they’re truly important (or maybe they won’t, if they’re not).
Since I’m only teaching two classes this semester, both of which I have taught before, I thought I would have more time for deep work sessions this semester than last, but for some reason it seems I have less. I’m not yet sure whether there’s something I could have done about that, or if the semester just looks different than I expected. This semester has also seen more unavoidable conflicts with my deep work blocks. Usually, I try to keep my scheduled deep work blocks sacrosanct, but I have made some exceptions this semester: for example, search committee meetings are quite important and also extremely difficult to schedule, so I let them be scheduled over top of my deep work blocks if necessary. (But it sure does wreak havoc on my work that week.)
I’m also still blocking my email before 4pm. On the one hand, I know this is helping a lot with my productivity and general level of anxiety. Recently I needed to (or thought I needed to!) briefly unblock my email during the day to check whether I had received a certain reply, and I specifically noticed how my anxiety level shot up as soon as I opened my inbox and saw all the messages there—a good reminder of why I have my email blocked in the first place. On the other hand, it can be frustrating, since the hour from 4-5 is often taken up with other things, so email gets pushed to the evening, or to the next day. When this goes on several days in a row it really doesn’t help my anxiety level to know there are emails sitting there that I ought to respond to. So perhaps there might be a better time to process my email than 4-5, but to be honest I am not sure what it would be. I certainly don’t want to do it first thing in the morning, and the middle of the day is not really any better, schedule-wise, than the end. In any case, I intend to keep doing it until a better idea comes along.
]]>When are two species isomorphic? Since species are, by definition, functors , the obvious answer is “when they are isomorphic as functors”, that is, when there is a natural isomorphism between them.
Let’s unpack definitions a bit. Recall that is the groupoid (i.e. category with all morphisms invertible) of finite sets and bijections, and is the category of finite sets and total functions.
Given two functors , a natural isomorphism is some family of bijections such that for any bijection , the following square commutes:
Think of as a relabelling—that is, a 1-1 correspondence between labels in and labels in . By functoriality of and , we can lift to relabel whole structures, via and . The whole square then says that the family of correspondences “commute with relabelling”—that is, intuitively, can’t “look at” the labels at all, because it has to “do the same thing” even when we change the labels out from under it. It operates based solely on the structure present in the – or -structures—which by functoriality must be independent of the labels—and not based on the identity or structure of the set of labels itself.
All this generalizes readily to virtual species: recall that virtual species consist of pairs of species , where we define to be equivalent to if and only if is (naturally) isomorphic to . That is, isomorphism of virtual species already has natural isomorphism of regular species baked into it by definition.
Now, recall the signed involution from my previous post. Since it depends on a chosen linear order, it is not a natural isomorphism: an arbitrary relabelling certainly need not preserve the ordering on the labels, so the involution is not preserved under relabelling.
This brings up the natural (haha) question: is it possible to give a natural isomorphism between signed sets and signed ballots? If so, we would certainly prefer it to this involution that makes use of a linear order. But sadly, the answer, it turns out, is no. Let range from to . We are looking for a natural isomorphism
in the case that is odd, or
when is even.
Notice that in any case, whether positive or negative, the -structure will be fixed by any relabelling which is a permutation (because permuting the elements of a set does not change the set). Hence, any natural isomorphism must send it to some structure which is also fixed by all permutations. But the only such ballot structure is the one consisting of a single part containing all the elements. This ballot has an odd number of parts, so there cannot be a natural isomorphism in the case that is even—we would have to match the structure with some even-sized ballot which is fixed by all permutations of the labels, but there are none. Hence there is no natural isomorphism, which by definition would have to work for all .
But what about for odd ? Can we have a natural isomorphism between the structures restricted to the case when is odd? In fact, no: we can make a more general argument that applies for any . Consider the different ballots consisting of separate singleton parts. Each of these is fixed by no permutations other than the identity. So, under a natural isomorphism they would all have to map to ballots of the opposite parity which are also fixed by no permutations other than the identity: but by the Pigeonhole Principle, any ballot with fewer than parts must have at least one part containing more than one element, and will therefore be fixed under any permutation that only touches the elements in that part.
In this situation—when there is a 1-1 correspondence between species, but not a natural one—we say the species are equipotent but not isomorphic. The species have the same number of structures of each size, and hence exactly the same generating function, but they are not (naturally) isomorphic: it is possible to tell them apart by looking at how structures behave under relabelling. Another classic example of this phenomenon is the species of lists and the species of permutations: each has exactly labelled structures of size , but they are not naturally isomorphic. Lists have no symmetry at all, that is, they are fixed by no permutations other than the identity, but permutations, in general, do have some symmetry. For example, any permutation is fixed when the labels are permuted by itself: the permutation is the same as the permutation . The classic combinatorial proof that there are the same number of lists and permutations also uses an extra linear order on the labels.
A classic lemma in combinatorics states that any nonempty finite set has the same number of even and odd subsets. In fact, I recently wrote a proof of this on my other blog. Since it’s written for a more general audience, it’s spelled out there in quite a lot of detail; but if you understand the idea of signed involutions, the classic combinatorial proof is quite simple to state: given a set , pick some , and define a signed involution on subsets of (signed according to the parity of their size) by “toggling” the presence of . That is, given , if then take it out, and if then add it in. This is clearly an involution and sends even subsets to odd and vice versa.
However, this involution is not natural—it depends on the choice of .^{1} Is it possible to prove it via a natural correspondence? In fact, that’s one of the things Anders Claesson’s original post—the one that got me started down this whole rabbit hole—was trying to do. Unfortunately, hidden in the middle of his proof was an assumed correspondence between signed sets and signed ballots, and as we’ve seen, this correspondence actually cannot be proved naturally. (I should reiterate that in no way do I mean to disparage his post—it’s still a great post with a lot of cool insights, not to mention a nice framework for thinking about multiplicative inverses of species. It just doesn’t quite accomplish one of the things he thought it was accomplishing!)
Now, at this point, all we know is that the particular argument used in that post is not, in fact, natural. But that doesn’t necessarily mean a natural correspondence between even and odd subsets is impossible. However, it turns out that it is (mostly) impossible: we can give a more direct argument that, in fact, there is no natural proof—that is, the species of odd subsets and the species of even subsets are equipotent but not naturally isomorphic.
The proof is quite simple, and along similar lines as the proof for signed sets and ballots. Note that the empty subset is fixed by all permutations, as is the maximal subset—and these are clearly the only such subsets. So any natural correspondence must match these subsets with each other—but when is even they have the same parity.
What about if we restrict to the case when is odd? Unlike the case of signed sets and ballots, it turns out that we actually can give a natural proof in this case! The involution to use is the one that simply sends any to its complement . This is clearly an involution, and since is odd, it reverses the parity as required. It also does not depend at all on the elements of or any assumed structure on them, that is, it commutes perfectly well with relabelling.
This corresponds nicely with an observation we can make about Pascal’s triangle: it is easy to see that the alternating sum of any odd row is , for example, , since the entries are all duplicated, with one positive and one negative version of each. However, for even rows it is not quite so obvious: why is ? To show this cancels to give 0 we must “split up” some of the numbers so that one part of them cancels with one number and another part cancels with another; that is, we cannot necessarily treat all the subsets of a given size uniformly. But subsets of a given size are completely indistinguishable up to relabelling—hence the necessity of some extra structure to allow us to make the necessary distinctions.
]]>So, how should such a proof look? For a given number of labels , there is a single signed set structure, which is just the set of labels itself (with a sign depending on the parity of ). On the other hand, there are lots of ballots on labels; the key is that some are positive and some are negative, since the sign of the ballots depends on the parity of the number of parts, not the number of labels. For example, consider . There is a single (negative) signed set structure:
(I will use a dashed blue line to indicate negative things, and a solid black line for positive things.)
On the other hand, as we saw last time, there are 13 ballot structures on 3 labels, some positive and some negative:
In this example, it is easy to see that most of the positives and negatives cancel, with exactly one negative ballot left over, which corresponds with the one negative set. As another example, when , there is a single positive set, and 75 signed ballots:
This time it is not quite so easy to tell at a glance (at least not the way I have arranged the ballots in the above picture!), but in fact one can verify that there are exactly 37 negative ballots and 38 positive ones, again cancelling to match the one positive set.
What we need to show, then, is that we can pair up the ballots in such a way that positive ballots are matched with negative ballots, with exactly one ballot of the appropriate sign left to be matched with the one signed set. This is known as a signed involution: an involution is a function which is its own inverse, so it matches things up in pairs; a signed involution sends positive things to negative things and vice versa, except for any fixed points.
In order to do this, we will start by assuming the set of labels is linearly ordered. In one sense this is no big deal, since for any finite set of labels we can always just pick an arbitrary ordering, if there isn’t an “obvious” ordering to use already. On the other hand, it means that the correspondence will be specific to the chosen linear ordering. All other things being equal, we would prefer a correspondence that depends solely on the structure of the ballots, and not on any structure inherent to the labels. I will have quite a bit more to say about this in my third and (probably) final post on the topic. But for today, let’s just see how the correspondence works, given the assumption of a linear order on the labels. I came up with this proof independently while contemplating Anders Claesson’s post, though it turns out that the exact same proof is already in a paper by Claesson and Hannah (in any case it is really just a small lemma, the sort of thing you might give as a homework problem in an undergraduate course on combinatorics).
Given some ballot, find the smallest label. For example, if the labels are as in the examples so far, we will find the label .
If the smallest label is contained in some part together with at least one other label, separate it out into its own part by itself, and put it to the right of its former part. Like this:
On the other hand, if the smallest label is in a part by itself, merge it with the part on the left (if one exists). This is clearly the inverse of the above operation.
The only case we haven’t handled is when the smallest label is in a part by itself which is the leftmost part in the ballot. In that case, we leave that part alone, switch to considering the second-smallest label, and recursively carry out the involution on the remainder of the ballot.
For example:
In this case we find the smallest label (1) in a part by itself in the leftmost position, so we leave it where it is and recurse on the remainder of the ballot. Again, we find the smallest remaining label (2) by itself and leftmost, so we recurse again. This time, we find the smallest remaining label (3) in a part with one other label, so we separate it out and place it to the right.
This transformation on ballots is clearly reversible. The only ballots it doesn’t change are ballots with each label in its own singleton part, sorted from smallest to biggest, like this:
In this case the algorithm recurses through the whole ballot and finds each smallest remaining label in the leftmost position, ultimately doing nothing. Notice that a sorted ballot of singletons has the same sign as the signed set on the same labels, namely, . In any other case, we can see that the algorithm matches positive ballots to negative and vice versa, since it always changes the number of parts by 1, either splitting one part into two or merging two parts into one.
Here’s my implementation of the involution in Haskell:
type Ballot = [[Int]]
ballotInv :: Ballot -> Ballot
ballotInv = go 1
where
go _ [] = []
go s ([a]:ps)
| s == a = [a] : go (s+1) ps
go s (p:ps)
| s `elem` p = delete s p : [s] : ps
go s (p:[a]:ps)
| s == a = sort (a:p) : ps
go s (p:ps) = p : go s ps
(The call to sort
is not strictly necessary, but I like to keep each part canonically sorted.)
Here again are the 13 signed ballots for , this time arranged so that the pair of ballots in each row correspond to each other under the involution, with the leftover, sorted ballot by itself at the top.
If you’d like to see an illustration of the correspondence for , you can find it here (I didn’t want to include inline since it’s somewhat large).
This completes the proof that signed sets and signed ballots correspond. But did we really need that linear order on the labels? Tune in next time to find out!
]]>