Deep work and email habits: a followup

About six months ago I wrote a post about some new work habits I adopted, inspired by Cal Newport’s blog and by his book, Deep Work. First, I began scheduling blocks of “deep work” time during the week, when I go to the library or a coffee shop and work intensely for several hours with no distractions or interruptions. Second, I decided to read and write email only after 4 pm each day. In my calendar, I put a reminder to write a followup blog post six months later, and here we are! (Has it been six months already!?)

My deep work sessions are still going strong, though having this opportunity to reflect has been good: I realized that over the months I have become more lax about using my computer and about what sort of things I am willing to do during my “deep work” sessions. It’s too easy to let them become just a block of time I can use to get done all the urgent things I think I need to get done. Of course, sometimes there are truly urgent things to get done, and having a big block of time to work on them can be great, especially if they require focused effort. But it pays to be more intentional about using the deep work blocks to work on big, long-term projects. The myriad little urgent things will get taken care of some other time, if they’re truly important (or maybe they won’t, if they’re not).

Since I’m only teaching two classes this semester, both of which I have taught before, I thought I would have more time for deep work sessions this semester than last, but for some reason it seems I have less. I’m not yet sure whether there’s something I could have done about that, or if the semester just looks different than I expected. This semester has also seen more unavoidable conflicts with my deep work blocks. Usually, I try to keep my scheduled deep work blocks sacrosanct, but I have made some exceptions this semester: for example, search committee meetings are quite important and also extremely difficult to schedule, so I let them be scheduled over top of my deep work blocks if necessary. (But it sure does wreak havoc on my work that week.)

I’m also still blocking my email before 4pm. On the one hand, I know this is helping a lot with my productivity and general level of anxiety. Recently I needed to (or thought I needed to!) briefly unblock my email during the day to check whether I had received a certain reply, and I specifically noticed how my anxiety level shot up as soon as I opened my inbox and saw all the messages there—a good reminder of why I have my email blocked in the first place. On the other hand, it can be frustrating, since the hour from 4-5 is often taken up with other things, so email gets pushed to the evening, or to the next day. When this goes on several days in a row it really doesn’t help my anxiety level to know there are emails sitting there that I ought to respond to. So perhaps there might be a better time to process my email than 4-5, but to be honest I am not sure what it would be. I certainly don’t want to do it first thing in the morning, and the middle of the day is not really any better, schedule-wise, than the end. In any case, I intend to keep doing it until a better idea comes along.

Posted in meta | Tagged , , | Leave a comment

Signed sets and ballots and naturality

This is blog post #3 in a series on signed sets and ballots (the previous posts are here and here).

Naturality, isomorphism, and equipotence

When are two species isomorphic? Since species are, by definition, functors \mathbb{B} \to \mathbf{FinSet}, the obvious answer is “when they are isomorphic as functors”, that is, when there is a natural isomorphism between them.

Let’s unpack definitions a bit. Recall that \mathbb{B} is the groupoid (i.e. category with all morphisms invertible) of finite sets and bijections, and \mathbf{FinSet} is the category of finite sets and total functions.

Given two functors F, G : \mathbb{B} \to \mathbf{FinSet}, a natural isomorphism is some family of bijections \psi_U : F[U] \leftrightarrow G[U] such that for any bijection \varphi : U \leftrightarrow V, the following square commutes:

Think of \varphi as a relabelling—that is, a 1-1 correspondence between labels in U and labels in V. By functoriality of F and G, we can lift \varphi to relabel whole structures, via F[\varphi] and G[\varphi]. The whole square then says that the family of correspondences \psi “commute with relabelling”—that is, intuitively, \psi can’t “look at” the labels at all, because it has to “do the same thing” even when we change the labels out from under it. It operates based solely on the structure present in the F– or G-structures—which by functoriality must be independent of the labels—and not based on the identity or structure of the set of labels itself.

All this generalizes readily to virtual species: recall that virtual species consist of pairs of species (P,N), where we define (P_1, N_1) to be equivalent to (P_2, N_2) if and only if P_1 + N_2 is (naturally) isomorphic to N_1 + P_2. That is, isomorphism of virtual species already has natural isomorphism of regular species baked into it by definition.

Now, recall the signed involution from my previous post. Since it depends on a chosen linear order, it is not a natural isomorphism: an arbitrary relabelling certainly need not preserve the ordering on the labels, so the involution is not preserved under relabelling.

This brings up the natural (haha) question: is it possible to give a natural isomorphism between signed sets and signed ballots? If so, we would certainly prefer it to this involution that makes use of a linear order. But sadly, the answer, it turns out, is no. Let k range from 0 to n. We are looking for a natural isomorphism

\displaystyle E_n + \sum_{k \text{ even}} (E_+)^k \cong \sum_{k \text{ odd}} (E_+)^k

in the case that n is odd, or

\displaystyle \sum_{k \text{ even}} (E_+)^k \cong E_n + \sum_{k \text{ odd}} (E_+)^k

when n is even.

Notice that in any case, whether positive or negative, the E_n-structure will be fixed by any relabelling which is a permutation (because permuting the elements of a set does not change the set). Hence, any natural isomorphism must send it to some structure which is also fixed by all permutations. But the only such ballot structure is the one consisting of a single part containing all the elements. This ballot has an odd number of parts, so there cannot be a natural isomorphism in the case that n is even—we would have to match the E_n structure with some even-sized ballot which is fixed by all permutations of the labels, but there are none. Hence there is no natural isomorphism, which by definition would have to work for all n.

But what about for odd n? Can we have a natural isomorphism between the structures restricted to the case when n is odd? In fact, no: we can make a more general argument that applies for any n > 1. Consider the n! different ballots consisting of n separate singleton parts. Each of these is fixed by no permutations other than the identity. So, under a natural isomorphism they would all have to map to ballots of the opposite parity which are also fixed by no permutations other than the identity: but by the Pigeonhole Principle, any ballot with fewer than n parts must have at least one part containing more than one element, and will therefore be fixed under any permutation that only touches the elements in that part.

In this situation—when there is a 1-1 correspondence between species, but not a natural one—we say the species are equipotent but not isomorphic. The species have the same number of structures of each size, and hence exactly the same generating function, but they are not (naturally) isomorphic: it is possible to tell them apart by looking at how structures behave under relabelling. Another classic example of this phenomenon is the species of lists and the species of permutations: each has exactly n! labelled structures of size n, but they are not naturally isomorphic. Lists have no symmetry at all, that is, they are fixed by no permutations other than the identity, but permutations, in general, do have some symmetry. For example, any permutation p is fixed when the labels are permuted by p itself: the permutation (123) is the same as the permutation (231). The classic combinatorial proof that there are the same number of lists and permutations also uses an extra linear order on the labels.

Subset parity

A classic lemma in combinatorics states that any nonempty finite set has the same number of even and odd subsets. In fact, I recently wrote a proof of this on my other blog. Since it’s written for a more general audience, it’s spelled out there in quite a lot of detail; but if you understand the idea of signed involutions, the classic combinatorial proof is quite simple to state: given a set S, pick some a \in S, and define a signed involution on subsets of S (signed according to the parity of their size) by “toggling” the presence of a. That is, given X \subseteq S, if a \in X then take it out, and if a \notin X then add it in. This is clearly an involution and sends even subsets to odd and vice versa.

However, this involution is not natural—it depends on the choice of a.1 Is it possible to prove it via a natural correspondence? In fact, that’s one of the things Anders Claesson’s original post—the one that got me started down this whole rabbit hole—was trying to do. Unfortunately, hidden in the middle of his proof was an assumed correspondence between signed sets and signed ballots, and as we’ve seen, this correspondence actually cannot be proved naturally. (I should reiterate that in no way do I mean to disparage his post—it’s still a great post with a lot of cool insights, not to mention a nice framework for thinking about multiplicative inverses of species. It just doesn’t quite accomplish one of the things he thought it was accomplishing!)

Now, at this point, all we know is that the particular argument used in that post is not, in fact, natural. But that doesn’t necessarily mean a natural correspondence between even and odd subsets is impossible. However, it turns out that it is (mostly) impossible: we can give a more direct argument that, in fact, there is no natural proof—that is, the species of odd subsets and the species of even subsets are equipotent but not naturally isomorphic.

The proof is quite simple, and along similar lines as the proof for signed sets and ballots. Note that the empty subset is fixed by all permutations, as is the maximal subset—and these are clearly the only such subsets. So any natural correspondence must match these subsets with each other—but when n is even they have the same parity.

What about if we restrict to the case when n is odd? Unlike the case of signed sets and ballots, it turns out that we actually can give a natural proof in this case! The involution to use is the one that simply sends any X \subseteq S to its complement S - X. This is clearly an involution, and since |S| is odd, it reverses the parity as required. It also does not depend at all on the elements of S or any assumed structure on them, that is, it commutes perfectly well with relabelling.

This corresponds nicely with an observation we can make about Pascal’s triangle: it is easy to see that the alternating sum of any odd row is 0, for example, 1 - 5 + 10 - 10 + 5 - 1 = 0, since the entries are all duplicated, with one positive and one negative version of each. However, for even rows it is not quite so obvious: why is 1 - 4 + 6 - 4 + 1 = 0? To show this cancels to give 0 we must “split up” some of the numbers so that one part of them cancels with one number and another part cancels with another; that is, we cannot necessarily treat all the subsets of a given size uniformly. But subsets of a given size are completely indistinguishable up to relabelling—hence the necessity of some extra structure to allow us to make the necessary distinctions.


  1. Interestingly, it does not depend on a linear order—just on a chosen element. I wonder if anyone has ever thought about “classifying” different equipotences by the “strength” of the extra structure needed to prove them.

Posted in combinatorics, math, species | Tagged , , , , , , , | 2 Comments

Signed sets and ballots, part 2

Recall, from my previous post, that our goal is to find a combinatorial proof showing the correspondence between signed sets and signed ballots, where a signed set is just a set of n elements, considered positive or negative according to the parity of n, and a signed ballot is an ordered list of sets, considered positive or negative according to the parity of the number of sets.

So, how should such a proof look? For a given number of labels n, there is a single signed set structure, which is just the set of labels itself (with a sign depending on the parity of n). On the other hand, there are lots of ballots on n labels; the key is that some are positive and some are negative, since the sign of the ballots depends on the parity of the number of parts, not the number of labels. For example, consider n = 3. There is a single (negative) signed set structure:

(I will use a dashed blue line to indicate negative things, and a solid black line for positive things.)

On the other hand, as we saw last time, there are 13 ballot structures on 3 labels, some positive and some negative:

In this example, it is easy to see that most of the positives and negatives cancel, with exactly one negative ballot left over, which corresponds with the one negative set. As another example, when n = 4, there is a single positive set, and 75 signed ballots:

This time it is not quite so easy to tell at a glance (at least not the way I have arranged the ballots in the above picture!), but in fact one can verify that there are exactly 37 negative ballots and 38 positive ones, again cancelling to match the one positive set.

What we need to show, then, is that we can pair up the ballots in such a way that positive ballots are matched with negative ballots, with exactly one ballot of the appropriate sign left to be matched with the one signed set. This is known as a signed involution: an involution is a function which is its own inverse, so it matches things up in pairs; a signed involution sends positive things to negative things and vice versa, except for any fixed points.

In order to do this, we will start by assuming the set of labels is linearly ordered. In one sense this is no big deal, since for any finite set of labels we can always just pick an arbitrary ordering, if there isn’t an “obvious” ordering to use already. On the other hand, it means that the correspondence will be specific to the chosen linear ordering. All other things being equal, we would prefer a correspondence that depends solely on the structure of the ballots, and not on any structure inherent to the labels. I will have quite a bit more to say about this in my third and (probably) final post on the topic. But for today, let’s just see how the correspondence works, given the assumption of a linear order on the labels. I came up with this proof independently while contemplating Anders Claesson’s post, though it turns out that the exact same proof is already in a paper by Claesson and Hannah (in any case it is really just a small lemma, the sort of thing you might give as a homework problem in an undergraduate course on combinatorics).

Given some ballot, find the smallest label. For example, if the labels are \{1, \dots, n\} as in the examples so far, we will find the label 1.

  • If the smallest label is contained in some part together with at least one other label, separate it out into its own part by itself, and put it to the right of its former part. Like this:

  • On the other hand, if the smallest label is in a part by itself, merge it with the part on the left (if one exists). This is clearly the inverse of the above operation.

  • The only case we haven’t handled is when the smallest label is in a part by itself which is the leftmost part in the ballot. In that case, we leave that part alone, switch to considering the second-smallest label, and recursively carry out the involution on the remainder of the ballot.

    For example:

    In this case we find the smallest label (1) in a part by itself in the leftmost position, so we leave it where it is and recurse on the remainder of the ballot. Again, we find the smallest remaining label (2) by itself and leftmost, so we recurse again. This time, we find the smallest remaining label (3) in a part with one other label, so we separate it out and place it to the right.

This transformation on ballots is clearly reversible. The only ballots it doesn’t change are ballots with each label in its own singleton part, sorted from smallest to biggest, like this:

In this case the algorithm recurses through the whole ballot and finds each smallest remaining label in the leftmost position, ultimately doing nothing. Notice that a sorted ballot of singletons has the same sign as the signed set on the same labels, namely, (-1)^n. In any other case, we can see that the algorithm matches positive ballots to negative and vice versa, since it always changes the number of parts by 1, either splitting one part into two or merging two parts into one.

Here’s my implementation of the involution in Haskell:

type Ballot = [[Int]]

ballotInv :: Ballot -> Ballot
ballotInv = go 1
  where
    go _ [] = []
    go s ([a]:ps)
      | s == a = [a] : go (s+1) ps
    go s (p:ps)
      | s `elem` p = delete s p : [s] : ps
    go s (p:[a]:ps)
      | s == a = sort (a:p) : ps
    go s (p:ps) = p : go s ps

(The call to sort is not strictly necessary, but I like to keep each part canonically sorted.)

Here again are the 13 signed ballots for n = 3, this time arranged so that the pair of ballots in each row correspond to each other under the involution, with the leftover, sorted ballot by itself at the top.

If you’d like to see an illustration of the correspondence for n = 4, you can find it here (I didn’t want to include inline since it’s somewhat large).

This completes the proof that signed sets and signed ballots correspond. But did we really need that linear order on the labels? Tune in next time to find out!

Posted in combinatorics, math, species | Tagged , , , , , , , | 1 Comment

Signed sets and ballots, part 1

The other day, Anders Claesson wrote a very nice blog post explaining a more combinatorial way to understand multiplicative inverses of virtual species (as opposed to the rather algebraic way I explained it in my previous post). In the middle of his post he makes an offhanded assumption which I stubbornly refused to take at face value; after thinking about it for a while and discussing it with Anders, I’m very glad I did, because there’s definitely more going on here than meets the eye and it’s given me a lot of interesting things to think about.

Recall that E denotes the species of sets, defined by E[U] = \{U\}, that is, the only E-structure on a given label set U is the set of labels itself. Recall also that the exponential generating function of a species F is given by

\displaystyle F(x) = \sum_{n \geq 0} f_n \frac{x^n}{n!}

where f_n counts the number of labelled F-structures of size n. In the case of E, we have f_n = 1 for all n, so

\displaystyle E(x) = \sum_{n \geq 0} \frac{x^n}{n!} = e^x.

(This is why E is such a good name for the species of sets—though in a fantastic coincidence, it seems to originally come from the French word for set, ensemble, rather than from the fact that E(x) = e^x (though on the other hand calling it a “coincidence” is probably too strong, since Joyal must surely have picked the notation with the generating function already in mind!).)

Now, from my previous post we know that

\displaystyle E^{-1} = (1 + E_+)^{-1} = \sum_{k \geq 0} (-1)^k (E_+)^k.

Let’s first consider \sum_k (E_+)^k (without the (-1)^k). This means that we have, for some k \geq 0, a k-ary product of E_+ structures—in other words, a list of nonempty sets. This is the species of ballots, also known as ordered set partitions, and can also be written L \circ E_+. As an example, here is a ballot on the set of labels \{1, \dots, 8\}:

The order of the parts matters, so this is a different ballot:

But the order of labels within each part doesn’t matter (since each part is a set). As another example, here is the complete collection of ballot structures on the labels \{1,2,3\}:

We can see that there are 13 in total: six where the labels are each in their own separate part (corresponding to the six possible permutations of the labels); six where two labels share a part and the other label is a singleton part (corresponding to the three ways to choose the solitary label, times the two ways to order the parts); and one final ballot where all three labels are grouped in the same part. (As an exercise, can you verify that there are 75 different ballot structures on a set of four labels?)

Returning to E^{-1} = \sum_k (-1)^k (E_+)^k, we can see that it consists of signed ballots, where the sign of a ballot is the parity of its number of parts, that is, a ballot with k parts has sign (-1)^k. The second half of Anders’ post gives a nice combinatorial proof that E \cdot E^{-1} = 1, via a sign-reversing involution: if we consider E \cdot E^{-1}-structures, i.e. pairs of sets and signed ballots, there is a natural1 way to pair them up, matching positive and negative structures so everything cancels (except in the case of the empty label set, which is why we get 1 instead of 0).

However, Anders is trying to do more than that. Note first that since multiplication of EGFs corresponds to multiplication of species, the EGF for E^{-1} is of course 1/e^x = e^{-x}. But this ought to also be the EGF for the virtual species E(-X), and the rest of his post hinges on identifying E(-X) and E^{-1}. As Anders and I discovered, however, this is precisely the point where it is worth being more careful.

First of all, what is E(-X)? Intuitively, an E(-X) structure consists of a set of negative atoms; since each set can be thought of as an (unordered) product of atoms, the whole set acquires a sign given by the parity of the number of atoms. In other words, intuitively it seems that E(-X) should be the species of signed sets, where an even-sized set is considered positive and an odd-sized set negative. That is,

\displaystyle E(-X) = \sum_{n \geq 0} (-1)^n E_n,

where E_n denotes the species of sets of size exactly n. As a sanity check, this makes sense as an EGF equation too, since the EGF of E_n is just x^n/n! and indeed

\displaystyle e^{-x} = \sum_{n \geq 0} \frac{(-x)^n}{n!} = \sum_{n \geq 0} (-1)^n \frac{x^n}{n!}.

But hold on a minute, what does E(-X) really mean, formally? It is the composition of the species E with the virtual species -X, and it turns out that it is not at all a priori obvious how to define composition for virtual species! We can find the definition on p. 127 of Bergeron et al. A special case (which is enough for our present purposes) is

\displaystyle \Phi(X - Y) = \Phi(X + Y) \times (E(X)E^{-1}(Y))

where X and Y are two sorts of atoms, and \times denotes Cartesian product of species. In our case,

\displaystyle E(0 - X) = E(0 + X) \times ((E(0) E^{-1}(X)) = E(X) \times E^{-1}(X) = E^{-1}(X)

since E is the identity for Cartesian product (overlaying an additional E structure on a set of labels does not add any structure, since there is only one possible E-structure).

All of this is to say, E(-X) is actually defined as E^{-1}! So at first glance it may seem we actually have nothing to prove: E(-X) and E^{-1} are the same by definition, end of story. …but in fact, all we have done is shift the burden of proof elsewhere: now it is our intuitive idea of E(-X) representing signed sets that requires proof!

To sum up, we know that E(-X) = E^{-1} = \sum_k (-1)^k (E_+)^k is the species of signed ballots, with sign given by parity of the number of parts; and intuitively, we also believe that E(-X) should correspond to parity-signed sets, \sum_n (-1)^n E_n. So, is there a nice combinatorial proof showing the correspondence between signed sets and signed ballots?

One can use the law of excluded middle to show that the answer must be “yes”: suppose the answer were “no”; but then I would not be writing this blog post, which is a contradiction since I am writing this blog post. But is there a constructive proof? Fear not! This blog post has gotten long enough, so I will stop here for now and let interested readers puzzle over it; in my next post I will explain what I came up with, along with some musings on linear orders and naturality.


  1. I am indeed using the word natural in a technical, categorical sense here! This will play an important role in my second post…

Posted in combinatorics, math, species | Tagged , , , , , , | 2 Comments

Virtual species suffice

Over six years ago, I wrote a post explaining how virtual species are defined. Ever since then (time flies!) I’ve been meaning to write a follow-up post explaining a bit more about virtual species and how they actually suffice to give us not just additive inverses, but also (somewhat surprisingly) multiplicative inverses.

Recall that the intuitive idea of a combinatorial species is a family of labelled structures which are invariant under relabelling. If you’ve never seen the formal definition before, don’t worry: just think “data structures” or “algebraic data types” for now.

The basic idea of virtual species is to work with pairs of species (P,N) where P is considered “positive” and N “negative”. Formally, we consider equivalence classes of such pairs under the equivalence relation defined by (P_1, N_1) \cong (P_2, N_2) iff P_1 + N_2 = P_2 + N_1.1 This parallels the way one typically gives a formal definition of the integers starting from the natural numbers (the “Grothendieck construction”); see my previous post for more details.

Intuition

How can we build intuition for virtual species, and for additive inverses of species in particular? To be honest I have been struggling with this question for many years.

Multiplicative inverses are much simpler to think about: they are like matter and antimatter. Having both an F-structure and an F^{-1} structure is the same as having nothing; they annihilate each other. By “having nothing” we mean “having no information”, that is, having a unit value: F F^{-1} = 1.

What about additive inverses? Note first that the 0 species does not correspond to having nothing; the word “nothing” corresponds to the 1 (i.e. unit) species. Instead the 0 (i.e. uninhabited) species corresponds to (logical) impossibility. So to interpret -F we have to imagine something where having either F or -F is impossible.

…yeah, me neither. This seems deeply strange. If someone says, “I either have an F or a -F”, you can confidently call them a liar, because it is impossible to have either an F or a -F; that is, F - F = 0. But surely if you actually have an F-structure, it should also be true to say “I have either an F or a G”? Well, that works for normal, positive species—in which case we can define a canonical injection F \to F + G. But once we introduce negative species this completely breaks down. As another example, if someone truthfully says, “I have either a tree or a negative non-empty tree”, you should be able to say, “Aha! I know what you have—it must be an empty tree.” In general, it’s strange that expressing a disjunction can cause some possibilities to be ruled out. Normally, we are used to disjunctions only increasing the number of possibilities.

Inspired by James and Sabry’s really cool paper The Two Dualities of Computation: Negative and Fractional Types, I have thought a bit about whether there is some plausible interpretation involving travelling backwards in time, but I haven’t been able to come up with one. I can’t quite seem to make the formalism of the paper match up with my intuition about species (though this may just be a failure of my imagination or intuition).

Multiplicative Inverses

In any case, let’s see why the ring of virtual species actually has multiplicative inverses—at least, all the ones we could possibly hope for. This is somewhat surprising, since when we build integers from natural numbers by considering equivalence classes of pairs, we certainly don’t get any multiplicative inverses, only additive ones. To get multiplicative inverses we have to do the same process a second time, building the rational numbers as equivalence classes of pairs of integers. But species already have enough extra structure that throwing in additive inverses is all it takes.

First, a caveat: we don’t get multiplicative inverses for all species, but only those species G such that G(0) = 1: that is, species G with only a single structure of size zero, which are of the form G = 1 + X(\dots). With any constant term other than 1, we clearly have no hope of finding another species H such that GH = 1, since the constant term of GH will be a multiple of G’s constant term.

So given such a G, write G = 1 + G_+, where G_+ denotes “non-empty G-structures”. Then we can define the multiplicative inverse of G as follows:

\displaystyle G^{-1} = \sum_{k \geq 0} (-1)^k (G_+)^k = 1 - G_+ + G_+^2 - G_+^3 + \dots

That is, a G^{-1}-structure consists of a list of nonempty G-structures, except that even-length lists are considered “positive” and odd-length lists considered “negative”.

We can easily check that this indeed defines a multiplicative inverse for G:

\displaystyle \begin{array}{rcl}G G^{-1} &=& (1 + G_+) (1 - G_+ + G_+^2 - G_+^3 + \dots) \\[0.5em] &=& (1 - G_+ + G_+^2 - G_+^3 + \dots) + (G_+ - G_+^2 + G_+^3 - G_+^4 + \dots) \\[0.5em] &=& 1 \end{array}

The infinite sums telescope down to leave only 1. Notice this really isn’t about species in particular, but really about infinite power series (of which species are the categorification): any infinite power series with integer coefficients and a constant term of 1 has a multiplicative inverse which is also such a power series.

As an example, consider 1/(1-X) = (1-X)^{-1}. We know this is “supposed” to be the species of lists (since it results from solving L = 1 + XL for L), but let’s see what happens. In this case G = 1-X and G_+ = -X. So the inverse ought to be

\displaystyle (1-X)^{-1} = \sum_{k \geq 0} (-1)^k (-X)^k = \sum_{k \geq 0} X^k = 1 + X + X^2 + X^3 + \dots

And hey, look at that! Lists!

A field of species?

So what would we need to get a true field, i.e. a multiplicative inverse for every nonzero species? Well, for that we would need to throw in rational coefficients. I forget exactly where I read this—some paper by Baez and Dolan, most likely—but I believe the proper way to interpret this would be as groupoid-valued species, since there is a sense in which the “cardinality” of groupoids can be interpreted as rational numbers. But to be honest I have no idea where this leads.


  1. Note that species sum is cancellative—that is, if A + C = B + C then A = B—so this is a coherent definition. This cancellative property is probably worth another post of its own since the reason for it is not entirely trivial.

Posted in combinatorics | Tagged , | 23 Comments

Advent of code #16 solution: an algebra of bitstrings

I had fun this past December solving Advent of Code problems in Haskell. I was particularly proud of my solution to one particular problem involving generating and processing large bitstrings, which I’d like to share here. I think it really shows off the power of an algebraic, DSL-based approach to problem solving.

This post is literate Haskell—download it and play along!

> {-# LANGUAGE GADTs #-}
> 
> import Control.Arrow   ((***))
> import Data.Bits       (xor)
> import Data.List       (unfoldr)
> import Data.List.Split (chunksOf)
> import Data.Maybe      (fromJust)

The problem

You can go read the problem description if you like, but it’s rather verbose—I’ll try to give a more concise description here, illustrated with Haskell code.

The problem is concerned with strings of bits:

> type BitString = [Bool]

We’ll start just by defining a few utility functions to view and input bitstrings conveniently.

> readbits :: String -> BitString
> readbits = map (=='1')
> 
> showbits :: BitString -> String
> showbits = map (\b -> if b then '1' else '0')
> 
> withbits :: (BitString -> BitString) -> String -> String
> withbits f = showbits . f . readbits

Now on to the problem proper. There is a central operation—which I’ll call the “dragon transform”—which makes a longer bitstring from a shorter one. Given a bitstring s, append a 0 to the end, and then append a reversed and inverted version of s (where “invert” means to flip all the bits). Like so:

> invert :: BitString -> BitString
> invert = map not
> 
> dragon :: BitString -> BitString
> dragon s = s ++ [False] ++ invert (reverse s)

For example,

ghci> withbits dragon "1"
  "100"

ghci> withbits dragon "1101111"
  "110111100000100"

(Incidentally, this operation is called dragon since it is related to the classic dragon curve. Hint: interpret 0 as a left turn and 1 as a right turn.)

Given a starting bitstring, and a target length, we are supposed to iterate dragon until we have at least the number of target bits, and then truncate the string to the desired length:

> fill :: Int -> BitString -> BitString
> fill len = take len . head . dropWhile ((< len) . length) . iterate dragon

For example, if we start with 1, we have to iterate dragon three times to end up with at least ten bits.

ghci> map showbits . take 4 $ iterate dragon [True]
  ["1","100","1000110","100011001001110"]

ghci> withbits (fill 10) "1"
  "1000110010"

Finally, after extending an initial bitstring to a given length, we perform a checksum operation:

  • If there are an odd number of bits, we are done.
  • Otherwise, take the bits two at a time and compute the negation of their exclusive or: that is, 1 if the bits are the same and 0 if they are different (otherwise known as (==)). This results in a bitstring half as long. Now repeat the process, continuing to halve the length until we have an odd number of bits remaining.

In code:

> checksum :: BitString -> BitString
> checksum a
>   | odd (length a) = a
>   | otherwise = checksum . map xnor . chunksOf 2 $ a
>   where
>     xnor [x,y] = x == y

The first task

So, we now have a simple reference implementation that directly follows the specification. We can use this to solve the first task, which just asks to start with a given short bitstring, extend it to length 272, and then compute the checksum. I think different logged-in users get different starting strings, but mine was 01000100010010111:

> input = "01000100010010111"
ghci> withbits (checksum . fill 272) input
  "10010010110011010"

Notice that 272 = 17 \cdot 2^4, so after expanding to that length and then repeatedly halving the length, we end up with a checksum of length 17.

The second task

That was easy. Bring on the second task! Well… of course, it is much bigger. It asks to use the same starting bitstring, but this time extend it to length 35651584 = 17 \cdot 2^{21} before computing the checksum (which will again end up having length 17). Using this naive, unoptimized implementation completely blows up: it turns out that generating a list of 35 million booleans is really not a good idea. Using actual lists with a cons cell for each bit incurs a whole lot of memory and allocation overhead; it just made my computer grind to a halt.

As you may realize, there is a lot of low-hanging fruit here: for example, we can use an unboxed Vector instead of a list, or even do some deforestation to avoid allocation (the former code is by Eric Mertens aka glguy, the latter by Daniel Wagner aka dmwit). Using techniques like that, it’s possible to get the runtime and memory requirements down to something reasonable. But that’s not what I want to talk about. Though more efficient, those solutions are still actually computing every single bit. It seemed to me we shouldn’t have to do that: the computation has a lot of nice structure, and seemingly a lot of opportunity for sharing intermediate results. I went off in search of a way to compute the correct checksum without actually generating the entire intermediate bitstring.

Interlude: xnor

The first order of business was to work out an algebraic understanding of the xnor operation, which I will denote \overline{x \oplus y} (the circled plus operator \oplus denotes xor, and the overbar denotes logical negation). One fundamental fact is that

\displaystyle \overline{x \oplus y} = \overline{x} \oplus y = x \oplus \overline{y}

(checking whether x and y are equal is the same as first negating one and then checking whether they are unequal). From this, and the fact that \oplus is associative, we can prove associativity of xnor:

\displaystyle \overline{\overline{x \oplus y} \oplus z} = (\overline{x} \oplus y) \oplus \overline{z} = \overline{x} \oplus (y \oplus \overline{z}) = \overline{x \oplus \overline{y \oplus z}}

Associativity, along with the fact that 1 is an identity for the operation, means it forms a monoid. When we repeatedly take the xnor of adjacent bits, we are therefore basically doing an mconcat using a strictly balanced combining scheme. But associativity means we can be freer about the order in which we do the combining. If we start with a bitstring of length n \cdot 2^k, the checksumming operation iterates k times, and each consecutive sequence of 2^k bits gets folded down into a single bit via mconcat. In other words, the checksum operation can be reimplemented like this:

> checksum2 :: BitString -> BitString
> checksum2 a = map combine . chunksOf (powTwo (length a)) $ a
>   where
>     combine = foldr (==) True
> 
>     -- Find the biggest power of two that divides n
>     powTwo n
>       | odd n     = 1
>       | otherwise = 2 * powTwo (n `div` 2)

Let’s check that this works:

ghci> withbits (checksum2 . fill 272) input
  "10010010110011010"

ghci> let bits = fill 272 (readbits input) in checksum bits == checksum2 bits
  True

Now, this isn’t really any faster yet; but this idea will be important later!

There’s one more thing we can observe about xnor: if we fold an odd number of bits with xnor, it’s the same as taking the xor of all the bits; if we fold an even number of bits, it’s the same as taking the xor of all the bits and then negating the result. That is,

\displaystyle \begin{array}{rcl} \overline{x_1 \oplus x_2} &=& \overline{x_1 \oplus x_2} \\[0.5em] \overline{x_1 \oplus \overline{x_2 \oplus x_3}} &=& x_1 \oplus x_2 \oplus x_3 \\[0.5em] \overline{x \oplus \overline{x_2 \oplus \overline{x_3 \oplus x_4}}} &=& \overline{x_1 \oplus x_2 \oplus x_3 \oplus x_4} \\[0.5em] \overline{x \oplus \overline{x_2 \oplus \overline{x_3 \oplus \overline{x_4 \oplus x_5}}}} &=& x_1 \oplus x_2 \oplus x_3 \oplus x_4 \oplus x_5 \end{array}

and so on. The proof is a simple induction argument, making use of the relation \overline{x \oplus y} = \overline{x} \oplus y we noted before. So when folding xnor, as a simple optimization, we can avoid doing a lot of negations by just computing the xor and then negating appropriately based on the parity of the number of bits.

The algebra of bitstrings

With that under our belts, we can move on to the real meat of the solution. The central idea is that instead of representing bitstrings directly as lists (or vectors, or whatever) of bits, we represent them using a deep embedding of a little bitstring algebra (aka DSL). That is, we represent each bitstring operation as a constructor of an algebraic data type, which allows us to directly manipulate bitstring expressions. The point is that this algebra/DSL has a lot of nice structure that allows us to work at an abstract, algebraic level instead of working directly with bits.

There’s one more twist to note before actually seeing the data type definition. We know that we will need to talk about the length of bitstrings as well as their xnor/xor. Instead of having to recalculate these every time we need them, we can cache them at each node of a bitstring expression. We’ll see how these cached values come in handy later.

> data BitExpr where

So, what does our algebra of bitstrings need? First, it’s useful to have an explicit representation of the empty bitstring, as well as a singleton bit. We don’t need to cache length or xor values here, since they are obvious and can be computed in constant time.

>   Emp :: BitExpr
>   Bit :: Bool -> BitExpr

Next, we need to be able to append bitstrings. Notice the Bool, which represents the cached xor of the entire bitstring, as well as the Integer which represents the length.

>   App :: !Bool -> !Integer -> BitExpr -> BitExpr -> BitExpr

Finally, we need three unary operations on bitstrings: invert, reverse, and dragon. Each also carries a cached length and xor.

>   Inv :: !Bool -> !Integer -> BitExpr -> BitExpr
>   Rev :: !Bool -> !Integer -> BitExpr -> BitExpr
>   Drg :: !Bool -> !Integer -> BitExpr -> BitExpr
> 
>   deriving Show

Note that Drg is redundant in some sense, since the dragon transform can be encoded in terms of append, inverse, and reverse. However, it’s critical that we include it explicitly: since the dragon transform uses the input bitstring twice, expanding an iterated application of Drg in terms of the other constructors would result in an exponential blowup in the size of the expression.

To be concrete, let’s write a straightforward interpreter which formally connects a bitstring expression with its intended semantics as a bitstring. This comes in handy for testing, but other than testing, the whole point is that we will not use this—we want to solve the problem at the level of bitstring expressions, without ever actually generating their corresponding bitstrings.

> toBits :: BitExpr -> BitString
> toBits Emp = []
> toBits (Bit b) = [b]
> toBits (App _ _ s1 s2) = toBits s1 ++ toBits s2
> toBits (Inv _ _ s) = invert  (toBits s)
> toBits (Rev _ _ s) = reverse (toBits s)
> toBits (Drg _ _ s) = dragon  (toBits s)

Next, let’s write some simple utility functions to extract the cached length or xor from the root of a bitstring expression:

> bsLen :: BitExpr -> Integer
> bsLen Emp           = 0
> bsLen (Bit _)       = 1
> bsLen (App _ l _ _) = l
> bsLen (Inv _ l _)   = l
> bsLen (Rev _ l _)   = l
> bsLen (Drg _ l _)   = l
> 
> bsXor :: BitExpr -> Bool
> bsXor Emp           = False
> bsXor (Bit b)       = b
> bsXor (App b _ _ _) = b
> bsXor (Inv b _ _)   = b
> bsXor (Rev b _ _)   = b
> bsXor (Drg b _ _)   = b

Next, we’ll write some smart constructors which automatically take care of properly computing the cached length and xor.

> bit :: Bool -> BitExpr
> bit = Bit

Appending combines xor values with xor and adds lengths. app also does a bit of optimization when appending with the empty bitstring. For convenience, we can also use app to create a function bits to convert a literal bitstring into a BitExpr.

> app :: BitExpr -> BitExpr -> BitExpr
> app s1 Emp = s1
> app s1 s2 = App (bsXor s1 `xor` bsXor s2) (bsLen s1 + bsLen s2) s1 s2
> 
> bits :: String -> BitExpr
> bits = foldr (app . bit . (=='1')) Emp

Inverting a bitstring preserves the xor when it has even length, and inverts the xor when it has odd length. Note how we make use of both the cached xor and length values to compute the new cached xor.

> inv :: BitExpr -> BitExpr
> inv s = Inv (if even (bsLen s) then bsXor s else not (bsXor s))
>             (bsLen s)
>             s

Reversing preserves xor and length.

> rev :: BitExpr -> BitExpr
> rev s = Rev (bsXor s) (bsLen s) s

Finally, the dragon operation: the xor of dragon s is the xor of s combined with the xor of inv s; the length is one more than twice the length of s.

> drg :: BitExpr -> BitExpr
> drg s = Drg (bsXor s `xor` bsXor (inv s)) (2*(bsLen s) + 1) s

We can test these:

ghci> let t = drg (bits "11" `app` inv (bits "10000"))
ghci> showbits . toBits $ t
  "110111100000100"

ghci> bsLen t
  15

Splitting

Remember that our high-level goal is to take the expanded version of our bitstring, split it into blocks of length 2^k, and then separately reduce each block with xnor. It turns out that we have enough information to split a bitstring expression into two bitstring expressions which correspond to splitting off a block of a given size from the beginning of the corresponding bitstring. That is, we will write a function splitBits :: Integer -> BitExpr -> (BitExpr, BitExpr) which works like splitAt, but on bitstring expressions instead of bitstrings. In other words, it will satisfy the property

splitAt n . toBits == (toBits *** toBits) . splitBits n

We’ll go through the implementation case by case. You might like to try implementing splitBits yourself before peeking at mine; it makes for a nice exercise.

> splitBits :: Integer -> BitExpr -> (BitExpr, BitExpr)

In the base cases, to split zero bits off the front of a bitstring, or if we are asked to split off more bits than there are, just generate the empty bitstring expression.

> splitBits 0 s                = (Emp, s)
> splitBits n s | n >= bsLen s = (s, Emp)

To split an App node, compare the number of bits we want to split off with the length of the first bitstring, and recursively split in either the left or right side appropriately, remembering to subtract the length of the first bitstring from the number of bits to split if we recurse on the right side.

> splitBits n (App _ _ s1 s2)
>   | n < bsLen s1
>     = let (s1a, s1b) = splitBits n s1 in (s1a, s1b `app` s2)
>   | otherwise
>     = let (s2a, s2b) = splitBits (n - bsLen s1) s2 in (s1 `app` s2a, s2b)

Inverting commutes with splitting, so to split an Inv node, we can just split recursively and then rewrap the results with inv.

> splitBits n (Inv _ _ s) = (inv *** inv) $ splitBits n s

To split Rev and Drg nodes, we expand the expressions a bit to get rid of the top-level constructor before re-calling splitBits.

> splitBits n (Rev _ _ s) = splitBits n (pushRev s)
> splitBits n (Drg _ _ s) = splitBits n (expandDragon s)

In the case of Rev, we can “push the reverse through” one level, transforming it into an equivalent expression which no longer has a Rev node at the top. We make use of some nice algebraic properties governing the interaction of reverse with the other operations:

  • Reversing an empty or singleton bitstring does nothing.
  • reverse (s1 ++ s2) == reverse s2 ++ reverse s1
  • reverse . invert = invert . reverse
  • reverse . reverse = id
  • Finally, reverse . dragon = dragon . invert, which can be easily proved by expanding dragon in terms of the other operations and then applying the above algebraic laws.

Using these properties, we can implement pushRev as follows:

> pushRev :: BitExpr -> BitExpr
> pushRev Emp     = Emp
> pushRev (Bit b) = Bit b
> pushRev (App _ _ s1 s2) = rev s2 `app` rev s1
> pushRev (Inv _ _ s) = inv (rev s)
> pushRev (Rev _ _ s) = s
> pushRev (Drg _ _ s) = drg (inv s)

Finally, expandDragon just expands a dragon operation in terms of the other operations. Although this approximately doubles the size of the bitstring expression, we only do this lazily, when we are actually trying to split the result of a dragon transform. It’s only natural that splitting an expression results in somewhat larger expressions.

> expandDragon :: BitExpr -> BitExpr
> expandDragon s = s `app` (bit False `app` inv (rev s))

Filling and checksumming

We’re almost there! We can now implement the fill and checksum operations at the level of bitstring expressions.

fill is straightforward: keep applying the drg smart constructor until the cached length is sufficient, then use splitBits to create an expression corresponding to only the first n bits.

> fillE :: Integer -> String -> BitExpr
> fillE n str = fst . splitBits n $ go (bits str)
>   where
>     go s | bsLen s >= n = s
>          | otherwise    = go (drg s)

Finally, we can implement checksumE using the same pattern as checksum2, where we break up the string into chunks of size 2^k and then reduce each chunk. The only difference is that now we use splitBits to split, and the cached xor to compute the reduction. We know each of the blocks has an even length, so the xnor is just the negation of the cached xor.

> checksumE :: BitExpr -> BitString
> checksumE s = map (not . bsXor) . unfoldr doSplit $ s
>   where
>     doSplit Emp = Nothing
>     doSplit s   = Just (splitBits blockLen s)
>     blockLen = powTwo (bsLen s)
>     powTwo n
>       | odd n     = 1
>       | otherwise = 2 * powTwo (n `div` 2)

Let’s check that we get the same answer for the first task:

ghci> showbits $ checksumE (fillE 272 input)
  "10010010110011010"

ghci> withbits (checksum . fill 272) input
  "10010010110011010"

Great! And now for the second task:

ghci> showbits $ checksumE (fillE (17 * 2^21) input)
  "01010100101011100"

On my machine this finishes pretty much instantaneously, taking only 0.02 seconds. In order to generate enough bits, the dragon transform must be applied 21 times, but that just generates a small expression with 21 Drg constructors. Splitting into chunks of length 2^{21} certainly expands the size of the expressions a bit, but everything stays nice and logarithmic since many of the Drg constructors can remain unexpanded.

In fact, this can easily handle MUCH larger problem instances. For example:

ghci> showbits $ checksumE (fillE (17 * 2^80) input)
  "10000100010001100"

ghci> showbits $ checksumE (fillE (17 * 2^81) input)
  "01010100101011100"

Semantically, this corresponds to generating yottabytes worth of bits (I had to look up the proper prefix) and then checksumming them; operationally, though, these are still basically instantaneous. (Interestingly, I also tried 17 \cdot 2^{200}, and it instantaneously printed the first 11 bits of the answer and then segfaulted. Perhaps I have found a bug in GHC 8.0.2.)

Notice that the checksum for 17 \cdot 2^{81} is actually the same as that for 17 \cdot 2^{21}. After playing around with it a bit, the checksums for 17 \cdot 2^k seem to have a period of 12, but I’m not sure how to prove it!

Posted in haskell | Tagged , , , , , , | 2 Comments

My new programming languages course

tl;dr: my new PL course is now finished, and all the course materials are freely available. Working through all the exercises should be a great option for anyone wishing to learn some basics of programming language design and implementation.

Last May, I wrote about my ideas for designing a new PL course, and got a lot of great comments and feedback. Well, somehow I survived the semester, and the course is now over. In the end I’m pretty happy with how it went (though of course there are always things that can be improved next time).

I decided to use class time in an unconventional way: for each class meeting I created a “module”, consisting of a literate Haskell file with some example code, explanatory text, and lots of holes where students needed to write answers to exercises or fill in code. I split the students into groups, and they spent class time just working through the module. Instead of standing at the front lecturing, I just wandered around watching them work and answering questions. It took a bit of getting used to—for the first few classes I couldn’t shake the feeling that I wasn’t really doing my job—but it quickly became clear that the students were really learning and engaging with the material in a way that they would not have been able to if I had just lectured.

A happy byproduct of this approach is that the modules are fairly self-contained and can now be used by anyone to learn the material. Reading through all the modules and working through the exercises should be a great option for anyone wishing to learn some basics of programming language design and implementation. For example, I know I will probably reuse it to get summer research students up to speed. Note that the course assumes no knowledge of Haskell (so those familiar with Haskell can safely skip the first few modules), but introduces just enough to get where I want to go.

I don’t plan to release any solutions, so don’t ask. But other than that, questions, comments, bug reports, etc. are welcome!

Posted in teaching | Tagged , , , | 9 Comments