A strange representation of Z6

On my other blog I am writing about a proof of the Lucas-Lehmer test, and today in the course of working up some examples I stumbled across this little gem.

Let M be a monoid, and let M^* denote the subset of elements of M which actually have an inverse. Then it is not hard to show that M^* is a group: the identity is its own inverse and hence is in M^*; it is closed under the monoid operation since if a and b have inverses then so does ab (namely, b^{-1}a^{-1}); and clearly the inverse of every element in M^* is also in M^*, because being an inverse also implies having one.

Now let M = \{a + b\sqrt{3} \mid 0 \leq a,b < 3\}, where the operation is multiplication, but the coefficients a and b are reduced modulo 3. For example, (2 + \sqrt 3)(2 + 2 \sqrt 3) = (4 + 6) + (4 + 2) \sqrt 3 = 1 + 0 \sqrt 3. This does turn out to be associative, and is clearly commutative; and 1 = 1 + 0\sqrt 3 is the identity. I wrote a little program to see which elements have inverses, and it turns out that the three elements with a = 0 do not, but the other six do. So this is an Abelian group of order 6; but there’s only one such group, namely, the cyclic group \mathbb{Z}_6. And, sure enough, M^* turns out to be generated by 2 + \sqrt 3 and 2 + 2 \sqrt 3.

Posted in math | Tagged , , , | 11 Comments

ICFP roundup

Well, ICFP 2015 in Vancouver is already two months in the past… I’ve been meaning for a while to write a post reflecting on the experience. Better late than never!

The Place

I had never been to Vancouver before; it seems like a beautiful and fun city. One afternoon I skipped all the talks and went for a long hike—I ended up walking around the entire perimeter of the Stanley Park seawall, which was gorgeous. The banquet was at the (really cool) aquarium—definitely the first time I have eaten dinner while being watched by an octopus.

The People

Instead of staying in the conference hotel, four of us (me, Ryan Yates, Ryan Trinkle, and Michael Sloan) rented an apartment through AirBnB.1 The apartment was really great, it ended up being cheaper per person than sharing two hotel rooms, and it was a lot of fun to have a comfortable place to hang out in the evenings—where we could sit around in our pajamas, and talk, or write code, or whatever, without having to be “on”.

I met some new people, including Aahlad Gogineni from Tufts (along with another Tufts student whose name I unfortunately forget); Zac Slade and Boyd Smith from my new home state of Arkansas; and some folks from Vancouver whose names I am also blanking on at the moment. I also met a few people in person for the first time I had previously only communicated with electronically, like Rein Heinrich and Chris Smith.

I also saw lots of old friends—way too many to list. It once again reminded me how thankful I am to be part of such a great community. Of course, the community is also far from perfect; towards that end I really enjoyed and appreciated the ally skills tutorial taught by Valerie Aurora (which probably deserves its own post).

The Content

Here are just a few of my favorite talks:

I had a lot of great discussions relating to diagrams. For example:

  • I talked with Alan Zimmerman about using his and Matthew Pickering’s great work on ghc-exactprint with an eye towards shipping future diagrams releases along with an automated refactoring tool for updating users’ diagrams code.

  • After talking a bit with Michael Sloan I got a much better sense for the ways stack can support our development and CI testing process.

  • I had a lot of fun talking with Ryan Yates about various things, including projecting diagrams from 3D into 2D, and reworking the semantics and API for diagrams’ paths and trails to be more elegant and consistent. We gave a presentation at FARM which seemed to be well-received.

I got another peek at how well Idris is coming along, including a few personal demonstrations from David Christiansen (thanks David!). I am quite impressed, and plan to look into using it during the last few weeks of my functional programming course this spring (in the past I have used Agda).

If I had written this as soon as I got back, I probably could have remembered a lot more; oh well. All in all, a wonderful week, and I’m looking forward to Japan next year!

  1. Yes, I know that hotel bookings help pay for the conference, and I admit to feeling somewhat conflicted about this.

  2. I asked him afterwards how he made the great animations in his slides, and sadly it seems he tediously constructed them using PowerPoint. Someday, it will be possible to do this with diagrams!

Posted in haskell | Tagged , , | 2 Comments

Experience report: oral final exam

This past spring I taught a standard data structures course (stacks, queues, binary trees, heaps, asymptotic analysis, that kind of thing). Inspired by a group I participated in exploring pedagogy and course design—led by the wonderful Betsy Burris—I decided to give an oral final exam instead of a typical written exam. For whatever reason, oral exams don’t seem very popular in the US these days, but overall I think my experiment was a success and I definitely hope to repeat it in the future. This post reports on how I organized things, what went well and poorly, and what I learned overall. I’d love to hear from anyone else who tries something like this in their own course. I’m also happy to answer questions, either in a comment on this post or via email.

Exam content

I prepared three questions for the exam. The first was fairly simple (“explain algorithm X and analyze its time complexity”) and I actually told the students ahead of time what it would be—to help them feel more comfortable and prepared. The other questions were a bit more open-ended:

  • The second question was of the form “I want to store X information and do operations Y and Z on it. What sorts of data structure(s) might you use, and what would be the tradeoffs?” There were then a couple rounds of “OK, now I want to add another operation W. How does that change your analysis?” In answering this I expected them to deploy metrics like code complexity, time and memory usage etc. to compare different data structures. I wanted to see them think about a lot of the different data structures we had discussed over the course of the semester and their advantages and disadvantages at a high level.

  • The final question was of the form “Here is some code. What does it do? What is its time complexity? Now please design a more efficient version that does the same thing.” With some students there was enough time to have them actually write code, with other students I just had them talk through the design of an algorithm. This question got more at their ability to design and analyze appropriate algorithms on data structures. The algorithm I asked them to develop was not something they had seen before, but it was similar to other things they had seen, put together in a new way.

Overall I was happy with the questions and the quality of the responses they elicited. If I do this again I would use similar sorts of questions.


You might well be wondering how long all of this took. I had about 30 students. I planned for the exam to take 30 minutes, and blocked out 45-minute chunks of time (to allow time for transitioning and for the exam to go a bit over 30 minutes if necessary; in practice the exams always went at least 40 minutes and I was scrambling at the end to jot down final thoughts before the next students showed up). I allowed them to choose whether to come in by themselves or with a partner (more on this later). As seems typical, about 1/3 of them chose to come by themselves, and the other 2/3 in pairs, for a total of about 20 exam slots. 20 slots at 45 minutes per slot comes out to 15 hours, or 3 hours per day for a week. This might sound like a lot, but if you compare it to the time required for a traditional written exam it compares quite favorably. First of all, I spent only two or three hours preparing the exam, whereas I estimate I would easily spend 5 or 10 hours preparing a written exam—a written exam has to be very precise in explaining what is wanted and in trying to anticipate potential questions and confusions. When you are asking the questions in person, it is easy to just clear up these confusions as they arise. Second, I was mostly grading students during their exam (more on this in the next section) so that by about five minutes after the end of their slot I had their exam completely graded. With a written exam, I could easily have spent at least 15 hours just grading all the exams.

So overall, the oral exam took up less of my time, and I can tell you, hands down, that my time was spent much more enjoyably than it would have been with a written exam. It was really fun to have each student come into my office, to get a chance to talk with them individually (or as a pair) and see what they had learned. It felt like a fitting end to the semester.


In order to assess the students, I prepared a detailed rubric beforehand, which was really critical. With a written exam you can just give the exam and then later come up with a rubric when you go to grade them (although I think even written exams are usually improved by coming up with a rubric beforehand, as part of the exam design process—it helps you to analyze whether your exam is really assessing the things you want it to). For an oral exam, this is impossible: there is no way to remember all of the responses that each student gives, and even if you write down a bunch of notes during or after each exam, you would probably find later that you didn’t write down everything that you should have.

In any case, it worked pretty well to have a rubric in front of me, where I could check things off or jot down quick notes in real time.

Social aspects

People are often surprised when I say that I allowed the students to come in pairs. My reasons were as follows:

  • For many students, being able to come with a partner may help ease their anxiety about the exam. I was especially sensitive to this since for many of them an oral exam was a new experience. (I asked beforehand, and only two or three of them had ever had an oral exam, and those were in foreign language classes.)
  • Computer science, like most disciplines, is often collaborative. No matter whether a student ends up going into industry or academia, it is very likely that any data structure design or analysis that they end up doing in “real life” will be in collaboration with others. So in this sense a collaborative oral exam is fairly authentic—much more so than a written exam. That is, the skills, activities, and modes of thinking and communicating required by a collaborative oral exam are actually fairly similar to what might be required of the students in the “real world”.
  • There was also, of course, the simple reason of expedience: it cut down on the amount of time I had to spend administering the exam.

Overall I was really happy with the result. Many of the students had been working with a particular partner on their labs for the whole semester and came to the exam with that same partner. For quite a few pairs this obviously worked well for them: it was really fun to watch the back-and-forth between them as they suggested different ideas, debated, corrected each other, and occasionally even seemed to forget that I was in the room.

One might worry about mismatched pairs, where one person does all of the talking and the other is just along for the ride. I only had this happen to some extent with one or two pairs. I told all the students up front that I would take points off in this sort of situation (I ended up taking off 10%). In the end this almost certainly meant that one member of the pair still ended up with a higher grade than they would have had they taken the exam individually. I decided I just didn’t care. I imagine I might rethink this for an individual class where there were many of these sorts of pairings going on during the semester—but in that case I would also try to do something about it before the final exam.

Another interesting social aspect of the process was figuring out what to do when students were floundering. One explicit thing one can do is to offer a hint in exchange for a certain number of points off, but I only ended up using this explicit option a few times. More often, after the right amount of time, I simply guided them on to the next part, either by suggesting that we move on in the interest of time, or by giving them whatever part of the answer they needed to move on to the next part of the question. I then took off points appropriately in my grading.

It was difficult figuring out how to verbally respond to students: on the one hand, stony-faced silence would be unnatural and unnerving; on the other hand, responding enthusiastically when they said something correct would give too much away (i.e. by the absence of such a response when they said something incorrect). As the exams went on I got better (I think) at giving interested-yet-non-committal sorts of responses that encouraged the students but didn’t give too much away. But I still found this to be one of the most perplexing challenges of the whole process.


One might wonder how much of the material from an entire semester can really be covered in a 30-minute conversation. Of course, you most certainly cannot cover every single detail. But you can actually cover quite a lot of the important ideas, along with enough details to get a sense for whether a student understands the details or not. In the end, after all, I don’t care whether a student remembers all the details from my course. Heck, I don’t even remember all the details from my course. But I care a great deal about whether they remember the big ideas, how the details fit, and how to re-derive or look up the details that they have forgotten. Overall, I am happy with the way the exam was able to cover the high points of the syllabus and to test students’ grasp of its breadth.

My one regret, content-wise, is that with only 30 minutes, it’s not really possible to put truly difficult questions on the exam—the sorts of questions that students might have to wrestle with for ten or twenty minutes before getting a handle on them.

Next time

Would I do this again? Absolutely, given the right circumstances. But there are probably a few things I would change or experiment with. Here are a few off the top of my head:

  • This time, I only decided to do an oral final exam quite late in the semester. Next time, with more planning, I would like to think carefully about how to explicitly prepare students for this type of assessment, i.e. to give them chances to practice the necessary skills and to receive feedback. I am not yet sure what that would look like.
  • I might try something like a system for e-mailing students a harder question a certain amount of time before their slot—the idea being that they would have that time to think about and prepare their answer to the question, which they could then present during the slot (in addition to answering some questions on the spot). However, a potential major downside is that students who struggle with the hard question might come in feeling demoralized and do even worse on the rest of the exam.
  • Next time I will definitely be more careful and intentional about writing down a script for exactly what I want to say at the start of each exam. There were a lot of things I wanted to remember to say, and specific ways I wanted to frame the time, and at times it seemed a bit haphazard and inconsistent.

Again, I’m happy to answer questions in the comments or by email. If you are inspired to also try giving an oral exam, let me know how it goes!

Posted in teaching | Tagged , , , , | 5 Comments

Meeting new people at ICFP

This afternoon I’ll be getting on a plane to Vancouver for ICFP. I’m looking forward to seeing many friends, of course, but I also enjoy meeting new people—whether or not they are “famous”, whether or not I think they can “advance my career”. So I’ll just throw this out there: if you will be in Vancouver this week and would like to meet me, just leave a comment and I will make a point of trying to find you to chat! I’ll be attending the Haskell Implementor’s Workshop, the Ally Skills Tutorial, ICFP itself, the Haskell Symposium, and FARM, but there’s also plenty of time to chat in the hallway or over a meal.

Posted in haskell | 1 Comment

Catsters guide is complete!

About a year and a half ago I announced that I had started creating a guide to the excellent series of category theory YouTube videos by the Catsters (aka Eugenia Cheng and Simon Willerton). I am happy to report that as of today, the guide is finally complete!

As far as possible, I have tried to arrange the order so that each video only depends on concepts from earlier ones. (If you have any suggestions for improving the ordering, I would love to hear them!) Along with each video you can also find my cryptic notes; I make no guarantee that they will be useful to anyone (even me!), but hopefully they will at least give you an idea of what is in each video.

If and when they post any new videos (pretty please?) I will try to keep it updated.

Posted in haskell, math, teaching | Tagged | 2 Comments

The Species of Bracelets

Summary: The species package now has support for bracelets, i.e. equivalence classes of lists up to rotation and reversal. I show some examples of their use and then explain the (very interesting!) mathematics behind their implementation.

I recently released a new version of my species package which adds support for the species of bracelets. A bracelet is a (nonempty) sequence of items which is considered equivalent up to rotation and reversal. For example, the two structures illustrated below are considered equivalent as bracelets, since you can transform one into the other by a rotation and a flip:

In other words, a bracelet has the same symmetry as a regular polygon—that is, its symmetry group is the dihedral group D_{2n}. (Actually, this is only true for n \geq 3—I’ll say more about this later.)

Bracelets came up for me recently in relation to a fun side project (more on that soon), and I am told they also show up in applications in biology and chemistry (for example, bracelet symmetry shows up in molecules with cycles, which are common in organic chemistry). There was no way to derive the species of bracelets from what was already in the library, so I added them as a new primitive.

Let’s see some examples (later I discuss how they work). First, we set some options and imports.

ghci> :set -XNoImplicitPrelude
ghci> :m +NumericPrelude
ghci> :m +Math.Combinatorics.Species

Unlabelled bracelets, by themselves, are completely uninteresting: there is only a single unlabelled bracelet shape of any positive size. (Unlabelled species built using bracelets can be interesting, however; we’ll see an example in just a bit). We can ask the library to tell us how many distinct size-n unlabelled bracelets there are for n \in \{0, \dots, 9\}:

ghci> take 10 $ unlabelled bracelets

Labelled bracelets are a bit more interesting. For n \geq 3 there are (n-1)!/2 labelled bracelets of size n: there are (n-1)! cycles of size n (there are n! lists, which counts each cycle n times, once for each rotation), and counting cycles exactly double counts bracelets, since each bracelet can be flipped in one of two ways. For example, there are (5-1)!/2 = 24/2 = 12 labelled bracelets of size 5.

ghci> take 10 $ labelled bracelets

In addition to counting these, we can exhaustively generate them (this is a bit annoying with the current API; I hope to improve it):

ghci> enumerate bracelets [0,1] :: [Bracelet Int]

ghci> enumerate bracelets [0..2] :: [Bracelet Int]

ghci> enumerate bracelets [0..3] :: [Bracelet Int]

And here are all 12 of the size-5 bracelets, where I’ve used a different color to represent each label (see here for the code used to generate them):

As a final example, consider the species B \times E^2, the Cartesian product of bracelets with ordered pairs of sets. That is, given a set of labels, we simultaneously give the labels a bracelet structure and also partition them into two (distinguishable) sets. Considering unlabelled structures of this species—that is, equivalence classes of labelled structures under relabelling—means that we can’t tell the labels apart, other than the fact that we can still tell which are in the first set and which are in the second. So, if we call the first set “purple” and the second “green”, we are counting the number of bracelets made from (otherwise indistinguishable) purple and green beads. Let’s call these binary bracelets. Here’s how many there are of sizes 0 through 14:

ghci> let biBracelets = bracelet >< (set * set)
ghci> take 15 $ unlabelled biBracelets

Let’s use the OEIS to check that we’re on the right track:

ghci> :m +Math.OEIS
ghci> let res = lookupSequence (drop 1 . take 10 $ unlabelled biBracelets)
ghci> fmap description res
  Just "Number of necklaces with n beads of 2 colors, allowing turning over."

Unfortunately the species library can’t currently enumerate unlabelled structures of species involving Cartesian product, though I hope to fix that. But for now we can draw these purple-green bracelets with some custom enumeration code. You can see the numbers 2, 3, 4, 6, 8, 13 show up here, and it’s not too hard to convince yourself that each row contains all possible binary bracelets of a given size.

If you’re just interested in what you can do with bracelets, you can stop reading now. If you’re interested in the mathematical and algorithmic details of how they are implemented, read on!

Exponential generating functions

The exponential generating function (egf) associated to a combinatorial species F is defined by

\displaystyle F(x) = \sum_{n \geq 0} |F[n]| \frac{x^n}{n!}.

That is, the egf is an (infinite) formal power series where the coefficient of x^n/n! is the number of distinct labelled F-structures on n labels. We saw above that for n \geq 3 there are (n-1)!/2 labelled bracelets of size n, and there is one bracelet each of sizes 1 and 2. The egf for bracelets is thus:

\displaystyle B(x) = x + \frac{x^2}{2} + \sum_{n \geq 3} \frac{(n-1)!}{2} \frac{x^n}{n!} = x + \frac{x^2}{2} + \sum_{n \geq 3} \frac{x^n}{2n}.

(Challenge: show this is also equivalent to \frac{1}{2}(x + x^2/2 - \ln(1-x)).) This egf is directly encoded in the species library, and this is what is being used to evaluate labelled bracelets in the example above.

Incidentally, the reason (n-1)!/2 only works for n \geq 3 is in some sense due to the fact that the dihedral groups D_2 = Z_2 and D_4 = K_4 are a bit weird: every dihedral group D_{2n} is a subgroup of the symmetric group \mathcal{S}_n except for D_2 and D_4. The problem is that for n < 3, “flips” actually have no effect, as you can see below:

So, for example, D_4 has 4 elements, corresponding to the identity, a 180 degree rotation, a flip, and a rotation + flip; but the symmetric group \mathcal{S}_2 only has two elements, in this case corresponding to the identity and a 180 degree rotation. The reason (n-1)!/2 doesn’t work, then, is that the division by two is superfluous: for n < 3, counting cycles doesn’t actually overcount bracelets, because every cycle is already a flipped version of itself. So it would also be correct (if rather baroque) to say that for n < 3 there are actually (n-1)! bracelets.

I find this fascinating; it’s almost as if for bigger n the dihedral symmetry has “enough room to breathe” whereas for small n it doesn’t have enough space and gets crushed and folded in on itself, causing weird things to happen. It makes me wonder whether there are other sorts of symmetry with a transition from irregularity to regularity at even bigger n. Probably this is an easy question for a group theorist to answer but I’ve never thought about it before.

Ordinary generating functions

The ordinary generating function (ogf) associated to a species F is defined by

\displaystyle \tilde F(x) = \sum_{n \geq 0} |F[n]/\mathord{\sim}| x^n

where \sim is the equivalence relation on F-structures induced by permuting the labels. That is, the coefficient of x^n is the number of equivalence classes of F-structures on n labels up to relabelling. There is only one unlabelled bracelet of any size n \geq 1, that is, any bracelet of size n can be transformed into any other just by switching labels around. The unique unlabelled bracelet of a given size can be visualized as a bracelet of uniform beads:

though it’s occasionally important to keep in mind the more formal definition as an equivalence class of labelled bracelets. Since there’s just one unlabelled bracelet of each size, the ogf for bracelets is rather boring:

\displaystyle \tilde B(x) = x + x^2 + x^3 + \dots = \frac{x}{x - 1}.

This is encoded in the species library too, and was used to compute unlabelled bracelets above.

egfs, ogfs, and homomorphisms

egfs are quite natural (in fact, species can be seen as a categorification of egfs), and the mapping from species to their associated egf is a homomorphism that preserves many operations such as sum, product, Cartesian product, composition, and derivative. ogfs, however, are not as nice. The mapping from species to ogfs preserves sum and product but does not, in general, preserve other operations like Cartesian product, composition or derivative. In some sense ogfs throw away too much information. Here’s a simple example to illustrate this: although the ogfs for bracelets and cycles are the same, namely, x/(1-x) (there is only one unlabelled bracelet or cycle of each size), the ogfs for binary bracelets and binary cycles are different:

ghci> -- recall biBracelets = bracelet >< (set * set)
ghci> let biCycles = cycles >< (set * set)
ghci> take 15 $ unlabelled biBracelets

ghci> take 15 $ unlabelled biCycles

(Puzzle: why are these the same up through n=5? Find the unique pair of distinct binary 6-cycles which are equivalent as bracelets.)

Clearly, there is no way to take equal ogfs, apply the same operation to both, and get different results out. So the species library cannot be working directly with ogfs in the example above—something else must be going on. That something else is cycle index series, which generalize both egfs and ogfs, and retain enough information that they once again preserve many of the operations we care about.

Cycle index series

Let \mathcal{S}_n denote the symmetric group of order n, that is, the group of permutations on \{1, \dots, n\} under composition. It is well-known that every permutation \sigma \in \mathcal{S}_n can be uniquely decomposed as a product of disjoint cycles. The cycle type of \sigma is the sequence of natural numbers \sigma_1, \sigma_2, \sigma_3, \dots where \sigma_i is the number of i-cycles in the cycle decomposition of \sigma. For example, the permutation (132)(45)(78)(6) has cycle type 1,2,1,0,0,0,\dots since it has one 1-cycle, two 2-cycles, and one 3-cycle.

For a species F and a permutation \sigma \in \mathcal{S}_n, let \mathrm{fix}\ F[\sigma] denote the number of F-structures that are fixed by the action of \sigma, that is,

\displaystyle \mathrm{fix}\ F[\sigma] = \#\{ f \in F[n] \mid F[\sigma] f = f \}.

The cycle index series of a combinatorial species F is a formal power series in an infinite set of variables x_1, x_2, x_3, \dots defined by

\displaystyle Z_F(x_1, x_2, x_3, \dots) = \sum_{n \geq 0} \frac{1}{n!} \sum_{\sigma \in \mathcal{S}_n} \mathrm{fix}\ F[\sigma] x_1^{\sigma_1} x_2^{\sigma_2} x_3^{\sigma_3} \dots

We also sometimes write x^\sigma as an abbreviation for x_1^{\sigma_1} x_2^{\sigma_2} x_3^{\sigma_3} \dots. As a simple example, consider the species of lists, i.e. linear orderings. For each n, the identity permutation (with cycle type n,0,0,\dots) fixes all n! lists of length n, whereas all other permutations do not fix any lists. Therefore

\displaystyle Z_L(x_1, x_2, x_3, \dots) = \sum_{n \geq 0} \frac{1}{n!} n! x_1^n = \sum_{n \geq 0} x_1^n = \frac{1}{1 - x_1}.

(This is not really that great of an example, though—since lists are regular species, that is, they have no nontrivial symmetry, their cycle index series, egf, and ogf are all essentially the same.)

Cycle index series are linked to both egfs and ogfs by the identities

\displaystyle \begin{array}{rcl}F(x) &=& Z_F(x,0,0,\dots) \\ \tilde F(x) &=& Z_F(x,x^2,x^3, \dots)\end{array}

To show the first, note that setting all x_i to 0 other than x_1 means that the only terms that survive are terms with only x_1 raised to some power. These correspond to permutations with only 1-cycles, that is, identity permutations. Identity permutations fix all F-structures of a given size, so we have

\begin{array}{rcl}  Z_F(x,0,0,\dots) &=& \displaystyle \sum_{n \geq 0} \frac{1}{n!} \mathrm{fix}\ F[\mathit{id}] x^n \\  &=& \displaystyle \sum_{n \geq 0} |F[n]| \frac{x^n}{n!}. \end{array}

To prove the link to ogfs, note first that for any permutation \sigma \in \mathcal{S}_n with cycle type \sigma_1,\sigma_2,\sigma_3,\dots we have \sigma_1 + 2\sigma_2 + 3\sigma_3 + \dots = n. Thus:

\begin{array}{rcl}  Z_F(x,x^2,x^3,\dots) &=& \displaystyle \sum_{n \geq 0} \frac{1}{n!} \sum_{\sigma  \in \mathcal{S}_n} \mathrm{fix}\ F[\sigma]  x^{\sigma_1}x^{2\sigma_2}x^{3\sigma_3}\dots \\  &=& \displaystyle \sum_{n \geq 0} \frac{1}{n!} \sum_{\sigma \in \mathcal{S}_n} \mathrm{fix}\ F[\sigma] x^n \\  &=& \displaystyle \sum_{n \geq 0} |F[n]/\mathord{\sim}| x^n \end{array}

where the final step is an application of Burnside’s Lemma.

The important point is that the mapping from species to cycle index series is again a homomorphism for many of the operations we care about, including Cartesian product and composition. So in order to compute an ogf for some species defined in terms of operations that are not compatible with ogfs, one can start out computing with cycle index series and then project down to an ogf at the end.

Cycle index series for bracelets

Let’s now see how to work out the cycle index series for the species of bracelets. For n=1, the single bracelet is fixed by the only element of \mathcal{S}_1, giving a term of x_1. For n=2, the single bracelet is fixed by both elements of \mathcal{S}_2, one of which has cycle type 2,0,0,\dots and the other 0,1,0,\dots. Bracelets of size n \geq 3, as discussed previously, have the dihedral group D_{2n} as their symmetry group. That is, every one of the (n-1)!/2 size-n bracelets is fixed by the action of each element of D_{2n}, and no bracelets are fixed by the action of any other permutation. Putting this all together, we obtain

\begin{array}{rcl} Z_B(x_1, x_2, x_3, \dots) &=& \displaystyle \sum_{n \geq 0}  \frac{1}{n!} \sum_{\sigma \in \mathcal{S}_n} \mathrm{fix}\ B[\sigma]  x_1^{\sigma_1}x_2^{\sigma_2}x_3^{\sigma_3}\dots \\ &=& \displaystyle x_1 + \frac{1}{2}(x_1^2 + x_2) + \sum_{n \geq 3}  \frac{1}{n!} \sum_{\sigma \in D_{2n}} \frac{(n-1)!}{2} x^\sigma \\ &=& \displaystyle x_1 + \frac{1}{2}(x_1^2 + x_2) + \sum_{n \geq 3}  \frac{1}{2n} \sum_{\sigma \in D_{2n}} x^\sigma. \end{array}

Our remaining task is thus to compute \sum_{\sigma \in D_{2n}} x^\sigma, that is, to compute the cycle types of elements of D_{2n} for n \geq 3. I don’t know whether there’s a nice closed form for Z_B, but for our purposes it doesn’t matter: it suffices to come up with a finite algorithm to generate all its terms with their coefficients. A closed form might be important if we want to compute with Z_B symbolically, but if we just want to generate coefficients, an algorithm is good enough.

In general, D_{2n} has n elements corresponding to rotations (including the identity element, which we think of as a rotation by 0 degrees) and n elements corresponding to reflections across some axis. Below I’ve drawn illustrations showing the symmetries of bracelets of size 5 and 6; each symmetry corresponds to an element of D_{2n}.

The lines indicate reflections. You can see that in general there are n lines of reflection. The curved arrows indicate clockwise rotations; taking any number of consecutive arrows from 0 to n-1 gives a distinct rotational symmetry. Let’s label the rotations R_k (for k \in \{0, \dots, n-1\}), where R_k indicates a rotation by k/n of a turn (so R_0 is the identity element). We won’t bother labelling the reflections since it’s not clear how we would choose canonical names for them, and in any case (as we’ll see) we don’t have as much of a need to give them names as we do for the rotations. The only thing we will note is that for even n there are two distinct types of reflections, as illustrated by the dark and light blue lines on the right: the dark blue lines pass through two vertices, and the light blue ones pass through two edges. In the odd case, on the other hand, every line of reflection passes through one vertex and one edge. If you haven’t studied dihedral groups before, you might want to take a minute to convince yourself that this covers all the possible symmetries. It’s clear that a rotation followed by a rotation is again a rotation; what may be less intuitively clear is that a reflection followed by a reflection is a rotation, and that a rotation followed by a reflection is a reflection.

So the name of the game is to consider each group element as a permutation of the labels, and compute the cycle type of the permutation. Let’s tackle the reflections first; we have to separately consider the cases when n is odd and even. We saw above that when n = 2m+1 is odd, each line of reflection passes through exactly one vertex. As a permutation, that means the reflection will fix the label at the vertex it passes through, and swap the labels on other vertices in pairs, as shown in the leftmost diagram below:

So the permutation has cycle type 1,m,0,0,\dots. There is one 1-cycle, and the remaining n-1 = 2m elements are paired off in 2-cycles. There are n of these reflections in total, yielding a term of nx_1x_2^m (where m = \lfloor n/2 \rfloor).

When n = 2m is even, half of the reflections (the light blue ones) have no fixed points, as in the middle diagram above; they put everything in 2-cycles. The other half of the even reflections fix two vertices, with the rest in 2-cycles, as in the rightmost diagram above. In all, this yields terms mx_1^2x_2^{m-1} + mx_2^m.

Now let’s tackle the rotations. One could be forgiven for initially assuming that each rotation will just yield one big n-cycle… a rotation is just cycling the vertices, right? But it is a bit more subtle than that. Let’s look at some examples. In each example below, the green curved arrow indicates a rotation R_k applied to the bracelet. As you can check, the other arrows show the resulting permutation on the labels, that is, each arrow points from one node to the node where it ends up under the action of the rotation.

Do you see the pattern? In the case when k=1 (the first example above), or more generally when n and k are relatively prime (the second example above, with n=5 and k=2), R_k indeed generates a single n-cycle. But when n and k are not relatively prime, it generates multiple cycles. By symmetry the cycles must all be the same size; in general, the rotation R_k generates (n,k) cycles of size n/(n,k) (where (n,k) denotes the greatest common divisor of n and k). So, for example, 2 cycles are generated when n=6 and k=2 or k=4 (the next two examples above). The last example shows n=12 and k=3; we can see that three 4-cycles are generated. Note this even works when k=0: we have (n,0) = n, so we get n cycles of size n/n = 1, i.e. the identity permutation.

So R_k contributes a term x_{n/(n,k)}^{(n,k)}. However, we can say something a bit more concise than this. Note, for example, when n=12, as the contribution of all the R_k we get

x_1^{12} + x_{12} + x_6^2 + x_4^3 + x_3^4 + x_{12} + x_2^6 + x_{12} + x_3^4 + x_4^3 + x_6^2 + x_{12}

but we can collect like terms to get

x_1^{12} + x_2^6 + 2x_3^4 + 2x_4^3 + 2x_6^2 + 4x_{12}

For a given divisor d \mid n, the coefficient of x_{n/d}^d is the number of nonnegative integers less than n whose \gcd with n is equal to d. For example, the coefficient of x_{6}^2 is 2, since there are two values of k for which (n,k) = 12/6 = 2 and hence generate a six-cycle, namely, 2 and 10. So as the contribution of the R_k we could write something like

\displaystyle \sum_{d \mid n} \left( \#\{ 1 \leq k \leq n \mid (k,n) = d \} \right) x_{n/d}^d

but there is a better way. Note that

\displaystyle \#\{ 1 \leq k \leq n \mid (k,n) = d \} = \#\{ 1 \leq j \leq n/d \mid (j,n/d) = 1 \}

since multiplying and dividing by d establishes a bijection between the two sets. For example, we saw that 2 and 10 are the two numbers whose \gcd with 12 is 2; this corresponds to the fact that 2/2 = 1 and 10/2 = 5 are relatively prime to 12/2 = 6.

But counting relatively prime numbers is precisely what Euler’s totient function (usually written \varphi) does. So we can rewrite the coefficient of x_{n/d}^d as

\displaystyle \varphi(n/d) x_{n/d}^d.

Finally, since we are adding up these terms for all divisors d \mid n, we can swap d and n/d (divisors of n always come in pairs whose product is n), and rewrite this as

\displaystyle \varphi(d) x_d^{n/d}.

To sum up, then, we have for each n \geq 3:

  1. When n is odd, n x_1 x_2^{\lfloor n/2 \rfloor}.
  2. When n is even, (n/2) x_1^2 x_2^{n/2 - 1} + (n/2)x_2^{n/2}.
  3. For each d \mid n, we have \varphi(d) x_d^{n/d}.

The only overlap is between (2) and (3): when d = 2 both generate x_2^{n/2} terms. Using Iverson brackets (the notation [P] is equal to 1 if the predicate P is true, and 0 if it is false), we can thus write the sum of the above for a particular n as

\displaystyle [n\text{ odd}](n x_1 x_2^{\lfloor n/2 \rfloor}) + [n\text{ even}]\left(\frac n 2 x_1^2 x_2^{n/2 - 1}\right) + \sum_{d \mid n} \left(\varphi(d) + [d = 2]\frac n 2\right) x_d^{n/d}.

Substituting this for \displaystyle \sum_{\sigma \in D_{2n}} x^\sigma yields a full definition of Z_B. You can see the result encoded in the species library here. Here’s the beginning of the full expanded series:

ghci> :m +Math.Combinatorics.Species.Types
ghci> take 107 $ show (bracelets :: CycleIndex)
  "CI x1 + 1 % 2 x2 + 1 % 2 x1^2 + 1 % 3 x3 + 1 % 2 x1 x2 + 1 % 6 x1^3 + 1 % 4 x4 + 3 % 8 x2^2 + 1 % 4 x1^2 x2"

This, then, is how unlabelled biBracelets (for example) is calculated, where biBracelets = bracelet >< (set * set). The cycle index series for bracelet and set are combined according to the operations on cycle index series corresponding to * and ><, and then the resulting cycle index series is mapped down to an ogf by substituting x^k for each x_k.

Bracelet generation

The final thing to mention is how bracelet generation works. Of course we can’t really generate actual bracelets, but only lists. Since bracelets can be thought of as equivalence classes of lists (under rotation and reversal), the idea is to pick a canonical representative element of each equivalence class, and generate those. A natural candidate is the lexicographically smallest among all rotations and reversals (assuming the labels have an ordering; if they don’t we can pick an ordering arbitrarily). One easy solution would be to generate all possible lists and throw out the redundant ones, but that would be rather inefficient. It is surprisingly tricky to do this efficiently. Fortunately, there is a series of papers by Joe Sawada (Generating bracelets with fixed content; A fast algorithm to generate necklaces with fixed content; Generating bracelets in constant amortized time) describing (and proving correct) some efficient algorithms for generating things like cycles and bracelets. In fact, they are as efficient as possible, theoretically speaking: they do only O(1) work per cycle or bracelet generated. One problem is that the algorithms are very imperative, so they cannot be directly transcribed into Haskell. But I played around with benchmarking various formulations in Haskell and got it as fast as I could. (Interestingly, using STUArray was a lot slower in practice than a simple functional implementation, even though the imperative solution is asymptotically faster in theory—my functional implementation is at least O(\lg n) per bracelet, and quite possibly O(n), though since n is typically quite small it doesn’t really matter very much. Of course it’s also quite possible that there are tricks to make the array version go faster that I don’t know about.) The result is released in the multiset-comb package; you can see the bracelet generation code here.

Posted in combinatorics, haskell, math, species | Tagged , , | 5 Comments

Ally Skills Tutorial at ICFP

I just signed up for the Ally Skills Tutorial at ICFP, and if you are (1) a man who (2) will be at ICFP in Vancouver, I encourage you to sign up, too! From the website:

The Ally Skills Tutorial teaches men simple, everyday ways to support women in their workplaces and communities. Participants learn techniques that work at the office, in classrooms, at conferences, and online. The skills we teach are relevant everywhere, including skills particularly relevant to open technology and culture communities. At the end of the tutorial, participants will feel more confident in speaking up to support women, be more aware of the challenges facing women in their workplaces and communities, and have closer relationships with the other participants.

This sounds super helpful—I suspect there is often a large gap between the extent to which I want to support women and the extent to which I actually know, practically, how to do so. The workshop will be taught by Valerie Aurora, Linux filesystem developer and Ada Initiative co-founder; I expect it will be high quality!

Posted in haskell | Tagged , , , , | 2 Comments