Polynomial Functors Constrained by Regular Expressions

I’ve now finished revising the paper that Dan Piponi and I had accepted to MPC 2015; you can find a PDF here:

Polynomial Functors Constrained by Regular Expressions

Here’s the 2-minute version: certain operations or restrictions on functors can be described by regular expressions, where the elements of the alphabet correspond to type arguments. The idea is to restrict to only those structures for which an inorder traversal yields a sequence of types matching the regular expression. For example, (aa)^* gives you even-size things; a^*ha^* gives you the derivative (the structure has a bunch of values of type a, a single hole of type h, and then more values of type a), and b^*ha^* the dissection.


The punchline is that we show how to use the machinery of semirings, finite automata, and some basic matrix algebra to automatically derive an algebraic description of any functor constrained by any regular expression. This gives a nice unified way to view differentiation and dissection; we also draw some connections to the theory of divided differences.

I’m still open to discussion, suggestions, typo fixes, etc., though at this point they won’t make it into the proceedings. There’s certainly a lot more that could be said or ways this could be extended further.

Posted in math, writing | Tagged , , , , , , | 3 Comments

Blogging again, & some major life events

It’s been a long time since I’ve written anything here; the blog was on hold while I was finishing my PhD and on the academic job market. Now that things have settled down a bit I plan to get back to blogging.

For starters, here are a few of the major events that have happened in the meantime, that readers of this blog might care about:

  • I successfully defended my PhD dissertation in October, officially graduated in December, and got an actual diploma in the mail a few weeks ago. I’ll be back in Philadelphia for the official graduation ceremony in May.
  • I accepted a tenure-track position at Hendrix College in Conway, Arkansas, and will be moving there this summer.
  • Dan Piponi and I had a paper accepted to MPC 2015. Here’s the github repo, and I plan to post a PDF copy here soon (once I get around to incorporating feedback from the reviewers). I look forward to seeing a bunch of folks (Volk?) in Königswinter this summer; I already have my plane tickets (CIU -> DTW -> AMS -> CGN, it’s a long story).
  • Work on diagrams continues strong (no thanks to me!), and we are aiming for a big new release soon—I will certainly post about that here as well.
Posted in diagrams, meta | Tagged , , , , , | Leave a comment

Maniac week postmortem

My maniac week was a great success! First things first: here’s a time-lapse video1 (I recommend watching it at the full size, 1280×720).

Some statistics2:

  • Total hours of productive work: 55.5 (74 pings)
  • Average hours of work per day3: 11
  • Average hours of sleep per night: 7.8 (52 pings over 5 nights)4
  • Total hours not working or sleeping: 27.25 (37 pings)
  • Average hours not working per day: 5.5
  • Pages of dissertation written: 24 (157 to 181)

[I was planning to also make a visualization of my TagTime data showing when I was sleeping, working, or not-working, but putting together the video and this blog post has taken long enough already! Perhaps I’ll get around to it later.]

Overall, I would call the experiment a huge success—although as you can see, I was a full 2.5 hours per day off my target of 13.5 hours of productive work each day. What with eating, showering, making lunch, getting dinner, taking breaks (both intentional breaks as well as slacking off), and a few miscellaneous things I had to take care of like taking the car to get the tire pressure adjusted… it all adds up surprisingly fast. I think this was one of the biggest revelations for me; going into it I thought 3 hours of not-work per day was extremely generous. I now think three hours of not-work per day is probably within reach for me but would be extremely difficult, and would probably require things like planning out meals ahead of time. In any case, 55 hours of actual, focused work is still fantastic.

Some random observations/thoughts:

  • Having multiple projects to work on was really valuable; when I got tired of working on one thing I could often just switch to something else instead of taking an actual break. I can imagine this might be different if I were working on a big coding project (as most of the other maniac weeks have been). The big project would itself provide multiple different subtasks to work on, but more importantly, coding provides immediate feedback that is really addictive. Code a new feature, and you can actually run the new code! And it does something cool! That it didn’t do before! In contrast, when I write another page of my dissertation I just have… another page of my dissertation. I am, in fact, relatively excited about my dissertation, but it can’t provide that same sort of immediate reinforcing feedback, and it was difficult to keep going at times.

  • I found that having music playing really helped me get into a state of “flow”. The first few days I would play some album and then it would stop and I wouldn’t think to put on more. Later in the week I would just queue up many hours of music at a time and that worked great.

  • I was definitely feeling worn out by the end of the week—the last two days in particular, it felt a lot harder to get into a flow. I think I felt so good the first few days that I became overconfident—which is good to keep in mind if I do this again. The evening of 12 August was particularly bad; I just couldn’t focus. It might have been better in the long run to just go home and read a book or something; I’m just not sure how to tell in the moment when I should push through and when it’s better to cut my losses.

  • Blocking Facebook, turning off email notifications, etc. was really helpful. I did end up allowing myself to check email using my phone (I edited the rules a few hours before I started) and I think it was a good idea—I ended up still needing to communicate with some people, so it was very convenient and not too distracting.

  • Note there are two places on Tuesday afternoon where you can see the clock jump ahead by an hour or so; of course those are times when I turned off the recording. One corresponded to a time when I needed to read and write some sensitive emails; during the other, I was putting student pictures into an anki deck, and turned off the recording to avoid running afoul of FERPA.

That’s all I can think of for now; questions or comments, of course, are welcome.

  1. Some technical notes (don’t try this at home; see http://expost.padm.us/maniactech for some recommendations on making your own timelapse). To record and create the video I used a homegrown concoction of scrot, streamer, ImageMagick, ffmpeg, with some zsh and Haskell scripts to tie it all together, and using diagrams to generate the clock and tag displays. I took about 3GB worth of raw screenshots, and it takes probably about a half hour to process all of it into a video.

  2. These statistics are according to TagTime, i.e. gathered via random sampling, so there is a bit of inherent uncertainty. I leave it as an exercise for the reader to calculate the proper error bars on these times (given that I use a standard ping interval of 45 minutes).

  3. Computed as 74/(171 – 9) pings multiplied by 24 hours; 9 pings occurred on Sunday morning which I did not count as part of the maniac week.

  4. This is somewhat inflated by Saturday night/Sunday morning, when I both slept in and got a higher-than-average number of pings; the average excluding that night is 6.75 hours, which sounds about right.

Posted in meta | Tagged , , , , , | 9 Comments

Readers wanted!

tl;dr: Read a draft of my thesis and send me your feedback by September 9!

Over the past year I’ve had several people say things along the lines of, “let me know if you want me to read through your thesis”. I never took them all that seriously (it’s easy to say you are willing to read a 200-page document…), but it never hurts to ask, right?

My thesis defense is scheduled for October 14, and I’m currently undertaking a massive writing/editing push to try to get as much of it wrapped up as I can before classes start on September 4. So, if there’s anyone out there actually interested in reading a draft and giving feedback, now is your chance!

The basic idea of my dissertation is to put combinatorial species and related variants (including a port of the theory to HoTT) in a common categorical framework, and then be able to use them for working with/talking about data types. If you’re brave enough to read it, you’ll find lots of category theory and type theory, and very little code—but I can promise lots of examples and pretty pictures. I’ve tried to make it somewhat self-contained, so it may be a good way to learn a bit of category theory or homotopy type theory, if you’ve been curious to learn more about those topics.

You can find the latest draft here (auto-updated every time I commit); more generally, you can find the git repo here. If you notice any typos or grammatical errors, feel free to open a pull request. For anything more substantial—thoughts on the organization, notes or questions about things you found confusing, suggestions for improvement, pointers to other references—please send me an email (first initial last name at gmail). And finally, please send me any feedback by September 9 at the latest (but the earlier the better). I need to have a final version to my committee by September 23.

Last but not least, if you’re interested to read it but don’t have the time or inclination to provide feedback on a draft, never fear—I’ll post an announcement when the final version is ready for your perusal!

Posted in category theory, combinatorics, diagrams, grad school, math, species, writing | Tagged , , , , | 15 Comments

Maniac week

Inspired by Bethany Soule (and indirectly by Nick Winter, and also by the fact that my dissertation defense and the start of the semester are looming), I am planning a “maniac week” while Joyia and Noah will be at the beach with my family (I will join them just for the weekend). The idea is to eliminate as many distractions as possible and to do a ton of focused work. Publically committing (like this) to a time frame, ground rules, and to putting up a time-lapse video of it afterwards are what actually make it work—if I don’t succeed I’ll have to admit it here on my blog; if I waste time on Facebook the whole internet will see it in the video; etc. (There’s actually no danger of wasting time on Facebook in particular since I have it blocked, but you get the idea.)

Here are the rules:

  • I will start at 6pm (or thereabouts) on Friday, August 8.
  • I will continue until 10pm on Wednesday, August 13, with the exception of the morning of Sunday, August 10 (until 2pm).
  • I will get at least 7.5 hours of sleep each night.
  • I will not eat cereal for any meal other than breakfast.
  • I will reserve 3 hours per day for things like showering, eating, and just plain resting.  Such things will be tracked by the TagTime tag “notwork”.
  • I will spend the remaining 13.5 hours per day working productively. Things that will count as productive work:
    • Working on my dissertation
    • Course prep for CS 354 (lecture and assignment planning, etc.) and CS 134 (reading through the textbook); making anki decks with names and faces for both courses
    • Updating my academic website (finish converting to Hakyll 4; add potential research and independent study topics for undergraduates)
    • Processing FogBugz tickets
    • I may work on other research or coding projects (e.g. diagrams) each day, but only after spending at least 7 hours on my dissertation.
  • I will not go on IRC at all during the week.  I will disable email notifications on my phone (but keep the phone around for TagTime), and close and block gmail in my browser.  I will also disable the program I use to check my UPenn email account.
  • For FogBugz tickets which require responding to emails, I will simply write the email in a text file and send it later.
  • I may read incoming email and write short replies on my phone, but will keep it to a bare minimum.
  • I will not read any RSS feeds during the week.  I will block feedly in my browser.
  • On August 18 I will post a time-lapse video of August 8-13.  I’ll probably also write a post-mortem blog post, if I feel like I have anything interesting to say.
  • I reserve the right to tweak these rules (by editing this post) up until August 8 at 6pm.  After that point it’s shut up and work time, and I cannot change the rules any more.

And no, I’m not crazy. You (yes, you) could do this too.

Posted in grad school, meta, writing | Tagged , , , , , | 9 Comments


This is part four in a series of posts on avoiding the axiom of choice (part one, part two, part three).

In my previous post, we considered the “Axiom of Protoequivalence”—that is, the statement that every fully faithful, essentially surjective functor (i.e. every protoequivalence) is an equivalance—and I claimed that in a traditional setting this is equivalent to the axiom of choice. However, intuitively it feels like AP “ought to” be true, whereas AC must be rejected in constructive logic.

One way around this is by generalizing functors to anafunctors, which were introduced by Makkai (1996). The original paper is difficult going, since it is full of tons of detail, poorly typeset, and can only be downloaded as seven separate postscript files. There is also quite a lot of legitimate depth to the paper, which requires significant categorical sophistication (more than I possess) to fully understand. However, the basic ideas are not too hard to grok, and that’s what I will present here.

It’s important to note at the outset that anafunctors are much more than just a technical device enabling the Axiom of Protoequivalence. More generally, if everything in category theory is supposed to be done “up to isomorphism”, it is a bit suspect that functors have to be defined for objects on the nose. Anafunctors can be seen as a generalization of functors, where each object in the source category is sent not just to a single object, but to an entire isomorphism class of objects, without privileging any particular object in the class. In other words, anafunctors are functors whose “values are specified only up to unique isomorphism”.

Such functors represent a many-to-many relationship between objects of \mathbb{C} and objects of \mathbb{D}. Normal functors, as with any function, may of course map multiple objects of \mathbb{C} to the same object in \mathbb{D}. The novel aspect is the ability to have a single object of \mathbb{C} correspond to multiple objects of \mathbb{D}. The key idea is to add a class of “specifications” which mediate the relationship between objects in the source and target categories, in exactly the same way that a “junction table” must be added to support a many-to-many relationship in a database schema, as illustrated below:

On the left is a many-to-many relation between a set of shapes and a set of numbers. On the right, this relation has been mediated by a “junction table” containing a set of “specifications”—in this case, each specification is simply a pair of a shape and a number—together with two mappings (one-to-many relations) from the specifications to both of the original sets, such that a specification maps to a shape s and number n if and only if s and n were originally related.

In particular, an anafunctor F : \mathbb{C} \to \mathbb{D} is defined as follows.

  • There is a class S of specifications.
  • There are two functions \mathrm{Ob}\ \mathbb{C}  \stackrel{\overleftarrow{F}}{\longleftarrow} S  \stackrel{\overrightarrow{F}}{\longrightarrow} \mathrm{Ob}\ \mathbb{D} mapping specifications to objects of \mathbb{C} and \mathbb{D}.

S, \overleftarrow{F}, and \overrightarrow{F} together define a many-to-many relationship between objects of \mathbb{C} and objects of \mathbb{D}. D \in \mathbb{D} is called a specified value of F at C if there is some specification s \in S such that \overleftarrow{F}(s) = C and \overrightarrow{F}(s) = D, in which case we write F_s(C) = D. Moreover, D is a value of F at C (not necessarily a specified one) if there is some s for which D \cong F_s(C).

The idea now is to impose additional conditions which ensure that F “acts like” a regular functor \mathbb{C} \to \mathbb{D}.

  • Functors are defined on all objects; so we require each object of \mathbb{C} to have at least one specification s which corresponds to it—that is, \overleftarrow{F} must be surjective.
  • Functors transport morphisms as well as objects. For each s,t \in   S (the middle of the below diagram) and each f :   \overleftarrow{F}(s) \to \overleftarrow{F}(t) in \mathbb{C} (the left-hand side below), there must be a morphism F_{s,t}(f) :   \overrightarrow{F}(s) \to \overrightarrow{F}(t) in \mathbb{D} (the right-hand side):

  • Functors preserve identities: for each s \in S we should have F_{s,s}(\mathit{id}_{\overleftarrow{F}(s)}) = \mathit{id}_{\overrightarrow{F}(s)}.
  • Finally, functors preserve composition: for all s,t,u \in S (in the middle below), f : \overleftarrow{F}(s) \to \overleftarrow{F}(t), and g : \overleftarrow{F}(t) \to \overleftarrow{F}(u) (the left side below), it must be the case that F_{s,u}(f ; g) = F_{s,t}(f) ; F_{t,u}(g):

Our initial intuition was that an anafunctor should map objects of \mathbb{C} to isomorphism classes of objects in \mathbb{D}. This may not be immediately apparent from the definition, but is in fact the case. In particular, the identity morphism \mathit{id}_C maps to isomorphisms between specified values of C; that is, under the action of an anafunctor, an object C together with its identity morphism “blow up” into an isomorphism class (aka a clique). To see this, let s,t \in S be two different specifications corresponding to C, that is, \overleftarrow{F}(s) = \overleftarrow{F}(t) = C. Then by preservation of composition and identities, we have F_{s,t}(\mathit{id}_C) ; F_{t,s}(\mathit{id}_C) = F_{s,s}(\mathit{id}_C ; \mathit{id}_C) = F_{s,s}(\mathit{id}_C) = \mathit{id}_{\overrightarrow{F}(s)}, so F_{s,t}(\mathit{id}_C) and F_{t,s}(\mathit{id}_C) constitute an isomorphism between F_s(C) and F_t(C).

There is an alternative, equivalent definition of anafunctors, which is somewhat less intuitive but usually more convenient to work with: an anafunctor F : \mathbb{C} \to \mathbb{D} is a category of specifications \mathbb{S} together with a span of functors \mathbb{C} \stackrel{\overleftarrow{F}}{\longleftarrow} \mathbb{S} \stackrel{\overrightarrow{F}}{\longrightarrow} \mathbb{D} where \overleftarrow{F} is fully faithful and (strictly) surjective on objects.

Note that in this definition, \overleftarrow{F} must be strictly (as opposed to essentially) surjective on objects, that is, for every C \in \mathbb{C} there is some S \in \mathbb{S} such that \overleftarrow{F}(S) = C, rather than only requiring \overleftarrow{F}(S) \cong C. Given this strict surjectivity on objects, it is equivalent to require \overleftarrow F to be full, as in the definition above, or to be (strictly) surjective on the class of all morphisms.

We are punning on notation a bit here: in the original definition of anafunctor, S is a set and \overleftarrow{F} and \overrightarrow{F} are functions on objects, whereas in this more abstract definition \mathbb{S} is a category and \overleftarrow{F} and \overrightarrow{F} are functors. Of course, the two are closely related: given a span of functors \mathbb{C} \stackrel{\overleftarrow{F}}{\longleftarrow} \mathbb{S} \stackrel{\overrightarrow{F}}{\longrightarrow} \mathbb{D}, we may simply take the objects of \mathbb{S} as the class of specifications S, and the actions of the functors \overleftarrow{F} and \overrightarrow{F} on objects as the functions from specifications to objects of \mathbb{C} and \mathbb{D}. Conversely, given a class of specifications S and functions \overleftarrow{F} and \overrightarrow{F}, we may construct the category \mathbb{S} with \mathrm{Ob}\ \mathbb{S} = S and with morphisms \overleftarrow{F}(s) \to \overleftarrow{F}(t) in \mathbb{C} acting as morphisms s \to t in \mathbb{S}. From \mathbb{S} to \mathbb{C}, we construct the functor given by \overleftarrow{F} on objects and the identity on morphisms, and the other functor maps f : s \to t in \mathbb{S} to F_{s,t}(f) : \overrightarrow{F}(s) \to \overrightarrow{F}(t) in \mathbb{D}.

Every functor F : \mathbb{C} \to \mathbb{D} can be trivially turned into an anafunctor \mathbb{C} \stackrel{\mathit{Id}}{\longleftarrow} \mathbb{C} \stackrel{F}{\longrightarrow} \mathbb{D}. Anafunctors also compose. Given compatible anafunctors F : \mathbb{C} \stackrel{\overleftarrow F}{\longleftarrow} S \stackrel{\overrightarrow F}{\longrightarrow} \mathbb{D} and G : \mathbb{D} \stackrel{\overleftarrow G}{\longleftarrow} T \stackrel{\overrightarrow G}{\longrightarrow} \mathbb{E}, consider the action of their composite on objects: each object of \mathbb{C} may map to multiple objects of \mathbb{E}, via objects of \mathbb{D}. Each such mapping corresponds to a zig-zag path C \longleftarrow s \longrightarrow D \longleftarrow t \longrightarrow E. In order to specify such a path it suffices to give the pair (s,t), which determines C, D, and E. Note, however, that not every pair in S \times T corresponds to a valid path, but only those which agree on the middle object D \in \mathbb{D}. Thus, we may take \{ (s,t) \mid s \in S, t \in T, \overrightarrow{F}(s) = \overleftarrow{G}(t) \} as the set of specifications for the composite F ; G, with \overleftarrow{F ; G}(s,t) = \overleftarrow{F}(s) and \overrightarrow{F ; G}(s,t) = \overrightarrow{G}(t). On morphisms, (F ; G)_{(s,t),(u,v)}(f) = G_{t,v}(F_{s,u}(f)). It is not hard to check that this satisfies the anafunctor laws.

If you know what a pullback is, note that the same thing can also be defined at a higher level in terms of spans. \mathbf{Cat}, the category of all (small) categories, is complete, and in particular has pullbacks, so we may construct a new anafunctor from \mathbb{C} to \mathbb{E} by taking a pullback of \overrightarrow F and \overleftarrow G and then composing appropriately.

One can go on to define ananatural transformations between anafunctors, and show that together these constitute a 2-category \mathbf{AnaCat} which is analogous to the usual 2-category of (small) categories, functors, and natural transformations; in particular, there is a fully faithful embedding of \mathbf{Cat} into \mathbf{AnaCat}, which moreover is an equivalence if AC holds.

To work in category theory based on set theory and classical logic, while avoiding AC, one is therefore justified in “mixing and matching” functors and anafunctors as convenient, but discussing them all as if they were regular functors (except when defining a particular anafunctor). Such usage can be formalized by turning everything into an anafunctor, and translating functor operations and properties into corresponding operations and properties of anafunctors.

However, as I will argue in some future posts, there is a better solution, which is to throw out set theory as a foundation of category theory and start over with homotopy type theory. In that case, thanks to a generalized notion of equality, regular functors act like anafunctors, and in particular AP holds.


Makkai, Michael. 1996. “Avoiding the Axiom of Choice in General Category Theory.” Journal of Pure and Applied Algebra 108 (2). Elsevier: 109–73.

Posted in category theory, math, species | Tagged , , , , , , , , , , | 2 Comments

AC and equivalence of categories

This is part three in a series of posts on avoiding the axiom of choice (part one, part two).

In my previous post, I explained one place where the axiom of choice often shows up in category theory, namely, when defining certain functors whose action on objects is specified only up to unique isomorphism. In this post, I’ll explain another place AC shows up, when talking about equivalence of categories. (Actually, as we’ll see, it’s really the same underlying issue, of defining a functor defined only up to unique isomorphism; this is just a particularly important instantiation of that issue.)

When are two categories “the same”? In traditional category theory, founded on set theory, there are quite a few different definitions of “sameness” for categories. Ultimately, this comes down to the fact that set theory does not make a very good foundation for category theory! There are lots of different ideas of equivalence, and they often do not correspond to the underlying equality on sets, so one must carefully pick and choose which notions of equality to use in which situations (and some choices might be better than others!). Every concept, it seems, comes with “strict” and “weak” variants, and often many others besides. Maintaining the principle of equivalence requires hard work and vigilence.

As an example, consider the following definition, our first candidate for the definition of “sameness” of categories:

Two categories \mathbb{C} and \mathbb{D} are isomorphic if there are functors F : \mathbb{C} \to \mathbb{D} and G : \mathbb{D} \to \mathbb{C} such that GF = 1_\mathbb{C} and FG = 1_\mathbb{D}.

Seems pretty straightforward, right? Well, this is the right idea in general, but it is subtly flawed. In fact, it is somewhat “evil”, in that it talks about equality of functors (GF and FG must be equal to the identity). However, two functors H and J can be isomorphic without being equal, if there is a natural isomorphism between them—that is, a pair of natural transformations \phi : H \to J and \psi : J \to H such that \phi \circ \psi and \psi \circ \phi are both equal to the identity natural transformation.1 For example, consider the Haskell functors given by

data Rose a = Node a [Rose a]
data Fork a = Leaf a | Fork (Fork a) (Fork a)

These are obviously not equal, but they are isomorphic, in the sense that there are natural transformations (i.e. polymorphic functions) rose2fork :: forall a. Rose a -> Fork a and fork2rose :: forall a. Fork a -> Rose a such that rose2fork . fork2rose === id and fork2rose . rose2fork === id (showing this is left as an exercise for the interested reader).

Here, then, is a better definition:

Categories \mathbb{C} and \mathbb{D} are equivalent if there are functors F : \mathbb{C} \to \mathbb{D} and G : \mathbb{D} \to \mathbb{C} which are inverse up to natural isomorphism, that is, there are natural isomorphisms GF \cong 1_\mathbb{C} and FG \cong 1_\mathbb{D}.

So the compositions of the functors F and G do not literally have to be the identity functor, but only (naturally) isomorphic to it. This does turn out to be a well-behaved notion of sameness for categories (although you’ll have to take my word for it).

The story doesn’t end here, however. In set theory, a function is a bijection—that is, an isomorphism of sets—if and only if it is both injective and surjective. By analogy, one might wonder what properties a functor F : \mathbb{C} \to \mathbb{D} must have in order to be one half of an equivalence. This leads to the following definition:

\mathbb{C} is proto-equivalent2 to \mathbb{D} if there is a functor F : \mathbb{C} \to \mathbb{D} which is full and faithful (i.e., a bijection on each hom-set) as well as essentially surjective, that is, for every object D \in \mathbb{D} there exists some object C \in \mathbb{C} such that F\ C \cong D.

Intuitively, this says that F “embeds” an entire copy of \mathbb{C} into \mathbb{D} (that’s the “full and faithful” part), and that every object of D which is not directly in the image of F is isomorphic to one that is. So every object of \mathbb{D} is “included” in the image of \mathbb{C}, at least up to isomorphism (which, remember, is supposed to be all that matters).

So, are equivalence and protoequivalence the same thing? In one direction, it is not too hard to show that every equivalence is a protoequivalence: if F and G are inverse-up-to-natural-isomorphism, then they must be fully faithful and essentially surjective. It would be nice if the converse were also true: in that case, in order to prove two categories equivalent, it would suffice to construct a single functor F from one to the other, and show that F has the requisite properties. This often ends up being more convenient than explicitly constructing two functors and showing they are inverse. However, it turns out that the converse is provable only if one accepts the axiom of choice!

To get an intuitive sense for why this is, suppose F : \mathbb{C} \to \mathbb{D} is fully faithful and essentially surjective. To construct an equivalence between \mathbb{C} and \mathbb{D}, we must define a functor G : \mathbb{D} \to \mathbb{C} and show it is inverse to F (up to natural isomorphism). However, to define G we must give its action on each object D \in \mathbb{D}, that is, we must exhibit a function \mathrm{Ob}\ \mathbb{D} \to \mathrm{Ob}\ \mathbb{C}. We know that for each D \in \mathbb{D} there exists some object C \in \mathbb{C} such that F\ C \cong D. That is,

\{ \{ C \in \mathbb{C} \mid F\ C \cong D \} \mid D \in \mathbb{D} \}

is a collection of non-empty sets. However, in a non-constructive logic, knowing these sets are nonempty does not actually give us any objects! Instead, we have to use the axiom of choice, which gives us a choice function \mathrm{Ob}\ \mathbb{D} \to \mathrm{Ob}\ \mathbb{C}, and we can use this function as the object mapping of the functor G.

So AC is required to prove that every protoequivalence is an equivalence. In fact, the association goes deeper yet: it turns out that the statement “every protoequivalence is an equivalence” (let’s call this the Axiom of Protoequivalence, or AP for short) not only requires AC, but is equivalent to it—that is, you can also derive AC given AP as an axiom!

On purely intuitive grounds, however, I would wager that to (almost?) anyone with sufficient category theory experience, it “feels” like AP “ought to be” true. If there is a full, faithful, and essentially surjective functor F : \mathbb{C} \to \mathbb{D}, then \mathbb{C} and \mathbb{D} “ought to be” equivalent. The particular choice of functor G : \mathbb{D} \to \mathbb{C} “doesn’t matter”, since it makes no difference up to isomorphism. On the other hand, we certainly don’t want to accept the axiom of choice. This puts us in the very awkward and inconsistent position of having two logically equivalent statements which we want to respectively affirm and reject. A fine pickle indeed! What to do?

There are four options (that I know of, at least):

  1. If one is feeling particularly rational, one can simply say, “Well, since AC and AP are equivalent, and I reject AC, I must therefore reject AP as well; my feelings about it are irrelevant.”

This is a perfectly sensible and workable approach. It’s important to highlight, therefore, that the “problem” is in some sense more a philosophical problem than a technical one. One can perfectly well adopt the above solution and continue to do category theory; it just may not be the “nicest” (a philosophical rather than technical notion!) way to do it.

We can therefore also consider some more creative solutions!

  1. In a classical setting, one can avoid AC and affirm (an analogue of) AP by generalizing the notion of functor to that of anafunctor (Makkai 1996). Essentially, an anafunctor is a functor “defined only up to unique isomorphism”. It turns out that the appropriate analogue of AP, where “functor” has been replaced by “anafunctor”, is indeed true—and neither requires nor implies AC. Anafunctors “act like” functors in a sufficiently strong sense that one can simply do category theory using anafunctors in place of functors. However, one also has to replace natural transformations with “ananatural transformations”, etc., and it quickly gets rather fiddly.

  2. In a constructive setting, a witness of essential surjectivity is necessarily a function which gives an actual witness C \in    \mathbb{C}, along with a proof that F\ C \cong D, for each D    \in \mathbb{D}. In other words, a constructive witness of essential surjectivity is already a “choice function”, and an inverse functor G can be defined directly, with no need to invoke AC and no need for anafunctors. So in constructive logic, AP is simply true. However, this version of “essential surjectivity” is rather strong, in that it forces you to make choices you might prefer not to make: for each D \in \mathbb{D} there might be many isomorphic C \in \mathbb{C} to choose from, with no “canonical” choice, and it is annoying (again, a philosophical rather than technical consideration!) to be forced to choose one.

  3. Instead of generalizing functors, a more direct solution is to generalize the notion of equality. After all, what really seems to be at the heart of all these problems is differing notions of equality (i.e. equality of sets vs isomorphism vs equivalence…). This is precisely what is done in homotopy type theory (Univalent Foundations Program 2013).3 It turns out that if one builds up suitable notions of category theory on top of HoTT instead of set theory, then (a) AP is true, (b) without the need for AC, (c) even with a weaker version of essential surjectivity that corresponds more closely to essential surjectivity in classical logic.4 This is explained in Chapter 9 of the HoTT book.

I plan to continue writing about these things in upcoming posts, particularly items (2) and (4) above. (If you haven’t caught on by now, I’m essentially blogging parts of my dissertation; we’ll see how far I get before graduating!) In the meantime, feedback and discussion are very welcome!


Makkai, Michael. 1996. “Avoiding the Axiom of Choice in General Category Theory.” Journal of Pure and Applied Algebra 108 (2). Elsevier: 109–73.

Univalent Foundations Program, The. 2013. Homotopy Type Theory: Univalent Foundations of Mathematics. Institute for Advanced Study: http://homotopytypetheory.org/book.

  1. The astute reader may well ask: but how do we know this is a non-evil definition of isomorphism between functors? Is it turtles all the way down (up)? This is a subtle point, but it turns out that it is not evil to talk about equality of natural transformations, since for the usual notion of category there is no higher structure after natural transformations, i.e. no nontrivial morphisms (and hence no nontrivial isomorphisms) between natural transformations. (However, you can have turtles all the way up if you really want.)

  2. I made this term up, since there is no term in standard use: of course, if you accept AC, there is no need for a separate term at all!

  3. As a historical note, it seems that the original work on anafunctors is part of the same intellectual thread that led to the development of HoTT.

  4. That is, using propositional truncation to encode the classical notion of “there exists”.

Posted in category theory, math, species | Tagged , , , , , , , , , | 5 Comments