Introducing diagrams-haddock

I am quite pleased to announce the release of diagrams-haddock, a tool enabling you to easily include programmatically generated diagrams in your Haddock documentation. Why might you want to do this? “A picture is worth a thousand words”—in many instances a diagram or illustration can dramatically increase the comprehension of users reading your library’s documentation. The diagrams project itself will be using this for documentation, beginning with the diagrams-contrib package (for example, check out the documentation for Diagrams.TwoD.Path.IteratedSubset). But inline images can benefit the documentation of just about any library.

Before jumping into a more detailed example, here are the main selling points of diagrams-haddock:

  1. You get to create arbitrary images to enhance your documentation, using the powerful diagrams framework.

  2. The code for your images goes right into your source files themselves, alongside the documentation—there is no need to maintain a bunch of auxiliary files, or (heaven forbid) multiple versions of your source files.

  3. Images are regenerated when, and only when, their definition changes—so you can include many diagrams in your documentation without having to recompile all of them every time you make a change to just one.

  4. You have to do a little bit of work to integrate the generated images into your Cabal package, but it’s relatively simple and you only have to do it once per package. No one else needs to have diagrams-haddock installed in order to build your documentation with the images (this includes Hackage).

So, how does it work? (For full details, consult the diagrams-haddock documentation.) Suppose we have some Haddock documentation that looks like this:

-- | The foozle function takes a widget and turns it into an
--   infinite list of widgets which alternate between red and
--   yellow.
foozle :: Widget -> [Widget]
foozle = ...

It would be really nice to illustrate this with a picture, don’t you think? First, we insert an image placeholder like so:

-- | The foozle function takes a widget and turns it into an
--   infinite list of widgets which alternate between red and
--   yellow.
--   <<dummy#diagram=foozleDia&width=300>>
foozle :: Widget -> [Widget]
foozle = ...

It doesn’t matter what we put in place of dummy; diagrams-haddock is going to shortly replace it anyway. The stuff following the # is a list of parameters to diagrams-haddock: we tell it to insert here an image built from the diagram called foozleDia, and that it should have a width of 300 pixels.

Now we just have to give a definition for foozleDia, which we do simply by creating a code block (set off with bird tracks) in a comment:

-- | The foozle function takes a widget and turns it into an
--   infinite list of widgets which alternate between red and
--   yellow.
--   <<dummy#diagram=foozleDia&width=300>>
foozle :: Widget -> [Widget]
foozle = ...

-- > widget =
-- >   (  stroke (circle 1.25 <> circle 0.75 # reversePath)
-- >   <> mconcat (iterateN 10 (rotateBy (1/10)) (square 0.5 # translateX 1.3))
-- >   )
-- >   # lw 0
-- >
-- > foozleDia =
-- >   hcat' with {sep = 2}
-- >   [ widget # fc black
-- >   , hrule 4 # alignR <> triangle 1 # rotateBy (-1/4) # fc black
-- >   , hcat' with {sep = 0.5} (zipWith fc (cycle [red, yellow]) (replicate 6 widget))
-- >   ]

Note that this definition for foozleDia isn’t in a Haddock comment, so it won’t be typeset in the Haddock output. (However, if you want users reading your documentation to see the code used to generate the pictures—as, e.g., we often do in the documentation for diagrams itself—it’s as simple as sticking the definitions in a Haddock comment.) It also doesn’t have to go right after the definition of foozle—for example, we could stick it all the way at the end of the source file if we didn’t want it cluttering up the code.

Now we simply run diagrams-haddock on our file (or on the whole Cabal project), and it will generate an appropriate SVG image and replace <<dummy#...>> with something like <<diagrams/foozleDia.svg#...>>. The Haddock documentation now displays something like

after the documentation for foozle. Hooray! Note that diagrams-haddock only replaces the stuff before the # (the clever bit is that browsers will ignore everything after the #). Running diagrams-haddock again at this point will do nothing. If we change the definition of foozleDia and then rerun diagrams-haddock, it will regenerate the image.

Okay, but how will others (or, for that matter, Hackage) be able to see the diagram for foozle when they build the documentation, without needing diagrams-haddock themselves? It’s actually fairly straightforward—we simply include the generated images in the source tarball, and tell cabal to copy the images in alongside the documentation when it is built, using either a custom Setup.hs, or (once it is released and sufficiently ubiquitous) the new extra-html-files: field in the .cabal file. The diagrams-haddock documentation has full details with step-by-step instructions.

I hope this silly example has piqued your interest; again, for full details please consult the diagrams-haddock documentation. Now go forth and illustrate your documentation!

Posted in diagrams, haskell, writing | Tagged , , , | 4 Comments

BlogLiterately 0.6

I’m very proud to announce the release of BlogLiterately version 0.6, a tool for formatting and uploading blog posts, including syntax highlighting, generation of ghci sessions, LaTeX support, automatic image uploading, and more.

tl;dr: Instead of cumbersomely specifying all options on the command-line, you can now specify options using a combination of “profiles” (e.g. for common sets of options such as blog URL and password) and options embedded within the .markdown or .lhs documents themselves (e.g. for post-specific options like title, tags, and categories).

There are a few other changes and improvements as well. For more information, see the documentation or keep reading below!

Specifying options

With previous releases, uploading a post usually went something like this:

BlogLiterately --blog " \
  --user me --password 1234567 --postid 9999 --title "My awesome post" \
  --tag tag1 --tag tag2 --tag tag3 --category Stuff \
  --category OtherStuff --ghci --wplatex

which is incredibly tedious and error-prone. Now we do things the Right Way ™. First, you can create one or more profiles, specifying a common set of options that can be referred to by name. For example, you might have a profile for a particular blog, or a profile for a particular type of post which always needs the same options. Suppose we put this in $HOME/.BlogLiterately/foo.cfg (or in something like C:/Documents And Settings/user/Application Data/BlogLiterately/foo.cfg on Windows):

blog        =
user        = me
password    = 1234567
wplatex     = true

Now the previous command line is reduced to

BlogLiterately -P foo --postid 9999 --title "My awesome post" \
  --tag tag1 --tag tag2 --tag tag3 --category Stuff \
  --category OtherStuff --ghci

which is already a big improvement! But it doesn’t stop there. The title, tags, categories, and other such things are really inherent to the post itself; there’s no reason they should go on the command line. So, we add this indented block somewhere in (probably near the top, though it doesn’t matter):

    profile    = foo
    postid     = 9999
    title      = "My awesome post"
    tags       = tag1, tag2, tag3
    categories = Stuff, OtherStuff
    ghci       = true

And now we only have to write


with no options on the command line at all! Notice how we can even specify which profile to use in the [BLOpts] block. When we’re satisfied with the post we can publish it with

BlogLiterately --publish

Generating HTML only

In the past, to get a “preview” version of the HTML output written to stdout, all you had to do was omit a --blog option. However, if you specify a profile with a blog field as in the above example, this is more problematic. For this reason, a new option --html-only has been added. When this option is specified, nothing will be uploaded, and the HTML output written to stdout.

Changes to Transforms

In order to make the above features possible, the definition of Transform has changed. This only affects those users who have created their own custom transformations. The definition used to be

data Transform
  = Transform
    { getTransform :: BlogLiterately -> Kleisli IO Pandoc Pandoc
    , xfCond       :: BlogLiterately -> Bool

that is, a Transform was a transformation on Pandoc documents, parameterized by an options record and able to have effects in the IO monad. The definition is now

data Transform
  = Transform
    { getTransform :: StateT (BlogLiterately, Pandoc) IO ()
    , xfCond       :: BlogLiterately -> Bool

meaning that a Transform is able to transform both a Pandoc document and the options record. This is crucial for being able to do things like embedding options within the document itself, because we don’t know all the options until we start processing the document! Also, I switched from using Kleisli arrows to using StateT, since I find it simpler to work with, especially now that multiple pieces of state are involved. For more information and help upgrading, see the documentation for Text.BlogLiterately.Transform.

Move to github

The other change is that I have moved the BlogLiterately repository from darcshub to github. In general, for small personal projects and miscellaneous sorts of things I use darcs and; for larger projects where I want to raise the visibility and encourage contributions from other users, I use github. At some point BlogLiterately crossed the line.

Learning more, and contacting me

For more information, see the full documentation. I’m always happy to receive comments, questions, feature requests, bug reports, and so on, via the bug tracker on github, IRC (byorgey on freenode), or email (the same as my IRC nick, at gmail).

Posted in haskell, writing | Tagged , , | Leave a comment

The Dawn of Software Engineering

The Dawn of Software Engineering: From Turing to Dijkstra
Edgar G. Daylight

Edgar sent me a review copy of his book a while back—it made for quite interesting reading and gave me new perspective on the historical origins of my field. I daresay many readers of this blog might be interested in giving it a read.

Alan Turing is widely regarded today as the father of digital computers. But as Daylight argues in this fascinating historical account of the development of computer programming as a discipline in the 1950s and 60s, the real story is much more complicated. Turing’s ideas didn’t actually have much influence on the building of the first computers themselves—but did gradually come to influence the practice of writing computer programs.

It will be interesting to compare and contrast this book with George Dyson’s book Turing’s Cathedral, which I have just begun reading—though I’m not far enough along to make any comparisons yet.

As an aside, I find it interesting that the subfield Dijkstra called “software engineering” —the subfield that was influenced by Turing’s ideas—really seems to comprise what is now known as “programming languages”. “Software engineering” now means something completely different, focusing on the business-oriented, large-scale aspects of building software systems. It’s difficult to imagine “software engineering” these days being influenced by Turing’s ideas (or abstract mathematical ideas, period—though perhaps I am being uncharitable).

Posted in haskell | 3 Comments

The algebra of species: primitives

[This is the fifth in a series of posts about combinatorial species. Previous posts: And now, back to your regularly scheduled combinatorial species; Decomposing data structures; Combinatorial species definition, Species definition clarification and exercises.]

Recall that a species is a functor from \mathbb{B}, the category of finite sets and bijections, to \mathbb{E},1 the category of finite sets and total functions. (Equivalently, species are endofunctors on \mathbb{B}, but in this post I’m going to want to think about them as the former.) That is, a species F is a mapping sending every set of labels U to a set of structures F[U], which also lifts relabelings \sigma : U \leftrightarrow V to functions F[\sigma] : U \to V in a way that respects the compositional structure of bijections.

However, as I hinted in a previous post, it’s inconvenient to work directly with this definition in practice. Instead, we use an algebraic theory that lets us compositionally build up certain species from a collection of primitive species and species operations. (It’s important to note that it does not allow us to build all species, but it does allow us to build many of the ones we care about.)

In this post we’ll begin by examining a few natural species to take as primitive.

  • The zero or empty species, denoted \mathbf{0}, is the unique species with no structures whatsoever; that is,

    \mathbf{0}[U] = \emptyset


    \mathbf{0}[\sigma : U \leftrightarrow V] = id_{\emptyset} : \mathbf{0}[U] \to \mathbf{0}[V].

    Of course, \mathbf{0} will turn out to be the identity element for species sum (which I’ll define in my next post, though it’s not hard to figure out what it should mean).

  • The unit species, denoted \mathbf{1}, is defined by

    \begin{array}{lcl}\mathbf{1}[\emptyset] &=& \{\star\} \\ \mathbf{1}[U] &=& \emptyset \qquad (U \neq \emptyset)\end{array}

    That is, there is a unique \mathbf{1}-structure indexed by the empty set of labels, and no \mathbf{1}-structures with any positive number of labels. \mathbf{1} lifts bijections in the obvious way, sending every bijection to the appropriate identity function.

    Some people initially find this definition surprising, expecting something like \mathbf{1}[U] = \{ \star \} for all U instead. That is indeed a valid species, and we will meet it below; but as I hope you will come to see, it doesn’t deserve the name \mathbf{1}.

    Of course we should also verify that this definition satisfies the requisite functoriality properties, which is not difficult.

    More abstractly, for those who know some category theory, it’s worth mentioning that \mathbf{1} can be defined as \mathbb{B}(\emptyset, -) : \mathbb{B} \to \mathbb{E}, that is, the covariant hom-functor sending each finite set U \in \mathbb{B} to the (finite) set of bijections \emptyset \leftrightarrow U. (This is why I wanted to think of species as functors \mathbb{B} \to \mathbb{E}. I learned this fact from Yeh (1986).) There is, of course, a unique bijection \emptyset \leftrightarrow \emptyset and no bijections \emptyset \leftrightarrow U for nonempty U, thus giving rise to the definition above.

    As you might expect, \mathbf{1} will be the identity element for species product. Like \mathbf{1} itself, species product isn’t defined quite as most people would initially guess. If you haven’t seen it before, you might like to try working out how product can be defined in order to make \mathbf{1} an identity element.

  • The singleton species, denoted \mathbf{X}, is defined by

    \mathbf{X}[U] = \begin{cases} U & |U| = 1 \\ \emptyset & \text{otherwise} \end{cases}

    with lifting of bijections defined in the evident manner. That is, there is a single \mathbf{X}-structure on a label set of size 1 (which we identify with the label itself, though we could have also defined \mathbf{X}[U] = \{\star\} when |U| = 1), and no \mathbf{X}-structures indexed by any other number of labels.

    As with \mathbf{1}, we may equivalently define \mathbf{X} as a hom-functor, namely \mathbf{X} = \mathbb{B}(\{\star\}, -).

    It’s worth noting again that although \mathbf{1} and \mathbf{X} do “case analysis” on the label set U, they actually only depend on the size of U; indeed, as we noted previously, by functoriality this is all they can do.

  • The species of bags2, denoted \mathbf{E}, is defined by

    \mathbf{E}[U] = \{U\},

    that is, there is a single \mathbf{E}-structure on any set of labels U, which we usually identify with the set of labels itself (although we could equivalently define \mathbf{E}[U] = \{\star\}). The idea is that an \mathbf{E}-structure consists solely of a collection of labels, with no imposed ordering whatsoever.

    If you want to abuse types slightly, one can define \mathbf{E} as a hom-functor too, namely \mathbb{E}(-,\{\star\}). (Yeh (1986) actually has \mathbb{B}(-, \{\star\}), but that’s wrong.)

As a summary, here’s a graphic showing \mathbf{0}-, \mathbf{1}-, \mathbf{X}-, and \mathbf{E}-structures arranged by size (i.e., the size of the underlying set of labels U): a dot indicates a single structure, and the size of the label set increases as you move to the right.

Just as a teaser, it turns out that \mathbf{X} and \mathbf{E} are identity elements for certain binary operations on species as well, though you’ll have to wait to find out which ones!

Next up, addition!


Yeh, Yeong-Nan. 1986. “The calculus of virtual species and K-species.” In Combinatoire énumérative, ed. Gilbert Labelle and Pierre Leroux, 1234:351–369. Springer Berlin Heidelberg.

  1. Last time I called this category \mathbf{FinSet}, but \mathbb{E} is more concise and matches the species literarure.

  2. The species literature calls this the species of sets, but that’s misleading to computer scientists, who expect the word “set” to imply that elements cannot be repeated.

Posted in math, species | Tagged , , , , , , | 6 Comments

Diagrams 0.6

I am pleased to announce the release of version 0.6 of diagrams, a full-featured framework and embedded domain-specific language for declarative drawing. Check out the gallery for examples of what it can do!

Highlights of this release include:

  • Diagrams now comes with a native-Haskell SVG backend by default. If you were holding off on trying diagrams because you couldn’t install cairo, you no longer have an excuse!

  • Proper support for subdiagrams: previous versions of diagrams-core had a mechanism for associating names with a pair of a location and an envelope. Now, names are associated with actual subdiagrams (including their location and envelope, along with all the other information stored by a diagram). This enables cool techniques like constructing a diagram in order to position its subelements and then taking it apart again, or constructing animations via keyframing.

  • Traces: in addition to an envelope, each diagram now stores a “trace”, which is like an embedded raytracer: given any ray (represented by a base point and a vector), the trace computes the closest point of intersection with the diagram along the ray. This is useful for determining points on the boundary of a diagram, e.g. when drawing arrows between diagrams.

  • The core data structure underlying diagrams has been completely refactored and split out into its own separate package, dual-tree.

  • Support for GHC 7.6.

  • Many more new features, bug fixes, and improvements! See the release notes for complete details, and the diagrams wiki for help migrating from 0.5 to 0.6.

Try it out

For the truly impatient:

cabal install diagrams

Diagrams is supported under GHC 7.0 through 7.6, with the exception that the cairo and gtk backends do not build under GHC 7.0 (but the SVG backend does), and the gtk backend does not build under GHC 7.6.

To get started with diagrams, read the quick tutorial, which will introduce you to the fundamentals of the framework.

For those who are less impatient and want to really dig in and use the power features, read the user manual.

Get involved

Subscribe to the project mailing list, and/or come hang out in the #diagrams IRC channel on for help and discussion. Make some diagrams. Fix some bugs. Submit your cool examples for inclusion in the gallery or your cool code for inclusion in the diagrams-contrib package!

Happy diagramming!

Brought to you by the diagrams team:

  • Michael Sloan
  • Ryan Yates
  • Brent Yorgey

with contributions from:

  • Sam Griffin
  • Niklas Haas
  • Peter Hall
  • Claude Heiland-Allen
  • Deepak Jois
  • John Lato
  • Felipe Lessa
  • Chris Mears
  • Ian Ross
  • Vilhelm Sjöberg
  • Jim Snavely
  • Luite Stegeman
  • Kanchalai Suveepattananont
  • Michael Thompson
  • Scott Walck
Posted in diagrams, haskell, projects | Tagged , , , | 6 Comments

Species definition clarification and exercises

[This is the fourth in a series of posts about combinatorial species. Previous posts: And now, back to your regularly scheduled combinatorial species; Decomposing data structures; Combinatorial species definition.]

In my previous post I neglected to mention something quite crucial, namely, that (at least for now) we are only talking about finite sets of labels and finite sets of structures. That is, we do not consider structures with infinitely many labels, and given a particular finite set of labels, there must be only a finite number of structures with those labels. The category \mathbb{B} is the category of finite sets and bijections, not of all sets as I said previously.

Of course, in practice we do want to think about infinite sets of structures and labels, especially in relation to a non-strict language like Haskell! But the theory was invented by mathematicians interested in counting things. I do intend to explore ways to extend the theory to encompass infinite structures, but for now we’ll stick to the finite.

Before moving on to talk about the algebraic approach to species, I also want to point out a few simple implications of the formal definition. Instead of spelling them out in detail, I will pose them as exercises: you can either take them on faith, or try working through the exercises to deepen your understanding.

  1. Let [n] denote the set \{0, \dots, n-1\}1, and write F[n] as an abbreviation for F[[n]], i.e. the application of the species F to the label set [n]. Show that given F[n] we can determine F[U] for any U with |U| = n.

(This shows that “size is all that matters”: in some sense species are really indexed not by sets of labels but by size.)

  1. [BLL2 Exercise 1.1.2] Show that we get an equivalent definition if we take species to be functors from \mathbb{B} to \mathbf{FinSet} (the category of finite sets and (total) functions) instead of endofunctors on \mathbb{B}.

(Apparently one can generalize the notion of species by replacing \mathbf{FinSet} with other categories, but at present I am not sure of the implications.)

  1. The species literature uses \{1, \dots, n\}, but (as every good computer scientist knows) counting ought to begin at zero.

  2. I will use “BLL” to refer to Bergeron, Labelle, and Leroux, Combinatorial Species and Tree-Like Structures (see the references at the end of the previous post for the full citation).

Posted in math, species | Tagged , , , | 2 Comments

Teaching abstraction

I’m just beginning to prepare for the third incarnation of CIS 194, Introduction to Haskell in the spring. It’s occasioned some general thoughts on teaching abstraction which seemed worth writing down.

Abstractions, of course, are everywhere in CS. By abstraction I mean anything with multiple “levels”, where upper levels hide details of lower levels. An operating system abstracts over the details of hardware; objects abstract over their internal state; functions abstract over their implementation; and so on. Let me just state up front my main thesis:

Students should be explicitly taught to think on multiple levels of abstraction.

Stated like that, perhaps it seems obvious; allow me to elaborate.

On the face of it, when learning about some abstraction, there are two things one can learn. One can learn how the abstraction is implemented (that is, the details of the lower level); and one can also learn how to use the abstraction (that is, how to work with the upper level while ignoring the details of the lower level).

In many cases, students already know one of the levels and learn about the other. For example, once they get around to learning about how compilers work, students already know how to write programs. Or perhaps they learn both levels as part of the same course, but at different times: for example, they might first learn about logic gates, and then later move on to talking about latches and adders.

It’s when teaching both levels at once that one must be extremely careful. Such a situation might arise, for example, when teaching a concept that is completely new or difficult to motivate—since it’s best to start with the concrete and move to the abstract, one might wish to begin with the lower level, but referencing the upper level is necessary to motivate the lower level in the first place. The problem with such a situation is that it is really easy for students to get the levels confused.

In particular, I have in mind the Applicative functor abstraction (and Monad as well). The last time teaching CIS 194, I didn’t do a very good job with this. I gave my students a homework assignment which had them first implement an Applicative interface to parser combinators, and then use it to write some parsers. Most students missed the point, however, and kept on using low-level implementation details when writing their parsers, instead of working in terms of the Applicative interface.

It’s tempting to just say “don’t do that”, i.e. don’t confuse students by teaching multiple levels at once! But I think the situation actually presents a great opportunity. In the “real world”, one seldom has the luxury of thinking on only one level at a time. Being able to explicitly switch between levels, and keep multiple levels straight, is an important real-world skill. I think the real solution is to explicitly teach students to think on multiple levels of abstraction: that is, be explicit about the fact that there are multiple levels, and teach them to consider carefully at each point which level they are thinking on, and why. I plan to do this in the spring and will report back on how it goes!

Posted in teaching | Tagged , , , | 8 Comments