## Virtual species suffice

Over six years ago, I wrote a post explaining how virtual species are defined. Ever since then (time flies!) I’ve been meaning to write a follow-up post explaining a bit more about virtual species and how they actually suffice to give us not just additive inverses, but also (somewhat surprisingly) multiplicative inverses.

Recall that the intuitive idea of a combinatorial species is a family of labelled structures which are invariant under relabelling. If you’ve never seen the formal definition before, don’t worry: just think “data structures” or “algebraic data types” for now.

The basic idea of virtual species is to work with pairs of species $(P,N)$ where $P$ is considered “positive” and $N$ “negative”. Formally, we consider equivalence classes of such pairs under the equivalence relation defined by $(P_1, N_1) \cong (P_2, N_2)$ iff $P_1 + N_2 = P_2 + N_1$.1 This parallels the way one typically gives a formal definition of the integers starting from the natural numbers (the “Grothendieck construction”); see my previous post for more details.

# Intuition

How can we build intuition for virtual species, and for additive inverses of species in particular? To be honest I have been struggling with this question for many years.

Multiplicative inverses are much simpler to think about: they are like matter and antimatter. Having both an $F$-structure and an $F^{-1}$ structure is the same as having nothing; they annihilate each other. By “having nothing” we mean “having no information”, that is, having a unit value: $F F^{-1} = 1$.

What about additive inverses? Note first that the $0$ species does not correspond to having nothing; the word “nothing” corresponds to the $1$ (i.e. unit) species. Instead the $0$ (i.e. uninhabited) species corresponds to (logical) impossibility. So to interpret $-F$ we have to imagine something where having either $F$ or $-F$ is impossible.

…yeah, me neither. This seems deeply strange. If someone says, “I either have an $F$ or a $-F$”, you can confidently call them a liar, because it is impossible to have either an $F$ or a $-F$; that is, $F - F = 0$. But surely if you actually have an $F$-structure, it should also be true to say “I have either an $F$ or a $G$”? Well, that works for normal, positive species—in which case we can define a canonical injection $F \to F + G$. But once we introduce negative species this completely breaks down. As another example, if someone truthfully says, “I have either a tree or a negative non-empty tree”, you should be able to say, “Aha! I know what you have—it must be an empty tree.” In general, it’s strange that expressing a disjunction can cause some possibilities to be ruled out. Normally, we are used to disjunctions only increasing the number of possibilities.

Inspired by James and Sabry’s really cool paper The Two Dualities of Computation: Negative and Fractional Types, I have thought a bit about whether there is some plausible interpretation involving travelling backwards in time, but I haven’t been able to come up with one. I can’t quite seem to make the formalism of the paper match up with my intuition about species (though this may just be a failure of my imagination or intuition).

# Multiplicative Inverses

In any case, let’s see why the ring of virtual species actually has multiplicative inverses—at least, all the ones we could possibly hope for. This is somewhat surprising, since when we build integers from natural numbers by considering equivalence classes of pairs, we certainly don’t get any multiplicative inverses, only additive ones. To get multiplicative inverses we have to do the same process a second time, building the rational numbers as equivalence classes of pairs of integers. But species already have enough extra structure that throwing in additive inverses is all it takes.

First, a caveat: we don’t get multiplicative inverses for all species, but only those species $G$ such that $G(0) = 1$: that is, species $G$ with only a single structure of size zero, which are of the form $G = 1 + X(\dots)$. With any constant term other than $1$, we clearly have no hope of finding another species $H$ such that $GH = 1$, since the constant term of $GH$ will be a multiple of $G$’s constant term.

So given such a $G$, write $G = 1 + G_+$, where $G_+$ denotes “non-empty $G$-structures”. Then we can define the multiplicative inverse of $G$ as follows:

$\displaystyle G^{-1} = \sum_{k \geq 0} (-1)^k (G_+)^k = 1 - G_+ + G_+^2 - G_+^3 + \dots$

That is, a $G^{-1}$-structure consists of a list of nonempty $G$-structures, except that even-length lists are considered “positive” and odd-length lists considered “negative”.

We can easily check that this indeed defines a multiplicative inverse for $G$:

$\displaystyle \begin{array}{rcl}G G^{-1} &=& (1 + G_+) (1 - G_+ + G_+^2 - G_+^3 + \dots) \\[0.5em] &=& (1 - G_+ + G_+^2 - G_+^3 + \dots) + (G_+ - G_+^2 + G_+^3 - G_+^4 + \dots) \\[0.5em] &=& 1 \end{array}$

The infinite sums telescope down to leave only $1$. Notice this really isn’t about species in particular, but really about infinite power series (of which species are the categorification): any infinite power series with integer coefficients and a constant term of $1$ has a multiplicative inverse which is also such a power series.

As an example, consider $1/(1-X) = (1-X)^{-1}$. We know this is “supposed” to be the species of lists (since it results from solving $L = 1 + XL$ for $L$), but let’s see what happens. In this case $G = 1-X$ and $G_+ = -X$. So the inverse ought to be

$\displaystyle (1-X)^{-1} = \sum_{k \geq 0} (-1)^k (-X)^k = \sum_{k \geq 0} X^k = 1 + X + X^2 + X^3 + \dots$

And hey, look at that! Lists!

# A field of species?

So what would we need to get a true field, i.e. a multiplicative inverse for every nonzero species? Well, for that we would need to throw in rational coefficients. I forget exactly where I read this—some paper by Baez and Dolan, most likely—but I believe the proper way to interpret this would be as groupoid-valued species, since there is a sense in which the “cardinality” of groupoids can be interpreted as rational numbers. But to be honest I have no idea where this leads.

1. Note that species sum is cancellative—that is, if $A + C = B + C$ then $A = B$—so this is a coherent definition. This cancellative property is probably worth another post of its own since the reason for it is not entirely trivial.

Posted in combinatorics | Tagged , | 23 Comments

## Advent of code #16 solution: an algebra of bitstrings

I had fun this past December solving Advent of Code problems in Haskell. I was particularly proud of my solution to one particular problem involving generating and processing large bitstrings, which I’d like to share here. I think it really shows off the power of an algebraic, DSL-based approach to problem solving.

> {-# LANGUAGE GADTs #-}
>
> import Control.Arrow   ((***))
> import Data.Bits       (xor)
> import Data.List       (unfoldr)
> import Data.List.Split (chunksOf)
> import Data.Maybe      (fromJust)


# The problem

You can go read the problem description if you like, but it’s rather verbose—I’ll try to give a more concise description here, illustrated with Haskell code.

The problem is concerned with strings of bits:

> type BitString = [Bool]


We’ll start just by defining a few utility functions to view and input bitstrings conveniently.

> readbits :: String -> BitString
>
> showbits :: BitString -> String
> showbits = map (\b -> if b then '1' else '0')
>
> withbits :: (BitString -> BitString) -> String -> String
> withbits f = showbits . f . readbits


Now on to the problem proper. There is a central operation—which I’ll call the “dragon transform”—which makes a longer bitstring from a shorter one. Given a bitstring $s$, append a 0 to the end, and then append a reversed and inverted version of $s$ (where “invert” means to flip all the bits). Like so:

> invert :: BitString -> BitString
> invert = map not
>
> dragon :: BitString -> BitString
> dragon s = s ++ [False] ++ invert (reverse s)


For example,

ghci> withbits dragon "1"
"100"

ghci> withbits dragon "1101111"
"110111100000100"


(Incidentally, this operation is called dragon since it is related to the classic dragon curve. Hint: interpret 0 as a left turn and 1 as a right turn.)

Given a starting bitstring, and a target length, we are supposed to iterate dragon until we have at least the number of target bits, and then truncate the string to the desired length:

> fill :: Int -> BitString -> BitString
> fill len = take len . head . dropWhile ((< len) . length) . iterate dragon


For example, if we start with 1, we have to iterate dragon three times to end up with at least ten bits.

ghci> map showbits . take 4 $iterate dragon [True] ["1","100","1000110","100011001001110"] ghci> withbits (fill 10) "1" "1000110010"  Finally, after extending an initial bitstring to a given length, we perform a checksum operation: • If there are an odd number of bits, we are done. • Otherwise, take the bits two at a time and compute the negation of their exclusive or: that is, 1 if the bits are the same and 0 if they are different (otherwise known as (==)). This results in a bitstring half as long. Now repeat the process, continuing to halve the length until we have an odd number of bits remaining. In code: > checksum :: BitString -> BitString > checksum a > | odd (length a) = a > | otherwise = checksum . map xnor . chunksOf 2$ a
>   where
>     xnor [x,y] = x == y


So, we now have a simple reference implementation that directly follows the specification. We can use this to solve the first task, which just asks to start with a given short bitstring, extend it to length $272$, and then compute the checksum. I think different logged-in users get different starting strings, but mine was 01000100010010111:

> input = "01000100010010111"

ghci> withbits (checksum . fill 272) input
"10010010110011010"


Notice that $272 = 17 \cdot 2^4$, so after expanding to that length and then repeatedly halving the length, we end up with a checksum of length 17.

That was easy. Bring on the second task! Well… of course, it is much bigger. It asks to use the same starting bitstring, but this time extend it to length $35651584 = 17 \cdot 2^{21}$ before computing the checksum (which will again end up having length 17). Using this naive, unoptimized implementation completely blows up: it turns out that generating a list of 35 million booleans is really not a good idea. Using actual lists with a cons cell for each bit incurs a whole lot of memory and allocation overhead; it just made my computer grind to a halt.

As you may realize, there is a lot of low-hanging fruit here: for example, we can use an unboxed Vector instead of a list, or even do some deforestation to avoid allocation (the former code is by Eric Mertens aka glguy, the latter by Daniel Wagner aka dmwit). Using techniques like that, it’s possible to get the runtime and memory requirements down to something reasonable. But that’s not what I want to talk about. Though more efficient, those solutions are still actually computing every single bit. It seemed to me we shouldn’t have to do that: the computation has a lot of nice structure, and seemingly a lot of opportunity for sharing intermediate results. I went off in search of a way to compute the correct checksum without actually generating the entire intermediate bitstring.

# Interlude: xnor

The first order of business was to work out an algebraic understanding of the xnor operation, which I will denote $\overline{x \oplus y}$ (the circled plus operator $\oplus$ denotes xor, and the overbar denotes logical negation). One fundamental fact is that

$\displaystyle \overline{x \oplus y} = \overline{x} \oplus y = x \oplus \overline{y}$

(checking whether $x$ and $y$ are equal is the same as first negating one and then checking whether they are unequal). From this, and the fact that $\oplus$ is associative, we can prove associativity of xnor:

$\displaystyle \overline{\overline{x \oplus y} \oplus z} = (\overline{x} \oplus y) \oplus \overline{z} = \overline{x} \oplus (y \oplus \overline{z}) = \overline{x \oplus \overline{y \oplus z}}$

Associativity, along with the fact that $1$ is an identity for the operation, means it forms a monoid. When we repeatedly take the xnor of adjacent bits, we are therefore basically doing an mconcat using a strictly balanced combining scheme. But associativity means we can be freer about the order in which we do the combining. If we start with a bitstring of length $n \cdot 2^k$, the checksumming operation iterates $k$ times, and each consecutive sequence of $2^k$ bits gets folded down into a single bit via mconcat. In other words, the checksum operation can be reimplemented like this:

> checksum2 :: BitString -> BitString
> checksum2 a = map combine . chunksOf (powTwo (length a)) $a > where > combine = foldr (==) True > > -- Find the biggest power of two that divides n > powTwo n > | odd n = 1 > | otherwise = 2 * powTwo (n div 2)  Let’s check that this works: ghci> withbits (checksum2 . fill 272) input "10010010110011010" ghci> let bits = fill 272 (readbits input) in checksum bits == checksum2 bits True  Now, this isn’t really any faster yet; but this idea will be important later! There’s one more thing we can observe about xnor: if we fold an odd number of bits with xnor, it’s the same as taking the xor of all the bits; if we fold an even number of bits, it’s the same as taking the xor of all the bits and then negating the result. That is, $\displaystyle \begin{array}{rcl} \overline{x_1 \oplus x_2} &=& \overline{x_1 \oplus x_2} \\[0.5em] \overline{x_1 \oplus \overline{x_2 \oplus x_3}} &=& x_1 \oplus x_2 \oplus x_3 \\[0.5em] \overline{x \oplus \overline{x_2 \oplus \overline{x_3 \oplus x_4}}} &=& \overline{x_1 \oplus x_2 \oplus x_3 \oplus x_4} \\[0.5em] \overline{x \oplus \overline{x_2 \oplus \overline{x_3 \oplus \overline{x_4 \oplus x_5}}}} &=& x_1 \oplus x_2 \oplus x_3 \oplus x_4 \oplus x_5 \end{array}$ and so on. The proof is a simple induction argument, making use of the relation $\overline{x \oplus y} = \overline{x} \oplus y$ we noted before. So when folding xnor, as a simple optimization, we can avoid doing a lot of negations by just computing the xor and then negating appropriately based on the parity of the number of bits. # The algebra of bitstrings With that under our belts, we can move on to the real meat of the solution. The central idea is that instead of representing bitstrings directly as lists (or vectors, or whatever) of bits, we represent them using a deep embedding of a little bitstring algebra (aka DSL). That is, we represent each bitstring operation as a constructor of an algebraic data type, which allows us to directly manipulate bitstring expressions. The point is that this algebra/DSL has a lot of nice structure that allows us to work at an abstract, algebraic level instead of working directly with bits. There’s one more twist to note before actually seeing the data type definition. We know that we will need to talk about the length of bitstrings as well as their xnor/xor. Instead of having to recalculate these every time we need them, we can cache them at each node of a bitstring expression. We’ll see how these cached values come in handy later. > data BitExpr where  So, what does our algebra of bitstrings need? First, it’s useful to have an explicit representation of the empty bitstring, as well as a singleton bit. We don’t need to cache length or xor values here, since they are obvious and can be computed in constant time. > Emp :: BitExpr > Bit :: Bool -> BitExpr  Next, we need to be able to append bitstrings. Notice the Bool, which represents the cached xor of the entire bitstring, as well as the Integer which represents the length. > App :: !Bool -> !Integer -> BitExpr -> BitExpr -> BitExpr  Finally, we need three unary operations on bitstrings: invert, reverse, and dragon. Each also carries a cached length and xor. > Inv :: !Bool -> !Integer -> BitExpr -> BitExpr > Rev :: !Bool -> !Integer -> BitExpr -> BitExpr > Drg :: !Bool -> !Integer -> BitExpr -> BitExpr > > deriving Show  Note that Drg is redundant in some sense, since the dragon transform can be encoded in terms of append, inverse, and reverse. However, it’s critical that we include it explicitly: since the dragon transform uses the input bitstring twice, expanding an iterated application of Drg in terms of the other constructors would result in an exponential blowup in the size of the expression. To be concrete, let’s write a straightforward interpreter which formally connects a bitstring expression with its intended semantics as a bitstring. This comes in handy for testing, but other than testing, the whole point is that we will not use this—we want to solve the problem at the level of bitstring expressions, without ever actually generating their corresponding bitstrings. > toBits :: BitExpr -> BitString > toBits Emp = [] > toBits (Bit b) = [b] > toBits (App _ _ s1 s2) = toBits s1 ++ toBits s2 > toBits (Inv _ _ s) = invert (toBits s) > toBits (Rev _ _ s) = reverse (toBits s) > toBits (Drg _ _ s) = dragon (toBits s)  Next, let’s write some simple utility functions to extract the cached length or xor from the root of a bitstring expression: > bsLen :: BitExpr -> Integer > bsLen Emp = 0 > bsLen (Bit _) = 1 > bsLen (App _ l _ _) = l > bsLen (Inv _ l _) = l > bsLen (Rev _ l _) = l > bsLen (Drg _ l _) = l > > bsXor :: BitExpr -> Bool > bsXor Emp = False > bsXor (Bit b) = b > bsXor (App b _ _ _) = b > bsXor (Inv b _ _) = b > bsXor (Rev b _ _) = b > bsXor (Drg b _ _) = b  Next, we’ll write some smart constructors which automatically take care of properly computing the cached length and xor. > bit :: Bool -> BitExpr > bit = Bit  Appending combines xor values with xor and adds lengths. app also does a bit of optimization when appending with the empty bitstring. For convenience, we can also use app to create a function bits to convert a literal bitstring into a BitExpr. > app :: BitExpr -> BitExpr -> BitExpr > app s1 Emp = s1 > app s1 s2 = App (bsXor s1 xor bsXor s2) (bsLen s1 + bsLen s2) s1 s2 > > bits :: String -> BitExpr > bits = foldr (app . bit . (=='1')) Emp  Inverting a bitstring preserves the xor when it has even length, and inverts the xor when it has odd length. Note how we make use of both the cached xor and length values to compute the new cached xor. > inv :: BitExpr -> BitExpr > inv s = Inv (if even (bsLen s) then bsXor s else not (bsXor s)) > (bsLen s) > s  Reversing preserves xor and length. > rev :: BitExpr -> BitExpr > rev s = Rev (bsXor s) (bsLen s) s  Finally, the dragon operation: the xor of dragon s is the xor of s combined with the xor of inv s; the length is one more than twice the length of s. > drg :: BitExpr -> BitExpr > drg s = Drg (bsXor s xor bsXor (inv s)) (2*(bsLen s) + 1) s  We can test these: ghci> let t = drg (bits "11" app inv (bits "10000")) ghci> showbits . toBits$ t
"110111100000100"

ghci> bsLen t
15


# Splitting

Remember that our high-level goal is to take the expanded version of our bitstring, split it into blocks of length $2^k$, and then separately reduce each block with xnor. It turns out that we have enough information to split a bitstring expression into two bitstring expressions which correspond to splitting off a block of a given size from the beginning of the corresponding bitstring. That is, we will write a function splitBits :: Integer -> BitExpr -> (BitExpr, BitExpr) which works like splitAt, but on bitstring expressions instead of bitstrings. In other words, it will satisfy the property

splitAt n . toBits == (toBits *** toBits) . splitBits n

We’ll go through the implementation case by case. You might like to try implementing splitBits yourself before peeking at mine; it makes for a nice exercise.

> splitBits :: Integer -> BitExpr -> (BitExpr, BitExpr)


In the base cases, to split zero bits off the front of a bitstring, or if we are asked to split off more bits than there are, just generate the empty bitstring expression.

> splitBits 0 s                = (Emp, s)
> splitBits n s | n >= bsLen s = (s, Emp)


To split an App node, compare the number of bits we want to split off with the length of the first bitstring, and recursively split in either the left or right side appropriately, remembering to subtract the length of the first bitstring from the number of bits to split if we recurse on the right side.

> splitBits n (App _ _ s1 s2)
>   | n < bsLen s1
>     = let (s1a, s1b) = splitBits n s1 in (s1a, s1b app s2)
>   | otherwise
>     = let (s2a, s2b) = splitBits (n - bsLen s1) s2 in (s1 app s2a, s2b)


Inverting commutes with splitting, so to split an Inv node, we can just split recursively and then rewrap the results with inv.

> splitBits n (Inv _ _ s) = (inv *** inv) $splitBits n s  To split Rev and Drg nodes, we expand the expressions a bit to get rid of the top-level constructor before re-calling splitBits. > splitBits n (Rev _ _ s) = splitBits n (pushRev s) > splitBits n (Drg _ _ s) = splitBits n (expandDragon s)  In the case of Rev, we can “push the reverse through” one level, transforming it into an equivalent expression which no longer has a Rev node at the top. We make use of some nice algebraic properties governing the interaction of reverse with the other operations: • Reversing an empty or singleton bitstring does nothing. • reverse (s1 ++ s2) == reverse s2 ++ reverse s1 • reverse . invert = invert . reverse • reverse . reverse = id • Finally, reverse . dragon = dragon . invert, which can be easily proved by expanding dragon in terms of the other operations and then applying the above algebraic laws. Using these properties, we can implement pushRev as follows: > pushRev :: BitExpr -> BitExpr > pushRev Emp = Emp > pushRev (Bit b) = Bit b > pushRev (App _ _ s1 s2) = rev s2 app rev s1 > pushRev (Inv _ _ s) = inv (rev s) > pushRev (Rev _ _ s) = s > pushRev (Drg _ _ s) = drg (inv s)  Finally, expandDragon just expands a dragon operation in terms of the other operations. Although this approximately doubles the size of the bitstring expression, we only do this lazily, when we are actually trying to split the result of a dragon transform. It’s only natural that splitting an expression results in somewhat larger expressions. > expandDragon :: BitExpr -> BitExpr > expandDragon s = s app (bit False app inv (rev s))  # Filling and checksumming We’re almost there! We can now implement the fill and checksum operations at the level of bitstring expressions. fill is straightforward: keep applying the drg smart constructor until the cached length is sufficient, then use splitBits to create an expression corresponding to only the first $n$ bits. > fillE :: Integer -> String -> BitExpr > fillE n str = fst . splitBits n$ go (bits str)
>   where
>     go s | bsLen s >= n = s
>          | otherwise    = go (drg s)


Finally, we can implement checksumE using the same pattern as checksum2, where we break up the string into chunks of size $2^k$ and then reduce each chunk. The only difference is that now we use splitBits to split, and the cached xor to compute the reduction. We know each of the blocks has an even length, so the xnor is just the negation of the cached xor.

> checksumE :: BitExpr -> BitString
> checksumE s = map (not . bsXor) . unfoldr doSplit $s > where > doSplit Emp = Nothing > doSplit s = Just (splitBits blockLen s) > blockLen = powTwo (bsLen s) > powTwo n > | odd n = 1 > | otherwise = 2 * powTwo (n div 2)  Let’s check that we get the same answer for the first task: ghci> showbits$ checksumE (fillE 272 input)
"10010010110011010"

ghci> withbits (checksum . fill 272) input
"10010010110011010"


Great! And now for the second task:

ghci> showbits $checksumE (fillE (17 * 2^21) input) "01010100101011100"  On my machine this finishes pretty much instantaneously, taking only 0.02 seconds. In order to generate enough bits, the dragon transform must be applied 21 times, but that just generates a small expression with 21 Drg constructors. Splitting into chunks of length $2^{21}$ certainly expands the size of the expressions a bit, but everything stays nice and logarithmic since many of the Drg constructors can remain unexpanded. In fact, this can easily handle MUCH larger problem instances. For example: ghci> showbits$ checksumE (fillE (17 * 2^80) input)
"10000100010001100"

ghci> showbits $checksumE (fillE (17 * 2^81) input) "01010100101011100"  Semantically, this corresponds to generating yottabytes worth of bits (I had to look up the proper prefix) and then checksumming them; operationally, though, these are still basically instantaneous. (Interestingly, I also tried $17 \cdot 2^{200}$, and it instantaneously printed the first 11 bits of the answer and then segfaulted. Perhaps I have found a bug in GHC 8.0.2.) Notice that the checksum for $17 \cdot 2^{81}$ is actually the same as that for $17 \cdot 2^{21}$. After playing around with it a bit, the checksums for $17 \cdot 2^k$ seem to have a period of 12, but I’m not sure how to prove it! Posted in haskell | Tagged , , , , , , | 2 Comments ## My new programming languages course tl;dr: my new PL course is now finished, and all the course materials are freely available. Working through all the exercises should be a great option for anyone wishing to learn some basics of programming language design and implementation. Last May, I wrote about my ideas for designing a new PL course, and got a lot of great comments and feedback. Well, somehow I survived the semester, and the course is now over. In the end I’m pretty happy with how it went (though of course there are always things that can be improved next time). I decided to use class time in an unconventional way: for each class meeting I created a “module”, consisting of a literate Haskell file with some example code, explanatory text, and lots of holes where students needed to write answers to exercises or fill in code. I split the students into groups, and they spent class time just working through the module. Instead of standing at the front lecturing, I just wandered around watching them work and answering questions. It took a bit of getting used to—for the first few classes I couldn’t shake the feeling that I wasn’t really doing my job—but it quickly became clear that the students were really learning and engaging with the material in a way that they would not have been able to if I had just lectured. A happy byproduct of this approach is that the modules are fairly self-contained and can now be used by anyone to learn the material. Reading through all the modules and working through the exercises should be a great option for anyone wishing to learn some basics of programming language design and implementation. For example, I know I will probably reuse it to get summer research students up to speed. Note that the course assumes no knowledge of Haskell (so those familiar with Haskell can safely skip the first few modules), but introduces just enough to get where I want to go. I don’t plan to release any solutions, so don’t ask. But other than that, questions, comments, bug reports, etc. are welcome! Posted in teaching | Tagged , , , | 9 Comments ## MonadRandom 0.5 released I’m happy to announce the release of MonadRandom-0.5, a package which provides a convenient monadic interface for random number generation in the style of transformers and mtl: a Rand monad (essentially a state monad that threads through a generator), a monad transformer variant called RandT, and a RandomMonad class allowing the use of random generation operations in monad stacks containing RandT. This release has quite a few small additions as well as a big module reorganization. However, thanks to module re-exports, most existing code using the library should continue to work with no changes; the major version bump reflects the large reorganization and my resultant inability to 100% guarantee that existing user code will not break. If your code does break, please let me know—I would be happy to help you fix it, or simply to know about it so I can help other users. Here are a few of the biggest changes that may be of interest to users of the library: • A new MonadInterleave class (see #20), which is a big improvement over MonadSplit. It provides a method interleave :: m a -> m a, which works by splitting the generator, running its argument using one half of the generator, and using the other half as the final state of the resulting action (replacing whatever the final generator state otherwise would have been). This can be used, for example, to allow random computations to run in parallel, or to create lazy infinite structures of random values. In the example below, the infinite tree randTree cannot be evaluated lazily: even though it is cut off at two levels deep by hew 2, the random value in the right subtree still depends on generation of all the random values in the (infinite) left subtree, even though they are ultimately unneeded. Inserting a call to interleave, as in randTreeI, solves the problem: the generator splits at each Node, so random values in the left and right subtrees are generated independently. data Tree = Leaf | Node Int Tree Tree deriving Show hew :: Int -> Tree -> Tree hew 0 _ = Leaf hew _ Leaf = Leaf hew n (Node x l r) = Node x (hew (n-1) l) (hew (n-1) r) randTree :: Rand StdGen Tree randTree = Node <$> getRandom <*> randTree <*> randTree

randTreeI :: Rand StdGen Tree
randTreeI = interleave $Node <$> getRandom <*> randTreeI <*> randTreeI

>>> hew 2 <$> evalRandIO randTree Node 2168685089479838995 (Node (-1040559818952481847) Leaf Leaf) (Node ^CInterrupted. >>> hew 2 <$> evalRandIO randTreeI
Node 8243316398511136358 (Node 4139784028141790719 Leaf Leaf) (Node 4473998613878251948 Leaf Leaf)
• A new PrimMonad instance for RandT (thanks to Koz Ross), allowing it to be used in conjunction with e.g. mutable vectors.

• New and improved random selection functions:
• fromList now raises an error when the total weight of elements is zero.
• The type of uniform is generalized to work over any Foldable.
• New operations weighted, weightedMay, fromListMay, and uniformMay have been added. weighted is like fromList but generalized to work over any Foldable. The May variants, of course, return a Maybe result instead of raising an error.
• New lazy vs strict variants of the Rand monad. If you import Control.Monad.Random or Control.Monad.Trans.Random you get the Lazy variant re-exported by default, but you can explicitly import .Lazy or .Strict if you want. They provide the exact same API, but Lazy is implemented with a lazy state monad and Strict with a strict one. To be honest it’s not clear what difference this might make, but since the distinction is already there with the underlying state monad for free, why not provide it?

Although there was some discussion of generalizing MonadRandom to work for a wider range of underlying generators (see the comments on my previous blog post and the discussion on issue #26), I decided to punt on that for now. It seems rather complicated, and that there are already good alternatives like the very nice random-fu package, so I decided to keep things simple for this release. I’m still open to proposals for generalizing future releases.

For a full rundown of changes in 0.5, see the change log. Comments, questions, and bug reports are always welcome either as a comment on this blog post or on the GitHub issue tracker.

## MonadRandom 0.5 and mwc-random: feedback wanted

Since 2013 or so I have been the maintainer of the MonadRandom package, which provides an mtl-style type class for monads with support for generation of pseudorandom values, along with a concrete random monad transformer RandT. As of this writing it has 89 reverse dependencies on Hackage—a healthy number, and one that makes me think carefully about any breaking changes to the package.

Recently I got a number of pull requests, and have been working on putting together an 0.5 release which adds a few functions, adds lazy- and strict-state variants of RandT, and reorganizes things to be closer to standard practice of the transformers package. Since this release will include some technically breaking changes already, it’s a good time to think about potentially including others.

The one thing I am not sure what to do about is this issue: Allow MonadRandom interface for MWC-random. mwc-random is a very nice package for psuedorandom number generation, but apparently it does not fit into the MonadRandom abstraction. First of all, I would like to understand why—I am not very familiar with mwc-random. Second of all, I’d love to figure out a solution, but ideally one that causes as little breakage to existing code as possible.

Leave a comment (either here or on the github issue) if this is something you know/care about, and let’s see if we can figure out a good solution together!

## The divided difference track

My wife and son made a train track corresponding to the regular expression of divided differences, $b^*ha^*$:

## Adventures in enumerating balanced brackets

Since I’ve been coaching my school’s ACM ICPC programming team, I’ve been spending a bit of time solving programming contest problems, partly to stay sharp and be able to coach them better, but also just for fun.

I recently solved a problem (using Haskell) that ended up being tougher than I thought, but I learned a lot along the way. Rather than just presenting a solution, I’d like to take you through my thought process, crazy detours and all.

Of course, I should preface this with a big spoiler alert: if you want to try solving the problem yourself, you should stop reading now!

> {-# LANGUAGE GADTs #-}
> {-# LANGUAGE DeriveFunctor #-}
>
> module Brackets where
>
> import Data.List (sort, genericLength)
> import Data.MemoTrie (memo, memo2)
> import Prelude hiding ((++))


## The problem

There’s a lot of extra verbiage at the official problem description, but what it boils down to is this:

Find the $m$th element of the lexicographically ordered sequence of all balanced bracketings of length $n$.

There is a longer description at the problem page, but hopefully a few examples will suffice. A balanced bracketing is a string consisting solely of parentheses, in which opening and closing parens can be matched up in a one-to-one, properly nested way. For example, there are five balanced bracketings of length $6$:

((())), (()()), (())(), ()(()), ()()()

By lexicographically ordered we just mean that the bracketings should be in “dictionary order” where ( comes before ), that is, bracketing $x$ comes before bracketing $y$ if and only if in the first position where they differ, $x$ has ( and $y$ has ). As you can verify, the list of length-$6$ bracketings above is, in fact, lexicographically ordered.

## A first try

Oh, this is easy, I thought, especially if we consider the well-known isomorphism between balanced bracketings and binary trees. In particular, the empty string corresponds to a leaf, and (L)R (where L and R are themselves balanced bracketings) corresponds to a node with subtrees L and R. So the five balanced bracketings of length $6$ correspond to the five binary trees with three nodes:

We can easily generate all the binary trees of a given size with a simple recursive algorithm. If $n = 0$, generate a Leaf; otherwise, decide how many nodes to put on the left and how many on the right, and for each such distribution recursively generate all possible trees on the left and right.

> data Tree where
>   Leaf :: Tree
>   Node :: Tree -> Tree -> Tree
>   deriving (Show, Eq, Ord)
>
> allTrees :: Int -> [Tree]
> allTrees 0 = [Leaf]
> allTrees n =
>   [ Node l r
>   | k <- [0 .. n-1]
>   , l <- allTrees ((n-1) - k)
>   , r <- allTrees k
>   ]


We generate the trees in “left-biased” order, where we first choose to put all $n-1$ nodes on the left, then $n-2$ on the left and $1$ on the right, and so on. Since a subtree on the left will result in another opening paren, but a subtree on the right will result in a closing paren followed by an open paren, it makes intuitive sense that this corresponds to generating bracketings in sorted order. You can see that the size-$3$ trees above, generated in left-biased order, indeed have their bracketings sorted.

Writing allTrees is easy enough, but it’s definitely not going to cut it: the problem states that we could have up to $n = 1000$. The number of trees with $1000$ nodes has 598 digits (!!), so we can’t possibly generate the entire list and then index into it. Instead we need a function that can more efficiently generate the tree with a given index, without having to generate all the other trees before it.

So I immediately launched into writing such a function, but it’s tricky to get right. It involves computing Catalan numbers, and cumulative sums of products of Catalan numbers, and divMod, and… I never did get that function working properly.

## The first epiphany

But I never should have written that function in the first place! What I should have done first was to do some simple tests just to confirm my intuition that left-biased tree order corresponds to sorted bracketing order. Because if I had, I would have found this:

> brackets :: Tree -> String
> brackets Leaf       = ""
> brackets (Node l r) = mconcat ["(", brackets l, ")", brackets r]
>
> sorted :: Ord a => [a] -> Bool
> sorted xs = xs == sort xs

ghci> sorted (map brackets (allTrees 3))
True

ghci> sorted (map brackets (allTrees 4))
False


As you can see, my intuition actually led me astray! $n = 3$ is a small enough case that left-biased order just happens to be the same as sorted bracketing order, but for $n = 4$ this breaks down. Let’s see what goes wrong:

In the top row are the size-$4$ trees in “left-biased” order, i.e. the order generated by allTrees. You can see it is nice and symmetric: reflecting the list across a vertical line leaves it unchanged. On the bottom row are the same trees, but sorted lexicographically by their bracketings. You can see that the lists are almost the same except the red tree is in a different place. The issue is the length of the left spine: the red tree has a left spine of three nodes, which means its bracketing will begin with (((, so it should come before any trees with a left spine of length 2, even if they have all their nodes in the left subtree (whereas the red tree has one of its nodes in the right subtree).

My next idea was to try to somehow enumerate trees in order by the length of their left spine. But since I hadn’t even gotten indexing into the original left-biased order to work, it seemed hopeless to get this to work by implementing it directly. I needed some bigger guns.

## Building enumerations

At this point I had the good idea to introduce some abstraction. I defined a type of enumerations (a la FEAT or data/enumerate):

> data Enumeration a = Enumeration
>   { fromNat :: Integer -> a
>   , size    :: Integer
>   }
>   deriving Functor
>
> enumerate :: Enumeration a -> [a]
> enumerate (Enumeration f n) = map f [0..n-1]


An Enumeration consists of a size along with a function Integer -> a, which we think of as being defined on [0 .. size-1]. That is, an Enumeration is isomorphic to a finite list of a given length, where instead of explicitly storing the elements, we have a function which can compute the element at a given index on demand. If the enumeration has some nice combinatorial structure, then we expect that this on-demand indexing can be done much more efficiently than simply listing all the elements. The enumerate function simply turns an Enumeration into the corresponding finite list, by mapping the indexing function over all possible indices.

Note that Enumeration has a natural Functor instance, which GHC can automatically derive for us. Namely, if e is an Enumeration, then fmap f e is the Enumeration which first computes the element of e for a given index, and then applies f to it before returning.

Now, let’s define some combinators for building Enumerations. We expect them to have all the nice algebraic flavor of finite lists, aka free monoids.

First, we can create empty or singleton enumerations, or convert any finite list into an enumeration:

> empty :: Enumeration a
> empty = Enumeration (const undefined) 0
>
> singleton :: a -> Enumeration a
> singleton a = Enumeration (\_ -> a) 1
>
> list :: [a] -> Enumeration a
> list as = Enumeration (\n -> as !! fromIntegral n) (genericLength as)

ghci> enumerate (empty :: Enumeration Int)
[]

ghci> enumerate (singleton 3)
[3]

ghci> enumerate (list [4,6,7])
[4,6,7]


We can form the concatenation of two enumerations. The indexing function compares the given index against the size of the first enumeration, and then indexes into the first or second enumeration appropriately. For convenience we can also define union, which is just an iterated version of (++).

> (++) :: Enumeration a -> Enumeration a -> Enumeration a
> e1 ++ e2 = Enumeration
>   (\n -> if n < size e1 then fromNat e1 n else fromNat e2 (n - size e1))
>   (size e1 + size e2)
>
> union :: [Enumeration a] -> Enumeration a
> union = foldr (++) empty

ghci> enumerate (list [3, 5, 6] ++ empty ++ singleton 8)
[3,5,6,8]


Finally, we can form a Cartesian product: e1 >< e2 is the enumeration of all possible pairs of elements from e1 and e2, ordered so that all the pairs formed from the first element of e1 come first, followed by all the pairs with the second element of e1, and so on. The indexing function divides the given index by the size of e2, and uses the quotient to index into e1, and the remainder to index into e2.

> (><) :: Enumeration a -> Enumeration b -> Enumeration (a,b)
> e1 >< e2 = Enumeration
>   (\n -> let (l,r) = n divMod size e2 in (fromNat e1 l, fromNat e2 r))
>   (size e1 * size e2)

ghci> enumerate (list [1,2,3] >< list [10,20])
[(1,10),(1,20),(2,10),(2,20),(3,10),(3,20)]

ghci> let big = list [0..999] >< list [0..999] >< list [0..999] >< list [0..999]
ghci> fromNat big 2973428654
(((2,973),428),654)


Notice in particular how the fourfold product of list [0..999] has $1000^4 = 10^{12}$ elements, but indexing into it with fromNat is basically instantaneous.

Since Enumerations are isomorphic to finite lists, we expect them to have Applicative and Monad instances, too. First, the Applicative instance is fairly straightforward:

> instance Applicative Enumeration where
>   pure    = singleton
>   f <*> x = uncurry ($) <$> (f >< x)

ghci> enumerate $(*) <$> list [1,2,3] <*> list [10, 100]
[10,100,20,200,30,300]


pure creates a singleton enumeration, and applying an enumeration of functions to an enumeration of arguments works by taking a Cartesian product and then applying each pair.

The Monad instance works by substitution: in e >>= k, the continuation k is applied to each element of the enumeration e, and the resulting enumerations are unioned together in order.

> instance Monad Enumeration where
>   return  = pure
>   e >>= f = union (map f (enumerate e))

ghci> enumerate $list [1,2,3] >>= \i -> list (replicate i i) [1,2,2,3,3,3]  Having to actually enumerate the elements of e is a bit unsatisfying, but there is really no way around it: we otherwise have no way to know how big the resulting enumerations are going to be. Now, that function I tried (and failed) to write before that generates the tree at a particular index in left-biased order? Using these enumeration combinators, it’s a piece of cake. Basically, since we built up combinators that mirror those available for lists, it’s just as easy to write this indexing version as it is to write the original allTrees function (which I’ve copied below for comparison): allTrees :: Int -> [Tree] allTrees 0 = [Leaf] allTrees n = [ Node l r | k <- [0 .. n-1] , l <- allTrees ((n-1) - k) , r <- allTrees k ] > enumTrees :: Int -> Enumeration Tree > enumTrees 0 = singleton Leaf > enumTrees n = union > [ Node <$> enumTrees (n-k-1) <*> enumTrees k
>   | k <- [0 .. n-1]
>   ]


(enumTrees and allTrees look a bit different, but actually allTrees can be rewritten in a very similar style:

allTrees :: Int -> [Tree]
allTrees 0 = [Leaf]
allTrees n = concat
[ Node <$> allTrees ((n-1) - k) <*> r <- allTrees k | k <- [0 .. n-1] ] Doing as much as possible using the Applicative interface gives us added “parallelism”, which in this case means the ability to index directly into a product with divMod, rather than scanning through the results of calling a function on enumerate until we have accumulated the right size. See the paper on the GHC ApplicativeDo extension.) Let’s try it out: ghci> enumerate (enumTrees 3) [Node (Node (Node Leaf Leaf) Leaf) Leaf,Node (Node Leaf (Node Leaf Leaf)) Leaf,Node (Node Leaf Leaf) (Node Leaf Leaf),Node Leaf (Node (Node Leaf Leaf) Leaf),Node Leaf (Node Leaf (Node Leaf Leaf))] ghci> enumerate (enumTrees 3) == allTrees 3 True ghci> enumerate (enumTrees 7) == allTrees 7 True ghci> brackets$ fromNat (enumTrees 7) 43
"((((()())))())"


It seems to work! Though actually, if we try larger values of $n$, enumTrees just seems to hang. The problem is that it ends up making many redundant recursive calls. Well… nothing a bit of memoization can’t fix! (Here I’m using Conal Elliott’s nice MemoTrie package.)

> enumTreesMemo :: Int -> Enumeration Tree
> enumTreesMemo = memo enumTreesMemo'
>   where
>     enumTreesMemo' 0 = singleton Leaf
>     enumTreesMemo' n = union
>       [ Node <$> enumTreesMemo (n-k-1) <*> enumTreesMemo k > | k <- [0 .. n-1] > ]  ghci> size (enumTreesMemo 10) 16796 ghci> size (enumTreesMemo 100) 896519947090131496687170070074100632420837521538745909320 ghci> size (enumTreesMemo 1000) 2046105521468021692642519982997827217179245642339057975844538099572176010191891863964968026156453752449015750569428595097318163634370154637380666882886375203359653243390929717431080443509007504772912973142253209352126946839844796747697638537600100637918819326569730982083021538057087711176285777909275869648636874856805956580057673173655666887003493944650164153396910927037406301799052584663611016897272893305532116292143271037140718751625839812072682464343153792956281748582435751481498598087586998603921577523657477775758899987954012641033870640665444651660246024318184109046864244732001962029120 ghci> brackets$ fromNat (enumTreesMemo 1000) 8234587623904872309875907638475639485792863458726398487590287348957628934765
"((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((()(((()((((()))())(()()()))()(())(())((()((()))(((())()(((((()(((()()))(((()((((()()(())()())(((()))))(((()()()(()()))))(((()((()))(((()())())))())(()()(())(())()(()())))()))((()()))()))()))()(((()))(()))))))())()()()))((())((()))((((())(())))((())))))()))()(())))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))"


That’s better!

## A second try

At this point, I thought that I needed to enumerate trees in order by the length of their left spine. Given a tree with a left spine of length $s$, we enumerate all the ways to partition the remaining $n-s$ elements among the right children of the $s$ spine nodes, preferring to first put elements as far to the left as possible. As you’ll see, this turns out to be wrong, but it’s fun to see how easy it is to write this using the enumeration framework.

First, we need an enumeration of the partitions of a given $n$ into exactly $k$ parts, in lexicographic order.

> kPartitions :: Int -> Int -> Enumeration [Int]


There is exactly one way to partition $0$ into zero parts.

> kPartitions 0 0 = singleton []


We can’t partition anything other than $0$ into zero parts.

> kPartitions _ 0 = empty


Otherwise, pick a number $i$ from $n$ down to $0$ to go in the first spot, and then recursively enumerate partitions of $n-i$ into exactly $k-1$ parts.

> kPartitions n k = do
>   i <- list [n, n-1 .. 0]
>   (i:) <$> kPartitions (n-i) (k-1)  Let’s try it: ghci> let p43 = enumerate$ kPartitions 4 3
ghci> p43
[[4,0,0],[3,1,0],[3,0,1],[2,2,0],[2,1,1],[2,0,2],[1,3,0],[1,2,1],[1,1,2],[1,0,3],[0,4,0],[0,3,1],[0,2,2],[0,1,3],[0,0,4]]

ghci> all ((==3) . length) p43
True

ghci> all ((==4) . sum) p43
True

ghci> sorted (reverse p43)
True


Now we can use kPartitions to build our enumeration of trees:

> spinyTrees :: Int -> Enumeration Tree
> spinyTrees = memo spinyTrees'
>   where
>     spinyTrees' 0 = singleton Leaf
>     spinyTrees' n = do
>
>       -- Pick the length of the left spine
>       spineLen  <- list [n, n-1 .. 1]
>
>       -- Partition the remaining elements among the spine nodes
>       bushSizes <- kPartitions (n - spineLen) spineLen
>       bushes <- traverse spinyTrees bushSizes
>       return $buildSpine (reverse bushes) > > buildSpine :: [Tree] -> Tree > buildSpine [] = Leaf > buildSpine (b:bs) = Node (buildSpine bs) b  This appears to give us something reasonable: ghci> size (spinyTrees 7) == size (enumTreesMemo 7) True  But it’s pretty slow—which is to be expected with all those monadic operations required. And there’s more: ghci> sorted . map brackets . enumerate$ spinyTrees 3
True

ghci> sorted . map brackets . enumerate $spinyTrees 4 True ghci> sorted . map brackets . enumerate$ spinyTrees 5
False


Foiled again! All we did was stave off failure a bit, until $n=5$. I won’t draw all the trees of size $5$ for you, but the failure mode is pretty similar: picking subtrees for the spine based just on how many elements they have doesn’t work, because there are cases where we want to first shift some elements to a later subtree, keeping the left spine of a subtree, before moving the elements back and having a shorter left spine.

It finally occurred to me that there was nothing in the problem statement that said anything about trees. That was just something my overexcited combinatorial brain imposed on it: obviously, since there is a bijection between balanced bracketings and binary trees, we should think about binary trees, right? …well, there is also a bijection between balanced bracketings and permutations avoiding (231), and lattice paths that stay above the main diagonal, and hundreds of other things, so… not necessarily.

In this case, I think trees just end up making things harder. Let’s think instead about enumerating balanced bracket sequences directly. To do it recursively, we need to know how to enumerate possible endings to the start of any balanced bracket sequence. That is, we need to enumerate sequences containing $n$ opening brackets and $c$ extra closing brackets (so $n+c$ closing brackets in total), which can be appended to a sequence of brackets with $c$ more opening brackets than closing brackets.

Given this idea, the code is fairly straightforward:

> enumBrackets :: Int -> Enumeration String
> enumBrackets n = enumBracketsTail n 0
>
> enumBracketsTail :: Int -> Int -> Enumeration String
> enumBracketsTail = memo2 enumBracketsTail'
>   where


To enumerate a sequence with no opening brackets, just generate c closing brackets.

>     enumBracketsTail' 0 c = singleton (replicate c ')')


To enumerate balanced sequences with $n$ opening brackets and an exactly matching number of closing brackets, start by generating an opening bracket and then continue by generating sequences with $n-1$ opening brackets and one extra closing bracket to match the opening bracket we started with.

>     enumBracketsTail' n 0 = ('(':) <$> enumBracketsTail (n-1) 1  In general, a sequence with $n$ opening and $c$ extra closing brackets is either an opening bracket followed by an (n-1, c+1)-sequence, or a closing bracket followed by an (n, c-1)-sequence. > enumBracketsTail' n c = > (('(':) <$> enumBracketsTail (n-1) (c+1))
>         ++
>         ((')':) <\$> enumBracketsTail n (c-1))


This is quite fast, and as a quick check, it does indeed seem to give us the same size enumerations as the other tree enumerations:

ghci> fromNat (enumBrackets 40) 16221270422764920820
"((((((((()((())()(()()()())(()))((()()()()(()((()())))((()())))))))()))()())()))"

ghci> size (enumBrackets 100) == size (enumTreesMemo 100)
True


But, are they sorted? It would seem so!

ghci> all sorted (map (enumerate . enumBrackets) [1..10])
True


At this point, you might notice that this can be easily de-abstracted into a fairly simple dynamic programming solution, using a 2D array to keep track of the size of the enumeration for each (n,c) pair. I’ll leave the details to interested readers.

Posted in combinatorics, haskell | Tagged , , , | 2 Comments