A lot of fun was had by all. Thanks to Andrew Mackey and the rest of the folks at UAFS for hosting a great contest!
]]>(Fun fact: this was actually my first ICFP talk—though not my first ICFP paper. I am glad I have a bunch of experience giving talks at this point; I can remember a time in my life when giving a talk on that stage would have been terrifying.)
Here is the video of the talk:
You can download my slides here. This version of the slides includes speaking notes—it’s not exactly what I ended up saying, but it should give you a decent idea if you just want to read the slides without watching the video.
]]>This is a literate Haskell post; download it and play along in ghci! (Note it requires GHC 8.6 since I couldn’t resist making use of DerivingVia
…)
> {-# LANGUAGE DefaultSignatures #-}
> {-# LANGUAGE FlexibleInstances #-}
> {-# LANGUAGE MultiParamTypeClasses #-}
> {-# LANGUAGE DerivingStrategies #-}
> {-# LANGUAGE GeneralizedNewtypeDeriving #-}
> {-# LANGUAGE DerivingVia #-}
> {-# LANGUAGE TypeApplications #-}
> {-# LANGUAGE ScopedTypeVariables #-}
>
> import Data.Semigroup
> import Data.Coerce
Consider a sequence of integers . (Actually, integers is too specific; any linearly ordered domain will do.) An inversion is a pair of positions in which are “out of order”: that is, such that but . So, for example, has six inversions, namely . (Here I’ve written the elements that are out of order rather than their positions, which doesn’t matter much when the elements are all distinct; but it’s important to keep in mind that e.g. has two inversions, not one, because each copy of makes an inversion with the .) The total number of inversions of is denoted .
One way to think about the inversion count is as a measure of how far away the sequence is from being sorted. In particular, bubble sort will make precisely adjacent swaps while sorting . The highest possible value of is , when is sorted in reverse order.
The obvious brute-force algorithm to count inversions is to use two nested loops to enumerate all possible pairs of elements, and increment a counter each time we discover a pair which is out of order. This clearly takes time. Can it be done any faster?
It turns out the (generally well-known) answer is yes, using a variant of mergesort. The trick is to generalize to counting inversions and sorting the sequence at the same time. First split the sequence in half, and recursively sort and count the inversions in each half. Any inversion in the original sequence must either be entirely contained in one of the two halves (these will be counted by the recursive calls), or have one endpoint in the left half and one in the right. One key observation at this point is that any inversion with one endpoint in each half will still be an inversion even after independently sorting the two halves. The other key observation is that we can merge the two sorted subsequences and count inversions between them in linear time. Use the usual two-finger algorithm for merging two sorted sequences; each time we take an element from the right subsequence, it’s because it is less than all the remaining elements in the left subsequence, but it was to the right of all of them, so we can add the length of the remaining left subsequence to the inversion count. Intuitively, it’s this ability to count a bunch of inversions in one step which allows this algorithm to be more efficient, since any algorithm which only ever increments an inversion counter is doomed to be no matter how cleverly it splits up the counting. In the end, the number of total inversions is the sum of the inversions counted recursively in the two sublists, plus any inversions between the two sublists.
Here’s some Haskell code implementing this sorted-merge-and-inversion-count. We have to be a bit careful because we don’t want to call length
on the remaining sublist at every step (that would ruin the asymptotic performance!), so we precompute the length and pass along the length of the left subsequence as an extra parameter which we keep up-to-date as we recurse.
> -- Precondition: the input lists are sorted.
> -- Output the sorted merge of the two lists, and the number of pairs
> -- (a,b) such that a \in xs, b \in ys with a > b.
> mergeAndCount :: Ord a => [a] -> [a] -> ([a], Int)
> mergeAndCount xs ys = go xs (length xs) ys
> -- precondition/invariant for go xs n ys: n == length xs
> where
> go [] _ ys = (ys, 0)
> go xs _ [] = (xs, 0)
> go (x:xs) n (y:ys)
> | x <= y = let (m, i) = go xs (n-1) (y:ys) in (x:m, i)
> | otherwise = let (m, i) = go (x:xs) n ys in (y:m, i + n)
>
> merge :: Ord a => [a] -> [a] -> [a]
> merge xs ys = fst (mergeAndCount xs ys)
>
> inversionsBetween :: Ord a => [a] -> [a] -> Int
> inversionsBetween xs ys = snd (mergeAndCount xs ys)
Do you see how this is an instance of the sparky monoid construction in my previous post? is the set of sorted lists with merge as the monoid operation; is the natural numbers under addition. The spark operation takes two sorted lists and counts the number of inversions between them. So the monoid on pairs merges the lists, and adds the inversion counts together with the number of inversions between the two lsits.
We have to verify that this satisfies the laws: let be any sorted list, then we need
, that is, a `inversionsBetween` [] = 0
. This is true since there are never any inversions between and an empty list. Likewise for .
a `inversionsBetween` (a1 `merge` a2) == (a `inversionsBetween` a1) + (a `inversionsBetween` a2)
. This is also true since a1 `merge` a2
contains the same elements as a1
and a2
: any inversion between a
and a1 `merge` a2
will be an inversion between a
and a1
, or between a
and a2
, and vice versa. The same reasoning shows that (a1 `merge` a2) `inversionsBetween` a == (a1 `inversionsBetween` a) + (a2 `inversionsBetween` a)
.
Note that and are commutative monoids, but the spark operation isn’t commutative; in fact, any given pair of elements is an inversion between a1
and a2
precisely iff they are not an inversion between a2
and a1
. Note also that and aren’t idempotent; for example merging a sorted list with itself produces not the same list, but a new list with two copies of each element.
So let’s see some more Haskell code to implement the entire algorithm in a nicely modular way. First, let’s encode sparky monoids in general. The Sparky
class is for pairs of types with a spark operation. As we saw in the example above, sometimes it may be more efficient to compute and the spark at the same time, so we bake that possibility into the class.
> class Sparky a b where
The basic spark operation, with a default implementation that projects the result out of the prodSpark
method.
> (<.>) :: a -> a -> b
> a1 <.> a2 = snd (prodSpark a1 a2)
prodSpark
does the monoidal product and spark at the same time, with a default implementation that just does them separately.
> prodSpark :: a -> a -> (a,b)
> default prodSpark :: Semigroup a => a -> a -> (a,b)
> prodSpark a1 a2 = (a1 <> a2, a1 <.> a2)
Finally we can specify that we have to implement one or the other of these methods.
> {-# MINIMAL (<.>) | prodSpark #-}
Sparked a b
is just a pair type, but with Semigroup
and Monoid
instances that implement the sparky product.
> data Sparked a b = S { getA :: a, getSpark :: b }
> deriving Show
>
> class Semigroup a => CommutativeSemigroup a
> class (Monoid a, CommutativeSemigroup a) => CommutativeMonoid a
>
> instance (Semigroup a, CommutativeSemigroup b, Sparky a b) => Semigroup (Sparked a b) where
> S a1 b1 <> S a2 b2 = S a' (b1 <> b2 <> b')
> where (a', b') = prodSpark a1 a2
>
> instance (Monoid a, CommutativeMonoid b, Sparky a b) => Monoid (Sparked a b) where
> mempty = S mempty mempty
Now we can make instances for sorted lists under merge…
> newtype Sorted a = Sorted [a]
> deriving Show
>
> instance Ord a => Semigroup (Sorted a) where
> Sorted xs <> Sorted ys = Sorted (merge xs ys)
> instance Ord a => Monoid (Sorted a) where
> mempty = Sorted []
>
> instance Ord a => CommutativeSemigroup (Sorted a)
> instance Ord a => CommutativeMonoid (Sorted a)
…and for inversion counts.
> newtype InvCount = InvCount Int
> deriving newtype Num
> deriving (Semigroup, Monoid) via Sum Int
>
> instance CommutativeSemigroup InvCount
> instance CommutativeMonoid InvCount
Finally we make the Sparky (Sorted a) InvCount
instance, which is just mergeAndCount
(some conversion between newtype
s is required, but we can get the compiler to do it automagically via coerce
and a bit of explicit type application).
> instance Ord a => Sparky (Sorted a) InvCount where
> prodSpark = coerce (mergeAndCount @a)
And here’s a function to turn a single a
value into a sorted singleton list paired with an inversion count of zero, which will come in handy later.
> single :: a -> Sparked (Sorted a) InvCount
> single a = S (Sorted [a]) 0
Finally, we can make some generic infrastructure for doing monoidal folds. First, Parens a
encodes lists of a
which have been explicitly associated, i.e. fully parenthesized:
> data Parens a = Leaf a | Node (Parens a) (Parens a)
> deriving Show
We can make a generic fold for Parens a
values, which maps each Leaf
into the result type b
, and replaces each Node
with a binary operation:
> foldParens :: (a -> b) -> (b -> b -> b) -> Parens a -> b
> foldParens lf _ (Leaf a) = lf a
> foldParens lf nd (Node l r) = nd (foldParens lf nd l) (foldParens lf nd r)
Now for a function which splits a list in half recursively to produce a balanced parenthesization.
> balanced :: [a] -> Parens a
> balanced [] = error "List must be nonempty"
> balanced [a] = Leaf a
> balanced as = Node (balanced as1) (balanced as2)
> where (as1, as2) = splitAt (length as `div` 2) as
Finally, we can make a balanced variant of foldMap
: instead of just mapping a function over a list and then reducing with mconcat
, as foldMap
does, it first creates a balanced parenthesization for the list and then reduces via the given monoid. This will always give the same result as foldMap
due to associativity, but in some cases it may be more efficient.
> foldMapB :: Monoid m => (e -> m) -> [e] -> m
> foldMapB leaf = foldParens leaf (<>) . balanced
Let’s try it out!
λ> :set +s
λ> getSpark $ foldMap single [3000, 2999 .. 1 :: Int]
Sum {getSum = 4498500}
(34.94 secs, 3,469,354,896 bytes)
λ> getSpark $ foldMapB single [3000, 2999 .. 1 :: Int]
Sum {getSum = 4498500}
(0.09 secs, 20,016,936 bytes)
Empirically, it does seem that we are getting quadratic performance with normal foldMap
, but with foldMapB
. We can verify that we are getting the correct inversion count in either case, since we know there should be when the list is reversed, and sure enough, .
The bigger context here is thinking about different ways of putting a monoid structure on a product type , assuming and are themselves monoids. Previously I knew of two ways to do this. The obvious way is to just have the two sides operate in parallel, without interacting at all: and so on. Alternatively, if acts on , we can form the semidirect product.
But here is a third way. Suppose is a monoid, and is a commutative monoid. To connect them, we also suppose there is another binary operation , which I will pronounce “spark”. The way I like to think of it is that two values of type , in addition to combining to produce another via the monoid operation, also produce a “spark” of type . That is, values of type somehow capture information about the interaction between pairs of values.
Sparking needs to interact sensibly with the monoid structures on and : in particular, fixing either argument must result in a monoid homomorphism from to . That is, for any choice of , we have (i.e. is boring and can never produce a nontrivial spark with anything else), and (i.e. sparking commutes with the monoid operations). Similar laws should hold if we fix the second argument of instead of the first.
Given all of this, we can now define a monoid on as follows:
That is, we combine the values normally, and then we combine the values together with the “spark” of type produced by the two values.
Let’s see that this does indeed define a valid monoid:
The identity is , since
Notice how we used the identity laws for (once) and (twice), as well as the law that says . The proof that is a right identity for is similar.
To see that the combining operation is associative, we can reason as follows:
and
In a formal proof one would have to also explicitly note uses of associativity of and but I didn’t want to be that pedantic here.
In addition, if is a commutative monoid and the spark operation commutes, then the resulting monoid will be commutative as well.
The proof of associativity gives us a bit of insight into what is going on here. Notice that when reducing , we end up with all possible sparks between pairs of ’s, i.e. , and one can prove that this holds more generally. In particular, if we start with a list of values:
,
then inject them all into by pairing them with :
,
and finally fold this list with , the second element of the resulting pair is
,
that is, the combination (via the monoid on ) of the sparks between all possible pairings of the . Of course there are such pairings: the point is that whereas computing this via a straightforward fold over the list may well take time, by using a balanced fold (i.e. splitting the list in half recursively and then combining from the leaves up) it may be possible to compute it in time instead (at least, this is the case for the example I have in mind!).
Please leave a comment if you can you think of any instances of this pattern, or if you have seen this pattern discussed elsewhere. In a future post I will write about the instance I had in mind when coming up with this generalized formulation.
]]>This problem and its solution has been well-studied in the context of combinatorics; we were wondering what additional insight could be gained from a higher-level functional approach. You can get a preprint here.
]]>mupf
joined the #haskell-ops
channel yesterday to tell us that Ertugrul Söylemez died unexpectedly on May 12. This is a great loss for the Haskell community. Going variously by mm_freak
, ertes
, ertesx
or ertes-w
, Ertugrul has been very active on the #haskell
IRC channel for almost 11 years—since June 2007—and he will be remembered as a knowledgeable, generous, and patient teacher. In addition to teaching on IRC, he wrote several tutorials, such as this one on foldr and this one on monoids. The thing Ertugrul was probably most well-known for in the Haskell community was his netwire FRP package, which he had developed since 2011, and more recently its successor package wires. I think it is no exaggeration to say that his work on and insights into FRP helped shape a whole generation of approaches.
He will be greatly missed. I will leave the comments open for anyone to share memories of Ertugrul, his writing, or his code.
]]>The main summer school (July 9–21) has a focus on parallelism and concurrency, but for those with less background, there is also a one-week introduction on the foundations of programming languages (July 3–8). If you want, you can actually choose to attend only the July 3–8 session (or both, or just July 9–21 of course). Attending just the introductory session could be a great option for someone who knows some functional programming but wants a more solid grounding in the underlying theory.
If you’re a member of the Haskell/FP community who is thinking of applying, and I’m familiar with you or your work, I would be happy to write you a recommendation if you need one.
]]>At the time, I was reading part of Janis Voigtländer’s habilitation thesis. Unsure where to even start, I decided to just answer straightforwardly: “I’m reading a very long story about free theorems.”
He persisted. “What are free theorems?”
Never one to shrink from a pedagogical challenge, I thought for a moment, then began: “Do you know what a function is?” He didn’t. “A function is like a machine where you put something in one end and something comes out the other end. For example, maybe you put a number in, and the number that is one bigger comes out. So if you put in three, four comes out, or if you put in six, seven comes out.” This clearly made sense to him, so I continued, “The type of a function machine tells you what kinds of things you put in and what kinds of things come out. So maybe you put a number in and get a number out. Or maybe you put in a list of numbers and get a number out.” He interrupted excitedly, “Or maybe you could put words in??” “Yes, exactly! Maybe you can put words in and get words out. Or maybe there is a function machine where you put other function machines in and get function machines out!” He gasped in astonishment at the idea of putting function machines into function machines.
“So,” I concluded, “a free theorem is when you can say something that is always true about a function machine if you only know its type, but you don’t know anything about what it does on the inside.” This seemed a bit beyond him (and to be fair, free theorems are only interesting when polymorphism is involved which I definitely didn’t want to go into). But the whole conversation had given me a different idea.
“Hey, I have a good idea for a game,” I said. “It’s called the function machine game. I will think of a function machine. You tell me things to put into the function machine, and I will tell you what comes out. Then you have to guess what the function machine does.” He immediately liked this game and it has been a huge hit; he wants to play it all the time. We played it while driving to a party yesterday, and we played it this morning while I was in the shower. So far, he has correctly guessed:
I tried but that was a bit tough for him. I realized that in some cases he may understand intuitively what the function does but have trouble expressing it in words (this was also a problem with ), so we started using the obvious variant where once the guesser thinks they know what the function does, the players switch roles and the person who came up with function specifies some inputs in order to test whether the guesser is able to produce the correct outputs.
was also surprisingly difficult for him to guess (though he did get it right eventually). I think he was just stuck on the idea of the function doing something arithmetical to the input, and was having trouble coming up with some sort of arithmetic procedure which would result in no matter what you put in! It simply hadn’t occurred to him that the machine might not care about the input. (Interestingly, many students in my functional programming class this semester were also confused by constant functions when we were learning about the lambda calculus; they really wanted to substitute the input somewhere and were upset/confused by the fact that the bound variable did not occur in the body at all!)
After a few rounds of guessing my functions, he wanted to come up with his own functions for me to guess (as I knew he would). Sometimes his functions are great and sometimes they don’t make sense (usually because his idea of what the function does changes over time, which of course he, in all sincerity, denies), but it’s fun either way. And after he finally understood , he came up with his own function which was something like
inspired, I think, by his kindergarten class where they were learning about pairs of numbers that added up to .
Definitely one of my better parenting days.
]]>Please consider submitting something! As in previous years, we are accepting a wide range of types of submissions. Paper and demo submissions are due by June 28: you can submit original research, a technology tutorial, a call for collaboration, or propose some sort of live demo. In addition, we also solicit performances to be given during the associated performance evening—performance proposals are due by July 8.
See the website for more info, and let me know if you have any questions!
]]>