foldr is made of monoids

> import Data.Monoid

In his recent blog post What is foldr made of?, Tom Ellis made the clever observation that foldr is equivalent in power to the combination of map and compose, where

> compose :: [a -> a] -> a -> a
> compose []     = id
> compose (f:fs) = f . compose fs

We can then write

> foldr' :: (a -> b -> b) -> b -> [a] -> b
> foldr' g z xs = compose (map g xs) z

which (as Tom proves) is exactly equivalent to the usual foldr.

This is a really neat idea which I don’t remember ever seeing before. But now that I have, I realized that I could expand on it a bit—compose doesn’t come out of thin air quite as much as it might first appear.

Consider the standard function mconcat (from Data.Monoid), which combines a list of values from some Monoid:

mconcat :: Monoid m => [m] -> m
mconcat []     = mempty
mconcat (m:ms) = m `mappend` mconcat ms

I claim that in the presence of map, compose and mconcat are equivalent. Why is that? First, it’s easy to see that compose can be implemented in terms of mconcat—we just instantiate it to the monoid of endofunctions (where mempty = id and mappend = (.)), with a little bit of syntactic noise due to the need to convert in and out of the Endo newtype:

> compose' :: [a -> a] -> a -> a
> compose' = appEndo . mconcat . map Endo

Proving that compose' is the same as compose is not hard; I leave it to you as an exercise.

Implementing mconcat in terms of compose is a bit more interesting:

> mconcat' :: Monoid m => [m] -> m
> mconcat' = ($mempty) . compose . map mappend

The key idea is that we can turn any value from some monoidal type m into a function m -> m by partially applying mappend; composing these functions then corresponds to combining the original values, and the final value can be recovered by applying the resulting function to mempty.1 That is,

m_1 \diamond m_2 \diamond \dots \diamond m_n = ((m_1 \diamond) \circ (m_2 \diamond) \circ \dots \circ (m_n \diamond))\ \varepsilon

(where I have used \diamond and \varepsilon to represent mappend and mempty, respectively). Written out this way, I hope you can see why the equality holds by thinking about what the composition on the right evaluates to (and remembering the right identity law, x \diamond \varepsilon = x).

So we can also say that foldr = map + mconcat! This gets at the idea that lists are free (or initial) monoids, which intuitively means that of all monoids, they “preserve the most information”—up to the monoid laws, combining lists preserves all the information about the original lists and how you have combined them. This also means that there is a canonical way to “convert” a list into any other Monoid: that is, given a mapping f :: a -> m, there is a canonical way to take a list [a] and turn it into an m, namely, mconcat . map f.

Let’s make the connection to foldr even more explicit. First, let’s swap around the order of arguments and add some parentheses which aren’t strictly necessary:

foldr'' :: (a -> (b -> b)) -> [a] -> (b -> b)

Hmm, b -> b… why does this look so familiar? I claim we can actually pull a similar trick2 as with compose/mconcat, and replace b -> b with Monoid m => m to come up with a function equivalent to foldr'' (and hence foldr as well):

fold :: Monoid m => (a -> m) -> [a] -> m

Hmm, so how would this be implemented? Let’s see…

> fold :: Monoid m => (a -> m) -> [a] -> m
> fold f = mconcat . map f

So actually, fold itself (and hence, equivalently, foldr) is the canonical mapping from a list down to any monoid which I mentioned earlier! And here we can see quite directly that fold is indeed equivalent to mconcat + map.

In summary: foldr is equivalent to map plus compose/mappend, because lists are free monoids.


  1. Incidentally, this amounts to saying that mappend itself is a monoid homomorphism (a structure-preserving map between monoids), and this is where difference lists come from. For more, see my recent paper, Monoids: Theme and Variations.

  2. I leave this as an exercise too.

About Brent

Associate Professor of Computer Science at Hendrix College. Functional programmer, mathematician, teacher, pianist, follower of Jesus.
This entry was posted in haskell, math and tagged , , , , . Bookmark the permalink.

15 Responses to foldr is made of monoids

  1. Steven Stewart-Gallus says:

    I am really confused. Does this mean that the difference list optimization can be done to other data structures too?

    I know the following data structure is a monoid, and has a function similar to mconcat (because it is a tree) does the same idea work with it as well?

    data Tree a = LNil | RNil | Fork a (Tree a) (Tree a)

    It’s fold can be massaged to have the type (a -> (b, b) -> b) -> Tree a -> (b, b) -> b. The type (b, b) -> b is a monoid.

    I still can’t get what the type for the optimized version of tree would be.

    • Steven Stewart-Gallus says:

      * I meant to say that (b, b) -> b is the monoid not the Tree structure.

      • Brent says:

        I don’t understand what you mean. In what sense is (b,b) -> b a monoid? How would you implement mappend :: ((b,b) -> b) -> ((b,b) -> b) -> ((b,b) -> b)?

    • Brent says:

      No, the difference list optimization can be done to any monoid. So you can do it with data structures that have a monoid instance, but your Tree is not such a data type (at least not in any obvious way that I can see). You say “the following data structure is a monoid” — what exactly would mappend do (say, when given two Forks)?

    • For a generalization of difference lists, see the “Asymptotic Improvement of Computations over Free Monads” paper.

  2. That signature is the same as Data.Foldable.foldMap. Is that the same function?

    • Brent says:

      Yes, it is the same function, specialized to lists. But note that lists really are special: in some sense you can think of Foldable as the class of types which have a canonical implementation of toList. The generic version of foldMap is equivalent (but often more efficient than) calling toList followed by fold on the list. From another point of view, note that lists are the only type for which the catamorphism and Data.Foldable.foldMap coincide. For all other types, foldMap is actually a weaker operation than the catamorphism, because it forces you to “forget all the non-list structure”.

    • Brent says:

      Nice observation! Yes, I think it is exactly analogous to Cayley’s theorem, specialized to monoids instead of groups. I suspect there is also a relationship to the Yoneda lemma but I am not sure how to make it precise.

  3. Felipe Lessa says:

    I think you meant “(where mempty = id and mappend = (.))” =).

  4. Ruud Koot says:

    This reminds me of Exerise 2.11 from Burstall’s Computational Category Theory: “Notice that the definition of list processing functions like length, member, sum and maplist all have the same general form. This may be explained by the fact that lists form a free monoid. The monoid structure is that of concatenating lists with the empty list as the identity. There is a function h : A → list(A) which returns the singleton list on an element. The freeness is expressed by the unique existence of a homomorphism as follows: For any monoid (B, ∗, e) and function f : A → B, there is a unique homomorphism f# from the monoid of lists to (B, ∗, e) such that the following commutes: f# ∘ h = f. Given the monoid (B, ∗, e), the map f → f # can be constructed and the construction can be expressed as a program in ML. Representing the target monoid as a pair of a binary function and a constant, write this program. It is an example of an encapsulation of recursion, in that functions (like those mentioned above) normally defined through explicit recursion, can be obtained by passing suitable parameters to this program.”

    E.g. one can define length as:

    type Monoid b = (b -> b -> b, b)

    the_map :: Monoid b -> (a -> b) -> ([a] -> b)
    the_map ((*), e) = \f -> foldr (*) e . map f

    sum_monoid :: Monoid Int
    sum_monoid = ((+), 0)

    f_length = const 1
    length = the_map sum_monoid f_length

  5. Pingback: Okay, this is cool.

  6. Very nice blog post. I have two reflections:

    The function fold very neatly captures the essence of the MapReduce framework. The function argument is the function being mapped and the use of mappend corresponds to reduce.

    The generalization to the scan function is a nice way to think about parallel prefix computations.
    scan :: (a -> m) -> [a] -> [m]

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.