In my last post, I showed how we can solve Modulo Solitaire (and hopefully other BFS problems as well) using a certain API for BFS, which returns two functions: one, level :: v > Maybe Int
, gives the level (i.e. length of a shortest path to) of each vertex, and parent :: v > Maybe v
gives the parent of each vertex in the BFS forest. Before showing an implementation, I wanted to talk a bit more about this API and why I chose it.
In particular, Andrey Mokhov left a comment on my previous post with some alternative APIs:
bfsForest :: Ord a => [a] > AdjacencyMap a > Forest a
bfs :: Ord a => [a] > AdjacencyMap a > [[a]]
Of course, as Andrey notes, AdjacencyMap
is actually a reified graph data structure, which we don’t want here, but that’s not essential; presumably the AdjacencyMap
arguments in Andrey’s functions could easily be replaced by an implicit graph description instead. (Note that an API requiring an implicit representation is strictly more powerful, since if you have an explicit representation you can always just pass in a function which does lookups into your explicit representation.) However, Andrey raises a good point. Both these APIs return information which is not immediately available from my API.

bfsForest
returns an actual forest we can traverse, giving the children of each node. My API only returns aparent
function which gives the parent of each node. These contain equivalent information, however, and we can convert back and forth efficiently (where by “efficiently” in this context I mean “in time or better”) as long as we have a list of all vertices. To convert from aForest
to a parent function, just traverse the forest and remember all the parentchild pairs we see, building e.g. aMap
that can be used for lookup. To convert back, first iterate over the list of all vertices, find the parent of each, and build an inverse mapping from parents to sets of children. If we want to proceed to building an actualForest
data structure, we can unfold one via repeated lookups into our child mapping.However, I would argue that in typical applications, having the
parent
function is more useful than having aForest
. For example, theparent
function allows us to efficiently answer common, classic queries such as “Is vertexv
reachable from vertexs
?” and “What is a shortest path froms
tov
?” Answering these questions with aForest
would require traversing the entireForest
to look for the target vertexv
. 
bfs
returns a list of levels: that is, the first list is the starting vertices, the next list is all vertices one step away from any starting vertex, the next list is all vertices two steps away, and so on. Again, given a list of all vertices, we can recover a list of levels from thelevel
function: just traverse the list of all vertices, looking up the level of each and adding it to an appropriate mapping from levels to sets of vertices. Converting in the other direction is easy as well.A level list lets us efficiently answer a queries such as “how many vertices are exactly 5 steps away from
s
”?, whereas with thelevel
function we can efficiently answer queries such as “What is the length of a shortest path froms
tov
?” In practice, the latter form of query seems more common.
In the final version of this BFS API, I will probably include some functions to recover forests and level sets as described above. Some benchmarking will be needed to see whether it’s more efficient to recover them after the fact or to actually keep track of them along the way.
Agreed that your API is better for common pathfinding queries.
I suspect there is a more abstract API that takes an applicative (or monadic?) “callback” to visit nodes in the BFS order, so one can get all three discussed APIs by picking different functors. Same for DFS. My motivation for having something like this is implementing the Hopcroft–Karp bipartite matching algorithm which uses both BFS and DFS in a compositional manner. The current implementation we use in the library can’t reuse our BFS/DFS algorithms, which is a shame.
Ah, interesting idea. So, something like bfs :: Applicative f => (Int > v > v > f v) > Graph v > f v (where the callback is given a vertex along with its level and parent)?
Yes, although I guess it should be `f ()` instead of `f v`? But then… a good old monoid is probably sufficient, i.e.:
bfs :: Monoid m => (Int > v > v > m) > Graph v > m
I feel like we’re still losing some information here because monoids correspond to sequences, whereas we are really after trees. Perhaps, we need a semiring?
Ah, yes, that’s simpler. Technically I think we’re not losing information because we can recover the tree structure from the extra level and parent arguments to the callback. But yes, it would be better if the algebraic structure on m matched the structure of the traversal. I’m not sure we need a semiring. Maybe just something like
class Treeish t where
node :: v > [t v] > t v
bfs :: Treeish t => Graph v > t v
? I think you could easily make a Treeish instance to keep track of levels and parents if you wanted to…
Yes, that seems better!
Pingback: Competitive programming in Haskell: BFS, part 3 (implementation via HashMap)  blog :: Brent > [String]