Competitive programming in Haskell: BFS, part 3 (implementation via HashMap)

In a previous post, I showed how we can solve Modulo Solitaire (and hopefully other BFS problems as well) using a certain API for BFS, and we also explored some alternatives. I had a very interesting discussion with Andrey Mokhov in the comments about potential designs for an even more general API; more on that in a future post, perhaps!

For today, though, I want to finally show one way to implement this API efficiently. Spoiler alert: this implementation ultimately won’t be fast enough for us, but it will be a helpful stepping stone on our way to a yet faster implementation (which will of course get its own post in due time).

This post is literate Haskell; you can obtain the source from the darcs repo. We begin with a few LANGUAGE pragmas and imports.

{-# LANGUAGE RecordWildCards            #-}
{-# LANGUAGE ScopedTypeVariables        #-}
{-# LANGUAGE TupleSections              #-}

module BFS where

import           Control.Arrow               ((>>>))
import           Data.Hashable               (Hashable)
import           Data.HashMap.Strict         (HashMap, (!))
import qualified Data.HashMap.Strict         as HM
import           Data.List                   (foldl')
import           Data.Sequence               (Seq (..), ViewL (..), (|>))
import qualified Data.Sequence               as Seq

Now a couple utility functions: (>$>) is just flipped function application, and exhaust iterates an (a -> Maybe a) function as many times as possible, returning the last non-Nothing value.

infixl 0 >$>
(>$>) :: a -> (a -> b) -> b
(>$>) = flip ($)
{-# INLINE (>$>) #-}

exhaust :: (a -> Maybe a) -> a -> a
exhaust f = go
  where
    go a = maybe a go (f a)

Here is the BFSResult record that we ultimately want to produce; it should be familiar from previous posts.

data BFSResult v =
  BFSR { getLevel :: v -> Maybe Int, getParent :: v -> Maybe v }

While running our BFS, we’ll keep track of three things: the level of each vertex that has been encountered; a mapping from each encountered vertex to its parent; and a queue of vertices that have been encountered but yet to be processed. We use a Seq from Data.Sequence to represent the queue, since it supports efficient (amortized constant-time) insertion and removal from either end of the sequence. There are certainly other potential ways to represent a queue in Haskell (and this probably deserves its own blog post) but Data.Sequence seems to give good performance for minimal effort (and in any case, as we’ll see, it’s not the performance bottleneck here). We use a pair of HashMaps to represent the level and parent maps.

data BFSState v =
  BS { level :: HashMap v Int, parent :: HashMap v v, queue :: Seq v }

Given a list of starting vertices, we can create an initial state, with a queue containing the starting vertices and all of them set to level 0.

initBFSState :: (Eq v, Hashable v) => [v] -> BFSState v
initBFSState vs = BS (HM.fromList (map (,0) vs)) HM.empty (Seq.fromList vs)

Now, here is our imeplementation of BFS, using the API discussed previously: it takes a list of starting vertices, a function giving the neighbors of each vertex, and a function identifying “target vertices” (so we can stop early), and returns a BFSResult record. We create an initial state, run bfsStep as much as possible, and convert the end state into a result.

bfs :: forall v. (Eq v, Hashable v) => [v] -> (v -> [v]) -> (v -> Bool) -> BFSResult v
bfs vs next goal = toResult $ exhaust bfsStep (initBFSState vs)
  where

Converting the final BFSState into a BFSResult is easy: just return functions that do a lookup into the relevant map.

    toResult BS{..} = BFSR (`HM.lookup` level) (`HM.lookup` parent)

To do a single step of BFS, try to remove the next vertex v from the queue. If the queue is empty, or the next vertex is a goal vertex, return Nothing to signal that we are done.

    bfsStep st@BS{..} = case Seq.viewl queue of
      EmptyL -> Nothing
      v :< q'
        | goal v    -> Nothing

Otherwise, use the next function to find the neighbors of v, keep only those we haven’t encountered before (i.e. those which are not keys in the level map), and use each one to update the BFS state (being sure to first set the queue to the new one with v removed).

        | otherwise ->
          v >$> next >>> filter (not . (`HM.member` level)) >>>
            foldl' (upd v) (st{queue=q'}) >>> Just

To update the BFS state based on a newly visited vertex, we record its parent, insert it into the level map with a level one greater than its parent, and add it to the end of the queue.

    upd p BS{..} v = BS
      (HM.insert v l level)
      (HM.insert v p parent)
      (queue |> v)
      where
        l = level!p + 1

And that’s it! This is good enough to solve many BFS problems on Open Kattis, such as Breaking Bad, ARMPIT Computations, and Folding a Cube. (I will leave you the pleasure of solving these problems yourself; I am especially fond of my Haskell solution to Folding a Cube.)

Unfortunately, it is not fast enough to solve Modulo Solitaire, which I picked specifically because it seems to be one of the most computationally demanding BFS problems I’ve seen. My solution using this HashMap-based implementation solves a bunch of initial test cases, but exceeds the 2 second time limit on one of the later test cases. Next time, I’ll show how to adapt this into an even faster implementation which is actually fast enough to solve Modulo Solitaire.

About Brent

Associate Professor of Computer Science at Hendrix College. Functional programmer, mathematician, teacher, pianist, follower of Jesus.

View all posts by Brent →

6 Responses to Competitive programming in Haskell: BFS, part 3 (implementation via HashMap)

Anonymous says:

November 4, 2021 at 12:21 pm

Why not use &?

> import Data.Function
> :t (&)
(&) :: a -> (a -> b) -> b

- Brent says:
  
  November 4, 2021 at 1:59 pm
  
  Good question, I actually couldn’t remember if there is a reason or if it’s just historical accident. But it turns out there is a reason: & and >>> have the same precedence level but opposite associativities (infixl 1 and infixr 1, respectively) so something like x & f >>> g >>> h is a syntax error. I don’t want to have to put parentheses around the entire function pipeline, so instead I just have >$> defined in my template as an infixl 0 variant of &.
  
  - Abhijit Sarkar says:
    
    February 1, 2023 at 7:44 pm
    
    The link to the complete code is broken. Also, it is probably just a matter of preference, but you seem to be focused on typing the minimum number of characters, why else would there be a weird operator for a flipped function? Anytime someone reads the code, they have to dart around to find where it’s defined and what it does, as the name isn’t exactly descriptive.
    
    - Brent says:
      
      February 1, 2023 at 9:30 pm
      
      Thanks, I’ve fixed the link now.
      
      As you say, it’s a matter of preference.
      
Pingback: Competitive programming in Haskell: Enumeration | blog :: Brent -> [String]
Pingback: Competitive programming in Haskell: BFS, part 4 (implementation via STUArray) | blog :: Brent -> [String]