## Competitive programming in Haskell: BFS, part 4 (implementation via STUArray)

In a previous post, we saw one way to implement our BFS API, but I claimed that it is not fast enough to solve Modulo Solitaire. Today, I want to demonstrate a faster implementation. (It’s almost certainly possible to make it faster still; I welcome suggestions!)

Once again, the idea is to replace the `HashMap`s from last time with mutable arrays, but in such a way that we get to keep the same pure API—almost. In order to allow arbitrary vertex types, while storing the vertices efficiently in a mutable array, we will require one extra argument to our `bfs` function, namely, an `Enumeration` specifying a way to map back and forth between vertices and array indices.

So why not instead just restrict vertices to some type that can be used as keys of a mutable array? That would work, but would unnecessarily restrict the API. For example, it is very common to see competitive programming problems that are “just” a standard graph algorithm, but on a non-obvious graph where the vertices are conceptually some more complex algebraic type, or on a graph where the vertices are specified as strings. Typically, competitive programmers just implement a mapping between vertices to integers on the fly—using either some math or some lookup data structures on the side—but wouldn’t it be nicer to be able to compositionally construct such a mapping and then have the graph search algorithm automatically handle the conversion back and forth? This is exactly what the `Enumeration` abstraction gives us.

This post is literate Haskell; you can obtain the source from the darcs repo. The source code (without accompanying blog post) can also be found in my comprog-hs repo.

``````{-# LANGUAGE FlexibleContexts    #-}
{-# LANGUAGE RankNTypes          #-}
{-# LANGUAGE RecordWildCards     #-}
{-# LANGUAGE ScopedTypeVariables #-}

module Graph where

import Enumeration

import           Control.Arrow       ((>>>))
import qualified Data.Array.IArray   as IA
import           Data.Array.ST
import           Data.Array.Unboxed  (UArray)
import qualified Data.Array.Unboxed  as U
import           Data.Array.Unsafe   (unsafeFreeze)
import           Data.Sequence       (Seq (..), ViewL (..), (<|), (|>))
import qualified Data.Sequence       as Seq

infixl 0 >\$>
(>\$>) :: a -> (a -> b) -> b
(>\$>) = flip (\$)
{-# INLINE (>\$>) #-}``````

`exhaustM` is like `exhaust` from the last post, but in the context of an arbitrary `Monad`. Each step will now be able to have effects (namely, updating mutable arrays) so needs to be monadic.

``````exhaustM :: Monad m => (a -> m (Maybe a)) -> a -> m a
exhaustM f = go
where
go a = do
ma <- f a
maybe (return a) go ma``````

The `BFSResult` type is the same as before.

``````data BFSResult v =
BFSR { getLevel :: v -> Maybe Int, getParent :: v -> Maybe v }``````

Instead of using `HashMap`s in our `BFSState` as before, we will use `STUArray`s.1 These are unboxed, mutable arrays which we can use in the `ST` monad. Note we also define `V` as a synonym for `Int`, just as a mnemonic way to remind ourselves which `Int` values are supposed to represent vertices.

``````type V = Int
data BFSState s =
BS { level :: STUArray s V Int, parent :: STUArray s V V, queue :: Seq V }``````

To initialize a BFS state, we allocate new mutable level and parent arrays (initializing them to all $-1$ values), and fill in the `level` array and queue with the given start vertices. Notice how we need to be explicitly given the size of the arrays we should allocate; we will get this size from the `Enumeration` passed to `bfs`.

``````initBFSState :: Int -> [V] -> ST s (BFSState s)
initBFSState n vs = do
l <- newArray (0,n-1) (-1)
p <- newArray (0,n-1) (-1)

forM_ vs \$ \v -> writeArray l v 0
return \$ BS l p (Seq.fromList vs)``````

The `bfs'` function implements the BFS algorithm itself. Notice that it is not polymorphic in the vertex type; we will fix that with a wrapper function later. If you squint, the implementation looks very similar to the implementation of `bfs` from my previous post, with the big difference that everything has to be in the `ST` monad now.

``````bfs' :: Int -> [V] -> (V -> [V]) -> (V -> Bool) -> ST s (BFSState s)
bfs' n vs next goal = do
st <- initBFSState n vs
exhaustM bfsStep st
where
bfsStep st@BS{..} = case Seq.viewl queue of
EmptyL -> return Nothing
v :< q'
| goal v -> return Nothing
| otherwise -> v >\$> next >>> filterM (fmap not . visited st)
>=> foldM (upd v) (st{queue=q'}) >>> fmap Just

upd p b@BS{..} v = do
writeArray level v (lp + 1)
writeArray parent v p
return \$ b{queue = queue |> v}

visited :: BFSState s -> V -> ST s Bool
visited BS{..} v = (/= -1) <\$> readArray level v
{-# INLINE visited #-}``````

The `bfs` function is a wrapper around `bfs'`. It presents the same API as before, with the exception that it requires an extra `Enumeration v` argument, and uses it to convert vertices to integers for the inner `bfs'` call, and then back to vertices for the final result. It also handles freezing the mutable arrays returned from `bfs'` and constructing level and parent lookup functions that index into them. Note, the use of `unsafeFreeze` seems unavoidable, since `runSTUArray` only allows us to work with a single mutable array; in any case, it is safe for the same reason the use of `unsafeFreeze` in the implementation of `runSTUArray` itself is safe: we can see from the type of `toResult` that the `s` parameter cannot escape, so the type system will not allow any further mutation to the arrays after it completes.

``````bfs :: forall v. Enumeration v -> [v] -> (v -> [v]) -> (v -> Bool) -> BFSResult v
bfs Enumeration{..} vs next goal
= toResult \$ bfs' card (map locate vs) (map locate . next . select) (goal . select)
where
toResult :: (forall s. ST s (BFSState s)) -> BFSResult v
toResult m = runST \$ do
st <- m
(level' :: UArray V Int) <- unsafeFreeze (level st)
(parent' :: UArray V V) <- unsafeFreeze (parent st)
return \$
BFSR
((\l -> guard (l /= -1) >> Just l) . (level' IA.!) . locate)
((\p -> guard (p /= -1) >> Just (select p)) . (parent' IA.!) . locate)``````

Incidentally, instead of adding an `Enumeration v` argument, why don’t we just make a type class `Enumerable`, like this?

``````class Enumerable v where
enumeration :: Enumeration v

bfs :: forall v. Enumerable v => [v] -> ...``````

This would allow us to keep the same API for BFS, up to only different type class constraints on `v`. We could do this, but it doesn’t particularly seem worth it. It would typically require us to make a `newtype` for our vertex type (necessitating extra code to map in and out of the `newtype`) and to declare an `Enumerable` instance; in comparison, the current approach with an extra argument to `bfs` requires us to do nothing other than constructing the `Enumeration` itself.

Using this implementation, `bfs` is finally fast enough to solve Modulo Solitaire, like this:

``````main = C.interact \$ runScanner tc >>> solve >>> format

data Move = Move { a :: !Int, b :: !Int } deriving (Eq, Show)
data TC = TC { m :: Int, s0 :: Int, moves :: [Move] } deriving (Eq, Show)

tc :: Scanner TC
tc = do
m <- int
n <- int
TC m <\$> int <*> n >< (Move <\$> int <*> int)

type Output = Maybe Int

format :: Output -> ByteString
format = maybe "-1" showB

solve :: TC -> Output
solve TC{..} = getLevel res 0
where
res = bfs (finiteE m) [s0] (\v -> map (step m v) moves) (==0)

step :: Int -> Int -> Move -> Int
step m v (Move a b) = (a*v + b) `mod` m
{-# INLINE step #-}``````

It’s pretty much unchanged from before, except for the need to pass an `Enumeration` to `bfs` (in this case we just use `finiteE m`, which is the identity on the interval $[0 .. m)$).

## Some remaining questions

This is definitely not the end of the story.

• Submitting all this code (BFS, `Enumeration`, and the above solution itself) as a single file gives a 2x speedup over submitting them as three separate modules. That’s annoying—why is that?

• Can we make this even faster? My solution to Modulo Solitaire runs in 0.57s. There are faster Haskell solutions (for example, Anurudh Peduri has a solution that runs in 0.32s), and there are Java solutions as fast as 0.18s, so it seems to me there ought to be ways to make it much faster. If you have an idea for optimizing this code I’d be very interested to hear it! I am far from an expert in Haskell optimization.

• Can we generalize this nicely to other kinds of graph search algorithms (at a minimum, DFS and Dijkstra)? I definitely plan to explore this question in the future.

## For next time: Breaking Bad

Next time, I want to look at a few other applications of this BFS code (and perhaps see if we can improve it along the way); I challenge you to solve Breaking Bad.

1. Why not use `Vector`, you ask? It’s probably even a bit faster, but the `vector` library is not supported on as many platforms.↩︎

Assistant Professor of Computer Science at Hendrix College. Functional programmer, mathematician, teacher, pianist, follower of Jesus.
This entry was posted in competitive programming, haskell and tagged , , , , , , . Bookmark the permalink.

### 3 Responses to Competitive programming in Haskell: BFS, part 4 (implementation via STUArray)

1. Anurudh Peduri says:

Hey, so the only difference I see between my solution and yours is that I used an `STUArray` to simulate the queue, with head & tail indices. I feel this is much faster than `Seq`, as we know the total size required (`|V|`) ahead of time.

• Brent says:

Ah, of course! Great idea.

Unfortunately, I tried implementing it in a general way and my solution is now *slower*! I really don’t know why. I guess I should try to profile it.

2. Aaron Allen says:

Just finished solving “Breaking Bad” and found it very enjoyable. As of now, I’ve got the fastest Haskell solution at 0.27s. Here is my code: https://gist.github.com/aaronallen8455/8558594a214786b916a14b82d24f9494

This site uses Akismet to reduce spam. Learn how your comment data is processed.