Competitive programming in Haskell: better binary search

Binary search is a workhorse of competitive programming. There are occasional easy problems where binary search is the solution in and of itself; more often, it’s used as a primitive building block of more complex algorithms. It is often presented as a way to find the index of something in a sorted array in O(\lg n) time, and many languages have such a thing in their standard library (for example, see Arrays.binarySearch in Java, the bisect library in Python, or the binary_search function in C++). However, the idea of binary search is more general than searching in a sorted array; we’re doing binary search any time we repeatedly halve a search interval. For example, we can use it to find the smallest or largest number with a given property, or to find an optimal, “just right” measurement that is neither too small nor too big.

Generic binary search with first-class functions, take 1

A language with easy access to first-class functions provides a great opportunity to generalize binary search properly. For example, here’s a version of binary search that has lived in my competitive programming solution template for a long time:

-- Discrete binary search.  Find the smallest integer in [lo,hi] such
-- that monotone predicate p holds.
binarySearchD :: Int -> Int -> (Int -> Bool) -> Int
binarySearchD lo hi p
  | lo == hi = lo
  | p mid     = binarySearchD lo mid p
  | otherwise = binarySearchD (mid+1) hi p
  where
    mid = (lo + hi) `div` 2

The key generalization is that it takes a predicate of type Int -> Bool as an argument. Note that in order for binary search to work, the predicate p must be monotonic. This means, intuitively, that p starts out False, and once it switches to True it never goes back. (Formally, p being monotonic means that for all x and y, if x <= y then p x <= p y, where False <= True). This is how we can tell if we’re “too low” or “too high”: we’re “too low” when p is False and “too high” when it is True.

This is definitely an improvement over array-specific versions of binary search. We can still use it to search in an array by providing a predicate that does an array lookup, but we can use it for other things as well.

I should note at this point that there is a very nice binary-search package published on Hackage, which I definitely recommend if you need binary search in some kind of project. However, for the purposes of competitive programming, we can’t rely on that package being available, and we’d also like something a bit simpler, so we don’t have to read the documentation every time we want to use it.

…can we do better?

So my binarySearchD function works fine as far as it goes, and I have used it regularly, but there are still several things about it that always annoyed me:

  • What if we want a slight variation, such as the largest integer such that something holds? Or the last integer where the predicate doesn’t hold? etc.? It is possible to use binarySearchD in these situations, but I find it tricky and error-prone to figure out how. And when I’m reaching for some function as a building block of a bigger algorithm I definitely don’t want to waste time and brainpower having to think carefully about small details like this.

  • Getting the implementation right in the first place was kind of tricky. Should we use mid+1? mid-1? Should we think in terms of a closed interval [lo,hi], or a half-open interval [lo,hi), or…? How can we convince ourselves our implementation is completely correct, and won’t get stuck in infinite recursion?

  • What if we want to do binary search over a continuous domain, like Double? We have to make a completely separate function, for example, like this:

-- Continuous binary search.  Given a tolerance eps, an interval
-- [a,b], a continuous, monotonically increasing function f, and a
-- target value tgt, find c ∈ [a,b] such that f(c) = tgt.
binarySearch :: (Fractional t, Ord t, Ord a) => t -> t -> t -> (t -> a) -> a -> t
binarySearch eps a b f tgt = go a b
  where
    go lo hi
      | hi-lo < eps = mid
      | f mid < tgt = go mid hi
      | otherwise   = go lo mid
      where
        mid = (lo + hi)/2

(Actually, I’m not sure why I wrote that version in terms of finding a “target” value. In practice I suppose continuous binary search often comes up that way, but looking at it now it seems less general. In any case, we’re going to throw this function away very shortly so it doesn’t really matter!)

Recently I came across a lovely article, Binary Search a Little Simpler & More Generic by Jules Jacobs. Jules explains a really elegant API for binary search that is so much better than anything I’d seen before, and solves all the above issues! I immediately went to implement it in Haskell, and I want to share it with you. As I’ve reflected on Jules’s presentation, I have identified three key ideas:

  1. Rather than looking for some index with a certain property, we’re really looking for the place where p switches from False to True. That actually happens in between two indices… so let’s return the pair of indices bracketing the change, rather than just a single index! This means we get both the last index that does not have property p and the first one that does, and we can use whichever one we want.

    This is a simple change, but in my experience, it helps a lot to reduce the cognitive load. Previously, if I wanted something like “the last index that does not have property p” I’d have to think hard about what the index I get out of the search represents, and figure out that I needed to subtract one. Now I only have to think “OK, I want the thing right before the predicate changes from False to True, so I can project it out with fst”.

  2. The second important idea is that we’re going to insist that p switches from False to True, not at most once, but exactly once. (If necessary, we can add special “virtual” -∞ and/or +∞ indices such that p (-∞) = False and p (+∞) = True.) Then as we narrow down our current search interval [l, r], we will maintain the invariant that p l = False and p r = True.

    This invariant makes everything so much cleaner, and it also ties in with the first important idea of returning a pair instead of a single index. Previously I always thought of binary search in terms of searching for a specific index, but that makes the semantics of the interval tricky. For example, do we maintain the invariant that the index we’re looking for is somewhere in the closed interval [l,r]? Somewhere in the half-open interval [l,r)? …? But I find it so much more elegant and natural to say “l always stays in the False part, and r always stays in the True part, and we just slide them closer until we find the exact dividing line between False and True.”

    I will note that there are a couple tradeoffs: first of all, our search function of course takes starting values for l and r as inputs, and it will now have as a prerequisite that p l = False and p r = True, so we have to think a little harder when calling it. We also have to work a little harder to figure out when e.g. a value we’re looking for was not found at all. Typically, if we use some sort of initial special +∞ value for r, if the returned r value is still +∞ it means nothing at all was found that made the predicate True.

  3. The final important idea is to abstract out a function mid to compute a potential next index to look at, given the current interval. We’ll insist that when mid l r returns a value, it must be strictly in between l and r (there’s no point in returning l or r because we already know p l = False and p r = True), and we’ll stop when it returns Nothing. This lets us cleanly separate out the logic of the recursion and keeping track of the current search interval from the details of the arithmetic needed for each step. In particular, it will allow us to unify binary search over both integral and floating-point domains.

Here’s the final form of our search function. Unlike, say, binarySearchD, it pretty much writes itself at this point:

search :: (a -> a -> Maybe a) -> (a -> Bool) -> a -> a -> (a,a)
search mid p = go
  where
    go l r = case mid l r of
      Nothing -> (l,r)
      Just m
        | p m       -> go l m
        | otherwise -> go m r

We check our mid function to tell us what to look at next. If it returns Nothing, we stop and return the pair of the current (l,r). If it returns a “midpoint” m then we test the predicate on m and recurse. No tricky +1’s or -1’s to think about; given our invariant regarding l and r, it’s obvious which one we should replace with m depending on the outcome of the predicate, and we can’t get stuck in an infinite loop since m is always strictly between l and r.

(As an aside, I love that this is polymorphic in a with no class constraints! That’s another hint that this is really quite general. The class constraints will come with particular mid functions.)

So what about those mid functions? Here’s one for doing binary search over integers:

binary :: Integral a => a -> a -> Maybe a
binary l r
  | r - l > 1 = Just ((l+r) `div` 2)
  | otherwise = Nothing

Pretty straightforward! We stop when l and r are exactly one apart; otherwise we return their midpoint (you should convince yourself that (l+r) `div` 2 is always strictly in between l and r when r - l > 1).

For example, we can use this to take an integer square root:

λ> search binary (\x -> x^2 >= 150) 0 100
(12,13)

This tells us that 12 is the biggest integer whose square is less than 150, and 13 is the smallest integer whose square is greater.

But we needn’t limit ourselves to integers; as hinted previously, we can also do binary search over Fractional domains:

continuous :: (Fractional a, Ord a) => a -> a -> a -> Maybe a
continuous eps l r
  | r - l > eps = Just ((l+r) / 2)
  | otherwise = Nothing

Given an eps value, we stop when r - l <= eps, and otherwise return their midpoint. For example, we can use this to find the square root of 150 to 6 decimal places:

λ> search (continuous 1e-6) (\x -> x^2 >= 150) 0 100
(12.247448414564133,12.247449159622192)

We can even write some functions to do linear search! Why might we want to do that, you ask? Well, with some care, these can be used even with non-monotonic predicates, to find the first or last place the predicate switches from False to True (though using something like find or findIndex is typically easier than using search fwd).

fwd :: (Num a, Ord a) => a -> a -> Maybe a
fwd l r
  | r - l > 1 = Just (l+1)
  | otherwise = Nothing

bwd :: (Num a, Ord a) => a -> a -> Maybe a
bwd l r
  | r - l > 1 = Just (r-1)
  | otherwise = Nothing

I don’t have any great examples of using these off the top of my head, but we might as well include them.

[WARNING: this section about binary search on bit representations of floating-point numbers is completely wrong, but I’m leaving it here for context. See the discussion in the comments to this post and the follow-up post!]

But there’s more: we can also do exact binary search on the bit representations of floating-point numbers! That is, we do binary search as if the bit representations of l and r were unsigned integers. This is possibly more efficient than “continuous” binary search, and lets us find the two precisely adjacent floating-point numbers where our predicate switches from False to True.

binaryFloat :: Double -> Double -> Maybe Double
binaryFloat l r = decode <$> binary (encode l) (encode r)
  where
    encode :: Double -> Word64
    encode = unsafeCoerce

    decode :: Word64 -> Double
    decode = unsafeCoerce

For example, we can find the closest possible floating-point approximation to the square root of 150:

λ> search binaryFloat (\x -> x^2 >= 150) 0 100
(12.247448713915889,12.24744871391589)
λ> sqrt 150
12.24744871391589

This honestly seems like black magic to me, and I don’t know enough about floating-point representation to have a good idea of how this works and what the caveats might be, but it’s worked for all the examples I’ve tried. It even works when l is negative and r is positive (it seems like in that case the bit representation of l would correspond to a larger unsigned integer than r, but somehow it all works anyway!).

λ> search binaryFloat (\x -> x^2 >= 150) (-100) 100
(12.247448713915889,12.24744871391589)

Code

I’ve added the code from this post to my comprog-hs repository on GitHub. The source for this blog post itself is available on hub.darcs.net.

Challenges

And here are some problems for you to solve! I’ll discuss some of them in an upcoming post.

Advertisement

About Brent

Associate Professor of Computer Science at Hendrix College. Functional programmer, mathematician, teacher, pianist, follower of Jesus.
This entry was posted in competitive programming, haskell and tagged , , . Bookmark the permalink.

22 Responses to Competitive programming in Haskell: better binary search

  1. Globules says:

    … you should convince yourself that (l+r) `div` 2 is always strictly in between l and r when r – l > 1

    Hmm, I don’t seem to be able to convince myself of that… :-)

    (See Joshua Bloch’s Extra, Extra – Read All About It: Nearly All Binary Searches and Mergesorts are Broken.)

  2. Yitz says:

    Woah Brent! unsafeCoerce – really? Why not encodeFloat/decodeFloat? Did you test and find that’s needed as an important optimization?

    • `encodeFloat` and `decodeFloat` work with an `(Integer,Int)` representation of the float; this is not at all the native (IEEE 754 on most modern CPUs) floating point representation.

      • Yitz says:

        You’re right that it’s not the raw bits of the internal representation. But it’s guaranteed to find two consecutive values, whatever the actual internal representation. Well, unless your compiler internally is using CReal with potentially infinite precision floats, in which case the binary search could hang. But I don’t think that’s a concern, and anyway unsafeCoerce probably wouldn’t work either in that case. Whereas even if the floats are IEEE 754 on the metal, there is no way to predict if your unknown compiler will give you something that works for binary search when you use unsafeCoerce. For example – what is the endian?

        • Brent says:

          OK, very good points. I will play around with encodeFloat/decodeFloat and see if I can come up with something better!

        • I think you’re looking at this in the wrong way. I take the example as showing the generality of this binary search function with an interesting and unusual example.

          A function being named ‘unsafeXXX’ does _not_ mean that it’s always unsafe to use. It means you have to be take special care in how you use it, which generally involves having other controls outside the language itself, such as restricting its use to particular environments (and there are many ways to do this). I don’t expect Brent to go into the enormously complex details of how to use this idea safely; that’s going to depend entirely on the details of the system in which you’re using it.

    • Brent says:

      Of course I would never use this in production code or even a personal project. But for competitive programming I don’t care, if it works, it works. =)

      • Yitz says:

        Sure. But even in competitive programming, are you sure it will always work the way you expect? You have some random compiler. Is the internal representation guaranteed forever and ever to have the semantics you need for you binary search to work? What’s wrong with encodeFloat/decodeFloat, which besides not being “unsafe”, is guaranteed present in ever compiler since Haskell 98 with the same semantics? I grant you, it could be what’s wrong is that it’s not efficient. But seems to me it’s a better first thing to try.

  3. babel says:

    The “black magic” of binaryFloat is a known type pun to a sign-magnitude integer, so because you’ve searched for positive values it’s always worked, however for negative it is not correctly handling it and will explode — it also won’t return NaN for sqrt of negative x, though it explodes well before that becomes a problem.

    The spec covers this binary shennanigans under comparison and total ordering, though it relies on some NaN behaviour being followed (754-2008 recommended’s is_quiet flag behaviour). See Herf’s old post for the necessary bit twiddling https://web.archive.org/web/20220326204603/http://stereopsis.com/radix.html (wayback as their webserver is down).

    • babel says:

      Negative l positive r works because the first binary step will halve it, thereby losing the top sign bit (being equivalent to a right shift), and basically restarting the search with an l of 0.0 (or some very small float near it). I recommend printing each step out with binary representation, to see the working.

    • Brent says:

      Thanks, that makes sense!

  4. Waleed Khan says:

    Thanks for posting. I’ve recently thinking about generalizing bisection on DAGs (for source control systems). This isn’t quite that, but I’ll be thinking about the ideas and seeing if they apply.

    • Brent says:

      As Jules mentions in his original article, this technique works for any partially ordered set, to find one particular pair of elements x < y, with nothing in between them, such that p x is False but p y is True. You just might not get the "optimal" such pair (for whatever your definition of "optimal" is). You need a way to implement the "mid" function, i.e. given l, r such that l < r, find another element m such that l < m < r. For example, for the poset of subsets ordered by inclusion, given A \subseteq B you can arbitrary choose half of the elements in the difference B – A and union them with A to get a set "halfway in between" A and B. I'm not sure off the top of my head how you would do this for a general DAG but you could probably figure something out.

      • Anonymous says:

        Thanks, I see section 5.2 “Searching lattices” now. I’ll just expound on the problem while I’m here. When searching the DAG for source control purposes, we want *all* points where p flips, making the monotonicity assumption that if a <= b then p(a) <= p(b) (where False < True). (Actually, this assumption doesn't need to be true in practice, but people still make good use of `git bisect` despite that.)

        Like you suggest, we don't want to search the poset of *all* subsets of nodes in the DAG, since some of those subsets won't be meaningful. Each element of the subset indicates one of the patches to combine together into a resulting commit to analyze, and patches under traditional SCM don't always successfully apply when combined, or are ill-defined in the cases of merge/multi-parent commits. But it should be reasonably straightforward to figure out a valid poset that corresponds to the DAG's ordering relation.

        • Brent says:

          Ah, I see, makes sense. And sorry I didn’t explain the poset connection very well, I did not have in mind looking at subsets of nodes in a DAG. I was just using subsets (in general) as one particular example of a poset. A DAG itself is another example of a poset, where x <= y iff there is a path in the DAG from x to y.

          Finding *all* the points in a DAG where a monotonic predicate flips is an interesting problem. Of course it can be done in O(V+E) just by doing a DFS and noting every edge where the predicate flips. But generalizing binary search to be able to do it faster is an interesting question. `git bisect` or something similar is also an interesting application domain, because in that case we are not necessarily looking to minimize total running time, but rather minimizing the number of evaluations of the predicate (which correspond to prompting the user). So it's OK if we need to spend linear time processing the entire DAG as long as we only evaluate the predicate a logarithmic number of times.

  5. greg7mdp says:

    Super fun, thanks for sharing Brent! I felt inspired by this, so for fun I implemented it in C++ in this commit: https://github.com/greg7mdp/gtl/commit/75cdaedcc3c883b139f3976b86190d21bacc74b8

  6. Pingback: Binary search over floating point representations | blog :: Brent -> [String]

  7. Jules Jacobs says:

    Nice post! Two other fun things:

    You can do interpolation search using a different mid function (but it needs access to the array) and on reals that gives you something close to the secant method (but better, actually, because the secant method can diverge).

    The requirement that the predicate is monotone is not strictly necessary: it will still find a place where the predicate flips, but you don’t get guarantees on which one it finds.

  8. Jason Hooper says:

    I ported the search function to Rust and used it for Bootstrapping Number and Toast. I’m stuck at various stages on the others.

    Toast: https://github.com/jasonincanada/kattis/blob/master/rust/toast/src/main.rs#L126

    I think I need some sort of insight to make further progress on Big Boxes or I’ll come up with a hopelessly slow algorithm. My notes so far are here: https://github.com/jasonincanada/kattis/blob/master/notes/Big%20Boxes.md

    These are interesting challenges, thanks for posting them!

    • Brent says:

      For Big Boxes, the trick is figuring out the right thing to search for. A useful thing to consider (for many problems, not just this one) is searching for the optimal *answer*. Given a potential answer (i.e. the weight of the heaviest box), can you figure out if it is viable?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.