Binary search is a workhorse of competitive programming. There are occasional easy problems where binary search is the solution in and of itself; more often, it’s used as a primitive building block of more complex algorithms. It is often presented as a way to find the index of something in a sorted array in time, and many languages have such a thing in their standard library (for example, see
Arrays.binarySearch
in Java, the bisect
library in Python, or the binary_search
function in C++). However, the idea of binary search is more general than searching in a sorted array; we’re doing binary search any time we repeatedly halve a search interval. For example, we can use it to find the smallest or largest number with a given property, or to find an optimal, “just right” measurement that is neither too small nor too big.
Generic binary search with first-class functions, take 1
A language with easy access to first-class functions provides a great opportunity to generalize binary search properly. For example, here’s a version of binary search that has lived in my competitive programming solution template for a long time:
-- Discrete binary search. Find the smallest integer in [lo,hi] such
-- that monotone predicate p holds.
binarySearchD :: Int -> Int -> (Int -> Bool) -> Int
binarySearchD lo hi p
| lo == hi = lo
| p mid = binarySearchD lo mid p
| otherwise = binarySearchD (mid+1) hi p
where
mid = (lo + hi) `div` 2
The key generalization is that it takes a predicate of type Int -> Bool
as an argument. Note that in order for binary search to work, the predicate p
must be monotonic. This means, intuitively, that p
starts out False
, and once it switches to True
it never goes back. (Formally, p
being monotonic means that for all x
and y
, if x <= y
then p x <= p y
, where False <= True
). This is how we can tell if we’re “too low” or “too high”: we’re “too low” when p
is False
and “too high” when it is True
.
This is definitely an improvement over array-specific versions of binary search. We can still use it to search in an array by providing a predicate that does an array lookup, but we can use it for other things as well.
I should note at this point that there is a very nice binary-search
package published on Hackage, which I definitely recommend if you need binary search in some kind of project. However, for the purposes of competitive programming, we can’t rely on that package being available, and we’d also like something a bit simpler, so we don’t have to read the documentation every time we want to use it.
…can we do better?
So my binarySearchD
function works fine as far as it goes, and I have used it regularly, but there are still several things about it that always annoyed me:
-
What if we want a slight variation, such as the largest integer such that something holds? Or the last integer where the predicate doesn’t hold? etc.? It is possible to use
binarySearchD
in these situations, but I find it tricky and error-prone to figure out how. And when I’m reaching for some function as a building block of a bigger algorithm I definitely don’t want to waste time and brainpower having to think carefully about small details like this. -
Getting the implementation right in the first place was kind of tricky. Should we use
mid+1
?mid-1
? Should we think in terms of a closed interval[lo,hi]
, or a half-open interval[lo,hi)
, or…? How can we convince ourselves our implementation is completely correct, and won’t get stuck in infinite recursion? -
What if we want to do binary search over a continuous domain, like
Double
? We have to make a completely separate function, for example, like this:
-- Continuous binary search. Given a tolerance eps, an interval
-- [a,b], a continuous, monotonically increasing function f, and a
-- target value tgt, find c ∈ [a,b] such that f(c) = tgt.
binarySearch :: (Fractional t, Ord t, Ord a) => t -> t -> t -> (t -> a) -> a -> t
binarySearch eps a b f tgt = go a b
where
go lo hi
| hi-lo < eps = mid
| f mid < tgt = go mid hi
| otherwise = go lo mid
where
mid = (lo + hi)/2
(Actually, I’m not sure why I wrote that version in terms of finding a “target” value. In practice I suppose continuous binary search often comes up that way, but looking at it now it seems less general. In any case, we’re going to throw this function away very shortly so it doesn’t really matter!)
A better binary search
Recently I came across a lovely article, Binary Search a Little Simpler & More Generic by Jules Jacobs. Jules explains a really elegant API for binary search that is so much better than anything I’d seen before, and solves all the above issues! I immediately went to implement it in Haskell, and I want to share it with you. As I’ve reflected on Jules’s presentation, I have identified three key ideas:
-
Rather than looking for some index with a certain property, we’re really looking for the place where
p
switches fromFalse
toTrue
. That actually happens in between two indices… so let’s return the pair of indices bracketing the change, rather than just a single index! This means we get both the last index that does not have propertyp
and the first one that does, and we can use whichever one we want.This is a simple change, but in my experience, it helps a lot to reduce the cognitive load. Previously, if I wanted something like “the last index that does not have property
p
” I’d have to think hard about what the index I get out of the search represents, and figure out that I needed to subtract one. Now I only have to think “OK, I want the thing right before the predicate changes fromFalse
toTrue
, so I can project it out withfst
”. -
The second important idea is that we’re going to insist that
p
switches fromFalse
toTrue
, not at most once, but exactly once. (If necessary, we can add special “virtual”-∞
and/or+∞
indices such thatp (-∞) = False
andp (+∞) = True
.) Then as we narrow down our current search interval[l, r]
, we will maintain the invariant thatp l = False
andp r = True
.This invariant makes everything so much cleaner, and it also ties in with the first important idea of returning a pair instead of a single index. Previously I always thought of binary search in terms of searching for a specific index, but that makes the semantics of the interval tricky. For example, do we maintain the invariant that the index we’re looking for is somewhere in the closed interval
[l,r]
? Somewhere in the half-open interval[l,r)
? …? But I find it so much more elegant and natural to say “l
always stays in theFalse
part, andr
always stays in theTrue
part, and we just slide them closer until we find the exact dividing line betweenFalse
andTrue
.”I will note that there are a couple tradeoffs: first of all, our search function of course takes starting values for
l
andr
as inputs, and it will now have as a prerequisite thatp l = False
andp r = True
, so we have to think a little harder when calling it. We also have to work a little harder to figure out when e.g. a value we’re looking for was not found at all. Typically, if we use some sort of initial special+∞
value forr
, if the returnedr
value is still+∞
it means nothing at all was found that made the predicateTrue
. -
The final important idea is to abstract out a function
mid
to compute a potential next index to look at, given the current interval. We’ll insist that whenmid l r
returns a value, it must be strictly in betweenl
andr
(there’s no point in returningl
orr
because we already knowp l = False
andp r = True
), and we’ll stop when it returnsNothing
. This lets us cleanly separate out the logic of the recursion and keeping track of the current search interval from the details of the arithmetic needed for each step. In particular, it will allow us to unify binary search over both integral and floating-point domains.
Here’s the final form of our search function. Unlike, say, binarySearchD
, it pretty much writes itself at this point:
search :: (a -> a -> Maybe a) -> (a -> Bool) -> a -> a -> (a,a)
search mid p = go
where
go l r = case mid l r of
Nothing -> (l,r)
Just m
| p m -> go l m
| otherwise -> go m r
We check our mid
function to tell us what to look at next. If it returns Nothing
, we stop and return the pair of the current (l,r)
. If it returns a “midpoint” m
then we test the predicate on m
and recurse. No tricky +1
’s or -1
’s to think about; given our invariant regarding l
and r
, it’s obvious which one we should replace with m
depending on the outcome of the predicate, and we can’t get stuck in an infinite loop since m
is always strictly between l
and r
.
(As an aside, I love that this is polymorphic in a
with no class constraints! That’s another hint that this is really quite general. The class constraints will come with particular mid
functions.)
So what about those mid
functions? Here’s one for doing binary search over integers:
binary :: Integral a => a -> a -> Maybe a
binary l r
| r - l > 1 = Just ((l+r) `div` 2)
| otherwise = Nothing
Pretty straightforward! We stop when l
and r
are exactly one apart; otherwise we return their midpoint (you should convince yourself that (l+r) `div` 2
is always strictly in between l
and r
when r - l > 1
).
For example, we can use this to take an integer square root:
λ> search binary (\x -> x^2 >= 150) 0 100
(12,13)
This tells us that 12
is the biggest integer whose square is less than 150, and 13
is the smallest integer whose square is greater.
But we needn’t limit ourselves to integers; as hinted previously, we can also do binary search over Fractional
domains:
continuous :: (Fractional a, Ord a) => a -> a -> a -> Maybe a
continuous eps l r
| r - l > eps = Just ((l+r) / 2)
| otherwise = Nothing
Given an eps
value, we stop when r - l <= eps
, and otherwise return their midpoint. For example, we can use this to find the square root of 150 to 6 decimal places:
λ> search (continuous 1e-6) (\x -> x^2 >= 150) 0 100
(12.247448414564133,12.247449159622192)
We can even write some functions to do linear search! Why might we want to do that, you ask? Well, with some care, these can be used even with non-monotonic predicates, to find the first or last place the predicate switches from False
to True
(though using something like find
or findIndex
is typically easier than using search fwd
).
fwd :: (Num a, Ord a) => a -> a -> Maybe a
fwd l r
| r - l > 1 = Just (l+1)
| otherwise = Nothing
bwd :: (Num a, Ord a) => a -> a -> Maybe a
bwd l r
| r - l > 1 = Just (r-1)
| otherwise = Nothing
I don’t have any great examples of using these off the top of my head, but we might as well include them.
[WARNING: this section about binary search on bit representations of floating-point numbers is completely wrong, but I’m leaving it here for context. See the discussion in the comments to this post and the follow-up post!]
But there’s more: we can also do exact binary search on the bit representations of floating-point numbers! That is, we do binary search as if the bit representations of l
and r
were unsigned integers. This is possibly more efficient than “continuous” binary search, and lets us find the two precisely adjacent floating-point numbers where our predicate switches from False
to True
.
binaryFloat :: Double -> Double -> Maybe Double
binaryFloat l r = decode <$> binary (encode l) (encode r)
where
encode :: Double -> Word64
encode = unsafeCoerce
decode :: Word64 -> Double
decode = unsafeCoerce
For example, we can find the closest possible floating-point approximation to the square root of 150:
λ> search binaryFloat (\x -> x^2 >= 150) 0 100
(12.247448713915889,12.24744871391589)
λ> sqrt 150
12.24744871391589
This honestly seems like black magic to me, and I don’t know enough about floating-point representation to have a good idea of how this works and what the caveats might be, but it’s worked for all the examples I’ve tried. It even works when l
is negative and r
is positive (it seems like in that case the bit representation of l
would correspond to a larger unsigned integer than r
, but somehow it all works anyway!).
λ> search binaryFloat (\x -> x^2 >= 150) (-100) 100
(12.247448713915889,12.24744871391589)
Code
I’ve added the code from this post to my comprog-hs
repository on GitHub. The source for this blog post itself is available on hub.darcs.net
.
Challenges
And here are some problems for you to solve! I’ll discuss some of them in an upcoming post.
Hmm, I don’t seem to be able to convince myself of that… :-)
(See Joshua Bloch’s Extra, Extra – Read All About It: Nearly All Binary Searches and Mergesorts are Broken.)
OK, good point, you have to be careful when using a bounded integer type, as always!
Woah Brent! unsafeCoerce – really? Why not encodeFloat/decodeFloat? Did you test and find that’s needed as an important optimization?
`encodeFloat` and `decodeFloat` work with an `(Integer,Int)` representation of the float; this is not at all the native (IEEE 754 on most modern CPUs) floating point representation.
You’re right that it’s not the raw bits of the internal representation. But it’s guaranteed to find two consecutive values, whatever the actual internal representation. Well, unless your compiler internally is using CReal with potentially infinite precision floats, in which case the binary search could hang. But I don’t think that’s a concern, and anyway unsafeCoerce probably wouldn’t work either in that case. Whereas even if the floats are IEEE 754 on the metal, there is no way to predict if your unknown compiler will give you something that works for binary search when you use unsafeCoerce. For example – what is the endian?
OK, very good points. I will play around with encodeFloat/decodeFloat and see if I can come up with something better!
I think you’re looking at this in the wrong way. I take the example as showing the generality of this binary search function with an interesting and unusual example.
A function being named ‘unsafeXXX’ does _not_ mean that it’s always unsafe to use. It means you have to be take special care in how you use it, which generally involves having other controls outside the language itself, such as restricting its use to particular environments (and there are many ways to do this). I don’t expect Brent to go into the enormously complex details of how to use this idea safely; that’s going to depend entirely on the details of the system in which you’re using it.
Of course I would never use this in production code or even a personal project. But for competitive programming I don’t care, if it works, it works. =)
Sure. But even in competitive programming, are you sure it will always work the way you expect? You have some random compiler. Is the internal representation guaranteed forever and ever to have the semantics you need for you binary search to work? What’s wrong with encodeFloat/decodeFloat, which besides not being “unsafe”, is guaranteed present in ever compiler since Haskell 98 with the same semantics? I grant you, it could be what’s wrong is that it’s not efficient. But seems to me it’s a better first thing to try.
The “black magic” of binaryFloat is a known type pun to a sign-magnitude integer, so because you’ve searched for positive values it’s always worked, however for negative it is not correctly handling it and will explode — it also won’t return NaN for sqrt of negative x, though it explodes well before that becomes a problem.
The spec covers this binary shennanigans under comparison and total ordering, though it relies on some NaN behaviour being followed (754-2008 recommended’s is_quiet flag behaviour). See Herf’s old post for the necessary bit twiddling https://web.archive.org/web/20220326204603/http://stereopsis.com/radix.html (wayback as their webserver is down).
Negative l positive r works because the first binary step will halve it, thereby losing the top sign bit (being equivalent to a right shift), and basically restarting the search with an l of 0.0 (or some very small float near it). I recommend printing each step out with binary representation, to see the working.
Thanks, that makes sense!
Thanks for posting. I’ve recently thinking about generalizing bisection on DAGs (for source control systems). This isn’t quite that, but I’ll be thinking about the ideas and seeing if they apply.
As Jules mentions in his original article, this technique works for any partially ordered set, to find one particular pair of elements x < y, with nothing in between them, such that p x is False but p y is True. You just might not get the "optimal" such pair (for whatever your definition of "optimal" is). You need a way to implement the "mid" function, i.e. given l, r such that l < r, find another element m such that l < m < r. For example, for the poset of subsets ordered by inclusion, given A \subseteq B you can arbitrary choose half of the elements in the difference B – A and union them with A to get a set "halfway in between" A and B. I'm not sure off the top of my head how you would do this for a general DAG but you could probably figure something out.
Thanks, I see section 5.2 “Searching lattices” now. I’ll just expound on the problem while I’m here. When searching the DAG for source control purposes, we want *all* points where p flips, making the monotonicity assumption that if a <= b then p(a) <= p(b) (where False < True). (Actually, this assumption doesn't need to be true in practice, but people still make good use of `git bisect` despite that.)
Like you suggest, we don't want to search the poset of *all* subsets of nodes in the DAG, since some of those subsets won't be meaningful. Each element of the subset indicates one of the patches to combine together into a resulting commit to analyze, and patches under traditional SCM don't always successfully apply when combined, or are ill-defined in the cases of merge/multi-parent commits. But it should be reasonably straightforward to figure out a valid poset that corresponds to the DAG's ordering relation.
Ah, I see, makes sense. And sorry I didn’t explain the poset connection very well, I did not have in mind looking at subsets of nodes in a DAG. I was just using subsets (in general) as one particular example of a poset. A DAG itself is another example of a poset, where x <= y iff there is a path in the DAG from x to y.
Finding *all* the points in a DAG where a monotonic predicate flips is an interesting problem. Of course it can be done in O(V+E) just by doing a DFS and noting every edge where the predicate flips. But generalizing binary search to be able to do it faster is an interesting question. `git bisect` or something similar is also an interesting application domain, because in that case we are not necessarily looking to minimize total running time, but rather minimizing the number of evaluations of the predicate (which correspond to prompting the user). So it's OK if we need to spend linear time processing the entire DAG as long as we only evaluate the predicate a logarithmic number of times.
Super fun, thanks for sharing Brent! I felt inspired by this, so for fun I implemented it in C++ in this commit: https://github.com/greg7mdp/gtl/commit/75cdaedcc3c883b139f3976b86190d21bacc74b8
Pingback: Binary search over floating point representations | blog :: Brent -> [String]
Nice post! Two other fun things:
You can do interpolation search using a different mid function (but it needs access to the array) and on reals that gives you something close to the secant method (but better, actually, because the secant method can diverge).
The requirement that the predicate is monotone is not strictly necessary: it will still find a place where the predicate flips, but you don’t get guarantees on which one it finds.
Cool, thanks! Do you have examples/code for these somewhere? I get the general idea but it would be nice to see how the details work out.
I ported the search function to Rust and used it for Bootstrapping Number and Toast. I’m stuck at various stages on the others.
Toast: https://github.com/jasonincanada/kattis/blob/master/rust/toast/src/main.rs#L126
I think I need some sort of insight to make further progress on Big Boxes or I’ll come up with a hopelessly slow algorithm. My notes so far are here: https://github.com/jasonincanada/kattis/blob/master/notes/Big%20Boxes.md
These are interesting challenges, thanks for posting them!
For Big Boxes, the trick is figuring out the right thing to search for. A useful thing to consider (for many problems, not just this one) is searching for the optimal *answer*. Given a potential answer (i.e. the weight of the heaviest box), can you figure out if it is viable?
This would be the generic binary search in Agda. The tricky bit is that Agda demands termination, so you have to have evidence that the intervals get smaller and smaller w.r.t. some relation on the intervals. This evidence should be produced by the `mid` function.
“`agda
open import Data.Bool.Base using (Bool; if_then_else_)
open import Data.Maybe.Base using (Maybe; just; nothing)
open import Data.Product using (∃; _×_; _,_)
open import Function using (case_of_)
open import Induction.WellFounded using (Acc; acc; WellFounded)
module GenericBinarySearch
(A : Set) — The domain.
(let Interval = A × A) — Type of intervals.
(_<_ : Interval → Interval → Set) — Wellfounded ordering on intervals
(wf : WellFounded _<_)
(mid : (i@(l , r) : Interval) → Maybe (∃ λ m → ((l , m) < i) × ((m , r) < i)))
— Computing the midpoint of the interval
(p : A → Bool) — What to search for
where
search' : (i : Interval) → Acc _<_ i → Interval
search' i@(l , r) (acc h) =
case mid i of λ where
nothing → i
(just (m , left , right)) →
if p m then search' (l , m) (h _ left)
else search' (m , r) (h _ right)
search : Interval → Interval
search i = search' i (wf i)
“`
As usual, the more generic, the easier to write down in Agda. The work starts when you try to use it: You have to prove that the ordering on the intervals is well-founded in the concrete case…
(No preview for wordpress comments? Does this recognize markdown? I can see this going wrong, let's try…)