In a previous post, I challenged you to solve Infinite 2D Array using Haskell. As a reminder, the problem specifies a two-parameter recurrence , given by

- for
- for
- for .

Last time, we derived a formula for that involves only a linear number of terms:

While the number of terms may be linear, it can still be on the order of a million terms, so computing each term is going to have to be pretty quick in order to fit the whole thing within the one second time limit.

## Fibonacci numbers modulo a prime

Computing Fibonacci numbers modulo a prime is not hard, especially since we want *all* the Fibonacci numbers from 1 up to : just compute each one by adding the previous two modulo . We could also precompute a table of Fibonacci numbers mod this way. And any of the fast methods for computing individual Fibonacci numbers (for example, using 2×2 matrix exponentiation) also work just fine if you reduce everything modulo at each step, since they only involve addition, subtraction, and multiplication.

## Binomial coefficients modulo a prime

What about binomial coefficients? Since and are not too large, and in particular since they will both be smaller than , we can use the usual formula:

(If and could be much larger, or if they could be larger than , we would have to use something like Lucas’s Theorem or other techniques; that might make for another interesting blog post sometime.) But how do we handle division in modular arithmtic? Since we’re working modulo a prime, every value other than zero must have a *modular inverse*, that is, a value such that (this is a corollary of Bézout’s Theorem). To compute the modular inverse for a given , we have a couple options. One simple way is to use Fermat’s Little Theorem: if is not divisible by a prime , then , hence is the modular inverse of modulo , and we can compute it efficiently using repeated squaring modulo . Another option is to use the extended Euclidean algorithm to find the and (guaranteed to exist by Bézout’s Theorem) such that ; then is the inverse of modulo .

Both of these methods take time. In my experience, computing the power is easier to code (especially in Haskell where we get exponentiation by repeated squaring for free!), but using the extended Euclidean algorithm can be a bit faster when it’s well-optimized. (Note the extended Euclidean algorithm can be faster when is small, but raising to the power always takes the same number of steps no matter what is.)

## Factorials modulo a prime

Since we’re going to be repeatedly using the same factorials, one thing we absolutely must do is precompute a table of factorials mod , from up to some maximum. In this case, since our formula involves things like , we may need factorials up to , so a table of size will do ( and can be up to ).

We could also precompute a table of modular inverses of factorials; to compute the inverse of , we just find the inverse of each and multiply it by the (previously computed) inverse of . (Or we could just invert the value for stored in the other table.) Making a table of inverse factorials like this turns out not to help too much for this particular problem, but it can be an important optimization in some cases.

## The end?

So we can compute each additional Fibonacci number in ; we can also now compute binomial coefficients modulo in , with a few table lookups for factorials and an inversion operation. (Again, we could achieve if we also stored a table of inverse factorials, but for this problem it seems the additional time needed to construct the table in the first place outweighs the time saved computing binomial coefficients.) In theory, we have everything we need to solve this problem efficiently.

However, for this problem, constant factors matter! There’s still quite a bit of nontrivial work I had to do to get my code fast enough. In my next and final post on this problem, we’ll walk through a few different ideas for implementing this concretely in Haskell.