I got a lot of great comments on my previous post about finding roots of polynomials in Haskell. One particularly promising idea I got from commenter Jake was to give up on the idea of having no external dependencies (which, to be fair, in these days of `stack`

and `nix`

and `cabal-v2`

, seems like much less of a big deal than it used to be), and use the `hmatrix`

package to find the eigenvalues of the companion matrix, which are exactly the roots.

So I tried that, and it seems to work great! The only problem is that I still don’t know how to write a reasonable test suite. I started by making a QuickCheck property expressing the fact that if we evaluate a polynomial at the returned roots, we should get something close to zero. I evaluate the polynomial using Horner’s method, which as far as I understand has good numerical stability in addition to being efficient.

```
polyRoots :: [Double] -> [Double]
polyRoots = ... stuff using hmatrix, see code at end of post ...
horner :: [Double] -> Double -> Double
horner as x = foldl' (\r a -> r*x + a) 0 as
_polyRoots_prop :: [Double] -> Property
_polyRoots_prop as = (length as > 1) ==>
all ((< 1e-10) . abs . horner as) (polyRoots as)
```

This property passes 100 tests for quadratic polynomials, but for cubic I get failures; here’s an example. Consider the polynomial

Finding its roots via `hmatrix`

yields three:

`[-1.077801388041068, 0.5106483227001805, 150.6238979010295]`

Notice that the third root is much bigger in magnitude than the other two, and indeed, that third root is the problematic one. Evaluating the polynomial at these roots via Horner’s method yields

`[1.2434497875801753e-14, 1.7763568394002505e-15, -1.1008971512183052e-10]`

the third of which is bigger than `1e-10`

which I had (very arbitrarily!) chosen as the cutoff for “close enough to zero”. But here’s the thing: after playing around with it a bit, it seems like this is the *most accurate possible value* for the root that can be represented using `Double`

. That is, if I evaluate the polynomial at any value other than the root that was returned—even if I just change the very last digit by 1 in either direction—I get a result which is *farther* from zero.

If I make the magic cutoff value bigger—say, `1e-8`

instead of `1e-10`

—then I still get similar counterexamples, but for larger-degree polynomials. I never liked the arbitrary choice of a tolerance anyway, and now it seems to me that saying “evaluating the polynomial at the computed roots should be within this absolute distance from zero” is fundamentally the wrong thing to say; depending on the polynomial, we might have to take what we can get. Some other things I could imagine saying instead include:

- Evaluating the polynomial at the computed roots should be within some
*relative*epsilon of zero, depending on the size/relative size of the coefficients - The computed roots are as accurate as possible (or close to it) in the sense that evaluating the polynomial at other numbers near the computed roots yields values which are farther from zero

…but, first of all, I don’t know if these are reasonable properties to expect; and even if they were, I’m not sure I know how to express them in Haskell! Any advice is most welcome. Are there any best practices for expressing desirable test properties for floating-point computations?

For completeness, here is the actual code I came up with for finding roots via `hmatrix`

. Notice there is another annoying arbitrary value in there, for deciding when a complex root is close enough to being real that we call it a real root. I’m not really sure what to do about this either.

```
-- Compute the roots of a polynomial as the eigenvalues of its companion matrix,
polyRoots :: [Double] -> [Double]
polyRoots [] = []
polyRoots (0:as) = polyRoots as
polyRoots (a:as) = mapMaybe toReal eigVals
where
n = length as'
as' = map (/a) as
companion = (konst 0 (1,n-1) === ident (n-1)) ||| col (map negate . reverse $ as')
eigVals = toList . fst . eig $ companion
toReal (a :+ b)
| abs b < 1e-10 = Just a -- This arbitrary value is annoying too!
| otherwise = Nothing
```

I don’t know if there any general techniques. If there are, I’d love to know them. I maintain hmatrix and it does have property tests (which I inherited). Occasionally folks run these and report bugs to me which almost always turn out to be e.g. they (or rather the tests) have generated an almost degenerate (I use the term loosely) matrix. I could possibly put guards around the tests to reject such matrices but it would probably be several days work and not much benefit to me.

For your specific case, I would test that the values for the result and one bit away are of different signs.

Taking a step back: why are you testing? Initially you are testing to check your implementation works so you have to investigate what turn out to false positives. After you have checked they really are false positives then you want a way to catch problems if someone changes the code. Here you could use a fixed seed and explicitly filter out the failing false positives. Now when someone changes something, the tests will still have a very good chance of detecting any problems.

Probably something involving the condition number: https://en.wikipedia.org/wiki/Condition_number

The condition number of the companion matrix of your cubic is 424.484 See https://www.wolframalpha.com/input/?i=%5B%5B0,0,-82.9%5D,%5B1,0,85.97718287916894%5D,%5B0,1,150.05674483568866%5D%5D

Using the rule of thumb mentioned on the Wikipedia page, this suggests that you could lose around 9 bits (round up to 3 digits) of accuracy before we even consider round-off errors. The suggestion would then be loosen the cut-off based on the condition number. Except this (probably?) isn’t the condition number you want. You want the one for the process of calculating the roots of the polynomial/eigenvalues of this matrix. This is mentioned in passing on the Wikipedia page. I’ll look in my copy of Higham tomorrow to see if it says anything readily useful.

Good shout! Yes you could use the condition number and have a guard in your QuickCheck proposition. See here for an egregious example where reciprocal condition number = 2.14591e-18: https://github.com/haskell-numerics/hmatrix/issues/288#issuecomment-462049145.

Sounds like it’s worth investigating — is there a way to compute the condition number using hmatrix?

This `rcond :: Field t => Matrix t -> Double` computes something very similar (https://hackage.haskell.org/package/hmatrix-0.19.0.0/docs/Numeric-LinearAlgebra.html#g:18). For your example it gives 2.2537998395228717e-3 and `recip 2.2537998395228717e-3 = 443.6951243246603`. I don’t think the hmatrix approach is the recommended approach as there is a function in LAPACK that will compute the value directly (well after doing LU decomposition) – see here for more details: https://stackoverflow.com/questions/4974349/computing-the-reciprocal-condition-number-with-lapack-i-e-rcondx. It wouldn’t be hard to add.

Interesting questions! It seems like there should be easy answers to your questions but my guess is there is not. Floating point arithmetic is not easy!

Looking at the notes for this course at Cambridge (https://www.cl.cam.ac.uk/teaching/1213/FPComp/fpcomp12slides.pdf) I see:

Slide 88

• Choose your iteration well.

• Determine a termination criterion.

For real-world equations (possibly in multi-dimensional space) neither of

these are easy and it may well be best to consult an expert!

Slide 98

If you need precision, or speed, or just to show

your algorithm is producing a mathematically justifiable answer then you

may (and will probably if the problem is non-trivial) need to consult an

expert, or buy in a package with certified performance (e.g. NAGLIB,

Matlab, Maple, Mathematica, REDUCE . . . ).

FWIW, for your example, Mathematica seems to give the same answer:

NumberForm[

Solve[0.1 x^3 – 15.005674483568866 x^2 – 8.597718287916894 x +

8.29 == 0], 16]

{{x->-1.077801388041069},{x->0.5106483227001806},{x->150.6238979010295}

The C standard library has functions to search adjacent floating point numbers (e.g. nextafter, nexttoward). This is one of those situations where they might come in handy.

If the roots are well-separated, you could check if plugging the two adjacent floating point numbers into the polynomial give you a value whose absolute value is further away from zero.

Oh, I just thought of another possibility. Convert the root to a Rational, do one iteration of your favourite cheap ‘n easy root finder algorithm, and convert that back to a floating-point number. If it’s the same as the original floating-point number or one adjacent to it, then it was probably the best approximation.

By the way, this is also what the machine epsilon is for. Assuming you don’t cross the line between normal and subnormal numbers, two floating point numbers are the same to machine precision if their relative difference is less than the machine epsilon.

Playing around with that big root in python I find that I can a slight improvement in the function value, but not by an incredible amount.

In hex, “150.6238979010295” is 4062d3f6f8bb1edf, and this does evaluate to the function value you got, but 4062d3f6f8bb1ee1 evaluates to 1.1381118270037405e-11.

For what it’s worth, numpy.roots only gets one bit better at 40…ee0.

One thing you might try is when you generate your random test cases to cheat and pick the roots first. Then multiply out (x-r1)(x-r2)…(x-rn) symbolically to get it in the monomial basis, and once you have the solved roots compare them against the originals.

The floating point number that’s closest to the root and the floating point number that returns the minimum absolute value when plugged into the function may, of course, be different. They’re probably adjacent, but they may be different.

It makes sense to use some kind of “as accurate as possible” wording. A fixed absolute tolerance ends up being too strict on some inputs and too lax on others, as you observed. This is why IEEE 754 FP accuracy is usually described in terms of ULPs (see, for example, the documentation for java.lang.Math), which is a measure of the distance between adjacent representable values — a distance which grows with the exponent.

This discussion also brings to mind the design of Julia’s isapprox() function:

https://docs.julialang.org/en/v1/base/math/#Base.isapprox

Determining whether two values are approximately equal involves a tolerance which scales with the magnitude of the inputs (this tolerance is usually looser than 1 ULP). But also note their rationale for why with the default tolerances, nothing is ever approximately equal to zero except exact zero itself.

“The computed roots are as accurate as possible” can be done with (man3)nextafter:

_polyRoots_prop as = (length as > 1) ==>

all (( check x && check prev > check x

check = abs . horner as

next = nextafter x +Infinity

prev = nextafter x -Infinity

This doesn’t solve “if these are reasonable properties to expect”, though.

Pingback: Resumen de lecturas compartidas durante febrero de 2019 | Vestigium