Squarefree numbers in Haskell

A squarefree number is one which is not divisible by any perfect squares. Put another way, the prime factorization of a squarefree number includes at most one copy of any given prime. So, the first few squarefree numbers are

1,2,3,5,6,7,10,11,13,14,15,…

How can we generate these in Haskell? In particular, we want to define the infinite sorted list of squarefree numbers, to be lazily computed on demand. We could implement a brute-force method where we simply factor every positive integer and only keep those which don’t have repeated factors. But that would be awfully slow. This is Haskell we’re talking about, there must be a better way!

There are better ways, of course; here’s one. Suppose that (somehow) we have computed the list L_n, which contains all the squarefree numbers with prime factors up to and including p_n, the nth prime number. For example, L_1 = [1,2]; L_2 = [1,2,3,6]; and so on. Given L_n, we can compute L_{n+1} as follows: multiply every element of L_n by p_{n+1} to produce a new list L_{n+1}'; this list contains all those squarefree numbers with prime factors up to p_{n+1}, which are divisible by p_{n+1}. Then we just merge L_n with L_{n+1}' to produce L_{n+1}.

Let’s try it: we start with L_0 = [1], the list of all squarefree numbers with no prime factors. Then we compute L_1 = [1,2]. Multiplying by 3 gives L_2' = [3,6], and merging yields L_2 = [1,2,3,6]. Multiplying by 5 gives L_3' = [5,10,15,30]; merging again gives us L_3 = [1,2,3,5,6,10,15,30]. And so on.

So, how do we translate this into a definition of the infinite list of squarefree numbers? It’s not quite as straightforward as it seems. First of all, we can’t just say the equivalent of “compute L_{\infty}“; nothing would ever get generated that way, since everything in L_{n+1} depends on everything in L_{n}. There’s nothing inherent in the above method that indicates which part of the list isn’t going to change on the next iteration. And we can’t do something like lazily merge all the lists in L_1, L_2, L_3\dots; the problem is that every squarefree number occurs infinitely often in such a list.

The key is to note that the “intermediate” lists L_n' are more important than we might have thought. The infinite sequence of lists L_1', L_2', L_3',\dots in fact contains every squarefree number exactly once (except 1); moreover, they are ordered by their first elements in addition to being ordered themselves, which gives us just enough information to implement an infinite merge that will actually get around to producing something in a finite amount of time!

The primeStep function below takes the prime p_{n+1} and the list L_n, and produces the pair (L_{n+1}, L_{n+1}'). mapAccumL (one of those higher-order functions which isn’t used much but comes in very handy every once in a while) is used to carry along the current list L_n in an accumulator while simultaneously producing the list {[L_1', L_2', L_3', \dots]} . Finally mergeAll performs a lazy infinite merge, giving us the infinite list of squarefree numbers.

import Data.List (mapAccumL)

sieve (x:xs) = x : sieve (filter (n -> n `mod` x /= 0) xs)
primes = sieve [2..]

-- merge two nondecreasing lists.
(#) :: (Ord a) => [a] -> [a] -> [a]
[] # ys = ys
xs # [] = xs
xs@(x:xt) # ys@(y:yt) | x < y = x : (xt # ys)
                       | x > y = y : (xs # yt)
                       | otherwise = x : (xt # yt)

-- merge an infinite list of lists, assuming that each list
-- is nondecreasing and the lists occur in order of their first
-- element.
mergeAll :: (Ord a) => [[a]] -> [a]
mergeAll ([] : zs) = mergeAll zs
mergeAll (xxs@(x:xs) : yys@(y:ys) : zs)
    | x < y = x : mergeAll (xs : yys : zs)
    | otherwise = mergeAll ((xxs # yys) : zs)

-- given a list of all squarefree numbers with factors up to
-- but not including p, produce (a,b), where a is all squarefree
-- numbers with factors up to p, and b only includes those which
-- are multiples of p.
primeStep xs p = (xs # pxs, pxs)
  where pxs = map (p*) xs

-- the nth element of primeLists is a sorted list of squarefree
-- numbers with factors up to p_n, which are all multiples of p_n.
-- Therefore every squarefree number occurs exactly once in (concat
-- primeLists), and the lists in primeLists are sorted by first element.
primeLists = snd $ mapAccumL primeStep [1] primes

-- to get a sorted list of squarefree numbers, just merge primeLists.
squarefree = 1 : mergeAll primeLists

It seems to work:

*Main> take 20 $ squarefree
[1,2,3,5,6,7,10,11,13,14,15,17,19,21,22,23,26,29,30,31]

Neat!

At the beginning, I said there are better ways; here’s another! It makes use of the beautiful theory of Dirichlet generating functions, and in particular, the fact that \zeta(s)/\zeta(2s) is the DGF for the squarefree numbers. Using David Amos’ fantastic mathematics library, this is a piece of cake:

import DirichletSeries
import Data.Maybe

bitToMaybe :: Int -> a -> Maybe a
bitToMaybe 1 = Just
bitToMaybe _ = const Nothing

squarefree = catMaybes $ zipWith bitToMaybe (coeffsDS squarefreeDGF) [1..]

Can you come up with another way?

(Thanks to chessguy, nasa_, olsner, and oerjan from #haskell for suggestions and ideas!)

About these ads
This entry was posted in haskell, math. Bookmark the permalink.

17 Responses to Squarefree numbers in Haskell

  1. Rasmus Ulvund says:

    I use this:

    notsquare = [x|x<-[1..], False == perfsquare x]

    perfsquare a = notElem (a`mod`10) [2,3,7,8] && (floor b == (ceiling b))
    where b = (sqrt . fromIntegral) a

    The only hackish part(s) of this is that perfect squares never end with 2,3,7,8 :).

    The precision of floats could be a problem for large numbers, but it is enough for my purposes. I haven’t experienced it being a bottleneck.

    Followed the link from project euler, and I also saw you on reddit. Downvoted this post. Sorry ;)

  2. Brent says:

    That isn’t quite the same thing; you’ve defined the list of all integers which aren’t perfect squares, whereas I’m talking about numbers which aren’t divisible by any perfect squares. For example, 12 is in your list but not mine.

  3. Rasmus Ulvund says:

    I see

  4. Pingback: Top Posts « WordPress.com

  5. Adam says:

    If you focus on the L instead of the L’, you can get a pretty simple definition.

    — merge two sorted lists, removing duplicates
    merge :: Ord a => [a] -> [a] -> [a]
    merge a [] = a
    merge [] b = b
    merge (x:xs) (y:ys)
    | x filter ((/= 0) . flip mod (head o)) o) (iterate (+ 1) 2))

    — doesn’t actually merge lists
    — instead, takes the nth element from the nth list, going down the diagonal
    mergeAll :: [[a]] -> [a]
    mergeAll ((h:hs):t) = h:(mergeAll $ map tail t)

    — a list of the L lists, not the L’ lists
    — since they each contain the beginning of the sequence,
    — we can use the simplified mergeAll
    ll :: [[Int]]
    ll = llfun primes [1] where
    llfun (h:t) pre = pre : (llfun t (merge pre (map (* h) pre)))

    squareless :: [Int]
    squareless = mergeAll ll

  6. Adam says:

    merge :: Ord a => [a] -> [a] -> [a]
    merge a [] = a
    merge [] b = b
    merge (x:xs) (y:ys)
    | x>y = y:(merge (x:xs) ys)
    | x==y = x:(merge xs ys)
    | otherwise = x:(merge xs (y:ys))

    primes :: [Int]
    primes = map head (iterate (\o -> filter ((/= 0) . flip mod (head o)) o) (iterate (+ 1) 2))

    mergeAll :: [[a]] -> [a]
    mergeAll ((h:hs):t) = h:(mergeAll $ map tail t)

    ll :: [[Int]]
    ll = llfun primes [1] where
    llfun (h:t) pre = pre : (llfun t (merge pre (map (* h) pre)))

    squareless :: [Int]
    squareless = mergeAll ll

    Really sorry about that. This should work

  7. Brent says:

    Aha! The idea of taking the diagonal is neat. It does mean that producing the first n squarefree numbers is O(n^2), since before producing each one it has to first throw away all the previous ones; but then again, I’m not sure what the running time of my version is, either. (Reasoning about time complexity when laziness is involved isn’t obvious…)

  8. anonymous says:

    Here is a (IMHO nice) way to enumerate all squarefree numbers, although not in sorted order. The key idea is that all sorted numbers can be written as p^a + q^b + r^c + … where p,q,r,… are the prime numbers and a,b,c are drawn from { 0, 1 }. Without further ado, here’s the code:

    import Control.Monad
    all01sequences :: [[Int]]
    all01sequences =
    filter (not . (== 0) . last) $
    concatMap (sequence . flip replicate [0,1]) [1..]

    allSQFreeNats :: [Int]
    allSQFreeNats = 1 : liftM (product . zipWith (^) primes) all01sequences

  9. Brent says:

    Hey, that’s nifty! Might I suggest replacing filter (not . (==0) . last) with ([1]:) . map (++[1]) ?

  10. anonymous says:

    Sure, makes it even shorter :) me idly wonders which magical higher-order cool-monad function one needs to shorten this even more…

    (For all those following along, I typoed “+” for “*” in the description of the “key idea”.)

  11. Brent says:

    Well, not only shorter, but more efficient: it no longer has to throw away any of the generated sequences. Note that ([1]:) . map (++[1]) isn’t doing the same thing as filter (not . (==0) . last), it’s just appending 1 to all the sequences and then manually adding in the sequence [1] again.

    It occurs to me that you could also replace

    concatMap (sequence . flip replicate [0,1]) [1..]

    with

    concatMap sequence . tail . iterate ([0,1]:) $ []

    which isn’t really all that much shorter but is (IMO) clearer. Also, concatMap is just (=<<) in the list monad, so you could also write

    sequence =<< (tail . iterate ([0,1]:) $ [])

    which is shorter but less clear. =)

  12. Brent says:

    Wait, if you take out the call to tail, you get the empty list at the beginning, which then gives [1] when (++[1]) is applied to it, so [1] doesn’t have to be added separately. Here’s the complete (now much shorter) version:

    all01sequences :: [[Int]]
    all01sequences = map (++[1]) $ sequence =<< iterate ([0,1]:) []

    sweet!

  13. Anonymous says:

    Indeed :)

  14. b_jonas says:

    I don’t understand why this is faster than factoring every number: to generate L_inf up to p_n, you have to generate the primes up to p_n, which is probably not much faster than factoring every number up to p_n.

  15. Brent says:

    b_jonas: Well, in order to factor every number up to p_n we also have to generate the primes up to p_n, so let’s take calculating a list of primes as given. (I should also note that there are better ways to calculate the primes than what I have here.) The difference is that with this method we are starting with the primes, and using them to generate only squarefree numbers. So we never even have to check non-squarefree numbers, and for squarefree numbers we are implicitly generating their prime factorizations straight away, with no extra work; in order to start with the numbers and factor them, we’d have to do a lot of wasted work checking for divisors.

    Another way to think about it: if you forget about the merging, and have primeStep involve (map (p:) xs) instead of (map (p*) xs), we get a list of the prime factorizations of all squarefree numbers. But we do no wasted work to get it (checking for divisors which aren’t divisors, factoring non-squarefree numbers) so it’s clearly faster.

    I should also point out that generating the primes up to p_n certainly is faster than factoring every number up to p_n, even if we imagine doing both by trial division: in order to discard a non-prime, we need only find a single factor, whereas to fully factor that number may require much more work.

  16. Leon says:

    Your definition of mergeAll is not that efficient. You’ll find it faster to fold right than to fold left-ish.

    mergeAll [] = []
    mergeAll ([]:xss) = mergeAll xss
    mergeAll ((x:xs):xss) = x : (xs # mergeAll xss)

  17. fysx says:

    This one liner does squareFree numbers too (if only 2 years too late):

    squareFree = 1 : 2 : 
                 filter (\y -> not . any ((== 0) . mod y) . 
                               takeWhile (<= y) . 
                               map (^2) $ 
                               tail squareFree) 
                            [3..]
    

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s