A squarefree number is one which is not divisible by any perfect squares. Put another way, the prime factorization of a squarefree number includes at most one copy of any given prime. So, the first few squarefree numbers are

1,2,3,5,6,7,10,11,13,14,15,…

How can we generate these in Haskell? In particular, we want to define the infinite sorted list of squarefree numbers, to be lazily computed on demand. We could implement a brute-force method where we simply factor every positive integer and only keep those which don’t have repeated factors. But that would be awfully slow. This is Haskell we’re talking about, there must be a better way!

There are better ways, of course; here’s one. Suppose that (somehow) we have computed the list $L_n$, which contains all the squarefree numbers with prime factors up to and including $p_n$, the nth prime number. For example, $L_1 = [1,2]$; $L_2 = [1,2,3,6]$; and so on. Given $L_n$, we can compute $L_{n+1}$ as follows: multiply every element of $L_n$ by $p_{n+1}$ to produce a new list $L_{n+1}'$; this list contains all those squarefree numbers with prime factors up to $p_{n+1}$, which are divisible by $p_{n+1}$. Then we just merge $L_n$ with $L_{n+1}'$ to produce $L_{n+1}$.

Let’s try it: we start with $L_0 = [1]$, the list of all squarefree numbers with no prime factors. Then we compute $L_1 = [1,2]$. Multiplying by 3 gives $L_2' = [3,6]$, and merging yields $L_2 = [1,2,3,6]$. Multiplying by 5 gives $L_3' = [5,10,15,30]$; merging again gives us $L_3 = [1,2,3,5,6,10,15,30]$. And so on.

So, how do we translate this into a definition of the infinite list of squarefree numbers? It’s not quite as straightforward as it seems. First of all, we can’t just say the equivalent of “compute $L_{\infty}$“; nothing would ever get generated that way, since everything in $L_{n+1}$ depends on everything in $L_{n}$. There’s nothing inherent in the above method that indicates which part of the list isn’t going to change on the next iteration. And we can’t do something like lazily merge all the lists in $L_1, L_2, L_3\dots$; the problem is that every squarefree number occurs infinitely often in such a list.

The key is to note that the “intermediate” lists $L_n'$ are more important than we might have thought. The infinite sequence of lists $L_1', L_2', L_3',\dots$ in fact contains every squarefree number exactly once (except 1); moreover, they are ordered by their first elements in addition to being ordered themselves, which gives us just enough information to implement an infinite merge that will actually get around to producing something in a finite amount of time!

The primeStep function below takes the prime $p_{n+1}$ and the list $L_n$, and produces the pair $(L_{n+1}, L_{n+1}')$. mapAccumL (one of those higher-order functions which isn’t used much but comes in very handy every once in a while) is used to carry along the current list $L_n$ in an accumulator while simultaneously producing the list ${[L_1', L_2', L_3', \dots]}$. Finally mergeAll performs a lazy infinite merge, giving us the infinite list of squarefree numbers.

import Data.List (mapAccumL)

sieve (x:xs) = x : sieve (filter (n -> n mod x /= 0) xs)
primes = sieve [2..]

-- merge two nondecreasing lists.
(#) :: (Ord a) => [a] -> [a] -> [a]
[] # ys = ys
xs # [] = xs
xs@(x:xt) # ys@(y:yt) | x < y = x : (xt # ys)
| x > y = y : (xs # yt)
| otherwise = x : (xt # yt)

-- merge an infinite list of lists, assuming that each list
-- is nondecreasing and the lists occur in order of their first
-- element.
mergeAll :: (Ord a) => [[a]] -> [a]
mergeAll ([] : zs) = mergeAll zs
mergeAll (xxs@(x:xs) : yys@(y:ys) : zs)
| x < y = x : mergeAll (xs : yys : zs)
| otherwise = mergeAll ((xxs # yys) : zs)

-- given a list of all squarefree numbers with factors up to
-- but not including p, produce (a,b), where a is all squarefree
-- numbers with factors up to p, and b only includes those which
-- are multiples of p.
primeStep xs p = (xs # pxs, pxs)
where pxs = map (p*) xs

-- the nth element of primeLists is a sorted list of squarefree
-- numbers with factors up to p_n, which are all multiples of p_n.
-- Therefore every squarefree number occurs exactly once in (concat
-- primeLists), and the lists in primeLists are sorted by first element.
primeLists = snd $mapAccumL primeStep [1] primes -- to get a sorted list of squarefree numbers, just merge primeLists. squarefree = 1 : mergeAll primeLists It seems to work: *Main> take 20$ squarefree
[1,2,3,5,6,7,10,11,13,14,15,17,19,21,22,23,26,29,30,31]

Neat!

At the beginning, I said there are better ways; here’s another! It makes use of the beautiful theory of Dirichlet generating functions, and in particular, the fact that $\zeta(s)/\zeta(2s)$ is the DGF for the squarefree numbers. Using David Amos’ fantastic mathematics library, this is a piece of cake:

import DirichletSeries
import Data.Maybe

bitToMaybe :: Int -> a -> Maybe a
bitToMaybe 1 = Just
bitToMaybe _ = const Nothing

squarefree = catMaybes $zipWith bitToMaybe (coeffsDS squarefreeDGF) [1..] Can you come up with another way? (Thanks to chessguy, nasa_, olsner, and oerjan from #haskell for suggestions and ideas!) About these ads This entry was posted in haskell, math. Bookmark the permalink. ### 17 Responses to Squarefree numbers in Haskell 1. Rasmus Ulvund says: I use this: notsquare = [x|x<-[1..], False == perfsquare x] perfsquare a = notElem (amod10) [2,3,7,8] && (floor b == (ceiling b)) where b = (sqrt . fromIntegral) a The only hackish part(s) of this is that perfect squares never end with 2,3,7,8 :). The precision of floats could be a problem for large numbers, but it is enough for my purposes. I haven’t experienced it being a bottleneck. Followed the link from project euler, and I also saw you on reddit. Downvoted this post. Sorry ;) 2. Brent says: That isn’t quite the same thing; you’ve defined the list of all integers which aren’t perfect squares, whereas I’m talking about numbers which aren’t divisible by any perfect squares. For example, 12 is in your list but not mine. 3. Rasmus Ulvund says: I see 4. Pingback: Top Posts « WordPress.com 5. Adam says: If you focus on the L instead of the L’, you can get a pretty simple definition. — merge two sorted lists, removing duplicates merge :: Ord a => [a] -> [a] -> [a] merge a [] = a merge [] b = b merge (x:xs) (y:ys) | x filter ((/= 0) . flip mod (head o)) o) (iterate (+ 1) 2)) — doesn’t actually merge lists — instead, takes the nth element from the nth list, going down the diagonal mergeAll :: [[a]] -> [a] mergeAll ((h:hs):t) = h:(mergeAll$ map tail t)

— a list of the L lists, not the L’ lists
— since they each contain the beginning of the sequence,
— we can use the simplified mergeAll
ll :: [[Int]]
ll = llfun primes [1] where
llfun (h:t) pre = pre : (llfun t (merge pre (map (* h) pre)))

squareless :: [Int]
squareless = mergeAll ll

merge :: Ord a => [a] -> [a] -> [a]
merge a [] = a
merge [] b = b
merge (x:xs) (y:ys)
| x>y = y:(merge (x:xs) ys)
| x==y = x:(merge xs ys)
| otherwise = x:(merge xs (y:ys))

primes :: [Int]
primes = map head (iterate (\o -> filter ((/= 0) . flip mod (head o)) o) (iterate (+ 1) 2))

mergeAll :: [[a]] -> [a]
mergeAll ((h:hs):t) = h:(mergeAll $map tail t) ll :: [[Int]] ll = llfun primes [1] where llfun (h:t) pre = pre : (llfun t (merge pre (map (* h) pre))) squareless :: [Int] squareless = mergeAll ll Really sorry about that. This should work 7. Brent says: Aha! The idea of taking the diagonal is neat. It does mean that producing the first n squarefree numbers is O(n^2), since before producing each one it has to first throw away all the previous ones; but then again, I’m not sure what the running time of my version is, either. (Reasoning about time complexity when laziness is involved isn’t obvious…) 8. anonymous says: Here is a (IMHO nice) way to enumerate all squarefree numbers, although not in sorted order. The key idea is that all sorted numbers can be written as p^a + q^b + r^c + … where p,q,r,… are the prime numbers and a,b,c are drawn from { 0, 1 }. Without further ado, here’s the code: import Control.Monad all01sequences :: [[Int]] all01sequences = filter (not . (== 0) . last)$
concatMap (sequence . flip replicate [0,1]) [1..]

allSQFreeNats :: [Int]
allSQFreeNats = 1 : liftM (product . zipWith (^) primes) all01sequences

9. Brent says:

Hey, that’s nifty! Might I suggest replacing filter (not . (==0) . last) with ([1]:) . map (++[1]) ?

10. anonymous says:

Sure, makes it even shorter :) me idly wonders which magical higher-order cool-monad function one needs to shorten this even more…

(For all those following along, I typoed “+” for “*” in the description of the “key idea”.)

11. Brent says:

Well, not only shorter, but more efficient: it no longer has to throw away any of the generated sequences. Note that ([1]:) . map (++[1]) isn’t doing the same thing as filter (not . (==0) . last), it’s just appending 1 to all the sequences and then manually adding in the sequence [1] again.

It occurs to me that you could also replace

concatMap (sequence . flip replicate [0,1]) [1..]

with

concatMap sequence . tail . iterate ([0,1]:) $[] which isn’t really all that much shorter but is (IMO) clearer. Also, concatMap is just (=<<) in the list monad, so you could also write sequence =<< (tail . iterate ([0,1]:)$ [])

which is shorter but less clear. =)

12. Brent says:

Wait, if you take out the call to tail, you get the empty list at the beginning, which then gives [1] when (++[1]) is applied to it, so [1] doesn’t have to be added separately. Here’s the complete (now much shorter) version:

all01sequences :: [[Int]]
all01sequences = map (++[1]) $sequence =<< iterate ([0,1]:) [] sweet! 13. Anonymous says: Indeed :) 14. b_jonas says: I don’t understand why this is faster than factoring every number: to generate L_inf up to p_n, you have to generate the primes up to p_n, which is probably not much faster than factoring every number up to p_n. 15. Brent says: b_jonas: Well, in order to factor every number up to p_n we also have to generate the primes up to p_n, so let’s take calculating a list of primes as given. (I should also note that there are better ways to calculate the primes than what I have here.) The difference is that with this method we are starting with the primes, and using them to generate only squarefree numbers. So we never even have to check non-squarefree numbers, and for squarefree numbers we are implicitly generating their prime factorizations straight away, with no extra work; in order to start with the numbers and factor them, we’d have to do a lot of wasted work checking for divisors. Another way to think about it: if you forget about the merging, and have primeStep involve (map (p:) xs) instead of (map (p*) xs), we get a list of the prime factorizations of all squarefree numbers. But we do no wasted work to get it (checking for divisors which aren’t divisors, factoring non-squarefree numbers) so it’s clearly faster. I should also point out that generating the primes up to p_n certainly is faster than factoring every number up to p_n, even if we imagine doing both by trial division: in order to discard a non-prime, we need only find a single factor, whereas to fully factor that number may require much more work. 16. Leon says: Your definition of mergeAll is not that efficient. You’ll find it faster to fold right than to fold left-ish. mergeAll [] = [] mergeAll ([]:xss) = mergeAll xss mergeAll ((x:xs):xss) = x : (xs # mergeAll xss) 17. fysx says: This one liner does squareFree numbers too (if only 2 years too late): squareFree = 1 : 2 : filter (\y -> not . any ((== 0) . mod y) . takeWhile (<= y) . map (^2)$
tail squareFree)
[3..]