Competitive programming in Haskell: folding folds

Now that the semester is over—and I will be on sabbatical in the fall!—you can expect a lot more competitive programming in Haskell posts. In a previous post, I challenged you to solve Origami. j0sejuan took me up on the challenge, as did Aaron Allen and Ryan Yates; if you still want to try it, go do it before reading on!

In the problem, we start with a square sheet of paper and are given a series of folds to perform in sequence; each fold is specified as a line, and we fold whatever is on one side of the line across onto the other side. Given some query points, we have to compute how thick the resulting origami design is at each point.


The first order of business is some computational geometry relating to lines in 2D (this code can all be found in Geom.hs. Here I am following Victor Lecomte’s excellent Handbook of geometry for competitive programmers, which I think I’ve mentioned before. I’ll try to give a bit of explanation, but if you want full explanations and proofs you should consult that document.

The equation of a line ax + by = c can be thought of as the set of all points (x,y) whose dot product with the vector (a,b) is a constant c. This will in fact be a line perpendicular to the vector (a,b), where c determines the distance of the line from the origin. Alternatively, we can think of the vector (b,-a), which is perpendicular to (a,b) and thus parallel to the line; the line now consists of all points (x,y) whose 2D cross product with (b,-a) is the constant c (since (b,-a) \times (x,y) = by - (-a)x = ax + by; note that the order matters). Either representation would work, but I will follow Lecomte in choosing the second: we represent a line by a vector giving its direction, and a scalar offset.

data L2 s = L2 { getDirection :: !(V2 s), getOffset :: !s }
type L2D = L2 Double

There are a few ways to construct a line: from an equation ax + by = c, or from two points which lie on the line. The first is easy, given the above discussion. For the second, given points p and q, we can easily construct the direction of the line as v = q - p. Then to get the constant c, we simply use the fact that c is the cross product of the direction vector with any point on the line, say, p (of course q would also work).

lineFromEquation :: Num s => s -> s -> s -> L2 s
lineFromEquation a b c = L2 (V2 b (-a)) c

lineFromPoints :: Num s => P2 s -> P2 s -> L2 s
lineFromPoints p q = L2 v (v `cross` p)
    v = q ^-^ p

Now we can write some functions to decide where a given point lies with respect to a line. First, the side function computes ax + by - c = (b,-a) \times (x,y) - c for any point p = (x,y).

side :: Num s => L2 s -> P2 s -> s
side (L2 v c) p = cross v p - c

Of course, for points that lie on the line, this quantity will be zero. We can also classify points p as lying to the left or right of the line (looking in the direction of v) depending on whether side l p is positive or negative, respectively.

onLine :: (Num s, Eq s) => L2 s -> P2 s -> Bool
onLine l p = side l p == 0

leftOf :: (Num s, Ord s) => L2 s -> P2 s -> Bool
leftOf l p = side l p > 0

rightOf :: (Num s, Ord s) => L2 s -> P2 s -> Bool
rightOf l p = side l p < 0

The last piece we will need to solve the problem is a way to reflect a point across a line. toProjection l p computes the vector perpendicular to l which points from p to l, and reflectAcross works by adding toProjection l p to p twice. I won’t derive the definition of toProjection, but the basic idea is to start with a vector perpendicular to the direction of the line (perp v) and scale it by a factor related to side l p. (Intuitively, it makes sense that ax + by - c tells us something about the distance from (x,y) to the line; the farther away (x,y) is from the line, the farther ax + by is from c.) See Lecomte for the full details.

toProjection :: Fractional s => L2 s -> P2 s -> V2 s
toProjection l@(L2 v _) p = (-side l p / normSq v) *^ perp v

project :: Fractional s => L2 s -> P2 s -> P2 s
project l p = p ^+^ toProjection l p

reflectAcross :: Fractional s => L2 s -> P2 s -> P2 s
reflectAcross l p = p ^+^ (2 *^ toProjection l p)

Folding origami

Finally we can solve the problem! First, some imports and input parsing.

{-# LANGUAGE RecordWildCards #-}

import           Control.Arrow
import qualified Data.ByteString.Lazy.Char8 as C

import           Geom
import           ScannerBS

main = C.interact $
  runScanner tc >>> solve >>> map (show >>> C.pack) >>> C.unlines

data TC = TC { steps :: [L2D], holes :: [P2D] }

tc = TC <$> numberOf (lineFromPoints <$> p2 double <*> p2 double) <*> numberOf (p2 double)

solve :: TC -> [Int]
solve TC{..} = map countLayers holes

For countLayers, the idea is to work backwards from a given query point to find all its preimages, that is, the points that will eventually map to that point under the folds. Then we can just count how many of those points lie (strictly) inside the original square of paper.

    inSquare (V2 x y) = 0 < x && x < 1000 && 0 < y && y < 1000

For a given point and fold, there are two possibilities, depending on which side of the fold line the point falls on. If the point falls on the fold or to the right of it, then it has no preimages (we always fold from right to left, so after the fold, there will be no paper on the right side of the line, and the problem specifies that points exactly on a folded edge do not count). Hence we can just discard such a point. On the other hand, if the point lies on the left side of the line, then the point has two preimages: the point itself, and its reflection across the fold line.

    preimage :: L2D -> P2D -> [P2D]
    preimage l p
      | leftOf l p = [p, reflectAcross l p]
      | otherwise  = []

So we keep a set of points, starting with the singleton query point, and for each fold (in order from last to first) we find the preimage of every point in the set under the fold. We actually use lists of points instead of sets, because (1) we won’t ever get any collisions (actually, the more I think about this, the less sure I am!) and (2) it lets us use the actual list monad instead of making some ad-hoc Set monad operations.

    countLayers :: P2D -> Int
    countLayers q = length . filter inSquare $ foldr (\l -> (>>= preimage l)) [q] steps

It is very satisfying to use a fold to process a list of folds!

Next time: Please, Go First

For next time, I invite you to solve Please, Go First.

Posted in competitive programming, haskell | Tagged , , , , , , , | 1 Comment

Average and median via optimization

This is certainly not a new observation, and I’m sure it is folklore and/or contained in various textbooks, but it was amusing to derive it independently.

Suppose we have a finite set of real numbers \{x_1, \dots, x_n\}, and we want to pick a value m which is somehow “in the middle” of the x_i. The punchline is that

  • if we want to minimize the sum of the squared distances from m to each x_i, we should pick m to be the average of the x_i;
  • if we want to minimize the sum of the absolute distances from m to each x_i, we should pick m to be the median of the x_i.

The first of these is tricky to understand intuitively but easy to derive; the second is intuitively straightforward but trying to derive it leads to an interesting twist.

Average = minimizing sum of squared distances

Let’s not worry about why we would want to minimize the sum of squared distances; there are good reasons and it’s not the point. I don’t know about you, but I find it difficult to reason intuitively about how and why to pick m to minimize this sum of squared differences. If you know of an intuitive way to explain this, I would love to hear about it! But in any case, it is easy to derive using some strightforward calculus.

Let \displaystyle S(m) = \sum_i(m - x_i)^2 denote the sum of squared distances from a given m to each of the x_i. Taking the derivative of S with respect to m, we find

\displaystyle \frac{d}{dm} S(m) = \sum_i 2(m - x_i).

Setting the derivative equal to zero, we can first divide through by the factor of 2, yielding

\displaystyle 0 = \sum_i (m - x_i)

Since m does not depend on i, this is just n copies of m less the sum of the x_i. Hence, solving for m yields

\displaystyle m = \frac{1}{n} \sum_i x_i

as expected: the value of m which minimizes the sum of squared distances to the x_i is their average, that is, the sum of the x_i divided by the size of the set.

Median = minimizing sum of absolute distances

Now suppose we want to minimize the sum of absolute distances instead, that is,

S(m) = \sum_i |m - x_i|

In this scenario, it is much easier to reason out the correct answer. Start with some arbitrary m, and imagine nudging it by some small amount \Delta x, say, to the right. m’s distances to any points on its left will each increase by \Delta x, and its distances to any points on its right will each decrease by the same amount. Therefore, if there are more x_i to the left of m, then the overall sum of distances distances will increase; if there are more x_i to the right, then the overall sum will decrease. So, to find m which minimizes the sum of absolute differences, we want the same number of x_i on the left and the right, that is, we want the median. Note that if n is odd, then we must pick m to be exactly equal to the x_i in the middle; if n is even, then we can pick m to be anywhere inside the interval between the middle two x_i.

Just for fun, can we derive this answer using calculus, like we did for minimizing squared differences? There is a wrinkle, of course, which is that the absolute value function is not differentiable everywhere: it has a sharp corner at zero. But we won’t let that stop us! Clearly the derivative of |x| is -1 when x < 0 and 1 when x > 0. So it seems reasonable to just assign the derivative a value of 0 at x = 0. Algebraically, we can define

\displaystyle \frac{d}{dx} |x| = [x > 0] - [x < 0]

where [P] is equal to 1 when the proposition P is true, and 0 when it is false (this notation is called the Iverson bracket). So when x > 0 we get [x > 0] - [x < 0] = 1 - 0 = 1; when x < 0 we get 0 - 1 = -1; and when x = 0 both propositions are false so we get 0 - 0 = 0.

Armed with this definition, we can differentiate S with respect to m:

\displaystyle \frac{d}{dm} S(m) = \frac{d}{dm} \sum_i |m - x_i| = \sum_i [m > x_i] - \sum_i [m < x_i]

Clearly, this is zero when \displaystyle \sum_i [m > x_i] = \sum_i [m < x_i], that is, when there are the same number of x_i on either side of m.

The curious thing to me is that even though the derivative of |x| is undefined when x = 0, it seems like it “wants” to be 0 here. In general, if we assign the value k to the derivative at x = 0, the derivative of S becomes

\displaystyle \frac{d}{dm} S(m) = \sum_i [m > x_i] + k \sum_i [m = x_i] - \sum_i [m < x_i]

When k is nonzero and n is odd, there are no values of m for which this derivative is zero, making it more difficult to find the minimum.

Posted in math | Tagged , , , , | Leave a comment

Competitive programming in Haskell: folding challenge

As of this writing, I am the only person who has solved Origami on Open Kattis (see the problem statistics here). This is a shame since it is a very nice problem! Of course, I solved it in Haskell.

I challenge you to solve it—bonus points for using a fold in your solution!

Posted in competitive programming, haskell | Tagged , , , , | 7 Comments

Competitive programming in Haskell: cycle decomposition with mutable arrays

In my previous post I I challenged you to solve Chair Hopping: if a bunch of people permute themselves according to the same rule twice, how many different rules could they be following which would result in the observed final permutation? Or, more formally, given a permutation \rho on [1 \dots n], how many permutations \sigma are there such that \sigma^2 = \rho?

Since this has to do with permutations, it should be unsurprising that cycle decomposition comes into the picture. And we have discussed cycle decomposition of permutations before; using those techniques to decompose the given permutation into cycles should be straightforward, right?

Not so fast!

Here is the code we used previously to compute the size of the cycle containing a given element:

dist :: Perm -> Int -> Int -> Int
dist p i j = length $ takeWhile (/= j) (iterate (p!) i)

cycleLen :: Perm -> Int -> Int
cycleLen p i = succ $ dist p (p!i) i

There’s nothing particularly wrong with this code, and no way to speed it up per se. Computing the distance between i and j in permutation p takes O(n), since we may have to scan through a significant fraction of the entire permutation if i and j are in a large cycle. But this is unavoidable. cycleLen then just uses dist, and if all we want to do is find the length of a single cycle this is unavoidable too.

However, the problem comes when we want to, for example, find the length of the cycle of many elements. cycleLen will take O(n) for each element we call it on. In the worst case, if the entire permutation consists of one giant cycle, calling cycleLen on every element will take O(n^2) overall. And this is particularly silly since the work of following the cycle will be entirely repeated every time, only starting from a different place! When n = 200, as in The Power of Substitution, an O(n^2) algorithm is no big deal; but when n = 10^5 it’s entirely too slow. Using 10^8 operations per second as our rule of thumb, we expect an O(n^2) algorithm on an input with n = 10^5 to take on the order of (10^5)^2 / 10^8 = 100 seconds. An input size of 10^5 is extremely common in competitive programming problems: not so big that I/O is going to be a huge bottleneck, but big enough that you need to come up with an algorithm faster than O(n^2) (for example, O(n) or O(n \lg n) are both fine).

Permutations and fast cycle decomposition

The idea is to do the work of decomposing a permutation into cycles only once, in O(n) time, and store the results in a data structure that allows us to look up the needed information quickly. (This general technique of preprocessing some data into a structure allowing for fast subsequent query/lookup is ubiquitous in competitive programming, and indeed in all of computer science.) The catch? I don’t know of a good way to do this without using mutable arrays! But if we write it generically we can potentially reuse it (I have in fact reused this code several times already on other problems).

Let’s make a library for representing permutations. This code can be found in Perm.hs. First, some imports and the main Perm type itself, which is just an alias for UArray Int Int. UArray represents (immutable) unboxed arrays, that is, arrays whose elements can be stored “unboxed” in a contiguous block of memory. “Boxed” arrays are those where the array actually stores pointers and the elements themselves are allocated somewhere else. Of course we prefer using unboxed arrays whenever possible!

{-# LANGUAGE BangPatterns #-}

module Perm where

import           Control.Arrow
import           Control.Monad.ST
import           Data.Array.Base
import           Data.Array.MArray
import           Data.Array.ST
import           Data.Array.Unboxed

-- | 'Perm' represents a /1-indexed/ permutation.  It can also be
--   thought of as an endofunction on the set @{1 .. n}@.
type Perm = UArray Int Int

Just based on the problems where I used it, I’ve chosen to make Perm values 1-indexed, though of course we could easily have made a different choice. We can now define a few utility functions for working with permutations: fromList constructs a Perm from a list; andThen composes permutations; and inverse computes the inverse of a permutation. We’ll only need fromList to solve Chair Hopping, but the others may come in handy for other problems.

-- | Construct a 'Perm' from a list containing a permutation of the
--   numbers 1..n.  The resulting 'Perm' sends @i@ to whatever number
--   is at index @i-1@ in the list.
fromList :: [Int] -> Perm
fromList xs = listArray (1,length xs) xs

-- | Compose two permutations (corresponds to backwards function
--   composition).  Only defined if the permutations have the same
--   size.
andThen :: Perm -> Perm -> Perm
andThen p1 p2 = listArray (bounds p1) (map ((p1!) >>> (p2!)) (range (bounds p1)))

-- | Compute the inverse of a permutation.
inverse :: Perm -> Perm
inverse p = array (bounds p) [ (p!k, k) | k <- range (bounds p) ]

When decomposing a permutation into cycles, we assign each cycle a unique ID number, and compute a number of mappings:

  • from each element to the ID number of its cycle;
  • from each cycle to its length;
  • from each element to its index in its cycle;
  • from each possible cycle size to the number of cycles of that size.

These mappings are collected in the CycleDecomp data type:

data CycleDecomp = CD
  { cycleID     :: UArray Int Int  -- | Each number maps to the ID# of the cycle it is part of
  , cycleLen    :: UArray Int Int  -- | Each cycle ID maps to the length of that cycle
  , cycleIndex  :: UArray Int Int  -- | Each element maps to its (0-based) index in its cycle
  , cycleCounts :: UArray Int Int  -- | Each size maps to the number of cycles of that size
  deriving Show

We can use these to quickly look up information about the cycle decomposition of a permutation. For example, if we want to know the size of the cycle containing element e, we can look it up with cycleLen!(cycleID!e). Or if we know that a and b are in the same cycle and we want to know the distance from a to b, we can compute it as (cycleIndex!b - cycleIndex!a) `mod` (cycleLen!(cycleID!a)).

Finally, here’s my code to actually compute all this information about a cycle decomposition in O(n) time, which works by looking at each element, and when finding an element which is so far unprocessed, it does a DFS in the permutation following the cycle from that element. To be honest, it’s kind of ugly; that’s what we get for working with mutable arrays in Haskell. I am very much interested if anyone has any ideas on how to make this (1) faster or (2) prettier. (I am aware those two criteria may be at odds!) I’m using STUArray which allows mutation inside a monadic ST block; at the end we freeze them into normal immutable UArrays. (Note there are also unsafe variants of reading, writing, and freezing which do less checks, but using them didn’t seem to speed things up; I’m very open to suggestions.)

-- | Cycle decomposition of a permutation in O(n), using mutable arrays.
permToCycles :: Perm -> CycleDecomp
permToCycles p = cd where

  (_,n) = bounds p

  cd = runST $ do
    cid <- newArray (1,n) 0
    cix <- newArray (1,n) 0
    ccs <- newArray (1,n) 0

    lens <- findCycles cid cix ccs 1 1
    cid' <- freeze cid
    cix' <- freeze cix
    ccs' <- freeze ccs
    return $ CD cid' (listArray (1,length lens) lens) cix' ccs'

  findCycles :: STUArray s Int Int -> STUArray s Int Int -> STUArray s Int Int
    -> Int -> Int -> ST s [Int]
  findCycles cid cix ccs l !k   -- l = next available cycle ID; k = cur element
    | k > n     = return []
    | otherwise = do
        -- check if k is already marked as part of a cycle
        id <- readArray cid k
        case id of
          0 -> do
            -- k is unvisited.  Explore its cycle and label it as l.
            len <- labelCycle cid cix l k 0

            -- Remember that we have one more cycle of this size.
            count <- readArray ccs len
            writeArray ccs len (count+1)

            -- Continue with the next label and the next element, and
            -- remember the size of this cycle
            (len:) <$> findCycles cid cix ccs (l+1) (k+1)

          -- k is already visited: just go on to the next element
          _ -> findCycles cid cix ccs l (k+1)

  -- Explore a single cycle, label all its elements and return its size.
  labelCycle cid cix l k !i = do

    -- Keep going as long as the next element is unlabelled.
    id <- readArray cid k
    case id of
      0 -> do

        -- Label the current element with l.
        writeArray cid k l
        -- The index of the current element is i.
        writeArray cix k i

        -- Look up the next element in the permutation and continue.
        (1+) <$> labelCycle cid cix l (p!k) (i+1)
      _ -> return 0

This code is overly generic in some sense—we don’t actually need all this information to solve Chair Hopping, for example—but again, I am trying to make it as reusable as possible.

Now, how can we use cycle decomposition to solve Chair Hopping? That will have to wait for another post!

Posted in competitive programming, haskell | Tagged , , , | 2 Comments

Competitive programming in Haskell: 2D cross product, part 1

Time for some more geometry! In my previous post I challenged you to solve Cookie Cutters, which asks us to scale the vertices of a polygon so that it has a certain prescribed area. It’s possible to solve this just by looking up an algorithm for computing the area of a polygon (see the “shoelace formula”). But the way to get good at solving geometry problems is not by memorizing a bunch of formulas, but rather by understanding a few general primitives and principles which can be assembled to solve a wide range of problems.

Incidentally, if you’re serious about getting good at geometric problems in competitive programming, then you absolutely must read Victor Lecomte’s Handbook of geometry for competitive programmers. (It’s still a great read even if you’re not serious!)

The 2D cross product

In two dimensions, given vectors \mathbf{u} = (u_x, u_y) and \mathbf{v} = (v_x, v_y), we can compute their cross product as

\mathbf{u} \times \mathbf{v} = \begin{vmatrix} u_x & v_x \\ u_y & v_y \end{vmatrix} = u_x v_y - v_x u_y.

One useful way to understand this as giving the signed area of the parallelogram determined by \mathbf{u} and \mathbf{v}. The area is positive when \mathbf{v} is counterclockwise from \mathbf{u}, negative when it is clockwise, and zero when the two vectors are colinear (i.e. parallel or antiparallel).

I’m not going to prove this here, since to be quite honest I don’t remember off the top of my head how to derive it. (Also, geometric algebra does a much better job of explaining where this comes from and generalizing to any number of dimensions; in particular, \mathbf{u} \times \mathbf{v} is the coefficient of the bivector resulting from the outer product of \mathbf{u} and \mathbf{v}. But that would take us much too far afield for now!)

So let’s write some Haskell code to compute the cross product of 2D vectors. (All this code has of course been added to Geom.hs.)

cross :: Num s => V2 s -> V2 s -> s
cross (V2 ux uy) (V2 vx vy) = ux*vy - vx*uy

crossP :: Num s => P2 s -> P2 s -> P2 s -> s
crossP p1 p2 p3 = cross (p2 ^-^ p1) (p3 ^-^ p1)

type P2 s = V2 s
type P2D  = P2 Double

A few things to note:

  • cross works over any scalar type which is an instance of Num. In solving Cookie Cutters, this is going to be Double, but it could also be, e.g. Integer.
  • For convenience, crossP is a variant of cross that takes three points as arguments, and computes the cross product of the vector from the first to the second with the vector from the first to the third. In many instances where we want to use the cross product, we actually have the coordinates of three points/vertices.
  • We’ve added P2 and P2D as type aliases for V2 and V2D. They are just aliases, not newtypes, to reduce the need for separate operators that work on points vs vectors, but it’s still helpful to have different type aliases to at least alert us to whether our functions morally want to be given vectors or points as arguments.

Now, keeping in mind the fundamental interpretation of the 2D cross product as computing the signed area of a parallelogram, we can derive a few other operations. First, given the three vertices of a triangle, we can compute the signed area of the triangle as half of the cross product (because the triangle is half the parallelogram). Note that the order of the vertices matters: the area will be positive if they are in counterclockwise order, and negative if clockwise. Swapping any two vertices negates the result. If we want the normal nonnegative area of a triangle regardless of the order of the vertices, of course we can just take the absolute value.

signedTriArea :: Fractional s => P2 s -> P2 s -> P2 s -> s
signedTriArea p1 p2 p3 = crossP p1 p2 p3 / 2

triArea :: Fractional s => P2 s -> P2 s -> P2 s -> s
triArea p1 p2 p3 = abs (signedTriArea p1 p2 p3)

(Notice the Fractional constraint since we have to divide by two.) At first glance, you might think the concept of “signed triangle area” is silly and useless. But it turns out to be the key to understanding the “shoelace formula”.

The shoelace formula for polygon area

Imagine first that we have a convex polygon. If we pick a point somewhere in its interior (say, the centroid) and draw lines from the central point to every vertex, we chop up the polygon into triangles. Obviously, adding up the areas of these triangles will give us the area of the polygon.

What’s much less obvious is that if we add up the signed area of each triangle, it still works even if (1) the polygon is not convex, and/or (2) the “central point” is not in the interior of the polygon! That is, we just pick some arbitrary “central point” (the origin works nicely) and compute the signed area of the triangle formed by the origin and each edge of the polygon. A sort of magical inclusion-exclusion thing happens where all the area outside the polygon gets canceled out, and all the area inside ends up getting counted exactly once. Rather than try to prove this to you, I’ll just illustrate some examples.

So, here’s the Haskell code. signedPolyArea yields a positive area if the vertices of the polygon are in “counterclockwise order” (puzzle: what does “counterclockwise order” mean for a non-convex polygon? Hint: look up “winding number”; this is also the key to a formal proof that all of this works), and negative if they are clockwise.

signedPolyArea :: Fractional s => [P2 s] -> s
signedPolyArea pts = sum $ zipWith (signedTriArea zero) pts (tail pts ++ [head pts])

polyArea :: Fractional s => [P2 s] -> s
polyArea = abs . signedPolyArea

The “shoelace formula”, as it is usually presented, falls out if you inline the zero argument to signedTriArea and then simplify the result. It would be possible to do this and code an optimized version of signedPolyArea that uses the shoelace formula more directly, but I much prefer having this version which is built out of meaningful and reusable components!

Incidentally, there is a 3D analogue to the shoelace formula for computing the volume of a 3D polyhedron, but it requires some care to first make sure all the faces are oriented in a compatible way; see section 3.5 of Lecomte.

Other utilities

I added a couple more utilities to Geom.hs which we will need. First, since we need to scale polygons up or down to give a required area, we need the concept of multiplying a vector by a scalar:

(*^) :: Num s => s -> V2 s -> V2 s
k *^ (V2 x y) = V2 (k*x) (k*y)

Also, to help with reading vectors from the input, I added this combinator:

v2 :: Applicative f => f s -> f (V2 s)
v2 s = V2 <$> s <*> s

The idea is to use it with f ~ Scanner. For example, if double :: Scanner Double then we can write v2 double :: Scanner (V2 Double).

Last but not least, I also added getX and getY field labels to the V2 type, for when we need to extract the coordinates of a point or vector:

data V2 s = V2 { getX :: !s, getY :: !s } deriving (Eq, Ord, Show)

Finally, here’s my solution to Cookie Cutters. First, some imports and main, which just scans the input, generates the required scaled and translated list of vertices, and then formats the output.

import           Control.Arrow
import qualified Data.Foldable as F
import           Text.Printf

import           Geom
import           Scanner

main = interact $
  runScanner scan >>> solve >>> map (F.toList >>> map (printf "%.5f") >>> unwords) >>> unlines

Here’s the data type for storing the input, along with a Scanner for it. Notice how we use v2 double' to read in 2D vectors (well, actually points!) in the input. The annoying thing is that some floating-point values in the input are formatted like .5, with no leading 0, and read ".5" :: Double crashes. Hence the need for the double' scanner below, which reads a string token and potentially adds a leading zero before conversion to Double.

data TC = TC { polygon :: [P2D], area :: Double }

scan :: Scanner TC
scan = do
  n <- int
  TC <$> n `times` (v2 double') <*> double'

double' :: Scanner Double
double' = (read . fixup) <$> str
    fixup s@('.':_) = '0':s
    fixup s         = s

And finally, putting the pieces together to solve the meat of the problem. We first compute the area of the given polygon using polyArea, then divide the desired area by the original area to find the factor by which the area must increase (or decrease). Area scales as the square of distance, so we must take the square root of this factor to find the factor by which the vertices must scale. We simply scale all the vertices appropriately, then find the minimum x and y coordinates so we can translate by their negation, to make the polygon touch the positive x and y axes as required.

solve :: TC -> [P2D]
solve (TC ps a) = map (^-^ V2 xmin ymin) ps'
    a0 = polyArea ps
    s  = sqrt (a / a0)     -- scaling factor to get the desired area
    ps' = map (s *^) ps
    xmin = minimum (map getX ps')
    ymin = minimum (map getY ps')

Next time: Chair Hopping

For next time I invite you to solve Chair Hopping. Warning, this one is rather difficult! But I had a lot of fun solving it, and the solution touches on several interesting topics (in fact, I’ll probably need more than one blog post).

Posted in competitive programming, haskell | Tagged , , , | 5 Comments

Competitive programming in Haskell: data representation and optimization, with cake

In my previous post I challenged you to solve Checking Break, which presents us with a cake in the shape of a rectangular prism, with chocolate chips at various locations, and asks us to check whether a proposed division of the cake is valid. A division of the cake is valid if it is a partition (no pieces overlap and every part of the cake is in some piece) and every piece contains one chocolate chip.

No one posted a solution—I don’t know if that’s because people have lost interest, or because no one was able to solve it—but in any case, don’t read this post yet if you still want to try solving it! As a very small hint, part of the reason I chose this problem is that it is an interesting example of a case where just getting the correct asymptotic time complexity is not enough—we actually have to work a bit to optimize our code so it fits within the allotted time limit.

The algorithm

When solving this problem I first just spent some time thinking about the different things I would have to compute and what algorithms and data structures I could use to accomplish them.

  • The first thing that jumped out at me is that we are going to want some kind of abstractions for 3D coordinates, and for 3D rectangular prisms (i.e. boxes, i.e. pieces of cake). Probably we can just represent boxes as a pair of points at opposite corners of the box (in fact this is how boxes are given to us). As we plan out how the rest of the solution is going to work we will come up with a list of operations these will need to support.

    As an aside, when working in Java I rarely make any classes beyond the single main class, because it’s too heavyweight. When working in Haskell, on the other hand, I often define little abstractions (i.e. data types and operations on them) because they are so lightweight, and being able to cleanly separate things into different layers of abstraction helps me write solutions that are more obviously correct.

  • We need to check that the coordinates of each given box are valid.

  • We will need to check that every piece of cake contains exactly one chocolate chip. At first this sounds difficult—given a chip, how do we find out which box(es) it is in? Or given a box, how can we find out which chips are in it? To do this efficiently seems like it will require some kind of advanced 3D space partitioning data structure, like an octree or a BSP tree. BUT this is a situation where reading carefully pays off: the problem statement actually says that “the i-th part must contain the i-th chocolate chip”. So all we have to do is zip the list of pieces together with the list of chips. We just need an operation to test whether a given point is contained in a given box.

  • We have to check that none of the boxes overlap. We can make a primitive to check whether two boxes intersect, but how do we make sure that none of the boxes intersect? Again, complicated space-partitioning data structures come to mind; but since there are at most 10^3 boxes, the number of pairs is on the order of 10^6. There can be multiple test cases, though, and the input specification says the sum of values for m (the number of pieces) over all test cases will be at most 5 \times 10^4. That means that in the worst case, we could get up to 50 test cases with 10^3 pieces of cake (and thus on the order of 10^6 pairs of pieces) per test case. Given 10^8 operations per second as a rough rule of thumb, it should be just barely manageable to do a brute-force check over every possible pair of boxes.

  • Finally, we have to check that the pieces account for every last bit of the cake. If we think about trying checking this directly, it is quite tricky. One could imagine making a 3D array representing every cubic unit of cake, and simply marking off the cubes covered by each piece, but this is completely out of the question since the cake could be up to 10^6 \times 10^6 \times 10^6 in size! Or we could again imagine some complicated space-partitioning structure to keep track of which parts have and have not been covered so far.

    But there is a much simpler way: just add up the volume of all the pieces and make sure it is the same as the volume of the whole cake! Of course this relies on the fact that we are also checking to make sure none of the pieces overlap: the volumes being equal implies that the whole cake is covered if and only if none of the pieces overlap. In any case, we will need a way to compute the volume of a box.

Implementation and optimization

Let’s start with some preliminaries: LANGUAGE pragmas, imports, main, and the parser.

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE RecordWildCards   #-}
{-# LANGUAGE TupleSections     #-}

import           Control.Arrow
import           Data.Bool
import qualified Data.ByteString.Lazy.Char8 as C
import           Data.Monoid

import           ScannerBS

main = C.interact $
  runScanner (many tc) >>> init >>>
  map (solve >>> bool "NO" "YES") >>>

data TC = TC { cake :: Box, chips :: [Pos], parts :: [Box] }

tc :: Scanner TC
tc = do
  a <- int
  case a of
    -1 -> return undefined
    _  -> do
      xs <- three int
      let [b,c,m] = xs
          cake    = Box (Pos 1 1 1) (Pos a b c)
      TC cake <$> m `times` pos <*> m `times` box

The parser is worth remarking upon. The input consists of multiple test cases, with a single value of -1 marking the end of the input. This is annoying: ideally, we would have a many combinator that keeps running a Scanner until it fails, but we don’t. To keep things simple and fast, our Scanner abstraction does not support parse failures and alternatives! The many combinator we made keeps running a given Scanner until the end of input, not until it fails. The quick-and-dirty solution I adopted is to make the test case Scanner return undefined if it sees a -1, and then simply ignore the final element of the list of test cases via init. Not pretty but it gets the job done.

Representing positions and boxes

Next let’s consider building abstractions for 3D coordinates and boxes. It is very tempting to do something like this:

type Pos = [Integer]
type Box = [Pos]

-- Check whether one position is componentwise <= another
posLTE :: Pos -> Pos -> Pos
posLTE p1 p2 = and $ zipWith (<=) p1 p2

-- ... and so on

Using list combinators like zipWith to work with Pos and Box values is quite convenient. And for some problems, using lists is totally fine. Having a small number of large lists—e.g. reading in a list of 10^5 integers and processing them somehow—is rarely a problem. But having a large number of small lists, as we would if we use lists to represent Pos and Box here, slows things down a lot (as I learned the hard way). I won’t go into the details of why—I am no expert on Haskell performance—but suffice to say that lists are a linked structure with a large memory overhead.

So let’s do something more direct. We’ll represent both Pos and Box as data types with strict fields (the strict fields make a big difference, especially in the case of Pos), and make some trivial Scanners for them. The volume function computes the volume of a box; given that the coordinates are coordinates of the cubes that make up the pieces, and are both inclusive, we have to add one to the difference between the coordinates. Note we assume that the first coordinate of a Box should be elementwise less than or equal to the second; otherwise, the call to max 0 ensures we will get a volume of zero.

data Pos = Pos !Int !Int !Int
data Box = Box !Pos !Pos

pos :: Scanner Pos
pos = Pos <$> int <*> int <*> int

box :: Scanner Box
box = Box <$> pos <*> pos

volume :: Box -> Int
volume (Box (Pos x1 y1 z1) (Pos x2 y2 z2)) = (x2 -. x1) * (y2 -. y1) * (z2 -. z1)
    x -. y = max 0 (x - y + 1)

Another very important note is that we are using Int instead of Integer. Using Integer is lovely when we can get away with it, since it means not worrying about overflow at all; but in this case using Int instead of Integer yields a huge speedup (some quick and dirty tests show about a factor of 6 speedup on my local machine, and replacing Int with Integer, without changing anything else, makes my solution no longer accepted on Kattis). Of course, this comes with an obligation to think about potential overflow: the cake can be at most 10^6 units on each side, giving a maximum possible volume of 10^{18}. On a 64-bit machine, that just fits within an Int (maxBound :: Int is approximately 9.2 \times 10^{18}). Since the Kattis test environment is definitely 64-bit, we are good to go. In fact, limits for competitive programming problems are often chosen so that required values will fit within 64-bit signed integers (C++ has no built-in facilities for arbitrary-size integers); I’m quite certain that’s why 10^6 was chosen as the maximum size of one dimension of the cake.

Pos and Box utilities

Next, some utilities for checking whether one Pos is elementwise less than or equal to another, and for taking the elementwise max and min of two Pos values. Checking whether a Box contains a Pos simply reduces to doing two calls to posLTE (again assuming a valid Box with the first corner componentwise no greater than the second).

posLTE (Pos x1 y1 z1) (Pos x2 y2 z2) = x1 <= x2 && y1 <= y2 && z1 <= z2
posMax (Pos x1 y1 z1) (Pos x2 y2 z2) = Pos (max x1 x2) (max y1 y2) (max z1 z2)
posMin (Pos x1 y1 z1) (Pos x2 y2 z2) = Pos (min x1 x2) (min y1 y2) (min z1 z2)

contains :: Box -> Pos -> Bool
contains (Box lo hi) p = posLTE lo p && posLTE p hi

To test whether a box is a valid box within a given cake, we test that its corners are in the correct order and fit within the low and high coordinates of the cake.

valid :: Box -> Box -> Bool
valid (Box lo hi) (Box c1 c2) = posLTE lo c1 && posLTE c1 c2 && posLTE c2 hi

How to test whether two given boxes intersect or not? There are probably many ways to do this, but the nicest way I could come up with is to first find the actual Box which represents their intersection, and check whether it has a positive volume (relying on the fact that volume returns 0 for degenerate boxes with out-of-order coordinates). In turn, to find the intersection of two boxes, we just take the coordinatewise max of their lower corners, and the coordinatewise min of their upper corners.

intersection :: Box -> Box -> Box
intersection (Box c11 c12) (Box c21 c22) = Box (posMax c11 c21) (posMin c12 c22)

disjoint :: Box -> Box -> Bool
disjoint b1 b2 = volume (intersection b1 b2) == 0

The solution

Finally, we can put the pieces together to write the solve function. We simply check that all the given cake parts are valid; that every part contains its corresponding chocolate chip; that every pair of parts is disjoint; and that the sum of the volumes of all parts equals the volume of the entire cake.

solve :: TC -> Bool
solve (TC{..}) = and
  [ all (valid cake) parts
  , and $ zipWith contains parts chips
  , all (uncurry disjoint) (pairs parts)
  , sum (map volume parts) == volume cake

Computing all pairs

Actually, there’s still one missing piece: how to compute all possible pairs of parts. The simplest possible thing would be to use a list comprehension like

[(x,y) | x <- parts, y <- parts]

but this has problems: first, it includes a pairing of each part with itself, which will definitely have a nonzero intersection. We could exclude such pairs by adding x /= y as a guard, but there is another problem: (p2,p1) is included whenever (p1,p2) is included, but this is redundant since disjoint is commutative. In fact, we don’t really want all pairs; we want all unordered pairs, that is, all sets of size two. We can do that with the below utility function (which I have now added to Util.hs):

pairs :: [a] -> [(a,a)]
pairs []     = []
pairs [_]    = []
pairs (a:as) = map (a,) as ++ pairs as

This is accepted, and runs in about 0.91 seconds (the time limit is 2 seconds). However, I was curious whether we are paying anything here for all the list operations, so I wrote the following version, which takes a binary operation for combining list elements, and a Monoid specifying how to combine the results, and directly returns the monoidal result of combining all the pairs, without ever constructing any intermediate lists or tuples at all. It’s sort of like taking the above pairs function, following it by a call to foldMap, and then manually fusing the two to get rid of the intermediate list.

withPairs :: Monoid r => (a -> a -> r) -> [a] -> r
withPairs _ []     = mempty
withPairs _ [_]    = mempty
withPairs f (a:as) = go as
    go []        = withPairs f as
    go (a2:rest) = f a a2 <> go rest

To use this, we have to change the solve function slightly: instead of

  , all (uncurry disjoint) (pairs parts)

we now have

  , getAll $ withPairs (\p1 p2 -> All $ disjoint p1 p2) parts

This version runs significantly faster on Kattis—0.72 seconds as opposed to 0.91 seconds. (In fact, it’s faster than the currently-fastest Java solution (0.75 seconds), though there is still a big gap to the fastest C++ solution (0.06 seconds).) I don’t completely understand why this version is faster—perhaps one of you will be able to enlighten us!

For next time

For next time, we’ll go back to computational geometry: I invite you to solve Cookie Cutters.

Posted in competitive programming, haskell | Tagged , , , , , | 7 Comments

Competitive programming in Haskell: vectors and 2D geometry

In my previous post (apologies it has been so long!) I challenged you to solve Vacuumba, which asks us to figure out where a robot ends up after following a sequence of instructions. Mathematically, this corresponds to adding up a bunch of vectors, but the interesting point is that the instructions are always relative to the robot’s current state, so robot programs are imperative programs.

Vector basics

The first order of business is to code up some primitives for dealing with (2D) vectors. I have accumulated a lot of library code for doing geometric stuff, but it’s kind of a mess; I’m using this as an opportunity to clean it up bit by bit. So there won’t be much code at first, but the library will grow as we do more geometry problems. The code so far (explained below) can be found in the comprog-hs repository.

First, a basic representation for 2D vectors, the zero vector, and addition and subtraction of vectors.

{-# LANGUAGE GeneralizedNewtypeDeriving #-}

module Geom where

-- 2D points and vectors

data V2 s = V2 !s !s deriving (Eq, Ord, Show)
type V2D  = V2 Double

instance Foldable V2 where
  foldMap f (V2 x y) = f x <> f y

zero :: Num s => V2 s
zero = V2 0 0

-- Adding and subtracting vectors
(^+^), (^-^) :: Num s => V2 s -> V2 s -> V2 s
V2 x1 y1 ^+^ V2 x2 y2 = V2 (x1+x2) (y1+y2)
V2 x1 y1 ^-^ V2 x2 y2 = V2 (x1-x2) (y1-y2)

A few things to point out:

  • The V2 type is parameterized over the type of scalars, but we define V2D as a synonym for V2 Double, which is very common. The reason for making V2 polymorphic in the first place, though, is that some problems require the use of exact integer arithmetic. It’s nice to be able to share code where we can, and have the type system enforce what we can and can’t do with vectors over various scalar types.

  • For a long time I just represented vectors as lists, type V2 s = [s]. This makes implementing addition and subtraction very convenient: for example, (^+^) = zipWith (+). Although this has worked just fine for solving many geometry problems, I have recently been reminded that having lots of small lists can be bad for performance. As long as we’re making a library anyway we might as well use a proper data type for vectors!

  • Elsewhere I have made a big deal out of the fact that vectors and points ought to be represented as separate types. But in a competitive programming context I have always just used a single type for both and it hasn’t bit me (yet!).

  • The Foldable instance for V2 gets us toList. It also gets us things like sum and maximum which could occasionally come in handy.

Angles and rotation

The other thing we are going to need for this problem is angles.

-- Angles

newtype Angle = A Double  -- angle (radians)
  deriving (Show, Eq, Ord, Num, Fractional, Floating)

fromDeg :: Double -> Angle
fromDeg d = A (d * pi / 180)

fromRad :: Double -> Angle
fromRad = A

toDeg :: Angle -> Double
toDeg (A r) = r * 180 / pi

toRad :: Angle -> Double
toRad (A r) = r

-- Construct a vector in polar coordinates.
fromPolar :: Double -> Angle -> V2D
fromPolar r θ = rot θ (V2 r 0)

-- Rotate a vector counterclockwise by a given angle.
rot :: Angle -> V2D -> V2D
rot (A θ) (V2 x y) = V2 (cos θ * x - sin θ * y) (sin θ * x + cos θ * y)

Nothing too complicated going on here: we have a type to represent angles, conversions to and from degrees and radians, and then two uses for angles: a function to construct a vector in polar coordinates, and a function to perform rotation.

Incidentally, one could of course define type Angle = Double, which would be simpler in some ways, but after getting bitten several times by forgetting to convert from degrees to radians, I decided it was much better to use a newtype and entirely prevent that class of error.

Solving Vacuumba

Now we just put the pieces together to solve the problem. First, some imports:

{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE RecordWildCards  #-}

import           Control.Arrow
import           Control.Monad.State
import qualified Data.Foldable       as F
import           Text.Printf

import           Geom
import           Scanner

We make a data type for representing robot instructions, and a corresponding Scanner. Notice how we are forced to use fromDeg to convert the raw input into an appropriate type.

data Instr = I { turn :: Angle, dist :: Double }

instr :: Scanner Instr
instr = I <$> (fromDeg <$> double) <*> double

The high-level solution then reads the input via a Scanner, solves each scenario, and formats the output. The output is a V2D, so we just convert it to a list with F.toList and use printf to format each coordinate.

main = interact $
  runScanner (numberOf (numberOf instr)) >>>
  map (solve >>> F.toList >>> map (printf "%.6f") >>> unwords) >>> unlines

Our solve function needs to take a list of instructions, and output the final location of the robot. Since the instructions can be seen as an imperative program for updating the state of the robot, it’s entirely appropriate to use a localized State computation.

First, a data type to represent the robot’s current state, consisting of a 2D vector recording the position, and an angle to record the current heading. initRS records the robot’s initial state (noting that it starts out facing north, corresponding to an angle of 90^\circ as measured clockwise from the positive x-axis).

data RobotState = RS { pos :: V2D, heading :: Angle }
initRS = RS zero (fromDeg 90)

Finally, the solve function itself executes each instruction in sequence as a State RobotState computation, uses execState to run the resulting overall computation and extract the final state, and then projects out the robot’s final position. Executing a single instruction is where the geometry happens: we look up the current robot state, calculate its new heading by adding the turn angle to the current heading, construct a movement vector in the direction of the new heading using polar coordinates, and add the movement to the current position.

solve :: [Instr] -> V2D
solve = mapM_ exec >>> flip execState initRS >>> pos
    exec :: Instr -> State RobotState ()
    exec (I θ d) = do
      RS{..} <- get
      let heading' = heading + θ
          move     = fromPolar d heading'
      put $ RS (pos ^+^ move) heading'

For next time

We’ll definitely be doing more geometry, but for the next post I feel like doing something different. I invite you to solve Checking Break.

Posted in competitive programming, haskell | Tagged , , , , | 2 Comments

Anyone willing to help me get set up with something like miso?

For the last few years I’ve been working (off and on) on a teaching language for FP and discrete math. One of the big goals is to build a browser-based IDE which runs completely client-side, so students can code in their browsers without installing anything, and I don’t have to worry about running a server. Thanks to amazing technology like GHCJS, miso, reflex, etc., this seems like it should be entirely doable.

However, every time I sit down to try building such a thing, I end up getting completely bogged down in details of nix and stack and .cabal files and whatnot, and never even get off the ground. There are usually nice examples of building a new site from scratch, but I can never figure out the right way to incorporate my large amount of existing Haskell code. Should I have one package? Two separate packages for the website and the language implementation? How can I set things up to build either/both a command-line REPL and a web IDE?

I’m wondering if there is someone experienced with GHCJS and miso who would be willing to help me get things set up. (I’m also open to being convinced that some other combination of technologies would be better for my use case.) I’m imagining a sort of pair-programming session via videoconference. I am pretty flexible in terms of times so don’t worry about whether your time zone matches mine. And if there’s something I can help you with I’m happy to do an exchange.

If you’re interested and willing, send me an email: byorgey at gmail. Thanks!

Posted in haskell | Tagged , , , , | 7 Comments

Competitive programming in Haskell: permutations

In my previous post I challenged you to solve The Power of Substitution. This problem presents us with a substitution cipher, and asks how many times we would have to iterate the encoding process in order to turn a given message into a given encryption.

A non-solution

Several commenters mentioned that they tried some sort of straightforward brute force approach: just iterate the encoding process and count how many iterations are needed to reach the specified encryption. This certainly works for the provided sample inputs. However, I hinted that this approach likely wouldn’t fit within Kattis’s time limit for the other, secret, test cases.

How did I know this solution would exceed the time limit? It is not just an issue of using efficient data structures! Even if someone told me they coded this straightforward solution in C++, I would have the same reaction. The problem specifies that the answer should be at most 10^9—and, as we will see, it is not hard to come up with test cases where the answer is indeed that large—and the simple fact is that counting to 10^9 and applying the encryption at each step (which requires iterating through a list of length 200) is definitely going to take more than 1 second. A useful rule of thumb that I learned from Competitive Programming 3 is 10^8 operations per second. (Of course your processor can carry out much more than 10^8 instructions per second, but in practice this rule of thumb seems to work remarkably well for predicting run times up to the right order of magnitude.)

Cycle decomposition of permutations

The given encoding, of course, is a permutation (the problem specifies only that it will be one-to-one, but any one-to-one endofunction on a finite set must in fact be a permtuation). Let’s call it p. If we start with an arbitrary m and repeatedly apply p—that is, m, p(m), p^2(m), p^3(m), \dots—what happens? Of course, because of the pigeonhole principle, the sequence must eventually repeat. But actually, something stronger is true: because p is a permutation, the first repeated value must be m itself. For suppose that p^i(m) = p^j(m) was the first repeated value in the sequence. But then since p is one-to-one, it must be the case that p^{i-1}(m) = p^{j-1}(m) as well, which means p^{i-2}(m) = p^{j-2}(m), and so on up to m = p^0(m) = p^{j-i}(m).

So in fact if we start at an arbitrary m and iterate p, we will find a cycle that includes m (including the possibility of a trivial length-1 cycle if p(m) = m). If there are other elements not included in m’s cycle, we can pick any one of them and repeat the process to find another cycle (which can’t possibly overlap at all with m’s cycle—do you see why?). In general, any permutation can be decomposed in this way into a collection of disjoint cycles.

CRT to the rescue

This idea of cycle decomposition is the key to unlocking the problem. Think about what happens to a particular letter m_i in the message, which we eventually want to become c_i. This will happen after applying the permutation some small number of times j, such that p^j(m_i) = c_i. (In general, of course, it would be quite possible that m_i and c_i are not in the same cycle at all, and so m_i will never turn into c_i no matter how many times we apply p; but the problem statement guarantees that this will not be the case.)

The problem, of course, is that all the other letters may not be encrypted properly after only j encryption steps, in which case we need to keep going until all the cycles line up. Suppose m_i and c_i are in a cycle of length n. That means applying the encryption n times to c_i will result in c_i again (and not before). Thus, we will get c_i after j steps and then every n steps thereafter (j, j+n, j+2n, \dots); in other words, the number of encryption steps must be equivalent to j \pmod n.

Every position in the message yields a similar modular equivalence, giving us a system of up to 200 simultaneous modular equivalences which we can solve using the generalized Chinese Remainder Theorem. Incidentally, this is why the solution can be so large—if we have cycles of sizes n_1, n_2, \dots, n_q, then it could take up to \mathrm{lcm}(n_1, n_2, \dots, n_q) iterations for them to all line up. In the special case that all the cycle sizes are relatively prime, this is just their product. So for example we could have cycles of sizes 2, 3, 5, 7, 11, 13, 17, 19, 23, which add up to exactly 100; the product of these is the primorial 23\# = 223\,092\,870.

My solution

Here is my solution. We’re going to use an unboxed array to represent the permutation, and the implementation of GCRT from a previous post.

{-# LANGUAGE RecordWildCards #-}

import           Control.Arrow
import           Data.Array.Unboxed

import           NumberTheory
import           Scanner

Here’s main, along with a data type to represent a single test case and a Scanner for reading one in. I like using record syntax to help me remember which field is which, combined with the RecordWildCards extension to unpack the data structure and get the field names as local variables.

main = interact $
  runScanner (numberOf tc) >>> map (solve >>> show) >>> unlines

data TC = TC { message :: [Int], crypt :: [Int], subst :: [Int] }
  deriving Show

tc = do
  l <- int
  TC <$> l `times` int <*> l `times` int <*> 100 `times` int

We’re going to represent a permutation as an unboxed array, which gives us nice constant-time lookup. I often use Data.Array or Data.Array.Unboxed to represent read-only information (which comes up a lot more than you might think!), giving all the benefits of fast array access with none of the complications of mutability.

type Perm = UArray Int Int

The distance between i and j is simply the number of times we have to apply p to get from i to j. Of course this definition would hang if i and j are not part of the same cycle, but we know they will be. Given dist, we can also find the length of a cycle containing i as one more than the distance from p(i) to i. (We can’t just ask for the distance from i to itself since that would return 0.)

dist :: Perm -> Int -> Int -> Int
dist p i j = length $ takeWhile (/= j) (iterate (p!) i)

cycleLen :: Perm -> Int -> Int
cycleLen p i = succ $ dist p (p!i) i

Finally, we can put these pieces together: create an array for the permutation, zip together the message and desired encryption, generating a modular equivalence for each, and solve the resulting system using gcrt.

solve :: TC -> Int
solve (TC{..}) = pick . gcrt $ zipWith modEqn message crypt
    p :: UArray Int Int
    p = listArray (1,100) subst

    modEqn m c = (fromIntegral (dist p m c), fromIntegral (cycleLen p m))
    pick (Just (z,k)) = fromIntegral (z `mod` k)

Incidentally, this code inspired me to create a Util.hs in my comprog-hs repository containing (for now) fi as an alias for fromIntegral, and both to apply a function to both elements of a tuple (sadly lens is not available in the Kattis environment). Then we can just write modEqn m c = both fi (dist p m c, cycleLen p m).

Solving bigger instances

The above solution works because the alphabet is quite small (only 100). However, it’s actually quite wasteful. For example, suppose that the given message consists of 200 copies of the number 1; then we will recompute the length of 1’s cycle 200 times. It’s easy to imagine a variant of this problem where both the message length and the alphabet size could be much larger. Then my solution above would be too slow. For example, suppose the permutation consists of one giant cycle of length 10^5, and the message also has length 10^5. We would traverse the entire cycle for every single character in the message, for a total of about 10^{10} operations—much too slow. This post has gotten long enough, but in another post I will show an alternative solution which I believe would work quickly enough even for such large inputs (assuming that the input was restricted such that the answer was still of a reasonable size!). The idea is to precompute the cycle decomposition of the permutation (in time proportional to the size of the alphabet), storing the information in such a way that for each pair of letters in the message and desired encryption, we can find the distance between them and the length of their cycle in constant time.

Next up: geometry

Next, I’d like to spend a few posts on the topic of geometry. I find that geometry problems work particularly well in Haskell (I don’t think I’ve solved a single geometry problem in Java). Let’s kick things off with a problem on the easier side:

Have fun!

Posted in competitive programming, haskell | Tagged , , , , , , , | 5 Comments

Competitive programming in Haskell: building unordered trees

In my previous post I challenged you to solve Subway Tree System, which encodes trees by recording sequences of steps taken away from and towards the root while exploring the whole tree, and asks whether two such recordings denote the same tree. There are two main difficulties here: the first is how to do the parsing; second, how to compare two trees when we don’t care about the order of children at each node. Thanks to all of you who posted your solutions—I learned a lot. I often feel like my solution is obviously the “only” solution, but then when I see how others solve a problem I realize that the solution space is much larger than I thought!

My solution

Here’s my solution, with some commentary interspersed. First, some pragmas and imports and such:

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TupleSections     #-}

import           Control.Arrow
import           Data.Bool
import qualified Data.ByteString.Lazy.Char8  as C
import           Data.Function
import           Data.List
import           Data.List.Split
import           Data.Map                    (Map)
import qualified Data.Map                    as M

import           Text.Parsec
import           Text.Parsec.ByteString.Lazy
import           Text.Parsec.Char

My main then looks like this:

main = C.interact $
  C.lines >>> drop 1 >>> chunksOf 2 >>>
  map (solve >>> bool "different" "same") >>> C.unlines

The use of ByteString instead of String isn’t really necessary for this problem, just habit. I split the input into lines, group them in twos using Data.List.Split.chunksOf, solve each test case, and turn the output into different or same appropriately. (Data.Bool.bool is the fold/case analysis for the Bool type; I never use it in any other Haskell code but am unreasonably fond of it for this particular use case.) It would also be possible to use the Scanner abstraction instead of lines, drop, and chunksOf, as commenter blaisepascal2014 did. In some ways that would actually be nicer, but I often default to using these more basic tools in simple cases.


Now for parsing the trees. The parsing is not too bad, and several commenters essentially did it manually with a recursive function manipulating a stack and so on; the most creative used a tree zipper to literally walk around the tree being constructed, just like you are supposedly walking around a subway in the problem. However, the parsec package is available in the Kattis environment, so the easiest thing is to actually whip up a proper little parser. (I know of several other Kattis problems which can be nicely solved using parser combinators but would be annoying otherwise, for example, Calculator and Otpor. A rather fiendish but fun parsing puzzle is Learning to Code.)

readTree :: C.ByteString -> Tree
readTree = parse parseTree "" >>> either undefined id
    parseTree    = Node     <$> parseForest
    parseForest  = fromList <$> many parseSubtree
    parseSubtree = char '0' *> parseTree <* char '1'

Of course I haven’t actually shown the definition of Tree, Node, or fromList yet, but hopefully you get the idea. either undefined id is justified here since the input is guaranteed to be well-formed, so the parser will never actually fail with a Left.

Unordered trees

The other difficulty is how to compare trees up to reordering children. Trying all permutations of the children at each node and seeing whether any match is obviously going to be much too slow! The key insight, and what this problem had in common with the one from my previous post, is that we can use an (automatically-derived) Ord instance to sort the children at each node into a canonical order. We don’t really need to know or care what order they end up in, which depends on the precise details of how the derived Ord instance works. The point is that sorting into some consistent order allows us to efficiently test whether two lists are permutations of each other.

I think everyone who posted a solution created some kind of function to “canonicalize” a tree, by first canonicalizing all subtrees and then sorting them. When I first solved this problem, however, I approached it along slightly different lines, hinted at by commenter Globules: can we define the Tree type in such a way that there is only a single representation for each tree-up-to-reordering?

My first idea was to use a Data.Set of children at each node, but this is subtly wrong, since it gets rid of duplicates! We don’t actually want a set of children at each node, but rather a bag (aka multiset). So I made a little Bag abstraction out of a Map. The magical thing is that GHC can still derive an Ord instance for my recursive tree type containing a newtype containing a Map containing trees! (OK, OK, it’s not really magic, but it still feels magic…)

Now, actually, I no longer think this is the best solution, but it’s interesting, so I’ll leave it. Later on I will show what I think is an even better solution.

newtype Tree = Node (Bag Tree)
  deriving (Eq, Ord)

newtype Bag a = Bag (Map a Int)
  deriving (Eq, Ord)

fromList :: Ord a => [a] -> Bag a
fromList = map (,1) >>> M.fromListWith (+) >>> Bag

The final piece is the solve function, which simply calls readTree on the two strings and compares the resulting (canonical!) Tree values for equality.

solve :: [C.ByteString] -> Bool
solve [t1,t2] = ((==) `on` readTree) t1 t2

A better way

I still think it’s a nice idea to have canonical-by-construction trees, rather than building ordered trees and then calling a separate function to canonicalize them afterwards. But inspired by several commenters’ solutions, I realized that rather than my complex Bag type, it’s much nicer to simply use a sorted list as the canonical representation of a Node’s bag of subtrees, and to use a smart constructor to build them:

newtype Tree = Node [Tree]
  deriving (Eq, Ord)

mkNode :: [Tree] -> Tree
mkNode = Node . sort

Then we just use mkNode instead of Node in the parser, and voilà! The canonicalization happens on the fly while parsing the tree. By contrast, if we write a separate canonicalization function, like

canonical :: Tree -> Tree
canonical (Node ts) = Node (map canonical (sort ts))

it is actually possible to get it wrong. In fact, I deliberately introduced a bug into the above function: can you see what it is?

All told, then, here is the (in my opinion) nicest solution that I know of:

{-# LANGUAGE OverloadedStrings #-}

import           Control.Arrow
import           Data.Bool
import qualified Data.ByteString.Lazy.Char8 as C
import           Data.Function
import           Data.List

import           Text.Parsec
import           ScannerBS                  hiding (many)

main = C.interact $
  runScanner (numberOf (two str)) >>>
  map (solve >>> bool "different" "same") >>> C.unlines

solve :: [C.ByteString] -> Bool
solve [t1,t2] = ((==) `on` readTree) t1 t2

newtype Tree = Node [Tree] deriving (Eq, Ord)

readTree :: C.ByteString -> Tree
readTree = parse parseTree "" >>> either undefined id
    parseTree    = (Node . sort) <$> many parseSubtree
    parseSubtree = char '0' *> parseTree <* char '1'

Next problem

For Tuesday, I invite you to solve The Power of Substitution. Don’t let the high difficulty rating scare you; in my estimation it should be quite accessible if you know a bit of math and have been following along with some of my previous posts (YMMV). However, it’s not quite as obvious what the nicest way to write it in Haskell is.

Posted in competitive programming, haskell | Tagged , , , , , , , , , | 26 Comments