## Competitive programming in Haskell: 2D cross product, part 1

Time for some more geometry! In my previous post I challenged you to solve Cookie Cutters, which asks us to scale the vertices of a polygon so that it has a certain prescribed area. It’s possible to solve this just by looking up an algorithm for computing the area of a polygon (see the “shoelace formula”). But the way to get good at solving geometry problems is not by memorizing a bunch of formulas, but rather by understanding a few general primitives and principles which can be assembled to solve a wide range of problems.

Incidentally, if you’re serious about getting good at geometric problems in competitive programming, then you absolutely must read Victor Lecomte’s Handbook of geometry for competitive programmers. (It’s still a great read even if you’re not serious!)

## The 2D cross product

In two dimensions, given vectors $\mathbf{u} = (u_x, u_y)$ and $\mathbf{v} = (v_x, v_y)$, we can compute their cross product as

$\mathbf{u} \times \mathbf{v} = \begin{vmatrix} u_x & v_x \\ u_y & v_y \end{vmatrix} = u_x v_y - v_x u_y.$

One useful way to understand this as giving the signed area of the parallelogram determined by $\mathbf{u}$ and $\mathbf{v}$. The area is positive when $\mathbf{v}$ is counterclockwise from $\mathbf{u}$, negative when it is clockwise, and zero when the two vectors are colinear (i.e. parallel or antiparallel).

I’m not going to prove this here, since to be quite honest I don’t remember off the top of my head how to derive it. (Also, geometric algebra does a much better job of explaining where this comes from and generalizing to any number of dimensions; in particular, $\mathbf{u} \times \mathbf{v}$ is the coefficient of the bivector resulting from the outer product of $\mathbf{u}$ and $\mathbf{v}$. But that would take us much too far afield for now!)

So let’s write some Haskell code to compute the cross product of 2D vectors. (All this code has of course been added to Geom.hs.)

cross :: Num s => V2 s -> V2 s -> s
cross (V2 ux uy) (V2 vx vy) = ux*vy - vx*uy

crossP :: Num s => P2 s -> P2 s -> P2 s -> s
crossP p1 p2 p3 = cross (p2 ^-^ p1) (p3 ^-^ p1)

type P2 s = V2 s
type P2D  = P2 Double

A few things to note:

• cross works over any scalar type which is an instance of Num. In solving Cookie Cutters, this is going to be Double, but it could also be, e.g. Integer.
• For convenience, crossP is a variant of cross that takes three points as arguments, and computes the cross product of the vector from the first to the second with the vector from the first to the third. In many instances where we want to use the cross product, we actually have the coordinates of three points/vertices.
• We’ve added P2 and P2D as type aliases for V2 and V2D. They are just aliases, not newtypes, to reduce the need for separate operators that work on points vs vectors, but it’s still helpful to have different type aliases to at least alert us to whether our functions morally want to be given vectors or points as arguments.

Now, keeping in mind the fundamental interpretation of the 2D cross product as computing the signed area of a parallelogram, we can derive a few other operations. First, given the three vertices of a triangle, we can compute the signed area of the triangle as half of the cross product (because the triangle is half the parallelogram). Note that the order of the vertices matters: the area will be positive if they are in counterclockwise order, and negative if clockwise. Swapping any two vertices negates the result. If we want the normal nonnegative area of a triangle regardless of the order of the vertices, of course we can just take the absolute value.

signedTriArea :: Fractional s => P2 s -> P2 s -> P2 s -> s
signedTriArea p1 p2 p3 = crossP p1 p2 p3 / 2

triArea :: Fractional s => P2 s -> P2 s -> P2 s -> s
triArea p1 p2 p3 = abs (signedTriArea p1 p2 p3)

(Notice the Fractional constraint since we have to divide by two.) At first glance, you might think the concept of “signed triangle area” is silly and useless. But it turns out to be the key to understanding the “shoelace formula”.

## The shoelace formula for polygon area

Imagine first that we have a convex polygon. If we pick a point somewhere in its interior (say, the centroid) and draw lines from the central point to every vertex, we chop up the polygon into triangles. Obviously, adding up the areas of these triangles will give us the area of the polygon.

What’s much less obvious is that if we add up the signed area of each triangle, it still works even if (1) the polygon is not convex, and/or (2) the “central point” is not in the interior of the polygon! That is, we just pick some arbitrary “central point” (the origin works nicely) and compute the signed area of the triangle formed by the origin and each edge of the polygon. A sort of magical inclusion-exclusion thing happens where all the area outside the polygon gets canceled out, and all the area inside ends up getting counted exactly once. Rather than try to prove this to you, I’ll just illustrate some examples.

So, here’s the Haskell code. signedPolyArea yields a positive area if the vertices of the polygon are in “counterclockwise order” (puzzle: what does “counterclockwise order” mean for a non-convex polygon? Hint: look up “winding number”; this is also the key to a formal proof that all of this works), and negative if they are clockwise.

signedPolyArea :: Fractional s => [P2 s] -> s
signedPolyArea pts = sum $zipWith (signedTriArea zero) pts (tail pts ++ [head pts]) polyArea :: Fractional s => [P2 s] -> s polyArea = abs . signedPolyArea The “shoelace formula”, as it is usually presented, falls out if you inline the zero argument to signedTriArea and then simplify the result. It would be possible to do this and code an optimized version of signedPolyArea that uses the shoelace formula more directly, but I much prefer having this version which is built out of meaningful and reusable components! Incidentally, there is a 3D analogue to the shoelace formula for computing the volume of a 3D polyhedron, but it requires some care to first make sure all the faces are oriented in a compatible way; see section 3.5 of Lecomte. ## Other utilities I added a couple more utilities to Geom.hs which we will need. First, since we need to scale polygons up or down to give a required area, we need the concept of multiplying a vector by a scalar: (*^) :: Num s => s -> V2 s -> V2 s k *^ (V2 x y) = V2 (k*x) (k*y) Also, to help with reading vectors from the input, I added this combinator: v2 :: Applicative f => f s -> f (V2 s) v2 s = V2 <$> s <*> s

The idea is to use it with f ~ Scanner. For example, if double :: Scanner Double then we can write v2 double :: Scanner (V2 Double).

Last but not least, I also added getX and getY field labels to the V2 type, for when we need to extract the coordinates of a point or vector:

data V2 s = V2 { getX :: !s, getY :: !s } deriving (Eq, Ord, Show)

Finally, here’s my solution to Cookie Cutters. First, some imports and main, which just scans the input, generates the required scaled and translated list of vertices, and then formats the output.

import           Control.Arrow
import qualified Data.Foldable as F
import           Text.Printf

import           Geom
import           Scanner

main = interact $runScanner scan >>> solve >>> map (F.toList >>> map (printf "%.5f") >>> unwords) >>> unlines Here’s the data type for storing the input, along with a Scanner for it. Notice how we use v2 double' to read in 2D vectors (well, actually points!) in the input. The annoying thing is that some floating-point values in the input are formatted like .5, with no leading 0, and read ".5" :: Double crashes. Hence the need for the double' scanner below, which reads a string token and potentially adds a leading zero before conversion to Double. data TC = TC { polygon :: [P2D], area :: Double } scan :: Scanner TC scan = do n <- int TC <$> n times (v2 double') <*> double'

double' :: Scanner Double
double' = (read . fixup) <$> str where fixup s@('.':_) = '0':s fixup s = s And finally, putting the pieces together to solve the meat of the problem. We first compute the area of the given polygon using polyArea, then divide the desired area by the original area to find the factor by which the area must increase (or decrease). Area scales as the square of distance, so we must take the square root of this factor to find the factor by which the vertices must scale. We simply scale all the vertices appropriately, then find the minimum x and y coordinates so we can translate by their negation, to make the polygon touch the positive x and y axes as required. solve :: TC -> [P2D] solve (TC ps a) = map (^-^ V2 xmin ymin) ps' where a0 = polyArea ps s = sqrt (a / a0) -- scaling factor to get the desired area ps' = map (s *^) ps xmin = minimum (map getX ps') ymin = minimum (map getY ps') ## Next time: Chair Hopping For next time I invite you to solve Chair Hopping. Warning, this one is rather difficult! But I had a lot of fun solving it, and the solution touches on several interesting topics (in fact, I’ll probably need more than one blog post). Posted in competitive programming, haskell | Tagged , , , | 2 Comments ## Competitive programming in Haskell: data representation and optimization, with cake In my previous post I challenged you to solve Checking Break, which presents us with a cake in the shape of a rectangular prism, with chocolate chips at various locations, and asks us to check whether a proposed division of the cake is valid. A division of the cake is valid if it is a partition (no pieces overlap and every part of the cake is in some piece) and every piece contains one chocolate chip. No one posted a solution—I don’t know if that’s because people have lost interest, or because no one was able to solve it—but in any case, don’t read this post yet if you still want to try solving it! As a very small hint, part of the reason I chose this problem is that it is an interesting example of a case where just getting the correct asymptotic time complexity is not enough—we actually have to work a bit to optimize our code so it fits within the allotted time limit. ## The algorithm When solving this problem I first just spent some time thinking about the different things I would have to compute and what algorithms and data structures I could use to accomplish them. • The first thing that jumped out at me is that we are going to want some kind of abstractions for 3D coordinates, and for 3D rectangular prisms (i.e. boxes, i.e. pieces of cake). Probably we can just represent boxes as a pair of points at opposite corners of the box (in fact this is how boxes are given to us). As we plan out how the rest of the solution is going to work we will come up with a list of operations these will need to support. As an aside, when working in Java I rarely make any classes beyond the single main class, because it’s too heavyweight. When working in Haskell, on the other hand, I often define little abstractions (i.e. data types and operations on them) because they are so lightweight, and being able to cleanly separate things into different layers of abstraction helps me write solutions that are more obviously correct. • We need to check that the coordinates of each given box are valid. • We will need to check that every piece of cake contains exactly one chocolate chip. At first this sounds difficult—given a chip, how do we find out which box(es) it is in? Or given a box, how can we find out which chips are in it? To do this efficiently seems like it will require some kind of advanced 3D space partitioning data structure, like an octree or a BSP tree. BUT this is a situation where reading carefully pays off: the problem statement actually says that “the $i$-th part must contain the $i$-th chocolate chip”. So all we have to do is zip the list of pieces together with the list of chips. We just need an operation to test whether a given point is contained in a given box. • We have to check that none of the boxes overlap. We can make a primitive to check whether two boxes intersect, but how do we make sure that none of the boxes intersect? Again, complicated space-partitioning data structures come to mind; but since there are at most $10^3$ boxes, the number of pairs is on the order of $10^6$. There can be multiple test cases, though, and the input specification says the sum of values for $m$ (the number of pieces) over all test cases will be at most $5 \times 10^4$. That means that in the worst case, we could get up to $50$ test cases with $10^3$ pieces of cake (and thus on the order of $10^6$ pairs of pieces) per test case. Given $10^8$ operations per second as a rough rule of thumb, it should be just barely manageable to do a brute-force check over every possible pair of boxes. • Finally, we have to check that the pieces account for every last bit of the cake. If we think about trying checking this directly, it is quite tricky. One could imagine making a 3D array representing every cubic unit of cake, and simply marking off the cubes covered by each piece, but this is completely out of the question since the cake could be up to $10^6 \times 10^6 \times 10^6$ in size! Or we could again imagine some complicated space-partitioning structure to keep track of which parts have and have not been covered so far. But there is a much simpler way: just add up the volume of all the pieces and make sure it is the same as the volume of the whole cake! Of course this relies on the fact that we are also checking to make sure none of the pieces overlap: the volumes being equal implies that the whole cake is covered if and only if none of the pieces overlap. In any case, we will need a way to compute the volume of a box. ## Implementation and optimization Let’s start with some preliminaries: LANGUAGE pragmas, imports, main, and the parser. {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RecordWildCards #-} {-# LANGUAGE TupleSections #-} import Control.Arrow import Data.Bool import qualified Data.ByteString.Lazy.Char8 as C import Data.Monoid import ScannerBS main = C.interact$
runScanner (many tc) >>> init >>>
map (solve >>> bool "NO" "YES") >>>
C.unlines

data TC = TC { cake :: Box, chips :: [Pos], parts :: [Box] }

tc :: Scanner TC
tc = do
a <- int
case a of
-1 -> return undefined
_  -> do
xs <- three int
let [b,c,m] = xs
cake    = Box (Pos 1 1 1) (Pos a b c)
TC cake <$> m times pos <*> m times box The parser is worth remarking upon. The input consists of multiple test cases, with a single value of $-1$ marking the end of the input. This is annoying: ideally, we would have a many combinator that keeps running a Scanner until it fails, but we don’t. To keep things simple and fast, our Scanner abstraction does not support parse failures and alternatives! The many combinator we made keeps running a given Scanner until the end of input, not until it fails. The quick-and-dirty solution I adopted is to make the test case Scanner return undefined if it sees a -1, and then simply ignore the final element of the list of test cases via init. Not pretty but it gets the job done. ## Representing positions and boxes Next let’s consider building abstractions for 3D coordinates and boxes. It is very tempting to do something like this: type Pos = [Integer] type Box = [Pos] -- Check whether one position is componentwise <= another posLTE :: Pos -> Pos -> Pos posLTE p1 p2 = and$ zipWith (<=) p1 p2

-- ... and so on

Using list combinators like zipWith to work with Pos and Box values is quite convenient. And for some problems, using lists is totally fine. Having a small number of large lists—e.g. reading in a list of $10^5$ integers and processing them somehow—is rarely a problem. But having a large number of small lists, as we would if we use lists to represent Pos and Box here, slows things down a lot (as I learned the hard way). I won’t go into the details of why—I am no expert on Haskell performance—but suffice to say that lists are a linked structure with a large memory overhead.

So let’s do something more direct. We’ll represent both Pos and Box as data types with strict fields (the strict fields make a big difference, especially in the case of Pos), and make some trivial Scanners for them. The volume function computes the volume of a box; given that the coordinates are coordinates of the cubes that make up the pieces, and are both inclusive, we have to add one to the difference between the coordinates. Note we assume that the first coordinate of a Box should be elementwise less than or equal to the second; otherwise, the call to max 0 ensures we will get a volume of zero.

data Pos = Pos !Int !Int !Int
data Box = Box !Pos !Pos

pos :: Scanner Pos
pos = Pos <$> int <*> int <*> int box :: Scanner Box box = Box <$> pos <*> pos

volume :: Box -> Int
volume (Box (Pos x1 y1 z1) (Pos x2 y2 z2)) = (x2 -. x1) * (y2 -. y1) * (z2 -. z1)
where
x -. y = max 0 (x - y + 1)

Another very important note is that we are using Int instead of Integer. Using Integer is lovely when we can get away with it, since it means not worrying about overflow at all; but in this case using Int instead of Integer yields a huge speedup (some quick and dirty tests show about a factor of 6 speedup on my local machine, and replacing Int with Integer, without changing anything else, makes my solution no longer accepted on Kattis). Of course, this comes with an obligation to think about potential overflow: the cake can be at most $10^6$ units on each side, giving a maximum possible volume of $10^{18}$. On a 64-bit machine, that just fits within an Int (maxBound :: Int is approximately $9.2 \times 10^{18}$). Since the Kattis test environment is definitely 64-bit, we are good to go. In fact, limits for competitive programming problems are often chosen so that required values will fit within 64-bit signed integers (C++ has no built-in facilities for arbitrary-size integers); I’m quite certain that’s why $10^6$ was chosen as the maximum size of one dimension of the cake.

## Pos and Box utilities

Next, some utilities for checking whether one Pos is elementwise less than or equal to another, and for taking the elementwise max and min of two Pos values. Checking whether a Box contains a Pos simply reduces to doing two calls to posLTE (again assuming a valid Box with the first corner componentwise no greater than the second).

posLTE (Pos x1 y1 z1) (Pos x2 y2 z2) = x1 <= x2 && y1 <= y2 && z1 <= z2
posMax (Pos x1 y1 z1) (Pos x2 y2 z2) = Pos (max x1 x2) (max y1 y2) (max z1 z2)
posMin (Pos x1 y1 z1) (Pos x2 y2 z2) = Pos (min x1 x2) (min y1 y2) (min z1 z2)

contains :: Box -> Pos -> Bool
contains (Box lo hi) p = posLTE lo p && posLTE p hi

To test whether a box is a valid box within a given cake, we test that its corners are in the correct order and fit within the low and high coordinates of the cake.

valid :: Box -> Box -> Bool
valid (Box lo hi) (Box c1 c2) = posLTE lo c1 && posLTE c1 c2 && posLTE c2 hi

How to test whether two given boxes intersect or not? There are probably many ways to do this, but the nicest way I could come up with is to first find the actual Box which represents their intersection, and check whether it has a positive volume (relying on the fact that volume returns 0 for degenerate boxes with out-of-order coordinates). In turn, to find the intersection of two boxes, we just take the coordinatewise max of their lower corners, and the coordinatewise min of their upper corners.

intersection :: Box -> Box -> Box
intersection (Box c11 c12) (Box c21 c22) = Box (posMax c11 c21) (posMin c12 c22)

disjoint :: Box -> Box -> Bool
disjoint b1 b2 = volume (intersection b1 b2) == 0

## The solution

Finally, we can put the pieces together to write the solve function. We simply check that all the given cake parts are valid; that every part contains its corresponding chocolate chip; that every pair of parts is disjoint; and that the sum of the volumes of all parts equals the volume of the entire cake.

solve :: TC -> Bool
solve (TC{..}) = and
[ all (valid cake) parts
, and $zipWith contains parts chips , all (uncurry disjoint) (pairs parts) , sum (map volume parts) == volume cake ] ## Computing all pairs Actually, there’s still one missing piece: how to compute all possible pairs of parts. The simplest possible thing would be to use a list comprehension like [(x,y) | x <- parts, y <- parts] but this has problems: first, it includes a pairing of each part with itself, which will definitely have a nonzero intersection. We could exclude such pairs by adding x /= y as a guard, but there is another problem: (p2,p1) is included whenever (p1,p2) is included, but this is redundant since disjoint is commutative. In fact, we don’t really want all pairs; we want all unordered pairs, that is, all sets of size two. We can do that with the below utility function (which I have now added to Util.hs): pairs :: [a] -> [(a,a)] pairs [] = [] pairs [_] = [] pairs (a:as) = map (a,) as ++ pairs as This is accepted, and runs in about 0.91 seconds (the time limit is 2 seconds). However, I was curious whether we are paying anything here for all the list operations, so I wrote the following version, which takes a binary operation for combining list elements, and a Monoid specifying how to combine the results, and directly returns the monoidal result of combining all the pairs, without ever constructing any intermediate lists or tuples at all. It’s sort of like taking the above pairs function, following it by a call to foldMap, and then manually fusing the two to get rid of the intermediate list. withPairs :: Monoid r => (a -> a -> r) -> [a] -> r withPairs _ [] = mempty withPairs _ [_] = mempty withPairs f (a:as) = go as where go [] = withPairs f as go (a2:rest) = f a a2 <> go rest To use this, we have to change the solve function slightly: instead of  , all (uncurry disjoint) (pairs parts) we now have  , getAll$ withPairs (\p1 p2 -> All $disjoint p1 p2) parts This version runs significantly faster on Kattis—0.72 seconds as opposed to 0.91 seconds. (In fact, it’s faster than the currently-fastest Java solution (0.75 seconds), though there is still a big gap to the fastest C++ solution (0.06 seconds).) I don’t completely understand why this version is faster—perhaps one of you will be able to enlighten us! ## For next time For next time, we’ll go back to computational geometry: I invite you to solve Cookie Cutters. Posted in competitive programming, haskell | | 7 Comments ## Competitive programming in Haskell: vectors and 2D geometry In my previous post (apologies it has been so long!) I challenged you to solve Vacuumba, which asks us to figure out where a robot ends up after following a sequence of instructions. Mathematically, this corresponds to adding up a bunch of vectors, but the interesting point is that the instructions are always relative to the robot’s current state, so robot programs are imperative programs. ## Vector basics The first order of business is to code up some primitives for dealing with (2D) vectors. I have accumulated a lot of library code for doing geometric stuff, but it’s kind of a mess; I’m using this as an opportunity to clean it up bit by bit. So there won’t be much code at first, but the library will grow as we do more geometry problems. The code so far (explained below) can be found in the comprog-hs repository. First, a basic representation for 2D vectors, the zero vector, and addition and subtraction of vectors. {-# LANGUAGE GeneralizedNewtypeDeriving #-} module Geom where ------------------------------------------------------------ -- 2D points and vectors data V2 s = V2 !s !s deriving (Eq, Ord, Show) type V2D = V2 Double instance Foldable V2 where foldMap f (V2 x y) = f x <> f y zero :: Num s => V2 s zero = V2 0 0 -- Adding and subtracting vectors (^+^), (^-^) :: Num s => V2 s -> V2 s -> V2 s V2 x1 y1 ^+^ V2 x2 y2 = V2 (x1+x2) (y1+y2) V2 x1 y1 ^-^ V2 x2 y2 = V2 (x1-x2) (y1-y2) A few things to point out: • The V2 type is parameterized over the type of scalars, but we define V2D as a synonym for V2 Double, which is very common. The reason for making V2 polymorphic in the first place, though, is that some problems require the use of exact integer arithmetic. It’s nice to be able to share code where we can, and have the type system enforce what we can and can’t do with vectors over various scalar types. • For a long time I just represented vectors as lists, type V2 s = [s]. This makes implementing addition and subtraction very convenient: for example, (^+^) = zipWith (+). Although this has worked just fine for solving many geometry problems, I have recently been reminded that having lots of small lists can be bad for performance. As long as we’re making a library anyway we might as well use a proper data type for vectors! • Elsewhere I have made a big deal out of the fact that vectors and points ought to be represented as separate types. But in a competitive programming context I have always just used a single type for both and it hasn’t bit me (yet!). • The Foldable instance for V2 gets us toList. It also gets us things like sum and maximum which could occasionally come in handy. ## Angles and rotation The other thing we are going to need for this problem is angles. ------------------------------------------------------------ -- Angles newtype Angle = A Double -- angle (radians) deriving (Show, Eq, Ord, Num, Fractional, Floating) fromDeg :: Double -> Angle fromDeg d = A (d * pi / 180) fromRad :: Double -> Angle fromRad = A toDeg :: Angle -> Double toDeg (A r) = r * 180 / pi toRad :: Angle -> Double toRad (A r) = r -- Construct a vector in polar coordinates. fromPolar :: Double -> Angle -> V2D fromPolar r θ = rot θ (V2 r 0) -- Rotate a vector counterclockwise by a given angle. rot :: Angle -> V2D -> V2D rot (A θ) (V2 x y) = V2 (cos θ * x - sin θ * y) (sin θ * x + cos θ * y)  Nothing too complicated going on here: we have a type to represent angles, conversions to and from degrees and radians, and then two uses for angles: a function to construct a vector in polar coordinates, and a function to perform rotation. Incidentally, one could of course define type Angle = Double, which would be simpler in some ways, but after getting bitten several times by forgetting to convert from degrees to radians, I decided it was much better to use a newtype and entirely prevent that class of error. ## Solving Vacuumba Now we just put the pieces together to solve the problem. First, some imports: {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE RecordWildCards #-} import Control.Arrow import Control.Monad.State import qualified Data.Foldable as F import Text.Printf import Geom import Scanner We make a data type for representing robot instructions, and a corresponding Scanner. Notice how we are forced to use fromDeg to convert the raw input into an appropriate type. data Instr = I { turn :: Angle, dist :: Double } instr :: Scanner Instr instr = I <$> (fromDeg <$> double) <*> double The high-level solution then reads the input via a Scanner, solves each scenario, and formats the output. The output is a V2D, so we just convert it to a list with F.toList and use printf to format each coordinate. main = interact$
runScanner (numberOf (numberOf instr)) >>>
map (solve >>> F.toList >>> map (printf "%.6f") >>> unwords) >>> unlines

Our solve function needs to take a list of instructions, and output the final location of the robot. Since the instructions can be seen as an imperative program for updating the state of the robot, it’s entirely appropriate to use a localized State computation.

First, a data type to represent the robot’s current state, consisting of a 2D vector recording the position, and an angle to record the current heading. initRS records the robot’s initial state (noting that it starts out facing north, corresponding to an angle of $90^\circ$ as measured clockwise from the positive $x$-axis).

data RobotState = RS { pos :: V2D, heading :: Angle }
initRS = RS zero (fromDeg 90)

Finally, the solve function itself executes each instruction in sequence as a State RobotState computation, uses execState to run the resulting overall computation and extract the final state, and then projects out the robot’s final position. Executing a single instruction is where the geometry happens: we look up the current robot state, calculate its new heading by adding the turn angle to the current heading, construct a movement vector in the direction of the new heading using polar coordinates, and add the movement to the current position.

solve :: [Instr] -> V2D
solve = mapM_ exec >>> flip execState initRS >>> pos
where
exec :: Instr -> State RobotState ()
exec (I θ d) = do
RS{..} <- get
put $RS (pos ^+^ move) heading' ## For next time We’ll definitely be doing more geometry, but for the next post I feel like doing something different. I invite you to solve Checking Break. Posted in competitive programming, haskell | Tagged , , , , | 1 Comment ## Anyone willing to help me get set up with something like miso? For the last few years I’ve been working (off and on) on a teaching language for FP and discrete math. One of the big goals is to build a browser-based IDE which runs completely client-side, so students can code in their browsers without installing anything, and I don’t have to worry about running a server. Thanks to amazing technology like GHCJS, miso, reflex, etc., this seems like it should be entirely doable. However, every time I sit down to try building such a thing, I end up getting completely bogged down in details of nix and stack and .cabal files and whatnot, and never even get off the ground. There are usually nice examples of building a new site from scratch, but I can never figure out the right way to incorporate my large amount of existing Haskell code. Should I have one package? Two separate packages for the website and the language implementation? How can I set things up to build either/both a command-line REPL and a web IDE? I’m wondering if there is someone experienced with GHCJS and miso who would be willing to help me get things set up. (I’m also open to being convinced that some other combination of technologies would be better for my use case.) I’m imagining a sort of pair-programming session via videoconference. I am pretty flexible in terms of times so don’t worry about whether your time zone matches mine. And if there’s something I can help you with I’m happy to do an exchange. If you’re interested and willing, send me an email: byorgey at gmail. Thanks! Posted in haskell | Tagged , , , , | 6 Comments ## Competitive programming in Haskell: permutations In my previous post I challenged you to solve The Power of Substitution. This problem presents us with a substitution cipher, and asks how many times we would have to iterate the encoding process in order to turn a given message into a given encryption. ## A non-solution Several commenters mentioned that they tried some sort of straightforward brute force approach: just iterate the encoding process and count how many iterations are needed to reach the specified encryption. This certainly works for the provided sample inputs. However, I hinted that this approach likely wouldn’t fit within Kattis’s time limit for the other, secret, test cases. How did I know this solution would exceed the time limit? It is not just an issue of using efficient data structures! Even if someone told me they coded this straightforward solution in C++, I would have the same reaction. The problem specifies that the answer should be at most $10^9$—and, as we will see, it is not hard to come up with test cases where the answer is indeed that large—and the simple fact is that counting to $10^9$ and applying the encryption at each step (which requires iterating through a list of length 200) is definitely going to take more than 1 second. A useful rule of thumb that I learned from Competitive Programming 3 is $10^8$ operations per second. (Of course your processor can carry out much more than $10^8$ instructions per second, but in practice this rule of thumb seems to work remarkably well for predicting run times up to the right order of magnitude.) ## Cycle decomposition of permutations The given encoding, of course, is a permutation (the problem specifies only that it will be one-to-one, but any one-to-one endofunction on a finite set must in fact be a permtuation). Let’s call it $p$. If we start with an arbitrary $m$ and repeatedly apply $p$—that is, $m, p(m), p^2(m), p^3(m), \dots$—what happens? Of course, because of the pigeonhole principle, the sequence must eventually repeat. But actually, something stronger is true: because $p$ is a permutation, the first repeated value must be $m$ itself. For suppose that $p^i(m) = p^j(m)$ was the first repeated value in the sequence. But then since $p$ is one-to-one, it must be the case that $p^{i-1}(m) = p^{j-1}(m)$ as well, which means $p^{i-2}(m) = p^{j-2}(m)$, and so on up to $m = p^0(m) = p^{j-i}(m)$. So in fact if we start at an arbitrary $m$ and iterate $p$, we will find a cycle that includes $m$ (including the possibility of a trivial length-1 cycle if $p(m) = m$). If there are other elements not included in $m$’s cycle, we can pick any one of them and repeat the process to find another cycle (which can’t possibly overlap at all with $m$’s cycle—do you see why?). In general, any permutation can be decomposed in this way into a collection of disjoint cycles. ## CRT to the rescue This idea of cycle decomposition is the key to unlocking the problem. Think about what happens to a particular letter $m_i$ in the message, which we eventually want to become $c_i$. This will happen after applying the permutation some small number of times $j$, such that $p^j(m_i) = c_i$. (In general, of course, it would be quite possible that $m_i$ and $c_i$ are not in the same cycle at all, and so $m_i$ will never turn into $c_i$ no matter how many times we apply $p$; but the problem statement guarantees that this will not be the case.) The problem, of course, is that all the other letters may not be encrypted properly after only $j$ encryption steps, in which case we need to keep going until all the cycles line up. Suppose $m_i$ and $c_i$ are in a cycle of length $n$. That means applying the encryption $n$ times to $c_i$ will result in $c_i$ again (and not before). Thus, we will get $c_i$ after $j$ steps and then every $n$ steps thereafter ($j, j+n, j+2n, \dots$); in other words, the number of encryption steps must be equivalent to $j \pmod n$. Every position in the message yields a similar modular equivalence, giving us a system of up to 200 simultaneous modular equivalences which we can solve using the generalized Chinese Remainder Theorem. Incidentally, this is why the solution can be so large—if we have cycles of sizes $n_1, n_2, \dots, n_q$, then it could take up to $\mathrm{lcm}(n_1, n_2, \dots, n_q)$ iterations for them to all line up. In the special case that all the cycle sizes are relatively prime, this is just their product. So for example we could have cycles of sizes $2, 3, 5, 7, 11, 13, 17, 19, 23$, which add up to exactly $100$; the product of these is the primorial $23\# = 223\,092\,870$. ## My solution Here is my solution. We’re going to use an unboxed array to represent the permutation, and the implementation of GCRT from a previous post. {-# LANGUAGE RecordWildCards #-} import Control.Arrow import Data.Array.Unboxed import NumberTheory import Scanner Here’s main, along with a data type to represent a single test case and a Scanner for reading one in. I like using record syntax to help me remember which field is which, combined with the RecordWildCards extension to unpack the data structure and get the field names as local variables. main = interact$
runScanner (numberOf tc) >>> map (solve >>> show) >>> unlines

data TC = TC { message :: [Int], crypt :: [Int], subst :: [Int] }
deriving Show

tc = do
l <- int
TC <$> l times int <*> l times int <*> 100 times int We’re going to represent a permutation as an unboxed array, which gives us nice constant-time lookup. I often use Data.Array or Data.Array.Unboxed to represent read-only information (which comes up a lot more than you might think!), giving all the benefits of fast array access with none of the complications of mutability. type Perm = UArray Int Int The distance between $i$ and $j$ is simply the number of times we have to apply $p$ to get from $i$ to $j$. Of course this definition would hang if $i$ and $j$ are not part of the same cycle, but we know they will be. Given dist, we can also find the length of a cycle containing $i$ as one more than the distance from $p(i)$ to $i$. (We can’t just ask for the distance from $i$ to itself since that would return $0$.) dist :: Perm -> Int -> Int -> Int dist p i j = length$ takeWhile (/= j) (iterate (p!) i)

cycleLen :: Perm -> Int -> Int
cycleLen p i = succ $dist p (p!i) i Finally, we can put these pieces together: create an array for the permutation, zip together the message and desired encryption, generating a modular equivalence for each, and solve the resulting system using gcrt. solve :: TC -> Int solve (TC{..}) = pick . gcrt$ zipWith modEqn message crypt
where
p :: UArray Int Int
p = listArray (1,100) subst

modEqn m c = (fromIntegral (dist p m c), fromIntegral (cycleLen p m))
pick (Just (z,k)) = fromIntegral (z mod k)

Incidentally, this code inspired me to create a Util.hs in my comprog-hs repository containing (for now) fi as an alias for fromIntegral, and both to apply a function to both elements of a tuple (sadly lens is not available in the Kattis environment). Then we can just write modEqn m c = both fi (dist p m c, cycleLen p m).

## Solving bigger instances

The above solution works because the alphabet is quite small (only $100$). However, it’s actually quite wasteful. For example, suppose that the given message consists of $200$ copies of the number $1$; then we will recompute the length of $1$’s cycle $200$ times. It’s easy to imagine a variant of this problem where both the message length and the alphabet size could be much larger. Then my solution above would be too slow. For example, suppose the permutation consists of one giant cycle of length $10^5$, and the message also has length $10^5$. We would traverse the entire cycle for every single character in the message, for a total of about $10^{10}$ operations—much too slow. This post has gotten long enough, but in another post I will show an alternative solution which I believe would work quickly enough even for such large inputs (assuming that the input was restricted such that the answer was still of a reasonable size!). The idea is to precompute the cycle decomposition of the permutation (in time proportional to the size of the alphabet), storing the information in such a way that for each pair of letters in the message and desired encryption, we can find the distance between them and the length of their cycle in constant time.

## Next up: geometry

Next, I’d like to spend a few posts on the topic of geometry. I find that geometry problems work particularly well in Haskell (I don’t think I’ve solved a single geometry problem in Java). Let’s kick things off with a problem on the easier side:

Have fun!

## Competitive programming in Haskell: building unordered trees

In my previous post I challenged you to solve Subway Tree System, which encodes trees by recording sequences of steps taken away from and towards the root while exploring the whole tree, and asks whether two such recordings denote the same tree. There are two main difficulties here: the first is how to do the parsing; second, how to compare two trees when we don’t care about the order of children at each node. Thanks to all of you who posted your solutions—I learned a lot. I often feel like my solution is obviously the “only” solution, but then when I see how others solve a problem I realize that the solution space is much larger than I thought!

## My solution

Here’s my solution, with some commentary interspersed. First, some pragmas and imports and such:

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TupleSections     #-}

import           Control.Arrow
import           Data.Bool
import qualified Data.ByteString.Lazy.Char8  as C
import           Data.Function
import           Data.List
import           Data.List.Split
import           Data.Map                    (Map)
import qualified Data.Map                    as M

import           Text.Parsec
import           Text.Parsec.ByteString.Lazy
import           Text.Parsec.Char

My main then looks like this:

parseForest  = fromList <$> many parseSubtree parseSubtree = char '0' *> parseTree <* char '1' Of course I haven’t actually shown the definition of Tree, Node, or fromList yet, but hopefully you get the idea. either undefined id is justified here since the input is guaranteed to be well-formed, so the parser will never actually fail with a Left. ## Unordered trees The other difficulty is how to compare trees up to reordering children. Trying all permutations of the children at each node and seeing whether any match is obviously going to be much too slow! The key insight, and what this problem had in common with the one from my previous post, is that we can use an (automatically-derived) Ord instance to sort the children at each node into a canonical order. We don’t really need to know or care what order they end up in, which depends on the precise details of how the derived Ord instance works. The point is that sorting into some consistent order allows us to efficiently test whether two lists are permutations of each other. I think everyone who posted a solution created some kind of function to “canonicalize” a tree, by first canonicalizing all subtrees and then sorting them. When I first solved this problem, however, I approached it along slightly different lines, hinted at by commenter Globules: can we define the Tree type in such a way that there is only a single representation for each tree-up-to-reordering? My first idea was to use a Data.Set of children at each node, but this is subtly wrong, since it gets rid of duplicates! We don’t actually want a set of children at each node, but rather a bag (aka multiset). So I made a little Bag abstraction out of a Map. The magical thing is that GHC can still derive an Ord instance for my recursive tree type containing a newtype containing a Map containing trees! (OK, OK, it’s not really magic, but it still feels magic…) Now, actually, I no longer think this is the best solution, but it’s interesting, so I’ll leave it. Later on I will show what I think is an even better solution. newtype Tree = Node (Bag Tree) deriving (Eq, Ord) newtype Bag a = Bag (Map a Int) deriving (Eq, Ord) fromList :: Ord a => [a] -> Bag a fromList = map (,1) >>> M.fromListWith (+) >>> Bag The final piece is the solve function, which simply calls readTree on the two strings and compares the resulting (canonical!) Tree values for equality. solve :: [C.ByteString] -> Bool solve [t1,t2] = ((==) on readTree) t1 t2 ## A better way I still think it’s a nice idea to have canonical-by-construction trees, rather than building ordered trees and then calling a separate function to canonicalize them afterwards. But inspired by several commenters’ solutions, I realized that rather than my complex Bag type, it’s much nicer to simply use a sorted list as the canonical representation of a Node’s bag of subtrees, and to use a smart constructor to build them: newtype Tree = Node [Tree] deriving (Eq, Ord) mkNode :: [Tree] -> Tree mkNode = Node . sort Then we just use mkNode instead of Node in the parser, and voilà! The canonicalization happens on the fly while parsing the tree. By contrast, if we write a separate canonicalization function, like canonical :: Tree -> Tree canonical (Node ts) = Node (map canonical (sort ts)) it is actually possible to get it wrong. In fact, I deliberately introduced a bug into the above function: can you see what it is? All told, then, here is the (in my opinion) nicest solution that I know of: {-# LANGUAGE OverloadedStrings #-} import Control.Arrow import Data.Bool import qualified Data.ByteString.Lazy.Char8 as C import Data.Function import Data.List import Text.Parsec import ScannerBS hiding (many) main = C.interact$
runScanner (numberOf (two str)) >>>
map (solve >>> bool "different" "same") >>> C.unlines

solve :: [C.ByteString] -> Bool
solve [t1,t2] = ((==) on readTree) t1 t2

newtype Tree = Node [Tree] deriving (Eq, Ord)

readTree = parse parseTree "" >>> either undefined id
where
parseTree    = (Node . sort) <$> many parseSubtree parseSubtree = char '0' *> parseTree <* char '1' ## Next problem For Tuesday, I invite you to solve The Power of Substitution. Don’t let the high difficulty rating scare you; in my estimation it should be quite accessible if you know a bit of math and have been following along with some of my previous posts (YMMV). However, it’s not quite as obvious what the nicest way to write it in Haskell is. Posted in competitive programming, haskell | Tagged , , , , , , , , , | 25 Comments ## Competitive programming in Haskell: sorting tree shapes In my previous post I challenged you to solve this problem, which essentially asks how many distinct binary tree shapes are created when we take lists of numbers and build a tree from each by repeated binary search tree insertion. Incidentally, this problem was from the 2016 ICPC world finals (probably one of the easiest ICPC world finals problems ever!). Several commenters solved it, and all with essentially the same solution. First we need to build some binary search trees by repeated insertion, which we can do by creating a binary tree type and insertion function and then doing some left folds over the input lists. Next, we need to classify the resulting trees by their shape. One obvious method would be to write a function which compares two binary trees to see if they have the same shape; use nubBy to remove duplicates; then count how many trees remain. This would take $O(n^2)$ time, but since there are only at most $50$ trees, with at most $20$ values in each, this should easily fit within the very geneous time limit of 5 seconds. (This is an understatement; my implementation of this approach runs in 0.01s.) However, there’s a different solution which is both asymptotically faster and less code! The key idea is that if we make the Tree type polymorphic, and an instance of Functor (by writing our own instance, or, even better, using the DeriveFunctor extension), then after building the trees we can turn them into literal tree shapes by replacing the values they contain with (). Moreover, since GHC can also derive an Ord instance for our Tree type, we can then count the distinct tree shapes in $O(n \lg n)$ time, either using a combination of sort, group, and length, or by throwing them all into a Set and asking for its size. Here’s my solution: {-# LANGUAGE DeriveFunctor #-} import Control.Arrow import Data.List main = interact$
lines >>> drop 1 >>> map (words >>> map read) >>> solve >>> show

solve :: [[Int]] -> Int
solve = map (foldl' (flip ins) Empty >>> (() <\$)) >>> sort >>> group >>> length
-- or: >>> S.fromList >>> S.size

data Tree a = Empty | Node a (Tree a) (Tree a)
deriving (Show, Eq, Ord, Functor)

ins :: Ord a => a -> Tree a -> Tree a
ins a Empty = Node a Empty Empty
ins a (Node x l r)
| a < x     = Node x (ins a l) r
| otherwise = Node x l (ins a r)

Honestly I’m not sure what the nicest way to solve this problem in something like Java or C++ would be. In Java, I suppose we would have to make a class for trees, implement equals and compareTo methods which compare trees by shape, and then put all the trees in a TreeSet; or else we could implement hashCode instead of compareTo and use a HashSet. The thing that makes the Haskell solution so much nicer is that the compiler writes some of the code for us, in the form of derived Functor and Ord instances.

For Friday, I invite you to solve Subway Tree System, a nifty problem which is more difficult but has some similar features!

Posted in competitive programming, haskell | Tagged , , , , , , , | 18 Comments

## Competitive programming in Haskell: summer series

Now that this weird and crazy semester is finally over, I’d like to get back to doing some blogging (among other things). I’m going to return to my series on competitive programming in Haskell (previous posts here: basic setup; code style and moral absolutes; Scanner; ByteString; primes and factoring; modular arithmetic 1; modular arithmetic 2), but following a more hands-on approach: each Tuesday and Friday, I will post a link to a problem (or two), and invite you to try solving it. The following Tuesday or Friday, I will explain my solution (as well as link to another problem for the next post). Sometimes I may develop a topic over several posts; sometimes I may just post random problems that I like.

To kick things off, here’s an easy-ish problem with an elegant Haskell solution:

I invite you to try solving it, and to post your questions, comments, and solutions here! (Fair warning: don’t look at the comments before you’ve solved it if you don’t want spoilers!) On Tuesday I will post my solution with commentary, and another problem for you to try.

Posted in competitive programming, haskell | Tagged , , , | 6 Comments

## Data structure challenge: application

I forgot to mention this in my previous post, but the thing which got me thinking about the predecessor problem in the first place was this competitive programming problem on Open Kattis:

I challenge you to go and solve it using your favorite technique from the previous post. (The connection between this Kattis problem and the predecessor problem is not immediately obvious, but I have to leave you something to puzzle over!)

## Data structure challenge: solutions

In my previous post I challenged you to find a way to keep track of a sequence of slots in such a way that we can quickly (in $O(\lg n)$ or better) either mark any empty slot as full, or find the rightmost empty slot prior to a given index. When I posted it, I had two solutions in mind; thanks to all the excellent comments I now know of many more!

• There were quite a few answers which in some way or another boiled down to using a balanced tree structure.

In all these cases, the idea is that the tree stores the sorted indices of all empty slots. We can mark a slot full or empty by deleting or inserting from the tree structure, and we can find the rightmost empty slot not exceeding a given index by searching for the index and returning highest value in the tree which is less than the index being searched. It is well-known that all these operations can be done in $O(\lg n)$ time.

• David Barbour suggested something along similar lines, but somewhat more general: keep a finger tree with cached monoidal annotations representing both the total number of elements of each subtree, as well as the number of elements satisfying some proposition (such as the number of empty slots). This also allows performing the operations in $O(\lg n)$ time. This is in some sense similar to the previous suggestions, but it generalizes much more readily, since we can use this scheme to track any kind of monoidal annotation. I had thought of using a segment tree where slot $i$ stores the value $i$ when it is full, and $0$ when it is empty, and each node caches the max value contained in its subtree, allowing updates and queries to happen in $O(\lg n)$ time. This could also track arbitrary monoidal annotations, but using a finger tree is strictly more expressive since it also supports insertion and deletion (although that is not required for my original formulation of the problem).

• Albert also suggested using a van Emde Boas tree to achieve $O(\lg \lg n)$ performance. Van Emde Boas trees directly support a “predecessor” operation which finds the largest key smaller than a given value.

• Roman Cheplyaka suggested using some sort of dynamic rank/select: if we think of the sequence of slots as a bit vector, and represent empty slots by 1s and full slots by 0s, we can find the rightmost empty slot up to a given index by first finding the rank of the index, then doing a select operation on that rank. (I smell some kind of adjunction here: the composition of rank then select is a sort of idempotent closure operator that “rounds down” to the index of the rightmost preceding 1 bit. Maybe one of my readers can elaborate?) The tricky part, apparently, is doing this in such a way that we can dynamically update bits; apparently it can be done so the operations are still $O(\lg n)$ (Roman linked to this paper) but it seems complicated.

• One of my favorite solutions (which I also independently came up with) was suggested by Julian Beaumont: use a disjoint-set data structure (aka union-find) which stores a set for each contiguous (possibly empty) block of full slots together with its one empty predecessor (we can create a dummy slot on the left end to act as the “empty predecessor” for the first block of full slots). Each set keeps track of the index of its leftmost, empty, slot, which is easy to do: any time two sets are unioned we simply take the minimum of their empty slot indices (more generally, we can annotate the sets of a disjoint-set structure with values from any arbitrary monoid). To mark an empty slot as full, we simply union its set with the set of the slot to the left. To find the rightmost empty slot left of a given index, just look up the stored leftmost index corresponding to the set of the given index. Both these operations can thus be implemented in amortized $O(\alpha(n))$ time (where $\alpha$ is the inverse Ackermann function, hence essentially constant). Intuitively, I doubt it is possible to do any better than this. Curiously, however, unlike the other solutions, this solution depends crucially on the fact that we can never revert a full slot to empty!

• Apparently, this is known as the predecessor problem, and can also be solved with something called fusion trees which I had never heard of before.

Posted in data structures | | 3 Comments