## Competitive programming in Haskell: 2D cross product, part 1

Time for some more geometry! In my previous post I challenged you to solve Cookie Cutters, which asks us to scale the vertices of a polygon so that it has a certain prescribed area. It’s possible to solve this just by looking up an algorithm for computing the area of a polygon (see the “shoelace formula”). But the way to get good at solving geometry problems is not by memorizing a bunch of formulas, but rather by understanding a few general primitives and principles which can be assembled to solve a wide range of problems.

Incidentally, if you’re serious about getting good at geometric problems in competitive programming, then you absolutely must read Victor Lecomte’s Handbook of geometry for competitive programmers. (It’s still a great read even if you’re not serious!)

## The 2D cross product

In two dimensions, given vectors $\mathbf{u} = (u_x, u_y)$ and $\mathbf{v} = (v_x, v_y)$, we can compute their cross product as

$\mathbf{u} \times \mathbf{v} = \begin{vmatrix} u_x & v_x \\ u_y & v_y \end{vmatrix} = u_x v_y - v_x u_y.$

One useful way to understand this as giving the signed area of the parallelogram determined by $\mathbf{u}$ and $\mathbf{v}$. The area is positive when $\mathbf{v}$ is counterclockwise from $\mathbf{u}$, negative when it is clockwise, and zero when the two vectors are colinear (i.e. parallel or antiparallel).

I’m not going to prove this here, since to be quite honest I don’t remember off the top of my head how to derive it. (Also, geometric algebra does a much better job of explaining where this comes from and generalizing to any number of dimensions; in particular, $\mathbf{u} \times \mathbf{v}$ is the coefficient of the bivector resulting from the outer product of $\mathbf{u}$ and $\mathbf{v}$. But that would take us much too far afield for now!)

So let’s write some Haskell code to compute the cross product of 2D vectors. (All this code has of course been added to Geom.hs.)

cross :: Num s => V2 s -> V2 s -> s
cross (V2 ux uy) (V2 vx vy) = ux*vy - vx*uy

crossP :: Num s => P2 s -> P2 s -> P2 s -> s
crossP p1 p2 p3 = cross (p2 ^-^ p1) (p3 ^-^ p1)

type P2 s = V2 s
type P2D  = P2 Double

A few things to note:

• cross works over any scalar type which is an instance of Num. In solving Cookie Cutters, this is going to be Double, but it could also be, e.g. Integer.
• For convenience, crossP is a variant of cross that takes three points as arguments, and computes the cross product of the vector from the first to the second with the vector from the first to the third. In many instances where we want to use the cross product, we actually have the coordinates of three points/vertices.
• We’ve added P2 and P2D as type aliases for V2 and V2D. They are just aliases, not newtypes, to reduce the need for separate operators that work on points vs vectors, but it’s still helpful to have different type aliases to at least alert us to whether our functions morally want to be given vectors or points as arguments.

Now, keeping in mind the fundamental interpretation of the 2D cross product as computing the signed area of a parallelogram, we can derive a few other operations. First, given the three vertices of a triangle, we can compute the signed area of the triangle as half of the cross product (because the triangle is half the parallelogram). Note that the order of the vertices matters: the area will be positive if they are in counterclockwise order, and negative if clockwise. Swapping any two vertices negates the result. If we want the normal nonnegative area of a triangle regardless of the order of the vertices, of course we can just take the absolute value.

signedTriArea :: Fractional s => P2 s -> P2 s -> P2 s -> s
signedTriArea p1 p2 p3 = crossP p1 p2 p3 / 2

triArea :: Fractional s => P2 s -> P2 s -> P2 s -> s
triArea p1 p2 p3 = abs (signedTriArea p1 p2 p3)

(Notice the Fractional constraint since we have to divide by two.) At first glance, you might think the concept of “signed triangle area” is silly and useless. But it turns out to be the key to understanding the “shoelace formula”.

## The shoelace formula for polygon area

Imagine first that we have a convex polygon. If we pick a point somewhere in its interior (say, the centroid) and draw lines from the central point to every vertex, we chop up the polygon into triangles. Obviously, adding up the areas of these triangles will give us the area of the polygon.

What’s much less obvious is that if we add up the signed area of each triangle, it still works even if (1) the polygon is not convex, and/or (2) the “central point” is not in the interior of the polygon! That is, we just pick some arbitrary “central point” (the origin works nicely) and compute the signed area of the triangle formed by the origin and each edge of the polygon. A sort of magical inclusion-exclusion thing happens where all the area outside the polygon gets canceled out, and all the area inside ends up getting counted exactly once. Rather than try to prove this to you, I’ll just illustrate some examples.

So, here’s the Haskell code. signedPolyArea yields a positive area if the vertices of the polygon are in “counterclockwise order” (puzzle: what does “counterclockwise order” mean for a non-convex polygon? Hint: look up “winding number”; this is also the key to a formal proof that all of this works), and negative if they are clockwise.

signedPolyArea :: Fractional s => [P2 s] -> s
signedPolyArea pts = sum $zipWith (signedTriArea zero) pts (tail pts ++ [head pts]) polyArea :: Fractional s => [P2 s] -> s polyArea = abs . signedPolyArea The “shoelace formula”, as it is usually presented, falls out if you inline the zero argument to signedTriArea and then simplify the result. It would be possible to do this and code an optimized version of signedPolyArea that uses the shoelace formula more directly, but I much prefer having this version which is built out of meaningful and reusable components! Incidentally, there is a 3D analogue to the shoelace formula for computing the volume of a 3D polyhedron, but it requires some care to first make sure all the faces are oriented in a compatible way; see section 3.5 of Lecomte. ## Other utilities I added a couple more utilities to Geom.hs which we will need. First, since we need to scale polygons up or down to give a required area, we need the concept of multiplying a vector by a scalar: (*^) :: Num s => s -> V2 s -> V2 s k *^ (V2 x y) = V2 (k*x) (k*y) Also, to help with reading vectors from the input, I added this combinator: v2 :: Applicative f => f s -> f (V2 s) v2 s = V2 <$> s <*> s

The idea is to use it with f ~ Scanner. For example, if double :: Scanner Double then we can write v2 double :: Scanner (V2 Double).

Last but not least, I also added getX and getY field labels to the V2 type, for when we need to extract the coordinates of a point or vector:

data V2 s = V2 { getX :: !s, getY :: !s } deriving (Eq, Ord, Show)

Finally, here’s my solution to Cookie Cutters. First, some imports and main, which just scans the input, generates the required scaled and translated list of vertices, and then formats the output.

import           Control.Arrow
import qualified Data.Foldable as F
import           Text.Printf

import           Geom
import           Scanner

main = interact $runScanner scan >>> solve >>> map (F.toList >>> map (printf "%.5f") >>> unwords) >>> unlines Here’s the data type for storing the input, along with a Scanner for it. Notice how we use v2 double' to read in 2D vectors (well, actually points!) in the input. The annoying thing is that some floating-point values in the input are formatted like .5, with no leading 0, and read ".5" :: Double crashes. Hence the need for the double' scanner below, which reads a string token and potentially adds a leading zero before conversion to Double. data TC = TC { polygon :: [P2D], area :: Double } scan :: Scanner TC scan = do n <- int TC <$> n times (v2 double') <*> double'

double' :: Scanner Double
double' = (read . fixup) <$> str where fixup s@('.':_) = '0':s fixup s = s And finally, putting the pieces together to solve the meat of the problem. We first compute the area of the given polygon using polyArea, then divide the desired area by the original area to find the factor by which the area must increase (or decrease). Area scales as the square of distance, so we must take the square root of this factor to find the factor by which the vertices must scale. We simply scale all the vertices appropriately, then find the minimum x and y coordinates so we can translate by their negation, to make the polygon touch the positive x and y axes as required. solve :: TC -> [P2D] solve (TC ps a) = map (^-^ V2 xmin ymin) ps' where a0 = polyArea ps s = sqrt (a / a0) -- scaling factor to get the desired area ps' = map (s *^) ps xmin = minimum (map getX ps') ymin = minimum (map getY ps') ## Next time: Chair Hopping For next time I invite you to solve Chair Hopping. Warning, this one is rather difficult! But I had a lot of fun solving it, and the solution touches on several interesting topics (in fact, I’ll probably need more than one blog post). ## About Brent Assistant Professor of Computer Science at Hendrix College. Functional programmer, mathematician, teacher, pianist, follower of Jesus. This entry was posted in competitive programming, haskell and tagged , , , . Bookmark the permalink. ### 6 Responses to Competitive programming in Haskell: 2D cross product, part 1 1. Albert says: An excellent article, Brent. Thank you so much! I would like to make a brief comment on how to convince yourself that the shoelace formula works for arbitrary (non-self-intersecting) shapes. This is not a rigorous proof: just an argument. I am pretty sure you heard that one before but it may help your readers’ intuition. Proceed in steps. (1) The formula works most nicely for convex shapes when the co-ordinate origin is at an (inside) “central” point, which makes all vector components equal to vertex co-ordinates. (2) If the origin is elsewhere, move the reference frame to to the “central” point of your choosing. As a translation never changes geometric lengths or angles, it will not affect the area. Not the absolute value of the area but the signed area; hence, always use the signed area and defer taking the absolute value till the very end. (3) If the shape is not convex but has the topology of a ring, its area is the area bounded by the outer border minus that of the inner, which is another reason why the notion of a signed area can be helpful. Given that, the formula still applies. (4) If the original shape is not convex, its area can be computed as the area of its convex hull minus differences. An arbitrary shape can always be partitioned into a set of convex sub-shapes (think triangulation). This is just an argument of why the shoelace formula still applies. 2. shaurya gupta says: I missed the last 2 problems but I’m glad I did this one. It took me more than 3 hours of coding and around an hour of debugging. C++ would have taken not more than 20 minutes. Let us first think about ‘s’ – a person in seat i will move to seat s(i) in the next turn. Thus we can imagine this as a graph where an edge is directed from i to s(i). Since all i’s are unique and all s(i)’s are unique, all nodes in the graph have indegree = outdegree = 1. Hence, we have a graph will simple cycles only. So each person will hop in their own cycle only. Now let us think about t. After two hops, i has come from chair i to chair t_i. In the graph for s, this would look like …i -> _ -> t(i)… . Now if we represent t as the graph, we get a graph that has the same structure as s (although in a slightly different context). Let us observe a cycle from the t-graph. a -> b -> c -> … -> a. The members of this cycle must also be members of the same cycle in the s-graph (although not necessarily just these members). So let us build a s-graph cycle from this cycle by leaving a gap after seating each person. We observe that there are 2 ways to do this: Way 1: we are seating r people in a 2r sized cycle by placing a person and then leaving a gap. Way 2: we are seating r people in a r sized cycle by first placing the first half and then filling the gaps made by the first half. This only works if r is odd. The cycles created by way 1 have gaps that have to be filled by pairing up same sized cycles. So if I have 1 > _ > 2 > _ > 1 and 3 > _ 4 > _ > 3, I can join these cycles in ways. 1 > 3 > 2 > 4 > 1 and 1 > 4 > 2 > 3 > 1. To join two 2k sized cycles, there are k ways (easy to see this.) Thus, overall our task is this: find the different sizes of cycles there are in the t-graph as well their count. Make a list of these pairs (n, m) which represents n cycles of length m. Solve each pair independently. From these n cycles, an even number will be translated by way 1 and the rest by way 2. Let there by 2k way-1 cycles and (n – 2k) way-2 cycles. There are “n choose 2k” ways to do this. For the way-2 cycle, we don’t have to do anything else. Thus only 1 way to do this. For the 2k way-1 cycles, we must pair them. To do this, we use an inductive strategy. Assume each cycle is a person and we want to find the number of ways of making couples. – for k = 1, there is only 1 way – just pair the 2 people that are there. ways(1) = 1 – now assume we know the number of ways to pair 2k people. We bring in 2 more people x and y. all we have to do is find a partner for x and then pair the remaining people. Thus ways (k+1) = ways(k) * (2k+1) Thus ways (k) = 1 * 3 * 5 … (2k – 1) = (2k)! / (k! * 2^k) Remember, these are cycles and not people and once we have made the pairs, we have to arrange as well which can be done in m ways for each cycle and thus m^k ways. (m is the length of the t cycle). Overall there are (2k)! * m^k / (k! * 2^k) ways to pair the way-1 cycles. Our final formula is therefore (nC2k) * (2k)! * m^k / (k! * 2^k) -> simplify this. We have to sum over this function from k = 0 to floor(n/2). One thing to remember is that way way-2 cycles are only possible if the length is odd. Thus, the above formula is when m is odd. When m is even, all cycles must be way-1. Since way-1 cycles must be even in number, if n is odd, then there are 0 ways. If n is even, then evaluate the above summation at only 1 point which is “n/2”. Phew…! Moving on to code. We need to find the answer module 10^9 + 7, so I first created a new type for this and gave it a num instance so that modular operations can be abstracted as much as possible. Next I wrote a modular exponentiation function as well as modular inverse function. The inverse function is based on fermat’s little theorem since 10^9 + 7 is prime. We use the inverse function to have a division function. newtype ModInt = ModInt {toInt :: Int} deriving (Show, Eq, Ord) modulo :: Num a => a modulo = 1000000000 + 7 mkModInt :: Int -> ModInt mkModInt x = ModInt$ x mod modulo

instance Num ModInt where
a + b = mkModInt $toInt a + toInt b a – b = mkModInt$ toInt a – toInt b
a * b = mkModInt . fromIntegral . (mod modulo)
$(fromIntegral (toInt a) * fromIntegral (toInt b) :: Int64) negate a = mkModInt . negate$ toInt a
abs = id
signum = const 1
fromInteger = ModInt . fromIntegral . (mod modulo)

modinv :: ModInt -> ModInt
modinv x = modexp x (modulo – 2)

modexp :: ModInt -> Int -> ModInt
modexp = go 1
where go :: ModInt -> ModInt -> Int -> ModInt
go acc a 0 = acc
go acc a pow = let acc’ = if pow .&. 1 == 1 then acc * a else acc
in go acc’ (a * a) (pow shift (-1))

moddiv :: ModInt -> ModInt -> ModInt
moddiv a b = a * modinv b

Next, we need the “ways” function. We use list comprehension for this as well as vector of factorials.

ways :: Int -> Int -> ModInt
ways n m = sum [(fac n * m’) moddiv (fac k * fac (n – 2 * k) * two) |
let mOdd = mod m 2 == 1,
let nEven = mod n 2 == 0,
(k, m’, two) ModInt
fac = (facs V.!)

_N :: Int
_N = 100000 + 5

facs :: V.Vector ModInt
facs = V.prescanl’ (\a b -> a * ModInt b) 1 . V.fromList . take _N $[1..] powers :: Int -> [ModInt] powers x = scanl (*) 1 (repeat$ ModInt x)

Finally, we want to find the t-cycles. The best way to do this would be to use disjoint-set union. However, I’m not sure how to write that as I haven’t had much experience with ST monad, so I instead just wrote a dfs using State monad. We pass the graph as a Int -> Int mapping and everytime we traverse an edge, we delete the node the edge originates from. The return value is the length of the cycle (0 means we already visited it).

In the solve function, we perform a dfs from each node, convert this into the (n, m) list we talked about before and apply the ways function and find the product of all ways!

solve :: Map.IntMap Int -> Int
solve t =
toInt
. product
. fmap (uncurry $flip ways) . Map.toList . Map.delete 0 . Map.fromListWith (+) . map (, 1)$ evalState (mapM dfs [1..(Map.size t)]) t

dfs :: Int -> State (Map.IntMap Int) Int
dfs u = do
present pure 0
Just nxt -> do
modify \$ Map.delete u
(+1) dfs nxt

Overall, this took 0.33 seconds. It would awesome if you could suggest how I can improve this.
One way would be to use a dsu with mutable vector instead of dfs.
Can my memoization of factorials be improved?
What else can I make more idiomatic?

This site uses Akismet to reduce spam. Learn how your comment data is processed.