Typeclassopedia in Japanese!

October 20, 2009

Satoshi Nakamura has published a Japanese translation of the Typeclassopedia. I don’t read any Japanese, but it sure looks cool, and I hope this will be a great resource for Japanese learners of Haskell. A big thank you to Satoshi for his hard work; the Typeclassopedia is certainly not short!


Hac φ is underway!

July 24, 2009

Hac φ is now in full swing, with almost 20 people hacking away and more coming tomorrow. More news and pictures to come!


Data.List.Split

December 21, 2008

Have you ever had a string like this

"abc;def;ghijk;lm"

and wanted to turn it into a list of strings, like this?

["abc", "def", "ghijk", "lm"]

Of course, you could always use a parsing library, or a regular expression library, but sometimes you just want something a little more lightweight. Perl and Ruby both have library functions called “split” to do just this. Haskell’s standard libraries, on the other hand, have no such function, much to the consternation of many a newbie and experienced Haskeller alike. There have been many proposals to add such a thing to the standard Data.List module in the past, but nothing ever came of it, primarily because there are many slightly different ways to split a list, and no one could ever agree on the One True Splitting Interface.

I decided we’ve been Doing It Wrong. Instead of bickering about the one true interface and going through the stringent library proposals process, let’s just get some useful code together and release it on Hackage. (Of course there are advantages to inclusion in the standard libraries — but that can come later.) So I solicited contributions on a wiki page, took some of the ideas, bits of code, and some ideas of my own, and created Data.List.Split.

Instead of talking about it more, I’ll just show some examples:


*Data.List.Split> splitOn ";" "abc;def;ghijk;lm"
["abc","def","ghijk","lm"]
*Data.List.Split> splitWhen (<0) [1,4,-8,4,-3,-2,9]
[[1,4],[4],[],[9]]
*Data.List.Split> split (startsWith "app") "applyappicativeapplaudapproachapple"
["apply","appicative","applaud","approach","apple"]
*Data.List.Split> split (dropDelims $ oneOf ":;") "::abc;:;;fg:h;;ij;"
["","","abc","","","","fg","h","","ij",""]
*Data.List.Split> split (condense . dropInitBlank $ oneOf ":;") "::abc;:;;fg:h;;ij;"
["::","abc",";:;;","fg",":","h",";;","ij",";",""]

Detailed documentation can be found in the package itself. Install it from Hackage:

cabal install split

You can also check out the darcs repo. Comments, suggestions, and patches welcome!


ZipEdit

June 21, 2008

So I’ve taken over the editorship of the Haskell Weekly News, a job that would be completely insane without help from automated tools. Putting together an issue requires gathering information from a number of different sources, summarizing and compiling the information into a consistent, standard layout, and finally publishing the issue in several different formats.

Thankfully, along with the editorship I inherited some nice tools built by my predecessors, including, in particular, a utility that generates all the different HWN formats (text, html, and pdf) from one text file in a standard format (essentially, the standard Show representation of a value of type HWN). This makes the task of creating and producing each issue much simpler; I have only to worry about the content, and the formatting is taken care of.

However, there wasn’t anything in the way of automation for the process of gathering material in the first place. Of course, this can’t be completely automated—at some level a human still has to look through lists of emails and blog posts and the like, decide what to include, and summarize it appropriately. But there’s still quite a bit of tedium that can potentially be automated away. So, I set about writing myself some tools, hoping that the upfront investment of time will pay off in the long run.

At the heart of several of these tools is a new library I created called zipedit. It’s far from polished and elegant, and may not even be useful to anyone else, but I think it’s kind of cute. The idea is simple: given some sort of list, we want to be able to create a text interface which allows the user to ‘edit’ the list by moving back and forth within it and modifying, deleting, or inserting elements. It’s called ‘zipedit’ because, of course, it uses a list zipper to keep track of the list context.

What does this have to do with collecting material for the HWN? Well, here’s a motivating example: using this library, I wrote a little utility that gets the RSS feed from Planet Haskell, parses out the relevant information, and presents each item to me one-by-one. For each item I can choose whether I would like it to be included in the HWN. But since I am really editing a list of RSS items, I can move backwards as well as forwards through the items, change my mind, and so on. When I am finally done, those items which I chose to be included are output in a format appropriate for pasting into the issue I am editing. The great thing is that the zipedit library deals with all the icky IO bits—the actual utility just specifies an appropriate (pure, functional) configuration. Here are some relevant bits of code from the utility:

data FlaggedItem = FI { item :: RSSItem
                      , use  :: Bool
                      }

mkItem :: RSSItem -> FlaggedItem
mkItem r = FI r False

yes :: FlaggedItem -> FlaggedItem
yes (FI r _) = FI r True

no :: FlaggedItem -> FlaggedItem
no (FI r _) = FI r False

ec :: EditorConf FlaggedItem
ec = EC { display = maybe "" (showItem 30)
        , prompt  = maybe "" (\fi -> (if use fi then "\n(Y/n)" else "\n(y/N)") ++ "? ")
        , actions = [ ('y', Seq [Modify yes, Fwd])
                    , ('n', Seq [Modify no, Fwd])
                    , ('p', Output (showItem 100))
                    ]
                    ++ stdActions
        }

main = do
  RSSFeed rss <- (fromJust . parseFeedString) `fmap` openURL url
  let items = map mkItem . rssItems . rssChannel $ rss
  ml <- edit ec items
  case ml of
    Nothing     -> return ()
    Just items' -> do
      f <- openFile "blogs.wiki" WriteMode
      hPutStr f . intercalate ",\n\n" . map (fmt.item) . filter use $ items'
      hClose f

I’ve left out a few things that aren’t important for this discussion (such as the code that formats the RSS items for output); the full source code can be obtained from the darcs repository. The important part to note is the definition of ec, which gets passed to the edit function. To define a configuration for a list editor, you just need to specify three things: a display function, which tells the library how to display the elements of the list; a prompt function, which tells it how to display an input prompt; and a mapping from user inputs (single character commands) to actions that should be taken in response. In this case, you can see that I have defined ‘y’ and ‘n’ to mark the current item as used/unused and automatically move to the next item, and the ‘p’ command to print more context. I’ve also used a set of standard commands provided by the zipedit library (things like ‘q’ for canceling, ‘d’ for when you’re done, ‘j’ and ‘k’ for moving forwards and backwards, and so on).

So, what does it look like? Here’s an example of the above utility in action:

[05:02:59 brent@xenophon:~/haskell/hwn]$ ./planethaskell 

Andy Gill: Where is the call to $f4?
In a previous blog, we saw that we have options for which version of core we examine.
We saw that we had a significant dictionary creation attributed cost, beneath unvector2D.
(y/N)? n

Bryan O'Sullivan: Disappointed by Thinkpad X60 thermal problems
Ive had a Lenovo X60 for about 18 months. For almost a year, I was well pleased with
its combination of light weight and decent performance, but then it developed
(y/N)? k

Andy Gill: Where is the call to $f4?
In a previous blog, we saw that we have options for which version of core we examine.
We saw that we had a significant dictionary creation attributed cost, beneath unvector2D.
(y/N)? y

Bryan O'Sullivan: Disappointed by Thinkpad X60 thermal problems
Ive had a Lenovo X60 for about 18 months. For almost a year, I was well pleased with
its combination of light weight and decent performance, but then it developed
(y/N)? n

Russell O'Connor: No-Good Ethical Funds
I do not understand the motivation behind buying ethical funds. They seem like a bad idea.
By my understanding they Do not help ethical companies Do not hinder unethical companies
(y/N)? d

As you can see, I rejected the first item, then changed my mind and went back, selected it, rejected the next item, and then decided that I was done. If you’ve used darcs, you may also realize at this point where I got my inspiration for this little project. =)

Another slightly more sophisticated utility I’ve created is for gathering announcements and discussions from email traffic on various lists. Not only can I select or reject individual posts, but for those I select, I can interactively choose a title, and compose some summary text in my favorite editor (which is initially populated with the text of the post, so I can easily pick and choose sentences, phrases, or links to incorporate into my summary). I’ve even added a nifty feature where it only loads a certain number of posts at a time, and seamlessly fetches more over the network when I get to the end of the list. The source code for this utility (gmane.hs) is also available from the HWN darcs repository.

The latest release of zipedit is available on Hackage; there’s also a darcs repository. As always, comments, questions, and patches welcome! I’d be especially interested to hear if you actually find zipedit useful for something, and if so I imagine I would be very responsive to reasonable feature requests. =)


New Haskell diagrams library

April 30, 2008

For the past week or so I’ve been working on an embedded domain-specific language for rendering simple diagrams with Haskell, and I’m excited to actually release version 0.1 today! You can now find it on Hackage. Version 0.1 is still fairly primitive, and there are a bunch more planned features, but you can already use it to create some pretty pictures. Here are a few examples.

We’ll start with a basic ‘hello world’ type diagram: a two-by-five rectangle, no frills:

module Main where
import Graphics.Rendering.Diagrams

main = renderToPng "hello.png" 100 100 (rect 2 5)

OK, not too exciting, but at least it was easy. Here’s another silly example that shows off a few more available features:

module Main where
import Graphics.Rendering.Diagrams

shapes :: Diagram
shapes = hcat [ fc blue $ circle 10
              , (fc goldenrod . lc green . lw 3 $ poly 5 10)
                ## (fc red . rotate (1/10) $ rect 4 4)
              , fc grey . lw 0 . scaleY 3 $ circle 5
              ]

main = renderToPng "shapes.png" 200 200 shapes

Hopefully, this example is fairly self-explanatory. We can alter the appearance of diagrams by applying functions to them like fc (fill color), lc (line color), lw (line width), rotate, and scaleY. We can superimpose two diagrams with ##. And we can lay out a list of diagrams horizontally with hcat. There are many other combinators along similar lines, with various options for distributing and aligning subdiagrams.

Now for a couple cooler examples. How about a Sierpinski triangle?

module Main where

import Graphics.Rendering.Diagrams
import Graphics.Rendering.Diagrams.Types

import qualified Graphics.Rendering.Cairo as C
import Graphics.Rendering.Diagrams.Shapes (draw)

data EqTri = EqTri  deriving Show
instance ShapeClass EqTri where
  shapeSize _   = (2, sqrt 3)
  renderShape _ = do
    c $ C.moveTo 1 s
    c $ C.lineTo 0 (-s)
    c $ C.lineTo (-1) s
    c $ C.closePath
    draw
   where s = sqrt 3 / 2

sierpinski :: Int -> Diagram
sierpinski 0 = fc black $ lw 0 $
               shape EqTri
sierpinski n = vcatA hcenter [         s
                             ,      s <> s]
  where s = sierpinski (n-1)

main = renderToPng "sierpinski.png" 300 300 (sierpinski 6)

This example illustrates a couple key points. One is that the library is easy to extend with new shapes. The built-in poly function is too general to provide a nice equilateral triangle for use in making a sierpinski triangle (its bounding box is too large, which would lead to ugly spaces in the diagram), so we can define our own shape just by making an instance of ShapeClass, and using the Cairo library to draw a path defining the shape. This is probably not the best way to accomplish this particular task — future versions of the diagrams library will include easier ways — but it’s a nice example of how easy it is to extend the basic library functionality.

The other key point is how much power we get for free from the fact that this is an embedded DSL. We can use the full power of Haskell to define a recursive function for computing sierpinski triangle diagrams.

For a final example, here are some nice Ford circles:

module Main where

import Graphics.Rendering.Diagrams

import Data.Ratio
import System.Random

(<+>) :: Rational -> Rational -> Rational
r1 <+> r2 = (numerator r1 + numerator r2) % (denominator r1 + denominator r2)

farey :: Integer -> [Rational]
farey 0 = [0%1, 1%1]
farey n = insertMediants (farey (n-1))

insertMediants :: [Rational] -> [Rational]
insertMediants [] = []
insertMediants [x] = [x]
insertMediants (x:y:zs) = x : (x <+> y) : insertMediants (y:zs)

fordCircles :: Integer -> [Diagram]
fordCircles n = map toCircle (filter ((<= n) . denominator) $ farey n)

toCircle r = translateX r' $
             circle (1 / (2 * d'^2))
  where r' = fromRational r
        d' = fromIntegral (denominator r)

dia :: [Color] -> Diagram
dia colors = view (0,-1/2) (1,0) $
             unionA hcenter bottom $
             zipWith fc colors (fordCircles 20)

randomColors :: [Double] -> [Color]
randomColors (r:g:b:ds) = rgb r g b : randomColors ds

main :: IO ()
main = do
  g <- newStdGen
  let rs = randoms g
  renderToPng "ford.png" 400 205 (dia $ randomColors rs)

Plans for future versions of the library include:

  • text objects
  • settable backgrounds and better support for transparency
  • support for line join style and dashing
  • more primitive shapes: special triangles, ellipses, bezier curves, lines, arrows…
  • more layouts: grid, tree, circle…
  • constraint-based placement of objects, e.g. to connect diagrams with arrows
  • more output modes: ps, svg, pdf
  • and more!

If this looks interesting to you, I hope you’ll download the library and play around with it! (Note that it does require the Cairo bindings, which are packaged as part of gtk2hs, which is unfortunately not yet Cabalized.) I would be happy to receive any and all feedback, including feature suggestions, bug reports, and pretty pictures. If you’re interested in contributing code, the darcs repository can be found at http://code.haskell.org/diagrams/.

Enjoy!


FringeDC talk

March 27, 2008

My FringeDC talk has now been uploaded to Google video! A huge thanks to Conrad Barski, who took the effort to actually record the video and then wrangle it into digital form. Without further ado:

My FringeDC talk

The slides are here if you’d like to follow along.

In general I’m pretty happy with how it turned out. This is one of the first “real” talks I’ve given, though (although I did teach high school for two years), so I definitely learned a lot and have some good ideas for improving the next talk I give.

Comments or questions welcome!

ETA: the Java bug mentioned in the video was just fixed today! w00t!


Gobby, Haskell, and patch theory

February 4, 2008

On a couple of occasions over the past few days, I’ve had the pleasure of editing code with other #haskellers, using the collaborative text editor gobby. It was quite a lot of fun, and (somewhat surprisingly, to me at least) rather effective. It could stand a good chance of becoming one of my preferred methods for collaborative programming.

Except.

Except that gobby doesn’t support undo! It doesn’t even have a way to go “back in time” to a previous version of a document (unless you have it saved locally, of course). This is, quite plainly, a huge show-stopper, and I don’t say that just in theory: I ran into this issue head-on after only a few hours of use. I wanted to delete a bunch of double quotes, so did a search-and-replace. It was taking too long to click “replace” for each double quote character, so I clicked “replace all”, thinking it would only replace double quotes from the current location to the end of the file. Well, as you can probably guess, it deleted all the double quotes everywhere in the file. Oops. Ten minutes of tracking down missing quotes ensued, and I don’t even want to imagine what it would have been like without the help of the typechecker.

This was ten minutes of wasted time. I should have been able to just hit “undo” and get on with life.

So, why doesn’t gobby have undo? I looked at their bug tracker, and sure enough, this has been a requested feature since 2005. But it’s never been done, since undo in a collaborative-editing context is a hard problem. Apparently someone is sorta-kinda working on an implementation, that works in some cases…

A hard problem, eh? What hard problems need is good theory. Well, gee goshums! If only there was some sort of theory of concurrent editing

So, the upshot of all of this is that I am seriously considering writing a gobby clone in Haskell with some sweet, sweet patch theory goodness. (Other killer features under possible consideration: hpaste and darcs integration, a Yi frontend, integrated compilation/error annotations, chat support that doesn’t suck…) I do have a lot more thoughts on the particulars of how this might work, but I won’t write more about the specifics here since that’s not really the point of this post.

The point is that although I said “seriously considering”, in actuality I’m not sure exactly how serious I really am. This would be a pretty big project, and I certainly couldn’t get anywhere close to pulling it off myself. I’m not even sure that I understand all of what it would require; I’d be particularly clueless about the networking and UI stuff. I just think that it would be pretty sweet, and a great application of Haskell’s strengths. Mostly at this point I’m looking for feedback. What do you think? Is this an awesome idea? Is it a stupid idea? Most importantly, would you be interested in working on it?


A word to the wise

December 19, 2007

If you only implement the less than operator in a custom Ord instance (on the theory that “I know I only need to implement one operation to get defaults for the others, and less than makes sense, since you can get anything using less than and equals”), the compiler gives zero warnings (even with -Wall) and trying to compare anything will send your program into nasty infinite recursion. It turns out that you have to implement less than or equal to (or the ‘compare’ function), not less than. Says so in the docs for Ord, of course, but… sigh.

In related news, the new ghci debugger is quite helpful. =)

And in unrelated news, I’m finally done with grad school and fellowship applications!!


a plea for accountability

October 29, 2007

I really need to be finishing up this NSF graduate fellowship application. Chatting in #haskell is not helping. If you see me in #haskell anytime in the next few days, please have lambdabot slap me with a trout.


Higher-dimensional enumeration

October 1, 2007

The other day in the #haskell IRC channel, some of us were playing around with generalizing Haskell list enumeration syntax:

Prelude> [1..5]
[1,2,3,4,5]

First, we defined the (…) operator to perform enumeration, so it can be partially applied. (Unfortunately, we can’t define a (..) operator for this purpose, since it’s reserved syntax.)

Prelude> let x ... y = [x..y]
Prelude> map (...4) [1,3,7]
[[1,2,3,4],[3,4],[]]

So, how to generalize? Someone typed this:

Prelude> let x .... y = map (x...) (x...y)
Prelude> 1 .... 3
[[1],[1,2],[1,2,3]]

Interesting! And what about five dots? Unfortunately, this doesn’t work:

Prelude> let x ..... y = map (x...) (x....y)

<interactive>:1:28:
    Occurs check: cannot construct the infinite type: a = [a]
      Expected type: [a]
      Inferred type: [[a]]
    In the second argument of `map', namely `(x .... y)'
    In the expression: map ((x ...)) (x .... y)

Of course, the problem is that (x …. y) has type [[a]], so we can’t just use map (x…), we have to map twice in order to reach the proper depth:

Prelude> let x ..... y = map (map (x...)) (x .... y)
Prelude> 1 ..... 3
[[[1]],[[1],[1,2]],[[1],[1,2],[1,2,3]]]

We could keep going, although it gets increasingly difficult to tell apart the number of dots.

x ...... y = (map . map . map) (x...) (x ..... y)
x ....... y = (map . map . map . map) (x...) (x ...... y)
...

Let’s take a step back and think about what is going on here. First, (1…3) just means we are counting from 1 up to 3: [1,2,3]. At the next level, (1 .... 3) means we are counting up to (1 … 3) = [1,2,3]: first [1], then [1,2], then [1,2,3]. So (1 .... 3) = [[1],[1,2],[1,2,3]]. We can illustrate it thus:

    1
  1    2
1    2    3

Then (1 ..... 3) means we’re counting up to THAT: first [[1]], then [[1],[1,2]], then [[1],[1,2],[1,2,3]]. Any guesses on a nice visualization? That’s right, it’s a tetrahedron!

In general, continuing this procedure gives us higher- and higher-dimensional simplices. From this perspective, an expression like [1..5] is really giving us discrete points on a 1-simplex, i.e. a line. Now, the big question: can we generalize the operators (…), (….), (…..), etc. into a single higher-order function which can compute this “simplicial enumeration of order n”, for any given n > 0?

At first blush this doesn’t seem possible. For one thing, there’s a problem with assigning a type to this hypothetical higher-order enumeration function: each of the enumeration operators we made returns a different type! (…) returns [a], (....) returns [[a]], and so on. So we’d have to unify the infinite family of types [a], [[a]], [[[a]]], etc. into a single type. Secondly, there’s the problem that each specific enumeration operator uses a different number of calls to map in its implementation. We could try to copy and paste some code from Oleg involving overlapping instances and crazy typeclasses… but (in this instance) there’s a better way.

What we’d like to do is create a type which encompasses [a], [[a]], [[[a]]], and so on. A good first try might be:

data DeepList a = List [a] | Deep [DeepList a]

which says that a DeepList of ‘a’ is either a list of ‘a’, or a list of DeepLists of ‘a’. Sounds good, right? Unfortunately, there is a subtle problem with this definition: lists in different parts of a DeepList value may have different depths. For example, the following is a perfectly legitimate value of type DeepList Int:

Deep [List [1,2,3], Deep [List [1,6], List [4]]]

If we remove the Deep and List constructors, however, this corresponds to the ill-typed list value [[1,2,3], [[1,6], [4]]]. Since we only want values corresponding to nested list types, this ability to have different-depth lists in different parts of the tree won’t do. How can we force different branches to all have the same depth?

To the rescue come non-regular types — that is, recursive parameterized types where the type parameters on the right side are not the same as on the left. In particular, we can use a Bush type defined as follows:

data Bush a = Leaves [a] | Trunk (Bush [a])

This definition is very similar to that of DeepList, but the key difference is that a Bush may consist of a Bush of lists of ‘a’, rather than a list of Bushes of ‘a’. Every occurrence of the Trunk constructor adds one more nested layer to the type parameter, until finally a Leaves constructor is reached, followed by a suitably-nested list. For example, here are some values of type (Bush Int):

Leaves [1,2,3]
Trunk (Leaves [[1,2], [6,7]])
Trunk (Trunk (Leaves [[[5],[6,8,9],[]],[[6,19],[20]]]))

You can check that something like Trunk (Leaves [1,2,3]) gives a type error.

So, our higher-order enumeration function can return a Bush. But how will we deal with the (map . map …)’s? Actually, this is easier than it looks: all we need to do is make Bush a Functor, since that’s really what’s going on: we just want to be able to map (x…) over the result of the enumeration one dimension lower, whatever that might happen to mean. So, here we go:

import Data.List

data Bush a = Trunk (Bush [a]) | Leaves [a]

instance (Show a) => Show (Bush a) where   -- show Bush values without the constructors
  show (Leaves l) = show l
  show (Trunk b)  = show b

instance Functor Bush where
  fmap f (Trunk b)  = Trunk $ fmap (map f) b    -- add another map
  fmap f (Leaves l) = Leaves $ map f l

flatten :: Bush a -> [a]                  -- flatten a Bush into a single list
flatten (Leaves l) = l
flatten (Trunk b)  = concat $ flatten b

x ... y = [x..y]

And now our higher-order enumeration function (which we’ll call simplex, for reasons which are hopefully clear) is easy to write:

simplex :: (Enum t, Integral a) => a -> t -> t -> Bush t
simplex n _ _ | n < 1 = Leaves []
simplex 1 x y = Leaves (x...y)
simplex n x y = Trunk $ fmap (x...) (simplex (n-1) x y)

Let’s try it out:

*Main> simplex 1 1 3
[1,2,3]
*Main> simplex 2 1 3
[[1],[1,2],[1,2,3]]
*Main> simplex 3 1 3
[[[1]],[[1],[1,2]],[[1],[1,2],[1,2,3]]]

It works! Finally, I’ll leave the explanation of the following QuickCheck properties as an exercise for the reader:

simplex' :: (Integral a, Integral t) => a -> t -> Bush t
simplex' n x = simplex n 1 x

fac :: (Integral t) => t -> t
fac n = product [1..n]
binom :: Integer -> Integer -> Integer
binom n k | n < k     = 0
          | k < 0     = 0
          | otherwise = (fac n) `div` (fac k * fac (n-k))

prop_simplex k n = (k > 0) ==>
    (genericLength . flatten . simplex' k $ n) == binom (n + k - 1) k

prop_simplex_components k n = (k > 0) ==>
    (map genericLength . group . sort . flatten . simplex' k $ n) == map (flip binom (k-1)) [k+n-2,k+n-3..(k-1)]