Patch theory, part II: some basics

February 13, 2008

(Previous posts here, and here.)

So, let’s talk about patch theory! I should start by saying that I have obviously drawn a lot of ideas and inspiration from darcs, and especially from the wikibook explanation of darcs patch theory, but I do think there might be some nuggets of original contributions here (in subsequent posts at least), but I’m not yet familiar enough with the literature to really say.

In this post I’d like to start off by giving an overview of the basics of patch theory, to lay a foundation for the things I plan to talk about in future posts.

Patches and documents

A patch is essentially a function which takes a document as input and produces another document. A “document” is just any sort of thing that we might wish to modify. In the context of darcs, a “document” is an entire directory tree; in a collaborative editor it would be just a single file.

Note, however, that patches are partial functions: not every patch can be applied to every document! For example, a patch which (to use a darcs example) makes a modification to file X cannot be applied in a context in which there is no file X. As another example, a patch to delete the character ‘z’ from the first position in a file cannot be applied to a file which begins with the character ‘y’. But thinking of patches as partial functions is not a very useful point of view. The main point is that the context of a patch matters—both the context to which it is applied, and the context which it produces. We will write P : x \to y to denote a patch P which, when applied to document x, produces document y. We say that x is the domain of P, and y is its codomain.

A patch

Of course, this immediately suggests…

The category of patches

Patches can be most usefully viewed as morphisms in a category with documents as objects. (If you don’t know any category theory, don’t worry: the rest of this post doesn’t particularly depend on any background knowledge.) To wit: given two patches P : x \to y and Q : y \to z, we can compose them to form the composite patch PQ : x \to z, which has the same overall effect as applying first P, then Q. (Of course, there are good arguments for writing this composition in the other order, like QP—function composition and all that—but this is the way I’ve been writing it, so get used to it. =)

Patch composition

Since patches can be viewed as functions from one document to another, and function composition is associative, patch composition is obviously associative as well. Finally, for every document d, we will have a null patch id_d which sends d to itself.

The identity patch for document d

Since we’re very interested in the “undo” operation, we also require that every patch must be invertible—that is, for every patch P : x \to y there must be a corresponding patch P^{-1} : y \to x, such that P P^{-1} = id_x and P^{-1} P = id_y. Of course, this also means that (P^{-1})^{-1} = P.

The inverse of a patch

So, the category of patches is actually a groupoid. A groupoid can be viewed as a set with inverses and a partial binary operation—here, the partiality comes from the fact that not all patches can be composed—but I prefer the category-theoretical view of a groupoid as a category with all morphisms invertible. Because really, who likes partial functions?

(A quick note: I’m playing a little fast and loose with patch equality here; when I say that two patches are equal, what I really mean is that they are observationally equivalent. So technically, the morphisms in the category of patches are equivalence classes of patches; in a particular implementation there may be patches with distinguishable representations which nevertheless have the same effect — that is, they produce the same output document given the same input document.)

Commutation

The other central operation on patches is that of commutation. As a motivating example, let’s consider the problem of undo in a text editor. Suppose, starting from a blank document, you sequentially apply the five patches A through E. Therefore, the current document state can be described by the composite patch

ABCDE

Now suppose you want to undo your last change. This is easy: since every patch has an inverse, you can just apply E^{-1} to obtain

ABCDEE^{-1} = ABCD

(In practice, to remember the fact that you performed an undo, and to allow the possibility of redo in the future, an editor would retain the patches E and E^{-1} rather than deleting them. But the overall effect is the same.)

Nice. But what if you are using a collaborative editor, and changes D and E were made by a different user (perhaps you are not even aware of their changes, if they were made in a different part of the document)? You want to undo patch C, which is the last change that you made, but simply applying C^{-1} doesn’t work anymore, since C^{-1} cannot be composed with E (the codomain of E does not match the domain of C^{-1})!

We need a way to “move” C to the end of the patch sequence, like this:

ABCDE \Rightarrow ABD'C'E \Rightarrow ABD'E'C''

Now we can simply apply the inverse patch C''^{-1} to undo.

This process of reordering patches is referred to as commutation. The composite patch PQ commutes to the composite patch Q'P' (written PQ \leftrightarrow Q'P') if PQ = Q'P', and P represents “the same change” as P', and similarly for Q and Q'.

Commuting patches

Of course, “the same change” is quite vague, but it’s a necessary restriction; just requiring that PQ = Q'P' is not enough, since in that case we could, for example, choose Q'=P and P'=Q—obviously not what we want. I’ve wondered whether there is a formal way to pin this down, although I think it might depend on the particular document type being used. However, for now a simple example should suffice.

Suppose Alice and Bob are editing a document together, which contains the word “cap”. First, Alice inserts the letter “m” at position 2 (positions are zero-indexed), to produce the word “camp”; call this patch A. Next, Bob inserts the letter “r” at position 1, producing the word “cramp”; call this patch B. Now Alice decides that she wishes to undo her change, since “crap” is a much better word than “cramp”. In order to do this, the patches A and B must first be commuted: we want to find patches B' and A' such that B' adds the letter “r”, A' adds the letter “m”, and when composed, B'A' still sends the document “cap” to the document “cramp”. In this case, it’s not too hard to see that B' should still insert “r” at position 1, but now A' should insert “m” at position 3 instead of position 2, since the location in the document where A inserted an “m” has been “shifted over” by the patch B'.

Alice and Bob commute their patches.

After commuting A and B, the patch A'^{-1} can now be applied to undo Alice’s change.

Now, the big question: does every pair of patches commute? In the case of a version control system like darcs, the answer is definitely “no”. For example, suppose P creates file X, and Q adds some content to file X. There is no way we can meaningfully reorder the two patches—content cannot be added to file X before it has been created! In the case of a collaborative editor, on the other hand, the answer is… maybe? This is one of the central questions I plan to address in later posts.

If the patches P and Q commute, a nice property that we’d like to have hold in all cases is

PQ \leftrightarrow Q'P' \leftrightarrow PQ,

that is, we want the commute operation to be an involution, so applying it twice is the identity. This also has some interesting implications for the theory of a collaborative editor, and it will come up again in later posts, too!

Merging

The other fundamental operation on patches, of course, is merging: taking two patches made to the same document in parallel, and merging them into a sequence of patches that performs both changes. In a version control system, this happens all the time, when two people have the same source and start making changes to it at the same time, and later want to get the other’s changes as well. It happens in a collaborative editor, too, because of network latency. When you start typing something, someone else may have already typed some things that have not yet propagated to you over the network; when you finally receive the propagated changes, they will have to be merged with the changes you have made.

It turns out, however, that merging is really not a fundamental operation at all! It can be implemented very simply in terms of commutation and inverses. Here’s the situation: suppose we have two patches, A and B, made in parallel to the same document, x. We want to find a patch B' which performs the “same change” as B, but which can be composed with A.

Merging two patches

How can we find B'? The key is to note that if we invert A, this looks just like the diagram for commuting two patches, but on its side!

Implementing merge

In other words, to merge patches A and B, we first commute A^{-1}B to obtain B'(A^{-1})'. Then we can simply discard (A^{-1})'. (Of course, this sounds like wasted computation, but supposing we were to use some sort of lazy language… =) Let’s illustrate this with a simple example. Alice and Bob are at it again; this time, they are editing a document containing the word “hat”. Alice adds the letter “c” to create the word “chat” (patch A). At the same time, Bob adds the letter “s” to create the word “hats” (patch B). Now Alice’s editor gets Bob’s patch, and needs to merge it with Alice’s. Commuting A^{-1}B yields B' (A^{-1})', as shown in the diagram, and Alice’s editor applies patch B', producing the document “chats”.

Alice merges Bob’s change

Of course, at the same time, Bob’s editor receives Alice’s patch, and performs the dual merge shown below.

Bob merges Alice’s change

This illustrates an obvious and important property that must hold: merging must be symmetric! From the above two diagrams, we must have AB' = BA'.

Onward

Alright, enough for today! Next time: some actual Haskell code implementing all of this for a simple document type!


Patch theory thoughts, part I

February 7, 2008

So, I still don’t know whether I’ll actually end up writing a gobby clone in Haskell. But it’s already been a wild ride thinking about the theory behind it and some of the issues involved, and over the next few posts I’d like to share some of my thoughts, complete with illustrative code, in the hopes that others will find them interesting or inspiring. Will this be just a rehashing of stuff that someone else has already written about in much more depth? Will it be an important contribution to the topic? Frankly, at this stage, I don’t really care. =)

I’ll get to the fun stuff soon; today, I’d like to start out simple by just responding to a few of the comments on my previous post.

First, Creighton Hogg wondered about using darcs as the backend of such a concurrent editor, and whether it would run into issues with the known performance problems with darcs. Allow me to first clarify that I am not thinking about creating a concurrent editor backed by darcs, but rather one which is backed by the theory behind darcs. It’s still a fair question, though: mightn’t my editor exhibit the same algorithmic problems that causes darcs performance problems? The answer, as it turns out, is no: darcs exhibits performance problems in the specific case of merging conflicting patches, but in a concurrent editor (the model I’m interested in, at least) there’s no such thing as conflicting patches! This is a very important difference and I’ll talk about this, and its theoretical implications, later. In any event, darcs 2 reportedly fixes the performance problems anyway.

Tim suggested some sort of GUI interface for darcs — I think this is a great idea, but in my opinion doesn’t come anywhere close to filling the same niche as a collaborative editor. A collaborative editor allows for much more freedom, and much more…well… collaboration. For example, you could change some code and put a little comment next to it saying “is this right? Or should it be like XYZ?”, and someone else could respond to the comment, or change the code, and so on. No one would do that via darcs; the social assumption is that each darcs patch you make is correct (to the best of your knowledge), so there’s not as much room for experimenting and getting feedback from others. Anon also questioned the worth of this sort of project, claiming that collaborative editors don’t actually get used all that much, to which I would reply: maybe the reason people don’t use collaborative editors very much is because good (and free) ones don’t exist.

sclv, Eric Kow, and the Edward linked to some interesting research on the topic, which I’ve started reading a bit but hope to look into in more detail soon.

mgsloan is working on a cool Haskell editor project (screenshot) — although as Christophe Poucet points out, it would not work well with a collaborative model due to its use of zippers to hold buffer contents. But I’m glad there are other people out there who know how to make cool-looking GUIs, since I sure don’t. =)

Finally, it turns out that I’m not the first person to have this idea! Christophe (along with Ivan Tarasov) worked on the beginnings of a system called Splatch at last October’s Haskell Hackathon. It’s pretty neat, although (as I have expressed to Christophe, rather incoherently, I’m afraid) I believe in its current state it has a few fundamental theoretical problems. Over my next few posts I hope to be able to expound on this a bit more, although my focus will not be on evaluating Splatch per se, but on describing my explorations in patch theory in general, and as it might apply to a collaborative editor in particular.


Gobby, Haskell, and patch theory

February 4, 2008

On a couple of occasions over the past few days, I’ve had the pleasure of editing code with other #haskellers, using the collaborative text editor gobby. It was quite a lot of fun, and (somewhat surprisingly, to me at least) rather effective. It could stand a good chance of becoming one of my preferred methods for collaborative programming.

Except.

Except that gobby doesn’t support undo! It doesn’t even have a way to go “back in time” to a previous version of a document (unless you have it saved locally, of course). This is, quite plainly, a huge show-stopper, and I don’t say that just in theory: I ran into this issue head-on after only a few hours of use. I wanted to delete a bunch of double quotes, so did a search-and-replace. It was taking too long to click “replace” for each double quote character, so I clicked “replace all”, thinking it would only replace double quotes from the current location to the end of the file. Well, as you can probably guess, it deleted all the double quotes everywhere in the file. Oops. Ten minutes of tracking down missing quotes ensued, and I don’t even want to imagine what it would have been like without the help of the typechecker.

This was ten minutes of wasted time. I should have been able to just hit “undo” and get on with life.

So, why doesn’t gobby have undo? I looked at their bug tracker, and sure enough, this has been a requested feature since 2005. But it’s never been done, since undo in a collaborative-editing context is a hard problem. Apparently someone is sorta-kinda working on an implementation, that works in some cases…

A hard problem, eh? What hard problems need is good theory. Well, gee goshums! If only there was some sort of theory of concurrent editing

So, the upshot of all of this is that I am seriously considering writing a gobby clone in Haskell with some sweet, sweet patch theory goodness. (Other killer features under possible consideration: hpaste and darcs integration, a Yi frontend, integrated compilation/error annotations, chat support that doesn’t suck…) I do have a lot more thoughts on the particulars of how this might work, but I won’t write more about the specifics here since that’s not really the point of this post.

The point is that although I said “seriously considering”, in actuality I’m not sure exactly how serious I really am. This would be a pretty big project, and I certainly couldn’t get anywhere close to pulling it off myself. I’m not even sure that I understand all of what it would require; I’d be particularly clueless about the networking and UI stuff. I just think that it would be pretty sweet, and a great application of Haskell’s strengths. Mostly at this point I’m looking for feedback. What do you think? Is this an awesome idea? Is it a stupid idea? Most importantly, would you be interested in working on it?