I’ve come up with idea for a type system for first-class (global) definitions, which can serve as a very lightweight alternative to a proper module system. I’m posting it here in the hopes of getting some feedback and pointers to related work.
Commands and expressions
The programming language of Swarm (for lack of a better term I will hereafter refer to it as Swarmlang) has a bunch of imperative commands, and standard monadic sequencing constructs. For example,
move commands in sequence, and
thing <- grab; give friend thing
grab, binding the variable
thing to the result, then executes
give friend thing. Of course, there is also a rich language of pure expressions, with things like arithmetic, strings, lambdas and function application, pairs, sums, and so on.
Some languages make a syntactic distinction between statements and expressions, but Swarmlang does not: everything is an expression, and some expressions happen to have a command type. If
t is a type, then
cmd t is the type of an imperative command which, when executed, can have some effects and then returns a result of type
t. (Of course this should feel very familiar to Haskell programmers;
cmd has many similarities to
IO.) This approach makes many things simpler and means that commands are first-class values.
Swarmlang has definitions, which are just expressions with a command type. If
e is an expression, then
def x = e end
cmd (). When executed, it should have the effect of binding the name
x to the expression
e, and bringing
x into scope for all subsequent commands. Thus, it is valid to sequence this first definition with a second definition that mentions
x, like so:
def x = e end; def y = foo bar x end
Of course, this means that while typechecking the definition of
y, we must be able to look up the type of
x. However, the type of the first
def command is simply
cmd (), which does not tell us anything about
x or its type. Normally, the typing rule for sequencing of commands would be something like
but this does not work for
def commands, since it does not take into account the new names brought into scope. Up until now, I have dealt with this in a somewhat ad-hoc manner, with some special typechecking rules for
def and some ad-hoc restrictions to ensure that
def can only syntactically show up at the top level. However, I would really like to put everything on a more solid theoretical basis (which will hopefully simplify the code as well).
Decorating command types
The basic idea is to decorate the type with extra information about names bound by definitions. As usual, let denote a generic context, that is, a finite mapping from variable names to their types. Then we extend the
cmd type by adding a context to it:
is the type of a command which yields a result of type and produces global bindings for some names whose types are recorded in . (Of course, we can continue to use as an abbreviation for .) So, for example,
def x = 3 end no longer has type , but rather something like , representing the fact that although
def x = 3 end does not result in an interesting value, it does bind a name,
x, whose type is
This is slightly unusual in the fact that types and contexts are now mutually recursive, but that doesn’t seem like a big problem. We can now write down a proper typing rule for sequencing that takes definitions into account, something like this:
And of course the typing rule for
def looks like this:
These rules together can now correctly typecheck an expression like
def x = 3 end; def y = 2 + x end
where the second definition refers to the name defined by the first. The whole thing would end up having type .
All this seems straightforward with only first-order types, as in my example typing rules above. But once you add parametric polymorphism my brain starts to hurt. Clearly, the context associated to a command type could bind variables to polytypes. For example,
def id = \x.x end has type . But should the context associated to a command type always contain polytypes, or only when the command type is itself a polytype? In other words, how do we deal with the associated contexts in the monotypes that show up during type inference? And what would it mean to unify two command types with their contexts (and would that ever even be necessary)? I hope it’s actually simple and I just need to think about it some more, but I haven’t wrapped my brain around it yet.
Ideas and pointers welcome!
I’d be very happy to hear anyone’s ideas, or (especially) pointers to published work that seems related or relevant! Feel free to comment either here, or on the relevant github issue.