Typed Type-Level Computation in Amulet

Posted on October 4, 2019

Amulet, as a programming language, has a focus on strong static typing. This has led us to adopt many features inspired by dependently-typed languages, the most prominent of which being typed holes and GADTs, the latter being an imitation of indexed families.

However, Amulet was up until recently sorely lacking in a way to express computational content in types: It was possible to index datatypes by other, regular datatypes (“datatype promotion”, in the Haskell lingo) since the type and kind levels are one and the same, but writing functions on those indices was entirely impossible.

As of this week, the language supports two complementary mechanisms for typed type-level programming: type classes with functional dependencies, a form of logic programming, and type functions, which permit functional programming on the type level.

I’ll introduce them in that order; This post is meant to serve as an introduction to type-level programming using either technique in general, but it’ll also present some concepts formally and with some technical depth.

Type Classes are Relations: Programming with Fundeps

In set theory1 a relation \(R\) over a family of sets \(A, B, C, \dots\) is a subset of the cartesian product \(A \times B \times C \times \dots\). If \((a, b, c, \dots) \in R_{A,B,C,\dots}\) we say that \(a\), \(b\) and \(c\) are related by \(R\).

In this context, a functional dependency is a term \(X \leadsto Y\) where \(X\) and \(Y\) are both subsets of \(\mathbb{N}\). A relation is said to satisfy a functional dependency \(X \leadsto Y\) when, for any tuple in the relation, the values at \(X\) uniquely determine the values at \(Y\).

For instance, the relations \(R_{A,B}\) satisfying \(\{0\} \leadsto \{1\}\) are partial functions \(A \to B\), and if it were additionally to satisfy \(\{1\} \leadsto \{0\}\) it would be a partial one-to-one mapping.

One might wonder what all of this abstract nonsense2 has to do with type classes. The thing is, a type class class foo : A -> B -> constraint is a relation \(\text{Foo}_{A,B}\)! With this in mind, it becomes easy to understand what it might mean for a type class to satisfy a functional relation, and indeed the expressive power that they bring.

To make it concrete:

class r 'a 'b (* an arbitrary relation between a and b *)
class f 'a 'b | 'a -> 'b (* a function from a to b *)
class i 'a 'b | 'a -> 'b, 'b -> 'a (* a one-to-one mapping between a and b *)

The Classic Example: Collections

In Mark P. Jones’ paper introducing functional dependencies, he presents as an example the class collects : type -> type -> constraint, where 'e is the type of elements in the collection type 'ce. This class can be used for all the standard, polymorphic collections (of kind type -> type), but it also admits instances for monomorphic collections, like a bitset.

class collects 'e 'ce begin
  val empty  : 'ce
  val insert : 'e -> 'ce -> 'ce
  val member : 'e -> 'ce -> bool
end

Omitting the standard implementation details, this class admits instances like:

class eq 'a => collects 'a (list 'a)
class eq 'a => collects 'a ('a -> bool)
instance collects char string (* amulet strings are not list char *)

However, Jones points out this class, as written, has a variety of problems. For starters, empty has an ambiguous type, forall 'e 'ce. collects 'e 'ce => 'ce. This type is ambiguous because the type varialbe e is \(\forall\)-bound, and appears in the constraint collects 'e 'ce, but doesn’t appear to the right of the =>; Thus, we can’t solve it using unification, and the program would have undefined semantics.

Moreover, this class leads to poor inferred types. Consider the two functions f and g, below. These have the types (collects 'a 'c * collects 'b 'c) => 'a -> 'b -> 'c -> 'c and (collects bool 'c * collects int 'c) => 'c -> 'c respectively.

let f x y coll = insert x (insert y coll)
let g coll = f true 1 coll

The problem with the type of f is that it is too general, if we wish to model homogeneous collections only; This leads to the type of g, which really ought to be a type error, but isn’t; The programming error in its definition won’t be reported here, but at the use site, which might be in a different module entirely. This problem of poor type inference and bad error locality motivates us to refine the class collects, adding a functional dependency:

(* Read: 'ce determines 'e *)
class collects 'e 'ce | 'ce -> 'e begin
  val empty  : 'ce
  val insert : 'e -> 'ce -> 'ce
  val member : 'e -> 'ce -> bool
end

This class admits all the same instances as before, but now the functional dependency lets Amulet infer an improved type for f and report the type error at g.

val f : collects 'a 'b => 'a -> 'a -> 'b -> 'b
  │ 
2 │ let g coll = f true 1 coll
  │                     ^
  Couldn't match actual type int
    with the type expected by the context, bool

One can see from the type of f that Amulet can simplify the conjunction of constraints collects 'a 'c * collects 'b 'c into collects 'a 'c and substitute 'b for 'a in the rest of the type. This is because the second parameter of collects is enough to determine the first parameter; Since 'c is obviously equal to itself, 'a must be equal to 'b.

We can observe improvement within the language using a pair of data types, (:-) : constraint -> constraint -> type and dict : constraint -> type, which serve as witnesses of implication between constraints and a single constraint respectively.

type dict 'c = Dict : 'c => dict 'c
type 'p :- 'q = Sub of ('p => unit -> dict 'q)

let improve : forall 'a 'b 'c. (collects 'a 'c * collects 'b 'c) :- ('a ~ 'b) =
  Sub (fun _ -> Dict)

Because this program type-checks, we can be sure that collects 'a 'c * collects 'b 'c implies 'a is equal to 'b. Neat!

Computing with Fundeps: Natural Numbers and Vectors

If you saw this coming, pat yourself on the back.

I’m required by law to talk about vectors in every post about types. No, really; It’s true. I’m sure everyone’s seen this by now, but vectors are cons-lists indexed by their type as a Peano natural.

type nat = Z | S of nat
type vect 'n 'a =
  | Nil  :                    vect Z      'a
  | Cons : 'a * vect 'n 'a -> vect (S 'n) 'a

Our running objective for this post will be to write a function to append two vectors, such that the length of the result is the sum of the lengths of the arguments.3 But, how do we even write the type of such a function?

Here we can use a type class with functional dependencies witnessing the fact that \(a + b = c\), for some \(a\), \(b\), \(c\) all in \(\mathbb{N}\). Obviously, knowing \(a\) and \(b\) is enough to know \(c\), and the functional dependency expresses that. Due to the way we’re going to be implementing add, the other two functional dependencies aren’t admissible.

class add 'a 'b 'c | 'a 'b -> 'c begin end

Adding zero to something just results in that something, and if \(a + b = c\) then \((1 + a) + b = 1 + c\).

instance add Z 'a 'a begin end
instance add 'a 'b 'c => add (S 'a) 'b (S 'c) begin end

With this in hands, we can write a function to append vectors.

let append : forall 'n 'k 'm 'a. add 'n 'k 'm
          => vect 'n 'a -> vect 'k 'a -> vect 'm 'a =
  fun xs ys ->
    match xs with
    | Nil -> ys
    | Cons (x, xs) -> Cons (x, append xs ys)

Success!
… or maybe not. Amulet’s complaining about our definition of append even though it’s correct; What gives?

The problem is that while functional dependencies let us conclude equalities from pairs of instances, it doesn’t do us any good if there’s a single instance. So we need a way to reflect the equalities in a way that can be pattern-matched on. If your GADT senses are going off, that’s a good thing.

Computing with Evidence

This is terribly boring to do and what motivated me to add type functions to Amulet in the first place, but the solution here is to have a GADT that mirrors the structure of the class instances, and make the instances compute that. Then, in our append function, we can match on this evidence to reveal equalities to the type checker.

type add_ev 'k 'n 'm =
  | AddZ : add_ev Z 'a 'a
  | AddS : add_ev 'a 'b 'c -> add_ev (S 'a) 'b (S 'c)

class add 'a 'b 'c | 'a 'b -> 'c begin
  val ev : add_ev 'a 'b 'c
end

instance add Z 'a 'a begin
  let ev = AddZ
end

instance add 'a 'b 'c => add (S 'a) 'b (S 'c) begin
  let ev = AddS ev
end

Now we can write vector append using the add_ev type.

let append' (ev : add_ev 'n 'm 'k)
            (xs : vect 'n 'a)
            (ys : vect 'm 'a)
  : vect 'k 'a =
    match ev, xs with
    | AddZ, Nil -> ys
    | AddS p, Cons (x, xs) -> Cons (x, append' p xs ys)
and append xs ys = append' ev xs ys

This type-checks and we’re done.

Functions on Types: Programming with Closed Type Functions

Look, duplicating the structure of a type class at the value level just so the compiler can figure out equalities is stupid. Can’t we make it do that work instead? Enter closed type functions.

type function (+) 'n 'm begin
  Z + 'n = 'n
  (S 'k) + 'n = S ('k + 'n)
end

This declaration introduces the type constructor (+) (usually written infix) and two rules for reducing types involving saturated applications of (+). Type functions, unlike type classes which are defined like Prolog clauses, are defined in a pattern-matching style reminiscent of Haskell.

Each type function has a set of (potentially overlapping) equations, and the compiler will reduce an application using an equation as soon as it’s sure that equation is the only possible equation based on the currently-known arguments.

Using the type function (+) we can use our original implementation of append and have it type-check:

let append (xs : vect 'n 'a) (ys : vect 'k 'a) : vect ('n + 'k) 'a =
  match xs with
  | Nil -> ys
  | Cons (x, xs) -> Cons (x, append xs ys)
let ys = append (Cons (1, Nil)) (Cons (2, Cons (3, Nil)))

Now, a bit of a strange thing is that Amulet reduces type family applications as lazily as possible, so that ys above has type vect (S Z + S (S Z)) int. In practice, this isn’t an issue, as a simple ascription shows that this type is equal to the more orthodox vect (S (S (S Z))) int.

let zs : vect (S (S (S Z))) int = ys

Internally, type functions do pretty much the same thing as the functional dependency + evidence approach we used internally. Each equation gives rise to an equality axiom, represented as a constructor because our intermediate language pretty much lets constructors return whatever they damn want.

type + '(n : nat) '(m : nat) =
  | awp : forall 'n 'm 'r. 'n ~ Z -> 'm ~ 'n -> ('n + 'm) ~ 'n
  | awq : forall 'n 'k 'm 'l. 'n ~ (S 'k) -> 'm ~ 'l
       -> ('n + 'm) ~ (S ('k + 'l))

These symbols have ugly autogenerated names because they’re internal to the compiler and should never appear to users, but you can see that awp and awq correspond to each clause of the (+) type function, with a bit more freedom in renaming type variables.

Custom Type Errors: Typing Better

Sometimes - I mean, pretty often - you have better domain knowledge than Amulet. For instance, you might know that it’s impossible to show a function. The type_error type family lets you tell the type checker this:

instance
     (type_error (String "Can't show functional type:" :<>: ShowType ('a -> 'b))
  => show ('a -> 'b)
begin
  let show _ = ""
end

Now trying to use show on a function value will give you a nice error message:

let _ = show (fun x -> x + 1)
  │ 
1 │ let _ = show (fun x -> x + 1)
  │         ^^^^^^^^^^^^^^^^^^^^^
  Can't show functional type: int -> int

Type Families can Overlap

Type families can tell when two types are equal or not:

type function equal 'a 'b begin
  discrim 'a 'a = True
  discrim 'a 'b = False
end

But overlapping equations need to agree:

type function overlap_not_ok 'a begin
  overlap_not_ok int = string
  overlap_not_ok int = int
end
  Overlapping equations for overlap_not_ok int
  • Note: first defined here,
  │ 
2 │   overlap_not_ok int = string
  │   ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  but also defined here
  │ 
3 │   overlap_not_ok int = int
  │   ^^^^^^^^^^^^^^^^^^^^^^^^

Conclusion

Type families and type classes with functional dependencies are both ways to introduce computation in the type system. They both have their strengths and weaknesses: Fundeps allow improvement to inferred types, but type families interact better with GADTs (since they generate more equalities). Both are important in language with a focus on type safety, in my opinion.


  1. This is not actually the definition of a relation with full generality; Set theorists are concerned with arbitrary families of sets indexed by some \(i \in I\), where \(I\) is a set of indices; Here, we’ve set \(I = \mathbb{N}\) and restrict ourselves to the case where relations are tuples.↩︎

  2. At least it’s not category theory.↩︎

  3. In the shower today I actually realised that the append function on vectors is a witness to the algebraic identity \(a^n * a^m = a^{n + m}\). Think about it: the vect 'n functor is representable by fin 'n, i.e. it is isomorphic to functions fin 'n -> 'a. By definition, fin 'n is the type with 'n elements, and arrow types 'a -> 'b have \(\text{size}(b)^{\text{size}(a)}\) elements, which leads us to conclude vect 'n 'a has size \(\text{size}(a)^n\) elements.↩︎