Programming languages with dependent types allow us to specify powerful properties of values using the type system. By employing the Curry–Howard correspondence, we can also use these languages as proof languages for higher-order logics. In this blog post, I want to demonstrate that Haskell as supported by the Glasgow Haskell Compiler (GHC) can give us almost the same features.
The complete article can be seen as a kind of literate program. Extracting all code snippets and glueing them together yields a valid Haskell module. For your convenience, you can download the complete code as an ordinary Haskell source code file.
Prerequisites
Plain Haskell 2010 is not enough for emulating dependent types. We need support for generalized algebraic data types and for type synonym families. Furthermore, we want to use operator syntax on the type level for making our code easier to read. We declare the necessary language extensions:
{-# LANGUAGE GADTs, TypeFamilies, TypeOperators #-}
We want to define better typed versions of certain Prelude functions. For convenience, we want them to have the same names as their Prelude counterparts. Therefore, we hide the corresponding Prelude functions:
import Prelude hiding (head, tail, (++), (+), replicate)
Since Haskell does not come with a type for natural numbers, we make use of the natural-numbers package:
import Data.Natural
We are now ready to turn to the actual development.
Dependently typed programming
Dependent types are types that are parameterized by values. In Haskell, however, parameters of types have to be types again. So we represent values by types. Let us look how this works for natural numbers. The usual ADT declaration of natural numbers uses two data constructors Zero
and Succ
. These data constructors become type constructors now:
data Zero data Succ nat
Note that it does not matter what values the types Zero
and Succ
cover. It is most reasonable to leave these types empty.
Now we define a list type with a length parameter almost like we would do in Agda:
data List el len where Nil :: List el Zero Cons :: el -> List el len -> List el (Succ len)
If we let GHCi compute the type of
Cons 'H' $ Cons 'i' $ Cons '!' $ Nil
,
for example, it will correctly answer with
List Char (Succ (Succ (Succ Zero)))
.
We can drop the static length information from lists by converting them to Haskell’s ordinary list type:
toList :: List el len -> [el] toList Nil = [] toList (Cons el els) = el : toList els
Using toList
, we implement a Show
instance for displaying values of type List
in GHCi:
instance (Show el) => Show (List el len) where show = show . toList
Let us actually make use of the length parameter in list types. We first construct a function head
, which can only be applied to non-empty lists and thus cannot fail:
head :: List el (Succ len) -> el head (Cons el _) = el
The type of the corresponding function tail
also shows that the length of a list’s tail is one less than the length of the original list:
tail :: List el (Succ len) -> List el len tail (Cons _ els) = els
A more advanced example of using static length information is list concatenation. Let us construct a dependently typed version of the well-known ++
-operator. Its type should specify that the length of the result is the sum of the lengths of the arguments. So we have to implement addition of type-level naturals. We introduce a type synonym family for this:
infixl 6 :+ type family nat1 :+ nat2 :: * type instance Zero :+ nat2 = nat2 type instance Succ nat1 :+ nat2 = Succ (nat1 :+ nat2)
We can now define list concatenation as follows:
infixr 5 ++ (++) :: List el len1 -> List el len2 -> List el (len1 :+ len2) Nil ++ list = list Cons el els ++ list = Cons el (els ++ list)
Note that the type checker guarantees that our implementation of concatenation really fulfills the length constraint expressed in the type.
We want to implement a dependently typed version of the Prelude function replicate
, which takes a natural number len
and some value val
and returns a list that contains len
times the value val
. Its desired type would be
(len : Nat) -> el -> List el len
in an Agda-like language. This type gives the first argument the name len
. Since len
is used in the result type, the type of a result depends on the concrete argument.
The problem for our Haskell emulation of dependent types is that the len
argument of the function must be a value, while the len
parameter of List
must be a type. So we have to link value-level naturals and type-level naturals. We do this by defining a type Nat
with one parameter such that for each n ∈ ℕ with a type-level representation nat
, the type Nat nat
contains exactly one value, which represents n:
data Nat nat where Zero :: Nat Zero Succ :: Nat nat -> Nat (Succ nat)
If we ask GHCi for the type of the value
Succ (Succ (Succ Zero))
,
we get the answer
Nat (Succ (Succ (Succ Zero)))
,
which demonstrates that the value is now reflected at the type level.
As in the case of List
, we can define a conversion function for forgetting the type parameter of Nat
, and a Show
instance based on it:
toNatural :: Nat nat -> Natural toNatural Zero = 0 toNatural (Succ nat) = succ (toNatural nat) instance Show (Nat nat) where show = show . toNatural
Analogously to the implementation of (++)
, we can implement addition for values of type Nat
:
infixl 6 + (+) :: Nat nat1 -> Nat nat2 -> Nat (nat1 :+ nat2) Zero + nat2 = nat2 Succ nat1 + nat2 = Succ (nat1 + nat2)
However, we actually wanted to implement a dependently typed replicate
. We can do this as follows:
replicate :: Nat len -> el -> List el len replicate Zero _ = Nil replicate (Succ len) el = Cons el (replicate len el)
The definition of Nat
ensures that the first argument of replicate
and the type variable len
represent the same natural number. So the type of replicate
guarantees that the first argument determines the length of the result list.
Theorem proving
Theorem proving in a dependently typed programming language works by exploiting the Curry–Howard correspondence. Propositions correspond to types and their proofs correspond to expressions of the corresponding types. In this blog post, we will only discuss how to prove equations. Proving other things works analogously.
Let us first define the equality predicate. We can define a predicate by introducing a corresponding GADT. The data constructors of this GADT serve as atomic proofs, their types correspond to axioms. A type constructor (:==)
that implements equality is defined as follows:
infix 4 :== data val1 :== val2 where Refl :: val :== val
We implement a Show
instance for (:==)
that enables us to show the only possible normal form of an equality proof, which is Refl
:
instance Show (val1 :== val2) where show Refl = "Refl"
We will now prove that 0 is a right unit of addition, that is,
∀n ∈ ℕ . n + 0 = n .
It might seem reasonable to give the proof the polymorphic type
nat :+ Zero :== nat
.
However, the proof has to be done by induction, so that we would have to do pattern matching on the natural number parameter. We cannot do pattern matching on a type, so we have to pass the natural number represented by nat
as a value of type Nat nat
. We implement the proof as follows:
rightUnit :: Nat nat -> nat :+ Zero :== nat rightUnit Zero = Refl rightUnit (Succ nat) = case rightUnit nat of Refl -> Refl
As a second and last example, we proof that addition is associative, that is,
∀n₁, n₂, n₃ ∈ ℕ . (n₁ + n₂) + n₃ = n₁ + (n₂ + n₃) .
The Haskell code is as follows:
assoc :: Nat nat1 -> Nat nat2 -> Nat nat3 -> (nat1 :+ nat2) :+ nat3 :== nat1 :+ (nat2 :+ nat3) assoc Zero nat2 nat3 = Refl assoc (Succ nat1) nat2 nat3 = case assoc nat1 nat2 nat3 of Refl -> Refl
Note that we pass all three natural numbers involved as values of Nat
, although we only do pattern matching on the first one. The reason is that we have to specify the second and third number in the recursive call. If we would not do this, they would have to be inferred, which turns out to be impossible.
So have we shown that Haskell can replace Agda if we are willing to write a bit of boilerplate code? Not really. A crucial property of Agda, which is important for proving theorems, is that all values are fully defined. In Haskell, however, values may be partially or completely undefined. In particular, the value ⊥, which denotes undefinedness, is an element of every type. So every proposition has at least one proof, which makes our logic inconsistent.
A solution is to accept a value as a proof only if it does not contain ⊥. We can check if ⊥ is present by evaluating the value completely. This succeeds if and only if there is no ⊥. There are two problems with this approach. First, it causes a runtime overhead, which is not necessary when using a language like Agda. Second, complete evaluation is not possible for arbitrary functions, since it would mean applying the functions in question to all possible arguments. Since function types correspond to implications, and implications occur often in general statements, this is a serious drawback.
So theorem proving is obviously something that should be left to languages that can handle it better than Haskell can. That said, I think that it is nevertheless fun to see how far we can go with present-day Haskell.
As it turns out, your Haskell theorems do not need to use values, thus solving your ⊥-problem. You can use logic programming with type classes to prove theorems. Also, your method of theorem proving also fails to ensure the type safety of the theorem statements themselves.
:+
could still be extended invalidating the truth of the theorems in it’s previous uses. The solution to these two problems is posted here: http://kormacode.blogspot.com/2012/05/almost-lf-in-haskell.htmlI do really like this method however, the syntax seems much cleaner.
LikeLike
I could surely extend
:+
, like this, for example:However, this would not really invalidate my theorems. The fact that 0 is a right unit of addition, for example, is encoded by the Haskell type
Nat nat -> nat :+ Zero :== nat
.If I pass a value of type
Nat ()
to the proof of this theorem, I will receive ⊥ as the result, since the only value of typeNat ()
is ⊥, and thus the pattern matching done by the proof will fail. However, I have to check every alleged proof for bottoms anyhow.The approach you present in Logic Programming with GADTs and Classes seems to be very similar to mine. The
List
type you define there is completely analogous to myNat
type. The difference is that you use classes with functional dependencies where I use type synonym families. As a result, you implement the different cases ofappend
via different instance declarations, while in my definition of(+)
, I use the fact that pattern matching on GADT values restricts the type parameters of the GADT.I cannot see how the definition of the
Iso
class in Almost LF in Haskell really ensures that the onlynat
withIso Nat nat
are the type-level natural numbers. I could very well do this:Note that even the informal requirement that
into
andoutof
form an isomorphism is fulfilled, since bothNat Bogus
andBogus
are empty.That said, it is possible to prevent a Haskell-encoded predicate from being extended. I describe a generic approach to this in Subsection 6.2 of Generic Record Combinators with Static Type Checking.
Instead of using this generic implementation, you can also specialize it manually for each predicate. For example, you can define a natural number predicate as follows:
Note that the type of
specialize_Nat
is isomorphic toAll_Nat item -> forall nat . (Nat nat) => item nat
.The informal requirement for instances of
Nat
is thatclosed_Nat
andspecialize_Nat
are inverses of each other.If you try to make the
Bogus
type from above an instance ofNat
, you will run into problems. There is no implementation ofspecialize_Nat
that preserves the property thatclosed_Nat
andspecialize_Nat
are each other’s inverses. Actually, there is no type-correct implementation ofspecialize_Nat
except ⊥.LikeLike
Unless I am incorrect, in the ⊥ instance, the only difference between the class solution and the data family solution is that in the class solution, because the proofs are entirely on the level of types, they all get automatically evaluated at compile time, rather than having to add a check for ⊥ manually to run at runtime.
For ensuring classes are closed:
If you have
Iso Nat Bogus
,outof :: Nat Bogus -> Bogus
. So you could define it as,outof _ = ⊥
yes. What is (or should have been), what kept these closed was that the module did not exportinto
andoutof
. Your solution is also cool, but I think still suffers the problem of requiring that every type have its own module (unless you want to perform bottom checks).LikeLike
If you are doing proofs at the type level using classes, then you probably do not need to evaluate proof terms at runtime. When I said that your class-based and my type-family-based solution were similar, I only referred to how we implement functions like
append
and(+)
on both the type- and the value-level. I think in your blog posts you did not show how to actually proof things, did you?If your module does not export
into
andoutof
, you can still give the instance declaration forIso Nat Bogus
I showed, since this declaration does not contain any method implementations.Unfortunately, my approach suffers from the need of bottom checks even if I put stuff related to a particular type in a separate module. Someone else can always write something like this:
So let us use Agda for proofs.
LikeLike
No, In my post my focus was not on theorem proving, just that an embedding of “Sort of LF” was possible, considering that we don’t have termination or totality checking in either of our cases. I’m only using the runtime to show that we can use this technique to meta program Haskell, and not just prove properties of Haskell.
I’m working under the assumption of “-Werror” being used, since the compiler can perform that check.
LikeLike