21 July, 2010

bubblesort in agda (part 1)

I've been playing with Agda, a dependently typed programming language (where types of expressions can depend on earlier values).

Dependent types let you make quite rich compile-time assertions about the behaviour of your program. These assertions are written and checked using the type system.

One of the examples for Agda seems to be quicksort. So, I thought I'd experiment with bubblesort - similar but different.

First I want an implementation of bubblesort in Agda. Then I want to use dependent types to express something interesting that I cannot express in another language: that bubblesort really sorts.

Agda looks syntactically quite like Haskell, so I first made a Haskell implementation. This doesn't look like a procedural normal bubblesort implementation, as I wanted to keep a functional style.

bubblesort t = applyLoop t

applyLoop t = if isSorted t then t else applyLoop (onePass t)

onePass [e] = [e]
onePass t =
  let a:b:rest = t
      (as,bs) = if a < b then (a,b) else (b,a)
      newlist = as:bs:rest  -- list with head two elements in order
      bsrest = onePass (bs:rest)
      endlist = as:bsrest
    in endlist

Next I wanted to convert the above Haskell code into Agda. Syntactically, this translates quite easily.

My only major stumbling block comes with termination.

Functions in Agda must be total.

That means they must always terminate with a value. They can't fail with an error, and they can't run forever (which means that Agda is not a turing-complete language, I think!)

Agda contains a termination checker that gives a compile error if it can't prove that your functions terminate.

Bubblesort does always terminate. But the Agda compiler can't see that from the above definition. The proof looks something like: every iteration of applyLoop makes the list more sorted, and once you've called it enough times (as many as there are elements in the list) then the list will be sorted.

So I changed my implementation to run the outer loop a fixed number of times, rather than until the list is sorted. That makes the termination provable. But it makes it less obvious that the final result is actually sorted: now we end after a certain number of runs, rather than when the isSorted test is true...

Here's the Agda code:

onepass : {n : ℕ}→ Vec ℕ n → Vec ℕ n
  onepass {zero} [] = []
  onepass {suc zero} (y :: []) = y :: []
  onepass {suc (suc n)} (y :: (y' :: y0)) = if y ≤? y' then                   
                                        y :: (onepass {suc n} (y' :: y0))     
                                   else y' :: (onepass {suc n} (y :: y0)) 

  innersort : {n : ℕ} -> ℕ -> Vec ℕ n -> Vec ℕ n
  innersort zero list = list
  innersort (suc i) sorted = innersort i (onepass sorted)

  sortlist : {n : ℕ} -> Vec ℕ n -> Vec ℕ n
  sortlist {n} l = innersort n l
(I actually started with a different definition of onepass - I'll come back to that in a different post).

Now, that's enough (given wrappers for main, etc) to successfully sort a list:

$ ./run
2 :: 3 :: 5 :: 8 :: 11 :: 12 :: 14 :: []
So what's different from doing this in Haskell?

Well, on the downside, it is not as obvious that this sort actually returns a sorted list, as mentioned above.

On the upside: we know that this program won't loop forever (although it might loop for a very long time) - because of the termination checker.

And the types of the functions, which I haven't mentioned yet, tell us something that relates the inputs and the outputs. sortlist has a type that tells us that its output will be the same length as its input. This is not a full definition of sortedness but its more than is offered by Haskell lists.

The type signature looks like this:
sortlist : {n : ℕ} -> Vec ℕ n -> Vec ℕ n
which says something like: sortlist is defined for every natural number n, and takes as input a Vector (of ℕumbers) of length n and returns as output a Vector (of ℕumbers) of that same length n.

The type checker checks that this is true at compile time, so the inner functions must have the length similarly encoded in their types.

At compile time, when we call n, we don't know what the n will be for every invocation (this is not Pascal statically-sized arrays) - but we do know that the output will be the same size as the input.

Compare that to type of the Haskell implementation:

bubblesort :: [Int] -> [Int]
which says that bubblesort takes a list of numbers and returns a list of numbers. Typewise, a perfectly valid implementation of quicksort here could return the empty list [] for every input. This can't happen here.

So in subsequent posts, I want to try to make compile-time assertions richer. I want to prove that bubblesort really sorts, by which I mean (I think):the output is a permutation of the input; and the output is in order.

(note 1) actually the Haskell type is: bubblesort :: (Ord t) => [t] -> [t] meaning Lists of anything that can be ordered, rather than lists of integers. That same abstraction is mostly achievable in Agda, but not quite - it doesn't have typeclasses in the same way that Haskell does.

Read part 2...

No comments:

Post a Comment