Filtering Lists with Dependent Types

Filtering a list is a pretty fundamental operation when programming and most modern programming languages provide such a function as part of their standard library. As a simple example, here's JavaScript:

[1,2,3,4].filter(x => x > 2)

So what's the potential issue, and why would we want to introduce the complexity of dependent types?

Let's take the following example:

[1, 2, null, 4].filter(x => Boolean(x)).map(x => x + 1)

It is pretty clear that the addition will not have any trouble because the filter has removed the null. And yet, both Flow.js[1] and TypeScript[2] both complain that x in the map could be null.

[1] Flow.js playground showing error
[2] TypeScript playground showing error

The Haskell standard library has a function called catMaybes[3], which will take all of the Just values from a list of Maybes, so in Haskell you could do this:

[3] catMaybes documentation on Hoogle

map (+1) $ catMaybes [Just 1, Just 2, Nothing, Just 4]

But this pattern relies on transforming the type of the items in the list; in this case from Maybe numbers to just numbers. What if instead of filtering out nulls/Nothings, we wanted to filter out all values greater than some given constant. The filter would transform the list from a list of numbers to a list of numbers, and there would be no evidence that the filter has occured at all. Or say we wanted to select just the records from a list that have a property whose value is of interest, such as an id that is not null. All of this is difficult without lots of tricks or without in-built language support that isn't available for user-defined data structures.

However, with dependent types we can begin to produce such evidence and tie it to the list, allowing all future access of the list to be certain that certain constraints hold. We can paramaterise a list by a predicate, which is to say that just as a generic list is paramatarised by the types of the elements (which is to say that a list of strings is a different type to a list of numbers), we can paramatarise a list by a predicate that all of the elements must adhere to.

PredList

Lets define a data structure called PredList:

data PredList : (ty:Type) -> (pred:(ty -> Type)) -> Type where
  Nil : PredList a p
  Cons : (x:a) -> (p x) -> PredList a p -> PredList a p

The first line says that we're defining a new data structure called PredList that is a paramaterised by two types. The first, called ty, is the type of all of the elements -- just like any other generic data structure. The second type, called pred, is our predicate that all of the elements must hold to be true. It's type is a function that takes as argument an instance of our type ty, which will be some element of the list, and returns a type. This is a pretty standard encoding of predicates in dependent types; simply functions that take as argument a value over which the predicate must hold and return a type; if the predicate does not hold then it will be impossible to create a value of returned type.

The second line defines the end of the list. Just like the base case of how arrays are defined in any other pure functional language, like Haskell or Elm. Here, a and p are just stand-ins for our ty and pred type variables.

It is the third line where things get interesting. The Cons case is the recursive case, again just like how lists are defined elsewhere, but in addition to a value of the generic type, which we call x, we also must provide the output of calling pred with x; which is to say that we must provide proof that the predicate holds for the value that we are attempting to append to the front of the list.

A pointles list

As an aside, with this, we can define a rather pointless list. Using type-level equality, we can define a list where every element must be equal to the number 3. Note that Refl is the proof that 3 equals 3.

aListOfThreeThrees : PredList Nat (\x => 3 = x)
aListOfThreeThrees = Cons 3 Refl $ Cons 3 Refl $ Cons 3 Refl $ Nil

Back to filter

With the definition of PredList, we can now look back at filter. We can define a filter function that takes as input a list and a function that decides whether a given predicate holds for each value in turn.

filterToPredList : { a:Type } -> { pred : (a -> Type) } -> (f : ((x:a) -> Dec (pred x))) -> List a -> PredList a pred
filterToPredList f [] = []
filterToPredList f (x :: xs) = let
  rest = filterToPredList f xs in
  case f x of
       (Yes prf) => Cons x prf rest
       (No _) => rest

On the first line, we have some implicit arguments to be inferred by the type checker specifying the type of the elements of our list and the predicate; we don't need to pass those around by hand. Then we have a function, f, that can take a value from the list and decide whether the predicate holds for that value or not. If it does, then it will return Yes with a proof that it holds, otherwise it will return No. Finally, we also pass in our regular list and return our new PredList.

Again, first case of the definition is the base case, where if the passed list is empty then so too should our PredList.

The following two lines define up front the output of the recursive call for everything except for x, the value that we're currently considering. Now, if f called with x is true, which is to say that f returns Yes, then we append x to the front of rest with the proof, otherwise we don't and we just return rest.

Ultimately, what we end up with is a PredList, which contains all of the elements from the input list for which the predicate holds true.

As an example, this expression would evaluate to a PredList that contains the values 2, 3, and 4 as well as a proof that each is greater than or equal to 2. The way to read the expression in parenthesis is: "2 is less than or equal to what?"

filterToPredList (isLTE 2) [1,2,3,4]

Further code can then use these proofs as evidence that the filtering has taken place and we don't end up in a situation where the compiler doesn't believe our values hold to some constraint that we literally just checked for.

~~~

Last Updated: 2022-02-05