Type Directed Search :search
¶
Idris’ :search
command searches for terms according to their
approximate type signature (much like
Hoogle for Haskell). For
example:
Idris> :search e -> List e -> List e
= Prelude.List.(::) : a -> List a -> List a
Cons cell
= Prelude.List.intersperse : a -> List a -> List a
Insert some separator between the elements of a list.
> Prelude.List.delete : Eq a => a -> List a -> List a
< assert_smaller : a -> b -> b
Assert to the totality checker than y is always structurally
smaller than x (which is typically a pattern argument)
< Prelude.Basics.const : a -> b -> a
Constant function. Ignores its second argument.
The best results are listed first. As we can see, (::)
and
intersperse
are exact matches; the =
symbol to the left of those
results tells us the types of (::)
and intersperse
are
effectively the same as the type that was searched.
The next result is delete
, whose type is more specific than the type
that was searched; that’s indicated by the >
symbol. If we had a
function with the signature e -> List e -> List e
, we could have
given it the type Eq a => a -> List a -> List a
, but not necessarily
the other way around.
The final two results, assert_smaller
and const
, have types more
general than the type that was searched, and so they have <
symbols
to their left. For example, e -> List e -> List e
would be a valid
type for assert_smaller
. The correspondence for const
is more
complicated than any of the four previous results. :search
shows
this result because we could change the order of the arguments! That is,
the following definition would be legal:
f : e -> List e -> List e
f x xs = const xs x
About :search results¶
:search’s functionality is based on the notion of type isomorphism.
Informally, two types are isomorphic if we can identify terms of one
type exactly with terms of the other. For example, we can consider the
types Nat -> a -> List a
and a -> Nat -> List a
to be
isomorphic, because if we have f : Nat -> a -> List a
, then
flip f : a -> Nat -> List a
. Similarly, if
g : a -> Nat -> List a
, then flip g : Nat -> a -> List a
.
With :search, we create a partial order on types; that is, given two
types A
and B
, we may choose to say that A <= B
, A >= B
,
or both (in which case we say A == B
), or neither. For :search, we
say that A >= B
if all of the terms inhabiting A
correspond to
terms of B
, but it need not necessarily be the case that all the
terms of B
correspond to terms of A
. Here’s an example:
a -> a >= Nat -> Nat
The left-hand type has just a single inhabitant, id
, which
corresponds to the term id {a = Nat}
, which has the right-hand type.
However, there are various terms inhabiting the right-hand type (such as
S
) which cannot correspond with terms of type a -> a
.
We can consider the partial order for :search
to be, in some sense,
inductively generated by several classes of “edits” which are described
below.
Possible edits¶
Here is a simple approximate list of the edits that are possible in
:search
. They are not entirely formal, and do not necessarily
reflect the :search
command’s actual behavior. For example, the
argument application rule may be used directly on arguments that are
bound after other arguments, without using several applications of the
argument transposition rule.
Argument transposition
A : Type B : Type a : A, b : B |- M : Type ---------------------------------------------------------------------------- (x : A) -> (y : B) -> [x,y/a,b]M == (y : B) -> (x : A) -> [x,y/a,b]M
Score: 1 point
Example:
a -> Vect n a -> Vect (S n) a == Vect n a -> a -> Vect (S n) a
Note that in order for it to make sense to change the order of arguments, neither of the arguments’ types may depend on the value bound by the other argument!
Symmetry of equality
A = B : Type t : Type |- M : Type ---------------------------------------- [A = B/t]M == [B = A/t]M
Score: 1 point
Example:
(x,y,z : Nat) -> x + (y + z) = (x + y) + z
==
(x,y,z : Nat) -> (x + y) + z = x + (y + z)
Note that this rule means that we can flip equalities anywhere they occur (i.e., not only in the return type).
Argument application
e : A |- M : Type y1 : T1, ..., yn : Tn |- x : A ----------------------------------------------------------------- (z : A) -> [z/e]M >= (y1 : T1) -> ... -> (yn :Tn) -> [x/e]M
Score: <= : 3 points, >= : 9 points
Examples:
a -> a >= Nat -> Nat
a -> a >= List e -> List e
Vect k (Fin k) >= Vect 5 (Fin 5)
Note that the n
shown in the scheme above may be 0; that is, there
are no Pi terms to be added on the right side. For example, that’s the
case for the first example shown above. This is probably the most
important, and most widely used, rule of all.
Type class application
C : Type -> TypeClass , y1 : T1, ..., yn : Tn |- A : Type, instance : C A , t : Type |- M : Type ----------------------------------------------------------------- C a => [a/t]M >= (y1 : T1) -> ... -> (yn : Tn) -> [A/t]M
Score: <= : 4 points, >= : 12 points
Examples
Ord a => a >= Int
Show (List e) => List e -> String >= Show a => List a -> String
This rule is used by looking at the instances for a particular type
class. While the scheme is shown only for single-parameter type classes,
it naturally generalizes to multi-parameter type classes. This rule is
particularly useful in conjunction with argument application. Again,
note that the n
in the scheme above may be 0.
Type class introduction
t : Type |- M : Type C : Type -> TypeClass ----------------------------------------------- (t : Type) -> M >= C t => M
Score: <= : 2 points, >= : 6 points
Example:
a -> a -> Bool >= Eq a => a -> a -> Bool
Scoring and listing search results¶
When a type S
is searched, the type is compared against the types of
all of the terms which are currently in context. When :search compares
two types S
and T
, it essentially tries to find a chain of
inequalities
R1 R2 Rn Rn+1
S <= A1 <= ... <= An <= T
using the edit rules listed above. It also tries to find chains going
the other way (i.e., showing S >= T
) as well. Each rule has an
associated score which indicates how drastic of a change the rule
corresponds to. These scores are listed above. Note that for the rules
which are not symmetric, the score depends on the direction in which the
rule is used. Finding types which are more general that the searched
typed (S <= T
) is preferred to finding types which are less general.
The score for the entire chain is, at minimum, the sum of the scores of
the individual rules (some non-linear interactions may be added). The
:search function tries to find the chain between S
and T
which
results in the lowest score, and this is the score associated to the
search result for T
.
Search results are listed in order of ascending score. The symbol which is shown along with the search result reflects the type of the chain which resulted in the minimum score.
Implementation of :search¶
Practically, naive and undirected application of the rules enumerated above is not possible; not only is this obviously inefficient, but the two application rules (particularly argument application) are really impossible to use without context given by other types. Therefore, we use a heuristic algorithm that is meant to be practical, though it might not find ways to relate two types which may actually be related by the rules listed above.
Suppose we wish to match two types, S
and T
. We think of the
problem as a non-deterministic state machine. There is a State
datatype which keeps track of how well we’ve matched S
and T
so
far. It contains:
- Names of argument variables (Pi-bound variables) in either type which have yet to be matched
- A directed acyclic graph (DAG) of arguments (Pi-bindings) for
S
andT
which have yet to be matched - A list of typeclass constraints for
S
andT
which have yet to be matched - A record of the rules which have been used so far to get to this point
A function nextSteps : State -> [State]
finds the next states which
may follow from a given state. Some states, where everything has been
matched, are considered final. The algorithm can be roughly broken down
into multiple stages; if we start from having two types, S
and
T
, which we wish to match, they are as follows:
- For each of
S
andT
, split the types up into their return types and directed acyclic graphs of the arguments, where there is an edge from argument A to argument B if the term bound in A appears in the type of B. The topological sorts of the DAG represent all the possible ways in which the arguments may be permuted. - For type
T
, recursively find (saturated) uses of the=
type constructor and produce a list of modified versions ofT
containing all possible flips of the=
constructor (this corresponds to the symmetry of equality rule). - For each modified type for
T
, try to unify the return type of the modifiedT
withS
, considering arguments from bothS
andT
to be holes, so that the unifier may match pieces of the two types. For each modified version ofT
where this succeeds, an initialState
can be made. The arguments and typeclasses are updated accordingly with the results of unification. The remainder of the algorithm involves applyingnextSteps
to these states until either no states remain (corresponding to no path fromS
toT
) or a final state is found.nextSteps
also has several stages: - Try to unify arguments of
S
with arguments ofT
, much like is done with the return types. We work “backwards” through the arguments: we try matching all remaining arguments ofS
which lack outgoing edges in the DAG of remaining arguments (that is, the bound value doesn’t appear in the type of any other remaining arguments) with the all of the corresponding remaining arguments ofT
. This is done recursively until no arguments remain for bothS
andT
; otherwise, we give up at this point. This step corresponds to application of the argument application rule, as well as the argument transposition rule. - Now, we try to match the type classes. First, we take all possible
subsets of type class constraints for
S
andT
. So ifS
andT
have a total ofn
type class constraints, this produces2^n
states for every state, and this quickly becomes infeasible asn
grows large. This is probably the biggest bottleneck of the algorithm at the moment. This step corresponds to applications of the type class introduction rule. - Try to match type class constraints for
S
with those forT
. We attempt to unify each type class constraint forS
with each constraint forT
. This may result in applications of the type class application rule. Once we are unable to match any more type class constraints betweenS
andT
, we proceed to the final step. - Try instantiating type classes with their instances (in either
S
orT
). This corresponds to applications of the type class application rule. After instantiating a type class, we hopefully open up more opportunities to match typeclass constraints ofS
with those ofT
, so we return to the previous step.
The code for :search is located in the Idris.TypeSearch module.
Aggregating results¶
The search for chains of rules/edits which relate two types can be
viewed as a shortest path problem where nodes correspond to types and
edges correspond to rules relating two types. The weights or distances
on each edge correspond to the score of each rule. We then may imagine
that we have a single start node, our search type S
, and several
final nodes: all of the types for terms which are currently in context.
The problem, then, is to find the shortest paths (where they exist) to
all of the final nodes. In particular, we wish to find the “closest”
types (those with the minimum score) first, as we’d like to display them
first.
This problem nicely maps to usage of Dijkstra’s algorithm. We search for all types simultaneously so we can find the closest ones with the minimum amount of work. In practice, this results in using a priority queue of priority queues. We first ask “which goal type should we work on next?”, and then ask “which state should we expand upon next?” By using this strategy, the best results can be shown quickly, even if it takes a bit of time to find worse results (or at least rule them out).
Miscellaneous Notes¶
Whether arguments are explicit or implicit does not affect search results.