|
COPYRIGHT 2000 Association for Computing Machinery, Inc.
1. INTRODUCTION
Most statically typed programming languages offer some form of type inference, allowing programmers to omit type annotations that can be recovered from context. Such a facility can eliminate a great deal of needless verbosity, making programs easier both to read and to write. Unfortunately, type inference technology has not kept pace with developments in type systems. In particular, the combination of subtyping and parametric polymorphism has been intensively studied for more than a decade in calculi such as System F [is less than or greater to] [Cardelli and Wegner 1985; Curien and Ghelli 1992; Cardelli et al. 1994], but these features have not yet been satisfactorily integrated with practical type inference methods. Part of the reason for this gap is that most work on type inference for this class of languages has concentrated on the difficult problem of developing complete methods, which are guaranteed to infer types, whenever possible, for entirely unannotated programs. In this article, we pursue a much simpler alternative, refining the idea of partial type inference with the additional simplifying principle that missing annotations should be recovered using only types propagated locally, from adjacent nodes in the syntax tree.
Our goal is to develop simple, well-behaved type inference techniques for new language designs in the style of Quest [Cardelli 1991], Pizza [Odersky and Wadler 1997], GJ [Bracha et al. 1998] or ML2000--designs supporting both object-oriented programming idioms and the characteristic coding styles of languages such as ML and Haskell. In particular, we shall use the shorthand ML-style programming to refer to a style in which (1) the use of higher-order functions and anonymous abstractions is encouraged; (2) polymorphic definitions are used freely and at a fairly fine grain (for individual function definitions rather than whole modules); and (3) "pure" data structures are used instead of mutable state, whenever possible. Our goal might then be restated as "type inference for ML-style programming in the presence of subtyping."
In particular, we are concerned with languages whose type-theoretic core combines subtyping and impredicative polymorphism in the style of System F [Girard 1972; Reynolds 1974]. This combination of features places us in the realm of partial type inference methods, since complete type inference for impredicative polymorphism alone is already known to be undecidable [Wells 1994], and the addition of subtyping does not seem to make the problem any easier. (For the combination of subtyping with Hindley/Milner-style polymorphic type inference, promising results have been reported [Aiken and Wimmers 1993; Eifrig et al. 1995; Jagannathan and Wright 1995; Trifonov and Smith 1996; Sulzmann et al. 1997; Flanagan and Felleisen 1997; Pottier 1997], but practical checkers based on these results have yet to see widespread use.)
1.1 How Much Inference Is Enough?
The job of a partial type inference algorithm should be to eliminate especially those type annotations that are both common and silly--i.e., those that can be neither justified on the basis of their value as checked documentation nor ignored because they are rare.
Unfortunately, each of the characteristic features of ML-style (polymorphic instantiation, anonymous function abstractions, and pure data structures) does give rise to a certain number of silly annotations that would not be required if the same program were expressed in a first-order, imperative style. To get a rough idea of the actual numbers, we made some simple measurements of a sizable body of existing code--about 160,000 lines of ML, written by several different programming teams. The results of these measurements can be summarized as follows (they are reported in detail in Appendix A):
--Polymorphic instantiation (i.e., type application) is ubiquitous, occurring in every third line of code, on average.
--Anonymous function definitions occur anywhere from once per 10 lines to once per 100 lines of code, depending on style.
--The manipulation of pure data structures leads to many local variable bindings (occurring, on average, once every 12 lines). However, in all but one of the programs we measured, local definitions of functions only occur once in 66 lines.
These observations give a fairly clear indication of the properties that a type inference scheme should have in order to support the ML programming style conveniently:
(1) To make fine-grained polymorphism tolerable, type arguments in applications of polymorphic functions must usually be inferred. However, it is acceptable to require annotations on the bound variables of top-level function definitions (since these usually provide useful documentation) and local function definitions (since these are relatively rare).
(2) To make higher-order programming convenient, it is helpful, though not absolutely necessary, to infer the types of parameters to anonymous function definitions.
(3) To support the manipulation of pure data structures, local bindings should not usually require explicit annotations.
Note that, even though we have motivated our design choices by an analysis of ML programming styles, it is not our intention to provide the same degree of type inference as is possible in languages based on Hindley-Milner polymorphism. Rather, we want to exchange complete type inference for simpler methods that work well in the presence of more powerful type-theoretic features such as subtyping and impredicative polymorphism.
1.2 Local Type Inference
In this article, we propose two specific partial type inference techniques that, together, satisfy all three of the requirements listed above.
(1) An algorithm for local synthesis of type arguments that infers the "locally best possible" values for types omitted from polymorphic applications whenever such best values exist. The expected and actual types of the term arguments are compared to yield a set of subtyping constraints on the missing type arguments; their values are then selected so as to satisfy these constraints while making the result type of the whole application as informative (small) as possible.
(2) Bidirectional propagation of type information allows the types of parameters of anonymous functions to be inferred. When an anonymous function appears as an argument to another function, the expected domain type is used as the expected type for the anonymous abstraction, allowing the type annotations on its parameters to be omitted. A similar, but even simpler, technique infers type annotations on local variable bindings.
Both of these methods are local, in the sense that type information is propagated only between adjacent nodes in the syntax tree. Indeed, their simplicity--and, in the case of type argument synthesis, its completeness relative to a simple declarative specification--rests on this property.
The basic idea of bidirectional checking is well known as folklore. Similar ideas have been used, for example, in ML compilers and typecheckers based on attribute grammars. However, this technique has usually been combined with ML-style type inference (see, for example, Aditya and Nikhil [1991]); it is surprisingly powerful when used by itself as a local type inference method. Specific technical contributions of this article are the formalization of bidirectional checking in a setting with both subtyping and impredicative polymorphism and the combination of this idea with the technique for local synthesis of type arguments presented in the previous section.
The remainder of the article is organized as follows. In the next section, we define a fully typed internal language. Sections 3, 4, and 5 develop the techniques of local synthesis of type arguments and bidirectional checking in detail, first for (in Sections 3 and 4) a simplified language with subtyping and unbounded universal polymorphism, then (in Section 5) extending this treatment to bounded quantifiers. Section 6 sketches some possible extensions. Section 7 surveys related work. Section 8 offers evaluation and concluding remarks. Details of our measurements of ML programs appear in an appendix.
Some additional experiments with using local type inference in practice are reported in Hosoya and Pierce [1999].
2. INTERNAL LANGUAGE
When discussing type inference, it is useful to think of a statically typed language in three parts:
(1) Syntax, typing rules, and semantics for a fully typed internal language.
(2) An external language in which some type annotations are made optional or omitted entirely. This is the language that the programmer actually uses. (In some programming languages, the internal and external language may differ in more than just type annotations, and type inference may perform nontrivial transformations on program structure. For example, under certain assumptions ML's generic let-definition mechanism can be viewed in this way.)
(3) Some specification of a type inference relation between the external language and the internal one. (The terms type inference, type reconstruction, and type synthesis have all been used for this relation. We choose "inference" as the most generic.)
In explicitly typed languages, the external and internal forms are essentially the same, and the type reconstruction relation is the identity. In implicitly typed languages, the external language allows all type annotations to be omitted, and type reconstruction promises to fill in all missing type information. On the other hand, we can describe a language as partially typed if the internal and external forms are not the same, but the specification of type inference does not guarantee that omitted annotations can always be inferred.(1)
Our internal language--the target for the type inference methods described in Sections 3 and 4--is based on the language Kernel F [is less than or equal to], Cardelli and Wegner's core calculus of subtyping and impredicative polymorphism. We consider first a simplified fragment of the full system, in which variables are all unbounded (i.e., all quantifiers are of the form All(X)T, not All(X [is less than]: S)T). The treatment here will be extended to deal with bounded quantifiers in Section 5, but the simple language presented first is enough to show all of the essential ideas and the technical development is easier to follow.
2.1 Syntax
Besides the restriction to unbounded quantifiers, we modify the usual definition of System F [is less than or equal to] [Cardelli and Wegner 1985] in two significant ways. First, we add a minimal type Bot. As we shall see in Section 3, our type inference algorithm keeps track of various type constraints by calculating the least upper bound and greatest lower bound of pairs of types. The Bot type plays a crucial role in these calculations, since without it we could not guarantee that least upper bounds and greatest lower bounds always exist. (Bot is also an interesting typing feature in its own right: for example, it can be used as the result type of non-returning expressions such as exception-raising primitives.(2))
Second, we extend abstraction and application so that several arguments (including both types and terms) may be passed at the same time. In other words, we favor a "fully uncurried" style of function definition and application (though currying is, of course, still available). This bias does not change the expressiveness of the language, but will play an important role in our scheme for inferring type arguments in Section 3.
The syntax of types, terms, and typing contexts in the internal language is as follows:
T ::= X type variable Top maximal type Bot minimal type All([bar]X)[bar]T [right arrow] T function type e ::= x variable fun [[bar]X] ([bar]x:[bar]T)e abstraction e[[bar]T] ([bar]e) application [Gamma] ::= [solid circle] empty context [Gamma], x:T variable binding [Gamma], X type variable binding
We use the metavariables R, S, T, U, and V to range over types; e and f range over terms. We use the notation [bar]X to denote the sequence [X.sub.1], ..., [X.sub.n] and similarly [bar]x:[bar]T to denote [x.sub.1]:[T.sub.1], ..., [x.sub.n]:[T.sub.n]. We write [Gamma](x) for the type of x in [Gamma].
We write [bar]S [right arrow] T as an abbreviation for the monomorphic function type All() [bar]S [right arrow] T. Similarly, we write fun([bar]x:[bar]T) e as an abbreviation for the monomorphic function fun[] ([bar]x:[bar]T) e.
Types, terms, and judgments that differ only in the names of bound variables are regarded as identical. Binders in contexts are assumed to have distinct names. The rules for scoping of bound variables are as usual (in All([bar]X)[bar]S [right arrow] T, the variables [bar]X are in scope in [bar]S and T). FV(T), the set of type variables free in T, is defined in the usual way. We write [[bar]T/[bar]X]S for the simultaneous sustitution of [bar]T for [bar]X in S.
2.2 Subtyping
Our subtyping relation is quite simple because of the restriction to unbounded quantification. In particular, the addition of the bottom type Bot in this context is straightforward. We write [bar]S [is less than]: [bar]T to mean "|[bar]S| = |[bar]T| and [S.sub.i] [is less than]: [T.sub.i] for all 1 [is less than or equal to] i [is less than or equal to] |S|."
(S-Refl) X [is less than]: X
(S-Top) T [is less than]: Top
(S-Bot) Bot [is less than]: T
(S-Fun) [bar]T [is less than]: [bar]R S [is less than]: U/ All ([bar]X) [bar]R [right arrow] S [is less than]: All ([bar]X) [bar]T [right arrow] U
For simplicity, we use an algorithmic presentation of subtyping, in which the rules of transitivity and general reflexivity are omitted and recovered as properties of the definition:
LEMMA 2.2.1 (TRANSITIVITY). If S [is less than]: T and T [is less than]: U then S [is less than]: U.
PROOF. A simple induction on the derivations of S [is less than]: T and T [is less than]: U. The cases involving Top and Bot rely on the fact that R [is less than]: Bot implies R = Bot, and Top [is less than]: R implies R = Top.
LEMMA 2.2.2 (REFLEXIVITY). T [is less than]: T, for all T.
PROOF. A simple induction on the structure of T. [Square]
We use the notation S [disconjunction] T to denote the least upper bound of S and T, and S [conjunction] T for the greatest lower bound of S and T.
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
Note that [conjunction] andd [disconjunction] are total functions: for every [Gamma], S, and T, there are unique types M and J such that S [conjunction] T = M and S [disconjunction] T = J. It is easy to check that these definitions have the appropriate universal properties:
LEMMA 2.2.3.
(1) S [is less than]: (S [disconjunction] T) and T [is less than]: (S [disconjunction] T).
(2) (S [conjunction] T) [is less than]: S and (S [conjunction] T) [is less than]: T.
PROOF. We prove both parts simultaneously, using induction on the structure of S and T. [Square]
LEMMA 2.2.4.
(1) If S [is less than]: U and T [is less than]: U then (S [disconjunction] T) [is less than]: U.
(2) If U [is less than]: S and U [is less than]: T then U [is less than]: (S [conjunction] T).
PROOF. We prove both parts simultaneously, using induction on the structure of U. [Square]
2.3 Explicit Typing Rules
The typing relation [Gamma] [assertion sign] e [element of] T is essentially the standard one, except that, as in the definition of subtyping, we use an algorithmic presentation, omitting the usual rule of subsumption ("if e [element of] S and S [is less than]: T, then e [element] T"); instead, the rules below calculate for each typable term a unique type (sometimes called the manifest type of the term), corresponding to its minimal type in the system with subsumption. Note that this stylistic choice does not change the set of typable terms--just the number of typing derivations showing that a given term is typable.
The typing rule for variables is standard.
(Var) [Gamma] [assertion sign] [element of] [Gamma](x)
The rule for (multi-)abstractions combines the usual rules for term and type abstractions.
(ABs) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
Similarly, the rule for applications combines the usual application and polymorphic application rules. We calculate the type of the function and check that the provided term and type arguments are consistent with the function type. The result type of the application is found by substituting the actual type arguments into the function's result type.
(App) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
[Gamma] [assertion sign] [is less than]: [[bar]T/[bar]X][bar]S here is an abbreviation for "[Gamma] [assertion sign] [bar]e [is less than]: [bar]U and [bar]U [is less than]: [[bar]T/[bar]X][bar]S."
To finish the definition of the typing relation, another rule is required. To see why, note that Bot [is less than]: All([bar]X) [bar]S [right arrow] T for any [bar]X, [bar]S, and T. This means that any expression of type Bot should be applicable to any set of well-formed type and expression arguments (if we did not allow for this behavior, we would lose the type soundness property):
(App-Bot) [Gamma] [assertion sign] f [element] Bot [Gamma] [assertion sign] [bar]e [element] [bar]S/ [Gamma] [assertion sign] f[[bar]T] [element] Bot
Note that the above rule gives the expression f[[bar]T ([bar]e) the type Bot, the most informative result type for the expression.
THEOREM 2.3.1 (UNIQUENESS OF MANIFEST TYPES). If [Gamma] [assertion sign] [element] S and [Gamma] [assertion sign] e [element] T, then S = T.
The definitions of operational and denotational semantics for the internal language are standard, as are proofs of properties such as subject reduction and absence of runtime errors. Evaluation order may be chosen either call-by-name or call-by-value; function spaces may be interpreted as either total or partial. The only slightly unusual case is the type Bot, which can be interpreted as an empty type (in a total-function semantics) or a type containing only divergent terms (in a partial function semantics).
3. LOCAL TYPE ARGUMENT SYNTHESIS
In the introduction, we identified three categories of type annotations that are worth inferring automatically: type arguments in applications of polymorphic functions, annotations on bound variables in anonymous function abstractions, and annotations on local variable bindings. In this section, we address the first of these, leaving the second and third for Section 4.
Our measurements of ML programs (presented in the appendix) showed that type arguments to polymorphic functions are inferred by the ML typechecker on at least one line in every three, in typical programs. Moreover, explicit type arguments rarely have any useful documentation value. We therefore believe that it is essential to have some form of type argument synthesis in any language intended to support ML-style programming. For example, consider the polymorphic identity function id with type All(X)X [right arrow] X. Our goal is to allow the programmer to apply the id function without explicitly supplying any type arguments: id(3) rather than id[Int] (3).
When considering the general problem of type argument synthesis, the first question we have to answer is: How do we decide where type arguments have been omitted (and therefore need to be synthesized)? In the variant of F [is less than or equal to] we presented in Section 2, the answer is simple: we look for application nodes where the function is polymorphic but there are no explicit type arguments. For example, the fact that id is polymorphic makes it clear that...
Read the full article for free courtesy of your local library.
|