Explaining Concepts in Compositional Type-Based Program Analysis: Principality, Intersection Types, Expansion, etc.

Joe Wells, Heriot-Watt University

A static program analysis predicts program behavior without actually running the program. An analysis is _compositional_ when the analysis result for each program part depends only on the results for the immediate sub-parts, which can therefore be analyzed in isolation (using zero information about their context) and in any order. Doing analysis compositionally is challenging in all approaches to program analysis, and in the case of type-based analysis tends to conflict with the ? ("for all") quantifiers most commonly used to gain _type polymorphism_ which is needed for programming flexibility.

The starting point for making analysis compositional is the concept of _principal typings_ (not to be confused with the much weaker notion of _principal types_ often mentioned in connection with ML-like languages and the Hindley/Milner type system). An analysis result (e.g., a typing in some type system) for a program fragment is _principal_ for that fragment when it is stronger than all other results for the same fragment. Without principal typings, an analysis algorithm must either be incomplete, be non-compositional (for example, Milner's algorithm W used for ML-like languages is non-compositional), or use intermediate results outside of the analysis system.

Obtaining principal typings requires using types that more closely correspond to actual program evaluation. As an example, in the case of the ?-calculus, _intersection types_ correspond to the multiple uses by a function of its parameter and the operation of _expansion_ on typings corresponds to the duplication (or discarding) of the actual arguments of functions. In comparison, recent work developing principal typings for Java-like languages has needed novel kinds of type assumptions that correspond to object-oriented operations like method dispatch.

This talk aims to explain and clarify in as simple a way as possible what the above-mentioned concepts are and how they are related, and also explain other related concepts (e.g., the _rank_ of types, polyvariant data flow analysis, incremental reanalysis, etc.).