My work focuses on how to design systems and reason about their correctness and efficiency. This includes work in programming, circuit design and hardware, and parallel systems. Two themes that run through most (though not all) of my research are formal methods and functional programming. I'm a member of the ENDS (computer systems) group at Glasgow.
This page gives a general introduction to my research, aimed at nonspecialists (although I assume you know about programming and computers in general). It's organized by subject; there's another web page containing a chronological list of some of my publications, and soon I'll make a chronological list of grants. The table of contents contains links down to the individual sections on this page.
New: the Circuit Solvers project.
The starting point of this work was an observation that the same techniques used for digital circuit specification (see below) can also specify parallel algorithms: digital circuits are the most highly parallel systems of all! For example, suppose you want to computer the square root of all the elements of a vector in parallel. One way is to write a sequential iteration in an imperative language:
for i := 0 to n-1 do y[i] := sqrt (x[i])
ys = map sqrt xsThis fragment of Haskell says that the list of elements ys is defined to be the square roots of the corresponding elements of xs. This equation is neither sequential nor parallel: it is abstract, and could be implemented either way. For example, the implementation could store each element of xs in its own processor, so that the map function becomes an instruction for each processor to perform a sqrt in parallel with the others. Another attractive feature is that we managed to express the computation without bringing in element indices. Functional languages offer a higher level, more abstract and more intuitive way to express such computations over aggregate data structures.
What about interprocessor communication? Consider the problem of summing up the elements of an array:
sum := 0; for i := 0 to n-1 do sum := sum + y[i]This computation is expressed functionally using the foldl combinator. In this case, foldl expresses an iteration that uses the (+) operator to combine the elements of xs, using 0 as the initial value:
ys = foldl (+) 0 xsIf the elements of ys are distributed across separate processors, foldl becomes a collective communication operation requiring interprocessor communication.
On many parallel architectures it is possible to do better than this. It takes linear time to calculate (((0+x0)+x1)+x2)+x3 but only log time to perform a tree-structured sum (x0+x1) + (x2+x3). We can define a tree-structured function tfold which does this, and we can prove a theorem that states exactly when the faster tfold can be used.
Theorem. If f is associative with left identity a, then foldl f a xs = tfold f xs
That was only a quick introduction. The following chapter from Research Directions in Parallel Functional Programming gives a tutorial overview and some examples:
Many commercially available parallel systems offer a a pragmatic and flexible programming style called SPMD (Single program, multiple data). The idea is that the program is written from the viewpoint of a single processor, and is completely sequential. The operating system loads a copy of the same program into each processor, with its own local data, and they all start executing concurrently. Interprocessor communication is performed by system calls, typically to the MPI (Message Passing Interface) library.
A high level specification in Haskell uses combinators like map, fold and scan to specify parallel computation and communications. These combinators take a global view of the entire computation, and they describe what is happening in all the processors. By definition, an SPMD program just performs a computation in a single processor, knowing nothing about the rest of the system. SPMD communications are handled by calling library functions (typically the MPI library), so receiving and sending messages is handled exactly the same way as input and output.
There needs to be a way to move from a collective level algorithm that uses combinators to specify the entire system computation to an individual level algorithm to be executed on one processor. The following paper formalizes this transformation, so that it's possible to prove that the low level implementation satisfies the high level algorithm:
The main reason for taking this approach is that there are many separate issues to be decided in parallel programming, and it seems more effective to tackle these in a structured manner rather than mixing all the programming problems together, choosing the first solution that comes to mind, and writing the low-level implementation at the outset. Furthermore, pure functional languages are especially well suited for supporting program transformation and refinement by equational reasoning.
Some programs have an irregular computation structure. For example, they might iterate over sparse arrays or unbalanced trees rather than simple arrays. There are many forms of irregularity, involving the algorithmic structure, the pattern of memory access, the communications, the organization of tasks, and so on. It can be much harder to deal with irregular problems efficiently on a parallel system. One attempt to address this problem is presented in the following paper, which uses exploits lazy evaluation and tree traversal to control the irregular task structure of the hierarchical radiosity algorithm.
HPVM combines Haskell with the PVM library to support distributed programming. It allows you to write programs at a higher level than the usual C programs that call PVM for parallel communications.
As described above, the central idea behind functional data parallel programming is the use of combinators to express the standard, common patterns of computation. For example, every SIMD-style parallel compuation where each processor applies f to its state can be described as map f, and common communication patterns (fold/reduce, scan, scatter, gather and so on) also correspond to higher order functions.
Exactly what combinators belong to this list, and how are they defined? There are several ways to answer this question, and they lead to different outcomes:
Abstract Parallel Machines take the third approach; they allow the programmer to define new parallel combinators and to collect them into libraries. An APM specifies both the semantics and the parallel behavior of a combinator, expressed in a specific style that makes the parallelism and the communications explicit. APMs also allow definitions at a wide range of levels of abstraction. This supports the definition of families of related operations that can be used for incremental derivation of parallel programs. APMs were first introduced in a conference paper:
My colleagues Thomas Rauber (Universität Halle, Germany) and Gudula Rünger (Technische Universität Chemnitz, Germany) have developed a structured approach to deriving parallel programs where the programmer uses cost models and quick but effective analyses to decide how to organize the parallel behavior of the program. This model is called TwoL, because it comprises two levels: the top level is task parallelism, and the lower level is data parallelism.
We are currently collaborating on combining Haskell program transformations with TwoL, and hope to build a working prototype during 2001. EPSRC has recently (June 2001) awarded us a grant to support this work.
Computer hardware is just as interesting as software, and there are many deep connections between them. There's no need to make an arbitrary distinction, restricting software to computer scientists and hardware to electrical engineers.
The centerpiece of my research in computer systems is Hydra, a computer hardware description language based on Haskell. I'm currently (mid 2001) in the midst of a complete rewrite of the Hydra system, and am planning to release it as a software package later in the year. Eventually there will be a Hydra web page, but that page won't have any content until autumn 2001.
One of the most fundamental problems in software is coping with the complexity of large scale designs, and this is equally important in hardware design. It should be no surprise that some of the best solutions that have been found for software apply equally well to hardware.
Hydra applies a variety standard techniques in functional programming to address well known problems in hardware design. In particular, Hydra makes use of higher order functions, algebraic data types, nonstrict semantics (lazy evaluation), multiparameter type classes, existential types, functional dependencies, and more. It also uses monads for the user interface, although not in the underlying mechanism for defining circuits.
The industry-standard computer hardware description language (CHDL) is called VHDL, and is based on the Ada programming language, although it is a distinct language. This means that a VHDL compiler must be written in order to run it on a system. Hydra takes a different approach: it is an embedded language. This means that a circuit specification written in Hydra is actually written in a well-defined subset of Haskell, and the implementation of Hydra consists of a library of Haskell definitions rather than a full compiler. Thus Hydra can be used with any Haskell compiler, and it benefits automatically from improvements in compiler technology.
For a long time, circuits were designed by drawing schematic diagrams that showed the components and how they were connected. Schematic diagrams are severely limited. A diagram describes only one instance of a circuit: you cannot show an n-bit adder in a diagram, you can only show an example at one particular size (say a 4-bit adder). Diagrams scale up poorly to large and complex circuits. Furthermore, a diagram doesn't actually do anything, it just sits there on a piece of paper.
Hydra solves these problems of schematic diagrams. It allows you to specify circuits precisely and with generality. Thus you can give a precise definition of an n-bit adder, and the definition can be instantiated automatically to any value of n. Hydra can check a circuit specification for errors, simulate its behavior, analyse its structure, and generate a netlist that can be used to fabricate it in hardware.
The fundamental idea in Hydra is to treat each circuit as a function. This is natural, since a circuit takes inputs, performs some computation internally, and produces outputs--just like a --> --mathematical function.
The following paper give an overview of the style of circuit specification used in Hydra, and also discusses how it is used in a course on computer architecture.
A circuit specification in Hydra doesn't just describe the circuit; it is an executable program that defines the semantics of the circuit. This means, in particular, that you can simulate the circuit simply by executing its specification. The following paper explains how this works (the paper refers to an earlier version of Hydra with an obsolete syntax).
As mentioned above, a Hydra circuit specification has a semantics (or meaning). Actually, it has several alternative semantics, which are kept straight through the Haskell type class system. The circuit designer just writes one specification, which can be executed by applying it to a variety of suitable inputs. The type of the input determines which circuit semantics will be in effect for that execution. Hydra allows several distinct views of the structure of a circuit, and it supports VLSI design as well as ordinary logic design.
A particularly interesting circuit semantics is its netlist. This is a data structure that states explicitly what components are inside the circuit, and how they are wired together. A netlist can be used by automated tools to fabricate the physical circuit, and they can also be used by further software tools to analyze the circuit in various ways (for example, to find potential timing problems).
The netlists are produced by making the circuit specification define a graph structure that is isomorphic to the actual circuit. This approach is simple, elegant and effective, but it runs into an interesting technical problem in a pure functional language like Haskell. Ironically, it's easy to produce the netlist graph, but it's difficult to traverse that graph in Haskell! The reason is that a software tool written in Haskell cannot mark the graph to indicate that it has already seen a node. In effect, it's like giving a cave explorer a piece of chalk for marking directions on the cave walls in order to avoid getting lost, but then coating the cave walls with anti-chalk repellant!
Over the years I have experimented with a number of solutions to this problem. The following paper describes the position as it was in 1992:
There has been a lot more progress since then, and I'm working on a new paper giving the current status -- which is much better than it was in 1992!
Another aspect of circuit structure is the geometry of a circuit. This goes beyond specifying what the connections are; it says explicitly where each component is and how the wires are routed.
Hydra allows the designer to specify the circuit geometry, but does not require this. The geometry is specified through a set of geometric combinators, which were originally inspired by the primitive geometric operators of the TeX typesetting system. (The core of TeX is essentially a functional geometry package.) This paper illustrates how a regular but complex circuit layout is specified in Hydra. (This is an earlier version of Hydra with an obsolete syntax.)
Several of my research projects have examined the relationship between data parallel architecture and functional programming. In the course of this work, a real VLSI implementation was carried out by my former student Tim Bridges. I don't think that designing special purpose hardware is the right solution, except in a few special circumstances, but it's very enlightening to carry out occasional implementation projects like this one.
Hardware systems are just as hard to design correctly as software, but it's more difficult to debug hardware. The time it takes to design a chip, fabricate it, test it -- only to find the next bug -- is prohibitive. This is a natural area for the application of formal methods, since every bug that's avoided saves so much time.
Hydra is embedded in Haskell, and it inherits the safe equational reasoning techniques that work so effectively in Haskell. There are many different kinds of formal method in use, and they all have their place, but equational reasoning about hardware in Hydra has two huge advantages:
I'm interested in all sorts of formal reasoning about digital circuits and computer architecture. The following subsections describe some interesting examples that are nontrivial but still not too long or hard to follow.
Many problems can be solved efficiently using a tree-structured computation. In some cases, a simple "upsweep" from the leaves of a tree to its root suffices. For example, a tree of adders can compute the sum of n numbers in log time. The same technique works for any associative operation f; it isn't restricted to addition.
The foldl1 operator performs a linear reduction of a list xs using an arbitrary operator + (which does not need to be addition). For example, foldl1 (+) [a,b,c,d] = a + (b + (c+d)), and foldl1 f xs takes time O(n) if xs contains n elements. The tree fold operator tfold is similar, but it uses a divide and conquer approach to perform a balanced computation which can be executed in parallel: tfold (+) [a,b,c,d] = (a+b) + (c+d). In general, tfold f xs requires only O(log n) time for a list of length xs. The following theorem generalises this technique, and is straightforward to prove with equational reasoning. It can be applied to speed up a whole class of fold computations.
Theorem. If f :: a -> a -> a is associative, and xs :: [a] is a finite list, then foldl1 f xs = tfold f xs.
However, a tree structure is quite flexible, and it can perform many computations in log time that are not expressed very well as folds. It's useful to find a more general way to specify all tree-structured computations, so we can get the full benefit of the entire class of tree circuits.
The class of scan functions provides an effective tool for exploiting tree circuits: scan is more general than fold, but it is still implementable on a tree. There are several scan functions, and several ways to define them, but the simplest view is the following: when you fold an operator (+) over a list [a,b,c,d] you get a series of partial results 0, 0+a, 0+a+b, 0+a+b+c before obtaining the final result 0+a+b+c+d. The list of these partial results is the scan over the list, while the final result is the fold over the list. (For precise definitions of the functions, see the papers below.)
Obviously a tree can calculate foldl1 f xs if f is associative, but it's much less obvious that a tree can calculate scanl f a xs. Of course we could do it with n trees, but that requires far too much hardware; the aim is to calculate scanl f a xs on one tree. This algorithm is called parallel scan, and it was disconvered by Ladner and Fisher about 25 years ago.
The parallel scan algorithm has been described informally in numerous books and papers, but these descriptions are usually based on an example calculation, which gives little insight into how the algorithm works. Furthermore, it is quite hopeless to prove the correctness of the algorithm when it isn't even specified precisely. The following paper addresses these problems by giving a general but precise formal specification of the algorithm and the associated tree hardware, and it proves the correctness using equational reasoning with Hydra.
Parallel scan has many applications, and a particuarly good one is the design of a log time binary addition circuit. This is a vitally important problem, not just a toy example, since an adder usually lies on the critical path of processor circuits. The following paper uses Hydra to specify a linear time ripple carry adder, and then applies equational reasoning and the parallel scan theorem to reduce the time from O(n) to O(log n).
The theme underlying all my research is functional programming. The topics described above involve the application of functional programming to computer systems, but I'm also actively interested in a variety of topics directly related to functional languages.
Pure functional languages like Haskell offer equational reasoning: this means you can do algebra on programs by substituting equals for equals. The reason this works is that Haskell equations really are equations; if your program says foldr f a (x:xs) = f x (foldr f a xs) then whenever you see an expression that matches either side of the equation, you can replace it with the other side of the equation (keeping the variable names straight, of course).
Equations are completely different from assignments. In C, the assignment
n = n+1;
n = n+1
Equational reasoning about functional programs is similar to using ordinary school algebra to solve equations like a*x^2 + b*x + c = 0. Several of my papers above contain examples of equational reasoning applied to practical problems; see in particular the parallel scan and adder derivation.
In large scale problems, the algebra can become long and involved. Fortunately, the process of manipulating the algebra can also be supported by a computer (another advantage of formal, precise reasoning!) and I'm working on a Haskell Equational Reaoning Assistant to do this.
So many beginners in functional programming ask for an interactive debugging tool!
A common response from experienced functional programmers is that the strong type system enables the compiler to catch most errors in programs, so you get an error message from the compiler, and you don't have to run the program and crash it in order to get feedback from the debugger.
My own opinion is that a rich and safe type system, as Haskell provides, is the single most effective tool for preventing errors and catching the ones you still make. The incidence of errors is reduced hugely, perhaps by an order of magnitude.
This is wonderful, but still it isn't enough! The problem is that it can become very hard to find the small residue of bugs that get past the type checker, because the usual debugging tools (shoving print statements into the program, watching the sequence of events in an animated window, etc.) run into trouble with the restrictions imposed by pure functional languages. For example, you can't put a print statement (causing a side effect) in a pure functional program that hasn't already put all the monadic infrastructure in place to support this. Similarly, lazy evaluation makes the actual sequence of events during program execution so confusing, an ordinary animation tool would just make things more confusing.
A survey of these technical difficulties and an early serious approach to solving them appear in this paper:
A more general discussion of some alternative approaches appeared in this workshop paper:
I'm still actively interested in the debugging problem. For some current state of the art work, I recommend the recent papers from Colin Runciman's group at York.
Many of the standard algorithms and data structures for imperative programming are also useful for pure nonstrict functional languages, but there are fundamental differences between these language paradigms that lend functional algorithms a distinctive character. Some imperative algorithms depend heavily on modifications to existing data structures, which is generally disallowed in pure functional programming (with some exceptions). Countering this, a functional language offers features that don't exist in imperative languages, such as the "knot-tying" techniques of mutual recursion, and the implicit coroutines of lazy evaluation.
Some of my work in functional algorithms and data structures is embedded in the other projects (parallel combinators, functional hardware description, etc.). The topics described below aren't specialised for any particular application, so they are best classified as general functional programming technique.
Some software applications require nondeterministic behavior, especially in situations where the software has to respond to user inputs that could arrive at any time in arbitrary order.
How could a pure functional language express nondeterminism? A natural idea, introduced by McCarthy around 40 years ago, is to use a pseudo-function amb :: a -> a -> a, such that amb x y will return either x or y, but amb gets to decide which value to return. However, amb is obligated to return a well-defined result if either x or y is well defined. This means, for example, that amb cannot be defined simply to return x all the time, since amb bottom 2 would fail to return the well defined result 2 (you can think of bottom as meaning "infinite loop"). The intended implementation of amb is to evaluate both arguments concurrently, perhaps by alternating time slices between them, until one of them produces a value, which is then returned as the value of the amb application.
Impure functional languages can use amb without problem, but any pure functional language cannot provide amb since this would render equational reasoning invalid. For example, amb 2 3 could return 2 sometimes and 3 at other times. If we have amb and try to use equational reasoning, we get the result 2 = amb 2 3 = 3, and we have just "proved" 2=3!
Lots of alternative approaches for pure functional nondeterminism have been considered. John Hughes and I have proposed a method called "set nondeterminism".
The idea behind set nondeterminism is that the program doesn't define a final result (say an Integer). Instead, it defines a set of possible Integers. Whenever there is a nondeterministic choice to be made, we define the set of possible results. Therefore we replace amb x y by {x,y}.
The real trick behind the method lies in the particular operations on nondeterministic sets that are allowed. These are designed carefully to achieve two aims: (1) the language preserves referential transparency (i.e. it is "pure" and equational reasoning is ok); (2) there is an efficient way to implement the language. Therefore some of the ordinary operations on sets are disallowed. For example, there is no function choose :: {a} -> a that can choose an arbitrary element from a set, since this would allow you to prove 2=3, breaking equational reasoning.
Set nondeterminism, as well as a critical review of alternative approaches, is described in detail in the following papers. Unfortunately they are really paper, as I don't have machine-readable copies of them! You're welcome to email me for a paper copy.
The scan functions are closely related to several topics I've worked on extensively, including linear circuit structures, data parallel algorithms, and tree structured computations. As a result, I've done a lot of work with the scan functions. This has led to an entire family of related functions, with a rich algebra of properties.
An unusual but useful function is the bidirectional scan (and its cousin, the bidirectional fold). The usual scan functions are unidirectional: scanl works from the left to right, and scanr is the opposite. The bidirectional scan defines a concurrent computation that begins at both ends of a list and works across it in both directions. The original definition of the bidirectional scan is given in this paper:
Since then there has been more progress on these functions, and I plan a future paper giving a systematic coverage of the entire family of scan and fold functions.
For a long time, a central problem in functional programming was the implementation of arrays. In a nutshell, the problem is this: imperative languages let you fetch and store an arbitrary element of an array in O(1) time, but there just doesn't seem to be a way to achieve this in a pure functional language. The naive solutions are hopelessly inefficient. With a lot of cleverness, it's possible to provide array access operations that are reasonably efficient most of the time.
In recent years this problem has largely faded from sight because of the advent of monadic programming, where side effects are introduced into the language in a safe way. As a result, you can write an ordinary Fortran-like program with imperative arrays in Haskell and get it to work with reasonable efficiency.
Nonetheless, this approach of embedding imperative arrays into Haskell with monads is evading the question. The result is less general than pure functional arrays, which allow unlimited sharing. This approach is fine if you have an existing imperative algorithm to be transcribed into Haskell, but the question remains: how efficiently is it possible to implement genuine functional arrays without placing any limitations on the allowable patterns of array access?
A partial answer to this question involves a combination of hardware and software support:
There is more to this story, and some more recent results, which I'll write about when time permits.
My teaching page describes some of the courses I've taught recently, and it contains information for students in my current courses.
One of the central themes in my research is using functional programming as a formal method to help design reliable software and hardware. Naturally, such methods are beneficial to courses in all areas of computer science, and I think it's especially important to apply formal methods to a wide range of subjects. If formalism is limited to a single course on formal methods, students will not realise how useful it can be.
Cordelia Hall and I have written an elementary textbook on discrete mathematics that stresses the connections between mathematics and computing. The book uses Haskell to implement a wide range of mathematical ideas; this enables students to compute with logic, sets, relations and so on. This gives a more practical and concrete understanding than a purely theoretical treatment can provide.
I'm working actively on new applicatoins of Haskell and formal methods to education in computer systems, based based on the Hydra research project. Old: IFL07 slides