# Overview of Research Projects

### John O'Donnell Computing Science DepartmentUniversity of Glasgow

My work focuses on how to design systems and reason about their correctness and efficiency. This includes work in programming, circuit design and hardware, and parallel systems. Two themes that run through most (though not all) of my research are formal methods and functional programming. I'm a member of the ENDS (computer systems) group at Glasgow.

This page gives a general introduction to my research, aimed at nonspecialists (although I assume you know about programming and computers in general). It's organized by subject; there's another web page containing a chronological list of some of my publications, and soon I'll make a chronological list of grants. The table of contents contains links down to the individual sections on this page.

If you're contemplating doing a PhD or a Masters degree (by research) at Glasgow, and your interests are related to any of the following projects, send me an email (jtod@dcs.gla.ac.uk). See also the department's web page on PhD and Masters Research in Computer Science at Glasgow.

New: the Circuit Solvers project.

### Parallel systems

Writing programs for parallel computers involves all the challenges of ordinary programming, and it introduces some new ones: maintaining correct behavior in the presence of concurrency, utilizing the processors effectively while keeping communication costs reasonable, etc. My research in this area uses the techniques of functional programming, especially higher order functions, to make it easier to develop reliable and efficient parallel programs.

#### Parallel programming models

##### Data parallelism

The starting point of this work was an observation that the same techniques used for digital circuit specification (see below) can also specify parallel algorithms: digital circuits are the most highly parallel systems of all! For example, suppose you want to computer the square root of all the elements of a vector in parallel. One way is to write a sequential iteration in an imperative language:

```for i := 0 to n-1 do
y[i] := sqrt (x[i])
```
A problem with this is that the programmer began with an inherently parallel idea (a computation with no data dependencies) and was forced by an inexpressive programming language to specify an irrelevant sequencing, which must now be removed by yet further work (compiler parallelization, programmer annotations, etc.). A lot of traditional research in parallel compilation has looked at how to undo this pointless sequentialization forced on the programmer by inadequate programming languages. A better approach is to allow the programmer to specify the intended computation more exactly:
```ys = map sqrt xs
```
This fragment of Haskell says that the list of elements ys is defined to be the square roots of the corresponding elements of xs. This equation is neither sequential nor parallel: it is abstract, and could be implemented either way. For example, the implementation could store each element of xs in its own processor, so that the map function becomes an instruction for each processor to perform a sqrt in parallel with the others. Another attractive feature is that we managed to express the computation without bringing in element indices. Functional languages offer a higher level, more abstract and more intuitive way to express such computations over aggregate data structures.

What about interprocessor communication? Consider the problem of summing up the elements of an array:

```sum := 0;
for i := 0 to n-1 do
sum := sum + y[i]
```
This computation is expressed functionally using the foldl combinator. In this case, foldl expresses an iteration that uses the (+) operator to combine the elements of xs, using 0 as the initial value:
```ys = foldl (+) 0 xs
```
If the elements of ys are distributed across separate processors, foldl becomes a collective communication operation requiring interprocessor communication.

On many parallel architectures it is possible to do better than this. It takes linear time to calculate (((0+x0)+x1)+x2)+x3 but only log time to perform a tree-structured sum (x0+x1) + (x2+x3). We can define a tree-structured function tfold which does this, and we can prove a theorem that states exactly when the faster tfold can be used.

Theorem. If f is associative with left identity a, then foldl f a xs = tfold f xs

That was only a quick introduction. The following chapter from Research Directions in Parallel Functional Programming gives a tutorial overview and some examples:

John O'Donnell, Data Parallelism, Research Directions in Parallel Functional Programming, Springer-Verlag, 191-206 (1999)
Many books and papers that mention data parallelism give the impression that it's just a quick way to implement iterations over arrays of numbers (just like the map sqrt example above!). In reality, the possibilities are endless, and the most interesting data parallel algorithms involve much more sophisticated data structures. The following paper gives a taste: it uses data parallelism, with communications restricted to a tree network, to implement two radically different variations on the Quicksort algorithm:
John O'Donnell, Functional microprogramming for a data parallel architecture, Proceedings of the 1988 Glasgow Workshop on Functional Programming , University of Glasgow Computing Science Department (July 1988)
##### Functional SPMD programming

Many commercially available parallel systems offer a a pragmatic and flexible programming style called SPMD (Single program, multiple data). The idea is that the program is written from the viewpoint of a single processor, and is completely sequential. The operating system loads a copy of the same program into each processor, with its own local data, and they all start executing concurrently. Interprocessor communication is performed by system calls, typically to the MPI (Message Passing Interface) library.

A high level specification in Haskell uses combinators like map, fold and scan to specify parallel computation and communications. These combinators take a global view of the entire computation, and they describe what is happening in all the processors. By definition, an SPMD program just performs a computation in a single processor, knowing nothing about the rest of the system. SPMD communications are handled by calling library functions (typically the MPI library), so receiving and sending messages is handled exactly the same way as input and output.

There needs to be a way to move from a collective level algorithm that uses combinators to specify the entire system computation to an individual level algorithm to be executed on one processor. The following paper formalizes this transformation, so that it's possible to prove that the low level implementation satisfies the high level algorithm:

John O'Donnell, The collective semantics in functional SPMD programming, Proc. 12th International Workshop on Implementation of Functional Languages, Springer, LNCS 2011, 249-265 (2001)

#### Transformation and incremental design

A central aim of my research is to help develop an incremental approach to parallel programming, where the user begins with a simple, abstract, high level program (in Haskell, of course) and this is then transformed through a sequence of stages into a concrete, executable implementation.

The main reason for taking this approach is that there are many separate issues to be decided in parallel programming, and it seems more effective to tackle these in a structured manner rather than mixing all the programming problems together, choosing the first solution that comes to mind, and writing the low-level implementation at the outset. Furthermore, pure functional languages are especially well suited for supporting program transformation and refinement by equational reasoning.

Some programs have an irregular computation structure. For example, they might iterate over sparse arrays or unbalanced trees rather than simple arrays. There are many forms of irregularity, involving the algorithmic structure, the pattern of memory access, the communications, the organization of tasks, and so on. It can be much harder to deal with irregular problems efficiently on a parallel system. One attempt to address this problem is presented in the following paper, which uses exploits lazy evaluation and tree traversal to control the irregular task structure of the hierarchical radiosity algorithm.

John O'Donnell, Thomas Rauber, and Gudula Rünger, A side-effect-free hierarchical radiosity algorithm, ACM Symposium on Applied Computing, Association for Computing Machinery, 782-789 (March 2000)

HPVM combines Haskell with the PVM library to support distributed programming. It allows you to write programs at a higher level than the usual C programs that call PVM for parallel communications.

Noel Winstanley and John O'Donnell, Parallel distributed programming with Haskell+PVM Third International Euro-Par Conference, Springer LNCS 1300, 670-677 (August 1997)
John O'Donnell and Gudula Rünger, A case study in parallel program derivation: the Heat Equation Algorithm, Functional Programming, Glasgow 1994, Springer-Verlag Workshops in Computing, 167-183 (1995)
##### Abstract parallel machines

As described above, the central idea behind functional data parallel programming is the use of combinators to express the standard, common patterns of computation. For example, every SIMD-style parallel compuation where each processor applies f to its state can be described as map f, and common communication patterns (fold/reduce, scan, scatter, gather and so on) also correspond to higher order functions.

Exactly what combinators belong to this list, and how are they defined? There are several ways to answer this question, and they lead to different outcomes:

• Circuits. Each combinator is taken to specify exactly how the system hardware performs the operation. This means the parallel program is, in effect, controlling the parallel system directly, right down to the implementation of communications within the interconnection network. Under this interpretation, there is a deep relationship between data parallel programs and digital circuit specification.
• Constructs. The parallel program is translated by a compiler that has built-in knowledge of each of the combinators. Since the same language can be supported on different architectures by different compilers, a combinator can produce quite different computations depending on the system. On the other hand, the meanings of the combinators are defined at just one level of abstraction, and this limits the benefit of the portibility. Even worse, the programmer is limited to the pre-defined combinators supplied by the language.
• Library. Each combinator is just an ordinary higher order function. It's convenient to provide a library of the most common combinators in advance, but a programmer working on a problem with an unusual structure is free to define new ones. There needs to be a mechanism for defining how the combinators are actually realised on a parallel system.

Abstract Parallel Machines take the third approach; they allow the programmer to define new parallel combinators and to collect them into libraries. An APM specifies both the semantics and the parallel behavior of a combinator, expressed in a specific style that makes the parallelism and the communications explicit. APMs also allow definitions at a wide range of levels of abstraction. This supports the definition of families of related operations that can be used for incremental derivation of parallel programs. APMs were first introduced in a conference paper:

John O'Donnell and Gudula Rünger, A methodology for deriving parallel programs with a family of parallel abstract machines, Third International Euro-Par Conference, Springer LNCS 1300, 662-669 (August 1997)
A more complete and definitive (but longer) explanation appears in:
John O'Donnell and Gudula Rünger, Abstract Parallel Machines, Computers and Artificial Intelligence (continued as Computing and Informatics), Vol. 19, No. 2, Slovak Academy of Sciences, Bratislava (2000)
Research using APMs is proceeding in several directions. A crucial problem is how to give suitable cost models to parallel operations defined as APMs, to enable the programmer to reason about the performance as well as the semantics of the program.
John O'Donnell, Thomas Rauber, and Gudula Rünger, Cost Hierarchies for Abstract Parallel Machines, 13th International Workshop on Languages and Compilers for Parallel Computing, Springer, LNCS (2001, to appear)
Another research direction is to try out some examples of program derivation using APMs. The first such case study involved a toy program where the code is short and simple, enabling us to focus on the details of the specification and the transformation.
Joy Goodman, John O'Donnell and Gudula Rünger, Refinement Transformation using Abstract Parallel Machines, Glasgow Workshop on Functional Programming, Computing Science Department, University of Glasgow (1998)
The next paper describes in more detail how one transformation -- load balancing -- is incorporated in the APM-based program derivation.
Joy Goodman and John O'Donnell, Introduction of static load balancing in incremental parallel programming, Seventh International Euro-Par Conference, Springer LNCS (2001, to appear)

My colleagues Thomas Rauber (Universität Halle, Germany) and Gudula Rünger (Technische Universität Chemnitz, Germany) have developed a structured approach to deriving parallel programs where the programmer uses cost models and quick but effective analyses to decide how to organize the parallel behavior of the program. This model is called TwoL, because it comprises two levels: the top level is task parallelism, and the lower level is data parallelism.

We are currently collaborating on combining Haskell program transformations with TwoL, and hope to build a working prototype during 2001. EPSRC has recently (June 2001) awarded us a grant to support this work.

### Digital circuits and computer architecture

Computer hardware is just as interesting as software, and there are many deep connections between them. There's no need to make an arbitrary distinction, restricting software to computer scientists and hardware to electrical engineers.

The centerpiece of my research in computer systems is Hydra, a computer hardware description language based on Haskell. I'm currently (mid 2001) in the midst of a complete rewrite of the Hydra system, and am planning to release it as a software package later in the year. Eventually there will be a Hydra web page, but that page won't have any content until autumn 2001.

#### Circuit design

One of the most fundamental problems in software is coping with the complexity of large scale designs, and this is equally important in hardware design. It should be no surprise that some of the best solutions that have been found for software apply equally well to hardware.

Hydra applies a variety standard techniques in functional programming to address well known problems in hardware design. In particular, Hydra makes use of higher order functions, algebraic data types, nonstrict semantics (lazy evaluation), multiparameter type classes, existential types, functional dependencies, and more. It also uses monads for the user interface, although not in the underlying mechanism for defining circuits.

The industry-standard computer hardware description language (CHDL) is called VHDL, and is based on the Ada programming language, although it is a distinct language. This means that a VHDL compiler must be written in order to run it on a system. Hydra takes a different approach: it is an embedded language. This means that a circuit specification written in Hydra is actually written in a well-defined subset of Haskell, and the implementation of Hydra consists of a library of Haskell definitions rather than a full compiler. Thus Hydra can be used with any Haskell compiler, and it benefits automatically from improvements in compiler technology.

##### Specification

For a long time, circuits were designed by drawing schematic diagrams that showed the components and how they were connected. Schematic diagrams are severely limited. A diagram describes only one instance of a circuit: you cannot show an n-bit adder in a diagram, you can only show an example at one particular size (say a 4-bit adder). Diagrams scale up poorly to large and complex circuits. Furthermore, a diagram doesn't actually do anything, it just sits there on a piece of paper.

Hydra solves these problems of schematic diagrams. It allows you to specify circuits precisely and with generality. Thus you can give a precise definition of an n-bit adder, and the definition can be instantiated automatically to any value of n. Hydra can check a circuit specification for errors, simulate its behavior, analyse its structure, and generate a netlist that can be used to fabricate it in hardware.

The fundamental idea in Hydra is to treat each circuit as a function. This is natural, since a circuit takes inputs, performs some computation internally, and produces outputs--just like a --> --mathematical function.

The following paper give an overview of the style of circuit specification used in Hydra, and also discusses how it is used in a course on computer architecture.

John O'Donnell. From transistors to computer architecture: teaching functional circuit specification in Hydra, FPLE'95: Symposium on Functional Programming Languages in Education, Springer-Verlag LNCS 1022 195--214 (1995)
John O'Donnell and Gudula Rünger, Formal specification of interconnection networks, Functional Programming, Glasgow 1995, Springer-Verlag Workshops in Computing (1995)
##### Simulation

A circuit specification in Hydra doesn't just describe the circuit; it is an executable program that defines the semantics of the circuit. This means, in particular, that you can simulate the circuit simply by executing its specification. The following paper explains how this works (the paper refers to an earlier version of Hydra with an obsolete syntax).

John O'Donnell. Hardware description with recursion equations, Proceedings of the IFIP 8th International Symposium on Computer Hardware Description Languages and their Applications, North-Holland, 363--382 (April 1987)

#### Structure of circuits

As mentioned above, a Hydra circuit specification has a semantics (or meaning). Actually, it has several alternative semantics, which are kept straight through the Haskell type class system. The circuit designer just writes one specification, which can be executed by applying it to a variety of suitable inputs. The type of the input determines which circuit semantics will be in effect for that execution. Hydra allows several distinct views of the structure of a circuit, and it supports VLSI design as well as ordinary logic design.

##### Netlists and Circuit Geometry

A particularly interesting circuit semantics is its netlist. This is a data structure that states explicitly what components are inside the circuit, and how they are wired together. A netlist can be used by automated tools to fabricate the physical circuit, and they can also be used by further software tools to analyze the circuit in various ways (for example, to find potential timing problems).

The netlists are produced by making the circuit specification define a graph structure that is isomorphic to the actual circuit. This approach is simple, elegant and effective, but it runs into an interesting technical problem in a pure functional language like Haskell. Ironically, it's easy to produce the netlist graph, but it's difficult to traverse that graph in Haskell! The reason is that a software tool written in Haskell cannot mark the graph to indicate that it has already seen a node. In effect, it's like giving a cave explorer a piece of chalk for marking directions on the cave walls in order to avoid getting lost, but then coating the cave walls with anti-chalk repellant!

Over the years I have experimented with a number of solutions to this problem. The following paper describes the position as it was in 1992:

John O'Donnell. Generating netlists from executable circuit specifications in a pure functional language, Functional Programming Glasgow 1992, Springer-Verlag Workshops in Computing 178--194 (1993)

There has been a lot more progress since then, and I'm working on a new paper giving the current status -- which is much better than it was in 1992!

Another aspect of circuit structure is the geometry of a circuit. This goes beyond specifying what the connections are; it says explicitly where each component is and how the wires are routed.

Hydra allows the designer to specify the circuit geometry, but does not require this. The geometry is specified through a set of geometric combinators, which were originally inspired by the primitive geometric operators of the TeX typesetting system. (The core of TeX is essentially a functional geometry package.) This paper illustrates how a regular but complex circuit layout is specified in Hydra. (This is an earlier version of Hydra with an obsolete syntax.)

John O'Donnell. Hydra: hardware description in a functional language using recursion equations and high order combining forms, The Fusion of Hardware Design and Verification, G. J. Milne (ed), Amsterdam: North-Holland 309--328 (1988)
##### VLSI design

Several of my research projects have examined the relationship between data parallel architecture and functional programming. In the course of this work, a real VLSI implementation was carried out by my former student Tim Bridges. I don't think that designing special purpose hardware is the right solution, except in a few special circumstances, but it's very enlightening to carry out occasional implementation projects like this one.

John O'Donnell, Timothy Bridges and Sidney Kitchel. A VLSI implementation of an architecture for applicative programming, Future Generation Computer Systems, Vol. 4 No. 3 245--254 (October 1988)

#### Correctness of hardware

Hardware systems are just as hard to design correctly as software, but it's more difficult to debug hardware. The time it takes to design a chip, fabricate it, test it -- only to find the next bug -- is prohibitive. This is a natural area for the application of formal methods, since every bug that's avoided saves so much time.

Hydra is embedded in Haskell, and it inherits the safe equational reasoning techniques that work so effectively in Haskell. There are many different kinds of formal method in use, and they all have their place, but equational reasoning about hardware in Hydra has two huge advantages:

• Equational reasoning is relatively straightforward; it's very similar to high school algebra, where you use a set of mathematical laws and "substituting equals with equals" in order to transform one term into an equivalent one.
• It enables you to reason about the circuit while you're still designing it, in contrast to some other formal methods that can be used only with a completed design. It's better to avoid making a mistake in the first place than to make unnecessary mistakes and then to find them!

I'm interested in all sorts of formal reasoning about digital circuits and computer architecture. The following subsections describe some interesting examples that are nontrivial but still not too long or hard to follow.

##### Parallel scan

Many problems can be solved efficiently using a tree-structured computation. In some cases, a simple "upsweep" from the leaves of a tree to its root suffices. For example, a tree of adders can compute the sum of n numbers in log time. The same technique works for any associative operation f; it isn't restricted to addition.

The foldl1 operator performs a linear reduction of a list xs using an arbitrary operator + (which does not need to be addition). For example, foldl1 (+) [a,b,c,d] = a + (b + (c+d)), and foldl1 f xs takes time O(n) if xs contains n elements. The tree fold operator tfold is similar, but it uses a divide and conquer approach to perform a balanced computation which can be executed in parallel: tfold (+) [a,b,c,d] = (a+b) + (c+d). In general, tfold f xs requires only O(log n) time for a list of length xs. The following theorem generalises this technique, and is straightforward to prove with equational reasoning. It can be applied to speed up a whole class of fold computations.

Theorem. If f :: a -> a -> a is associative, and xs :: [a] is a finite list, then foldl1 f xs = tfold f xs.

However, a tree structure is quite flexible, and it can perform many computations in log time that are not expressed very well as folds. It's useful to find a more general way to specify all tree-structured computations, so we can get the full benefit of the entire class of tree circuits.

The class of scan functions provides an effective tool for exploiting tree circuits: scan is more general than fold, but it is still implementable on a tree. There are several scan functions, and several ways to define them, but the simplest view is the following: when you fold an operator (+) over a list [a,b,c,d] you get a series of partial results 0, 0+a, 0+a+b, 0+a+b+c before obtaining the final result 0+a+b+c+d. The list of these partial results is the scan over the list, while the final result is the fold over the list. (For precise definitions of the functions, see the papers below.)

Obviously a tree can calculate foldl1 f xs if f is associative, but it's much less obvious that a tree can calculate scanl f a xs. Of course we could do it with n trees, but that requires far too much hardware; the aim is to calculate scanl f a xs on one tree. This algorithm is called parallel scan, and it was disconvered by Ladner and Fisher about 25 years ago.

The parallel scan algorithm has been described informally in numerous books and papers, but these descriptions are usually based on an example calculation, which gives little insight into how the algorithm works. Furthermore, it is quite hopeless to prove the correctness of the algorithm when it isn't even specified precisely. The following paper addresses these problems by giving a general but precise formal specification of the algorithm and the associated tree hardware, and it proves the correctness using equational reasoning with Hydra.

John O'Donnell, A correctness proof of parallel scan, Parallel Processing Letters, Vol. 4 No. 3 329-338 (September 1994)

Parallel scan has many applications, and a particuarly good one is the design of a log time binary addition circuit. This is a vitally important problem, not just a toy example, since an adder usually lies on the critical path of processor circuits. The following paper uses Hydra to specify a linear time ripple carry adder, and then applies equational reasoning and the parallel scan theorem to reduce the time from O(n) to O(log n).

John O'Donnell and Gudula Rünger, Derivation of a Carry Lookahead Addition Circuit, Proc. ACM SIGPLAN 2001 Haskell Workshop, Electronic Notes in Theoretical Computer Science, (to appear)

### Functional programming

The theme underlying all my research is functional programming. The topics described above involve the application of functional programming to computer systems, but I'm also actively interested in a variety of topics directly related to functional languages.

#### Tools

##### Equational reasoning

Pure functional languages like Haskell offer equational reasoning: this means you can do algebra on programs by substituting equals for equals. The reason this works is that Haskell equations really are equations; if your program says foldr f a (x:xs) = f x (foldr f a xs) then whenever you see an expression that matches either side of the equation, you can replace it with the other side of the equation (keeping the variable names straight, of course).

Equations are completely different from assignments. In C, the assignment

```n = n+1;
```
is a command telling the computer to modify the value of n. In Haskell, the equation
```n = n+1
```
is a mathematical statement asserting that n and n+1 are the same value. Furthermore, it says that to calculate the value of n, you just have to work out n+1. This mathematical statement happens to be untrue, and if your program needs to compute n it will go into an infinite loop. What it will not to is to modify the value of n and ust that!

Equational reasoning about functional programs is similar to using ordinary school algebra to solve equations like a*x^2 + b*x + c = 0. Several of my papers above contain examples of equational reasoning applied to practical problems; see in particular the parallel scan and adder derivation.

In large scale problems, the algebra can become long and involved. Fortunately, the process of manipulating the algebra can also be supported by a computer (another advantage of formal, precise reasoning!) and I'm working on a Haskell Equational Reaoning Assistant to do this.

##### Debugging

So many beginners in functional programming ask for an interactive debugging tool!

A common response from experienced functional programmers is that the strong type system enables the compiler to catch most errors in programs, so you get an error message from the compiler, and you don't have to run the program and crash it in order to get feedback from the debugger.

My own opinion is that a rich and safe type system, as Haskell provides, is the single most effective tool for preventing errors and catching the ones you still make. The incidence of errors is reduced hugely, perhaps by an order of magnitude.

This is wonderful, but still it isn't enough! The problem is that it can become very hard to find the small residue of bugs that get past the type checker, because the usual debugging tools (shoving print statements into the program, watching the sequence of events in an animated window, etc.) run into trouble with the restrictions imposed by pure functional languages. For example, you can't put a print statement (causing a side effect) in a pure functional program that hasn't already put all the monadic infrastructure in place to support this. Similarly, lazy evaluation makes the actual sequence of events during program execution so confusing, an ordinary animation tool would just make things more confusing.

A survey of these technical difficulties and an early serious approach to solving them appear in this paper:

John O'Donnell and Cordelia Hall. Debugging in applicative languages, Lisp and Symbolic Computation, Vol. 1 No. 2 113--145 (September 1988)

A more general discussion of some alternative approaches appeared in this workshop paper:

Cordelia Hall, Kevin Hammond and John O'Donnell, An algorithmic and semantic approach to debugging, Functional Programming: Glasgow 1990, 44-53, Springer Workshops in Computing (1991)

I'm still actively interested in the debugging problem. For some current state of the art work, I recommend the recent papers from Colin Runciman's group at York.

#### Algorithms and data structures

Many of the standard algorithms and data structures for imperative programming are also useful for pure nonstrict functional languages, but there are fundamental differences between these language paradigms that lend functional algorithms a distinctive character. Some imperative algorithms depend heavily on modifications to existing data structures, which is generally disallowed in pure functional programming (with some exceptions). Countering this, a functional language offers features that don't exist in imperative languages, such as the "knot-tying" techniques of mutual recursion, and the implicit coroutines of lazy evaluation.

Some of my work in functional algorithms and data structures is embedded in the other projects (parallel combinators, functional hardware description, etc.). The topics described below aren't specialised for any particular application, so they are best classified as general functional programming technique.

##### Nondeterminism

Some software applications require nondeterministic behavior, especially in situations where the software has to respond to user inputs that could arrive at any time in arbitrary order.

How could a pure functional language express nondeterminism? A natural idea, introduced by McCarthy around 40 years ago, is to use a pseudo-function amb :: a -> a -> a, such that amb x y will return either x or y, but amb gets to decide which value to return. However, amb is obligated to return a well-defined result if either x or y is well defined. This means, for example, that amb cannot be defined simply to return x all the time, since amb bottom 2 would fail to return the well defined result 2 (you can think of bottom as meaning "infinite loop"). The intended implementation of amb is to evaluate both arguments concurrently, perhaps by alternating time slices between them, until one of them produces a value, which is then returned as the value of the amb application.

Impure functional languages can use amb without problem, but any pure functional language cannot provide amb since this would render equational reasoning invalid. For example, amb 2 3 could return 2 sometimes and 3 at other times. If we have amb and try to use equational reasoning, we get the result 2 = amb 2 3 = 3, and we have just "proved" 2=3!

Lots of alternative approaches for pure functional nondeterminism have been considered. John Hughes and I have proposed a method called "set nondeterminism".

The idea behind set nondeterminism is that the program doesn't define a final result (say an Integer). Instead, it defines a set of possible Integers. Whenever there is a nondeterministic choice to be made, we define the set of possible results. Therefore we replace amb x y by {x,y}.

The real trick behind the method lies in the particular operations on nondeterministic sets that are allowed. These are designed carefully to achieve two aims: (1) the language preserves referential transparency (i.e. it is "pure" and equational reasoning is ok); (2) there is an efficient way to implement the language. Therefore some of the ordinary operations on sets are disallowed. For example, there is no function choose :: {a} -> a that can choose an arbitrary element from a set, since this would allow you to prove 2=3, breaking equational reasoning.

Set nondeterminism, as well as a critical review of alternative approaches, is described in detail in the following papers. Unfortunately they are really paper, as I don't have machine-readable copies of them! You're welcome to email me for a paper copy.

John Hughes and John O'Donnell. Nondeterministic functional programming with sets, Proceedings of the 1990 Banf Conference on Higher Order Reasoning (1990)
John Hughes and John O'Donnell. Expressing and reasoning about nondeterministic functional programs, Functional Programming, Glasgow 1989, Springer-Verlag Workshops in Computing 308--328 (1990)
##### The scan family

The scan functions are closely related to several topics I've worked on extensively, including linear circuit structures, data parallel algorithms, and tree structured computations. As a result, I've done a lot of work with the scan functions. This has led to an entire family of related functions, with a rich algebra of properties.

An unusual but useful function is the bidirectional scan (and its cousin, the bidirectional fold). The usual scan functions are unidirectional: scanl works from the left to right, and scanr is the opposite. The bidirectional scan defines a concurrent computation that begins at both ends of a list and works across it in both directions. The original definition of the bidirectional scan is given in this paper:

John O'Donnell, Bidirectional fold and scan, Functional Programming: Glasgow 1993, Springer Workshops in Computing (1994)

Since then there has been more progress on these functions, and I plan a future paper giving a systematic coverage of the entire family of scan and fold functions.

##### Functional arrays

For a long time, a central problem in functional programming was the implementation of arrays. In a nutshell, the problem is this: imperative languages let you fetch and store an arbitrary element of an array in O(1) time, but there just doesn't seem to be a way to achieve this in a pure functional language. The naive solutions are hopelessly inefficient. With a lot of cleverness, it's possible to provide array access operations that are reasonably efficient most of the time.

In recent years this problem has largely faded from sight because of the advent of monadic programming, where side effects are introduced into the language in a safe way. As a result, you can write an ordinary Fortran-like program with imperative arrays in Haskell and get it to work with reasonable efficiency.

Nonetheless, this approach of embedding imperative arrays into Haskell with monads is evading the question. The result is less general than pure functional arrays, which allow unlimited sharing. This approach is fine if you have an existing imperative algorithm to be transcribed into Haskell, but the question remains: how efficiently is it possible to implement genuine functional arrays without placing any limitations on the allowable patterns of array access?

A partial answer to this question involves a combination of hardware and software support:

John O'Donnell, Data parallel implementation of Extensible Sparse Functional Arrays, Parallel Architectures and Languages Europe, LNCS 694, Springer-Verlag 68--79 (June 1993).

There is more to this story, and some more recent results, which I'll write about when time permits.

### Computers and Music

research on computers and music

### Research and teaching

My teaching page describes some of the courses I've taught recently, and it contains information for students in my current courses.

One of the central themes in my research is using functional programming as a formal method to help design reliable software and hardware. Naturally, such methods are beneficial to courses in all areas of computer science, and I think it's especially important to apply formal methods to a wide range of subjects. If formalism is limited to a single course on formal methods, students will not realise how useful it can be.

Cordelia Hall and I have written an elementary textbook on discrete mathematics that stresses the connections between mathematics and computing. The book uses Haskell to implement a wide range of mathematical ideas; this enables students to compute with logic, sets, relations and so on. This gives a more practical and concrete understanding than a purely theoretical treatment can provide.

Cordelia Hall and John O'Donnell, Discrete Mathematics Using a Computer, Springer, ISBN 1-85233-089-9 (2000)

I'm working actively on new applicatoins of Haskell and formal methods to education in computer systems, based based on the Hydra research project. Old: IFL07 slides

John O'Donnell, jtod@dcs.gla.ac.uk