The esfa package contains a suite of libraries and programs in Haskell and C that implement ESF arrays. This document explains what ESF arrays are, how to use them, and how they work. The examples directory contains several test programs that illustrate how ESF arrays behave.
Further resources, including papers, presentations, and the latest versions of the software and this user guide, are available on the ESFA web page.
Notes. This document isn't finished yet. Also, the software is part of a research project, and is not yet a stable production release. Some changes to the API are planned, and this document will be updated as that happens.
ESFA is an array data structure with pure functional access, sparse traversal, extensibility, associative searching, and memory management, yet with O(1) access time.
The significance of the ESFA system is that it has a lower time and space complexity than appears to be possible. There is good reason to conjecture that it is impossible to obtain O(1) access time for the ESFA operations using a standard computer architecture, such as the RAM theoretical model or the von Neumann architecture.
The conjecture assumes that unrestricted usage of the ESFA operations is allowed; naturally if you restrict the way the arrays are updated, a conventional implementation can give O(1) access --- as long as it's acceptable to give wrong results if you perform an update that falls outside its limitations. In contrast, the ESFA system always performs every operation in unit time, regardless of the past history, and with no restrictions on how you use the arrays.
Here is a quick motivation for the conjecture: if you make several updates to an array, you can't afford to replicate the elements of the original array even though they are now shared among several arrays, and you also can't afford to follow any pointers through a linked data structure that implements the sharing.
So how does this system implement algorithms in less time than the lower bound? The answer is that ESFA is a software/hardware codesign: the algorithms run on a digital circuit which provides a smart memory that exploits the massive parallelism of circuits to achieve a lower time and space complexity than is apparently possible on conventional hardware. Furthermore, this is not achieved by shifting the complexity from the software to the hardware: the ESFA circuit provides powerful primitives without significantly increasing the complexity of standard hardware (in gate delay or component count). Depending on the cost model used, the ESFA circuit either has the same asymptotic cost as a conventional RAM memory, or its cost is increased by a negligible amount. There is, of course, a constant-factor slowdown.
ESFA supports several kinds of operation. The most important feature is pure functional arrays, but the techniques used in the implementation support several additional features (sparsity, extensibility, associativity) as well. Here are the main characteristics of the data structure:
Every ESFA operation takes unit time; that is, it requires a fixed number of clock cycles, regardless of the past history of operations.
ESF arrays are immutable, and are accessed using incremental updates and lookups (i.e. "one element at a time", just as with conventional imperative arrays). ESF arrays are purely functional: when you update an array, the result is a completely new array, and the old one is unchanged and still available. Instead of making a side-effect to an array to change its value, a program simply performs an update to create a new array, and it can't ever change the value of an old array. This is normal in a pure functional language; for example, in Haskell you don't modify a list by zapping its first element in memory, but instead you construct a new list using the (:) function.
There are no restrictions on the use of the ESFA functions. For example, a program can make many alternative updates to the same array, resulting in massive sharing of elements among arrays.
ESF arrays may be sparse, and a program can traverse an array from one defined index to the next, skipping over indices that don't have a defined value.
The arrays are extensible: you don't need to allocate space, or specify the minimum and maximum index values. At any time the minimum index can be decreased, or the maximum index increased, simply by updating the array with a new value at a lower (higher) index.
Several operations that support storage management are provided; these make it possible either for a program to manage space explicitly, or to integrate ESFA with a heap garbage collector. There is an operation to delete an array, which automatically reclaims all elements that are inaccessible while retaining all elements that are shared with a live array -- in O(1) time!
A variety of applications using ESFA are possible. For example, the data structure provides a very flexible "undo/redo" facility, and functional arrays directly implement the environment extension and lookup used in denotational semantics. This document focuses on how to use the operations and how the implementation works, rather than on the range of possible applications.
All of the ESFA operations can be implemented using standard algorithms on a conventional architecture, but this increases the time complexity of array access. That is unfortunate, because conventional imperative arrays offer O(1) time update and lookup.
The ESFA system described here uses a hardware/software codesign. The operations are implemented in three levels; the top two levels are software, but the lowest level is a digital circuit that implements a "smart memory". The crux of the system is that the smart memory has essentially the same complexity as a conventional RAM memory (in terms of gate delay and component count), yet it makes it possible to implement the ESFA operations in a fixed number of clock cycles -- exactly what it is conjectured that the RAM cannot do.
ESFA can be executed on a conventional von Neumann architecture, by but simulating the circuit, but this increases the time complexity because a sequential simulator requires O(n) time to simulate one clock cycle of a memory with N cells. A sequential simulator is still useful for experimenting with the data structure, and it gives accurate timings for a true parallel circuit implementation.
The digital circuit is specified using a functional hardware description language, but isn't yet realised in a physical chip. However, the hardware description language constitutes an executable specification, and executing the specification simulates the circuit.
It isn't necessary to use a special VLSI chip to implement the hardware; it can also be emulated on a parallel computer, such as a multicore processor or a GPU.
There are two APIs to the operations on ESF arrays: a set of pure functions and a set of monadic operations. Therefore each ESF operation (with a few exceptions) has two types; the functions whose names begin with esfm are the monadic versions. Some of the operations, including initialization and some of the diagnostics, are meaningful only in the monadic version.
The "esfm" operations are not the lowest level of operation; they collect tracing data which is stored in the heap, and they provide some extra diagnostic tools. This makes it possible to provide operations such as esfmPrintSpecArrays. The low level operations, which run either on a real or simulated digital circuit, have names beginning with "ptc".
See Section for an example that shows how to set up and run a program.
The esfm family of functions provides some convenient features that make it easier to experiment with ESF arrays, but this is not the lowest level API, and also not the most secure. (That API isn't documented yet.) The esfm API keeps the state of the simulated circuit in a state monad:
StESFA a e c
StESFA is defined using StateT, with IO as the inner monad. The type variable a is the user's state type; e is the element type used for ESF arrays, and c is the return type, which is often just (). The user's state is there just for convenience, but this isn't very secure.
The function simulateESFM k x script generates an instance of the circuit containing 2^k cells, and runs the simulator on the script, using x as the initial user state.
simulateESFM k x script
Before doing any other operations, the system must be initialized.
esfmInitialize
An array has a "handle" or "stable pointer" which is of type AName and is represented as an Int. This is an opaque type; that is, there is no particular significance to the integer value of an AName. An AName is analogous to the machine address used to represent an ordinary imperative array. (However, there is a difference between an AName and an imperative array address. For imperative arrays, the numeric value of an array's address is used to calculate the address of an element at a given index, but, as will be seen later, that is not the case for ESF arrays.)
Note: perhaps the token AName will be replaced by something like ESFA, one of many API changes likely to be made.
An ESFA consists of a set of mappings from an index to a value. An index has type Int, and a value has a polymorphic type; in this document the type variable v will be used for the value type. All elements of all ESF arrays must have the same type, so it's common to define an algebraic data type with tags to allow element values of different types. The examples in this document just use Int for all element types.
The system always has an empty array that contains no elements. This is not created by the user, and cannot be deleted. There are two names, corresponding to the functional (esf) and monadic (esfm) API, but there is no difference between them; they are both defined just for consistency in naming. (An AName is really an Int, and empty has the value -1, although the user doesn't need to know that, and should always use esfmEmpty and should never use -1 directly.)
esfEmpty, esfmEmpty :: AName
Update takes an existing array, an index, a value, and a Bool. It returns a new array which is identical to the old one, except that it has the specified value at the given index. The old array is not modified.
esfUpdate, esfUpdateNfy
:: AName -- old array to be updated
-> (Int,e) -- new element (index, value)
-> Maybe AName -- new array
esfmUpdate, esfmUpdateNfy
:: Show e
=> String -- label to make dumps more readable
-> AName -- old array to be updated
-> (Int,e) -- new element (index, value)
-> StESFA u a (Maybe AName) -- result is new array
The four versions of update are almost the same. The differences among them are:
The monadic ones (esfm) return a result in the StESFA monad, while the nonmonadic ones (esf) just return the result directly.
Since the monadic API has facilities to dump the arrays, the element type needs to be in the Show class.
The monadic operations take an extra String argument, which is a symbolic label that is used to make array dumps more readable (see esfmShowArrays). Since array dumps are not supported in the non-monadic functions, the String argument is omitted.
The versions that end in Nfy indicate that the element should receive special treatment if and when an array deletion makes the element inaccessible. If an Nfy operation was used, then the host system will be notified before the element is reclaimed. This is needed to handle pointers from an ESF array back to a heap object (see the section Notification and Zombies). If an array is created with esfUpdate or esfmUpdate, then the element will be reclaimed immediately and automatically if and when it becomes inaccessible.
The lookup function is used to an element from an array at a specified index.
esfLookup
:: AName -- array to look up
-> Int -- index
-> Maybe (EltVal e) -- value if it exists
esfmLookup
:: Show v -- allows dumping arrays
=> AName -- array to look up
-> Int -- index
-> StESFA u v (Maybe v) -- value if it exists
The behaviour of ESF arrays is defined by two laws.
Law 1 (Empty array.) If you look up any index at all in an empty array, you get nothing.
esfLookup esfEmpty i = Nothing
Law 2 (Nonempty array.) Suppose an array a2 is defined as update a j v. Now if you look up a2 at index j, the result is Just v. If you look up a2 at any index other than j, the result is the same as looking up a at that index.
esfLookup (esfUpdate a j v) i
| i==j = Just v
| i/=j = esfLookup a j
Some important points.
You don't allocate an array with a given size
There is only one way to create an array, and that's using update
When you update an array, you don't modify it - you create a new array
An array is immutable: once created, its value will never change
You can look up an array at any index; that's why the result is a Maybe type
esfmPrintState. Prints out the state of the circuit. Only the cells that contain useful information are printed; empty cells are skipped.
esfmPrintMachineArrays. This examines the state of the (simulated) circuit and works out the contents of the arrays that are represented in the machine. These arrays are printed out. This operation shows what is actually represented in the machine (cf. esfmPrintSpecArrays). This is a meta-operation that relies on the ability of the circuit simulator to gain access to the internal register states. Note: the MachineArrays aren't quite finished: if an array has been updated at an index where it was already defined, both index/value definitions will be printed. That isn't a bug in the ESFA system, it's just incomplete implementation of the meta-operation.
esfmPrintSpecArrays. Whenever an esfmUpdate is performed, the result according to the specification is calculated and stored in the tracing state. The Spec arrays are using an algebraic data type to calculate the correct values of the arrays; this is actually an executable specification. This value is completely independent of the machine state. By using esfmPrintSpecArrays, you can see what the correct arrays should be, according to the specification.
esfmAnalyzeMachineState. You can find out several statistics of the machine state. Also, you can extract these fields for yourself and process them. Here is an example, followed by the definition of the data type.
stats <- esfmAnalyzeMachineState liftIO (putStrLn (show stats))
data MachineStatistics = MachineStats {ncells, narrays, nelts, nzombies, nfree :: Int}
esfmSetOption. You can turn on tracing of the operations:
esfmSetOption EsfmTraceOperation
This section describes several example programs that illustrate how the system works. Each program listing is interleaved with its output and some commentary explaining what is going on, and all the programs appear in the examples directory. You can experiment by modifying any of the programs and running it.
The examples here use the monadic versions of the ESF operations. However, it is not necessary to encapsulate ESFA operations in a monad! Monads are often used for imperative arrays in Haskell, because single-threaded access is required for imperative arrays, but the section below on Representation and sharing shows how ESFA handles multiple threaded arrays easily. The reasons that a monad is used here are:
The monad makes it possible to illustrate how storage management works. The pure function interface requires the ESFA system to be integrated fully with the Haskell garbage collector; otherwise the system will eventually run out of memory. This integration is not yet done.
It's helpful to print out the machine state at well defined points in the program, and the monad facilitates that.
The program examples/InitUpdateLookup.hs shows how to set up a program, run it, initialize the machine, and do some simple array operations, particularly update and lookup. To run it, go into the examples directory, enter ghci InitUpdateLookup, then enter main.
The program begins by importing some required modules.
-- ESFA: InitUpdateLookup
-- Demonstrate initialization, update, lookup, and printing arrays
-- This program is discussed in doc/src/esfa-users-guide.txt
module Main where
import Control.Monad.IO.Class
import Data.Array.ESFA.SimArch
The main program uses sumulateESFM to simulate the circuit, and to run the ESFA algorithm on that simulated circuit. (Parallel implementations, including multicore, GPU, and FPGA, are underway, and will use the same interface but a different run function to use in place of simulateESFM.)
The first argument is the log of the number of memory cells; thus specifying 10 gives a machine with 1024 cells. The second parameter () is the initial value of the user's state, which is not used in this particular program. The script is the user program, running in the StateT monad.
The rest of the program is the body of the script. In it, you can do IO using liftIO. The ESF machine needs to be initialized, setting all the cells in the circuit to empty. We also set a trace option, which will cause the system to print a message each time an update or lookup is performed.
The program uses the StateT monad; StESFA is defined for convenience. The program needs to begin by initializing the system state. It also sets an option, EsfmTraceOperation, that will cause the system to print a message each time an update or other ESFA operation is performed.
main :: IO ()
main = simulateESFM 10 () script
script :: StESFA () Int ()
script =
do liftIO (putStrLn "ESFA Example InitUpdateLookup")
esfmInitialize
esfmSetOption EsfmTraceOperation
Now the program will create some arrays using update. The empty array is called esfmEmpty (for now, at least). It's always present: the user doesn't need to create it, can't create it, and can't destroy it. The only way to get started is by updating esfmEmpty to get a new array. The following definition updates esfmEmpty at index 100 with the value 101. The string parameter "a1" isn't really part of the ESFA system, and isn't present at all in the machine representation. But the esfm API lets you give a string label to an array, which is used by esfmPrintMachineArrays and esfmPrintSpecArrays to make it easier to see which array is which. This is the only purpose of that label. It's possible for esfmUpdate to fail, so it returns a Maybe type. There are several ways to detect and handle errors, but in this demonstration program we just pattern match so that a1 is defined to be the actual array. The update would fail if the memory is completely full, or if the array being updated doesn't exist.
-- Create a couple of arrays
Just a1 <- esfmUpdate "a1" esfmEmpty (100, 101)
Just a2 <- esfmUpdate "a2" a1 (200, 201)
-- Expected contents of the arrays:
-- a1 = 100~>101
-- a2 = 200~>201, 100~>101
Since a trace was requested, each operation causes a line to be printed. The first update defines an array with the label "a1", which is created by updating the empty array (whose name is -1) with value 101 at index 100. The result of the update is a new array whose AName is 0. The second operation updates a1, whose AName is 0, and produces a2 with AName = 1.
ESFA Example InitUpdateLookup
esf update: define a1 (name=0) = update -1 [100]=101
esf update: define a2 (name=1) = update 0 [200]=201
Now do some lookups.
-- Fetch and print some array elements
Just x1 <- esfmLookup a1 100 -- expect 101
Just x2 <- esfmLookup a2 100 -- expect 101
Just x3 <- esfmLookup a2 200 -- expect 201
liftIO (putStrLn (show [x1,x2,x3]))
-- Fetch and print some non-existent array elements
y1 <- esfmLookup a1 200 -- expect Nothing
y2 <- esfmLookup a2 300 -- expect Nothing
liftIO (putStrLn (show [y1,y2]))
Here is the output, which agrees with the expected values.
esf lookup (name=0) [100] = Just 101
esf lookup (name=1) [100] = Just 101
esf lookup (name=1) [200] = Just 201
[101,101,201]
esf lookup (name=0) [200] = Nothing
esf lookup (name=1) [300] = Nothing
[Nothing,Nothing]
Finally, the program dumps out the entire contents of all arrays in its memory. There is a meta-operation called esfmPrintMachineArrays that examines the states of the cell circuits, works out the arrays that are represented, and prints them out. Each time you do an esfmUpdate, the correct result --- correct according to the executable specification --- is automatically saved in the state. This is using a conventional tree data structure in the RAM to represent the ESF arrays. The esfmPrintSpecArrays operation prints those out.
-- Print the expected and then the actual arrays
esfmPrintSpecArrays
esfmPrintMachineArrays
The output is given below; again it agrees with the expected values that are given in comments below the updates.
esfmPrintSpecArrays: arrays according to specification:
esfmEmpty (-1) =
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
esfmPrintMachineArrays: arrays represented in machine state:
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
By comparing the results of the PrintSpecArrays and PrintMachineArrays, you can check that all is well. The results should always be identical (except that printSpecArrays currently shows esfmEmpty).
needs work...
Now a2 is defined by updating a1.
Just a2 <- esfmUpdate "a2" a1 (200, mkInt 201)
showState
-- a2 = 200-201 100-101
Now there are two arrays in the system. The old array, a1, has not been modified, and the new array, a2, is just like a1 except that it has the new element.
esf update: define a2 (name=1) = update 0 [200]=EsfUserOrdVal 201
0. > 1 1..2 [ 100] = EsfUserOrdVal 101
1. * > 2 2..2 [ 200] = EsfUserOrdVal 201
esfmPrintMachineArrays: arrays represented in machine state:
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
esfmPrintSpecArrays: arrays according to specification:
esfmEmpty (-1) =
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
MachineStats {ncells = 16, narrays = 2, nelts = 2, nzombies = 0, nfree = 14}
No we can print the internal machine representation. This shows the fields in the cells. Naturally this representation takes some work to interpret, because it's designed to be used by the algorithm, not to be read by users. A detailed explanation of what the fields mean, and how to interpret the results, is given later. The example programs sometimes show the expected contents of an array with a comment; the comment here says that array a1 contains one element that maps index 100 to value 101.
There are several operations that print information contained in the system state. In this example program, several of those will be used after each update. For convenience, they are collected into one operation called showState. The output from this is described later, after the second update is performed.
showState =
do esfmPrintState
esfmPrintMachineArrays
esfmPrintSpecArrays
stats <- esfmAnalyzeMachineState
liftIO (putStrLn (show stats))
liftIO (putStrLn (take 75 (repeat '-')))
A tree diagram (shown with indented text) illustrates the sharing. Array a1 contains the [100]=101 element. Array a2 contains [200]=201, and it also inherits [100]=101 from a1. Thus the element [100]=101 is shared between a1 and a2.
a1 [100]=101
a2 [200]=201
Now consider how these arrays are represented in the machine. The crucial point is that each element is represented in just one cell, even if it is shared among many arrays. Here's a tree diagram showing the array codes and the inclusion sets.
a1 (0->1) [100]=101 1..2
a2 (1->2) [200]=201 2..2
The notation a1 (0->1) means that array a1 has AName=0, and it currently has ACode=1. The AName is permanent; a1 will always have AName=0. When a1 was created, the update operation returned the int 0, and the value of a1 in the main program is 0. When an array is created, an arbitrary empty cell is chosen to hold its element, and if the CellId of that cell is i, then the name of the array is AName=i. Similarly, a2 has name 1 and code 2.
The notation i..j means an inclusion set consisting of arrays with codes c such that c <= i <= j. Thus the [100] element has inclusion set {ACode 1, ACode 2} which is the same as {AName 0, AName 1}, or {a1, a2}. This element is shared between two arrays. The [200] element belongs only to a2.
If you want to find the elements of an array, say a1, first find its AName (0), then look in cell with CellID=0 to get the ACode (it's 1), then look at all the cells where 1 lies within the low..high interval (in the state right now, the only element with that property is [100]. This identifies the candidates for the elements of the array, but if several of these elements have the same index value, then the most recent update has to be chosen; the lookup algorithm does this by finding the candidate with the specified index that has the narrowest inclusion set interval.
Now we define a new array by updating a1 again. Notice that a1 has been updated twice, in two different ways, resulting in a2 and a3. This is a "multi-threaded" access, where a new update uses some older array, rather than just the most recently created array.
Just a3 <- esfmUpdate "a3" a1 (300, mkInt 301)
showState
-- a3 = 300-301 100-101
esf update: define a3 (name=2) = update 0 [300]=EsfUserOrdVal 301
0. > 1 1..3 [ 100] = EsfUserOrdVal 101
1. > 3 3..3 [ 200] = EsfUserOrdVal 201
2. * > 2 2..2 [ 300] = EsfUserOrdVal 301
esfmPrintMachineArrays: arrays represented in machine state:
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
a3 (2) = [100] = 101. [300] = 301.
esfmPrintSpecArrays: arrays according to specification:
esfmEmpty (-1) =
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
a3 (2) = [100] = 101. [300] = 301.
MachineStats {ncells = 16, narrays = 3, nelts = 3, nzombies = 0, nfree = 13}
Just a4 <- esfmUpdate "a4" a2 (400, mkInt 401)
showState
-- a4 = 400-401 200-201 100-101
esf update: define a4 (name=3) = update 1 [400]=EsfUserOrdVal 401
0. > 1 1..4 [ 100] = EsfUserOrdVal 101
1. > 3 3..4 [ 200] = EsfUserOrdVal 201
2. > 2 2..2 [ 300] = EsfUserOrdVal 301
3. * > 4 4..4 [ 400] = EsfUserOrdVal 401
esfmPrintMachineArrays: arrays represented in machine state:
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
a3 (2) = [100] = 101. [300] = 301.
a4 (3) = [100] = 101. [200] = 201. [400] = 401.
esfmPrintSpecArrays: arrays according to specification:
esfmEmpty (-1) =
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
a3 (2) = [100] = 101. [300] = 301.
a4 (3) = [100] = 101. [200] = 201. [400] = 401.
MachineStats {ncells = 16, narrays = 4, nelts = 4, nzombies = 0, nfree = 12}
Just a5 <- esfmUpdate "a5" a2 (200, mkInt 501)
Just temp <- esfmLookup "a2" a2 200
liftIO (putStrLn ("a2[200] = " ++ show temp)) -- expect 201
Just temp <- esfmLookup "a5" a5 200
liftIO (putStrLn ("a5[200] = " ++ show temp)) -- expect 501
showState
-- a5 = 500-501 200-201 100-101
esf update: define a5 (name=4) = update 1 [200]=EsfUserOrdVal 501
esf lookup a2 (name=1) [200] = Just (EsfUserOrdVal 201)
a2[200] = EsfUserOrdVal 201
esf lookup a5 (name=4) [200] = Just (EsfUserOrdVal 501)
a5[200] = EsfUserOrdVal 501
0. > 1 1..5 [ 100] = EsfUserOrdVal 101
1. * > 3 3..5 [ 200] = EsfUserOrdVal 201
2. > 2 2..2 [ 300] = EsfUserOrdVal 301
3. > 5 5..5 [ 400] = EsfUserOrdVal 401
4. * > 4 4..4 [ 200] = EsfUserOrdVal 501
esfmPrintMachineArrays: arrays represented in machine state:
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
a3 (2) = [100] = 101. [300] = 301.
a4 (3) = [100] = 101. [200] = 201. [400] = 401.
a5 (4) = [100] = 101. [200] = 201. [200] = 501.
esfmPrintSpecArrays: arrays according to specification:
esfmEmpty (-1) =
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
a3 (2) = [100] = 101. [300] = 301.
a4 (3) = [100] = 101. [200] = 201. [400] = 401.
a5 (4) = [100] = 101. [200] = 501. [200] = 201.
MachineStats {ncells = 16, narrays = 5, nelts = 5, nzombies = 0, nfree = 11}
Before doing some lookups, let's create another array.
Just a6 <- esfmUpdate "a6" a3 (600, mkInt 601)
showState
-- a6 = 600-601 300-301 100-101
esf update: define a6 (name=5) = update 2 [600]=EsfUserOrdVal 601
0. > 1 1..6 [ 100] = EsfUserOrdVal 101
1. > 4 4..6 [ 200] = EsfUserOrdVal 201
2. > 2 2..3 [ 300] = EsfUserOrdVal 301
3. > 6 6..6 [ 400] = EsfUserOrdVal 401
4. > 5 5..5 [ 200] = EsfUserOrdVal 501
5. * > 3 3..3 [ 600] = EsfUserOrdVal 601
esfmPrintMachineArrays: arrays represented in machine state:
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
a3 (2) = [100] = 101. [300] = 301.
a4 (3) = [100] = 101. [200] = 201. [400] = 401.
a5 (4) = [100] = 101. [200] = 201. [200] = 501.
a6 (5) = [100] = 101. [300] = 301. [600] = 601.
esfmPrintSpecArrays: arrays according to specification:
esfmEmpty (-1) =
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
a3 (2) = [100] = 101. [300] = 301.
a4 (3) = [100] = 101. [200] = 201. [400] = 401.
a5 (4) = [100] = 101. [200] = 501. [200] = 201.
a6 (5) = [100] = 101. [300] = 301. [600] = 601.
MachineStats {ncells = 16, narrays = 6, nelts = 6, nzombies = 0, nfree = 10}
The next part of the program performs some lookups. In a real program, lookup (and the sparse operations described later) are the only way to fetch data from the machine. (Note: the string label parameter shouldn't be there and will be removed.) To do a lookup, you specify an array and index, and the result is the contents of the array at that index. Naturally this is a Maybe type.
Just x1 <- esfmLookup "a4" a4 400 -- expect 401
Just x2 <- esfmLookup "a5" a5 100 -- expect 101
Just x3 <- esfmLookup "a5" a5 200 -- expect 201
Just x4 <- esfmLookup "a6" a6 300 -- expect 301
liftIO (putStrLn (show (map getInt [x1,x2,x3,x4])))
esf lookup a4 (name=3) [400] = Just (EsfUserOrdVal 401)
esf lookup a5 (name=4) [100] = Just (EsfUserOrdVal 101)
esf lookup a5 (name=4) [200] = Just (EsfUserOrdVal 501)
esf lookup a6 (name=5) [300] = Just (EsfUserOrdVal 301)
[401,101,501,301]
esfmDelete a3
showState
esf delete name=2
0. > 1 1..5 [ 100] = EsfUserOrdVal 101
1. > 3 3..5 [ 200] = EsfUserOrdVal 201
2. * 2..2 [ 300] = EsfUserOrdVal 301
3. > 5 5..5 [ 400] = EsfUserOrdVal 401
4. > 4 4..4 [ 200] = EsfUserOrdVal 501
5. > 2 2..2 [ 600] = EsfUserOrdVal 601
esfmPrintMachineArrays: arrays represented in machine state:
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
a4 (3) = [100] = 101. [200] = 201. [400] = 401.
a5 (4) = [100] = 101. [200] = 201. [200] = 501.
a6 (5) = [100] = 101. [300] = 301. [600] = 601.
esfmPrintSpecArrays: arrays according to specification:
esfmEmpty (-1) =
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
a4 (3) = [100] = 101. [200] = 201. [400] = 401.
a5 (4) = [100] = 101. [200] = 501. [200] = 201.
a6 (5) = [100] = 101. [300] = 301. [600] = 601.
MachineStats {ncells = 16, narrays = 5, nelts = 6, nzombies = 0, nfree = 10}
A cascade of elements can be reclaimed by a deletion.
esfmDelete a6
showState
esf delete name=5
0. > 1 1..4 [ 100] = EsfUserOrdVal 101
1. > 2 2..4 [ 200] = EsfUserOrdVal 201
3. > 4 4..4 [ 400] = EsfUserOrdVal 401
4. > 3 3..3 [ 200] = EsfUserOrdVal 501
esfmPrintMachineArrays: arrays represented in machine state:
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
a4 (3) = [100] = 101. [200] = 201. [400] = 401.
a5 (4) = [100] = 101. [200] = 201. [200] = 501.
esfmPrintSpecArrays: arrays according to specification:
esfmEmpty (-1) =
a1 (0) = [100] = 101.
a2 (1) = [100] = 101. [200] = 201.
a4 (3) = [100] = 101. [200] = 201. [400] = 401.
a5 (4) = [100] = 101. [200] = 501. [200] = 201.
MachineStats {ncells = 16, narrays = 4, nelts = 4, nzombies = 0, nfree = 12}
The program also contains a few utility functions that perform tagging and untagging of Int data, converting to and from the EsfEltVal type.
mkInt :: Int -> EsfEltVal Int
mkInt n = EsfUserOrdVal n
getInt :: EsfEltVal Int -> Int
getInt (EsfUserOrdVal n) = n
getInt _ = error "element should be EsfUserOrdVal but isn't"
ESFA is implemented in several layers.
Top layer: macro operations. Each ESFA operation is implemented as a macro-operation defined using the micro operations comprising the middle layer. Every operation is implemented by a small fixed sequence of micro operations; thus the top layer could be implemented by a simple control circuit.
Middle layer: combinators (map, fold, scan). These are operations that act in parallel on every cell in the ESF machine. A parallel map updates each cell by applying a function to the old state, yielding the new state. A parallel fold uses an associative function to combine messages produced from each cell into a single value that is read by the controller. A parallel scan is similar to a fold, but it also updates each cell based on an incoming message.
Bottom layer: parallel implementation of the combinators. The reference implementation is a digital circuit; indeed, the middle layer is chosen because (1) the three combinators are primitive enough that they can be implemented directly in hardware, and (2) they are powerful enough that they can implement all the ESFA operations. However, the combinators can be implemented on other platforms, such as a GPU, multicore CPU, or sequential CPU. Naturally, platforms that provide less parallelism will yield slower implementations of the ESFA operations.
Each cell contains the following fields:
data PtcCellState a = PtcCellState
{ ptcCellId :: !Int,
ptcSelect :: !Bool,
ptcMark :: !Bool,
ptcZombie :: !Bool,
ptcMapDef :: !Bool,
ptcMapCode :: !ACode,
ptcEltDef :: !Bool,
ptcLow, ptcHigh :: !ACode,
ptcInd :: !EltIndex,
ptcValIsArray :: !Bool,
ptcValField :: !(EsfEltVal a) }
During initialization, the cellID fields of the cells are set to 0, 1, 2, ..., and all the other fields are set to False or 0.
An ESF array has a stable name called an AName. This is simply an integer that identifies the array (it is not necessarily the address of the first element of the array, as for conventional imperative arrays). The empty array has an AName = -1.
Each ESF array also has a numeric code, called an ACode. Like an AName, this is an integer that identifies the array. The difference is that the ESF operations frequently need to change the codes for many arrays. The purpose of having an AName is so that the "outside world" can have a stable value to refer to an array. The ACodes are kept within the ESFM and are never sent to the outside world (apart from operations that dump the state, for tracing, instrumentation, and debugging purposes).
If there is an array with AName = n and ACode = c, then cell[n].MapDef = True and cell[n].MapCode = c. Thus we can find out the code for the array with name n using the encode function. The algorithms below are given as pseudo-code; for the full definitions see the src directory.
encode n =
if cell[n].MapDef
then Just (cell[n].mapCode]
else Nothing
The encode function is implemented using a parallel fold.
The update function takes an AName to be updated, and an index and value. It returns the AName of a new array which is just like the argument, except that it has the specified value at the specified index. Thus if b = update a (i,v) then b will satisfy these properties:
Update is a macro operation that takes several steps, each performed by a combinator:
The ACode for a is calculated: acode = encode a. This is done by a parallel fold, and requires one clock cycle. The ACode for the resulting array, called acode', is defined as acode' = acode+1 (the controller may spend a clock cycle doing that addition, or could do that in parallel with the next step).
An empty cell is allocated, the Select flag in that cell is set, and the cellID is returned. This is done in one clock cycle by a parallel scan. The scan (technically this is called a "mapping scan") updates each cell by setting or clearing its Select flag, and it returns a result which is the ID of the selected cell. A simplified way to understand how this works is to imagine an accumulator value scanning across the sequence of cells from left to right, checking whether the cell is empty (EltDef = False), and setting Select in the first such cell. The subtlety in the implementation is that this has to be realised by an associative function, so that the parallel tree circuit can do the entire mapping scan in one clock cycle with a gate delay on the order of log N, where there are N cells.
The existing cells are adjusted. The value of acode is broadcast (using the tree circuit) and each cell adjusts its fields as follows; this is performed by a parallel map and takes one clock cycle:
if EltDef && Low > acode then Low++
if EltDef && High >= acode then High++
if MapDef && MapCode > acode then MapCode++
MapDef = True
MapCode = acode'
EltDef = True
Low = acode'
High = acode'
Index = i
Value = v