My research interests are mainly in the field of programming languages and parallel architectures such as FPGAs, manycore processors and GPGPUs.
Currently, we are working with the following hardware platforms:
- GiDEL PROCStar-IV FPGA board with 4 Altera Stratix-IV FPGAs, 32GB RAM
- Nallatech P385-D5 OpenCL FPGA board
- Maxeler Maia FPGA board
- NVIDIA GeForce GTX 590 dual GPGPGU with 1024 cores (16 compute units), 3GB RAM
- NVIDIA GeForce GTX TITAN Black GPGPGU with 2880 CUDA cores (15 compute units), 6GB RAM
- Tilera TILEncore card with 64-core TILEPro64 processor, 16GB RAM
- Intel Xeon Phi Coprocessor 5110P (8GB, 1.053 GHz, 60 core)
- Access to the 3408-core ARCHIE-WeSt supercomputer as well as a local 300-core cluster and 64-core NUMA machine
High-level FPGA programming research
Exploiting Parallelism through Type Transformations for Hybrid Manycore Systems (TyTra)
This collaborative 5-year EPSRC project aims to build compilers for heterogeneous platforms, including FPGAs, using multi-party session types for compiler-based program transformations. See the TyTra project web site for more info.
A high-level presentation about FPGAs for HPC:
TyTra in the News
The MORA Soft Processor Array
The MORA project is a collaboration with Prof. Martin Margala at University of Massachusetts Lowell, aiming to develop a coarse-grained reconfigurable architecture for multimedia applications. My contribution to the project is the programming model and a novel low-level language to program the architecture; the hardware design and implementation is done by our student Sai Rahul Chalamalsetti. Our collaboration has already lead to several publications (ERSA09, AHS09, FPL09).
One objective of this research is to create a soft core version of MORA optimised for FPGAs. We have developped a complete tool chain which converts programs written in C++ via compilation into the MORA assembly language into an FPGA bitstream. Our findings have been published at ASAP2010 and HPCS2011.
Slides from the seminar on MORA (UMass Lowell 15 Nov 2011, Glasgow 23 Nov 2011)
MORA in the News
For our latest paper we implemented a high-throughput DCT algorithm, using over a 1000 processors.
Our Parallel Programming Talk on Intel Software Network TV (April 2011):
"Greener Search": Accelerating Information Retrieval using FPGAs
Our current work focuses on the use of high-level programming solutions such as OpenCL to ease the implementation of "Greener Search" applications.
A very informal talk about FPGAs for Greener Search
Greener Search projects
Over the summer of 2011 we worked with HP on a FPGA-accelerated document filtering application which updated our previous work both in therms of the design and the FPGA hardware (GiDEL PROCStar-IV). The work was published in the International Journal of Reconfigurable Computing and at ISPASS-2012, IEEE International Symposium on Performance Analysis of Systems and Software (© 2012 IEEE).
As part of an EPSRC-KTA project, we are working on an FPGA-accelerated text classification application that can be used e.g. for spam filtering at 10Gb/s line rates.
The FPGA4IR project, a collaboration with Dr Leif Azzopardi of the DCS IR group, aims at developing high-performance FPGA-accelerated Information Retrieval solutions. FPGAs can speed up IR algorithms significantly by exploiting the inherent parallelism.
For this project, sponsored by Matrixware, we use the SGI RC100 FPGA board connected to an SGI Altix 4700 NUMA machine (80 Itanium cores, 320GB memory). For programming the FPGAs we us Mitrion-C, a high-level language developped by Mitrionics.
We have demonstrated order-of-magnitude speed-ups for IR algorithms implemented on the RC100 compared to the same algorithms running on the Itanium. However, the FPGAs consume only a fraction of the power of an Itanium processor (4W compared to 80W). Clearly, FPGA implementations of IR algoritms could make search a lot greener.
The results of this work have been presented at the Information Retrieval Facility Symposium (IRFS 2008) in Vienna, the SIGIR09 conference in Boston and the FPL09 conference in Prague. The papers are here: SIGIR09 (© 2009 ACM), FPL09 (© 2009 IEEE)
To take this work to the next level and demonstrate the capabilities of FPGA-based reconfigurable computing in terms of performance and scalability, we are currently working on implementing IR applications on the Novo-G machine, the world's most powerful reconfigurable supercomputer for research.
Greener Search in the News
GPU and multicore programming research
Weather and Climate Modelling
In collaboration with Prof Saji Hameed of the University of Aizu we are working on GPU- and multicore acceleration of weather simulations. Our long-term goal is to port the Weather Research & Forecasting Model (WRF) to OpenCL. Our current efforts focus on the FLEXPART-WRF model, a Lagrangian Particle Dispersion Model which is used to model e.g. dispersion of particles from the Icelandic volancoes. We have succesfully ported Flexpart to OpenCL and published the result at IWOCL'14.
We have created a dedicated refactoring tool for the FORTRAN source , translate the critical portion to C and then port the C code to OpenCL. The first stage, the FORTRAN refactoring tool, is already usable. For FORTRAN-to-C translation we plan to use NOAA's F2C-ACC with a custom postprocessor. The source code for this project is available on GitHub.
Particle Physics Modelling
The aim of this work (in collaboration with Prof David Ireland and his PhD student Stefanie Lewis) is to accelerate a Nested Sampling Markov Chain Monte Carlo (MCMC) algorithm on GPUs. The MCMC is used to find the optimal parameters for a complex particle physics model based on measured data. The source code for this project is available on GitHub.
Accelerating Search Algorithms
Although we have demonstrated the potential of FPGAs for acceleration information retrieval tasks, GPUs are also good candidates for this type of work. This is an internal project to create an OpenCL kernel for document filtering.