My research interests are mainly in the field of programming languages and parallel architectures such as FPGAs, manycore processors and GPGPUs.
Currently, we are working with the following hardware platforms:
- GiDEL PROCStar-IV FPGA board with 4 Altera Stratix-IV FPGAs, 32GB RAM
- NVIDIA GeForce GTX 590 dual GPGPGU with 1024 cores (16 compute units), 3GB RAM
- Tilera TILEncore card with 64-core TILEPro64 processor, 16GB RAM
- Intel Xeon Phi Coprocessor 5110P (8GB, 1.053 GHz, 60 core)
High-level FPGA programming research
The MORA Soft Processor Array
The MORA project is a collaboration with Prof. Martin Margala at University of Massachusetts Lowell, aiming to develop a coarse-grained reconfigurable architecture for multimedia applications. My contribution to the project is the programming model and a novel low-level language to program the architecture; the hardware design and implementation is done by our student Sai Rahul Chalamalsetti. Our collaboration has already lead to several publications (ERSA09, AHS09, FPL09).
One objective of this research is to create a soft core version of MORA optimised for FPGAs. We have developped a complete tool chain which converts programs written in C++ via compilation into the MORA assembly language into an FPGA bitstream. Our findings have been published at ASAP2010 and HPCS2011.
Slides from the seminar on MORA (UMass Lowell 15 Nov 2011, Glasgow 23 Nov 2011)
MORA in the News
For our latest paper we implemented a high-throughput DCT algorithm, using over a 1000 processors.
Our Parallel Programming Talk on Intel Software Network TV (April 2011):
"Greener Search": Accelerating Information Retrieval using FPGAs
A very informal talk about FPGAs for Greener Search
Greener Search projects
Over the summer of 2011 we worked with HP on a FPGA-accelerated document filtering application which updated our previous work both in therms of the design and the FPGA hardware (GiDEL PROCStar-IV). The work was published in the International Journal of Reconfigurable Computing and at ISPASS-2012, IEEE International Symposium on Performance Analysis of Systems and Software (© 2012 IEEE).
As part of an EPSRC-KTA project, we are working on an FPGA-accelerated text classification application that can be used e.g. for spam filtering at 10Gb/s line rates.
The FPGA4IR project, a collaboration with Dr Leif Azzopardi of the DCS IR group, aims at developing high-performance FPGA-accelerated Information Retrieval solutions. FPGAs can speed up IR algorithms significantly by exploiting the inherent parallelism.
For this project, sponsored by Matrixware, we use the SGI RC100 FPGA board connected to an SGI Altix 4700 NUMA machine (80 Itanium cores, 320GB memory). For programming the FPGAs we us Mitrion-C, a high-level language developped by Mitrionics.
We have demonstrated order-of-magnitude speed-ups for IR algorithms implemented on the RC100 compared to the same algorithms running on the Itanium. However, the FPGAs consume only a fraction of the power of an Itanium processor (4W compared to 80W). Clearly, FPGA implementations of IR algoritms could make search a lot greener.
The results of this work have been presented at the Information Retrieval Facility Symposium (IRFS 2008) in Vienna, the SIGIR09 conference in Boston and the FPL09 conference in Prague. The papers are here: SIGIR09 (© 2009 ACM), FPL09 (© 2009 IEEE)
To take this work to the next level and demonstrate the capabilities of FPGA-based reconfigurable computing in terms of performance and scalability, we are currently working on implementing IR applications on the Novo-G machine, the world's most powerful reconfigurable supercomputer for research.
Greener Search in the News
High-level FPGA Programming using Hume
The Hume language "is a strongly typed, mostly-functional language with an integrated tool set for developing, proving and assessing concurrent, safety-critical systems. Hume aims to extend the frontiers of language design for resource-limited systems, including real-time embedded and safety-critical systems, by introducing new levels of abstraction and provability." In a collaboration with Heriot-Watt University in the framework of the Islay project, we are exploring several routes for high-level FPGA Programming using Hume.
GPU and multicore programming research
Weather and Climate Modelling
In collaboration with Prof Saji Hameed of the University of Aizu we are working on GPU- and multicore acceleration of weather simulations. Our long-term goal is to port the Weather Research & Forecasting Model (WRF) to OpenCL. Our current efforts focus on the FLEXPART-WRF model, a Lagrangian Particle Dispersion Model which is used to model e.g. dispersion of particles from the Icelandic volancoes.
Our research plan is to create a dedicated refactoring tool for the FORTRAN source, translate the critical portion to C and then port the C code to OpenCL. The first stage, the FORTRAN refactoring tool, is already usable. For FORTRAN-to-C translation we plan to use NOAA's F2C-ACC with a custom postprocessor. The source code for this project is available on GitHub.
Particle Physics Modelling
The aim of this work (in collaboration with Prof David Ireland and his PhD student Stefanie Lewis) is to accelerate a Nested Sampling Markov Chain Monte Carlo (MCMC) algorithm on GPUs. The MCMC is used to find the optimal parameters for a complex particle physics model based on measured data. The source code for this project is available on GitHub.
Accelerating Search Algorithms
Although we have demonstrated the potential of FPGAs for acceleration information retrieval tasks, GPUs are also good candidates for this type of work. This is an internal project to create an OpenCL kernel for document filtering.
One of my main research topic is novel architectures for heterogeneous many-core Systems-on-Chip, with a focus on high-level programmability. This research was supported by an EPSRC Advanced Research Fellowship grant. Check out my research blog for the latest news.
Networks-on-Chip are an integral part of today's Systems-on-Chip and mulit/many-core architectures.
- Mahmoud Moadeli worked on analytical performance modelling of NoCs, in particular the novel Quarc architecture, initially inspired by and improving on the Spidergon topology.
- Faiz-ul Hassan (co-supervision with Dr Fernando Rodriguez) works on low-level modeling of NoC links in decanano CMOS technologies.
A number of EngD students work on various aspects of reconfigurable hardware:
- Graham Milligan's thesis on dynamically reconfigurable Finite State Machines;
- Waqar Nabi has developed a coarse-grain reconfigurable architecture for power-efficient implementation of multiple wireless communications protocols. He pursues similar lines of research in his current function at Namal College (an associate college of University of Bradford) in Pakistan.
- Paul Mckechnie worked with Xilinx on validation and verification of the interconnection of hardware intellectual property blocks. His work aimed at reducing design errors in reconfigurable systems, covering aspects such as type checking, formal verification and instrumentation >;
- Haitham Fattah works with Codeplay on compiler technologies for C++ that exploit the inherent parallelism in FPGAs.
Previous research (at Strathclyde University)