Scalability and Reliability Limitations of ROS

A Look at the Scalability and Reliability Limitations of the Robot Operating System

Analysing ROS Distribution Capabilities (June 9, 2016)

Over the course of summer 2016, I'm going to be interning at the University of Glasgow, working on an exciting robotics project run by my mentors and supervisors: Dr. Natalia Chechina, Dr. Gerardo Aragón-Camarasa, and Professor Phil Trinder. The main goal is to thoroughly assess investigate the performance of the Robot Operating System (or ROS, commonly) in large-scale multi-robot environments, as well as determine what its limitations are with regard to scalability and fault-tolerance.

In essence, the project aims to conduct the first in-depth analyses of just how large multi-robot systems running atop ROS are able to grow before task efficiency degrades (i.e. when inter-robot communication becomes too resource-intensive, message latencies become intolerably large or when the system cannot recover from partial or total failures within an acceptable time frame). And to do that, we'll be using a set of 9 Sunfounder video car robots powered by Raspberry Pi 3 Model B boards to communicate and assign roles to each other in solving a collaborative task. Note, this research does not target swarm robotics that usually operates primitive robots with no OS, but rather a set of complex autonomous and collaborative robots.

Previous work

The idea for this project was born out of previous work conducted by me under the supervision of the previously mentioned researchers on producing a comparative analysis of the scalability and reliability of a typical ROS system with one sensor (a pan-tilt-zoom camera) and an equivalent system in which the ROS middleware was replaced by a custom-built Erlang-based communication framework. This work constituted my final year undergraduate project at the university and aimed to not only highlight, but also prove - with concrete metrics - that Erlang has great potential within the field of robotics. The focus was on showcasing how Erlang's excellent fault-tolerance and concurrency make it a very attractive choice for implementing robotic middlewares, in particular for large-scale robotic systems with high availability requirements.

You can find a video demonstration of the project on YouTube, the full text of my dissertation here: Real-time Robot Camera Control in Erlang and the supporting code on GitHub.

While the project was a success and resulted in the research paper "Towards Reliable and Scalable Robot Communication" to be presented at the ACM SIGPLAN Erlang Workshop 2016 in Nara, Japan, before going any further integrating ROS and Erlang as part of the larger ROSIE project, first actual ROS scalability and reliability limitations should be established. This is where my internship fits in.

What are the real-life implications of the scalability and reliability findings?

Apart from its research background, the internship's goals also draw inspiration from real-life scenarios, where groups of robots are sent in to perform tasks cooperatively in environments that are hazardous to humans. Some examples here are robots deployed in natural disaster sites, radioactive contaminated zones, or inhospitable planets and moons (e.g. Mars, Venus, Titan), where devices are bound by memory and power consumption limitations.

By electing a single leader to run the more power-hungry sensors and routines and coordinate the actions of the workers via message-passing, the group of robots can conserve its total energy and still continue to solve tasks efficiently and adopt appropriate formations for different terrain types. These tasks I keep mentioning are not as of yet concretely determined for the project's upcoming evaluation, but we'll discuss them more as the internship moves along, based on what hardware we have and what interesting software libraries (both ROS and otherwise) we find.

First steps

For now the initial assignment is building the 9 robots and wiring the electronics without frying any of the component boards! :) Truth be told, though, I do have some practice with this part, as I've already finished the assembly and configuration of one of the robots, little Terra:

CkbuRO5WYAAn7Ja.jpg:large

Terra, the first of the 9 Sunfounder Video Cars I'm going to assemble. The Pi is not screwed in yet, to make the HDMI port access a little easier.

The next post will be an in-depth picture guide of the unboxing and assembly of Terra's first twin, Mercury, with tips on how to make the tricky screwing of some components easier.

<< Back to blog Next >>

A Look at the Scalability and Reliability Limitations...
Sunfounder Video Car Kit for Raspberry Pi... Part 1
Sunfounder Video Car Kit for Raspberry Pi... Part 2
Robot Cars in Action
Robot Cars in a Line Formation – Part 1...
Robot Cars in a Line Formation – Part 2...

A Look at the Scalability and Reliability Limitations of the Robot Operating System

Analysing ROS Distribution Capabilities (June 9, 2016)

Previous work

What are the real-life implications of the scalability and reliability findings?

First steps

Related Posts