next up previous
Next: Pointers to virtual Up: Data Structures Used Previous: The persistent heap

A garbage collector

The storage organisation desribed above has the features required to allow garbage collection of languages like C or pascal. The key step consists firstly of recognising that, on a computer with N bit addressing, valid addresses for objects comprise a subset of the the N bit integers. If some algorithm can be devised to computationally distinguish between those integers that could and those that could not be valid addresses, then, there is no need to tag or otherwise extraneously distinguish pointers from other machine words. Secondly, it consists of devising an organisation of objects in computer memory, such that a couple of table lookup operations can determine if an arbitrary N bit integer is in fact a valid object address.The key features are:

  1. Allocating one or more ranges of addresses to be used to hold persistent objects. Call these ranges heaps.
  2. Constructing a table indexed on high order address bits indicating if a given address falls within a heap.
  3. Ensuring that all valid objects within a given heap occur at multiples of a heap-specific stride through the address space.
  4. Placing immediately before the start of each object a marker byte that can be used to distinguish allocated from unallocated memory blocks.

Thus the table described in point 2 can be used to determine if the upper bits of a 32 bit integer are a valid address, whilst the information provided in points 3 and 4 can be used to check the lower bits. It should be noted that the use of steps 1 to 3 would itself be sufficient to create a reliable garbage collector - one that will not delete any object that ought to be retained. Step 4 is added for efficiency reasons only, to prevent the garbage collector wasting time scanning blocks of memory that have not currently been allocated.

The garbage collector proceeds as follows. It has access to the stack and data segments of the program, along, possibly, with one or more distinguished pointers that are designated roots of persistence. It scans the stack and data segment of the computer and for each memory location, it tests to see if the word at that location is

  1. Within one of the heaps.
  2. If it is, whether it corresponds to the starting point of an object in the heap.
It can do this because the start and end addresses of all of the heaps are known to it, and, within any one heap all items start at multiples of a known block-size from the start of the heap.

Having found an object it is marked, using a byte reserved for that purpose at an address one less than the starting point of the object, and the above scanning procedure is applied recursively to the internals of the object. The maximum possible size of the object is known to the garbage collector because of the known blocksizes of the heaps.

Once the objects are found, what we have is a conventional mark and sweep garbbage collector.



next up previous
Next: Pointers to virtual Up: Data Structures Used Previous: The persistent heap



W Paul Cockshott
Fri Sep 6 10:29:18 BST 1996