next up previous
Next: Compression Up: Design Motivation Previous: Design Motivation

Difficulties in the use of persistence

In a properly developed persistent programming language the burden of arranging for transfers of data between its long term storage and the volatile memory of the computer is fully automated. Data-structures declared in a program, automatically persist between program invocations and may be exchanged between different programs, without the application programmer having to write any instructions to store data onto non-volatile media. In addition, a garbage collection facility will automatically recover memory locations occupied by data that is now unwanted.

However, the most popular programming languages of today, such as C++ or Pascal, do not support persistence. Although it is in principle possible to produce specially tailored translation systems for these languages in order to allow them to have full persistence in the sense described above, the development and marketing of new language translation systems is very expensive.

There are two principle difficulties in providing full persistence for such languages.

  1. No support for automatic storage recovery, (garbage collection), is provided in such languages. Instead the programmer must provide explicit calls on a de-allocation procedure to free memory. Whilst the basic technique of garbage collection is well known [Bishop77][Almes80] it relies upon the language run-time system being able to reliably find which memory locations hold pointers to other locations. This, in the state of the art, requires either special purpose computers in which memory words are tagged to distinguish pointers from non pointers, or if standard computers are used, it requires a regular and disciplined placement of pointers in the computers memory. Popular techniques are to ensure that all of the pointers in an object are adjacent to one another at the start of the object, and are preceded by a word which encodes the number of pointers that an object contains.

    Popular programming languages like C, C++ or Pascal, allow the prorgammer complete freedom in deciding where pointers are to be stored within a data-structure, hence precluding the regular organisation needed for a garbage collector.

  2. In object oriented programming, data objects commonly contain pointers to a data structure called a virtual method table. The virtual method table is a table of procedures that are permited to operate on the object. A problem with providing persistence for languages like C++ and Object Pascal is that the machine address at which this virtual method table resides may change between successive versions or runs of a program. Thus objects created at an earlier time may have invalid pointers to virtual method tables. This precludes simply copying objects in from disk when they are needed.

Two well known additional issues involved with persistent programming are that:

  1. A large virtual address space may cause disk thrashing to occur when space is being allocated.
  2. A mechanism is required to ensure that the store can be rolled back to a correct state following an error.

The persistence management system that we have implemented has features designed to overcome both of these.



next up previous
Next: Compression Up: Design Motivation Previous: Design Motivation



W Paul Cockshott
Fri Sep 6 10:29:18 BST 1996