This paper was translated from a LaTeX source - it still needs to be polished so please contact me if you would like a paper copy.

Literate Specifications

Chris Johnson

Department of Computing Science,
University of Glasgow,


The increasing scale and complexity of software is imposing serious burdens upon many industries. Formal notations, such as Z, VDM and temporal logic, have been developed to address these problems. There are, however, a number of limitations that restrict the use of mathematical specifications for large-scale, software development. Many projects take months or years to complete. This creates difficulties because abstract, mathematical requirements cannot easily be used by new members of a development team to understand past design decisions. Formal specifications describe what software must do, they do not explain why it must do it. In order to avoid these limitations, we propose a literate approach to software engineering. This technique integrates a formal specification language and a semi-formal design rationale. The Z schema calculus is used to represent what a system must do. In contrast, the Questions, Options and Criteria (QOC) notation is used to represent the justifications that lie behind development decisions. We then present empirical evidence which suggests that the integration of these techniques provides significant benefits over previous approaches to mathematical analysis and design techniques. The paper closes by describing a range of tools that have been developed to support our literate approach to the specification of large-scale software systems.


The increasing scale and complexity of software systems imposes serious burdens upon engineering teams in many industries (Craigen:93,Austin:93). Formal methods have been used to ease these burdens (Bowen:93). They can strip out low-level implementation details during the early stages of development (Johnson:92d). This enables designers to focus in upon more `significant' properties of a potential implementation (Dix:91). In consequence, formal methods are increasingly being used in the development of large-scale applications. For instance, NASA are applying mathematical specification techniques to represent and reason about a range of on-board communications systems (NASA:89). Formal analysis is a necessary part of many design projects commissioned by the UK Ministry of Defence (MOD:91). Bowen and Hinchey demonstrate that many international standards now require the use of formal methods during the specification and design stages of software development (Bowen:94). As a result, formal methods are being used to specify test cases for conformance testing both with international standards and with company quality guidelines (Carrington:94).

Unfortunately, a number of limitations restrict the application of formal methods to support software engineering. Austin and Perkin's survey of four hundred and fourty-four commercial and academic software engineers provides detailed evidence about these weaknesses (Austin:93). The following table presents the percentage of respondents who felt that a number of issues were barriers to the commercial development of formal methods:

The specification is not readable by the clients. 		23%

Some aspects of specification are difficult to define 
in a mathematical model: timing constraints; HCI; 
performance;  reliability; maintainability; availability...	21%

A specification will not model all aspects of the real 
world: hardware; environment...					19%

Lack of experienced staff  					18%

To use a formal method you have to do proofs which 
dramatically increases the development time (and hence
cost)  								16%

Development costs increase  					15%

Mistakes can be made in the specification  			14%

From this analysis it is clear that many of the limitations of formal methods relate to the problems of communication. Mathematical specifications cannot easily be understood by clients. It is hard to train staff to develop specifications. It is difficult to avoid mistakes in specifications. Many of these problems stem from the fact that while formal specification techniques provide powerful means of representing the intended behaviour of a software system, they cannot easily be used to describe why that behaviour is appropriate. This prevents regulatory authorities, clients and other designers from understanding the motivation behind particular development decisions.

A number of further problems restrict the utility of formal methods for large-scale software engineering. Abstract, mathematical notations cannot easily be used to coordinate the activities of different development teams. This is a significant issue because team-based management structures are, typically, used to organise software development (Klein:93). These groups can be formed around specific sub-systems. Alternatively, they can be organised around different skill-bases; human factors groups must coordinate with systems engineering teams. In either case, many of the people in these teams will have little knowledge of discrete mathematics. This prevents them from understanding the implications that a formal specification may have for their development tasks. The use of a mathematical notation, therefore, jeopardises communication both within and between design teams (Johnson:95z). The extent of this problem has also been recognised by industry. Craigen, Gerhart and Ralston's international survey on the commercial application of formal methods concluded that there must be ``added emphasis on developing notations more suitable to use by individuals not expert in formal methods or mathematical logic'' (Craigen:93).

Design Rationale

A number of semi-formal notations have been developed to avoid the communications problems that restrict the utility of approaches such as Z, VDM and temporal logic (Johnson:93c). Many of these notations have also been used to explicitly represent the justifications that lie behind behind particular development decisions. For instance, the graphical Issue Based Information System (gIBIS) uses a complex web of issues, positions and arguments to record the rhetorical events in a design meeting (Conklin:89).

Figure 1: The Structure of gIBIS diagrams.

A limitation with this approach is that it can be extremely time consuming to develop models of the many different arguments that take place during software design meetings. gIBIS, therefore, involves considerable expense in terms of the recording process and in terms of the subsequent analysis. In contrast to the gIBIS approach, Figure 1 illustrates the simpler syntax of the Questions, Options and Criteria (QOC) notation (Shum:95).

Figure 2: A Simple QOC diagram.

These diagrams are built by identifying the key questions that must be addressed during the development of an interactive system. The options that answer a particular question can then by linked to it. Finally, the criteria that support those options are linked by either solid or dotted lines. The solid lines indicate supporting criteria. The broken lines indicate criteria that count against a particular option.

Design rationale offers a number of advantages for software development. Shum argues that semi-formal notations provide the flexibility that is required to represent the many different forms of decisions that have to made during the development life-cycle (Shum:91). A further advantage is that designers are not forced to learn complex syntax and composition rules (Johnson:95z). Klein argues that this the flexibility and informality of design rationale supports the communication of design decisions between the members of concurrent development teams (Klein:93). MacLean, Young, Bellotti and Moran regard the arguments that support decisions as the `co-products of design' (MacLean:91). If the justifications that support crucial decisions are explicitly represented then they can be shown to new members of development teams (Conklin:89). Yakemovic and Conklin argue that this helps to clarify the reasons for past development decisions (Yakemovic:90). Design rationale can also be shown to clients and regulatory authorities as part of the documentation for safety-critical software (Davies:85). It can help to demonstrate that sufficient attention has been paid throughout the development of large-scale software projects. This paper argues that these benefits must be recruited to support the use of formal methods in large-scale software development.

Not only is there an argument for supporting formal specifications with design rationale techniques. These is also a need to recruit the precision and rigour of mathematical specifications to support semi-formal notations, such as QOC and gIBIS. Design rationale lacks the precision that is required during the development safety-critical systems. For example, the natural language that is used to annotate QOC diagrams can easily be misinterpreted. Confusion often occurs when different development teams use the same terms to refer to different data items or sections of code. This can easily happen when QOC diagrams are produced without automated support for data dictionaries (Johnson:94). Semi-formal design rationale notations can also be inconsistent. This occurs if the same criterion is used both to support and weaken an option in different contexts (Johnson:95h).

In order for any integration of formal methods and design rationale to be successful it is critically important to minimise the additional burdens that will be placed upon the end-users of the technique. A frequent weakness of formal methods research is that it pays little regard to the software engineers who must eventually learn to apply their techniques (Johnson:95b). With this is mind, we have decided to exploit the QOC notation rather than the gIBIS design rationale. There are a number of reasons for this. The simpler syntax of the QOC notation makes it easier to construct and maintain than gIBIS diagrams. QOC also has a simpler semantics. It may, therefore, be easier to learn.

Literate Specifications

Formal methods and design rationale provide powerful and complementary means of supporting large-scale software development. These two approaches might be integrated by using an appropriate formalism to represent the requirements for a system and the justifications that support those requirements. For instance, deontic logics can be used to represent those elements of a specification that `must' or `may' be satisfied (Sergot:95). Additional argumentation structures can be used to formalise the reasons why one requirement `must' be satisfied while another `may' be accepted. Unfortunately, this approach does not resolve the communications problems that restrict the use of formal methods in large-scale software engineering. Designers would still be forced to understand the non-trivial syntax and semantics of the underlying formalisms. In contrast, this paper introduces a literate approach to software specification. This extends Knuth's literate programming techniques (Knuth:84). The earlier approach explicitly linked natural language documentation to program listings (Thimbleby:90a). This was unstructured and provided little guidance about the sort of information that should be included in the literate part of software documentation. In contrast, our techniques link semi-formal design rationale to the mathematical abstractions of formal specifications. This is a significant development of Knuth's earlier approach because design rationale provides guidance about the sort of information that should accompany a formal specification. The elements of the QOC notation also provide means of structuring the justifications that support critical development decisions.

The argumentation in design rationale provides a means of communicating the findings of formal analyses to the members of concurrent design teams. The use of natural language within a graphical notation also provides the degree of flexibility that is necessary in order to communicate the justifications that lie behind particular design options. Formal specifications provide the degree of precision that is required by those who are responsible for programming an application.

Figure 3: Overview Of The Literate Specification Technique

Figure 3 provides an overview of our approach to the literate specification of large-scale, software systems. Our proposed approach to the integration of design rationale and formal specification requires two additional forms of support if it is to be successfully used by `real-world' engineers on the development of `real-world' software. The first is empirical evidence that these techniques can be successfully used by today's generation of programmers and designers. We are, therefore, anxious to provide empirical evidence that demonstrates the `usability' of literate specifications. The second form of support for our approach is provided by the development of appropriate tools. This is a critical issue because 43\% of the respondents to Austin and Parkin's survey indicated that the lack of available tools were a significant barrier to the use of formal methods (Austin:93). Traditionally, formal methods research has focussed upon the provision of theorem provers and type checkers. In contrast, we are anxious to provide systems that support the communication of mathematical requirements and not simply their formal analysis or generation.

An Example Specification

The argument in this paper is illustrated by requirements for a reactor feedwater control system. This application exhibits many of the problems that complicate large-scale software development. As computer systems gradually replace the hardware interlocks, designers are increasingly faced with the challenge of implementing safe and usable control programs. Our choice of application is also justified by the observation that the development of a feedwater system requires close cooperation between multi-disciplinary design teams. These applications have posed significant challenges for both systems designers (Siu:91) and human factors specialists (Kirkpatrick:81). It is against this context of concurrent engineering that we will assess the utility of our literate specification techniques.

The requirements for the control system can be briefly described as follows. Rupture accidents occur when contaminated primary coolant escapes from steam generator tubes into the containment. A warning system should detect contamination as coolant accumulates in sumps. An alarm should then be raised. The leakage will result in a fall in the level and pressure in the system's pressuriser. A reactor trip should then occur. Following this the operator must: establish a secondary heat sink; identify and isolate the faulty steam generator; cool and depressurize the primary coolant system; restore long term cooling. In order for safety to be preserved, designers must integrate the monitoring and detection systems with the user interface to the control applications. This is clearly a complex task given the many hundreds of variables that are sampled during coolant failures (Johnson:95z).

The rest of this section briefly introduces formal requirements for our example system. Readers who are more interested in the development of literate specifications than the details of our specification should move to the next section. The feedwater application can be described in terms of a number of parameters. The cntmnt_sump_level describes the depth of fluid in a containment sump. The pump_a_press represents the output pressure produced by pump A:

PARAMETER_IDS = {cntmnt_sump_level, pump_a_press,...}

Operations can be performed on the elements of PARAMETER_IDS. For instance, max and min values define a permissible range of values for each parameter. Automated systems and operators may request a new target value for a parameter; set:

| set, value, max, min: PARAMETER_IDS -> Z
| user_input: power TEXT
| system_input: power TEXT
| forall p : PARAMETER_IDS o max(p) >= value(p) >= min(p)

The user interface to the feedwater control system requires the close cooperation of systems engineers and human factors experts. The former group must provide software and hardware to supply all necessary process information. The latter group must devise appropriate presentation techniques so that operators can read and understand the information which is presented to them. Requirements for both these groups can be represented using the abstractions of the Z specification language. For example, a sump_levels display can be used to represent information about the contamination of liquid accumulating within the drainage system. The image presented to an operator can be described in terms of a set of display formats:

DISPLAY_FORMATS = {sump_levels, steam_circuit,...}

| display_text: TEXT <-> DISPLAY_FORMATS
| image: power DISPLAY_FORMATS

Feed_System == Feed_System_Display ^ Feed_System_Application

An important requirement for our example system is that a warning is displayed if the system detects an attempt to set a variable below its authorised limit. The following schema provides a common focus for several different development teams. Software engineers must ensure that the system can detect and respond to a low setting. Human factors specialists must determine whether operators will actually be able to observe and understand the warnings specified in schemas such as Low_Set:

| Xi Feed_System_Application
| Delta Feed_System_Display
| param? : PARAMETER_IDS
| new_setting? : Z
| new_setting? < min(param?)
| exists format : image' o
| (``Setting of " cat name(param?) cat`` low", format) in display_text'
| display_params' = display_params
| display_text' = display_text

It is important to emphasise that mathematical specifications support communication between designers who have a clear understanding of the formal language. Discrete mathematics provides a vocabulary with clear semantics and well defined operators. Unfortunately, this strength can also be a weakness if other designers, clients and regulatory authorities only possess a limited knowledge of the formal language.

The Weaknesses Of Formal Methods

A number of barriers prevent software engineers from successfully applying formal methods within large-scale development projects. Some of these limitations arise from the technical problems of constructing complex, mathematical specifications. These issues are widely discussed in the literature (Barringer:85a,GrayJohnson:95,Moszkowski:93). In contrast, this section concentrates upon the organisational problems that complicate the use of formal methods within multi-disciplinary design teams.

Lack Of Vernacular Descriptions

The previous schemas illustrate many of the problems that limit the application of formal methods to support large-scale software development. It can be hard for software engineers without a detailed knowledge of discrete mathematics to understand the implications that schemas, such as Low_Set_Response, have for their development tasks. Z attempts to address this problem by stipulating that informal descriptions are a necessary part of any specification (Spivey:92). Unfortunately, this requirement is often ignored (Johnson:95z). One means of ensuring that vernacular descriptions are included within formal specifications is to introduce prose into the syntactic structures that are used to organise abstract requirements. For example, schemas might be grouped with an informal description:

|Delta Feed_System
|input_text? : TEXT
|text_to_num : TEXT -> Z
|exists format : image; setting : TEXT o
| (``Setting of " cat name(param?) cat``out of range", format) in display_text =>
| (input_text? = name(param?) cat setting) in user_input ^
| max(param) >= text_to_num(setting) >= min(param)
| system_input' = system_input

This states that if a warning about an out of bounds setting for a process parameter is part of the displayed text then the user inputs a string that identifies the parameter and alters the setting to an admissible value.

Although this syntactic grouping helps to communicate the requirements that are represented in specifications, it does not provide a panacea for the use of formal methods within large development teams. It states what system software must do, it does not explain why it must do it. This problem can be avoided by using formal notations, such as deontic logics, to represent the arguments that support particular requirements. Unfortunately, this introduces further communications problems for designers who do not understand the mathematical notations that are used to represent the justifications behind formal requirements.

Figure 4: Linking Vernacular Argumentation And Formal Requirements

In contrast, our approach to the development of literate specifications avoids this problem by using vernacular and semi-formal argumentation structures to support the abstract requirements of formal methods. For instance, Figure 4 illustrates how a QOC rationale might be used to support the technique described in User_Set_Response. The requirement that users should provide input in response to the low setting warning is supported by the argument that this increases operator involvement. Designers may also be more confident that users will monitor their system and observe the warning if they are expected to actively intervene during control tasks. The formal requirement is not supported by the argument that this will reduce the burdens on the operator. It is important to emphasise that the formal schemas and semi-formal rationale are intended to be the co-products of our literate approach to specification. Although some training in discrete mathematics may still be required to understand the precise requirements in a formal specification, it is now possible for all members of a design team to question the arguments that support these formal assertions.

Lack Of Justification

Formal specifications are typically used to describe the intended behaviour of software systems; they are not used to `justify' that behaviour. For instance, the previous schema specified that operators must enter the name of the parameter and its new setting. Designers might justify this input requirement because it increases confidence that operators will have observed and understood the warning. However, User_Set_Response represents an extremely optimistic assumption. Distractions caused by other users and application processes frequently prevent operators from responding to system warnings. An alternative would be for the control system to automatically reset the parameter to its minimum value. This would reduce operator workload but would have the disadvantage that users might not detect that the system was intervening to resolve the problem. Unfortunately, there is no means of reconstructing this informal argument from the formal statement of the design requirement:

|Delta Feed_System
|input_text? : TEXT
|text_to_num : TEXT -> Z
| exists format : image; setting : TEXT o
| (``Setting of " cat name(param?) cat`` out of range", format) in display_text=>
|(input_text? = name(param?) cat setting) in system_input ^
| max(param) >= text_to_num(setting) >= min(param)
|user_input' = user_input

This states that if a warning about a low setting for a process parameter is part of the displayed text then the system inputs a string that identifies the parameter and alters the setting to an admissible value.

This lack of justification has important consequences for the application of formal methods. For instance, a development team might expect user confirmation of more serious problems, such as low pressure settings for pumping equipment. Automatic responses might be used for less important errors. Other designers within the same group or within different development teams cannot use schemas, such as User_Set_Response or System_Set_Response, to discover why an approach was appropriate in one context but not in another. In our example, it would be impossible for software engineers to reconstruct the implicit risk assessments that were used to select operator confirmation for low pressure settings. This makes it difficult for the same criteria to be applied throughout system development (Shum:94).

Our literate approach to specification is motivated by the need to explicitly represent the justification for particular requirements. Figure 5 illustrates this use of design rationale and formal methods.

Figure 5: Justifying Formal Requirements Using Design Rationale

The requirements represented by System_Set_Response are justified by the argument that this reduces the burdens on system operators. These requirements are not supported by an increase in the designers' confidence that users will observe the warning. Nor are they supported by an increase in operator involvement. These justifications can be compared with those for User_Set_Response shown in Figure 5.

It is intended that these diagrams should be produced as a co-product of the formal development process. As various alternatives are assessed, the justification for each design option should be explicitly represented in a corresponding QOC diagram. It would then be possible to trace the reasons for adopting a particular requirement within a software system. It is important to acknowledge that there will be an additional overhead in developing both a design rationale and a formal specification. Following MacLean, Young, Belotti and Moran we argue that our integrated approach will repay any initial investment by ``(a) providing an explicit representation to aid reasoning about the design and the consequences of changes to it, and (b) serving as a vehicle for communication, for example, among the members of a design team or between the original designers and the later maintainers of a system.'' (MacLean:91).

Lack Of Status Information

A further limitation of existing specification techniques is that they provide no means of determining the status of a requirement from its formal statement. By status information, we mean some indication that further work is necessary before a requirement is at a level of detail that can be used to guide implementation. A prime objective in the use of formal methods is to minimise the modifications that are necessary once a specification has been refined towards implementation. If some requirements are overlooked then there may be considerable delays and additional costs while that area of a design is developed to the necessary level of detail. For example, none of the previous requirements specified what should happen to the low setting warning after the system or the operator intervened. Unfortunately there is no way to detect that this additional work is required from formal statements such as User_Set_Response, even though this information might be captured in subsequent schemas:

| Delta Feed_System
| exists warning_format : image, format : image' o
| ((``Setting of " cat name(param?) cat`` low", warning_format) in display_text ^
| ((input_text? = name(param?) cat setting) in system_input v
| (input_text? = name(param?) cat setting) in user_input) =>
| ((``Setting of " cat name(param?) cat`` low", format) not in display_text' |

This states that if a warning about a low setting for a process parameter is displayed and the user or the system inputs a string that identifies the parameter and alters its setting then the warning is not displayed after that input operation.

It is important to emphasise that even more work is required before this schema can guide an implementation. For instance, it does not describe what should happen if concurrent input from the system results in another low setting at the same time as the previous warning is cleared. A further limitation is that it does not represent the real-time delays that may occur before the user or the system responds to the low setting. Again, there is no way of identifying the need for this further work from the formal specification. The lack of status information does not just cause problems for the refinement of software engineering requirements. Similar comments can be made about the device abstractions in previous schemas. They are not at a level of detail where a development team can proceed towards implementation. For instance, any number of sensors and monitors might be used to obtain process values. The selection of these devices will have a profound impact upon the anticipated frequency and granularity of process updates. Such factors will, in turn, have an impact upon the polling mechanisms that are used to sample the devices. These issues must be considered by both hardware and software engineers during the more detailed development of complex applications.

Many theoretical works on formal methods ignore the pragmatic problems that arise from the refinement of large-scale specifications (Morgan:90). There is an assumption that the level of detail will gradually increase in an almost uniform manner. Recent research into design practice has revealed that such assumptions are unwarranted (Johnson:95z,Klein:93). Less critical areas of a software system may be left at a high level of abstraction. More important areas and, in particular, those aspects that preserve safety must be considered in greater detail. Unless these areas can be agreed upon then development activities may be needlessly duplicated or omitted. This problem is exacerbated by the fact that different groups may be working on different areas within a specification. The lack of status information for abstract schemas, such as User_Set_Response, can prevent designers from coordinating and focusing their efforts upon those areas of a specification that require the most work.

The integration of mathematical requirements and semi-formal, design rationale can be used to explicitly introduce status information into abstract specifications of large-scale systems. For instance, the Remove_Set_Warning schema can be criticised because it does not represent the duration of the delay that can arise before a user or the system responds to the low setting warning. This is unsatisfactory because at some point additional presentation resources may be required in order to direct the operator's attention towards the warning. Similarly, a time-out must be used to describe the maximum delay before system intervention. This need for further refinement can be indicated in design rationale by explicitly representing the status of particular schemas. By this we mean that QOC diagrams can be annotated to show those options that require further work. Figure 6 illustrates this technique.

Figure 6: Representing Status Information Using Design Rationale

In this diagram, the option that the Remove_Set_Warning should be used to cancel the alarm is opposed by the observation that the schema does not consider the additional warnings which must be provided if the user does not observe the initial error display. Nor does it consider the time-outs that must be used in order to trigger system intervention. The use of the box around the criterion is not a standard part of the QOC syntax and is included to emphasise the need for further refinement. A number of benefits can be gained from this introduction of status information into formal specifications. In particular, it supports the use of mathematical requirements within concurrent engineering teams. For instance, schemas such as Remove_Set_Warning have implications both for systems engineers and for interface designers. The former group must implement the automatic time-outs. The latter group must devise appropriate presentation techniques to display the warning. By explicitly representing the status of this option both parties may recognise that this requirement needs further work. In this way, design rationale can provide a focus for the subsequent development of literate specifications.

Lack Of Alternatives

Perhaps the biggest problem in applying formal methods is that they do not capture the range of alternative options that are considered but rejected during software development. This is a significant issue because regulatory and commercial organisations require proof that designers have selected the correct approach from amongst the available alternatives (Davies:85). For example, the problems of guaranteeing operator intervention within a particular time period could persuade a design team to reject the approach described in User_Set_Response. It might, therefore, be specified that operators should be prevented from exacerbating low setting errors. This approach was embodied in the recommendations of the United States' Nuclear Regulatory Commission investigation into the Three Mile Accident (Pew:81). A policy of minimal intervention was adopted so that operators were locked out of the system while it performed critical recovery tasks. This principle could be used to guide the development of software interlocks:

|Xi System_Set_Response |
|------------------------------------------ |
|forall user_text : TEXT; exists format : image' o
| (``Setting of "cat name(param?) cat`` low", format) \in display_text =>
| not (user_text? in user_input) ^
| ((input_text? = name(param?) cat setting) in system_input)

This states that if a warning about a low setting for a process parameter is displayed then the user is locked out of the interface and the system inputs a string that identifies the parameter and alters its setting.

The representation of rejected options is of critical importance during the subsequent development of large-scale software systems. Revisions to a set of initial requirements frequently force designers to re-consider the proposals that were discarded early in the development cycle. For instance, the decision to lock out the user might be rejected by regulatory and quality control groups. This is again illustrated by the events following the Three Mile Island Accident. The policy of minimal intervention was largely abandoned after problems in the North Anna reactor forced operators to intervene in order to preserve the safety of their system (Duncan:87b). Under such circumstances, designers might be forced to re-consider the option described by User_Set_Response rather than Lock_Out_User. As before, the integration of formal specification techniques and design rationale notations can be used to support this re-evaluation. For instance, Figure 7 exploits our extended QOC notation to represent alternative options for the confirmation of low setting warnings.

Figure 7: Representing Alternative Options Using Design Rationale

This diagram shows that a number of schemas might be represented as different design options within a QOC diagram. Software engineers might use this figure to reconstruct the arguments for and against user confirmation, automatic lock-outs and an approach that allows the system or the user to re-set parameters.

`Real-World' Support For Literate Specification

So far, we have not presented evidence that `real-world' designers can use literate specifications to support software development. This is a significant omission because formal methods research often pays little regard to the software engineers who must eventually learn to apply their techniques. In order to address this weakness, we conducted a week-long trial of the approach with a group of five software engineers from companies including an international banking group and a rail-service provider. During this period the subjects were trained from `scratch' to a level where they could both read and write Z specifications. The first four days, therefore, included an intensive course on discrete mathematics and the schema calculus. On the fifth day, we presented the subjects with a detailed specification of a control application from a chlorine recovery system. They were asked a number of qualitative questions about the usability of the formalisation. The results of this are shown as Figure 8. As can be seen, our subjects found the specification to be either impossible or hard to understand. Only one of the software engineers found it easy.

Figure 8: Qualitative Assessments Of Formal Methods

Such results are not particularly surprising. Formal methods are skill based activities. Clearly, one week's training is not sufficient to develop the expertise that is necessary to become confident in the use of mathematical specifications. Perhaps, more surprising are the qualitative responses for QOC diagrams. After being asked to analyse the Z specification, our subjects were shown a design rationale for another area of the chlorine recovery system. The results for the qualitative evaluation of QOC are shown in Figure 9.

Figure 9: Qualitative Assessments Of Design Rationale

In contrast to the formal specification, the subjects found it either trivial, easy or OK to understand the semi-formal notation. They also found it relatively easy to understand the design alternatives that were represented as options in the QOC. This evidence seems to back up the claims that were made for design rationale in the previous section. It is important to emphasise, however, that no direct comparisons can be made between the attitude statements in Figures 8 and 9. Clearly, the information content in the specification and the design rationale is quite different.

Finally, we presented the subjects with a literate specification that included both Z schemas and QOC diagrams. The results are shown in Figure 10. All of the subjects found that the integration of these techniques would be either useful or OK in their everyday work. This is an extremely encouraging finding considering the antipathy to formal methods that one frequently encounters amongst software engineers whose training has included only a minimal amount of discrete mathematics None of our subjects had any previous training in formal specification or discrete mathematics.

Figure 10: Qualitative Assessments Of Literate Specifications

These findings are suggestive rather than conclusive. It is important not to under-estimate the difficulty of performing rigorous evaluations with practising engineers. We had to train our subjects over a significant period; this involved a high degree of commitment from both the individuals concerned and from their companies. We are currently attempting to replicate our results with larger groups of software engineers drawn from a wider range of companies. These investigations are discussed in more detail in a companion paper (Johnson:95lit).

Tool Support

One of the most important observations to be made from our evaluation was that our subjects required tool support. They found it difficult to read QOC diagrams and Z schemas that were spread across various pieces of paper. They also commented that there was an important problem in maintaining consistency between the formal and semi-formal representations.

Constructing And Maintaining Literate Specifications

Figure 6 illustrates the complexity of developing literate specifications for non-trivial applications. QOC diagrams quickly become complex webs of interconnected options and criteria. Together with Chan we have, therefore, built a direct manipulation editor for QOC diagrams (Chan:94). Graphical interaction techniques, using buttons and pull-down menus, can be used to generate the networks that represent the argumentation behind formal specifications. The tool includes routines for tidying-up the resulting diagrams. Questions, options and criteria can be spaced to help users view their inter-connections. Scrolling facilities can also be used to navigate the graphical structures in a way that is not possible using paper-based techniques.

Chan's tool can not only provides means of navigating QOC diagrams. It also provides a novel means of navigating through the schemas of a formal specification. The names of abstract requirements can be used to label the options in a QOC network. These can then be selected to reveal their full formalisation. This enables designers to directly view the different requirements in a literate specification by selecting the text shown in a QOC diagram. This approach is illustrated by Figure 11.

Figure 11: Storing Schemas As Hypertext Links In Chan's Tool

Chan's tool has important advantages over many existing systems for constructing and maintaining formal specifications. In some of these applications, it can be difficult for designers to visualise the ways in which schemas relate to each other. They are forced to go through many different displays in order to inspect the hundreds of different clauses that characterise non-trivial systems. In contrast, the QOCs of literate specification provide a powerful structuring mechanism for formal requirements. The semi-formal links between questions and associated criteria provide the contextual information that is necessary in order to understand large-scale specifications.

It is important also to emphasise that our tool has many weaknesses when compared with other formal specification systems. For instance, there is no type checking. There is no support for theorem proving. Other problems relate to the user interface. For instance, there are only rudimentary facilities for displaying mathematical symbols. It can be difficult to view schemas against a background of questions, options and criteria. Current work is extending the system through the use of colour as a redundant cue to the role of a schema within a specification. A further problem is that the tool provides little or no support for version control. This makes it difficult for designers to trace how literate specifications change during the development lifecycle. New members of existing design teams cannot easily use the tool to understand successive changes in the development of a software system.

Reasoning About The Structure Of Literate Specifications

Clearly, the support for formal specifications in Chan's tool might be improved by the introduction of typechecking facilities. The presentation of the QOC diagrams might be improved by changes to the network editor. A more fundamental problem, however, is the need to ensure that the argumentation which is used to support literate specifications is coherent and sound. For example, Figure 12 shows the potential for inconsistency in design rationale.

Figure 12: The Potential For Contradiction In Design Rationale

The formal requirement that the user explicitly acknowledges a high setting warning is supported by the criteria that this will reduce the burdens on the operator. This argument represents a potential inconsistency because in Figure 7 it was argued that an explicit acknowledgement did not reduce the burdens on the operator. There may be perfectly good reasons why a criteria is used in this apparently contradictory fashion. However, it is important that designers can explicitly identify these inconsistencies if they are to justify them before quality control groups and regulatory authorities. A second strand of research has, therefore, been developing Prolog based tools that can be used to reason about the structure and consistency of literate specifications. The following clauses illustrate this approach by representing the reasons for and against explicit user confirmation of low set warnings:

qoc(low_set_cancel, persistant_set_warning, low_burdens).
qoc(low_set_cancel, persistan_set_warning, confirms_observation).
qoc(low_set_cancel, user_set_response, not(low_burdens)).
qoc(low_set_cancel, user_set_response, confirms_observation).
qoc(low_set_cancel, lock_out_user, low_burdens).
qoc(low_set_cancel, lock_out_user, not(confirms_observation)).

qoc(hig_set_cancel, user_set_response, low_burdens)).
Standard resolution proof techniques can be used on these clauses to determine whether there are any contradictions or inconsistencies in the rationale behind an interface. For example, the justifications for an option are inconsistent if it is both supported and weakened by the same criteria in two different contexts:
inconsistency(Ques1, Ques2, Opt, Crit):-
qoc(Ques1, Opt, Crit), qoc(Ques2, Opt, not(Crit)).
This is only one of a number of structural properties that can be represented using this approach. Brevity prevents a complete analysis of the various completeness criteria that might be imposed upon a literate specification, the interested reader is directed to (Johnson:95h). Given such clauses, designers can use our tool to determine whether there are any such inconsistencies in a literate specification:
|?- inconsistency(low_set_cancel,Ques, Opt, Crit).

Ques = high_set_cancel 
Opt = user_set_response
Crit = low_burdens?
This illustrates how relatively simple Prolog clauses can be used to identify potential inconsistency. In the case of high settings, user intervention is argued to reduce the burdens upon the operator. In the case of low settings, user intervention is argued not to reduce the burdens upon the operator. It is exactly this form of inconsistency that jeopardises the development of large-scale software projects. Clearly, there may be profound consequences for the design of an application if one team chose to exploit user confirmation for one range of error messages while another chose not to use them because of the additional burdens imposed upon the operator. Our tools, therefore, provide important means of identifying the inconsistent criteria that often remain implicit assumptions during the formal development of complex software.

Uniting Graphical Interaction And Structural Reasoning

We are currently using the Prelog system to integrate both of the tools described above. Figure 13 shows the architecture of this application.

Figure 13: The Architecture Of The Prelog Tool

Prelog is built from two existing systems. The Tokio temporal logic interpreter is used to maintain the qoc clauses from the Prolog tool. This interpreter was chosen because elsewhere we have argued for the need to consider temporal properties in the development of safety-critical applications (Johnson:95c,Johnson:95z). The same article also considers ways in which Tokio can be used to rapidly derive partial implementations from abstract specifications. For now it is sufficient to observe that Tokio provides the same resolution techniques as those provided by Prolog and so it involves no additional overheads if designers simply want to use first order logic. Unfortunately, qoc clauses provide an extremely cumbersome means of constructing and viewing literate specifications. Fortunately, Prelog also provides access to the Presenter screen management system. This tool is used to directly translate the textual output of Tokio into the graphical form of the QOC notation. We have implemented our system in such a way that the logic clauses can be automatically generated from the graphical representation. This has the advantage that designers need never directly interact with qoc clauses.

It is important to emphasise that the development of tool support for literate specification is an area of on-going research. For example, we have yet to directly address the problems of maintaining conformance between a formal specification and its design rationale. If changes are made to a schema then surely the rationale ought to be updated to reflect that alteration? This is clearly a critical issue in the development of tool support. Our recent focus upon version control is helping to address this problem. Designers can chart the changes that occur within a specification and a QOC diagram. This is not a complete solution. For instance, the Prelog presentation of a QOC diagram might be altered to warn designers that modifications are required once the underlying specification has been changed. This raises a number of further issues that remain to be explored. For instance, introducing a new schema into a specification might have knock-on effects throughout many different areas of a design rationale. In our example, the introduction of a new option for the cancellation of warnings might affect the argumentation behind both high and low settings etc. At present there are no clear procedures or heuristics for tracking these knock-on effects.

Conclusion And Further Work

In this paper we have argued that a number of limitations frustrate attempts to use formal specification techniques during large-scale software development. It is argued that the lack of vernacular descriptions prevents many designers from accessing the products of formal analyses. Mathematical specification techniques have also been criticised because they do not represent the arguments that lie behind critical design decisions. This makes it difficult for software engineers to justify their development decisions to external, regulatory authorities. A further problem is that formal methods, typically, do not represent the level of refinement for abstract requirements. In large-scale systems, this makes it difficult for teams of designers to coordinate their efforts around those areas of a program that require the most work. Finally, formal specifications are constructed to provide an unambiguous representation of a design. They do not represent the alternative options that were considered but rejected as development progresses. This creates practical problems if these options must be reconsidered during subsequent stages of design. This is a significant limitation because user and customer requirements frequently change during software development. We have shown that all of these limitations can be addressed by representing schemas as options in the QOC notation. This provides the status information, the justification and the alternative options that often remain implicit during the formal development process. Unfortunately, previous research in the field of formal methods has paid little attention to the software engineers who must use mathematical specification techniques. We have, therefore, conducted a preliminary `usability' test to determine whether existing programmers and designers can be trained to use our literate specification techniques. The initial findings are encouraging and we are currently working to replicate our results. This trial also revealed the need for tool support. The paper, therefore, goes on to discuss two different approaches that provide graphical manipulation facilities and structural proof techniques for literate specifications.

There are a number of areas for further work. Many of these have been mentioned in the previous sections of the paper. For instance, our tools must be improved to support version control mechanisms and type checking facilities. There are other areas that also remain to be explored. For example, Fischer, Lemke, McCall and Morch argue that it is possible to develop a large corpus of domain specific design rationale to guide the development of future systems (Fischer:91). This approach might be extended to support our use of QOC within literate specifications. Libraries of formal requirements and informal rationales might reduce the costs of future projects through the re-use of design documents. Such developments remain long term objectives. Further work is required to determine whether it is possible to represent more generic requirements in the formal components of our rationales. The examples that have been used to illustrate our techniques are highly specific to particular classes of safety-critical software.

It is important to stress that the use of mathematical specification techniques does not remove the need for traditional software engineering practices. Requirements elicitation is a necessary precursor to the application of formal methods. Programmers must still implement and test the systems that satisfy formal requirements. In this paper we have argued that literate specification techniques can be used to support these various stages of development. Requirements analysts, programmers and quality assessment teams can not only find out what a system should do but why it should do it. This argument suggests an extremely wide role for the use of formal methods in the software lifecycle. A number of colleagues in Glasgow, including Gilbert Cockton, Phil Gray and Steven Clarke, have begun to identify practical techniques for extending literate specifications in this way. In a recent publication, they have introduced the notion of `literate development'. This expands the links that we have developed between formal and semi-formal representations to include vernacular and informal annotations at a variety of different levels of detail. The intention is to directly support requirements engineering, task analysis and software testing (Cockton:95). This work is very much in its infancy but it promises to bridge some of the divides that currently prevent formal methods from being used in conjunction with existing software engineering practices.


The research reported in this paper has been supported by UK Engineering and Physical Sciences Research Council grant GR/J07686. I would like to thank the members of the Department of Computing Science, University of Glasgow for their support and encouragement in this work. Tom Melham, in particular, helped to proof read the final version of this paper. Thanks are also due to the reviewers of this paper who provided a number of useful comments on the focus and presentation of this research.