Next: Acknowledgements Up: A Probabilistic Terminological Logic Previous: 4.2 Semantics

5 Concluding remarks

In this paper we have presented a logical model for information retrieval based on a probabilistic terminological logic. In this model, IR is seen as the task of 1) computing, for a given information need (represented by the concept) and for each document (represented by an individual constant) , the real number (represented by the constant) such that is valid in (i.e. in the theory representing the document base and the lexical, ``thesaural'' knowledge), and 2) ranking documents in terms of their associated .

Besides enjoying the numerous properties that accrue from the adoption of a TL (properties that are more fully described in [8]), this model takes advantage of the considerable expressive power provided by our probabilistic extension to the terminological framework. This extension allows the distinct expression of two radically and conceptually different kinds of probabilistic information that feature in the IR task, i.e. statistical information, and information about the degrees of belief that the IR system being modelled has in other information.

Although statistical information and information about degrees of belief are conceptually different, it is clear that there is a relationship between the two. Our work so far has aimed at providing a framework in which both could be expressed and reasoned upon in a principled, semantically clear way. A further step in this direction should be the investigation of mechanisms for allowing information about degrees of belief to be directly derivable from statistical information. For instance, ifthe system has no belief at all (i.e. to no degree) whether a given assertion is true, but at the same time knows that 80%of all individuals of the domain are 's, it might plausibly decide to believe with a 0.8 degree of confidence that , a particular individual in the domain, is a . This approach to the derivation of degrees of belief, well known in actuarial reasoning, is known as direct inference (see e.g. [7]). Other approaches exist however, yielding different results, and based on principles as diverse as the maximum entropy principle (see e.g. [3]), the centre of mass principle or the maximal independence principle (see e.g. [2]). Unfortunately, in all of these approaches, degrees of belief are completely determined by statistical information, to the extent that two formulae such as and would jointly imply that ; instead, it is clear that we would like to be able to entertain such beliefs without this implying that . Investigating mechanisms that allow statistical information to determine degrees of belief only when these latter are not already determined is the next research task that this work opens up.



Next: Acknowledgements Up: A Probabilistic Terminological Logic Previous: 4.2 Semantics


Fabrizio_Sebastiani