The objective of the project is to develop a formal theory of multimedia information retrieval (MIR) and to evaluate it by running experiments on a realistic application. This theory will be the theoretical basis of information systems supporting the storage and retrieval of multimedia documents, that is complex objects whose components may be facts, text, graphics and images.
The need for such a theory has become apparent over the last few years as there has been increasing awareness that the existing range of retrieval models have built-in theoretical limits to their effectiveness. This is due to the fact that these models are based on a limited view of the retrieval process, and of the types of text representation. In addition, the representation of multimedia information requires richer formalisms than those traditionally used.
We propose to develop a logic, the MIR logic, together with a matching theory of uncertainty based on probability theory, whose expressive power allows an adequate representation of multimedia information, and whose inference relation models a dynamic and effective retrieval. This logic will play to multimedia document bases the same role as classical first-order logic plays to traditional databases, that is it will provide the theoretical foundations of the representation and retrieval of multimedia information.
The central idea of the project is that the retrieval of
a multimedia document in response to a user request can, and
indeed should, be seen as the establishment of the validity of
the sentence of the MIR logic, where
and
denote, respectively, the representation
of the document, the representation of the user request, and
the implication relation of the logic.
In order to adequately model the retrieval of information on multimedia documents, the sought MIR logic must satisfy a number of requirements.
First, the logic must be able to represent and support the reasoning about the three different views on multimedia documents, which relate to the structure, the layout and the content of the document.
Second, the inference relation of the MIR logic must capture
the notion of uncertain relevance, that is
if the document represented by
is likely to be relevant
to the request represented by
The uncertainty in the
retrieval process is introduced by two factors:
These factors affect the relevance of documents to queries, and make the retrieval process a probabilistic inference, in which the theory of probability is to be used to estimate the extent to which a document is relevant to a query.
Third, the inferential process of the MIR logic must be adaptive, in that it must take into account the dynamic nature of the notion of relevance. It is a well known fact that relevance is (a) user dependent, and (b) subject to revision during the retrieval process, as the already acquired information may affect the future relevance assessments.
Finally, the decision problem must be tractable,
that is solvable with a limited amount of resources. This is
a necessary condition for the efficient implementation of a
multimedia information retrieval service based on the MIR logic,
and requires to trade the expressive power of the logic with
good computational properties of its entailment relation.
A logic fulfilling all these requirements will be one main final result of the project. The other final result will be the evaluation of the logic on an experimental basis, which will assess the retrieval effectiveness allowed by the logic.
Indeed, the delivery of a retrieval model, even of one based on logic and probability theory, does not contribute per sé to the advance of the state of the art in MIR. In order to establish the adequacy of a model to the retrieval of information, empirical evidence is needed which relates specific aspects of the model (such as the chosen treatment of uncertainty or notion of relevance) to their impact on the performance of the system.
The FERMI consortium will develop prototypes implementing significant aspects of the underlying theoretical work and will use these prototypes, either singularly or jointly, in experiments against a realistic document base. The design of these experiments and the identification of evaluation criteria will be intermediate results of the project, along with any logic satisfying any significant subset of the above requirements.
It is expected that the MIR theory will have a significant impact both on the industrial world and the scientific community.
As far as the former is concerned, it will open the way to industrial research aiming at the development of the technology for the construction and maintenance of a new generation of multimedia document retrieval systems, providing an effective and efficient content-based information retrieval. Indeed, there is an ever increasing market demand for applications which create and maintain large repositories of multimedia documents for on-line access. At the same time, a widespread awareness that current technologies are not able to adequately respond to this demand. This technology relies on traditional retrieval models as far as text is concerned, and provide almost no facility for expressing the content of multimedia data objects. While the MIR logic alone will not solve all the problems posed by the realization of the envisaged kind of multimedia document retrieval systems, it is nevertheless a necessary step, perhaps the most important, towards the creation of the technology needed to support these systems.
As far as the impact of the project on the scientific community, it is believed that the MIR logic will contribute significantly to the definition of a unified, well-founded framework for research in multimedia information retrieval. Within this framework,