Next: 2.2 The Workplan Up: 2 Programme of work Previous: 2 Programme of work

2.1 Introduction to the Workplan

The Workplan is structured around the following five main Research Themes:

  1. A logic for information retrieval;

  2. A theory of uncertainty for information retrieval;

  3. A model of the semantic content of multimedia data;

  4. Integration;

  5. Evaluation.

All these Research Themes contribute to the sought theory and strongly interact with each other, so that a single task of the project will in general address more than one of these themes. However, they reflect the basic classes of problems, or domains of investigation, that will be dealt with by the FERMI consortium; therefore, the subdivision of the project into these Research Theme constitutes a convenient perspective from which to look at the research activity of the project.

The Research Theme ``A logic for information retrieval'' aims at developing a logic for representing and reasoning on the structure and the content of documents in a way that captures the notion of relevance of documents to users' requests and is consistent with the theories to be developed within Research Themes 2 and 3. Contributions to the development of this logic is expected from investigations into different classes of logics (viz. Terminological Logics, Modal Logics, Fuzzy Logics, Relevance Logics), each addressing a single aspect of the problem.

The Research Theme ``A theory of uncertainty for information retrieval'' aims at developing a theory of uncertainty appropriate for modelling the imprecision that is inherent in the information retrieval process, and is consistent with the theories to be developed within Research Themes 1 and 3. Such a theory will be based on Probability Theory and Dempster-Shafer's theory of evidence.

The Research Theme ``A model of the semantic content of multimedia data'' aims at defining a model for expressing the semantic content of multimedia information that is consistent with the theories to be developed within Research Themes 1 and 2. Such a theory will be based on the theory of Conceptual Graphs.

The Research Theme ``Integration'' aims at integrating the results of Work Parts 1-3 into a logic that: 1) be adequate for representing complex multimedia documents and queries, and reasoning on the relation of relevance between the two; 2) takes into account the imprecision inherent in the information retrieval task, hence modelling a notion of partial relevance of documents to queries.

The Research Theme ``Evaluation'' aims at defining and experimenting with an appropriate evaluation methodology for multimedia information retrieval systems based on the theoretical principles identified in Research Themes 1-4. A twofold evaluation is planned. The first type of evaluation aims at assessing the computational properties of the formal theories developed within Research Themes 1-4; this work will be based on formal tools, such as computability theory and complexity theory, and is to be understood as a prerequisite for any further consideration of the theories under analysis. The subsequent type of evaluation, of a more experimental nature, aims at assessing the performance of prototypical multimedia information retrieval systems based on the chosen theories. Evaluation criteria will be defined, and some significant experiments based on these criteria will be designed and run on the prototypical systems.

The objective of defining a logic for reasoning about the relevance of multimedia documents to queries, and that integrates features that account for such a variety of aspects of the multimedia information retrieval endeavour, is ambitious, and therefore there are some risks of failure.

A key methodological aspect of our approach, which we deem of fundamental importance in order to ensure integrability of the various solutions proposed, is that all the logics that are being considered for integration have a well-known, Tarski-style denotational semantics; this is a key factor in ensuring that integration will indeed be possible, and will thus allow the research to confidently concentrate on the rationale for integration of the selected logics rather than worry for the very possibility that they can be integrated at all.

The main risk the concerns the very possibility to define a logic which has the expressive power needed to model a multimedia document and which is, at the same time, computationally tractable. Some of the partners have already undertaken preliminary studies on a logical approach to the modelling of multimedia information retrieval. The first results are encouraging. However the project, in order to avoid shortcomings, will follow an approach which has good chances of meeting the project objective.

First, concerning the expressive power of the sought logic, different families of logics will be considered. The features of these logics that are relevant to the problem of modelling multimedia information retrieval will be studied, and only the more promising ones will be considered for inclusion in the MIRLOG logic; MIRLOG will then contain the minimal set of features that allows to meet the goals.

Second, concerning the tractability problem, a new, ``probabilistic'' approach will be followed. This approach is made possible by the fact that probabilistic complexity classes have recently been proposed, in order to make the notion of a ``tractable problem'' correspond more realistically to the intuitive notion of what a problem solvable by a ``good algorithm'' is. These proposals are centered around the notion of a ``probabilistic solution'' to a problem (i.e. of a solution which is correct only with probability close to 1); accordingly, there could be problems intractable in a deterministic sense which instead admit an efficient probabilistic solution. Probabilistic complexity classes may turn out to be very important for information retrieval, because they might include the decision problem for logics that are deterministically intractable and are nonetheless suitable to model multimedia information retrieval. These logics could then be adopted for modelling multimedia information retrieval, because the imprecision introduced by their probabilistic inference algorithms would be a negligible addition to the imprecision already inherent in the information retrieval process.



Next: 2.2 The Workplan Up: 2 Programme of work Previous: 2 Programme of work