15th European Conference on Artificial Intelligence

ECAI 2002

Seeking Information: Methods from Information Retrieval and Artificial Intelligence

This tutorial will address the problem of how to support users in finding information relevant to their needs from large sources of stored data, such as the World Wide Web, Intranets, and digital libraries. It will describe how current techniques from information retrieval and artificial intelligence can be used for this purpose. The tutorial will provide the attendees with:
  • Knowledge of the underlying empirical and theoretical research issues relating to the indexing and retrieving of information.
  • Understanding the criteria that make the seeking process efficient and effective.
  • Skills necessary to design, implement and experiment with systems to enable intelligent information access.

  • The tutorial aims to encourage synthesis of ideas between the two fields.


    The amount of available information is currently growing at an incredible rate; a particular example of this is the Internet. This information appears in many forms (images, text, video, and speech), and its increase leads to information overload because there are no means for separating relevant from irrelevant information. To utilise this information, whether for business or leisure purpose, we need techniques and tools to allow for fast, effective and efficient access to large amounts of stored information.

    The fields of information retrieval (IR) and artificial intelligence (AI) have been looking at this problem. The IR field has developed successful methods to deal effectively with huge amounts of information, whereas the AI field has developed methods to learn user's information needs, extract information from text, and represent the semantics of information. They converge to the goal of describing and building large-scale systems that store, manipulate, retrieve and display electronic information of any kind.

    The aim of this tutorial is to give a survey on the state-of-the-art in methods from IR and AI for searching and retrieving relevant information. The focus will be on indexing and retrieval models and methods. The attendees of this tutorial will obtain a basic understanding of the major models upon which modern retrieval software is based. This will include a summary of the important research problems, a short historical perspective, and a more detailed description of basic techniques and approaches to enable intelligent access to information. Important new directions in the field will be described, such as: metadata and the semantic web; multimedia retrieval; cross-lingual retrieval; information categorisation, filtering, and personalisation. We will also consider how current XML-based technologies support more flexible information access.

    The tutorial will present results of both theoretical and applied experiments in using IR and AI techniques to seek information. The tutorial should provide each participant with a starting point for further self-education. Participants will benefit from learning about the latest developments across a broad range of activities.


    Introduction: Historical perspective, research issues, and aims related to the task of accessing vast amount of stored information will be discussed.

    IR Methods: We will describe standard indexing and retrieval techniques used in IR. These will include Boolean, vector space and probabilistic models. We will show how a search can be refined using relevance feedback and contextual information. Particular attention will be paid to link-based approaches, which are used on current web search engines such as Google. More advanced models based on a combination of logic and uncertainty theories will be introduced. Finally, the methods used to evaluate the effectiveness and the efficiency of a retrieval approach will be described.

    AI Methods: We will discuss the use of AI techniques that enable intelligent information access. We show how speech and language technology is used in summarising and extracting information from texts, and in multimedia retrieval. We will discuss the use of metadata, providing rich categorisation of web resources, and the use of ontologies, making the semantics of the information explicit. Personalisation and filtering techniques will be presented, which seek to select, combine and present information according to the userís requirements. We will show the role that new web standards like RDF and XML may play in providing more structure and semantics, and how this opens up new opportunities in intelligent access and presentation.

    The Future: We will discuss challenges faced by both AI and IR to provide better access to information and to deal with the heterogeneous nature of the seeking process; for instance, to allow users, whether expert or not, using any language, to access information stored in any media, and across distributed sites. The pros and cons of AI and IR will be discussed, as well as how the two approaches can be combined to provide for more efficient, effective and intelligent access to information.


    Level: Introductory to intermediary.

    This course is designed to provide a fast-paced yet rigorous introduction to the basic searching and retrieval methods. The tutorial will be of interest to academics and post-graduate students working in the field, and those involved in industrial and commercial research. The tutorial will also be of relevance to people who are the end-users of search systems, and organisation faced with publishing and accessing information on the Internet and Intranet.


    Dr Alison Cawsey is a Lecturer in Computer Science at Heriot-Watt University in Edinburgh. She was previously a lecturer at Glasgow University, and has been a research fellow at Edinburgh and Cambridge Universities. Her research interests centre on the personalised presentation of information, with applications with health promotion and education. Her PhD work in this area was published as a book  "Explanation and Interaction" (MIT Press, 1993) and she has since written a textbook on Artificial Intelligence ("Essence of Artificial Intelligence", Prentice Hall, 1997). She is currently involved in a number of projects concerned with intelligent information access and presentation: MIRADOR is concerned with personalised presentation of resource descriptions given user profile and resource metadata; DIP is concerned with a general framework for the dynamic presentation of personalised information (including filtering and integration). In other current funded projects she is looking at evaluating the benefits of personalised presentation of information in the healthcare domain. Current XML standards (RDF and XSLT) are used in these projects.

     Dr Mounia Lalmas is a Reader in Information Retrieval at Queen Mary University of London, which she joined as a lecturer in 1999. Prior to this, she was a Research Scientist at the University of Dortmund in 1998, a Lecturer from 1995 to 1997 and a Research Fellow from 1997 to 1998 at the University of Glasgow, where she received her PhD in 1996. Her research interests centre around the development, implementation and validation of approaches for representing and retrieving multimedia, structured and heterogeneous repositories of information in a highly interactive environment. She is involved in a number of projects: SAMBITS, concerned with the development of systems for advanced multimedia broadcast and IT services, where she designed access methods for the navigation and retrieval of complex digital multimedia data annotated with MPEG-7; GRIS, which aim to investigate the information seeking behaviours of users querying, browsing and retrieving structured documents; FOCUS which aims at investigating effective methods for representing and retrieving structured documents; "Relevance through explanations" which aims at developing intelligent approaches to relevance feedback; and OntoWeb, concerned with strengthening the European influence on Semantic Web standardisation efforts such as those based on RDF and XML.

    Dr Thomas Roelleke obtained in 1999 his PhD on probabilistic object-oriented logical representation and retrieval of complex objects. Along with his thesis, he published his research results in international journals (TOIS), conferences (SIGIR), and related workshops. In 1999, he founded the HySpirit company, transferring his research results and knowledge into a product development. In 2000, he became a strategic consultant for a leading direct broker, and was appointed at the same time a research fellow at Queen Mary. His current tasks and interests include: information management, strategic IT-consultancy, data warehousing, data modelling and software engineering, (semi-)structured document retrieval, retrieval models and their evaluation, probabilistic object-oriented knowledge representations, integration of retrieval and database technology, heterogeneous information sources, transfer and usage of knowledge.


    Dr Alison Cawsey
    Department of Computing and Electrical Engineering
    Heriot-Watt University
    Edinburgh EH14 4AS, Scotland
    Tel: +44-131-451-3413
    Fax: +44-131-451-3431

    Dr Mounia Lalmas
    Department of Computer Science
    Queen Mary University of London
    Mile End Road
    London E1 4NS, England
    Tel: +44-20-7882-5200
    Fax: +44-20-8882-6533

    Dr Thomas Roelleke
    Department of Computer Science
    Queen Mary University of London
    Mile End Road
    London E1 4NS, England
    Tel: +44 (0)20 7882 5245
    Fax:+44 (0)20 8980 6533