Seeking Information: Methods from Information Retrieval and Artificial Intelligence
The amount of available information is currently growing at an incredible rate; a particular example of this is the Internet. This information appears in many forms (images, text, video, and speech), and its increase leads to information overload because there are no means for separating relevant from irrelevant information. To utilise this information, whether for business or leisure purpose, we need techniques and tools to allow for fast, effective and efficient access to large amounts of stored information.
The fields of information retrieval (IR) and artificial intelligence (AI) have been looking at this problem. The IR field has developed successful methods to deal effectively with huge amounts of information, whereas the AI field has developed methods to learn user's information needs, extract information from text, and represent the semantics of information. They converge to the goal of describing and building large-scale systems that store, manipulate, retrieve and display electronic information of any kind.
The aim of this tutorial is to give a survey on the state-of-the-art in methods from IR and AI for searching and retrieving relevant information. The focus will be on indexing and retrieval models and methods. The attendees of this tutorial will obtain a basic understanding of the major models upon which modern retrieval software is based. This will include a summary of the important research problems, a short historical perspective, and a more detailed description of basic techniques and approaches to enable intelligent access to information. Important new directions in the field, such as, multimedia retrieval, cross-lingual retrieval, automatic categorisation, metadata, ontologies, information integration, profiling and filtering, user-centred aspects, and applications in electronic commerce will also be described.
The tutorial will present results of both theoretical and applied experiments in using IR and AI techniques to seek information. The tutorial should provide each participant with a starting point for further self-education. Participants will benefit from learning about the latest developments across a broad range of activities.
Introduction: Historical perspective, research issues, and aims related to the task of accessing vast amount of stored information will be discussed.
IR Methods: We will describe standard indexing and retrieval techniques used in IR. These will include Boolean, vector space and probabilistic models. We will show how a search can be refined using relevance feedback and contextual information. More advanced models based on a combination of logic and uncertainty theories will be introduced. These have been proposed as uniform models for manipulating multimedia structured data. The methods used to evaluate the effectiveness and the efficiency of a retrieval approach will be described. Finally, we will discuss the application of IR methods to the web and digital libraries.
AI Methods: We will discuss the use of AI techniques that enable intelligent information access. We will discuss state-of-the-art wrapper and mediator techniques supporting information extraction and integration. We will show the role of ontologies for intelligent information access. Ontologies provide explicit domain theories that can be used to make semantics of information explicit and machine processable. Based on these techniques, information retrieval can be enriched to direct information access and automated task fulfilment based on automatically extracted information. Finally, we sketch application in the area of knowledge management and electronic commerce and show the role new web standards like RDF and XML may play.
The Future: We will discuss challenges faced by both AI and IR to provide better access to information and to deal with the heterogeneous nature of the seeking process; for instance, to allow users, whether expert or not, using any language, to access information stored in any media, and across distributed sites. The pros and cons of AI and IR will be discussed, as well as how the two approaches can be combined to provide for more efficient, effective and intelligent access to information.
Level: Introductory to intermediary
This course is designed to provide a fast-paced yet rigorous introduction to the basic searching and retrieval methods. The tutorial will be of interest to academics and post-graduate students working in the field, and those involved in industrial and commercial research. The tutorial will also be of relevance to people who are the end-users of search systems, and organisation faced with publishing and accessing information on the Internet and Intranet.
Dieter Fensel received in 1989 a Diploma in Sociology at the FU Berlin and a Diploma in Computer Science at the TU Berlin. In 1993 he finalized his PhD at the Faculty of Economic Science, University of Karlsruhe, in Applied AI and in 1998 he received his Habilitation in Applied Computer Science at the University of Karlsruhe. He worked from 1989-1994 as Junior Researcher at the Institute AIFB of the University of Karlsruhe. From 1994-1996 he was a guest scientist at the Department SWI of the University of Amsterdam. In 1996-1999 he was a Senior Researcher at the Institute AIFB of the University of Karlsruhe and since September 1999 he is an Associated Professor at the Vrije Universiteit Amsterdam, Faculty of Science. Research interests include: Formal Specification Languages; Software Engineering; Data Warehouse, World Wide Web; Electronic Commerce; and Ontology-based Information Access.
Mounia Lalmas is a lecturer in the Department of Computer Science, Queen Mary & Westfield College, at the University of London. She has been an active researcher in information retrieval since 1990. She obtained her PhD in 1996 from the University of Glasgow. Her research interests centre around the development of effective formalisms able to model information in the places and in the forms that it appears in an interactive multimedia information retrieval system. In particular, she has expertise in: Knowledge Representations; Logical Models; Modelling Uncertainty; Structured and Hypermedia Document Indexing and Retrieval; and the Combination of Evidence in Information Retrieval.
Dr Dieter Fensel
Division of Mathematics & Computer Science
Vrije Universiteit Amsterdam
De Boelelaan 1081a
1081 HV Amsterdam, NL
Tel.: +31-20-444 7739
Fax: +31-20-444 7653
Dr Mounia Lalmas
Department of Computer Science
Queen Mary & Westfield College
University of London
Mile End Road
London E1 4NS, England
Tel: +44 (0) 20 7882 5200
Fax: +44(0) 20 8882 6533
The tutorial material can be found here.