Term Paper Topic Suggestions

The list of topics suggested for term paper is given below. Each topic comes with 2 or more papers. You are required to choose one topic. The aim of this term paper is to analyse and compare the contents of at least 2 papers on your chosen topic. You are required to write a paper/report of approximately 3000-3500 words.

Guidelines for Term Paper preparation (Please Read!)

Submission

Two stage submission:

You should read one paper and write an essay on it by 20th of February 2004.

Final Submission
You are required to submit two hard copies of your term paper on or before 21^st Of March 2004. You can submit either to Iadh or Joemon.

Topics:

Each topic comes with a short description and has an associated reference that you should read if you want to know more about the topic. Most of the references are available on the web or in the library. If you cannot find the reference then contact Joemon - (jj) Iadh - (ounis) who can help you find it.

This is not a definitive list of topics - if there is a topic you are interested in writing a term paper on that is not listed here then discuss it with us.

ACM digital Library is a valuable online resource. You can access it from ACM Digital Libray

Other useful sources are:

The Computer Journal

Information Processing and Management (IPandM)

Journal of the American Society of Information Science (JASIS)

Journal of Documentation (JDoc) (electronic version is not available)

Journal of Information Retrieval (JIR) electronic version is not available)

Transactions of Office and Information Systems (TOIS) (available from ACM digital library)

Please note that all these URLs could be accessed through University servers!!! (It won’t work from home computers)

Other sources

Cit Seer Scientific Literature Digital Library

Operational/Modern Information Retrieval Models

Advanced IR applications, such as multimedia hypertext, and Web retrieval, require the use of more effective and powerful indexing languages, hence allowing a faithful representation of the complexity of the documents content. New information retrieval models were built on top of such powerful indexing language, often called Operational or Modern Information Retrieval models. Unlike the classical/traditional IR models, which use simple keywords as indexing terms, the approach followed by the operational IR model consists of considering that an indexing term is based on more complex concepts where semantic connectors are considered to be operators, or relations that allow to build new experessions representing the real complexity of nowadays IR applications.

Papers:

1. Carlo Meghini, Fabrizio Sebastiani, Umberto Straccia; Constantino Thanos."A Model of Information Retrieval Based on a Terminological Logic", In Korfhage, R.; Rasmussen, E.; Willett, P. (eds.): Proceedings of the Sixteenth Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, pages 298--308. ACM, New York.

2. Iadh Ounis and Marius Pasca "RELIEF: Combining expressiveness and rapidity into a single system" in 21st International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'98)

Cross-language information retrieval (CLIR)

Cross-Language Information Retrieval (CLIR) CLIR is motivated by availability of electronic documents in various languages. It is concerned with retrieving documents written in languages different than the language of the queries. A range of approaches have been proposed by researchers to solve this language barrier. This term paper should study and compare the approaches.
Reference:Oard, D. A Survey of Multilingual Text Retrieval.

Papers:

1. Doug Oard "A Comparative Study of Query and Document Translation for Cross-Language Information Retrieval" Paper presented at the Third Conference of the Association for Machine Translation in the Americas (AMTA), Philadelphia, PA, October, 1998.

2. Ari Pirkola "The Effect of Query Structure and Dictionary setups in Dictionary Based Cross Language Information retrieval". in 21st International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'98)

3. M. Littman and S. Dumais and T. Landauer. Automatic cross-language information retrieval using latent semantic indexing. Automatic cross-language information retrieval using latent semantic indexing. In Grefenstette, G., editor, Cross Language Information Retrieval. Kluwer.1998.

4. J.Y. Nie, P. Isabelle, M. Simard, R. Durand, Cross-language information retrieval based on parallel texts and automatic mining of parallel texts from the Web, ACM-SIGIR conference, Berkeley, CA, pp. 74-81(1999)

Frameworks for Information Retrieval

Building information retrieval systems is a tedious task. Object-oriented software development techniques can be employed for building extensible and flexible software systems. ECLAIR, and SketchTrieve are two such extensible architectures for information Retrieval.

Papers:

Rao, R., and Card, S. K., jellinek, H. D., Mackinlay, J. D., & Robertson, G. G. (1992). The Information GRID: A framework for information retrieval and retrieval-centered applications. Proceedings of ACM symposium on user Interface Software Technology, pages 23-32, New York, ACM Press

David G. Hendry and David J. Harper. An informal information-seeking environment. Journal of the American Society for Information Science, 48(11):1036-1048, 1997

David G. Hendry, David J. Harper, An architecture for implementing extensible information-seeking environments, Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval Pages: 94 – 100, 1986, ACM Press

J M Jose, David G. Hendry and David J. Harper. An Object Oriented Framework for IR Applications, In the proceedings of OOIS 2001, Calgary, Canada, August 26-29th. pages: 259-268, Springer-Verlag

G. Sonnenberger, H. Frie. Design of a reusable IR framework. In Proceedings of the 19th ACM SIGIR, pages 48-57, 1995. (You can get this from ACM digital library www.acm.org/dl)

Evaluation of Web search Engines

The issues of evaluation of IR on the Web differ from the issues of evaluation of classical IR, because the Web, and then the processes of indexing and retrieval of Web pages, are very different from those of classical information retrieval systems. The Web Track of the TREC initiative aims to provide experimental results about the performance of IR on the Web.

Papers:

1. D. Hawking, N. Craswell and P. Thistlewaite. "Overview of TREC-7 Very Large Collection Track." In Proceedings of the TREC Conference, 1999

2. D. Hawking, N. Craswell and P. Thistlewaite and D. Harman. "Resutls and Challenges in Web Search Evaluation". In Proceedings of the Wold Wide Web Conference, Toronto, Canada, April, 1999

3. D. Hawking, E. Voorhees, N. Craswell and P. Baily. "Overview of TREC-8 Very Large Collection Track." In Proceedings of the TREC Conference, 2000

4. M. Agosti and M. Melucci. "Information Retrieval on the Web". In proceedings of the ESSIR 2000 Summer School, Lecture Notes in Computer Science N. 1980, Springer, 2000.

Application of rough sets to information retrieval

Rough set theory, introduced by Zdzislaw Pawlak in the 80s is a theory of vagueness and uncertainty. It is of fundamental importance to artificial intelligence and cognitive sciences, especially in the areas of machine learning, knowledge discovery from databases and pattern recognition. The rough set concept overlaps, to some extent, with many other mathematical tools developed to deal with vagueness and uncertainty, in particular with the Dempster-Shafer theory of evidence and the fuzzy set theory. Naturally, the rough set techniques have been used for the purpose of information retrieval. The term paper will address the use of rough sets in the context of IR, its strenghts and weaknesses, especially in comparison to others theories of uncertainty.

Papers:

1. Sadaaki Miyamoto. "Application of Rough Sets to Information Retrieval", Journal of the American Society for Information Science (JASIS), Volume 49, Number 3, March 1998, pages 195-205

2. Padmini Srinivasan. "Intelligent Information Retrieval using Rough Set Approximations" Information Processing and Management, Volume 25, Number 4, 1989, pages 347-361 (Contact us for hardcopy)

Singular value decomposition in IR

These techniques convert highly complex data into simpler forms that can be used in visualisation interfaces or to detect hidden patterns in data. A term paper on this topic will investigate the user of singular value decomposition in IR.

Papers:

1. George W. Furnas, Scott C. Deerwester, Susan T. Dumais, Thomas K. Landauer, Richard A. Harshman, Lynn A. Streeter, Karen E. Lochbaum:"Information Retrieval using a Singular Value Decomposition Model of Latent Semantic Structure". Proceedings of the ACM SIGIR Conference, 1988: 465-480.

2. Alexander Thomasian, Vittorio Castello, Chung-Sheng Li"Clustering and Singular Value Decomposition for Approximate Indexing in High Dimensional Spaces". In Proceedings of the ACM CIKM Conference 1998: 201-207

3. Robert M. Corless, Patrizia M. Gianni, Barry M. Trager, Stephen M. Watt."The Singular Value Decomposition for Polynomial Systems". In the Proceedings of the ISSAC Conference, 1995: 195-207

Search Strategies and Tactics

A searcher's information need is rarely satisfied in the first iteration of a search, and s/he is therefore faced with the task of improving the search in order to find more relevant documents. The overall effectiveness of the session can be improved by the use of a systematic search strategy, and the employment of appropriate tactics at each iteration of the search process. The main aims of this term paper are to clearly differentiate between search strategies and tactics, to examine the different strategies and tactics which can be employed, and to identify the criteria which affect the choice of strategies and tactics in different situations.

Reference:

1. Bates, M. (1979b). Information search tactics. Journal of the American Society for Information Science, 30, 205-214.

2. Peter Pirolli, Stuart K. card, Information Foraging (1999), UIR Technical Report. UIR-R97-01, Palo Alto Research center

Query Modification Techniques

The following three term papers are on related topics but the emphasis will be on a different aspect of query modification. The first will look at different techniques for changing a user's query based on what documents the user finds relevant, the second looks at automatically adding words to a query, whether or the user has seen any relevant documents and in the third topic, the term paper will look at techniques designed to help the user select new words to add to queries.

Automatic relevance feedback

Relevance feedback is a standard component of most IR systems. Once users have marked some documents as relevant the system can use this information to improve the query supplied by the user and retrieve a better set of documents. The majority of IR systems perform this query modification automatically so it is important to be able to indentify good techniques for modifiying a query. There are a range of issues in relevance feedback that would make the basis for a good term paper topic

Reference:

1. D. Harman. Relevance feedback. Information retrieval : data structures & algorithms . (W. B. Frakes and R. Baeza-Yates, ed.). Ch. 11.

2. W.B. Croft and D.J. Harper, Using probabilistic models of document retrieval without relevance information, Journal of Documentation, 37:285--295, 1979. (hard copy available)

Query expansion

Formulating a good query to retrieve relevant documents sometimes is not easy for many users. The query formulation can be improved by adding certain terms to the original query using a query expansion technique. This term paper should discuss various query expansion techniques.

Reference:

1. Xu, J. and W.B. Croft. Query Expansion Using Local and Global Document Analysis. In Proceedings ACM-SIGIR Conference on Research and Development in Information Retrieval, pp. 4-11, Zurich, Switzerland, 1996.

2. Mandar Mitra, Amit Singhal, Chris Buckley Improving Automatic Query Expansion , In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 206 - 214, 1998.

Interactive query expansion

An alternative to automatic relevance feedback in which queries are modified automatically is interactive query expansion in which the user is asked to select, from a list of terms, which should be added to the query. This process gives the user more control over how a query is modified but can be less effective than the automatic approach. How terms are selected and how they are presented to the user are important factors in determining the success of interactive query expansion and would form the basis of a term paper.

Reference:

1. D Harman. Relevance feedback revisited. Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval. 1992

2. Harman D.: Towards Interactive Query Expansion. In: Chiaramella Y. (editor): 11th International Conference on Research and Development in Information Retrieval, pp. 321 -- 331, Grenoble, France, 1988

3. The potential and actual effectiveness of interactive query expansion
Van Rijsbergen,C.J.. Magennis,M.
Proceedings of the 20th Annual International ACM SIGIR Conference on Reseach and Development in Information Retrieval (Seattle, USA

Filtering and IR

In contrast to Information Retrieval (IR) systems, Information Filtering (IF) systems operate on streams of documents and serve a large number of users. The term paper should study similarities and differences between IF and IR and highlight those aspects (if any) for which filtering requires a different approach from retrieval.
References:

[1]Belkin, N. J. and W. B. Croft, Information filtering and information retrieval: two sides of the same coin? Communications of the ACM, Vol. 35, No. 12 (Dec. 1992), Pages 29-38
[2] The TREC-7 Filtering Track: Description and Analysis, NIST Special Publication 500-242: The Seventh Text REtrieval Conference (TREC-7), 1998, page 33, http://trec.nist.gov/pubs/trec7/papers/tr7filter/paper.ps

Passage retrieval

Documents often have short, relevant sections contained within long sections of irrelevant material. Passage retrieval techniques retrieve documents based on the most relevant passage or section, rather than on the text of the whole document. The basis of this term paper is to examine different approaches to selecting the most relevant passage of a document.
Reference:

1. Callan, J. Passage-level evidence in document retrieval. Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval. Dublin, Ireland, 1994. 302-310.

2. G. Salton, J. Allan and C. Buckley. Approaches to Passage Retrieval in Full Text Information Systems. ACM SIGIR 93, pp.49-56