The list of topics suggested for term paper is given below. Each topic comes with 2 or more papers. You are required to choose one topic. The aim of this term paper is to analyse and compare the contents of at least 2 papers on your chosen topic. You are required to write a paper/report of approximately 3000-3500 words.
You should read one paper and write an essay on it by 20th of February 2004.
Final Submission
You are required to submit two hard copies of your
term paper on or before 21st Of March 2004. You can submit
either to Iadh or Joemon.
Each topic comes with a short description and has an associated reference that you should read if you want to know more about the topic. Most of the references are available on the web or in the library. If you cannot find the reference then contact Joemon - (jj) Iadh - (ounis) who can help you find it.
This is not a definitive list of topics - if there is a topic you are
interested in writing a term paper on that is not listed here then discuss it
with us.
ACM digital Library is a valuable online resource. You can access it from ACM Digital Libray
Other useful sources are:
Information
Processing and Management (IPandM)
Journal of the
American Society of Information Science (JASIS)
Journal of Documentation (JDoc) (electronic version is not available)
Journal of Information
Retrieval (JIR) electronic version is
not available)
Transactions of Office and Information Systems
(TOIS) (available from ACM digital
library)
Please note that all these URLs could be accessed through
University servers!!! (It won’t work from home computers)
Advanced IR applications, such as multimedia hypertext, and
Web retrieval, require the use of more effective and powerful indexing
languages, hence allowing a faithful representation of the complexity of the
documents content. New information retrieval models were built on top of such
powerful indexing language, often called Operational or Modern Information
Retrieval models. Unlike the classical/traditional IR models, which use simple
keywords as indexing terms, the approach followed by the operational IR model
consists of considering that an indexing term is based on more complex concepts
where semantic connectors are considered to be operators, or relations that
allow to build new experessions representing the real complexity of nowadays IR
applications.
Papers:
1. Carlo Meghini, Fabrizio Sebastiani, Umberto Straccia; Constantino Thanos."A Model of Information Retrieval Based on a Terminological Logic", In Korfhage, R.; Rasmussen, E.; Willett, P. (eds.): Proceedings of the Sixteenth Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, pages 298--308. ACM, New York.
2. Iadh Ounis and Marius Pasca "RELIEF: Combining expressiveness and rapidity into a single system" in 21st International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'98)
Cross-Language Information Retrieval (CLIR) CLIR is
motivated by availability of electronic documents in various languages. It is
concerned with retrieving documents written in languages different than the
language of the queries. A range of approaches have been proposed by
researchers to solve this language barrier. This term paper should study and
compare the approaches.
Reference:Oard, D. A Survey of
Multilingual Text Retrieval.
Papers:
1. Doug Oard "A Comparative Study of Query and Document Translation for Cross-Language Information Retrieval" Paper presented at the Third Conference of the Association for Machine Translation in the Americas (AMTA), Philadelphia, PA, October, 1998.
2. Ari Pirkola "The Effect of Query Structure and Dictionary setups in Dictionary Based Cross Language Information retrieval". in 21st International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'98)
3. M. Littman and S. Dumais and T. Landauer. Automatic cross-language information retrieval using latent semantic indexing. Automatic cross-language information retrieval using latent semantic indexing. In Grefenstette, G., editor, Cross Language Information Retrieval. Kluwer.1998.
4. J.Y. Nie, P. Isabelle, M. Simard, R. Durand, Cross-language information retrieval based on parallel texts and automatic mining of parallel texts from the Web, ACM-SIGIR conference, Berkeley, CA, pp. 74-81(1999)
Building information retrieval systems is a tedious task. Object-oriented
software development techniques can be employed for building extensible and
flexible software systems. ECLAIR, and SketchTrieve are two such extensible
architectures for information Retrieval.
Papers:
Rao, R., and Card, S. K., jellinek, H. D., Mackinlay, J. D., & Robertson, G. G. (1992). The Information GRID: A framework for information retrieval and retrieval-centered applications. Proceedings of ACM symposium on user Interface Software Technology, pages 23-32, New York, ACM Press
David G. Hendry and David J. Harper. An informal information-seeking
environment. Journal of the American Society for Information Science,
48(11):1036-1048, 1997
David G. Hendry, David J. Harper, An architecture
for implementing extensible information-seeking environments, Proceedings
of the 19th annual international ACM SIGIR conference on Research and
development in information retrieval Pages: 94 – 100, 1986, ACM Press
J M Jose, David G. Hendry and David J. Harper. An Object Oriented Framework for IR Applications, In the proceedings of OOIS 2001, Calgary, Canada, August 26-29th. pages: 259-268, Springer-Verlag
G. Sonnenberger, H. Frie. Design of a reusable IR framework. In Proceedings of the 19th ACM SIGIR, pages 48-57, 1995. (You can get this from ACM digital library www.acm.org/dl)
The issues of evaluation of IR on the Web differ from the issues of evaluation of classical IR, because the Web, and then the processes of indexing and retrieval of Web pages, are very different from those of classical information retrieval systems. The Web Track of the TREC initiative aims to provide experimental results about the performance of IR on the Web.
Papers:
1. D. Hawking, N. Craswell and P. Thistlewaite. "Overview of TREC-7 Very Large Collection Track." In Proceedings of the TREC Conference, 1999
2. D. Hawking, N. Craswell and P. Thistlewaite and D. Harman. "Resutls and Challenges in Web Search Evaluation". In Proceedings of the Wold Wide Web Conference, Toronto, Canada, April, 1999
3. D. Hawking, E. Voorhees, N. Craswell and P. Baily. "Overview of TREC-8 Very Large Collection Track." In Proceedings of the TREC Conference, 2000
4. M. Agosti and M. Melucci. "Information Retrieval on the Web". In proceedings of the ESSIR 2000 Summer School, Lecture Notes in Computer Science N. 1980, Springer, 2000.
Rough set theory, introduced by Zdzislaw Pawlak in the 80s is a theory of
vagueness and uncertainty. It is of fundamental importance to artificial intelligence
and cognitive sciences, especially in the areas of machine learning, knowledge
discovery from databases and pattern recognition. The rough set concept
overlaps, to some extent, with many other mathematical tools developed to deal
with vagueness and uncertainty, in particular with the Dempster-Shafer theory
of evidence and the fuzzy set theory. Naturally, the rough set techniques have
been used for the purpose of information retrieval. The term paper will address
the use of rough sets in the context of IR, its strenghts and weaknesses,
especially in comparison to others theories of uncertainty.
Papers:
1. Sadaaki Miyamoto. "Application of Rough Sets to Information Retrieval", Journal of the American Society for Information Science (JASIS), Volume 49, Number 3, March 1998, pages 195-205
2. Padmini Srinivasan. "Intelligent Information Retrieval using Rough
Set Approximations" Information Processing and Management, Volume 25,
Number 4, 1989, pages 347-361 (Contact us for hardcopy)
These techniques convert highly complex data into simpler forms that can be used in visualisation interfaces or to detect hidden patterns in data. A term paper on this topic will investigate the user of singular value decomposition in IR.
Papers:
1. George W. Furnas, Scott C. Deerwester, Susan T. Dumais, Thomas K. Landauer, Richard A. Harshman, Lynn A. Streeter, Karen E. Lochbaum:"Information Retrieval using a Singular Value Decomposition Model of Latent Semantic Structure". Proceedings of the ACM SIGIR Conference, 1988: 465-480.
2. Alexander Thomasian, Vittorio Castello, Chung-Sheng Li"Clustering and Singular Value Decomposition for Approximate Indexing in High Dimensional Spaces". In Proceedings of the ACM CIKM Conference 1998: 201-207
3. Robert M. Corless, Patrizia M. Gianni, Barry M. Trager, Stephen M. Watt."The Singular Value Decomposition for Polynomial Systems". In the Proceedings of the ISSAC Conference, 1995: 195-207
A searcher's information need is rarely satisfied in the first iteration of a search, and s/he is therefore faced with the task of improving the search in order to find more relevant documents. The overall effectiveness of the session can be improved by the use of a systematic search strategy, and the employment of appropriate tactics at each iteration of the search process. The main aims of this term paper are to clearly differentiate between search strategies and tactics, to examine the different strategies and tactics which can be employed, and to identify the criteria which affect the choice of strategies and tactics in different situations.
Reference:
1. Bates, M. (1979b). Information search tactics. Journal of the American Society for Information Science, 30, 205-214.
2. Peter Pirolli, Stuart K. card, Information Foraging (1999), UIR Technical Report. UIR-R97-01, Palo Alto Research center
Query Modification
Techniques
The following three term papers are on related topics but the emphasis will be on a different aspect of query modification. The first will look at different techniques for changing a user's query based on what documents the user finds relevant, the second looks at automatically adding words to a query, whether or the user has seen any relevant documents and in the third topic, the term paper will look at techniques designed to help the user select new words to add to queries.
Relevance feedback is a standard component of most IR systems. Once users
have marked some documents as relevant the system can use this information to
improve the query supplied by the user and retrieve a better set of documents.
The majority of IR systems perform this query modification automatically so it is
important to be able to indentify good techniques for modifiying a query. There
are a range of issues in relevance feedback that would make the basis for a
good term paper topic
Reference:
1. D. Harman. Relevance feedback. Information retrieval : data structures & algorithms . (W. B. Frakes and R. Baeza-Yates, ed.). Ch. 11.
2. W.B. Croft and D.J. Harper, Using probabilistic models of document retrieval without relevance information, Journal of Documentation, 37:285--295, 1979. (hard copy available)
Formulating a good query to retrieve relevant documents sometimes is not
easy for many users. The query formulation can be improved by adding certain
terms to the original query using a query expansion technique. This term paper
should discuss various query expansion techniques.
Reference:
1. Xu, J. and W.B. Croft. Query Expansion Using Local and Global Document Analysis. In Proceedings ACM-SIGIR Conference on Research and Development in Information Retrieval, pp. 4-11, Zurich, Switzerland, 1996.
2. Mandar Mitra, Amit Singhal, Chris Buckley Improving Automatic Query Expansion , In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 206 - 214, 1998.
An alternative to automatic relevance feedback in which queries are modified
automatically is interactive query expansion in which the user is asked to
select, from a list of terms, which should be added to the query. This process
gives the user more control over how a query is modified but can be less
effective than the automatic approach. How terms are selected and how they are
presented to the user are important factors in determining the success of
interactive query expansion and would form the basis of a term paper.
Reference:
1. D Harman. Relevance feedback revisited. Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval. 1992
2. Harman D.: Towards Interactive Query Expansion. In: Chiaramella Y. (editor): 11th International Conference on Research and Development in Information Retrieval, pp. 321 -- 331, Grenoble, France, 1988
3.
The potential and actual effectiveness of interactive query
expansion
Van Rijsbergen,C.J.. Magennis,M.
Proceedings of the 20th Annual International ACM SIGIR Conference on Reseach
and Development in Information Retrieval (Seattle, USA
In contrast to Information Retrieval (IR) systems, Information Filtering
(IF) systems operate on streams of documents and serve a large number of users.
The term paper should study similarities and differences between IF and IR and
highlight those aspects (if any) for which filtering requires a different
approach from retrieval.
References:
[1]Belkin, N. J. and W. B. Croft, Information filtering and information
retrieval: two sides of the same coin? Communications of the ACM, Vol. 35, No.
12 (Dec. 1992), Pages 29-38
[2] The TREC-7 Filtering Track: Description and Analysis, NIST Special
Publication 500-242: The Seventh Text REtrieval Conference (TREC-7), 1998, page
33, http://trec.nist.gov/pubs/trec7/papers/tr7filter/paper.ps
Documents often have short, relevant
sections contained within long sections of irrelevant material. Passage
retrieval techniques retrieve documents based on the most relevant passage or
section, rather than on the text of the whole document. The basis of this term
paper is to examine different approaches to selecting the most relevant passage
of a document.
Reference:
1. Callan, J. Passage-level evidence in document retrieval. Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval. Dublin, Ireland, 1994. 302-310.
2. G. Salton, J. Allan and C. Buckley. Approaches to Passage Retrieval in Full Text Information Systems. ACM SIGIR 93, pp.49-56