DDR-2011: Diversity in Document Retrieval

Tetsuya Sakai - Challenges in Diversity Evaluation

Abstract: In this talk, I survey existing approaches to evaluating diversified search results very briefly. I then cover some open problems in the evaluation of diversified search results. Finally, I report on the ongoing efforts at NTCIR that are related to diversity evaluation.

Bio: Tetsuya Sakai received a Master's degree from Waseda University in 1993 and joined the Toshiba Corporate R&D Center in the same year. He received a Ph.D from Waseda University in 2000 for his work on information retrieval and filtering systems. From 2000 to 2001, he was a visiting researcher at the University of Cambridge Computer Laboratory. In 2007, he became Director of the Natural Language Processing Laboratory at NewsWatch, Inc. In 2009, he joined Microsoft Research Asia. He is Chair of IPSJ SIG-IFAT, Evaluation Co-chair of NTCIR, and Regional Representative to the ACM SIGIR Executive Committee (Asia/Pacific). He has served as a Senior PC member for ACM SIGIR, CIKM and AIRS. He is on the editorial board of Information Processing and Management and that of Information Retrieval the Journal. He has received several awards in Japan, mostly from IPSJ. He is currently co-organising the NTCIR-9 1CLICK and INTENT tasks.

Alessandro Moschitti - Analysis of Document Diversity through Sentence-Level Opinion and Relation Extraction

Abstract: Diversity in document retrieval has been mainly approached as a classical statistical problem, where the typical optimization function aims at diversifying the retrieval items represented by means of language models. Although this is an essential step for the development of effective approaches to capture diversity, it is clearly not sufficient. The effort in Novelty Detection has shown that sentence-level analysis is a promising research direction. However, models and theory are needed for understanding the difference in content of the target sentences. In this talk, an argument for using current state-of-the-art in Relation and Opinion Extraction at the sentence level is made. After presenting some ideas for the use of the above technology for document retrieval, advanced extraction models are briefly described.

Bio: Alessandro Moschitti is a professor of the Computer Science and Information Engineering Department of the Trento University. He took his PhD in Computer Science from the University of Rome "Tor Vergata" in 2003. He has worked as an associate researcher for the University of Texas at Dallas (for two years), as a visiting professor for the CCLS department of Columbia University and more recently as visiting researcher for the IBM Watson research center of New York for the Jeopardy project and as visiting professor of the cognitive science and natural language processing (NLP) department of The University of Colorado at Boulder. His expertise concerns theoretical and applied machine learning (ML) in the areas of NLP, IR and Data Mining. He has devised innovative kernels within support vector and other kernel-based machines for advanced syntactic/semantic processing. These have been documented in more than 110 scientific articles, published in the major conferences of several research communities, e.g., ACL, ICML, ECML-PKDD, CIKM, ECIR and ICDM. He is also an active PC member for the conferences/journals of the areas above. He is currently guest editor of the Journal of Natural Language Engineering, an ML area co-chair for ACL-2011 and a co-chair for TextGraphs 6. He has participated in six projects of the European Community (EC) and in three US projects: MTBF with Con-Edison, IQAS for the ARDA AQUAINT PROGRAM and Deep QA (the Jeopardy! challenge) with IBM. Currently, he is the project consortium coordinator of the EC Coordinate Action, EternalS, project coordinator of two Italian projects and responsible of the ML/NLP research for the LivingKnowledge project. He has received the IBM Faculty award and other prestigious awards.