IEEE Transactions on Multimedia paper online

“Content-Based Analysis Improves Audiovisual Archive Retrieval” by Bouke Huurnink, Cees Snoek, Maarten de Rijke and Arnold Smeulders (IEEE Transactions on Multimedia, published online: 5 April 2012) is available now.

Content-based video retrieval is maturing to the point where it can be used in real-world retrieval practices. One such practice is the audiovisual archive, whose users increasingly require fine-grained access to broadcast television content. In this paper, we take into account the information needs and retrieval data already present in the audiovisual archive, and demonstrate that retrieval performance can be significantly improved when content-based methods are applied to search. To the best of our knowledge, this is the first time that the practice of an audiovisual archive has been taken into account for quantitative retrieval evaluation. To arrive at our main result, we propose an evaluation methodology tailored to the specific needs and circumstances of the audiovisual archive, which are typically missed by existing evaluation initiatives. We utilize logged searches, content purchases, session information, and simulators to create realistic query sets and relevance judgments. To reflect the retrieval practice of both the archive and the video retrieval community as closely as possible, our experiments with three video search engines incorporate archive-created catalog entries as well as state-of-the-art multimedia content analysis results. A detailed query-level analysis indicates that individual content-based retrieval methods such as transcript-based retrieval and concept-based retrieval yield approximately equal performance gains. When combined, we find that content-based video retrieval incorporated into the archives practice results in significant performance increases for shot retrieval and for retrieving entire television programs. The time has come for audiovisual archives to start accommodating content-based video retrieval methods into their daily practice.

IRJ paper online

“Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval” by Katja Hofmann, Shimon Whiteson and Maarten de Rijke (Information Retrieval Journal, published online: April 7, 2012) is available now.

As retrieval systems become more complex, learning to rank approaches are being developed to automatically tune their parameters. Using online learning to rank, retrieval systems can learn directly from implicit feedback inferred from user interactions. In such an online setting, algorithms must obtain feedback for effective learning while simultaneously utilizing what has already been learned to produce high quality results. We formulate this challenge as an exploration–exploitation dilemma and propose two methods for addressing it. By adding mechanisms for balancing exploration and exploitation during learning, each method extends a state-of-the-art learning to rank method, one based on listwise learning and the other on pairwise learning. Using a recently developed simulation framework that allows assessment of online performance, we empirically evaluate both methods. Our results show that balancing exploration and exploitation can substantially and significantly improve the online retrieval performance of both listwise and pairwise approaches. In addition, the results demonstrate that such a balance affects the two approaches in different ways, especially when user feedback is noisy, yielding new insights relevant to making online learning to rank effective in practice.

More ECIR 2012 papers online

Two more ECIR 2012 papers are online now.

”Adaptive Temporal Query Modeling” by Maria-Hendrike Peetz, Edgar Meij, Maarten de Rijke and Wouter Weerkamp is available
here. We present an approach to query modeling that uses the temporal distribution of documents in an initially retrieved set of documents. Such distributions tend to exhibit bursts, especially in news-related document collections. We hypothesize that documents in those bursts are more likely to be relevant and update the query model with the most distinguishing terms in high-quality docu- ments sampled from bursts. We evaluate the effectiveness of our models on a test collection of blog posts.

”Result Disambiguation in Web People Search”, by Richard Berendsen, Bogomil Kovachev, Evi Nastou, Maarten de Rijke and Wouter Weerkamp is available
here. In the paper we study the problem of disambiguating the results of a web people search engine: given a query consisting of a person name plus the result pages for this query, find correct referents for all mentions by clustering the pages according to the different people sharing the name. While the problem has been studied extensively, we discover that the increasing availability of results retrieved from social media platforms causes state-of-the-art methods to break down. We analyze the problem and propose a dual strategy where we distinguish between results obtained from social media platforms and those obtained from other sources. In our dual strategy, the two types of documents are disambiguated separately, using different strategies, and their results are then merged. We study several instantiations for the different stages in our proposed strategy and manage to achieve state-of-the-art performance.

CLEF 2011 conference report online

“CLEF 2011: Conference on Multilingual and Multimodal Information Access Evaluation” by Paul Clough, Nicola Ferro, Pamela Forner, Julio Gonzalo, Bouke Huurnink, Jaana Kekäläinen, Mounia Lalmas, Vivien Petras and Maarten de Rijke is online now. In the paper we report on CLEF 2011.

TREC 2011 papers online

Two TREC 2011 reports are online now.

”The University of Amsterdam at the TREC 2011 Session Track” by Bouke Huurnink, Richard Berendsen, Katja Hofmann, Edgar Meij and Maarten de Rijke is
online now. In the paper we describe the participation of the University of Amsterdam’s ILPS group in the Sessino track at TREC 2011.

”Team COMMIT at TREC 2011” by Marc Bron, Edgar Meij, Maria-Hendrike Peetz, Manos Tsagkias and Maarten de Rijke is also
online. In this paper we describe the participation of Team COMMIT in the TREC 2011 Microblog and Entity tracks.

ECIR 2012 paper online

“Predicting IMDB Movie Ratings Using Social Media” by Andrei Oghina, Mathias Breuss, Manos Tsagkias and Maarten de Rijke is available online now at this location.

In the paper, we consider the problem of predicting IMDb movie ratings. We examine two sets of features: surface and textual features. For the latter, we assume that no social media signal is isolated and use data from multiple channels that are linked to a particular movie, such as tweets from Twitter and comments from YouTube. We extract textual features from each channel to use in our prediction model and we explore whether data from either of these channels can help to extract a better set of textual feature for prediction. Our best performing model is able to rate movies very close to the observed values.

ACM TOIS paper online

“Query Modeling for Entity Search Based on Terms, Categories, and Examples” by Krisztian Balog, Marc Bron and Maarten de Rijke is available online now.

Users often search for entities instead of documents, and in this setting, are willing to provide extra input, in addition to a series of query terms, such as category information and example entities. We propose a general probabilistic framework for entity search to evaluate and provide insights in the many ways of using these types of input for query modeling. We focus on the use of category information and show the advantage of a category-based representation over a term-based representation, and also demonstrate the effectiveness of category-based expansion using example entities. Our best performing model shows very competitive performance on the INEX-XER entity ranking and list completion tasks.

NIPS Workshop paper online

“Contextual Bandits for Information Retrieval,” by Katja Hofmann, Shimon Whiteson, Maarten de Rijke, is our contribution to the NIPS workshop on Bayesian Optimization, Experimental Design and Bandits: Theory and Applications. You can find it here.

In this paper we give an overview of and outlook on research at the intersection of information retrieval and contextual bandit problems. A critical problem in information retrieval is online learning to rank, where a search engine strives to improve the quality of the ranked result lists it presents to users on the basis of those users’ interactions with those result lists. Recently, researchers have started to model interactions between users and search engines as contextual bandit problems, and initial methods for learning in this setting have been devised. Our research focuses on two aspects: balancing exploration and exploitation and inferring preferences from implicit user interactions. This paper summarizes our recent work on online learning to rank for information retrieval and points out challenges that are characteristic of this application area.

WSDM 2012 paper online

Our WSDM 2012 paper “Adding semantics to microblog posts” (Meij, Weerkamp, de Rijke) is online now.

Microblogs have become an important course of information for the purpose of marketing, intelligence and reputation management. Streams of microblogs are of great value because of their direct and real-time nature. Determining what an individual microblog post is about, however, can be non-trivial because of creative language usage, the highly contextualized and informal nature of microblog posts, and the limited length of this form of communication.

We propose a solution to the problem of determining what a microblog post is about through semantic linking: we add semantics to posts by automatically identifying concepts that are semantically related to it and generating links to the corresponding Wikipedia articles. The identified concepts can subsequently be used for, e.g., social media mining, thereby reducing the need for manual inspection and selection. Using a purpose-built test collection of tweets, we show that recently proposed approaches for semantic linking do not perform well, mainly due to the idiosyncratic nature of microblog posts. We propose a novel method based on machine learning with a set of innovative features and show that is is able to achieve significant improvements over all other methods, especially in terms of precision.

The paper is available
here.

Five ECIR 2012 papers

Five ILPS papers were accepted for ECIR 2012:

  • R. Berendsen, B. Kovachev, E. Nastou, M.de Rijke, W. Weerkamp, “Result Disambiguation in Web People Search”

  • M. Bosma, E. Meij, W. Weerkamp, "A Framework for Unsupervised Spam Detection in Social Networking Sites”

  • P. Lubell-Doughtie, K. Hofmann, "Learning to Rank from Relevance Feedback for e-Discovery”

  • A. Oghina, M. Breuss, M. Tsagkias, M. de Rijke, "Predicting IMDB Movie Ratings Using Social Media”

  • M.-H. Peetz, E. Meij, M. de Rijke , W. Weerkamp, "Adaptive Temporal Query Modeling”

One more CIKM 2011 paper online

A second CIKM 2011 paper, Automatic Link Generation with Wikipedia: A Case Study in Annotating Radiology Reports by Jiyin He, Maarten de Rijke, Merlijn Sevenster, Rob van Ommering and Yuechen Qian is now also available online.

Automatically annotating texts with background information has recently received much attention. We conduct a case study in automatically generating links from narrative radiology reports to Wikipedia. Such links help users understand the medical terminology and thereby increase the value of the reports.

Direct applications of existing automatic link generation systems trained on Wikipedia to our radiology data do not yield satisfactory results. Our analysis reveals that medical phrases are often syntactically regular but semantically complicated, e.g., containing multiple concepts or concepts with multiple modifiers. The latter property is the main reason for the failure of existing systems. Based on this observation, we propose an automatic link generation approach that takes into account these properties. We use a sequential labeling approach with syntactic features for anchor text identification in order to exploit syntactic regularities in medical terminology. We combine this with a sub-anchor based approach to target finding, which is aimed at coping with the complex semantic structure of medical phrases. Empirical results show that the proposed system effectively improves the performance over existing systems.

CIKM 2011 paper online

One of our papers for this year’s CIKM is now online: A Probabilistic Method for Inferring Preferences from Clicks by Katja Hofmann, Shimon Whiteson and Maarten de Rijke.

Evaluating rankers using implicit feedback, such as clicks on documents in a result list, is an increasingly popular alternative to traditional evaluation methods based on explicit relevance judgments. Previous work has shown that so-called interleaved comparison methods can utilize click data to detect small differences between rankers and can be applied to learn ranking functions online.

In this paper, we analyze three existing interleaved comparison methods and find that they are all either biased or insensitive to some differences between rankers. To address these problems, we present a new method based on a probabilistic interleaving process. We derive an unbiased estimator of comparison outcomes and show how marginalizing over possible comparison outcomes given the observed click data can make this estimator even more effective.

We validate our approach using a recently developed simulation framework based on a learning to rank dataset and a model of click behavior. Our experiments confirm the results of our analysis and show that our method is both more accurate and more robust to noise than existing methods.

Information Retrieval Journal paper on blog feed search online

Blog feed search with a post index by Wouter Weerkamp, Krisztian Balog and Maarten de Rijke has been made available online by the Information Retrieval Journal. User generated content forms an important domain for mining knowledge. In this paper, we address the task of blog feed search: to find blogs that are principally devoted to a given topic, as opposed to blogs that merely happen to mention the topic in passing. The large number of blogs makes the blogosphere a challenging domain, both in terms of effectiveness and of storage and retrieval efficiency. We examine the effectiveness of an approach to blog feed search that is based on individual posts as indexing units (instead of full blogs). Working in the setting of a probabilistic language modeling approach to information retrieval, we model the blog feed search task by aggregating over a blogger’s posts to collect evidence of relevance to the topic and persistence of interest in the topic. This approach achieves state-of-the-art performance in terms of effectiveness. We then introduce a two-stage model where a pre-selection of candidate blogs is followed by a ranking step. The model integrates aggressive pruning techniques as well as very lean representations of the contents of blog posts, resulting in substantial gains in efficiency while maintaining effectiveness at a very competitive level.

ECIR 2011 papers online

Two of our papers for this year’s ECIR are online now.

One paper, Balancing Exploration and Exploitation in Learning to Rank Online, is by Katja Hofmann, Shimon Whiteson and Maarten de Rijke. As retrieval systems become more complex, learning to rank approaches are being developed to automatically tune their parameters. Using online learning to rank approaches, retrieval systems can learn directly from implicit feedback, while they are running. In such an online setting, algorithms need to both explore new solutions to obtain feedback for effective learning, and exploit what has already been learned to produce results that are acceptable to users. We formulate this challenge as an exploration-exploitation dilemma and present the first online learning to rank algorithm that works with implicit feedback and balances exploration and exploitation. We leverage existing learning to rank data sets and recently developed click models to evaluate the proposed algorithm. Our results show that finding a balance between exploration and exploitation can substantially improve online retrieval performance, bringing us one step closer to making online learning to rank work in practice.

The other paper, Incorporating Query Expansion and Quality Indicators in Searching Microblog Posts, is by Kamran Massoudi, Manos Tsagkias, Maarten de Rijke and Wouter Weerkamp. In the paper we propose a retrieval model for searching microblog posts for a given topic of interest. We develop a language modeling approach tailored to microblogging characteristics, where redundancy-based IR methods cannot be used in a straightforward manner. We enhance this model with two groups of quality indicators: textual and microblog specific. Additionally, we propose a dynamic query expansion model for microblog post retrieval. Experimental results on Twitter data reveal the usefulness of boolean search, and demonstrate the utility of quality indicators and query expansion in microblog search.

WSDM 2011 paper online

Our WSDM 2011 paper, Linking Online News and Social Media by Manos Tsagkias, Maarten de Rijke and Wouter Weerkamp, is online at this location.

Much of what is discussed in social media is inspired by events in the news and, vice versa, social media provide us with a handle on the impact of news events. We address the following linking social media utterances task: given a news article, find social media utterances that implicitly reference it.

We follow a three-step approach: we derive multiple query models from a given source news article, which are then used to retrieve utterances from a target social media index, resulting in multiple ranked lists that we then merge into a single result list using data fusion techniques.

Query models are created by exploiting the structure of the source news article and by using explicitly linked social media utterances that are known to discuss the source article.

To combat query drift resulting from the large volume of text, either in the source news article itself or in social media utterances explicitly linked to it, we introduce a graph-based method for selecting discriminative terms.

For our experimental evaluation, we use data from Twitter, Digg, Delicious, the New York Times Community, Wikipedia, and the blogosphere to generate query models. We show that different query models, based on different data sources, provide complementary information and manage to retrieve different social media utterances from our target index. As a consequence, (article dependent) data fusion methods manage to significantly boost retrieval performance over individual approaches. Our graph-based term selection method is shown to help improve both effectiveness and efficiency.

CIKM 2010 paper online

Our CIKM 2010 paper Ranking Related Entities: Components and Analyses by Marc Bron, Krisztian Balog and Maarten de Rijke, is available online.

Related entity finding is the task of returning a ranked list of homepages of relevant entities of a specified type that need to engage in a given relationship with a given source entity. We propose a framework for addressing this task and perform a detailed analysis of four core components; co-occurrence models, type filtering, context modeling and homepage finding. Our initial focus is on recall. We analyze the performance of a model that only uses co-occurrence statistics. While this method identifies the potential set of related entities, it fails to rank them effectively. Two types of error emerge (1) entities of the wrong type pollute the ranking and (2) while somehow associated to the source entity, some retrieved entities do not engage in the right relation with it. To address (1), we add type filtering based on category information available in Wikipedia. To correct for (2), we complement our related entity finding method with contextual information, represented as language models derived from documents in which source and target entities co-occur. To complete the pipeline, we find homepages of top ranked entities by combining a language modeling approach with heuristics based on Wikipedia's external links. Our method achieves very high recall scores on the end-to-end task, providing a solid starting point for expanding our focus to improve precision. Our framework can effectively incorporate additional heuristics and these extensions lead to state-of-the-art performance.

Another CLEF 2010 Conference paper online

Another CLEF 2010 Conference paper is also online now: On the Evaluation of Entity Profiles by Maarten de Rijke, Krisztian Balog, Toine Bogers and Antal van den Bosch.

Entity profiling is the task of identifying and ranking descriptions of a given entity. The task may be viewed as one where the descriptions being sought are terms that need to be selected from a knowledge source (such as an ontology or thesaurus). In this case, entity profiling systems can be assessed by means of precision and recall values of the descriptive terms produced. However, recent evidence suggests that more sophisticated metrics are needed that go beyond mere lexical matching of system-produced descriptors against a ground truth, allowing for graded relevance and rewarding diversity in the list of descriptors returned. In this note, we motivate and propose such a metric.

CLEF 2010 Conference paper online

One of our CLEF 2010 conference papers, Validating Query Simulators: An Experiment Using Commercial Searches and Purchases by Bouke Huurnink, Katja Hofmann, Maarten de Rijke and Marc Bron, is available online now.

In the paper we design and validate simulators for generating queries and relevance judgments for retrieval system evaluation. We develop a simulation framework that incorporates existing and new simulation strategies. To validate a simulator, we assess whether evaluation using its output data ranks retrieval systems in the same way as evaluation using real-world data. The real-world data is obtained using logged commercial searches and associated purchase decisions. While no simulator reproduces an ideal ranking, there is a large variation in simulator performance that allows us to distinguish those that are better suited to creating artificial testbeds for retrieval experiments. Incorporating knowledge about document structure in the query generation process helps create more realistic simulators.

ACL 2010 paper online

Our ACL 2010 paper Generating Focused Topic-specific Sentiment Lexicons by Valentin Jijkoun, Maarten de Rijke and Wouter Weerkamp is available online now.

In the paper we present a method for automatically generating focused and accurate topic-specific subjectivity lexicons from a general purpose polarity lexicon that allow users to pin-point subjective on-topic information in a set of relevant documents. We motivate the need for such lexicons in the field of media analysis, describe a bootstrapping method for generating a topic-specific lexicon from a general purpose polarity lexicon, and evaluate the quality of the generated lexicons both manually and using a TREC Blog track test set for opinionated blog post retrieval. Although the generated lexicons can be an order of magnitude more selective than the general purpose lexicon, they maintain, or even improve, the performance of an opinion retrieval system.

Semantic Search workshop paper online

Our Semantic Search Workshop at WWW 2010 paper Entity Search: Building Bridges Between Two Worlds, by Krisztian Balog, Edgar Meij and Maarten de Rijke, is available online now.

We consider the task of entity search and examine to which extent state-of-art information retrieval (IR) and semantic web (SW) technologies are capable of answering information needs that focus on entities. We also explore the potential of combining IR with SW technologies to improve the end-to- end performance on a specific entity search task. We arrive at and motivate a proposal to combine text-based entity models with semantic information from the Linked Open Data cloud.

Another INEX 2009 paper online

A second INEX 2009, Combining term-based and category-based representations for entity search by Krisztian Balog, Marc Bron, Maarten de Rijke and Wouter Weerkamp is also online now.

In the paper we describe our participation in the INEX 2009 Entity Ranking track. We employ a probabilistic retrieval model for entity search in which term-based and category-based representations of queries and entities are effectively integrated. We demonstrate that our approach achieves state-of-the-art performance on both the entity ranking and list completion tasks.

INEX 2009 paper online

One of our INEX 2009 paper, An exploration of learning to link with Wikipedia: Features, methods and training collection, by Jiyin He and Maarten de Rijke is online now.

We describe our participation in the Link-the-Wiki track at INEX 2009. We apply machine learning methods to the anchor-to-best-entry-point task and explore the impact of the following aspects of our approaches: features, learning methods as well as the collection used for training the models. We find that a learning to rank-based approach and a binary classification approach do not differ a lot. The new Wikipedia collection which is of larger size and which has more links than the collection previously used, provides better training material for learning our models. In addition, a heuristic run which combines the two intuitively most useful features outperforms machine learning based runs, which suggests that a further analysis and selection of features is necessary.

CIVR 2010 paper online

Our CIVR 2010 paper Today's and Tomorrow's Retrieval Practice in the Audiovisual Archive by Bouke Huurnink, Cees Snoek, Maarten de Rijke and Arnold Smeulders is online now.

Content-based video retrieval is maturing to the point where it can be used in real-world retrieval practices. One such practice is the audiovisual archive, whose users increasingly require fine-grained access to broadcast television content. We investigate to what extent content-based video retrieval methods can improve search in the audiovisual archive. In particular, we propose an evaluation methodology tailored to the specific needs and circumstances of the audiovisual archive, which are typically missed by existing evaluation initiatives. We utilize logged searches and content purchases from an existing audiovisual archive to create realistic query sets and relevance judgments. To reflect the retrieval practice of both the archive and the video retrieval community as closely as possible, our experiments with three video search engines incorporate archive-created catalog entries as well as state-of-the-art multimedia content analysis results. We find that incorporating content-based video retrieval into the archive's practice results in significant performance increases for shot retrieval and for retrieving entire television programs. Our experiments also indicate that individual content-based retrieval methods yield approximately equal performance gains. We conclude that the time has come for audiovisual archives to start accommodating content-based video retrieval methods into their daily practice.

NAACL Social Media workshop paper online

Our NAACL 2010 Social Media workshop paper Mining User Experiences from Online Forums: An Exploration by Valentin Jijkoun, Maarten de Rijke, Wouter Weerkamp, Paul Ackermans and Gijs Geleijnse is available online now.

We introduce the task of experience mining. Here, the goal is to gain insights into criteria that people formulate to judge or rate a product or its usage. These criteria can be formulated as the expectations that people have of the product in advance (i.e., the reasons to buy), but can also be expressed as reports of experiences while using the product and comparisons with other products. We focus on the latter: reports of experiences with products. In this paper, we define the task, describe guidelines for manual annotation and analyze linguistic features that can be used in an automatic experience mining system.

And one more JASIST paper online

Search Behavior of Media Professionals at an Audiovisual Archive: A Transaction Log Analysis by Bouke Huurnink, Laura Hollink, Wietske van den Heuvel and Maarten de Rijke is now available online on the Journal of the American Society for Information Science and Technology site at http://doi.wiley.com/10.1002/asi.21327.

Finding audiovisual material for reuse in new programs is an important activity for news producers, documentary makers, and other media professionals. Such professionals are typically served by an audiovisual broadcast archive. We report on a study of the transaction logs of one such archive. The analysis includes an investigation of commercial orders made by the media professionals and a characterization of sessions, queries, and the content of terms recorded in the logs. One of our key findings is that there is a strong demand for short pieces of audiovisual material in the archive. In addition, while searchers are generally able to quickly navigate to a usable audiovisual broadcast, it takes them longer to place an order when purchasing a subsection of a broadcast than when purchasing an entire broadcast. Another key finding is that queries predominantly consist of (parts of) broadcast titles and of proper names. Our observations imply that it may be beneficial to increase support for fine-grained access to audiovisual material, for example, through manual segmentation or content-based analysis.

Another JASIST paper online

Contextual Factors for Finding Similar Experts by Katja Hofmann, Krisztian Balog, Toine Bogers and Maarten de Rijke was accepted by the Journal of the American Society for Information Science and Technology late last year. It is available online now at http://dx.doi.org/10.1002/asi.21292.

Expertise-seeking research studies how people search for expertise and choose whom to contact in the context of a specific task. An important outcome are models that identify factors that influence expert finding. Expertise retrieval addresses the same problem, expert finding, but from a system-centered perspective. The main focus has been on developing content-based algorithms similar to document search. These algorithms identify matching experts primarily on the basis of the textual content of documents with which experts are associated. Other factors, such as the ones identified by expertise-seeking models, are rarely taken into account. In this article, we extend content-based expert-finding approaches with contextual factors that have been found to influence human expert finding. We focus on a task of science communicators in a knowledge-intensive environment, the task of finding similar experts, given an example expert. Our approach combines expertise-seeking and retrieval research. First, we conduct a user study to identify contextual factors that may play a role in the studied task and environment. Then, we design expert retrieval models to capture these factors. We combine these with content-based retrieval models and evaluate them in a retrieval experiment. Our main finding is that while content-based features are the most important, human participants also take contextual factors into account, such as media experience and organizational structure. We develop two principled ways of modeling the identified factors and integrate them with content-based retrieval models. Our experiments show that models combining content-based and contextual factors can significantly outperform existing content-based models.

Listening to ''Stuki'', by The Durutti Column (Play Count: 73)

ECIR 2010 papers online

Our two ECIR 2010 papers are online now. One is entitled Category-based Query Modeling for Entity Search and authored by Krisztian Balog, Marc Bron and Maarten de Rijke. Users often search for entities instead of documents and in this setting are willing to provide extra input, in addition to a query, such as category information and example entities. We propose a general probabilistic framework for entity search to evaluate and provide insight in the many ways of using these types of input for query modeling. We focus on the use of category information and show the advantage of a category-based representation over a term-based representation, and also demonstrate the effectiveness of category-based expansion using example entities. Our best performing model shows very competitive performance on the INEX-XER entity ranking and list completion tasks. The paper is available here.

Our other ECIR 2010 paper is called News Comments: Exploring, Modeling, and Online Prediction with Manos Tsagkias, Wouter Weerkamp and Maarten de Rijke as authors. Online news agents provide commenting facilities for their readers to express their opinions or sentiments with regards to news stories. The number of user supplied comments on a news article may be indicative of its importance, interestingness, or impact. We explore the news comments space, and compare the log-normal and the negative binomial distributions for modeling comments from various news agents. These estimated models can be used to normalize raw comment counts and enable comparison across different news sites. We also examine the feasibility of online prediction of the number of comments, based on the volume observed shortly after publication. We report on solid performance for predicting news comment volume in the long run, after short observation. This prediction can be useful for identifying news stories with the potential to ``take off,'' and can be used to support front page optimization for news sites. The paper is available here.

Listening to ''BWV 0730 Liebster Jesu, wir sind hier'', by Hans Fagius (Organ) (Play Count: 3)

JASIST paper online

Predicting Podcast Preference: An Analysis Framework and its Application by Manos Tsagkias, Martha Larson and Maarten de Rijke was accepted by Journal of the American Society for Information Science and Technology a while back. It is available online now at http://dx.doi.org/10.1002/asi.21259.

In the paper we start from the observation that finding worthwhile podcasts can be difficult for listeners since podcasts are published in large numbers and vary widely with respect to quality and repute. Independently of their informational content, certain podcasts provide satisfying listening material while other podcasts have little or no appeal. In this paper we present PodCred, a framework for analyzing listener appeal, and we demonstrate its application to the task of automatically predicting the listening preferences of users. First, we describe the PodCred framework, which consists of an inventory of factors contributing to user perceptions of the credibility and quality of podcasts. The framework is designed to support automatic prediction of whether or not a particular podcast will enjoy listener preference. It consists of four categories of indicators related to the Podcast Content, the Podcaster, the Podcast Context, and the Technical Execution of the podcast. Three studies contributed to the development of the PodCred framework: a review of the literature on credibility for other media, a survey of prescriptive guidelines for podcasting, and a detailed data analysis. Next, we report on a validation exercise in which the PodCred framework is applied to a real-world podcast preference prediction task. Our validation focuses on select framework indicators that show promise of being both discriminative and readily accessible. We translate these indicators into a set of easily extractable surface features and use them to implement a basic classification system. The experiments carried out to evaluate system use popularity levels in iTunes as ground truth and demonstrate that simple surface features derived from the PodCred framework are indeed useful for classifying podcasts.

Listening to ''Eleusian Lullaby'', by Alio Die & Martina Galvagni (Play Count: 2)

IPM paper online

Conceptual Languages for Domain-Specific Retrieval by Edgar Meij, Dolf Trieschnigg, Maarten de Rijke and Wessel Kraaij was accepted for publication in Information Processing and Management a while back; it is available now. Over the years, various meta-languages have been used to manually enrich documents with conceptual knowledge of some kind. Examples include keyword assignment to citations or, more recently, tags to websites. In this paper we propose generative concept models as an extension to query modeling within the language modeling framework, which leverages these conceptual annotations to improve retrieval. By means of relevance feedback the original query is translated into a conceptual representation, which is subsequently used to update the query model.

Extensive experimental work on five test collections in two domains shows that our approach gives significant improvements in terms of recall, initial precision and mean average precision with respect to a baseline without relevance feedback. On one test collection, it is also able to outperform a text-based pseudo-relevance feedback approach based on relevance models. On the other test collections it performs similarly to relevance models. Overall, conceptual language models have the added advantage of offering query and browsing suggestions in the form of conceptual annotations. In addition, the internal structure of the meta-language can be exploited to add related terms.

Our contributions are threefold. First, an extensive study is conducted on how to effectively translate a textual query into a conceptual representation. Second, we propose a method for updating a textual query model using the concepts in conceptual representation. Finally, we provide an extensive analysis of when and how this conceptual feedback improves retrieval.

iTunes is not playing.

IJDAR paper online

An Efficient Coherence Measure to Determine Topical Consistency in User Generated Content by Jiyin He, Wouter Weerkamp, Martha Larson and Maarten de Rijke is available online. When searching for blogs on a specific topic, information seekers prefer blogs that place a central focus on that topic over blogs whose mention of the topic is diffuse or incidental. In order to present users with better blog feed search results, we develop a measure of topical consistency that is able to capture whether or not a blog is topically focused. The measure, called the coherence score, is inspired by the genetics literature and captures the tightness of the clustering structure of a data set relative to a background collection. In a set of experiments on synthetic data, the coherence score is shown to provide a faithful reflection of topic clustering structure. The properties that make the coherence score more appropriate than lexical cohesion, a common measure of topical structure, are discussed. Retrieval experiments show that integrating the coherence score as a prior in a language modeling-based approach to blog feed search improves retrieval effectiveness. The coherence score must, however, be used judiciously in order to avoid boosting the ranking of irrelevant but topically focused blogs. To this end, we experiment with a series of weighting schemes that adjust the contribution of the coherence score according to the relevance of a blog to the user query. An appropriate weighting scheme is able to improve retrieval performance. Finally, we show that the coherence score can be reliably estimated with a sample exceeding 20 posts in size. Consistent with this finding, experiments show that the best retrieval performance is achieved if coherence scores are used when a blog contains more than 20 posts.

Listening to ''Credo - Chorus Crucifixus'', by Catherine Denley, Etc., Harry Christophers; The Sixteen Choir & Orchestra Catherine Dubosc (Play Count: 5)

CIKM 2009 papers online

Three CIKM 2009 papers are online now. The first, The Impact of Document Structure on Keyphrase Extraction by Katja Hofmann, Manos Tsagkias, Edgar Meij and Maarten de Rijke, can be downloaded here. Keyphrases are short phrases that reflect the main topic of a document. Because manually annotating documents with keyphrases is a time-consuming process, several automatic approaches have been developed. Typically, candidate phrases are extracted using features such as position or frequency in the document text. Document structure may contain useful information about which parts or phrases of a document are important, but has rarely been considered as a source of information for keyphrase extraction. We address this issue in the context of keyphrase extraction from scientific literature. We introduce a new, large corpus that consists of full-text journal articles, where the rich collection and document structure available at the publishing stage is explicitly annotated. We explore features based on the XML tags contained in the documents, and based on generic section types derived using position and cue words in section titles. For XML tags we find sections, abstract, and title to perform best, but many smaller elements may be beneficial in combination with other features. Of the generic section types, the discussion section is found to be the most useful for keyphrase extraction.

The second paper, A Query Model Based on Normalized Log-Likelihood, by Edgar Meij, Wouter Weerkamp and Maarten de Rijke, is available here. Leveraging information from relevance assessments has been proposed as an effective means for improving retrieval. We introduce a novel language modeling method which uses information from each assessed document and their aggregate. While most previous approaches focus either on features of the entire set or on features of the individual relevant documents, our model exploits features of both the documents and the set as a whole. When evaluated, we show that our model is able to significantly improve over state-of-art feedback methods.

The third paper, Predicting the Volume of Comments\\ on Online News Stories by Manos Tsagkias, Wouter Weerkamp and Maarten de Rijke is available here. On-line news agents provide commenting facilities for readers to express their views with regard to news stories. The number of user supplied comments on a news article may be indicative of its importance or impact. We report on exploratory work that predicts the comment volume of news articles prior to publication using five feature sets. We address the prediction task as a two stage classification task: a binary classification identifies articles with the potential to receive comments, and a second binary classification receives the output from the first step to label articles ``low'' or ``high'' comment volume. The results show solid performance for the former task, while performance degrades for the latter.

ISWC 2009 paper online

Learning Semantic Query Suggestions by Edgar Meij, Marc Bron, Laura Hollink, Bouke Huurnink and Maarten de Rijke is available online now. An important application of semantic web technology is recognizing human-defined concepts in text. Query transformation is a strategy often used in search engines to derive queries that are able to return more useful search results than the original query and most popular search engines provide facilities that let users complete, specify, or reformulate their queries. We study the problem of semantic query suggestion, a special type of query transformation based on identifying semantic concepts contained in user queries. We use a feature-based approach in conjunction with supervised machine learning, augmenting term-based features with search history-based and concept-specific features. We apply our method to the task of linking queries from real-world query logs (the transaction logs of the Netherlands Institute for Sound and Vision) to the DBpedia knowledge base. We evaluate the utility of different machine learning algorithms, features, and feature types in identifying semantic concepts using a manually developed test bed and show significant improvements over an already high baseline. The resources developed for this paper, i.e., queries, human assessments, and extracted features, are available for download.

ACL-IJCNLP 2009 paper online

A Generative Blog Post Retrieval Model that Uses Query Expansion based on External Collections by Wouter Weerkamp, Krisztian Balog and Maarten de Rijke is available online now. User generated content is characterized by short, noisy documents, with many spelling errors and unexpected language usage. To bridge the vocabulary gap between the user's information need and documents in a specific user generated content environment, the blogosphere, we apply a form of query expansion, i.e., adding and reweighing query terms. Since the blogosphere is noisy, query expansion on the collection itself is rarely effective but external, edited collections are more suitable. In the paper we propose a generative model for expanding queries using external collections in which dependencies between queries, documents, and expansion documents are explicitly modeled. Different instantiations of our model are discussed and make different (in)dependence assumptions. Results using two external collections (news and Wikipedia) show that external expansion for retrieval of user generated content is effective; besides, conditioning the external collection on the query is very beneficial, and making candidate expansion terms dependent on just the document seems sufficient.

Listening to ''Tears for Affairs'', by Camera Obscura (Play Count: 34)

WePS2 paper online

The University of Amsterdam at WePS2 by Krisztian Balog, Jiyin He, Katja Hofmann, Valentin Jijkoun, Christof Monz, Manos Tsagkias, Wouter Weerkamp and Maarten de Rijke is online now. In this paper we describe our participation in the Second Web People Search workshop (WePS2) and detail our approaches. For the clustering task, our focus was on replicating the lessons learned at WEPS1 on the data set made available as part of WEPS2 and on experimenting with a voting-based combination of clustering methods. We found that clustering methods display the same overall behavior on the WEPS1 and WESP2 data sets and that a hierarchical clustering approach delivers the best performance, even outperforming voting-based combinations.

For attribute extraction, we explore approaches using pattern matching with manually and automatically constructed patterns. Manual patterns were constructed using expert knowledge and following analysis of sample data. Automatic pattern construction extracts textual and syntactic context around training samples and selects patterns which are expected to perform well based on leave-one-out evaluation. Experimental results show that manually constructed patterns are very effective for obtaining high recall. For automatically extracted patterns performance varied widely depending on the attribute type. Larger amounts of training data may help improve these approaches in the future.

Listening to ''Autour de l'arbre'', by Keren Ann (Play Count: 17)

WebCLEF 2008 overview online

Overview of WebCLEF 2008 by Valentin Jijkoun and Maarten de Rijke is online now. The paper describes the WebCLEF 2008 task. Similarly to the 2007 edition of WebCLEF, the 2008 edition implements a multilingual ``information synthesis" task, where, for a given topic, participating systems have to extract important snippets from web pages. We detail the task, the assessment procedure, the evaluation measures and results.

Listening to ''IV. Allegro'', by J.C. Schickhardt (Play Count: 0)

CLEF 2008 paper on domain-specific search online

Concept Models for Domain-specific Search by Edgar Meij and Maarten de Rijke is online now. In the paper we describe our participation in the 2008 CLEF Domain-specific track. We evaluate blind relevance feedback models and concept models on the CLEF domain-specific test collection. Applying relevance modeling techniques is found to have a positive effect on the 2008 topic set, in terms of mean average precision and precision@10. Applying concept models for blind relevance feedback, results in even bigger improvements over a query-likelihood baseline, in terms of mean average precision and early precision.

Listening to ''Struggle For Pleasure'', by Wim Mertens (Play Count: 5)

Another ECIR 2009 paper online

Using Contextual Information to Improve Search in Email Archives, an ECIR 2009 paper by Wouter Weerkamp, Krisztian Balog, and Maarten de Rijke is available online now. In the paper, we address the task of finding topically relevant email messages in public discussion lists. We make two important observations. First, email messages are not isolated, but are part of a larger online environment. This context, existing on different levels, can be incorporated into the retrieval model. We explore the use of thread, mailing list, and community content levels, by expanding our original query with term from these sources. We find that query models based on contextual information improve retrieval effectiveness. Second, email is a relatively informal genre, and therefore offers scope for incorporating techniques previously shown useful in searching user-generated content. Indeed, our experiments show that using query-independent features (email length, thread size, and text quality), implemented as priors, results in further improvements.

Listening to ''Meu Mundo Hojo (Eu Sou Assim)'', by Teresa Cristina (Play Count: 2)

Some ECIR 2009 papers online

Two ECIR 2009 papers are online now. The first is Exploiting Surface Features for the Prediction of Podcast Preference by Manos Tsagkias, Martha Larson and Maarten de Rijke. Podcasts display an unevenness characteristic of domains dominated by user generated content, resulting in potentially radical variation of the user preference they enjoy. In the paper we report on work that uses easily extractable surface features of podcasts in order to achieve solid performance on two podcast preference prediction tasks: classification of preferred vs. non-preferred podcasts and ranking podcasts by level of preference. We identify features with good discriminative potential by carrying out manual data analysis, resulting in a refinement of the indicators of an existent podcast preference framework. Our preference prediction is useful for topic-independent ranking of podcasts, and can be used to support download suggestion or collection browsing.

The second paper is Investigating the Global Semantic Impact of Speech Recognition Error on Spoken Content Collections by Martha Larson, Manos Tsagkias, Jiyin He and Maarten de Rijke. Errors in speech recognition transcripts have a negative impact on the effectiveness of content-based speech retrieval and present a particular challenge for collections containing conversational spoken content. We propose a Global Semantic Distortion (GSD) metric that measures the collection-wide impact of speech recognition error on spoken content retrieval in a query-independent manner. We deploy our metric to examine the effects of speech recognition substitution errors. First, we investigate frequent substitutions, cases in which the recognizer habitually mis-transcribes one word as another. Although habitual mistakes have a large global impact, the long tail of rare substitutions has a more damaging effect. Second, we investigate semantically similar substitutions, cases in which the word spoken and the word recognized do not diverge radically in meaning. Similar substitutions are shown to have slightly less global impact than semantically dissimilar substitutions.

See the Publications page.

Listening to ''Polly'', by Keren Ann (Play Count: 15)

Catching up

A number of new papers have become available online since the last update:

  • The University of Amsterdam at TREC 2008: Blog, Enterprise, and Relevance Feedback, K. Balog, E. Meij, W. Weerkamps, J. He, and M. de Rijke. In: TREC 2008 Working Notes, November 2008.

  • The MediaMill TRECVID 2008 Semantic Video Search Engine, C.G.M. Snoek, K.E.A. van de Sande, O. de Rooij, B. Huurnink, J.C. van Gemert, J.R.R. Uijlings, J. He, X. Li, I. Everts, V. Nedovic, M. van Liempt, R. van Balen, F. Yan, M.A. Tahir, K. Mikolajczyk, J. Kittler, M. de Rijke, J.M. Geusebroek, Th. Gevers, M. Worring, A.W.M. Smeulders, and D.C. Koelma. In: TRECvid Working Notes, November 2008.

  • The University of Amsterdam at the TAC 2008 Question Answering Track, V. Jijkoun and M. de Rijke. In: TAC 2008 Working Notes, November 2008.

  • PodCred: A Framework for Analyzing Podcast Preference, M. Tsagkias, M. Larson, W. Weerkamp and M. de Rijke. In: Second Workshop on Information Credibility on the Web (WICOW 2008), October 2008.

  • Non-Local Evidence for Expert Finding, K. Balog and M. de Rijke. In ACM 17th Conference on Information and Knowledge Managment (CIKM 2008), October 2008.

  • Assessing Concept Selection for Video Retrieval, B. Huurnink, K. Hofmann, and M. de Rijke. In: ACM International Conference on Multimedia Information Retrieval (MIR 2008), October 2008.

  • On the Topical Structure of the Relevance Feedback Set, J. He, M. Larson and M. de Rijke. In: FGIR Workshop on Information Retrieval 2008, October 2008.

  • Overview of WebCLEF 2008 (draft), V. Jijkoun and M. de Rijke. In: CLEF 2008 Working Notes, September 2008.

  • A Language Modeling Framework for Expertise Search, K. Balog, L. Azzopardi, and M. de Rijke. Information Processing and Management, doi:10.1016/j.ipm.2008.06.003


iTunes is not playing.

CLEF 2008 Domain Specific Track working notes paper online

The University of Amsterdam at the CLEF 2008 Domain Specific Track by Edgar Meij and Maarten de RIjke is available online now. In the paper we describe our participation in the CLEF 2008 Domain Specific track. The research questions we address are threefold: (i) what are the effects of estimating and applying relevance models to the domain specific collection used at CLEF 2008, (ii) what are the results of parsimonizing these relevance models, and (iii) what are the results of applying concept models for blind relevance feedback? Parsimonization is a technique by which the term probabilities in a language model may be re-estimated based on a comparison with a reference model, making the resulting model more sparse and to the point. Concept models are term distributions over vocabulary terms, based on the language associated with concepts in a thesaurus or ontology and are estimated using the documents which are annotated with concepts. Concept models may be used for blind relevance feedback, by first translating a query to concepts and then back to query terms. We find that applying relevance models helps significantly for the current test collection, in terms of both mean average precision and early precision. Moreover, parsimonizing the relevance models helps mean average precision on title-only queries and early precision on title+narrative queries. Our concept models are able to significantly outperform a baseline query-likelihood run, both in terms of mean average precision and early precision on both title-only and title+narrative queries.

Listening to ''Come Back Margaret'', by Camera Obscura (Play Count: 4)

CIKM 2008 paper online

Non-Local Evidence for Expert Finding by Krisztian Balog and Maarten de Rijke is available online now. The task addressed in this paper, finding experts in an enterprise setting, has gained in importance and interest over the past few years. Commonly, this task is approached as an association finding exercise between people and topics. Existing techniques use either documents (as a whole) or proximity-based techniques to represent candidate experts. Proximity-based techniques have shown clear precision-enhancing benefits. We complement both document and proximity-based approaches to expert finding by importing global evidence of expertise, i.e., evidence obtained using information that is not available in the immediate proximity of a candidate expert's name occurrence or even on the same page on which the name occurs. Examples include candidate priors, query models, as well as other documents a candidate expert is associated with.

Using the CSIRO data set created for the TREC 2007 Enterprise track we identify examples of non-local evidence of expertise. We then propose modified expert retrieval models that are capable of incorporating both local (either document or snippet-based) evidence and non-local evidence of expertise. Results show that our refined models significantly outperform existing state-of-the-art approaches.

iTunes is not playing.

WICOW 2008 paper online

PodCred: A Framework for Analyzing Podcast Preference by Manos Tsagkias, Martha Larson, Wouter Weerkamp and Maarten de Rijke is available online now. The PodCred framework is a framework for assessing the credibility and quality of podcasts published on the internet. It consists of a series of indicators designed to support prediction of listener preference of one podcast over another, given that both carry comparable informational content. The indicators are grouped into four categories pertaining to the Podcast Content, the Podcaster, the Podcast Context or the Technical Execution of the podcast. We adopt the term ``cred'' as a designation encompassing both credibility (comprising trustworthiness and expertise) and qualitative acceptability to listeners. Our podcast analysis framework is inspired by work on credibility in blogs, another medium dominated by user generated content. The PodCred framework is derived from a review of the literature on credibility for other media, a survey of prescriptive standards for podcasting, and a detailed data analysis of award winning podcasts. The paper concludes with a discussion of future work in which the framework will be applied.

iTunes is not playing.

ACM MIR 2008 paper online

Assessing Concept Selection for Video Retrieval by Bouke Huurning, Katja Hofmann and Maarten de Rijke is available online now. In the paper we explore the use of benchmarks to address the problem of assessing concept selection in video retrieval systems. Two benchmarks are presented, one created by human association of queries to concepts, the other generated from an extensively tagged collection. They are compared in terms of reliability, captured semantics, and retrieval performance. Recommendations are given for using the benchmarks to assess concept selection algorithms; the assessment is demonstrated on two existing algorithms. The benchmarks are released to the research community.

iTunes is not playing.

SIGIR 2008 workshop paper online (4)

Using Term Clouds to Represent Segment-Level Semantic Content of Podcasts by Marguerite Fuller, Manos Tsagias, Eamonn Newman, Jana besser, Martha Larson, Gareth Jones and Maarten de Rijke is available online now. Spoken audio, like any time-continuous medium, is notoriously difficult to browse or skim without support of an interface providing semantically annotated jump points to signal the user where to listen in. Creation of time-aligned metadata by human annotators is prohibitively expensive, motivating the investigation of representations of segment-level semantic content based on transcripts generated by automatic speech recognition (ASR). This paper examines the feasibility of using term clouds to provide users with a structured representation of the semantic content of podcast episodes. Podcast episodes are visualized as a series of sub-episode segments, each represented by a term cloud derived from a transcript generated by automatic speech recognition (ASR). Quality of segment-level term clouds is measured quantitatively and their utility is investigated using a small-scale user study based on human labeled segment boundaries. Since the segment-level clouds generated from ASR-transcripts prove useful, we examine an adaptation of text tiling techniques to speech in order to be able to generate segments as part of a completely automated indexing and structuring system for browsing of spoken audio. Results demonstrate that the segments generated are comparable with human selected segment boundaries.

Listening to ''Gathering Dust'', by The Durutti Column (Play Count: 70)

SIGIR 2008 workshop paper online (3)

Integrating Contextual Factors into Topic-centric Retrieval Models for Finding Similar Experts by Katja Hofmann, Krisztian Balog, Toine Bogers, and Maarten de Rijke is available online now. Expert finding has been addressed from multiple viewpoints, including expertise seeking and expert retrieval. The focus of expertise seeking has mostly been on descriptive or predictive models, for example to identify what factors affect human decisions on locating and selecting experts. In expert retrieval the focus has been on algorithms similar to document search, which identify topical matches based on the content of documents associated with experts.

We report on a pilot study on an expert finding task in which we explore how contextual factors identified by expertise seeking models can be integrated with topic-centric retrieval algorithms and examine whether they can improve retrieval performance for this task. We focus on the task of \emph{similar expert finding}: given a small number of example experts, find similar experts. Our main finding is that, while topical knowledge is the most important factor, human subjects also consider other factors, such as reliability, up-to-dateness, and organizational structure. We find that integrating these factors into topical retrieval models can significantly improve retrieval performance.

Listening to ''BWV 0826 Partita #2 in c-moll - 5. Rondeaux'', by Pieter-Jan Belder, harpsicord (Play Count: 6)

SIGIR 2008 Workshop paper online (2)

Blogger, Stick to your Story: Modeling Topical Noise in Blogs with Coherence Measures by Jiyin He, Wouter Weerkamp, Martha Larson, and Maarten de Rijke is available now. Topical noise in blogs arises when bloggers digress from the central topical thrust of their blogs. We introduce a method to explicitly incorporate a model of topical noise into a language modeling approach to the task of blog distillation. Topical noise is integrated into the model using a coherence score, which reflects the tightness of the topical structure of a blog. Tests performed on the TRECBlog06 corpus show that a naive integration of the coherence score as blog prior fails to achieve performance improvements. Instead, we develop a set of more sophisticated models in which the coherence score is weighted by a function of the blog retrieval score. The proposed models help improve effectiveness of our language modeling approach to the blog distillation task.

Listening to ''Run to Yuki'', by Yuji Nomi (Play Count: 3)

SIGIR 2008 Workshop paper online

Named Entity Normalization in User Generated Content by Valentin Jijkoun, Mahboob Khalid, Maarten Marx and Maarten de Rijke is available online now. Named entity recognition is important for semantically oriented retrieval tasks, such as question answering, entity retrieval, biomedical retrieval, trend detection, and event and entity tracking. In many of these tasks it is important to be able to accurately normalize the recognized entities, i.e., to map surface forms to unambiguous references to real world entities. Within the context of structured databases, this task (known as record linkage and data de-duplication) has been a topic of active research for more than five decades. For edited content, such as news articles, the named entity normalization (NEN) task is one that has recently attracted considerable attention. We consider the task in the challenging context of user generated content (UGC), where it forms a key ingredient of tracking and media-analysis systems.

A baseline NEN system from the literature (that normalizes surface forms to Wikipedia pages) performs considerably worse on UGC than on edited news: accuracy drops from 80% to 65% for a Dutch language data set and from 94% to 77% for English. We identify several sources of errors: entity recognition errors, multiple ways of referring to the same entity and ambiguous references.

To address these issues we propose five improvements to the baseline NEN algorithm, to arrive at a language independent NEN system that achieves overall accuracy scores of 90% on the English data set and 89% on the Dutch data set. We show that each of the improvements contributes to the overall score of our improved NEN algorithm, and conclude with an error analysis on both Dutch and English language UGC. The NEN system is computationally efficient and runs with very modest computational requirements.

iTunes is not playing.

ECAI 2008 paper online

Finding Key Bloggers, One Post At A Time by Wouter Weerkamp, Krisztian Balog and Maarten de Rijke is available online now. User generated content in general, and blogs in particular, form an interesting and relatively little explored domain for mining knowledge. We address the task of blog distillation: to find blogs that are principally devoted to a given topic, as opposed to blogs that merely happen to discuss the topic in passing. Working in the setting of statistical language modeling, we model the task by aggregating a blogger's blog posts to collect evidence of relevance to the topic and persistence of interest in the topic. This approach achieves state-of-the-art performance. On top of this baseline, we extend our model by incorporating a number of blog-specific features, concerning document structure, social structure, and temporal structure. These blog-specific features yield further improvements.

iTunes is not playing.

SIGIR 2008 poster online (6)

Measuring Concept Relatedness Using Language Models by Dolf Trieschnigg, Edag Meij, Maarten de Rijke and Wessel Kraaij is available online now. Over the years, the notion of concept relatedness has attracted considerable attention. A variety of approaches, based on ontology structure, information content, association, or context have been proposed to indicate the relatedness of abstract ideas. We propose a method based on the cross entropy reduction between language models of concepts which are estimated based on document-concept assignments. The approach shows improved or competitive results compared to state-of-the-art methods on two test sets in the biomedical domain.

Listening to ''Nolita'', by Keren Ann (Play Count: 19)

SIGIR 2008 paper online

A Few Examples Go A Long Way: Constructing Query Models from Elaborate Query Formulations by Krisztian Balog, Wouter Weerkamp and Maarten de Rijke is available online now. In the paper we address a specific enterprise document search scenario, where the information need is expressed in an elaborate manner. In our scenario, information needs are expressed using a short query (of a few keywords) together with examples of key reference pages. Given this setup, we investigate how the examples can be utilized to improve the end-to-end performance on the document retrieval task. Our approach is based on a language modeling framework, where the query model is modified to resemble the example pages. We compare several methods for sampling expansion terms from the example pages to support query-dependent and query-independent query expansion; the latter is motivated by the wish to increase ``aspect recall,'' and attempts to uncover aspects of the information need not captured by the query.

For evaluation purposes we use the CSIRO data set created for the TREC 2007 Enterprise track. The best performance is achieved by query models based on query-independent sampling of expansion terms from the example documents.

Listening to ''A Serious Version'', by King Tubby & The Aggrovators (Play Count: 6)

SIGIR 2008 poster online (5)

Term Clouds as Surrogates for User Generated Speech by Manos Tsagias, Martha Larson and Maarten de Rijke is available online. User generated spoken audio remains a challenge for Automatic Speech Recognition (ASR) technology and content-based audio surrogates derived from ASR-transcripts must be error robust. An investigation of the use of term clouds as surrogates for podcasts demonstrates that ASR term clouds closely approximate term clouds derived from human-generated transcripts across a range of cloud sizes. A user study confirms the conclusion that ASR-clouds are viable surrogates for depicting the content of podcasts.

Listening to ''Allegro Blues'', by Dave Brubeck (Play Count: 2)

SIGIR 2008 poster online (4)

Parsimonious Concept Modeling by Edgar Meij, Dolf Trieschnigg, Maarten de Rijke, and Wessel Kraaij is available online now. We introduce a parsimonious conceptual query model whose retrieval performance matches that of relevance models, while it is also able to generate high quality navigation suggestions in the form of concepts.

Listening to ''The Paris Match'', by The Style Council (Play Count: 12)

SIGIR 2008 poster online (3)

Parsimonious Relevance Models by Edgar Meij, Wouter Weerkamp, Krisztian Balog and Maarten de Rijke is available online. We describe a method for applying parsimonious language models to re-estimate the term probabilities assigned by relevance models. We apply our method to six topic sets from test collections in five different genres. Our parsimonious relevance models (i) improve retrieval effectiveness in terms of MAP on all collections, (ii) significantly outperform their non-parsimonious counterparts on most measures, and (iii) have a precision enhancing effect, unlike other blind relevance feedback methods.

Listening to ''The Paris Match'', by The Style Council (Play Count: 12)

SIGIR 2008 poster online (2)

Personal vs Non-Personal Blogs: Initial Classification Experiments by Erik Elgersma and Maarten de Rijke is available online now. In the poster we address the task of separating personal from non-personal blogs, and report on a set of baseline experiments where we compare the performance on a small set of features across a set of five classifiers. We show that with a limited set of features a performance of up to 90\% can be obtained.

Listening to ''Barrio Vejo'', by Ry Cooder (Play Count: 4)

SIGIR 2008 poster online

Bloggers as Experts, by Krisztian Balog, Maarten de Rijke and Wouter Weerkamp is available online now. We address the task of (blog) feed distillation: to find blogs that are principally devoted to a given topic. The task may be viewed as an association finding task, between topics and bloggers; it resembles the expert finding task, for which a range of models have been proposed. We adopt two language modeling-based approaches to expert finding, and determine their effectiveness as feed distillation strategies. The two models capture the idea that a human will often search for key blogs by spotting highly relevant posts (the Posting model) or by taking global aspects of the blog into account (the Blogger model). The Blogger model outperforms the Posting model and delivers state-of-the art performance, out-of-the-box.

Listening to ''Barrio Vejo'', by Ry Cooder (Play Count: 4)

ACL 2008 paper online

Credibility Improves Topical Blog Post Retrieval by Wouter Weerkamps and Maarten de Rijke is available online now. Topical blog post retrieval is the task of ranking blog posts with respect to their relevance for a given topic. To improve topical blog post retrieval we incorporate textual credibility indicators in the retrieval process. We consider two groups of indicators: post level (determined using information about individual blog posts only) and blog level (determined using information from the underlying blogs). We describe how to estimate these indicators and how to integrate them into a retrieval approach based on language models. Experiments on the TREC Blog track test set show that both groups of credibility indicators significantly improve retrieval effectiveness; the best performance is achieved when combining them.

Listening to ''Lullaby 4 Nina'', by The Durutti Column (Play Count: 7)

DIR 2008 paper online

Looking at Things Differently: Exploring Perspective Recall for Informal Text Retrieval by Wouter Weerkamp and Maarten de Rijke is available now. The paper will be presented at DIR 2008 this April; it reports on ongoing work where we examine the use of query expansion against multiple external corpora so as to uncover multiple perspective on a given topic. Our working assumption is that uncovering multiple perspectives is especially helpful when searching informal text (blogs, discussion forums, comments, etc).

Listening to ''Thin Blue Flame'', by Josh Ritter (Play Count: 0)

CLEF 2007 and NLPIX 2008 papers online

The proceedings versions of two CLEF 2007 papers are online now: Overview of WebCLEF 2007, by Valentin Jijkoun and Maarten de Rijke, and Using Centrality to Rank Web Snippets by the same authors. Also available now is Personal Name Resolution of Web People Search by Leif Azzopardi, Krisztian Balog and Maarten de Rijke; this paper will appear in the WWW 2008 workshop on NLP Challenges in the Information Explosion Era (NLPIX 2008).

Listening to ''Fate (Aka For Soph)'', by The Durutti Column (Play Count: 31)

TREC 2007 proceedings papers online

Two contributions to the TREC 2007 proceedings, Query and Document Models for Enterprise Search by Krisztian Balog, Katja Hofmann, Wouter Weerkamp and Maarten de Rijke, and Language Modeling Approaches to Blog Post and Feed Finding by Breyten Ernsting, Wouter Weerkamp and Maarten de Rijke, are online now.

iTunes is not playing.

ECIR 2008 paper online (3)

Associating People and Documents, by Krisztian Balog and Maarten de Rijke is available now. Since the introduction of the Enterprise Track at TREC in 2005, the task of finding experts has generated a lot of interest within the research community. Numerous models have been proposed that rank candidates by their level of expertise with respect to some topic. Common to all approaches is a component that estimates the strength of the association between a document and a person. Forming such associations, then, is a key ingredient in expertise search models. In this paper we introduce and compare a number of methods for building document-people associations. Moreover, we make underlying assumptions explicit, and examine two in detail: (i) independence of candidates, and (ii) frequency is an indication of strength. We show that our refined ways of estimating the strength of associations between people and
documents leads to significant improvements over the state-of-the-art in the end-to-end expert finding task.

Listening to ''Lock Jaw'', by Dave Barker With Tommy Mccook & The Upsetters (Play Count: 1)

ECIR 2008 paper online (2)

Using Coherence-based Measures to Predict Query Difficulty by Jiyin He, Martha Larson and Maarten de Rijke is online now. In the paper we investigate the potential of coherence-based scores to predict query difficulty. The coherence of a document set associated with each query word is used to capture the quality of a query topic aspect. A simple query coherence score, QC-1, is proposed that requires the average coherence contribution of individual query terms to be high. Two further query scores, QC-2 and QC-3, are developed by constraining QC-1 in order to capture the semantic similarity among query topic aspects. All three query coherence scores show the correlation with average precision necessary to make them good predictors of query difficulty. Simple and efficient, the measures require no training data and are competitive with language model-based clarity scores.

Listening to ''Clampdown'', by The Clash (Play Count: 1)

ECIR 2008 paper online

The Impact of Named Entity Normalization on Information Retrieval for Question Answering by Mahboob Alam Khalid, Valentin Jijkoun and Maarten de Rijke is available now. In the named entity normalization task, a system identifies a canonical unambiguous referent for names like Bush or Alabama. Resolving synonymy and ambiguity of such names can benefit end-to-end information access tasks. We evaluate two entity normalization methods based on Wikipedia in the context of both passage and document retrieval for question anwering. We find that even a simple normalization method leads to improvements of early precision, both for document and passage retrieval. Moreover, better normalization results in better retrieval performance.

Listening to ''Everybody Knows This Is Nowhere'', by Neil Young (Play Count: 11)

WIDM 2007 paper published

Extracting the Discussion Structure in Comments on News-Articles by Anne Schuth, Maarten Marx and Maarten de Rijke has now been published. Several on-line daily newspapers offer readers the opportunity to directly comment on articles. In the Netherlands this feature is used quite often and the quality (grammatically and content-wise) is surprisingly high. We develop techniques to collect, store, enrich and analyze these comments. After giving a high-level overview of the Dutch `commentosphere' we zoom in on extracting the discussion structure found in flat comment threads; people not only comment on the news article, they also heavily comment on other comments, resembling discussion fora. We show how techniques from information retrieval, natural language processing and machine learning can be used to extract the `reacts-on' relation between comments with high precision and recall.

TREC working notes papers online

Some of the papers describing our participation in TREC this year are available now. For our work on the blog track, see this one, and go here for our work on the enterprise track.

Listening to ''Gathering Dust'', by The Durutti Column (Play Count: 35)