IEEE Transactions on Multimedia paper online
May 20, 2012 00:02
“Content-Based
Analysis Improves Audiovisual Archive
Retrieval” by Bouke Huurnink, Cees Snoek,
Maarten de Rijke and Arnold Smeulders (IEEE
Transactions on Multimedia, published online: 5
April 2012) is available now.
Content-based video retrieval is maturing to the point where it can be used in real-world retrieval practices. One such practice is the audiovisual archive, whose users increasingly require fine-grained access to broadcast television content. In this paper, we take into account the information needs and retrieval data already present in the audiovisual archive, and demonstrate that retrieval performance can be significantly improved when content-based methods are applied to search. To the best of our knowledge, this is the first time that the practice of an audiovisual archive has been taken into account for quantitative retrieval evaluation. To arrive at our main result, we propose an evaluation methodology tailored to the specific needs and circumstances of the audiovisual archive, which are typically missed by existing evaluation initiatives. We utilize logged searches, content purchases, session information, and simulators to create realistic query sets and relevance judgments. To reflect the retrieval practice of both the archive and the video retrieval community as closely as possible, our experiments with three video search engines incorporate archive-created catalog entries as well as state-of-the-art multimedia content analysis results. A detailed query-level analysis indicates that individual content-based retrieval methods such as transcript-based retrieval and concept-based retrieval yield approximately equal performance gains. When combined, we find that content-based video retrieval incorporated into the archives practice results in significant performance increases for shot retrieval and for retrieving entire television programs. The time has come for audiovisual archives to start accommodating content-based video retrieval methods into their daily practice.
Content-based video retrieval is maturing to the point where it can be used in real-world retrieval practices. One such practice is the audiovisual archive, whose users increasingly require fine-grained access to broadcast television content. In this paper, we take into account the information needs and retrieval data already present in the audiovisual archive, and demonstrate that retrieval performance can be significantly improved when content-based methods are applied to search. To the best of our knowledge, this is the first time that the practice of an audiovisual archive has been taken into account for quantitative retrieval evaluation. To arrive at our main result, we propose an evaluation methodology tailored to the specific needs and circumstances of the audiovisual archive, which are typically missed by existing evaluation initiatives. We utilize logged searches, content purchases, session information, and simulators to create realistic query sets and relevance judgments. To reflect the retrieval practice of both the archive and the video retrieval community as closely as possible, our experiments with three video search engines incorporate archive-created catalog entries as well as state-of-the-art multimedia content analysis results. A detailed query-level analysis indicates that individual content-based retrieval methods such as transcript-based retrieval and concept-based retrieval yield approximately equal performance gains. When combined, we find that content-based video retrieval incorporated into the archives practice results in significant performance increases for shot retrieval and for retrieving entire television programs. The time has come for audiovisual archives to start accommodating content-based video retrieval methods into their daily practice.
IRJ paper online
April 09, 2012 08:33
“Balancing
exploration and exploitation in listwise and pairwise
online learning to rank for information
retrieval” by Katja Hofmann, Shimon Whiteson
and Maarten de Rijke (Information Retrieval
Journal, published online: April 7, 2012)
is available now.
As retrieval systems become more complex, learning to rank approaches are being developed to automatically tune their parameters. Using online learning to rank, retrieval systems can learn directly from implicit feedback inferred from user interactions. In such an online setting, algorithms must obtain feedback for effective learning while simultaneously utilizing what has already been learned to produce high quality results. We formulate this challenge as an exploration–exploitation dilemma and propose two methods for addressing it. By adding mechanisms for balancing exploration and exploitation during learning, each method extends a state-of-the-art learning to rank method, one based on listwise learning and the other on pairwise learning. Using a recently developed simulation framework that allows assessment of online performance, we empirically evaluate both methods. Our results show that balancing exploration and exploitation can substantially and significantly improve the online retrieval performance of both listwise and pairwise approaches. In addition, the results demonstrate that such a balance affects the two approaches in different ways, especially when user feedback is noisy, yielding new insights relevant to making online learning to rank effective in practice.
As retrieval systems become more complex, learning to rank approaches are being developed to automatically tune their parameters. Using online learning to rank, retrieval systems can learn directly from implicit feedback inferred from user interactions. In such an online setting, algorithms must obtain feedback for effective learning while simultaneously utilizing what has already been learned to produce high quality results. We formulate this challenge as an exploration–exploitation dilemma and propose two methods for addressing it. By adding mechanisms for balancing exploration and exploitation during learning, each method extends a state-of-the-art learning to rank method, one based on listwise learning and the other on pairwise learning. Using a recently developed simulation framework that allows assessment of online performance, we empirically evaluate both methods. Our results show that balancing exploration and exploitation can substantially and significantly improve the online retrieval performance of both listwise and pairwise approaches. In addition, the results demonstrate that such a balance affects the two approaches in different ways, especially when user feedback is noisy, yielding new insights relevant to making online learning to rank effective in practice.
More ECIR 2012 papers online
March 26, 2012 10:35
Two
more ECIR 2012 papers are online now.
”Adaptive Temporal Query Modeling” by Maria-Hendrike Peetz, Edgar Meij, Maarten de Rijke and Wouter Weerkamp is available here. We present an approach to query modeling that uses the temporal distribution of documents in an initially retrieved set of documents. Such distributions tend to exhibit bursts, especially in news-related document collections. We hypothesize that documents in those bursts are more likely to be relevant and update the query model with the most distinguishing terms in high-quality docu- ments sampled from bursts. We evaluate the effectiveness of our models on a test collection of blog posts.
”Result Disambiguation in Web People Search”, by Richard Berendsen, Bogomil Kovachev, Evi Nastou, Maarten de Rijke and Wouter Weerkamp is available here. In the paper we study the problem of disambiguating the results of a web people search engine: given a query consisting of a person name plus the result pages for this query, find correct referents for all mentions by clustering the pages according to the different people sharing the name. While the problem has been studied extensively, we discover that the increasing availability of results retrieved from social media platforms causes state-of-the-art methods to break down. We analyze the problem and propose a dual strategy where we distinguish between results obtained from social media platforms and those obtained from other sources. In our dual strategy, the two types of documents are disambiguated separately, using different strategies, and their results are then merged. We study several instantiations for the different stages in our proposed strategy and manage to achieve state-of-the-art performance.
”Adaptive Temporal Query Modeling” by Maria-Hendrike Peetz, Edgar Meij, Maarten de Rijke and Wouter Weerkamp is available here. We present an approach to query modeling that uses the temporal distribution of documents in an initially retrieved set of documents. Such distributions tend to exhibit bursts, especially in news-related document collections. We hypothesize that documents in those bursts are more likely to be relevant and update the query model with the most distinguishing terms in high-quality docu- ments sampled from bursts. We evaluate the effectiveness of our models on a test collection of blog posts.
”Result Disambiguation in Web People Search”, by Richard Berendsen, Bogomil Kovachev, Evi Nastou, Maarten de Rijke and Wouter Weerkamp is available here. In the paper we study the problem of disambiguating the results of a web people search engine: given a query consisting of a person name plus the result pages for this query, find correct referents for all mentions by clustering the pages according to the different people sharing the name. While the problem has been studied extensively, we discover that the increasing availability of results retrieved from social media platforms causes state-of-the-art methods to break down. We analyze the problem and propose a dual strategy where we distinguish between results obtained from social media platforms and those obtained from other sources. In our dual strategy, the two types of documents are disambiguated separately, using different strategies, and their results are then merged. We study several instantiations for the different stages in our proposed strategy and manage to achieve state-of-the-art performance.
CLEF 2011 conference report online
March 26, 2012 10:30
“CLEF
2011: Conference on Multilingual and Multimodal
Information Access Evaluation” by Paul Clough,
Nicola Ferro, Pamela Forner, Julio Gonzalo, Bouke
Huurnink, Jaana Kekäläinen, Mounia Lalmas, Vivien
Petras and Maarten de Rijke is online now.
In the paper we report on CLEF 2011.
TREC 2011 papers online
March 26, 2012 10:25
Two
TREC 2011 reports are online now.
”The University of Amsterdam at the TREC 2011 Session Track” by Bouke Huurnink, Richard Berendsen, Katja Hofmann, Edgar Meij and Maarten de Rijke is online now. In the paper we describe the participation of the University of Amsterdam’s ILPS group in the Sessino track at TREC 2011.
”Team COMMIT at TREC 2011” by Marc Bron, Edgar Meij, Maria-Hendrike Peetz, Manos Tsagkias and Maarten de Rijke is also online. In this paper we describe the participation of Team COMMIT in the TREC 2011 Microblog and Entity tracks.
”The University of Amsterdam at the TREC 2011 Session Track” by Bouke Huurnink, Richard Berendsen, Katja Hofmann, Edgar Meij and Maarten de Rijke is online now. In the paper we describe the participation of the University of Amsterdam’s ILPS group in the Sessino track at TREC 2011.
”Team COMMIT at TREC 2011” by Marc Bron, Edgar Meij, Maria-Hendrike Peetz, Manos Tsagkias and Maarten de Rijke is also online. In this paper we describe the participation of Team COMMIT in the TREC 2011 Microblog and Entity tracks.
ECIR 2012 paper online
December 28, 2011 18:46
“Predicting
IMDB Movie Ratings Using Social Media” by
Andrei Oghina, Mathias Breuss, Manos Tsagkias and
Maarten de Rijke is available online now at
this location.
In the paper, we consider the problem of predicting IMDb movie ratings. We examine two sets of features: surface and textual features. For the latter, we assume that no social media signal is isolated and use data from multiple channels that are linked to a particular movie, such as tweets from Twitter and comments from YouTube. We extract textual features from each channel to use in our prediction model and we explore whether data from either of these channels can help to extract a better set of textual feature for prediction. Our best performing model is able to rate movies very close to the observed values.
In the paper, we consider the problem of predicting IMDb movie ratings. We examine two sets of features: surface and textual features. For the latter, we assume that no social media signal is isolated and use data from multiple channels that are linked to a particular movie, such as tweets from Twitter and comments from YouTube. We extract textual features from each channel to use in our prediction model and we explore whether data from either of these channels can help to extract a better set of textual feature for prediction. Our best performing model is able to rate movies very close to the observed values.
ACM TOIS paper online
December 13, 2011 00:32
“Query
Modeling for Entity Search Based on Terms,
Categories, and Examples” by Krisztian Balog,
Marc Bron and Maarten de Rijke is available
online now.
Users often search for entities instead of documents, and in this setting, are willing to provide extra input, in addition to a series of query terms, such as category information and example entities. We propose a general probabilistic framework for entity search to evaluate and provide insights in the many ways of using these types of input for query modeling. We focus on the use of category information and show the advantage of a category-based representation over a term-based representation, and also demonstrate the effectiveness of category-based expansion using example entities. Our best performing model shows very competitive performance on the INEX-XER entity ranking and list completion tasks.
Users often search for entities instead of documents, and in this setting, are willing to provide extra input, in addition to a series of query terms, such as category information and example entities. We propose a general probabilistic framework for entity search to evaluate and provide insights in the many ways of using these types of input for query modeling. We focus on the use of category information and show the advantage of a category-based representation over a term-based representation, and also demonstrate the effectiveness of category-based expansion using example entities. Our best performing model shows very competitive performance on the INEX-XER entity ranking and list completion tasks.
NIPS Workshop paper online
December 12, 2011 13:56
“Contextual
Bandits for Information Retrieval,” by Katja
Hofmann, Shimon Whiteson, Maarten de Rijke, is our
contribution to the NIPS workshop on Bayesian
Optimization, Experimental Design and Bandits: Theory
and Applications. You can find it here.
In this paper we give an overview of and outlook on research at the intersection of information retrieval and contextual bandit problems. A critical problem in information retrieval is online learning to rank, where a search engine strives to improve the quality of the ranked result lists it presents to users on the basis of those users’ interactions with those result lists. Recently, researchers have started to model interactions between users and search engines as contextual bandit problems, and initial methods for learning in this setting have been devised. Our research focuses on two aspects: balancing exploration and exploitation and inferring preferences from implicit user interactions. This paper summarizes our recent work on online learning to rank for information retrieval and points out challenges that are characteristic of this application area.
In this paper we give an overview of and outlook on research at the intersection of information retrieval and contextual bandit problems. A critical problem in information retrieval is online learning to rank, where a search engine strives to improve the quality of the ranked result lists it presents to users on the basis of those users’ interactions with those result lists. Recently, researchers have started to model interactions between users and search engines as contextual bandit problems, and initial methods for learning in this setting have been devised. Our research focuses on two aspects: balancing exploration and exploitation and inferring preferences from implicit user interactions. This paper summarizes our recent work on online learning to rank for information retrieval and points out challenges that are characteristic of this application area.
WSDM 2012 paper online
December 03, 2011 16:41
Our
WSDM 2012 paper “Adding semantics to microblog
posts” (Meij, Weerkamp, de Rijke) is online
now.
Microblogs have become an important course of information for the purpose of marketing, intelligence and reputation management. Streams of microblogs are of great value because of their direct and real-time nature. Determining what an individual microblog post is about, however, can be non-trivial because of creative language usage, the highly contextualized and informal nature of microblog posts, and the limited length of this form of communication.
We propose a solution to the problem of determining what a microblog post is about through semantic linking: we add semantics to posts by automatically identifying concepts that are semantically related to it and generating links to the corresponding Wikipedia articles. The identified concepts can subsequently be used for, e.g., social media mining, thereby reducing the need for manual inspection and selection. Using a purpose-built test collection of tweets, we show that recently proposed approaches for semantic linking do not perform well, mainly due to the idiosyncratic nature of microblog posts. We propose a novel method based on machine learning with a set of innovative features and show that is is able to achieve significant improvements over all other methods, especially in terms of precision.
The paper is available here.
Microblogs have become an important course of information for the purpose of marketing, intelligence and reputation management. Streams of microblogs are of great value because of their direct and real-time nature. Determining what an individual microblog post is about, however, can be non-trivial because of creative language usage, the highly contextualized and informal nature of microblog posts, and the limited length of this form of communication.
We propose a solution to the problem of determining what a microblog post is about through semantic linking: we add semantics to posts by automatically identifying concepts that are semantically related to it and generating links to the corresponding Wikipedia articles. The identified concepts can subsequently be used for, e.g., social media mining, thereby reducing the need for manual inspection and selection. Using a purpose-built test collection of tweets, we show that recently proposed approaches for semantic linking do not perform well, mainly due to the idiosyncratic nature of microblog posts. We propose a novel method based on machine learning with a set of innovative features and show that is is able to achieve significant improvements over all other methods, especially in terms of precision.
The paper is available here.
Five ECIR 2012 papers
November 26, 2011 09:59
Five
ILPS papers
were accepted for ECIR 2012:
- R. Berendsen, B. Kovachev, E. Nastou, M.de Rijke, W. Weerkamp, “Result Disambiguation in Web People Search”
- M. Bosma, E. Meij, W. Weerkamp, "A Framework for Unsupervised Spam Detection in Social Networking Sites”
- P. Lubell-Doughtie, K. Hofmann, "Learning to Rank from Relevance Feedback for e-Discovery”
- A. Oghina, M. Breuss, M. Tsagkias, M. de Rijke, "Predicting IMDB Movie Ratings Using Social Media”
- M.-H. Peetz, E. Meij, M. de Rijke , W. Weerkamp, "Adaptive Temporal Query Modeling”
One more CIKM 2011 paper online
August 26, 2011 06:55
A second CIKM 2011 paper, Automatic Link
Generation with Wikipedia: A Case Study in Annotating
Radiology Reports by Jiyin He, Maarten de Rijke,
Merlijn Sevenster, Rob van Ommering and Yuechen Qian
is now also available online.
Automatically annotating texts with background information has recently received much attention. We conduct a case study in automatically generating links from narrative radiology reports to Wikipedia. Such links help users understand the medical terminology and thereby increase the value of the reports.
Direct applications of existing automatic link generation systems trained on Wikipedia to our radiology data do not yield satisfactory results. Our analysis reveals that medical phrases are often syntactically regular but semantically complicated, e.g., containing multiple concepts or concepts with multiple modifiers. The latter property is the main reason for the failure of existing systems. Based on this observation, we propose an automatic link generation approach that takes into account these properties. We use a sequential labeling approach with syntactic features for anchor text identification in order to exploit syntactic regularities in medical terminology. We combine this with a sub-anchor based approach to target finding, which is aimed at coping with the complex semantic structure of medical phrases. Empirical results show that the proposed system effectively improves the performance over existing systems.
Automatically annotating texts with background information has recently received much attention. We conduct a case study in automatically generating links from narrative radiology reports to Wikipedia. Such links help users understand the medical terminology and thereby increase the value of the reports.
Direct applications of existing automatic link generation systems trained on Wikipedia to our radiology data do not yield satisfactory results. Our analysis reveals that medical phrases are often syntactically regular but semantically complicated, e.g., containing multiple concepts or concepts with multiple modifiers. The latter property is the main reason for the failure of existing systems. Based on this observation, we propose an automatic link generation approach that takes into account these properties. We use a sequential labeling approach with syntactic features for anchor text identification in order to exploit syntactic regularities in medical terminology. We combine this with a sub-anchor based approach to target finding, which is aimed at coping with the complex semantic structure of medical phrases. Empirical results show that the proposed system effectively improves the performance over existing systems.
CIKM 2011 paper online
August 01, 2011 09:44
One of our papers for this year’s CIKM is now
online: A Probabilistic Method for
Inferring Preferences from Clicks by
Katja Hofmann, Shimon Whiteson and Maarten de
Rijke.
Evaluating rankers using implicit feedback, such as clicks on documents in a result list, is an increasingly popular alternative to traditional evaluation methods based on explicit relevance judgments. Previous work has shown that so-called interleaved comparison methods can utilize click data to detect small differences between rankers and can be applied to learn ranking functions online.
In this paper, we analyze three existing interleaved comparison methods and find that they are all either biased or insensitive to some differences between rankers. To address these problems, we present a new method based on a probabilistic interleaving process. We derive an unbiased estimator of comparison outcomes and show how marginalizing over possible comparison outcomes given the observed click data can make this estimator even more effective.
We validate our approach using a recently developed simulation framework based on a learning to rank dataset and a model of click behavior. Our experiments confirm the results of our analysis and show that our method is both more accurate and more robust to noise than existing methods.
Evaluating rankers using implicit feedback, such as clicks on documents in a result list, is an increasingly popular alternative to traditional evaluation methods based on explicit relevance judgments. Previous work has shown that so-called interleaved comparison methods can utilize click data to detect small differences between rankers and can be applied to learn ranking functions online.
In this paper, we analyze three existing interleaved comparison methods and find that they are all either biased or insensitive to some differences between rankers. To address these problems, we present a new method based on a probabilistic interleaving process. We derive an unbiased estimator of comparison outcomes and show how marginalizing over possible comparison outcomes given the observed click data can make this estimator even more effective.
We validate our approach using a recently developed simulation framework based on a learning to rank dataset and a model of click behavior. Our experiments confirm the results of our analysis and show that our method is both more accurate and more robust to noise than existing methods.
Information Retrieval Journal paper on blog feed search online
April 18, 2011 09:14
Blog feed search with a post index by Wouter
Weerkamp, Krisztian Balog and Maarten de Rijke has
been made available online by the
Information Retrieval Journal. User
generated content forms an important domain for
mining knowledge. In this paper, we address the
task of blog feed search: to find blogs that are
principally devoted to a given topic, as opposed
to blogs that merely happen to mention the topic
in passing. The large number of blogs makes the
blogosphere a challenging domain, both in terms
of effectiveness and of storage and retrieval
efficiency. We examine the effectiveness of an
approach to blog feed search that is based on
individual posts as indexing units (instead of
full blogs). Working in the setting of a
probabilistic language modeling approach to
information retrieval, we model the blog feed
search task by aggregating over a
blogger’s posts to collect evidence of
relevance to the topic and persistence of
interest in the topic. This approach achieves
state-of-the-art performance in terms of
effectiveness. We then introduce a two-stage
model where a pre-selection of candidate blogs
is followed by a ranking step. The model
integrates aggressive pruning techniques as well
as very lean representations of the contents of
blog posts, resulting in substantial gains in
efficiency while maintaining effectiveness at a
very competitive level.
ECIR 2011 papers online
January 19, 2011 09:00
Two of our papers for this year’s ECIR are
online now.
One paper, Balancing Exploration and Exploitation in Learning to Rank Online, is by Katja Hofmann, Shimon Whiteson and Maarten de Rijke. As retrieval systems become more complex, learning to rank approaches are being developed to automatically tune their parameters. Using online learning to rank approaches, retrieval systems can learn directly from implicit feedback, while they are running. In such an online setting, algorithms need to both explore new solutions to obtain feedback for effective learning, and exploit what has already been learned to produce results that are acceptable to users. We formulate this challenge as an exploration-exploitation dilemma and present the first online learning to rank algorithm that works with implicit feedback and balances exploration and exploitation. We leverage existing learning to rank data sets and recently developed click models to evaluate the proposed algorithm. Our results show that finding a balance between exploration and exploitation can substantially improve online retrieval performance, bringing us one step closer to making online learning to rank work in practice.
The other paper, Incorporating Query Expansion and Quality Indicators in Searching Microblog Posts, is by Kamran Massoudi, Manos Tsagkias, Maarten de Rijke and Wouter Weerkamp. In the paper we propose a retrieval model for searching microblog posts for a given topic of interest. We develop a language modeling approach tailored to microblogging characteristics, where redundancy-based IR methods cannot be used in a straightforward manner. We enhance this model with two groups of quality indicators: textual and microblog specific. Additionally, we propose a dynamic query expansion model for microblog post retrieval. Experimental results on Twitter data reveal the usefulness of boolean search, and demonstrate the utility of quality indicators and query expansion in microblog search.
One paper, Balancing Exploration and Exploitation in Learning to Rank Online, is by Katja Hofmann, Shimon Whiteson and Maarten de Rijke. As retrieval systems become more complex, learning to rank approaches are being developed to automatically tune their parameters. Using online learning to rank approaches, retrieval systems can learn directly from implicit feedback, while they are running. In such an online setting, algorithms need to both explore new solutions to obtain feedback for effective learning, and exploit what has already been learned to produce results that are acceptable to users. We formulate this challenge as an exploration-exploitation dilemma and present the first online learning to rank algorithm that works with implicit feedback and balances exploration and exploitation. We leverage existing learning to rank data sets and recently developed click models to evaluate the proposed algorithm. Our results show that finding a balance between exploration and exploitation can substantially improve online retrieval performance, bringing us one step closer to making online learning to rank work in practice.
The other paper, Incorporating Query Expansion and Quality Indicators in Searching Microblog Posts, is by Kamran Massoudi, Manos Tsagkias, Maarten de Rijke and Wouter Weerkamp. In the paper we propose a retrieval model for searching microblog posts for a given topic of interest. We develop a language modeling approach tailored to microblogging characteristics, where redundancy-based IR methods cannot be used in a straightforward manner. We enhance this model with two groups of quality indicators: textual and microblog specific. Additionally, we propose a dynamic query expansion model for microblog post retrieval. Experimental results on Twitter data reveal the usefulness of boolean search, and demonstrate the utility of quality indicators and query expansion in microblog search.
WSDM 2011 paper online
November 25, 2010 10:07
Our WSDM 2011 paper, Linking Online News and
Social Media by Manos Tsagkias, Maarten de Rijke
and Wouter Weerkamp, is online at this location.
Much of what is discussed in social media is inspired by events in the news and, vice versa, social media provide us with a handle on the impact of news events. We address the following linking social media utterances task: given a news article, find social media utterances that implicitly reference it.
We follow a three-step approach: we derive multiple query models from a given source news article, which are then used to retrieve utterances from a target social media index, resulting in multiple ranked lists that we then merge into a single result list using data fusion techniques.
Query models are created by exploiting the structure of the source news article and by using explicitly linked social media utterances that are known to discuss the source article.
To combat query drift resulting from the large volume of text, either in the source news article itself or in social media utterances explicitly linked to it, we introduce a graph-based method for selecting discriminative terms.
For our experimental evaluation, we use data from Twitter, Digg, Delicious, the New York Times Community, Wikipedia, and the blogosphere to generate query models. We show that different query models, based on different data sources, provide complementary information and manage to retrieve different social media utterances from our target index. As a consequence, (article dependent) data fusion methods manage to significantly boost retrieval performance over individual approaches. Our graph-based term selection method is shown to help improve both effectiveness and efficiency.
Much of what is discussed in social media is inspired by events in the news and, vice versa, social media provide us with a handle on the impact of news events. We address the following linking social media utterances task: given a news article, find social media utterances that implicitly reference it.
We follow a three-step approach: we derive multiple query models from a given source news article, which are then used to retrieve utterances from a target social media index, resulting in multiple ranked lists that we then merge into a single result list using data fusion techniques.
Query models are created by exploiting the structure of the source news article and by using explicitly linked social media utterances that are known to discuss the source article.
To combat query drift resulting from the large volume of text, either in the source news article itself or in social media utterances explicitly linked to it, we introduce a graph-based method for selecting discriminative terms.
For our experimental evaluation, we use data from Twitter, Digg, Delicious, the New York Times Community, Wikipedia, and the blogosphere to generate query models. We show that different query models, based on different data sources, provide complementary information and manage to retrieve different social media utterances from our target index. As a consequence, (article dependent) data fusion methods manage to significantly boost retrieval performance over individual approaches. Our graph-based term selection method is shown to help improve both effectiveness and efficiency.
CIKM 2010 paper online
August 30, 2010 23:25
Our CIKM 2010 paper Ranking Related Entities:
Components and Analyses by Marc Bron, Krisztian
Balog and Maarten de Rijke, is available online.
Related entity finding is the task of returning a ranked list of homepages of relevant entities of a specified type that need to engage in a given relationship with a given source entity. We propose a framework for addressing this task and perform a detailed analysis of four core components; co-occurrence models, type filtering, context modeling and homepage finding. Our initial focus is on recall. We analyze the performance of a model that only uses co-occurrence statistics. While this method identifies the potential set of related entities, it fails to rank them effectively. Two types of error emerge (1) entities of the wrong type pollute the ranking and (2) while somehow associated to the source entity, some retrieved entities do not engage in the right relation with it. To address (1), we add type filtering based on category information available in Wikipedia. To correct for (2), we complement our related entity finding method with contextual information, represented as language models derived from documents in which source and target entities co-occur. To complete the pipeline, we find homepages of top ranked entities by combining a language modeling approach with heuristics based on Wikipedia's external links. Our method achieves very high recall scores on the end-to-end task, providing a solid starting point for expanding our focus to improve precision. Our framework can effectively incorporate additional heuristics and these extensions lead to state-of-the-art performance.
Related entity finding is the task of returning a ranked list of homepages of relevant entities of a specified type that need to engage in a given relationship with a given source entity. We propose a framework for addressing this task and perform a detailed analysis of four core components; co-occurrence models, type filtering, context modeling and homepage finding. Our initial focus is on recall. We analyze the performance of a model that only uses co-occurrence statistics. While this method identifies the potential set of related entities, it fails to rank them effectively. Two types of error emerge (1) entities of the wrong type pollute the ranking and (2) while somehow associated to the source entity, some retrieved entities do not engage in the right relation with it. To address (1), we add type filtering based on category information available in Wikipedia. To correct for (2), we complement our related entity finding method with contextual information, represented as language models derived from documents in which source and target entities co-occur. To complete the pipeline, we find homepages of top ranked entities by combining a language modeling approach with heuristics based on Wikipedia's external links. Our method achieves very high recall scores on the end-to-end task, providing a solid starting point for expanding our focus to improve precision. Our framework can effectively incorporate additional heuristics and these extensions lead to state-of-the-art performance.
Another CLEF 2010 Conference paper online
June 19, 2010 14:07
Another CLEF 2010 Conference paper is also online now: On the Evaluation
of Entity Profiles by Maarten de Rijke,
Krisztian Balog, Toine Bogers and Antal van den
Bosch.
Entity profiling is the task of identifying and ranking descriptions of a given entity. The task may be viewed as one where the descriptions being sought are terms that need to be selected from a knowledge source (such as an ontology or thesaurus). In this case, entity profiling systems can be assessed by means of precision and recall values of the descriptive terms produced. However, recent evidence suggests that more sophisticated metrics are needed that go beyond mere lexical matching of system-produced descriptors against a ground truth, allowing for graded relevance and rewarding diversity in the list of descriptors returned. In this note, we motivate and propose such a metric.
Entity profiling is the task of identifying and ranking descriptions of a given entity. The task may be viewed as one where the descriptions being sought are terms that need to be selected from a knowledge source (such as an ontology or thesaurus). In this case, entity profiling systems can be assessed by means of precision and recall values of the descriptive terms produced. However, recent evidence suggests that more sophisticated metrics are needed that go beyond mere lexical matching of system-produced descriptors against a ground truth, allowing for graded relevance and rewarding diversity in the list of descriptors returned. In this note, we motivate and propose such a metric.
CLEF 2010 Conference paper online
June 19, 2010 14:04
One of our CLEF 2010 conference papers, Validating
Query Simulators: An Experiment Using Commercial
Searches and Purchases by Bouke Huurnink, Katja
Hofmann, Maarten de Rijke and Marc Bron, is available
online now.
In the paper we design and validate simulators for generating queries and relevance judgments for retrieval system evaluation. We develop a simulation framework that incorporates existing and new simulation strategies. To validate a simulator, we assess whether evaluation using its output data ranks retrieval systems in the same way as evaluation using real-world data. The real-world data is obtained using logged commercial searches and associated purchase decisions. While no simulator reproduces an ideal ranking, there is a large variation in simulator performance that allows us to distinguish those that are better suited to creating artificial testbeds for retrieval experiments. Incorporating knowledge about document structure in the query generation process helps create more realistic simulators.
In the paper we design and validate simulators for generating queries and relevance judgments for retrieval system evaluation. We develop a simulation framework that incorporates existing and new simulation strategies. To validate a simulator, we assess whether evaluation using its output data ranks retrieval systems in the same way as evaluation using real-world data. The real-world data is obtained using logged commercial searches and associated purchase decisions. While no simulator reproduces an ideal ranking, there is a large variation in simulator performance that allows us to distinguish those that are better suited to creating artificial testbeds for retrieval experiments. Incorporating knowledge about document structure in the query generation process helps create more realistic simulators.
ACL 2010 paper online
May 13, 2010 00:29
Our ACL 2010 paper Generating Focused
Topic-specific Sentiment Lexicons by Valentin
Jijkoun, Maarten de Rijke and Wouter Weerkamp is
available online now.
In the paper we present a method for automatically generating focused and accurate topic-specific subjectivity lexicons from a general purpose polarity lexicon that allow users to pin-point subjective on-topic information in a set of relevant documents. We motivate the need for such lexicons in the field of media analysis, describe a bootstrapping method for generating a topic-specific lexicon from a general purpose polarity lexicon, and evaluate the quality of the generated lexicons both manually and using a TREC Blog track test set for opinionated blog post retrieval. Although the generated lexicons can be an order of magnitude more selective than the general purpose lexicon, they maintain, or even improve, the performance of an opinion retrieval system.
In the paper we present a method for automatically generating focused and accurate topic-specific subjectivity lexicons from a general purpose polarity lexicon that allow users to pin-point subjective on-topic information in a set of relevant documents. We motivate the need for such lexicons in the field of media analysis, describe a bootstrapping method for generating a topic-specific lexicon from a general purpose polarity lexicon, and evaluate the quality of the generated lexicons both manually and using a TREC Blog track test set for opinionated blog post retrieval. Although the generated lexicons can be an order of magnitude more selective than the general purpose lexicon, they maintain, or even improve, the performance of an opinion retrieval system.
Semantic Search workshop paper online
April 24, 2010 06:28
Our Semantic Search Workshop at WWW 2010 paper
Entity Search: Building Bridges Between Two
Worlds, by Krisztian Balog, Edgar Meij and
Maarten de Rijke, is available online now.
We consider the task of entity search and examine to which extent state-of-art information retrieval (IR) and semantic web (SW) technologies are capable of answering information needs that focus on entities. We also explore the potential of combining IR with SW technologies to improve the end-to- end performance on a specific entity search task. We arrive at and motivate a proposal to combine text-based entity models with semantic information from the Linked Open Data cloud.
We consider the task of entity search and examine to which extent state-of-art information retrieval (IR) and semantic web (SW) technologies are capable of answering information needs that focus on entities. We also explore the potential of combining IR with SW technologies to improve the end-to- end performance on a specific entity search task. We arrive at and motivate a proposal to combine text-based entity models with semantic information from the Linked Open Data cloud.
Another INEX 2009 paper online
April 20, 2010 22:45
A second INEX 2009, Combining term-based and
category-based representations for entity search
by Krisztian Balog, Marc Bron, Maarten de Rijke and
Wouter Weerkamp is also online now.
In the paper we describe our participation in the INEX 2009 Entity Ranking track. We employ a probabilistic retrieval model for entity search in which term-based and category-based representations of queries and entities are effectively integrated. We demonstrate that our approach achieves state-of-the-art performance on both the entity ranking and list completion tasks.
In the paper we describe our participation in the INEX 2009 Entity Ranking track. We employ a probabilistic retrieval model for entity search in which term-based and category-based representations of queries and entities are effectively integrated. We demonstrate that our approach achieves state-of-the-art performance on both the entity ranking and list completion tasks.
INEX 2009 paper online
April 20, 2010 22:43
One of our INEX 2009 paper, An exploration of
learning to link with Wikipedia: Features, methods
and training collection, by Jiyin He and Maarten
de Rijke is online now.
We describe our participation in the Link-the-Wiki track at INEX 2009. We apply machine learning methods to the anchor-to-best-entry-point task and explore the impact of the following aspects of our approaches: features, learning methods as well as the collection used for training the models. We find that a learning to rank-based approach and a binary classification approach do not differ a lot. The new Wikipedia collection which is of larger size and which has more links than the collection previously used, provides better training material for learning our models. In addition, a heuristic run which combines the two intuitively most useful features outperforms machine learning based runs, which suggests that a further analysis and selection of features is necessary.
We describe our participation in the Link-the-Wiki track at INEX 2009. We apply machine learning methods to the anchor-to-best-entry-point task and explore the impact of the following aspects of our approaches: features, learning methods as well as the collection used for training the models. We find that a learning to rank-based approach and a binary classification approach do not differ a lot. The new Wikipedia collection which is of larger size and which has more links than the collection previously used, provides better training material for learning our models. In addition, a heuristic run which combines the two intuitively most useful features outperforms machine learning based runs, which suggests that a further analysis and selection of features is necessary.
CIVR 2010 paper online
April 16, 2010 07:13
Our CIVR 2010 paper Today's and Tomorrow's
Retrieval Practice in the Audiovisual Archive by
Bouke Huurnink, Cees Snoek, Maarten de Rijke and
Arnold Smeulders is online now.
Content-based video retrieval is maturing to the point where it can be used in real-world retrieval practices. One such practice is the audiovisual archive, whose users increasingly require fine-grained access to broadcast television content. We investigate to what extent content-based video retrieval methods can improve search in the audiovisual archive. In particular, we propose an evaluation methodology tailored to the specific needs and circumstances of the audiovisual archive, which are typically missed by existing evaluation initiatives. We utilize logged searches and content purchases from an existing audiovisual archive to create realistic query sets and relevance judgments. To reflect the retrieval practice of both the archive and the video retrieval community as closely as possible, our experiments with three video search engines incorporate archive-created catalog entries as well as state-of-the-art multimedia content analysis results. We find that incorporating content-based video retrieval into the archive's practice results in significant performance increases for shot retrieval and for retrieving entire television programs. Our experiments also indicate that individual content-based retrieval methods yield approximately equal performance gains. We conclude that the time has come for audiovisual archives to start accommodating content-based video retrieval methods into their daily practice.
Content-based video retrieval is maturing to the point where it can be used in real-world retrieval practices. One such practice is the audiovisual archive, whose users increasingly require fine-grained access to broadcast television content. We investigate to what extent content-based video retrieval methods can improve search in the audiovisual archive. In particular, we propose an evaluation methodology tailored to the specific needs and circumstances of the audiovisual archive, which are typically missed by existing evaluation initiatives. We utilize logged searches and content purchases from an existing audiovisual archive to create realistic query sets and relevance judgments. To reflect the retrieval practice of both the archive and the video retrieval community as closely as possible, our experiments with three video search engines incorporate archive-created catalog entries as well as state-of-the-art multimedia content analysis results. We find that incorporating content-based video retrieval into the archive's practice results in significant performance increases for shot retrieval and for retrieving entire television programs. Our experiments also indicate that individual content-based retrieval methods yield approximately equal performance gains. We conclude that the time has come for audiovisual archives to start accommodating content-based video retrieval methods into their daily practice.
NAACL Social Media workshop paper online
April 13, 2010 09:04
Our NAACL 2010 Social Media workshop paper Mining
User Experiences from Online Forums: An
Exploration by Valentin Jijkoun, Maarten de
Rijke, Wouter Weerkamp, Paul Ackermans and Gijs
Geleijnse is available online now.
We introduce the task of experience mining. Here, the goal is to gain insights into criteria that people formulate to judge or rate a product or its usage. These criteria can be formulated as the expectations that people have of the product in advance (i.e., the reasons to buy), but can also be expressed as reports of experiences while using the product and comparisons with other products. We focus on the latter: reports of experiences with products. In this paper, we define the task, describe guidelines for manual annotation and analyze linguistic features that can be used in an automatic experience mining system.
We introduce the task of experience mining. Here, the goal is to gain insights into criteria that people formulate to judge or rate a product or its usage. These criteria can be formulated as the expectations that people have of the product in advance (i.e., the reasons to buy), but can also be expressed as reports of experiences while using the product and comparisons with other products. We focus on the latter: reports of experiences with products. In this paper, we define the task, describe guidelines for manual annotation and analyze linguistic features that can be used in an automatic experience mining system.
And one more JASIST paper online
March 21, 2010 20:35
Search Behavior of Media Professionals at an
Audiovisual Archive: A Transaction Log Analysis
by Bouke Huurnink, Laura Hollink, Wietske van den
Heuvel and Maarten de Rijke is now available online
on the Journal of the American Society for
Information Science and Technology site at
http://doi.wiley.com/10.1002/asi.21327.
Finding audiovisual material for reuse in new programs is an important activity for news producers, documentary makers, and other media professionals. Such professionals are typically served by an audiovisual broadcast archive. We report on a study of the transaction logs of one such archive. The analysis includes an investigation of commercial orders made by the media professionals and a characterization of sessions, queries, and the content of terms recorded in the logs. One of our key findings is that there is a strong demand for short pieces of audiovisual material in the archive. In addition, while searchers are generally able to quickly navigate to a usable audiovisual broadcast, it takes them longer to place an order when purchasing a subsection of a broadcast than when purchasing an entire broadcast. Another key finding is that queries predominantly consist of (parts of) broadcast titles and of proper names. Our observations imply that it may be beneficial to increase support for fine-grained access to audiovisual material, for example, through manual segmentation or content-based analysis.
Finding audiovisual material for reuse in new programs is an important activity for news producers, documentary makers, and other media professionals. Such professionals are typically served by an audiovisual broadcast archive. We report on a study of the transaction logs of one such archive. The analysis includes an investigation of commercial orders made by the media professionals and a characterization of sessions, queries, and the content of terms recorded in the logs. One of our key findings is that there is a strong demand for short pieces of audiovisual material in the archive. In addition, while searchers are generally able to quickly navigate to a usable audiovisual broadcast, it takes them longer to place an order when purchasing a subsection of a broadcast than when purchasing an entire broadcast. Another key finding is that queries predominantly consist of (parts of) broadcast titles and of proper names. Our observations imply that it may be beneficial to increase support for fine-grained access to audiovisual material, for example, through manual segmentation or content-based analysis.
Another JASIST paper online
February 17, 2010 05:58
Contextual Factors for Finding Similar Experts
by Katja Hofmann, Krisztian Balog, Toine Bogers and
Maarten de Rijke was accepted by the Journal of
the American Society for Information Science and
Technology late last year. It is available online
now at http://dx.doi.org/10.1002/asi.21292.
Expertise-seeking research studies how people search for expertise and choose whom to contact in the context of a specific task. An important outcome are models that identify factors that influence expert finding. Expertise retrieval addresses the same problem, expert finding, but from a system-centered perspective. The main focus has been on developing content-based algorithms similar to document search. These algorithms identify matching experts primarily on the basis of the textual content of documents with which experts are associated. Other factors, such as the ones identified by expertise-seeking models, are rarely taken into account. In this article, we extend content-based expert-finding approaches with contextual factors that have been found to influence human expert finding. We focus on a task of science communicators in a knowledge-intensive environment, the task of finding similar experts, given an example expert. Our approach combines expertise-seeking and retrieval research. First, we conduct a user study to identify contextual factors that may play a role in the studied task and environment. Then, we design expert retrieval models to capture these factors. We combine these with content-based retrieval models and evaluate them in a retrieval experiment. Our main finding is that while content-based features are the most important, human participants also take contextual factors into account, such as media experience and organizational structure. We develop two principled ways of modeling the identified factors and integrate them with content-based retrieval models. Our experiments show that models combining content-based and contextual factors can significantly outperform existing content-based models.
Listening to ''Stuki'', by The Durutti Column (Play Count: 73)
Expertise-seeking research studies how people search for expertise and choose whom to contact in the context of a specific task. An important outcome are models that identify factors that influence expert finding. Expertise retrieval addresses the same problem, expert finding, but from a system-centered perspective. The main focus has been on developing content-based algorithms similar to document search. These algorithms identify matching experts primarily on the basis of the textual content of documents with which experts are associated. Other factors, such as the ones identified by expertise-seeking models, are rarely taken into account. In this article, we extend content-based expert-finding approaches with contextual factors that have been found to influence human expert finding. We focus on a task of science communicators in a knowledge-intensive environment, the task of finding similar experts, given an example expert. Our approach combines expertise-seeking and retrieval research. First, we conduct a user study to identify contextual factors that may play a role in the studied task and environment. Then, we design expert retrieval models to capture these factors. We combine these with content-based retrieval models and evaluate them in a retrieval experiment. Our main finding is that while content-based features are the most important, human participants also take contextual factors into account, such as media experience and organizational structure. We develop two principled ways of modeling the identified factors and integrate them with content-based retrieval models. Our experiments show that models combining content-based and contextual factors can significantly outperform existing content-based models.
Listening to ''Stuki'', by The Durutti Column (Play Count: 73)
ECIR 2010 papers online
December 23, 2009 11:29
Our two ECIR 2010 papers are online now. One is
entitled Category-based Query Modeling for Entity
Search and authored by Krisztian Balog, Marc Bron
and Maarten de Rijke. Users often search for entities
instead of documents and in this setting are willing
to provide extra input, in addition to a query, such
as category information and example entities. We
propose a general probabilistic framework for entity
search to evaluate and provide insight in the many
ways of using these types of input for query
modeling. We focus on the use of category information
and show the advantage of a category-based
representation over a term-based representation, and
also demonstrate the effectiveness of category-based
expansion using example entities. Our best performing
model shows very competitive performance on the
INEX-XER entity ranking and list completion tasks.
The paper is available here.
Our other ECIR 2010 paper is called News Comments: Exploring, Modeling, and Online Prediction with Manos Tsagkias, Wouter Weerkamp and Maarten de Rijke as authors. Online news agents provide commenting facilities for their readers to express their opinions or sentiments with regards to news stories. The number of user supplied comments on a news article may be indicative of its importance, interestingness, or impact. We explore the news comments space, and compare the log-normal and the negative binomial distributions for modeling comments from various news agents. These estimated models can be used to normalize raw comment counts and enable comparison across different news sites. We also examine the feasibility of online prediction of the number of comments, based on the volume observed shortly after publication. We report on solid performance for predicting news comment volume in the long run, after short observation. This prediction can be useful for identifying news stories with the potential to ``take off,'' and can be used to support front page optimization for news sites. The paper is available here.
Listening to ''BWV 0730 Liebster Jesu, wir sind hier'', by Hans Fagius (Organ) (Play Count: 3)
Our other ECIR 2010 paper is called News Comments: Exploring, Modeling, and Online Prediction with Manos Tsagkias, Wouter Weerkamp and Maarten de Rijke as authors. Online news agents provide commenting facilities for their readers to express their opinions or sentiments with regards to news stories. The number of user supplied comments on a news article may be indicative of its importance, interestingness, or impact. We explore the news comments space, and compare the log-normal and the negative binomial distributions for modeling comments from various news agents. These estimated models can be used to normalize raw comment counts and enable comparison across different news sites. We also examine the feasibility of online prediction of the number of comments, based on the volume observed shortly after publication. We report on solid performance for predicting news comment volume in the long run, after short observation. This prediction can be useful for identifying news stories with the potential to ``take off,'' and can be used to support front page optimization for news sites. The paper is available here.
Listening to ''BWV 0730 Liebster Jesu, wir sind hier'', by Hans Fagius (Organ) (Play Count: 3)
JASIST paper online
November 22, 2009 14:05
Predicting Podcast Preference: An Analysis
Framework and its Application by Manos Tsagkias,
Martha Larson and Maarten de Rijke was accepted by
Journal of the American Society for Information
Science and Technology a while back. It is
available online now at http://dx.doi.org/10.1002/asi.21259.
In the paper we start from the observation that finding worthwhile podcasts can be difficult for listeners since podcasts are published in large numbers and vary widely with respect to quality and repute. Independently of their informational content, certain podcasts provide satisfying listening material while other podcasts have little or no appeal. In this paper we present PodCred, a framework for analyzing listener appeal, and we demonstrate its application to the task of automatically predicting the listening preferences of users. First, we describe the PodCred framework, which consists of an inventory of factors contributing to user perceptions of the credibility and quality of podcasts. The framework is designed to support automatic prediction of whether or not a particular podcast will enjoy listener preference. It consists of four categories of indicators related to the Podcast Content, the Podcaster, the Podcast Context, and the Technical Execution of the podcast. Three studies contributed to the development of the PodCred framework: a review of the literature on credibility for other media, a survey of prescriptive guidelines for podcasting, and a detailed data analysis. Next, we report on a validation exercise in which the PodCred framework is applied to a real-world podcast preference prediction task. Our validation focuses on select framework indicators that show promise of being both discriminative and readily accessible. We translate these indicators into a set of easily extractable surface features and use them to implement a basic classification system. The experiments carried out to evaluate system use popularity levels in iTunes as ground truth and demonstrate that simple surface features derived from the PodCred framework are indeed useful for classifying podcasts.
Listening to ''Eleusian Lullaby'', by Alio Die & Martina Galvagni (Play Count: 2)
In the paper we start from the observation that finding worthwhile podcasts can be difficult for listeners since podcasts are published in large numbers and vary widely with respect to quality and repute. Independently of their informational content, certain podcasts provide satisfying listening material while other podcasts have little or no appeal. In this paper we present PodCred, a framework for analyzing listener appeal, and we demonstrate its application to the task of automatically predicting the listening preferences of users. First, we describe the PodCred framework, which consists of an inventory of factors contributing to user perceptions of the credibility and quality of podcasts. The framework is designed to support automatic prediction of whether or not a particular podcast will enjoy listener preference. It consists of four categories of indicators related to the Podcast Content, the Podcaster, the Podcast Context, and the Technical Execution of the podcast. Three studies contributed to the development of the PodCred framework: a review of the literature on credibility for other media, a survey of prescriptive guidelines for podcasting, and a detailed data analysis. Next, we report on a validation exercise in which the PodCred framework is applied to a real-world podcast preference prediction task. Our validation focuses on select framework indicators that show promise of being both discriminative and readily accessible. We translate these indicators into a set of easily extractable surface features and use them to implement a basic classification system. The experiments carried out to evaluate system use popularity levels in iTunes as ground truth and demonstrate that simple surface features derived from the PodCred framework are indeed useful for classifying podcasts.
Listening to ''Eleusian Lullaby'', by Alio Die & Martina Galvagni (Play Count: 2)
IPM paper online
October 15, 2009 17:46
Conceptual Languages for Domain-Specific
Retrieval by Edgar Meij, Dolf Trieschnigg,
Maarten de Rijke and Wessel Kraaij was accepted for
publication in Information Processing and
Management a while back; it is available
now. Over the years, various meta-languages have been
used to manually enrich documents with conceptual
knowledge of some kind. Examples include keyword
assignment to citations or, more recently, tags to
websites. In this paper we propose generative concept
models as an extension to query modeling within the
language modeling framework, which leverages these
conceptual annotations to improve retrieval. By means
of relevance feedback the original query is
translated into a conceptual representation, which is
subsequently used to update the query model.
Extensive experimental work on five test collections in two domains shows that our approach gives significant improvements in terms of recall, initial precision and mean average precision with respect to a baseline without relevance feedback. On one test collection, it is also able to outperform a text-based pseudo-relevance feedback approach based on relevance models. On the other test collections it performs similarly to relevance models. Overall, conceptual language models have the added advantage of offering query and browsing suggestions in the form of conceptual annotations. In addition, the internal structure of the meta-language can be exploited to add related terms.
Our contributions are threefold. First, an extensive study is conducted on how to effectively translate a textual query into a conceptual representation. Second, we propose a method for updating a textual query model using the concepts in conceptual representation. Finally, we provide an extensive analysis of when and how this conceptual feedback improves retrieval.
iTunes is not playing.
Extensive experimental work on five test collections in two domains shows that our approach gives significant improvements in terms of recall, initial precision and mean average precision with respect to a baseline without relevance feedback. On one test collection, it is also able to outperform a text-based pseudo-relevance feedback approach based on relevance models. On the other test collections it performs similarly to relevance models. Overall, conceptual language models have the added advantage of offering query and browsing suggestions in the form of conceptual annotations. In addition, the internal structure of the meta-language can be exploited to add related terms.
Our contributions are threefold. First, an extensive study is conducted on how to effectively translate a textual query into a conceptual representation. Second, we propose a method for updating a textual query model using the concepts in conceptual representation. Finally, we provide an extensive analysis of when and how this conceptual feedback improves retrieval.
iTunes is not playing.
IJDAR paper online
August 27, 2009 23:24
An Efficient Coherence Measure to Determine
Topical Consistency in User Generated Content by
Jiyin He, Wouter Weerkamp, Martha Larson and Maarten
de Rijke is available online. When searching for
blogs on a specific topic, information seekers
prefer blogs that place a central focus on that
topic over blogs whose mention of the topic is
diffuse or incidental. In order to present users
with better blog feed search results, we develop
a measure of topical consistency that is able to
capture whether or not a blog is topically
focused. The measure, called the coherence
score, is inspired by the genetics
literature and captures the tightness of the
clustering structure of a data set relative to a
background collection. In a set of experiments
on synthetic data, the coherence score is shown
to provide a faithful reflection of topic
clustering structure. The properties that make
the coherence score more appropriate than
lexical cohesion, a common measure of topical
structure, are discussed. Retrieval experiments
show that integrating the coherence score as a
prior in a language modeling-based approach to
blog feed search improves retrieval
effectiveness. The coherence score must,
however, be used judiciously in order to avoid
boosting the ranking of irrelevant but topically
focused blogs. To this end, we experiment with a
series of weighting schemes that adjust the
contribution of the coherence score according to
the relevance of a blog to the user query. An
appropriate weighting scheme is able to improve
retrieval performance. Finally, we show that the
coherence score can be reliably estimated with a
sample exceeding 20 posts in size. Consistent
with this finding, experiments show that the
best retrieval performance is achieved if
coherence scores are used when a blog contains
more than 20 posts.
Listening to ''Credo - Chorus Crucifixus'', by Catherine Denley, Etc., Harry Christophers; The Sixteen Choir & Orchestra Catherine Dubosc (Play Count: 5)
Listening to ''Credo - Chorus Crucifixus'', by Catherine Denley, Etc., Harry Christophers; The Sixteen Choir & Orchestra Catherine Dubosc (Play Count: 5)
CIKM 2009 papers online
August 20, 2009 16:48
Three CIKM 2009 papers are online now. The first,
The Impact of Document Structure on Keyphrase
Extraction by Katja Hofmann, Manos Tsagkias,
Edgar Meij and Maarten de Rijke, can be downloaded
here. Keyphrases are short
phrases that reflect the main topic of a
document. Because manually annotating documents
with keyphrases is a time-consuming process,
several automatic approaches have been
developed. Typically, candidate phrases are
extracted using features such as position or
frequency in the document text. Document
structure may contain useful information about
which parts or phrases of a document are
important, but has rarely been considered as a
source of information for keyphrase extraction.
We address this issue in the context of
keyphrase extraction from scientific literature.
We introduce a new, large corpus that consists
of full-text journal articles, where the rich
collection and document structure available at
the publishing stage is explicitly annotated. We
explore features based on the XML tags contained
in the documents, and based on generic section
types derived using position and cue words in
section titles. For XML tags we find sections,
abstract, and title to perform best, but many
smaller elements may be beneficial in
combination with other features. Of the generic
section types, the discussion section is found
to be the most useful for keyphrase extraction.
The second paper, A Query Model Based on Normalized Log-Likelihood, by Edgar Meij, Wouter Weerkamp and Maarten de Rijke, is available here. Leveraging information from relevance assessments has been proposed as an effective means for improving retrieval. We introduce a novel language modeling method which uses information from each assessed document and their aggregate. While most previous approaches focus either on features of the entire set or on features of the individual relevant documents, our model exploits features of both the documents and the set as a whole. When evaluated, we show that our model is able to significantly improve over state-of-art feedback methods.
The third paper, Predicting the Volume of Comments\\ on Online News Stories by Manos Tsagkias, Wouter Weerkamp and Maarten de Rijke is available here. On-line news agents provide commenting facilities for readers to express their views with regard to news stories. The number of user supplied comments on a news article may be indicative of its importance or impact. We report on exploratory work that predicts the comment volume of news articles prior to publication using five feature sets. We address the prediction task as a two stage classification task: a binary classification identifies articles with the potential to receive comments, and a second binary classification receives the output from the first step to label articles ``low'' or ``high'' comment volume. The results show solid performance for the former task, while performance degrades for the latter.
The second paper, A Query Model Based on Normalized Log-Likelihood, by Edgar Meij, Wouter Weerkamp and Maarten de Rijke, is available here. Leveraging information from relevance assessments has been proposed as an effective means for improving retrieval. We introduce a novel language modeling method which uses information from each assessed document and their aggregate. While most previous approaches focus either on features of the entire set or on features of the individual relevant documents, our model exploits features of both the documents and the set as a whole. When evaluated, we show that our model is able to significantly improve over state-of-art feedback methods.
The third paper, Predicting the Volume of Comments\\ on Online News Stories by Manos Tsagkias, Wouter Weerkamp and Maarten de Rijke is available here. On-line news agents provide commenting facilities for readers to express their views with regard to news stories. The number of user supplied comments on a news article may be indicative of its importance or impact. We report on exploratory work that predicts the comment volume of news articles prior to publication using five feature sets. We address the prediction task as a two stage classification task: a binary classification identifies articles with the potential to receive comments, and a second binary classification receives the output from the first step to label articles ``low'' or ``high'' comment volume. The results show solid performance for the former task, while performance degrades for the latter.
ISWC 2009 paper online
August 20, 2009 16:36
Learning Semantic Query Suggestions by Edgar
Meij, Marc Bron, Laura Hollink, Bouke Huurnink and
Maarten de Rijke is available online now. An important
application of semantic web technology is
recognizing human-defined concepts in text.
Query transformation is a strategy often used in
search engines to derive queries that are able
to return more useful search results than the
original query and most popular search engines
provide facilities that let users complete,
specify, or reformulate their queries. We study
the problem of semantic query suggestion,
a special type of query transformation based on
identifying semantic concepts contained in user
queries. We use a feature-based approach in
conjunction with supervised machine learning,
augmenting term-based features with search
history-based and concept-specific features. We
apply our method to the task of linking queries
from real-world query logs (the transaction logs
of the Netherlands Institute for Sound and
Vision) to the DBpedia knowledge base. We
evaluate the utility of different machine
learning algorithms, features, and feature types
in identifying semantic concepts using a
manually developed test bed and show significant
improvements over an already high baseline. The
resources developed for this paper, i.e.,
queries, human assessments, and extracted
features, are available for download.
ACL-IJCNLP 2009 paper online
May 12, 2009 22:37
A Generative Blog Post Retrieval Model that Uses
Query Expansion based on External Collections by
Wouter Weerkamp, Krisztian Balog and Maarten de Rijke
is available online now. User generated content is
characterized by short, noisy documents, with many
spelling errors and unexpected language usage. To
bridge the vocabulary gap between the user's
information need and documents in a specific user
generated content environment, the blogosphere, we
apply a form of query expansion, i.e., adding and
reweighing query terms. Since the blogosphere is
noisy, query expansion on the collection itself is
rarely effective but external, edited collections are
more suitable. In the paper we propose a generative
model for expanding queries using external
collections in which dependencies between queries,
documents, and expansion documents are explicitly
modeled. Different instantiations of our model are
discussed and make different (in)dependence
assumptions. Results using two external collections
(news and Wikipedia) show that external expansion for
retrieval of user generated content is effective;
besides, conditioning the external collection on the
query is very beneficial, and making candidate
expansion terms dependent on just the document seems
sufficient.
Listening to ''Tears for Affairs'', by Camera Obscura (Play Count: 34)
Listening to ''Tears for Affairs'', by Camera Obscura (Play Count: 34)
WePS2 paper online
February 21, 2009 09:45
The University of Amsterdam at WePS2 by
Krisztian Balog, Jiyin He, Katja Hofmann, Valentin
Jijkoun, Christof Monz, Manos Tsagkias, Wouter
Weerkamp and Maarten de Rijke is online now. In this paper we
describe our participation in the Second Web
People Search workshop (WePS2) and detail our
approaches. For the clustering task, our focus
was on replicating the lessons learned at WEPS1
on the data set made available as part of WEPS2
and on experimenting with a voting-based
combination of clustering methods. We found that
clustering methods display the same overall
behavior on the WEPS1 and WESP2 data sets and
that a hierarchical clustering approach delivers
the best performance, even outperforming
voting-based combinations.
For attribute extraction, we explore approaches using pattern matching with manually and automatically constructed patterns. Manual patterns were constructed using expert knowledge and following analysis of sample data. Automatic pattern construction extracts textual and syntactic context around training samples and selects patterns which are expected to perform well based on leave-one-out evaluation. Experimental results show that manually constructed patterns are very effective for obtaining high recall. For automatically extracted patterns performance varied widely depending on the attribute type. Larger amounts of training data may help improve these approaches in the future.
Listening to ''Autour de l'arbre'', by Keren Ann (Play Count: 17)
For attribute extraction, we explore approaches using pattern matching with manually and automatically constructed patterns. Manual patterns were constructed using expert knowledge and following analysis of sample data. Automatic pattern construction extracts textual and syntactic context around training samples and selects patterns which are expected to perform well based on leave-one-out evaluation. Experimental results show that manually constructed patterns are very effective for obtaining high recall. For automatically extracted patterns performance varied widely depending on the attribute type. Larger amounts of training data may help improve these approaches in the future.
Listening to ''Autour de l'arbre'', by Keren Ann (Play Count: 17)
WebCLEF 2008 overview online
February 21, 2009 09:42
Overview of WebCLEF 2008 by Valentin Jijkoun
and Maarten de Rijke is online now. The paper describes
the WebCLEF 2008 task. Similarly to the 2007
edition of WebCLEF, the 2008 edition implements
a multilingual ``information synthesis" task,
where, for a given topic, participating systems
have to extract important snippets from web
pages. We detail the task, the assessment
procedure, the evaluation measures and results.
Listening to ''IV. Allegro'', by J.C. Schickhardt (Play Count: 0)
Listening to ''IV. Allegro'', by J.C. Schickhardt (Play Count: 0)
CLEF 2008 paper on domain-specific search online
February 09, 2009 21:46
Concept Models for Domain-specific Search by
Edgar Meij and Maarten de Rijke is online now. In the paper we
describe our participation in the 2008 CLEF
Domain-specific track. We evaluate blind
relevance feedback models and concept models on
the CLEF domain-specific test collection.
Applying relevance modeling techniques is found
to have a positive effect on the 2008 topic set,
in terms of mean average precision and
precision@10. Applying concept models for blind
relevance feedback, results in even bigger
improvements over a query-likelihood baseline,
in terms of mean average precision and early
precision.
Listening to ''Struggle For Pleasure'', by Wim Mertens (Play Count: 5)
Listening to ''Struggle For Pleasure'', by Wim Mertens (Play Count: 5)
Another ECIR 2009 paper online
January 05, 2009 20:14
Using Contextual Information to Improve Search in
Email Archives, an ECIR 2009 paper by Wouter
Weerkamp, Krisztian Balog, and Maarten de Rijke is
available online now. In the
paper, we address the task of finding topically
relevant email messages in public discussion
lists. We make two important observations.
First, email messages are not isolated, but are
part of a larger online environment. This
context, existing on different levels, can be
incorporated into the retrieval model. We
explore the use of thread, mailing list, and
community content levels, by expanding our
original query with term from these sources. We
find that query models based on contextual
information improve retrieval effectiveness.
Second, email is a relatively informal genre,
and therefore offers scope for incorporating
techniques previously shown useful in searching
user-generated content. Indeed, our experiments
show that using query-independent features
(email length, thread size, and text quality),
implemented as priors, results in further
improvements.
Listening to ''Meu Mundo Hojo (Eu Sou Assim)'', by Teresa Cristina (Play Count: 2)
Listening to ''Meu Mundo Hojo (Eu Sou Assim)'', by Teresa Cristina (Play Count: 2)
Some ECIR 2009 papers online
December 24, 2008 14:42
Two ECIR 2009 papers are online now. The first is
Exploiting Surface Features for the Prediction of
Podcast Preference by Manos Tsagkias, Martha
Larson and Maarten de Rijke. Podcasts display an
unevenness characteristic of domains dominated by
user generated content, resulting in potentially
radical variation of the user preference they enjoy.
In the paper we report on work that uses easily
extractable surface features of podcasts in order to
achieve solid performance on two podcast preference
prediction tasks: classification of preferred vs.
non-preferred podcasts and ranking podcasts by level
of preference. We identify features with good
discriminative potential by carrying out manual data
analysis, resulting in a refinement of the indicators
of an existent podcast preference framework. Our
preference prediction is useful for topic-independent
ranking of podcasts, and can be used to support
download suggestion or collection browsing.
The second paper is Investigating the Global Semantic Impact of Speech Recognition Error on Spoken Content Collections by Martha Larson, Manos Tsagkias, Jiyin He and Maarten de Rijke. Errors in speech recognition transcripts have a negative impact on the effectiveness of content-based speech retrieval and present a particular challenge for collections containing conversational spoken content. We propose a Global Semantic Distortion (GSD) metric that measures the collection-wide impact of speech recognition error on spoken content retrieval in a query-independent manner. We deploy our metric to examine the effects of speech recognition substitution errors. First, we investigate frequent substitutions, cases in which the recognizer habitually mis-transcribes one word as another. Although habitual mistakes have a large global impact, the long tail of rare substitutions has a more damaging effect. Second, we investigate semantically similar substitutions, cases in which the word spoken and the word recognized do not diverge radically in meaning. Similar substitutions are shown to have slightly less global impact than semantically dissimilar substitutions.
See the Publications page.
Listening to ''Polly'', by Keren Ann (Play Count: 15)
The second paper is Investigating the Global Semantic Impact of Speech Recognition Error on Spoken Content Collections by Martha Larson, Manos Tsagkias, Jiyin He and Maarten de Rijke. Errors in speech recognition transcripts have a negative impact on the effectiveness of content-based speech retrieval and present a particular challenge for collections containing conversational spoken content. We propose a Global Semantic Distortion (GSD) metric that measures the collection-wide impact of speech recognition error on spoken content retrieval in a query-independent manner. We deploy our metric to examine the effects of speech recognition substitution errors. First, we investigate frequent substitutions, cases in which the recognizer habitually mis-transcribes one word as another. Although habitual mistakes have a large global impact, the long tail of rare substitutions has a more damaging effect. Second, we investigate semantically similar substitutions, cases in which the word spoken and the word recognized do not diverge radically in meaning. Similar substitutions are shown to have slightly less global impact than semantically dissimilar substitutions.
See the Publications page.
Listening to ''Polly'', by Keren Ann (Play Count: 15)
Catching up
November 21, 2008 22:50
A number of new papers have become available online
since the last update:
iTunes is not playing.
- The University of Amsterdam at TREC 2008:
Blog, Enterprise, and Relevance Feedback, K.
Balog, E. Meij, W. Weerkamps, J. He, and M. de
Rijke. In: TREC 2008 Working Notes, November
2008.
- The MediaMill TRECVID 2008 Semantic Video
Search Engine, C.G.M. Snoek, K.E.A. van de
Sande, O. de Rooij, B. Huurnink, J.C. van Gemert,
J.R.R. Uijlings, J. He, X. Li, I. Everts, V.
Nedovic, M. van Liempt, R. van Balen, F. Yan, M.A.
Tahir, K. Mikolajczyk, J. Kittler, M. de Rijke,
J.M. Geusebroek, Th. Gevers, M. Worring, A.W.M.
Smeulders, and D.C. Koelma. In: TRECvid Working
Notes, November 2008.
- The University of Amsterdam at the TAC 2008
Question Answering Track, V. Jijkoun and M. de
Rijke. In: TAC 2008 Working Notes, November
2008.
- PodCred: A Framework for Analyzing Podcast
Preference, M. Tsagkias, M. Larson, W. Weerkamp
and M. de Rijke. In: Second Workshop on
Information Credibility on the Web (WICOW
2008), October 2008.
- Non-Local Evidence for Expert Finding,
K. Balog and M. de Rijke. In ACM 17th Conference
on Information and Knowledge Managment (CIKM
2008), October 2008.
- Assessing Concept Selection for Video
Retrieval, B. Huurnink, K. Hofmann, and M. de
Rijke. In: ACM International Conference on
Multimedia Information Retrieval (MIR 2008),
October 2008.
- On the Topical Structure of the Relevance
Feedback Set, J. He, M. Larson and M. de Rijke.
In: FGIR Workshop on Information Retrieval
2008, October 2008.
- Overview of WebCLEF 2008 (draft), V.
Jijkoun and M. de Rijke. In: CLEF 2008 Working
Notes, September 2008.
- A Language Modeling Framework for Expertise
Search, K. Balog, L. Azzopardi, and M. de
Rijke. Information Processing and
Management, doi:10.1016/j.ipm.2008.06.003
iTunes is not playing.
CLEF 2008 Domain Specific Track working notes paper online
August 23, 2008 14:46
The University of Amsterdam at the CLEF 2008
Domain Specific Track by Edgar Meij and Maarten
de RIjke is
available online now. In the paper we describe
our participation in the CLEF 2008 Domain Specific
track. The research questions we address are
threefold: (i) what are the effects of estimating and
applying relevance models to the domain specific
collection used at CLEF 2008, (ii) what are the
results of parsimonizing these relevance models, and
(iii) what are the results of applying concept models
for blind relevance feedback? Parsimonization is a
technique by which the term probabilities in a
language model may be re-estimated based on a
comparison with a reference model, making the
resulting model more sparse and to the point. Concept
models are term distributions over vocabulary terms,
based on the language associated with concepts in a
thesaurus or ontology and are estimated using the
documents which are annotated with concepts. Concept
models may be used for blind relevance feedback, by
first translating a query to concepts and then back
to query terms. We find that applying relevance
models helps significantly for the current test
collection, in terms of both mean average precision
and early precision. Moreover, parsimonizing the
relevance models helps mean average precision on
title-only queries and early precision on
title+narrative queries. Our concept models are able
to significantly outperform a baseline
query-likelihood run, both in terms of mean average
precision and early precision on both title-only and
title+narrative queries.
Listening to ''Come Back Margaret'', by Camera Obscura (Play Count: 4)
Listening to ''Come Back Margaret'', by Camera Obscura (Play Count: 4)
CIKM 2008 paper online
August 16, 2008 00:33
Non-Local Evidence for Expert Finding by
Krisztian Balog and Maarten de Rijke is available
online now. The task addressed in this paper,
finding experts in an enterprise setting, has gained
in importance and interest over the past few years.
Commonly, this task is approached as an association
finding exercise between people and topics. Existing
techniques use either documents (as a whole) or
proximity-based techniques to represent candidate
experts. Proximity-based techniques have shown clear
precision-enhancing benefits. We complement both
document and proximity-based approaches to expert
finding by importing global evidence of expertise,
i.e., evidence obtained using information that is not
available in the immediate proximity of a candidate
expert's name occurrence or even on the same page on
which the name occurs. Examples include candidate
priors, query models, as well as other documents a
candidate expert is associated with.
Using the CSIRO data set created for the TREC 2007 Enterprise track we identify examples of non-local evidence of expertise. We then propose modified expert retrieval models that are capable of incorporating both local (either document or snippet-based) evidence and non-local evidence of expertise. Results show that our refined models significantly outperform existing state-of-the-art approaches.
iTunes is not playing.
Using the CSIRO data set created for the TREC 2007 Enterprise track we identify examples of non-local evidence of expertise. We then propose modified expert retrieval models that are capable of incorporating both local (either document or snippet-based) evidence and non-local evidence of expertise. Results show that our refined models significantly outperform existing state-of-the-art approaches.
iTunes is not playing.
WICOW 2008 paper online
August 15, 2008 09:04
PodCred: A Framework for Analyzing Podcast
Preference by Manos Tsagkias, Martha Larson,
Wouter Weerkamp and Maarten de Rijke is available
online now. The PodCred framework is a framework
for assessing the credibility and quality of podcasts
published on the internet. It consists of a series of
indicators designed to support prediction of listener
preference of one podcast over another, given that
both carry comparable informational content. The
indicators are grouped into four categories
pertaining to the Podcast Content, the
Podcaster, the Podcast Context or the
Technical Execution of the podcast. We adopt
the term ``cred'' as a designation encompassing both
credibility (comprising trustworthiness and
expertise) and qualitative acceptability to
listeners. Our podcast analysis framework is inspired
by work on credibility in blogs, another medium
dominated by user generated content. The PodCred
framework is derived from a review of the literature
on credibility for other media, a survey of
prescriptive standards for podcasting, and a detailed
data analysis of award winning podcasts. The paper
concludes with a discussion of future work in which
the framework will be applied.
iTunes is not playing.
iTunes is not playing.
ACM MIR 2008 paper online
August 15, 2008 09:01
Assessing Concept Selection for Video
Retrieval by Bouke Huurning, Katja Hofmann and
Maarten de Rijke is available
online now. In the paper we explore the use of
benchmarks to address the problem of assessing
concept selection in video retrieval systems. Two
benchmarks are presented, one created by human
association of queries to concepts, the other
generated from an extensively tagged collection. They
are compared in terms of reliability, captured
semantics, and retrieval performance. Recommendations
are given for using the benchmarks to assess concept
selection algorithms; the assessment is demonstrated
on two existing algorithms. The benchmarks are
released to the research community.
iTunes is not playing.
iTunes is not playing.
SIGIR 2008 workshop paper online (4)
June 28, 2008 17:03
Using Term Clouds to Represent Segment-Level
Semantic Content of Podcasts by Marguerite
Fuller, Manos Tsagias, Eamonn Newman, Jana besser,
Martha Larson, Gareth Jones and Maarten de Rijke is
available online now. Spoken
audio, like any time-continuous medium, is
notoriously difficult to browse or skim without
support of an interface providing semantically
annotated jump points to signal the user where
to listen in. Creation of time-aligned metadata
by human annotators is prohibitively expensive,
motivating the investigation of representations
of segment-level semantic content based on
transcripts generated by automatic speech
recognition (ASR). This paper examines the
feasibility of using term clouds to provide
users with a structured representation of the
semantic content of podcast episodes. Podcast
episodes are visualized as a series of
sub-episode segments, each represented by a term
cloud derived from a transcript generated by
automatic speech recognition (ASR). Quality of
segment-level term clouds is measured
quantitatively and their utility is investigated
using a small-scale user study based on human
labeled segment boundaries. Since the
segment-level clouds generated from
ASR-transcripts prove useful, we examine an
adaptation of text tiling techniques to speech
in order to be able to generate segments as part
of a completely automated indexing and
structuring system for browsing of spoken audio.
Results demonstrate that the segments generated
are comparable with human selected segment
boundaries.
Listening to ''Gathering Dust'', by The Durutti Column (Play Count: 70)
Listening to ''Gathering Dust'', by The Durutti Column (Play Count: 70)
SIGIR 2008 workshop paper online (3)
June 27, 2008 06:34
Integrating Contextual Factors into Topic-centric
Retrieval Models for Finding Similar Experts by
Katja Hofmann, Krisztian Balog, Toine Bogers, and
Maarten de Rijke is available online now. Expert
finding has been addressed from multiple
viewpoints, including expertise seeking and
expert retrieval. The focus of expertise seeking
has mostly been on descriptive or predictive
models, for example to identify what factors
affect human decisions on locating and selecting
experts. In expert retrieval the focus has been
on algorithms similar to document search, which
identify topical matches based on the content of
documents associated with experts.
We report on a pilot study on an expert finding task in which we explore how contextual factors identified by expertise seeking models can be integrated with topic-centric retrieval algorithms and examine whether they can improve retrieval performance for this task. We focus on the task of \emph{similar expert finding}: given a small number of example experts, find similar experts. Our main finding is that, while topical knowledge is the most important factor, human subjects also consider other factors, such as reliability, up-to-dateness, and organizational structure. We find that integrating these factors into topical retrieval models can significantly improve retrieval performance.
Listening to ''BWV 0826 Partita #2 in c-moll - 5. Rondeaux'', by Pieter-Jan Belder, harpsicord (Play Count: 6)
We report on a pilot study on an expert finding task in which we explore how contextual factors identified by expertise seeking models can be integrated with topic-centric retrieval algorithms and examine whether they can improve retrieval performance for this task. We focus on the task of \emph{similar expert finding}: given a small number of example experts, find similar experts. Our main finding is that, while topical knowledge is the most important factor, human subjects also consider other factors, such as reliability, up-to-dateness, and organizational structure. We find that integrating these factors into topical retrieval models can significantly improve retrieval performance.
Listening to ''BWV 0826 Partita #2 in c-moll - 5. Rondeaux'', by Pieter-Jan Belder, harpsicord (Play Count: 6)
SIGIR 2008 Workshop paper online (2)
June 21, 2008 08:11
Blogger, Stick to your Story: Modeling Topical
Noise in Blogs with Coherence Measures by Jiyin
He, Wouter Weerkamp, Martha Larson, and Maarten de
Rijke is available now. Topical noise in
blogs arises when bloggers digress from the
central topical thrust of their blogs. We
introduce a method to explicitly incorporate a
model of topical noise into a language modeling
approach to the task of blog distillation.
Topical noise is integrated into the model using
a coherence score, which reflects the tightness
of the topical structure of a blog. Tests
performed on the TRECBlog06 corpus show that a
naive integration of the coherence score as blog
prior fails to achieve performance improvements.
Instead, we develop a set of more sophisticated
models in which the coherence score is weighted
by a function of the blog retrieval score. The
proposed models help improve effectiveness of
our language modeling approach to the blog
distillation task.
Listening to ''Run to Yuki'', by Yuji Nomi (Play Count: 3)
Listening to ''Run to Yuki'', by Yuji Nomi (Play Count: 3)
SIGIR 2008 Workshop paper online
June 20, 2008 15:52
Named Entity Normalization in User Generated
Content by Valentin Jijkoun, Mahboob Khalid,
Maarten Marx and Maarten de Rijke is available online now. Named
entity recognition is important for semantically
oriented retrieval tasks, such as question
answering, entity retrieval, biomedical
retrieval, trend detection, and event and entity
tracking. In many of these tasks it is important
to be able to accurately normalize the
recognized entities, i.e., to map surface forms
to unambiguous references to real world
entities. Within the context of structured
databases, this task (known as record linkage
and data de-duplication) has been a topic of
active research for more than five decades. For
edited content, such as news articles, the named
entity normalization (NEN) task is one that has
recently attracted considerable attention. We
consider the task in the challenging context of
user generated content (UGC), where it forms a
key ingredient of tracking and media-analysis
systems.
A baseline NEN system from the literature (that normalizes surface forms to Wikipedia pages) performs considerably worse on UGC than on edited news: accuracy drops from 80% to 65% for a Dutch language data set and from 94% to 77% for English. We identify several sources of errors: entity recognition errors, multiple ways of referring to the same entity and ambiguous references.
To address these issues we propose five improvements to the baseline NEN algorithm, to arrive at a language independent NEN system that achieves overall accuracy scores of 90% on the English data set and 89% on the Dutch data set. We show that each of the improvements contributes to the overall score of our improved NEN algorithm, and conclude with an error analysis on both Dutch and English language UGC. The NEN system is computationally efficient and runs with very modest computational requirements.
iTunes is not playing.
A baseline NEN system from the literature (that normalizes surface forms to Wikipedia pages) performs considerably worse on UGC than on edited news: accuracy drops from 80% to 65% for a Dutch language data set and from 94% to 77% for English. We identify several sources of errors: entity recognition errors, multiple ways of referring to the same entity and ambiguous references.
To address these issues we propose five improvements to the baseline NEN algorithm, to arrive at a language independent NEN system that achieves overall accuracy scores of 90% on the English data set and 89% on the Dutch data set. We show that each of the improvements contributes to the overall score of our improved NEN algorithm, and conclude with an error analysis on both Dutch and English language UGC. The NEN system is computationally efficient and runs with very modest computational requirements.
iTunes is not playing.
ECAI 2008 paper online
May 20, 2008 15:32
Finding Key Bloggers, One Post At A Time by
Wouter Weerkamp, Krisztian Balog and Maarten de Rijke
is available online now. User
generated content in general, and blogs in
particular, form an interesting and relatively
little explored domain for mining knowledge. We
address the task of blog distillation: to find
blogs that are principally devoted to a given
topic, as opposed to blogs that merely happen to
discuss the topic in passing. Working in the
setting of statistical language modeling, we
model the task by aggregating a blogger's blog
posts to collect evidence of relevance to the
topic and persistence of interest in the topic.
This approach achieves state-of-the-art
performance. On top of this baseline, we extend
our model by incorporating a number of
blog-specific features, concerning document
structure, social structure, and temporal
structure. These blog-specific features yield
further improvements.
iTunes is not playing.
iTunes is not playing.
SIGIR 2008 poster online (6)
April 24, 2008 06:03
Measuring Concept Relatedness Using Language
Models by Dolf Trieschnigg, Edag Meij, Maarten de
Rijke and Wessel Kraaij is available online now. Over
the years, the notion of concept relatedness has
attracted considerable attention. A variety of
approaches, based on ontology structure,
information content, association, or context
have been proposed to indicate the relatedness
of abstract ideas. We propose a method based on
the cross entropy reduction between language
models of concepts which are estimated based on
document-concept assignments. The approach shows
improved or competitive results compared to
state-of-the-art methods on two test sets in the
biomedical domain.
Listening to ''Nolita'', by Keren Ann (Play Count: 19)
Listening to ''Nolita'', by Keren Ann (Play Count: 19)
SIGIR 2008 paper online
April 23, 2008 22:02
A Few Examples Go A Long Way: Constructing Query
Models from Elaborate Query Formulations by
Krisztian Balog, Wouter Weerkamp and Maarten de Rijke
is available online now. In the
paper we address a specific enterprise document
search scenario, where the information need is
expressed in an elaborate manner. In our
scenario, information needs are expressed using
a short query (of a few keywords) together with
examples of key reference pages. Given this
setup, we investigate how the examples can be
utilized to improve the end-to-end performance
on the document retrieval task. Our approach is
based on a language modeling framework, where
the query model is modified to resemble the
example pages. We compare several methods for
sampling expansion terms from the example pages
to support query-dependent and query-independent
query expansion; the latter is motivated by the
wish to increase ``aspect recall,'' and attempts
to uncover aspects of the information need not
captured by the query.
For evaluation purposes we use the CSIRO data set created for the TREC 2007 Enterprise track. The best performance is achieved by query models based on query-independent sampling of expansion terms from the example documents.
Listening to ''A Serious Version'', by King Tubby & The Aggrovators (Play Count: 6)
For evaluation purposes we use the CSIRO data set created for the TREC 2007 Enterprise track. The best performance is achieved by query models based on query-independent sampling of expansion terms from the example documents.
Listening to ''A Serious Version'', by King Tubby & The Aggrovators (Play Count: 6)
SIGIR 2008 poster online (5)
April 23, 2008 12:34
Term Clouds as Surrogates for User Generated
Speech by Manos Tsagias, Martha Larson and
Maarten de Rijke is available online. User
generated spoken audio remains a challenge for
Automatic Speech Recognition (ASR) technology
and content-based audio surrogates derived from
ASR-transcripts must be error robust. An
investigation of the use of term clouds as
surrogates for podcasts demonstrates that ASR
term clouds closely approximate term clouds
derived from human-generated transcripts across
a range of cloud sizes. A user study confirms
the conclusion that ASR-clouds are viable
surrogates for depicting the content of
podcasts.
Listening to ''Allegro Blues'', by Dave Brubeck (Play Count: 2)
Listening to ''Allegro Blues'', by Dave Brubeck (Play Count: 2)
SIGIR 2008 poster online (4)
April 23, 2008 09:14
Parsimonious Concept Modeling by Edgar Meij,
Dolf Trieschnigg, Maarten de Rijke, and Wessel Kraaij
is available online now. We
introduce a parsimonious conceptual query model
whose retrieval performance matches that of
relevance models, while it is also able to
generate high quality navigation suggestions in
the form of concepts.
Listening to ''The Paris Match'', by The Style Council (Play Count: 12)
Listening to ''The Paris Match'', by The Style Council (Play Count: 12)
SIGIR 2008 poster online (3)
April 23, 2008 09:07
Parsimonious Relevance Models by Edgar Meij,
Wouter Weerkamp, Krisztian Balog and Maarten de Rijke
is available online. We describe
a method for applying parsimonious language
models to re-estimate the term probabilities
assigned by relevance models. We apply our
method to six topic sets from test collections
in five different genres. Our parsimonious
relevance models (i) improve retrieval
effectiveness in terms of MAP on all
collections, (ii) significantly outperform their
non-parsimonious counterparts on most measures,
and (iii) have a precision enhancing effect,
unlike other blind relevance feedback methods.
Listening to ''The Paris Match'', by The Style Council (Play Count: 12)
Listening to ''The Paris Match'', by The Style Council (Play Count: 12)
SIGIR 2008 poster online (2)
April 19, 2008 08:24
Personal vs Non-Personal Blogs: Initial
Classification Experiments by Erik Elgersma and
Maarten de Rijke is available online now. In the
poster we address the task of separating
personal from non-personal blogs, and report on
a set of baseline experiments where we compare
the performance on a small set of features
across a set of five classifiers. We show that
with a limited set of features a performance of
up to 90\% can be obtained.
Listening to ''Barrio Vejo'', by Ry Cooder (Play Count: 4)
Listening to ''Barrio Vejo'', by Ry Cooder (Play Count: 4)
SIGIR 2008 poster online
April 19, 2008 08:17
Bloggers as Experts, by Krisztian Balog,
Maarten de Rijke and Wouter Weerkamp is available online now. We
address the task of (blog) feed distillation: to
find blogs that are principally devoted to a
given topic. The task may be viewed as an
association finding task, between topics and
bloggers; it resembles the expert finding task,
for which a range of models have been proposed.
We adopt two language modeling-based approaches
to expert finding, and determine their
effectiveness as feed distillation strategies.
The two models capture the idea that a human
will often search for key blogs by spotting
highly relevant posts (the Posting model) or by
taking global aspects of the blog into account
(the Blogger model). The Blogger model
outperforms the Posting model and delivers
state-of-the art performance, out-of-the-box.
Listening to ''Barrio Vejo'', by Ry Cooder (Play Count: 4)
Listening to ''Barrio Vejo'', by Ry Cooder (Play Count: 4)
ACL 2008 paper online
April 19, 2008 06:54
Credibility Improves Topical Blog Post
Retrieval by Wouter Weerkamps and Maarten de
Rijke is available online now. Topical
blog post retrieval is the task of ranking blog
posts with respect to their relevance for a
given topic. To improve topical blog post
retrieval we incorporate textual credibility
indicators in the retrieval process. We consider
two groups of indicators: post level
(determined using information about individual
blog posts only) and blog level
(determined using information from the
underlying blogs). We describe how to estimate
these indicators and how to integrate them into
a retrieval approach based on language models.
Experiments on the TREC Blog track test set show
that both groups of credibility indicators
significantly improve retrieval effectiveness;
the best performance is achieved when combining
them.
Listening to ''Lullaby 4 Nina'', by The Durutti Column (Play Count: 7)
Listening to ''Lullaby 4 Nina'', by The Durutti Column (Play Count: 7)
DIR 2008 paper online
March 16, 2008 06:42
Looking at Things Differently:
Exploring Perspective Recall for Informal Text
Retrieval by Wouter Weerkamp and Maarten
de Rijke is available now. The paper will be
presented at DIR 2008 this April; it reports on
ongoing work where we examine the use of query
expansion against multiple external corpora so as
to uncover multiple perspective on a given topic.
Our working assumption is that uncovering multiple
perspectives is especially helpful when searching
informal text (blogs, discussion forums, comments,
etc).
Listening to ''Thin Blue Flame'', by Josh Ritter (Play Count: 0)
Listening to ''Thin Blue Flame'', by Josh Ritter (Play Count: 0)
CLEF 2007 and NLPIX 2008 papers online
March 15, 2008 13:43
The proceedings versions of two CLEF 2007 papers are
online now: Overview of WebCLEF 2007,
by Valentin Jijkoun and Maarten de Rijke, and
Using Centrality to Rank Web
Snippets by the same authors. Also
available now is Personal Name Resolution of Web
People Search by Leif Azzopardi, Krisztian
Balog and Maarten de Rijke; this paper will appear
in the WWW 2008 workshop on NLP Challenges in the
Information Explosion Era (NLPIX 2008).
Listening to ''Fate (Aka For Soph)'', by The Durutti Column (Play Count: 31)
Listening to ''Fate (Aka For Soph)'', by The Durutti Column (Play Count: 31)
TREC 2007 proceedings papers online
February 06, 2008 13:08
Two contributions to the TREC 2007 proceedings,
Query and Document Models for
Enterprise Search by Krisztian Balog,
Katja Hofmann, Wouter Weerkamp and Maarten de
Rijke, and Language Modeling Approaches to
Blog Post and Feed Finding by Breyten
Ernsting, Wouter Weerkamp and Maarten de Rijke,
are online now.
iTunes is not playing.
iTunes is not playing.
ECIR 2008 paper online (3)
January 12, 2008 08:30
Associating People and Documents, by Krisztian
Balog and Maarten de Rijke is available now. Since
the introduction of the Enterprise Track at TREC in
2005, the task of finding experts has generated a lot
of interest within the research community. Numerous
models have been proposed that rank candidates by
their level of expertise with respect to some topic.
Common to all approaches is a component that
estimates the strength of the association between a
document and a person. Forming such associations,
then, is a key ingredient in expertise search models.
In this paper we introduce and compare a number of
methods for building document-people associations.
Moreover, we make underlying assumptions explicit,
and examine two in detail: (i) independence of
candidates, and (ii) frequency is an indication of
strength. We show that our refined ways of estimating
the strength of associations between people and
documents leads to significant improvements over the state-of-the-art in the end-to-end expert finding task.
Listening to ''Lock Jaw'', by Dave Barker With Tommy Mccook & The Upsetters (Play Count: 1)
documents leads to significant improvements over the state-of-the-art in the end-to-end expert finding task.
Listening to ''Lock Jaw'', by Dave Barker With Tommy Mccook & The Upsetters (Play Count: 1)
ECIR 2008 paper online (2)
January 12, 2008 08:27
Using Coherence-based Measures to Predict Query
Difficulty by Jiyin He, Martha Larson and Maarten
de Rijke is online now. In the paper we investigate
the potential of coherence-based scores to predict
query difficulty. The coherence of a document set
associated with each query word is used to capture
the quality of a query topic aspect. A simple query
coherence score, QC-1, is proposed that requires the
average coherence contribution of individual query
terms to be high. Two further query scores, QC-2 and
QC-3, are developed by constraining QC-1 in order to
capture the semantic similarity among query topic
aspects. All three query coherence scores show the
correlation with average precision necessary to make
them good predictors of query difficulty. Simple and
efficient, the measures require no training data and
are competitive with language model-based clarity
scores.
Listening to ''Clampdown'', by The Clash (Play Count: 1)
Listening to ''Clampdown'', by The Clash (Play Count: 1)
ECIR 2008 paper online
January 10, 2008 21:19
The Impact of Named Entity Normalization on
Information Retrieval for Question Answering by
Mahboob Alam Khalid, Valentin Jijkoun and Maarten de
Rijke is available now. In the named entity
normalization task, a system identifies a canonical
unambiguous referent for names like Bush or
Alabama. Resolving synonymy and ambiguity of
such names can benefit end-to-end information access
tasks. We evaluate two entity normalization methods
based on Wikipedia in the context of both passage and
document retrieval for question anwering. We find
that even a simple normalization method leads to
improvements of early precision, both for document
and passage retrieval. Moreover, better normalization
results in better retrieval performance.
Listening to ''Everybody Knows This Is Nowhere'', by Neil Young (Play Count: 11)
Listening to ''Everybody Knows This Is Nowhere'', by Neil Young (Play Count: 11)
WIDM 2007 paper published
November 12, 2007 10:27
Extracting the Discussion Structure in Comments on
News-Articles by Anne Schuth, Maarten Marx and
Maarten de Rijke has now been published. Several
on-line daily newspapers offer readers the
opportunity to directly comment on articles. In the
Netherlands this feature is used quite often and the
quality (grammatically and content-wise) is
surprisingly high. We develop techniques to collect,
store, enrich and analyze these comments. After
giving a high-level overview of the Dutch
`commentosphere' we zoom in on extracting the
discussion structure found in flat comment threads;
people not only comment on the news article, they
also heavily comment on other comments, resembling
discussion fora. We show how techniques from
information retrieval, natural language processing
and machine learning can be used to extract the
`reacts-on' relation between comments with high
precision and recall.
TREC working notes papers online
October 15, 2007 22:05



