CLEF 2008 Domain Specific Track working notes paper online

The University of Amsterdam at the CLEF 2008 Domain Specific Track by Edgar Meij and Maarten de RIjke is available online now. In the paper we describe our participation in the CLEF 2008 Domain Specific track. The research questions we address are threefold: (i) what are the effects of estimating and applying relevance models to the domain specific collection used at CLEF 2008, (ii) what are the results of parsimonizing these relevance models, and (iii) what are the results of applying concept models for blind relevance feedback? Parsimonization is a technique by which the term probabilities in a language model may be re-estimated based on a comparison with a reference model, making the resulting model more sparse and to the point. Concept models are term distributions over vocabulary terms, based on the language associated with concepts in a thesaurus or ontology and are estimated using the documents which are annotated with concepts. Concept models may be used for blind relevance feedback, by first translating a query to concepts and then back to query terms. We find that applying relevance models helps significantly for the current test collection, in terms of both mean average precision and early precision. Moreover, parsimonizing the relevance models helps mean average precision on title-only queries and early precision on title+narrative queries. Our concept models are able to significantly outperform a baseline query-likelihood run, both in terms of mean average precision and early precision on both title-only and title+narrative queries.

Listening to ''Come Back Margaret'', by Camera Obscura (Play Count: 4)

CIKM 2008 paper online

Non-Local Evidence for Expert Finding by Krisztian Balog and Maarten de Rijke is available online now. The task addressed in this paper, finding experts in an enterprise setting, has gained in importance and interest over the past few years. Commonly, this task is approached as an association finding exercise between people and topics. Existing techniques use either documents (as a whole) or proximity-based techniques to represent candidate experts. Proximity-based techniques have shown clear precision-enhancing benefits. We complement both document and proximity-based approaches to expert finding by importing global evidence of expertise, i.e., evidence obtained using information that is not available in the immediate proximity of a candidate expert's name occurrence or even on the same page on which the name occurs. Examples include candidate priors, query models, as well as other documents a candidate expert is associated with.

Using the CSIRO data set created for the TREC 2007 Enterprise track we identify examples of non-local evidence of expertise. We then propose modified expert retrieval models that are capable of incorporating both local (either document or snippet-based) evidence and non-local evidence of expertise. Results show that our refined models significantly outperform existing state-of-the-art approaches.

iTunes is not playing.

WICOW 2008 paper online

PodCred: A Framework for Analyzing Podcast Preference by Manos Tsagkias, Martha Larson, Wouter Weerkamp and Maarten de Rijke is available online now. The PodCred framework is a framework for assessing the credibility and quality of podcasts published on the internet. It consists of a series of indicators designed to support prediction of listener preference of one podcast over another, given that both carry comparable informational content. The indicators are grouped into four categories pertaining to the Podcast Content, the Podcaster, the Podcast Context or the Technical Execution of the podcast. We adopt the term ``cred'' as a designation encompassing both credibility (comprising trustworthiness and expertise) and qualitative acceptability to listeners. Our podcast analysis framework is inspired by work on credibility in blogs, another medium dominated by user generated content. The PodCred framework is derived from a review of the literature on credibility for other media, a survey of prescriptive standards for podcasting, and a detailed data analysis of award winning podcasts. The paper concludes with a discussion of future work in which the framework will be applied.

iTunes is not playing.

ACM MIR 2008 paper online

Assessing Concept Selection for Video Retrieval by Bouke Huurning, Katja Hofmann and Maarten de Rijke is available online now. In the paper we explore the use of benchmarks to address the problem of assessing concept selection in video retrieval systems. Two benchmarks are presented, one created by human association of queries to concepts, the other generated from an extensively tagged collection. They are compared in terms of reliability, captured semantics, and retrieval performance. Recommendations are given for using the benchmarks to assess concept selection algorithms; the assessment is demonstrated on two existing algorithms. The benchmarks are released to the research community.

iTunes is not playing.