ECIR 2014 proceedings online

With 10 days to go to ECIR 2014, the ECIR 2014 proceedings are online now. You can find information about it at the Springer web site or access the online version directly at this page.

Now playing: Portico Quartet -- Rubidium

HPC grant

My colleague Lars Buitinck and I received a grant from the HPC fund to support the development of a more scalable version of xTAS, our extensible text analysis service. It’s the pipeline that we (and others) use for our text mining work. This is great news as it allows us to port modules to the new architecture that Lars has been developing for the past six months.

Now playing: Bowerbirds -- Silver Clouds

The autonomous search engine

After Tuesday’s talk on personal data mining, I gave another talk to non-experts on Thursday. This time the topic was “The Autonomous Search Engine”. The backbone of the story is the move from supervised to weakly supervised technology development of one of the core components of search engines: rankers. Weak supervision in this context means that we’re not using explicit, manually created labels provided by experts but that we’re making inferences about our ranker technology from naturally occurring interactions with the search engine.

Weakly supervised ways of evaluating rankers and weakly supervised ways of learning to combined the outputs of multiple rankers are being studied a lot. And solutions are being used outside the lab. The next holy grail here is to create new rankers in a weakly supervised way, again based on the implicit signals that we get from user interactions. The audience, consisting of information professionals for whom expert judgments are the norm, was obviously critical but interested, with a number of great questions and a range of suggestions. Thank you!

Now playing: The Durutti Column -- Stuki

Life mining talk

I gave a talk aimed at the general public on personal data mining last night, in Maastricht. The talk is about explaining what type of information can be mined from the content of open sources (news, social media, etc) using state of the art search and text mining technology. And the focus is on extracting personal information, whether this is location or music listening behavior or health or personality traits. It’s not the first time I’ve given this talk and follow-ups are planned for later this Spring. The message is at the end of the talk is very simple: the content of open sources can be incredibly revealing and therefore incredibly sensitive.

Because of the ongoing NSA revelations, I decided to add some material on what can be mined from metadata. The message is the same: like the content of open sources, the metadata can be incredibly revealing and therefore sensitive too. There were good and interesting questions, both on the reach of search and text mining technology and also on the balance between sharing digital trails as many people stand to benefit on the one hand and ownership of such trails on the other hand. Thanks to everyone who attended!

Now playing: Dakota Suite & Quentin Sirjacq -- The Side of Her Inexhaustible Heart (Part III)

Microsoft PhD Fellowship

For a proposal entitled “Leveraging Data Reuse for Efficient Ranker Evaluation in Information Retrieval”, my colleague Shimon Whiteson and I received funding. The proposal was submitted to the Microsoft Research PhD Scholarship Programme. The project is a collaboration with Filip Radlinksi and will run for three years, with a start planned in the fall. We’ll be recruiting soon.

ECIR 2014 paper on blending vertical and web results online

“Blending Vertical and Web results: A Case Study using Video Intent” by Damien Lefortier, Pavel Serdyukov, Fedor Romanenko and Maarten de Rijke is available online now.

Modern search engines aggregate results from specialized verticals into the Web search results. We study a setting where vertical and Web results are blended into a single result list, a setting that has not been studied before. We focus on video intent and present a detailed observational study of Yandex's two video content sources (i.e., the specialized vertical and a subset of the general web index) thus providing insights into their complementary character. By investigating how to blend results from these sources, we contrast traditional federated search and fusion-based approaches with newly proposed approaches that significantly outperform the baseline methods.

ECIR 2014 paper on query-dependent contextualization of streaming data online

“Query-dependent contextualization of streaming data” by Nikos Voskarides, Daan Odijk, Manos Tsagkias, Wouter Weerkamp and Maarten de Rijke is available online.

We propose a method for linking entities in a stream of short textual documents that takes into account context both inside a document and inside the history of documents seen so far. Our method uses a generic optimization framework for combining several entity ranking functions, and we introduce a global control function to control optimization. Our results demonstrate the effectiveness of combining entity ranking functions that take into account context, which is further boosted by 6% when we use an informed global control function.

ECIR 2014 paper on predicting new concepts in social streams online

"Generating Pseudo-ground Truth for Predicting New Concepts in Social Streams” by David Graus, Manos Tsagkias, Lars Buitinck and Maarten de Rijke is available online now.

The manual curation of knowledge bases is a bottleneck in fast paced domains where new concepts constantly emerge. Identification of nascent concepts is important for improving early entity linking, content interpretation, and recommendation of new content in real-time applications. We present an unsupervised method for generating pseudo-ground truth for training a named entity recognizer to specifically identify entities that will become concepts in a knowledge base in the setting of social streams. We show that our method is able to deal with missing labels, justifying the use of pseudo-ground truth generation in this task. Finally, we show how our method significantly outperforms a lexical-matching baseline, by leveraging strategies for sampling pseudo-ground truth based on entity confidence scores and textual quality of input documents.

ECIR 2014 paper on cluster-based fusion for microblog search online

“The impact of semantic document expansion on cluster-based fusion for microblog search” by Shangsong Liang, Zhaochun Ren and Maarten de Rijke is available online now.

Searching microblog posts, with their limited length and creative language usage, is challenging. We frame the microblog search problem as a data fusion problem. We examine the effectiveness of a recent cluster-based fusion method on the task of retrieving microblog posts. We find that in the optimal setting the contribution of the clustering information is very limited, which we hypothesize to be due to the limited length of microblog posts. To increase the contribution of the clustering information in cluster-based fusion, we integrate semantic document expansion as a preprocessing step. We enrich the content of microblog posts appearing in the lists to be fused by Wikipedia articles, based on which clusters are created. We verify the effectiveness of our combined document expansion plus fusion method by making comparisons with microblog search algorithms and other fusion methods.

ECIR 2014 paper on click-based recommender evaluation online

“Effects of Position Bias on Click-Based Recommender Evaluation” by Katja Hofmann, Anne Schuth, Alejandro Bellogin and Maarten de Rijke is available online now.

Measuring the quality of recommendations produced by a recommender system (RS) is challenging. Labels used for evaluation are typically obtained from users of a RS, by asking for explicit feedback, or inferring labels from implicit feedback. Both approaches can introduce significant biases in the evaluation process. We investigate biases that may affect labels inferred from implicit feedback. Implicit feedback is easy to collect but can be prone to biases, such as position bias. We examine this bias using click models, and show how bias following these models would affect the outcomes of RS evaluation. We find that evaluation based on implicit and explicit feedback can agree well, but only when the evaluation metrics are designed to take user behavior and preferences into account, stressing the importance of understanding user behavior in deployed RSs.

Going out with Streamwatchr

A few weeks ago I visited a local high school as part of a series of efforts to get more high school kids to maintain an interest in computer science and possibly study the subject in university. I gave a sneak preview of a new version of a demo that we’ve been working on with Manos Tsagkias and Wouter Weerkamp, Streamwatchr.

Streamwatchr offers a new way to discover and enjoy music through an innovative interface. We show, in real-time, what music people around the world are listening to. Each time Streamwatchr identifies a tweet in which someone reports about the song that he or she is listening to, it shows a tile with a photo of the artist and a play button (on mouse over) that does, indeed, play the song (from Youtube). Streamwatchr collects about 500,000 music tweets per day, which is about 6 tweets per second.

Screen Shot 2013-12-29 at 13.35.17

I visited a class of 12 and 13 year olds at a local high school here in Amsterdam. As my laptop and the beamer refused to talk to each other I walked around the class room with my laptop to demo Streamwatchr, with Streamwatchr running in full screen. While walking around I talked about some of the technology behind it (entity linking, data integration, open data, etc). Occasionally, I put the laptop down on a table so that the students could interact with Streamwatchr.

I was amazed to see how addictive the interface was … a screen full of tiles, 6 of which flip and change at random every second, kept every kid glued to the laptop. Later (private) demos of Streamwatchr to friends and colleagues led to similar scenes. As I observed the high school kids interact with Streamwatchr, some interesting questions came up. What is the appeal of random changes to visual elements at a pace that seems to be slightly higher than one can actively track? Is it that there is always something on the screen that you have not seen yet? But that you think you don’t want to miss? At which speed should those random changes occur to be optimally captivating? Should the changes really be random or should they provide maximal coverage of the screen (in the obvious spatial sense) to be optimally captivating? There’s a great set of experiments to be run there --- an unexpected side product of an outreach activity.

ECIR 2014 paper on optimizing base rankers using clicks online

“Optimizing Base Rankers Using Clicks: A Case Study Using BM25” by Anne Schuth, Floor Sietsma, Shimon Whiteson and Maarten de Rijke is available online now.

We study the problem of optimizing an individual base ranker using clicks. Surprisingly, while there has been considerable attention for using clicks to optimize linear combinations of base rankers, the problem of optimizing an individual base ranker using clicks has been ignored. The problem is different from the problem of optimizing linear combinations of base rankers as the scoring function of a base ranker may be highly non-linear. For the sake of concreteness, we focus on the optimization of a specific base ranker, viz. BM25. We start by showing that significant improvements in performance can be obtained when optimizing the parameters of BM25 for individual datasets. We also show that it is possible to optimize these parameters from clicks, i.e., without the use of manually annotated data, reaching or even beating manually tuned parameters.

WSDM 2014 paper on efficient on-line ranker evaluation online

“Relative Confidence Sampling for Efficient On-Line Ranker Evaluation” by Masrour Zoghi, Shimon Whiteson, Maarten de Rijke and Remi Munos is available online now.

A key challenge in information retrieval is that of on-line ranker evaluation: determining which one of a finite set of rankers performs the best in expectation on the basis of user clicks on presented document lists. When the presented lists are constructed using interleaved comparison methods, which interleave lists proposed by two different candidate rankers, then the problem of minimizing the total regret accumulated while evaluating the rankers can be formalized as a K-armed dueling bandits problem. In this paper, we propose a new method called relative confidence sampling (RCS) that aims to reduce cumulative regret by being less conservative than existing methods in eliminating rankers from contention. In addition, we present an empirical comparison between RCS and two state-of-the-art methods, relative upper confidence bound and SAVAGE. The results demonstrate that RCS can substantially outperform these alternatives on several large learning to rank datasets.