SIKS seminar: Something on Search and Searchers

Date: 15 April 2011
Time: 09.30-13.30
Location: Bijzondere Collecties - Nina van Leerzaal
Oude Turfmarkt 129, 1012 GC Amsterdam (directions)
Registration: Registration for this seminar is free. To register please send an email to f.w.adriaans@uva.nl

The event is part of the Advanced Components Stage of the Educational Program for SIKS-PhD-students. Therefore, SIKS-PhD-students working on the research foci "Web based Systems" and "Data management, Storage and Retrieval" are strongly encouraged to participate.

This seminar is generously sponsored by WGI (Vereniging Werkgemeenschap Informatiewetenschap).

Following the seminar is the Ph.D. defence of Marijn Koolen at 14.00 in the Agnietenkapel of the University of Amsterdam (address: Oudezijds Voorburgwal 231).

Programme

09.30-10.00 Coffee

10.00-10.30 Mounia Lalmas (Yahoo! Research Barcelona) - Towards a Science of User Engagement (abstract)
10.30-11.00 Claudia Hauff (Delft University of Technology) - Enhancing Access To Classic Children’s Literature (abstract)
11.00-11.30 Arjen de Vries (Delft University of Technology and CWI Amsterdam) - Image Search Logs Potpourri (abstract)

11.30-11.45 Coffee break

11.45-12.15 Edgar Meij (University of Amsterdam) - Search Engines for the Humanities and Social Sciences (abstract)
12.15-12.45 Nick Craswell (Bing, Microsoft Research Cambridge) - Power and Fidelity Tradeoffs in IR Evaluation (abstract)

12.45-13.30 Lunch in the Museumcafé of the Bijzondere Collecties

(end of seminar)

14.00-15:00 Ph.D. defence of Marijn Koolen. The Meaning of Structure: the Value of Link Evidence for Information Retrieval. Location: Agnietenkapel, Oudezijds Voorburgwal 231, Amsterdam (a mere 300 metres from the Bijzondere Collecties).


Abstracts


Towards a Science of User Engagement

Mounia Lalmas, Yahoo! Research Barcelona

I will present some research ideas on how to measure user engagement. User engagement is a quality of user experience that emphasises the positive aspects of interaction, and in particular the phenomena associated with being captivated by technology. Successful technologies are not just used, they are engaged with. Engagement is measured in many ways, by subjective (e.g., questionnaires) and objective (e.g., number of clicks) measures. However, not much has been done in terms of validating and relating these and so providing a firm basis for assessing the quality of the user experience. My proposal looks at combining techniques from web analytics, information retrieval evaluation, and existing works on user engagement coming from the information science community. I will also discuss how the related areas of game immersion and non-intrusive technologies may provide novel insights into developing effective measures and models of user engagement.

-------------------------------------------------------------

Enhancing Access To Classic Children’s Literature

Claudia Hauff, Delft University of Technology

A number of digital libraries (such as Project Gutenberg) contain mostly public domain books, including a significant number of works that belong to children's literature. Many of these classic books are offered in a text-only format, which does not make them appealing for children to read. Moreover, stories that were written for children one hundred or more years ago, might not be readily understandable by children today due to diverging vocabularies and experiences. In this talk, I will describe ongoing work to enhance the access to children's literature repositories which includes automatic story illustration and linking of texts to background information.

-------------------------------------------------------------

Image Search Logs Potpourri

Arjen de Vries, Delft University of Technology and CWI Amsterdam

In context of the FP6 Vitalas project, we gained access to the usage logs of the Belga (commercial) online picture portal, capturing the real-life use of the news agency's photo site. We have studied the usefulness of these logs on their potential to improve information retrieval tasks, and in this talk I will summarize (what I found) the most interesting findings of our research to date.

-------------------------------------------------------------

Search Engines for the Humanities and Social Sciences

Edgar Meij, University of Amsterdam

Data transitions have revolutionized many scientific disciplines, starting with the exact sciences, via the life sciences, and now the social sciences and humanities are in the process of making the transition to becoming data intensive sciences, with descriptions through quantitative measurements. Publicly accessible utterances, opinions, transactions, and interactions resulting from widespread internet and social media usage facilitate new, data-intensive research methods in disciplines that have so far relied on traditional methods such as small-scale literature or panel studies. To illustrate the new possibilities, I will report on two pilot projects carried out by cross-disciplinary teams consisting of computer scientists and researchers from the humanities and social sciences, including anthropology and religious studies. In my presentation I will focus on lessons learned, methodological innovations, and on technical innovations required from computer scientists building the enabling technology, mainly having to do with ranking principles and text selecting facilities.

-------------------------------------------------------------

Power and Fidelity Tradeoffs in IR Evaluation

Nick Craswell, Microsoft

Some organizations make a big ongoing investment in improving their information retrieval system. After a long series of improvements to indexing, query analysis and ranking, you may begin to make improvements that are small and rely on subtle user preferences between different types of document. These improvements are hard to measure in a user study or by observing clicks, because such experiments are not very sensitive. Smaller improvements are easier to measure with a Cranfield-style experiment, but do you trust that your relevance judgments and metrics reflect the opinions of real users? I’ll present some of these tradeoffs in detail, and describe some interesting options like interleaving.

-------------------------------------------------------------