Blendle breekt door: 1 miljoen gebruikers
I’m interviewed about data at Blendle in the Volkskrant in a piece about 1 million users:
I’m interviewed about data at Blendle in the Volkskrant in a piece about 1 million users:
Interview about my PhD research on the faculty website:
My PhD thesis is now available for download. Read the abstract, download a copy, or tell me if you would like to receive a printed copy. The public defense will be on Friday, June 10th 2016 at 13:00 in the Aula of the University of Amsterdam. I cordially invite you to attend the defense and reception afterwards.
Our SIGIR short paper has been accepted. In this joint work, with MSc student Joost van Doorn, Diederik Roijers and Maarten de Rijke, we propose a multi-objective optimization approach for balancing relevance criteria.
Offline evaluation of information retrieval systems typically focuses on a single effectiveness measure that models the utility for a typical user. Such a measure usually combines a behavior-based rank discount with a notion of document utility that captures the single relevance criterion of topicality. However, for individual users relevance criteria such as credibility, reputability or readability can strongly impact the utility. Furthermore, for different information needs the utility can be a different mixture of these criteria. Because of the focus on single metrics, offline optimization of IR systems do not account for different preferences in balancing relevance criteria.
In this paper, we propose to mitigate this by viewing multiple relevance criteria as objectives and learning a set of rankers that provide different trade-offs w.r.t. these objectives. We model document utility within a gain-based evaluation framework as a weighted combination of relevance criteria. Using the learned set, we are able to make an informed decision based on the values of the rankers and a preference w.r.t. the relevance criteria. We demonstrate on a dataset annotated for readability and a web search dataset annotated for sub-topic relevance, how trade-offs between, e.g., topicality and readability, can be made explicit. We show that there are different available trade-offs between relevance criteria.
Joost van Doorn, Daan Odijk, Diederik Roijers and Maarten de Rijke. Balancing Relevance Criteria through Multi-Objective Optimization. In SIGIR’16, 2016. [BibTeX] [PDF]
@inproceedings{vandoorn2016balancing, Author = {van Doorn, Joost and Odijk, Daan and Roijers, Diederik and de Rijke, Maarten}, Booktitle = {SIGIR 2016: 39th international ACM SIGIR conference on Research and development in information retrieval}, Month = {July}, Publisher = {ACM}, Title = {Balancing relevance criteria through multi-objective optimization}, Year = {2016}}
Our CIKM2015 paper received the best student paper award! This full paper is the result of my internship at Microsoft Research in Redmond. In this joint work with Ryen White, Ahmed Hassan Awadallah and Susan Dumais, we investigate why some web searchers succeed where others struggle.
Web searchers sometimes struggle to find relevant information. Struggling leads to frustrating and dissatisfying search experiences, even if searchers ultimately meet their search objectives. Better understanding of search tasks where people struggle is important in improving search systems. We address this important issue using a mixed methods study using large-scale logs, crowd-sourced labeling, and predictive modeling. We analyze anonymized search logs from the Microsoft Bing Web search engine to characterize aspects of struggling searches and better explain the relationship between struggling and search success. To broaden our understanding of the struggling process beyond the behavioral signals in log data, we develop and utilize a crowd-sourced labeling methodology. We collect third-party judgments about why searchers appear to struggle and, if appropriate, where in the search task it became clear to the judges that searches would succeed (i.e., the pivotal query). We use our findings to propose ways in which systems can help searchers reduce struggling. Key components of such support are algorithms that accurately predict the nature of future actions and their anticipated impact on search outcomes. Our findings have implications for the design of search systems that help searchers struggle less and succeed more.
Daan Odijk, Ryen W. White, Ahmed Hassan Awadallah and Susan T. Dumais. Struggling and Success in Web Search.. In CIKM ’15, 2015. [Bibtex] [PDF] [Slides]
@inproceedings{odijk2015struggling, Author = {Odijk, Daan and White, Ryen W. and Hassan Awadallah, Ahmed and Dumais, Susan T.}, Booktitle = {CIKM 2015: 24th ACM International Conference on Information and Knowledge Management}, Month = {October}, Publisher = {ACM}, Title = {Struggling and Success in Web Search}, Year = {2015}}
Our TPDL2015 full paper was accepted. In this joint work in an Amsterdam Data Science seed project with Cristina Gârbacea (UvA), Thomas Schoegje (VU), Laura Hollink (VU/CWI), Victor de Boer (VU), Kees Ribbens (NIOD) and Jacco van Ossenbruggen (VU/CWI), we present an exploratory search application that highlights different perspectives across collections.
The ever growing number of textual historical collections calls for methods that can meaningfully connect and explore these. Different collections offer different perspectives, expressing the views at the time of writing or even a subjective view of the author. We propose to connect heterogeneous digital collections through the temporal references found in the documents as well as their textual content. We evaluate our approach and find that it works very well on digital-native collections. Digitized collections pose interesting challenges and with improved preprocessing our approach performs sufficiently well. We introduce a novel search interface to explore and analyze the connected collections that highlights different perspectives. In our approach, perspectives are expressed as complex queries. The results of these methods are presented in an open-source interface collections that requires little domain knowledge. Our approach supports humanity scholars in exploring collections in a novel way and allows for digital collections to be more accessible by adding new connections and new means to access the collections.
Daan Odijk, Cristina Gârbacea, Thomas Schoegje, Laura Hollink, Victor de Boer, Kees Ribbens and Jacco van Ossenbruggen. Supporting Exploration of Historical Perspectives across Collections. In TPDL’15, 2015. [Bibtex] [PDF]
@incollection{odijk2015perspectives year={2015}, isbn={978-3-319-24591-1}, booktitle={Research and Advanced Technology for Digital Libraries}, volume={9316}, series={Lecture Notes in Computer Science}, editor={Kapidakis, Sarantos and Mazurek, Cezary and Werla, Marcin}, doi={10.1007/978-3-319-24592-8_18}, title={Supporting Exploration of Historical Perspectives Across Collections}, url={http://dx.doi.org/10.1007/978-3-319-24592-8_18}, publisher={Springer International Publishing}, author={Odijk, Daan and Gârbacea, Cristina and Schoegje, Thomas and Hollink, Laura and de Boer, Victor and Ribbens, Kees and van Ossenbruggen, Jacco}, pages={238-251}, language={English}}
Co-author Rens Vliegenthart wrote a blogpost (in Dutch) on our earlier work on finding frames in news.
Source: stukroodvlees.nl
Our SIGIR2015 full-paper got accepted. In this work, we present a query modeling approach to find related content in the setting of the NOS SmartTV app.
While watching television, people increasingly consume additional content related to what they are watching. We consider the task of finding video content related to a live television broadcast for which we leverage the textual stream of subtitles associated with the broadcast. We model this task as a Markov decision process and propose a method that uses reinforcement learning to directly optimize the retrieval effectiveness of queries generated from the stream of subtitles. Our dynamic query modeling approach significantly outperforms state-of-the-art baselines for stationary query modeling and for text-based retrieval in a television setting. In particular we find that carefully weighting terms and decaying these weights based on recency significantly improves effectiveness. Moreover, our method is highly efficient and can be used in a live television setting, i.e., in near real time.
D. Odijk, E. Meij, I. Sijaranamual and M. de Rijke. Dynamic Query Modeling for Related Content Finding. In SIGIR ’15, 2015. [Bibtex] [PDF]
@inproceedings{odijk2015dynamic, Author = {Odijk, Daan and Meij, Edgar and Sijaranamual, Isaac and de Rijke, Maarten}, Booktitle = {SIGIR 2015: 38th international ACM SIGIR conference on Research and development in information retrieval}, Month = {August}, Publisher = {ACM}, Title = {Dynamic query modeling for related content finding}, Year = {2015}}
Broadcaster NOS launched a SmartTV app that uses our semantic linking technology. Read more on their blog (in Dutch).
I am very excited to be spending the summer as an intern in the CLUES group at Microsoft Research in Redmond, USA. Dr. Susan Dumais, head of the group on Context, Learning, and User Experience for Search will be my mentor.
Top