Fully funded PhD position in search engine evaluation methodology
June 30, 2010 22:47
The Intelligent Systems Lab Amsterdam has a vacancy
for a fully funded four year PhD student to work
evaluation methodology for search engines.
This is a position in the PROMISE project, a new European FP7 project starting in September 2010. PROMISE is a Network of Excellence involving 10 prominent partner institutions in seven European countries. The PROMISE project addresses the challenges being faced by today's evaluation methodology for information retrieval systems and theories. The successful candidate will work on novel evaluation tasks and scenarios, approaches to automatic evaluation, as well as living labs in which evaluation is being carried out using live systems.
Applicants must have an MSc in Computer Science or Articial Intelligence. We are looking for very strong candidates who: have a good knowledge of programming in at least one of the following languages: C, C++, Java, Python, or Perl; have a solid background in statistics and/or machine learning; enjoy working with large, real-world data sets; are interested in working on research problems related to next generation search engines; have excellent communication skills, both oral and written
Details on the application procedure can be found at http://bit.ly/dgnlI4
For informal inquiries, please contact Maarten de Rijke at derijke@uva.nl
This is a position in the PROMISE project, a new European FP7 project starting in September 2010. PROMISE is a Network of Excellence involving 10 prominent partner institutions in seven European countries. The PROMISE project addresses the challenges being faced by today's evaluation methodology for information retrieval systems and theories. The successful candidate will work on novel evaluation tasks and scenarios, approaches to automatic evaluation, as well as living labs in which evaluation is being carried out using live systems.
Applicants must have an MSc in Computer Science or Articial Intelligence. We are looking for very strong candidates who: have a good knowledge of programming in at least one of the following languages: C, C++, Java, Python, or Perl; have a solid background in statistics and/or machine learning; enjoy working with large, real-world data sets; are interested in working on research problems related to next generation search engines; have excellent communication skills, both oral and written
Details on the application procedure can be found at http://bit.ly/dgnlI4
For informal inquiries, please contact Maarten de Rijke at derijke@uva.nl
Another CLEF 2010 Conference paper online
June 19, 2010 14:07 Filed in: Papers
Another CLEF 2010 Conference paper is also online now: On the Evaluation
of Entity Profiles by Maarten de Rijke,
Krisztian Balog, Toine Bogers and Antal van den
Bosch.
Entity profiling is the task of identifying and ranking descriptions of a given entity. The task may be viewed as one where the descriptions being sought are terms that need to be selected from a knowledge source (such as an ontology or thesaurus). In this case, entity profiling systems can be assessed by means of precision and recall values of the descriptive terms produced. However, recent evidence suggests that more sophisticated metrics are needed that go beyond mere lexical matching of system-produced descriptors against a ground truth, allowing for graded relevance and rewarding diversity in the list of descriptors returned. In this note, we motivate and propose such a metric.
Entity profiling is the task of identifying and ranking descriptions of a given entity. The task may be viewed as one where the descriptions being sought are terms that need to be selected from a knowledge source (such as an ontology or thesaurus). In this case, entity profiling systems can be assessed by means of precision and recall values of the descriptive terms produced. However, recent evidence suggests that more sophisticated metrics are needed that go beyond mere lexical matching of system-produced descriptors against a ground truth, allowing for graded relevance and rewarding diversity in the list of descriptors returned. In this note, we motivate and propose such a metric.
CLEF 2010 Conference paper online
June 19, 2010 14:04 Filed in: Papers
One of our CLEF 2010 conference papers, Validating
Query Simulators: An Experiment Using Commercial
Searches and Purchases by Bouke Huurnink, Katja
Hofmann, Maarten de Rijke and Marc Bron, is available
online now.
In the paper we design and validate simulators for generating queries and relevance judgments for retrieval system evaluation. We develop a simulation framework that incorporates existing and new simulation strategies. To validate a simulator, we assess whether evaluation using its output data ranks retrieval systems in the same way as evaluation using real-world data. The real-world data is obtained using logged commercial searches and associated purchase decisions. While no simulator reproduces an ideal ranking, there is a large variation in simulator performance that allows us to distinguish those that are better suited to creating artificial testbeds for retrieval experiments. Incorporating knowledge about document structure in the query generation process helps create more realistic simulators.
In the paper we design and validate simulators for generating queries and relevance judgments for retrieval system evaluation. We develop a simulation framework that incorporates existing and new simulation strategies. To validate a simulator, we assess whether evaluation using its output data ranks retrieval systems in the same way as evaluation using real-world data. The real-world data is obtained using logged commercial searches and associated purchase decisions. While no simulator reproduces an ideal ranking, there is a large variation in simulator performance that allows us to distinguish those that are better suited to creating artificial testbeds for retrieval experiments. Incorporating knowledge about document structure in the query generation process helps create more realistic simulators.



