monzilla.net / research / publications / full text


abstract
Shallow Morphological Analysis in Monolingual Information Retrieval for Dutch, German and Italian

Christof Monz and Maarten de Rijke

In: C. Peters, Braschler, J. Gonzalo and M. Kluck (eds.) Post-Conference Proceedings of the Cross Language Evaluation Forum Workshop (CLEF 2001), Springer Lecture Notes in Computer Science, 2002, pages 262-277.

This paper describes the experiments of our team for CLEF 2001, which includes both official and post-submission runs. We took part in the mono-lingual task, for Dutch, German, and Italian. The focus of our experiments was on the effects of morphological analyses such as stemming and compound splitting on retrieval effectiveness. Confirming earlier reports on retrieval in compound splitting languages such as Dutch and German, we found improvements to be around 25% for German and as much as 69% for Dutch. For Italian, lexicon-based stemming resulted in gains of up to 25%.

[ PostScript | PostScript (gnu-zipped) | PDF ]


links
co-author(s): Maarten de Rijke
research project(s): Document Retrieval
conference site: CLEF 2001
publisher information: Copyright © by Springer


BibTeX entry
@inProceedings{monz:shal02,
   author =    {Monz, C. and de Rijke, M.},
   title =     {Shallow Morphological Analysis in Monolingual
                Information Retrieval for {D}utch, {G}erman and {I}talian},
   booktitle = {Proceedings CLEF 2001},
   year =      2002,
   editor =    {Peters, C. and M. Braschler and J. Gonzalo and M. Kluck},
   publisher = {Springer Verlag},
   series =    {LNCS 2406},
   pages =     {262--277},
}