TPDL 2012 paper online
July 16, 2012 07:34 Filed in: Papers
The
following TPDL 2012 paper is online now.
In “Semantic document selection: Historical Research on Collections that Span Multiple Centuries” (Daan Odijk, Ork de Rooij, Maria-Hendrike Peetz, Toine Pieters, Maarten de Rijke, Stephen Snelders) we start from the observation that the availability of digitized collections of historical data, such as news papers, increases every day. With that, the wish for historians to explore these collections increases as well. Methods that are traditionally used to examine a collection do not scale up to today's collection sizes. We propose a method that combines text mining with exploratory search to provide historians with a means of interactively selecting and inspecting relevant documents from very large collections. We assess our proposal with a case study on a prototype system. PDF
In “Semantic document selection: Historical Research on Collections that Span Multiple Centuries” (Daan Odijk, Ork de Rooij, Maria-Hendrike Peetz, Toine Pieters, Maarten de Rijke, Stephen Snelders) we start from the observation that the availability of digitized collections of historical data, such as news papers, increases every day. With that, the wish for historians to explore these collections increases as well. Methods that are traditionally used to examine a collection do not scale up to today's collection sizes. We propose a method that combines text mining with exploratory search to provide historians with a means of interactively selecting and inspecting relevant documents from very large collections. We assess our proposal with a case study on a prototype system. PDF



