WIDM 2007 paper published
November 12, 2007 10:27 Filed in: Papers
Extracting the Discussion Structure in Comments on
News-Articles by Anne Schuth, Maarten Marx and
Maarten de Rijke has now been published. Several
on-line daily newspapers offer readers the
opportunity to directly comment on articles. In the
Netherlands this feature is used quite often and the
quality (grammatically and content-wise) is
surprisingly high. We develop techniques to collect,
store, enrich and analyze these comments. After
giving a high-level overview of the Dutch
`commentosphere' we zoom in on extracting the
discussion structure found in flat comment threads;
people not only comment on the news article, they
also heavily comment on other comments, resembling
discussion fora. We show how techniques from
information retrieval, natural language processing
and machine learning can be used to extract the
`reacts-on' relation between comments with high
precision and recall.



