Computational Pragmatics :: Raquel Fernandez ======================================================== Evaluation criteria for final papers based on reviewing criterial at computational linguistics conferences ======================================================== APPROPRIATENESS Does the paper fit the contents of the course? 5: Certainly. 4: Probably. 3: Unsure. 2: Probably not. 1: Certainly not. FORMATTING / STYLE Is the paper properly formatted and free of typos and grammatical errors? Are the references in the bibliography accurate and adequately formatted? 5 = Nicely formatted with no typos. 4 = Well formatted and only a couple of very minor typos. 3 = Some typos and style errors and/or format mishaps. 2 = Quite a few typos and/or formatting problems that make reading the paper a bit unpleasant. 1 = The paper has not been proofread: full of typos and/or formatting problems. The reader cannot concentrate on the content. CLARITY Is it clear what was done and why? Is the paper well-written and well-structured? 5 = Very clear. 4 = Understandable by most readers. 3 = Mostly understandable to me with some effort. 2 = Important questions were hard to resolve even with effort. 1 = Much of the paper is confusing. SOUNDNESS / CORRECTNESS First, is the technical approach sound and well-chosen? Second, can one trust the claims of the paper -- are they supported by the analysis or by proper experiments and are the results correctly interpreted? 5 = The approach is very apt, and the claims are convincingly supported. 4 = Generally solid work, although there are some aspects of the approach or evaluation I am not sure about. 3 = Fairly reasonable work. The approach is not bad, and at least the main claims are probably correct, but I am not entirely ready to accept them (based on the material in the paper). 2 = Troublesome. There are some ideas worth salvaging here, but the work should really have been done or evaluated differently. 1 = Fatally flawed. MEANINGFUL COMPARISON (1-5) Does the author make clear where the analysis and methods sit with respect to existing literature? Are the references adequate? Are the experimental results meaningfully compared with the best prior approaches? 5 = Precise and complete comparison with related work. Good job given the space constraints. 4 = Mostly solid bibliography and comparison, but there are some references missing. 3 = Bibliography and comparison are somewhat helpful, but it could be hard for a reader to determine exactly how this work relates to previous work. 2 = Only partial awareness and understanding of related work, or a flawed empirical comparison. 1 = Little awareness of related work, or lacks necessary empirical comparison. SUBSTANCE (1-5) Does the paper have enough substance, or would it benefit from more ideas or results? Note that this question mainly concerns the amount of work; its quality is evaluated in other categories. 5 = Contains more ideas or results than most publications in this conference; goes the extra mile. 4 = Represents an appropriate amount of work for a publication in this conference. (most submissions) 3 = Leaves open one or two natural questions that should have been pursued within the paper. 2 = Work in progress. There are enough good ideas, but perhaps not enough results yet. 1 = Seems thin. Not enough ideas here for a full-length paper. REPLICABILITY (1-5) Will readers with the appropriate background be able to reproduce or verify the results in this paper? 5 = could easily reproduce the results. 4 = could mostly reproduce the results, but there may be some variation because of sample variance or minor variations in their interpretation of the protocol or method. 3 = could reproduce the results with some difficulty. The settings of parameters are underspecified or subjectively determined; the training/evaluation data are not widely available. 2 = would be hard pressed to reproduce the results. The contribution depends on data that are simply not available outside the author's institution or consortium; not enough details are provided. 1 = could not reproduce the results here no matter how hard they tried. The following two criteria have marginal importance in a course paper. ORIGINALITY / INNOVATIVENESS (1-5) How original is the approach? Does this paper break new ground in topic, methodology, or content? How exciting and innovative is the research it describes? Note that a paper could score high for originality even if the results do not show a convincing benefit. 5 = Seminal: Significant new problem, technique, methodology, or insight -- no prior research has attempted something similar. 4 = Creative: An intriguing problem, technique, or approach that is substantially different from previous research. 3 = Respectable: A nice research contribution that represents a significant extension of prior approaches or methodologies. 2 = Pedestrian: Obvious, or a minor improvement on familiar techniques. 1 = Significant portions have actually been done before or done better. IMPACT OF IDEAS OR RESULTS (1-5) How significant is the work described? If the ideas are novel, will they also be useful or inspirational? If the results are sound, are they also important? Does the paper bring new insights into the nature of the problem? 5 = Will affect the field by altering other people's choice of research topics or basic approach. 4 = Some of the ideas or results will substantially help other people's ongoing research. 3 = Interesting but not too influential. The work will be cited, but mainly for comparison or as a source of minor contributions. 2 = Marginally interesting. May or may not be cited. 1 = Will have no impact on the field.