|
Tracking Web Opinions Robust Extraction of Subjective Meaning
You are here:
> Kamps
> Internet
A. The Measurement of Meaning RevisitedThe classic work on measuring emotive or affective meaning in texts is Charles Osgood's Theory of Semantic Differentiation. Osgood and his collaborators identify the aspect of meaning in which they are interested asa strictly psychological one: those cognitive states of human language users which are necessary antecedent conditions for selective encoding of lexical signs and necessary subsequent conditions in selective decoding of signs in messages. (Osgood et al. 1957, p.318).Their semantic differential technique is using several pairs of bipolar adjectives to scale the responses of subjects to words, short phrases, or texts. That is, subjects are asked to rate their meaning on scales like active-passive; good-bad; optimistic-pessimistic; positive-negative; strong-weak; serious-humorous; or ugly-beautifully. Each pair of bipolar adjectives is a factor in the semantic differential technique. As a result, the differential technique can cope with quite a large number of aspects of affective meaning. A natural question to ask is whether each of these factors is equally important. Osgood et.al. use factorial analysis of extensive empirical tests to investigate this question. The surprising answer is that most of the variance in judgment could be explained by only three major factors. These three factors of the affective or emotive meaning are the evaluative factor (e.g., good-bad); the potency factor (e.g., strong-weak); and the activity factor (e.g., active-passive). Among these three factors, the evaluative factor has the strongest relative weight. B. Words with AttitudeWe investigate measures for the evaluative factor of meaning based on the WordNet lexical database. WordNet is database of semantic knowledge inspired by psycho-linguistic and computational theories of human lexical memory (developed by the Princeton based group of George Miller). The evaluative dimension of Osgood is typically determined using the adjectives `good' and `bad' (other operationalizations are possible depending on the subject under investigation). In WordNet, we'll look at all the words that can be reached from the adjectives `good' and `bad,' this turns our to be roughly 25% of all the adjectives (i.e., 5410 adjective words, or 5464 adjective synsets).WordNet neighborhood of adjective goodDepth is the maximal length of a synonymy path in WordNet, a higher familiarity value filters out uncommon words.
WordNet neighborhood of adjective bad
WordNet geodesic distances to both `good' and `bad'
C. Analysing Internet Discussion SitesWe can score a text on the evaluative, potency, and activity dimension of subjective meaning, by simply adding up the scores of the individual adjectives contained in it. We apply this to posting on Internet discussion sites (this data is update on a daily basis).UK political partiesThis is some data from Usenet activity in newsgroups about English politics, on the three main political parties in the UK, i.e., New Labour, Conservative Party, and the Liberal Democrats.Raw data:
Female TeenstarsThis is some data from Usenet activity in newsgroups about music, on four of the currently popular teenstars, i.e., female singers Britney Spears, Christina Aguilera, Jessica Simpson, Mandy Moore.Raw data:
|