Certainty categorization model

Victoria L. Rubin, Noriko Kando, Elizabeth D. Liddy

Research output: Contribution to conferencePaperpeer-review


We present a theoretical framework and preliminary results for manual categorization of explicit certainty information in 32 English newspaper articles. The explicit certainty markers were identified and categorized according to the four hypothesized dimensions - perspective, focus, timeline, and level of certainty. One hundred twenty one sentences from sample news stories contained a significantly lower frequency of markers per sentence (M=0.46, SD =0.04) than 564 sentences from sample editorials (M=0.6, SD =0.23), p= 0.0056, two-tailed heteroscedastic t-test. Within each dimension, editorials had most numerous markers per sentence in high level of certainty, writer's point of view, and future and present timeline (0.33, 0.43, 0.24, and 0.22, respectively); news stories - in high and moderate levels, directly involved third party's point of view, and past timeline (0.19, 0.20, 0.24, and 0.20, respectively). These patterns have practical implications for automation. Further analysis of editorials showed that out of 72 combinations possible under the hypothesized model, the high level of certainty from writer's perspective expressed abstractly in the present and future time, and expressed factually in the future were very common. Twenty two combinations never occurred; and 35 had ≤ 8 occurrences. This narrows the focus for future linguistic analysis of explicit certainty markers.

Original languageEnglish (US)
Number of pages6
StatePublished - 2005
Event2004 AAAI Spring Symposium - Stanford, CA, United States
Duration: Mar 22 2004Mar 24 2004


Other2004 AAAI Spring Symposium
Country/TerritoryUnited States
CityStanford, CA

ASJC Scopus subject areas

  • General Engineering


Dive into the research topics of 'Certainty categorization model'. Together they form a unique fingerprint.

Cite this