Anaphora in natural language processing and information retrieval

Elizabeth Du Ross Liddy

Research output: Contribution to journalArticle

24 Scopus citations

Abstract

Anaphora is the discourse-level linguistic phenomenon of abbreviated subsequent reference, pronouns being the most commonly used anaphors. In that anaphora plays an essential role in human processors' production and understanding of texts, its appropriate recognition and resolution is essential to information retrieval systems that manipulate natural language texts. The approaches to anaphora undertaken in theoretical linguistics and NLP are surveyed and the results of research on anaphora as it impacts on information processes are presented with specific attention to the detailed studies conducted at Syracuse University. These studies have provided essential baseline data on the extent to which anaphora occur, their likelihood of referring to concepts integral to the toplc, their effect on a variety of term-weighting schemes, and their impact on retrieval results. Although the most effective means of processing anaphora may not have yet been determined, it is suggested that improved retrieval systems will need to represent the full meaning of natural language documents, including anaphoric references as well as all other discourse linguistic phenomena.

Original languageEnglish (US)
Pages (from-to)39-52
Number of pages14
JournalInformation Processing and Management
Volume26
Issue number1
DOIs
StatePublished - 1990

ASJC Scopus subject areas

  • Information Systems
  • Media Technology
  • Computer Science Applications
  • Management Science and Operations Research
  • Library and Information Sciences

Fingerprint Dive into the research topics of 'Anaphora in natural language processing and information retrieval'. Together they form a unique fingerprint.

  • Cite this