Anaphora is the discourse-level linguistic phenomenon of abbreviated subsequent reference, pronouns being the most commonly used anaphors. In that anaphora plays an essential role in human processors' production and understanding of texts, its appropriate recognition and resolution is essential to information retrieval systems that manipulate natural language texts. The approaches to anaphora undertaken in theoretical linguistics and NLP are surveyed and the results of research on anaphora as it impacts on information processes are presented with specific attention to the detailed studies conducted at Syracuse University. These studies have provided essential baseline data on the extent to which anaphora occur, their likelihood of referring to concepts integral to the toplc, their effect on a variety of term-weighting schemes, and their impact on retrieval results. Although the most effective means of processing anaphora may not have yet been determined, it is suggested that improved retrieval systems will need to represent the full meaning of natural language documents, including anaphoric references as well as all other discourse linguistic phenomena.
ASJC Scopus subject areas
- Information Systems
- Media Technology
- Computer Science Applications
- Management Science and Operations Research
- Library and Information Sciences