This paper is an exploratory study of one approach to incorporating situational information into information retrieval systems, drawing on principles and methods of discourse linguistics. A tenet of discourse linguistics is that texts of a specific type possess a structure above the syntactic level, which follows conventions known to the people using such texts to communicate. In some cases, such as literature describing work done, the structure is closely related to situations, and may therefore be a useful representational vehicle for the present purpose. Abstracts of empirical research papers exhibit a well-defined discourse-level structure, which is revealed by lexical clues. Two methods of detecting the structure automatically are presented: (i) a Bayesian probabilistic analysis; and (ii) a neural network model. Both methods show promise in preliminary implementations. A study of users’ oral problem statements indicates that they are not amenable to the same kind of processing. However, from in-depth interviews with users and search intermediaries, the following conclusions are drawn: (i) the notion of a generic research script is meaningful to both users and intermediaries as a high-level description of situation; (ii) a researcher's position in the script is a predictor of the relevance of documents; and (iii) currently, intermediaries can make very little use of situational information. The implications of these findings for system design are discussed, and a system structure presented to serve as a framework for future experimental work on the factors identified in this paper. The design calls for a dialogue with the user on his or her position in a research script and incorporates features permitting discourse-level components of abstracts to be specified in search strategies.
ASJC Scopus subject areas
- Information Systems
- Library and Information Sciences