Named Entity Disambiguation for Archival Collections: Metadata, Wikidata, and Linked Data

Katherine Louise Polley, Vivian Teresa Tompkins, Brendan John Honick, Jian Qin

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Representing archival metadata as linked data can increase the findability and usability of items, and linked data sources such as Wikidata can be used to further enrich existing collection metadata. However, a central challenge to this process is the named entity disambiguation or entity linking that is required to ensure that the named entities in a collection are being properly matched to Wikidata entities so that any additional metadata is applied correctly. This paper details our experimentation with one entity linking system called OpenTapioca, which was chosen for its use of Wikidata and its accessibility to librarians and archivists with minimal technical intervention. We discuss the results of using OpenTapioca for named entity disambiguation on the Belfer Cylinders Collection from the Special Collections Research Center at Syracuse University, highlighting the successes and limitations of the system and of using Wikidata as a knowledge base.

Original languageEnglish (US)
Pages (from-to)520-524
Number of pages5
JournalProceedings of the Association for Information Science and Technology
Volume58
Issue number1
DOIs
StatePublished - 2021

Keywords

  • Archival item-level metadata
  • entity management
  • linked data
  • named entity disambiguation
  • Wikidata

ASJC Scopus subject areas

  • General Computer Science
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Named Entity Disambiguation for Archival Collections: Metadata, Wikidata, and Linked Data'. Together they form a unique fingerprint.

Cite this