TY - JOUR
T1 - Named Entity Disambiguation for Archival Collections
T2 - Metadata, Wikidata, and Linked Data
AU - Polley, Katherine Louise
AU - Tompkins, Vivian Teresa
AU - Honick, Brendan John
AU - Qin, Jian
N1 - Publisher Copyright:
84 Annual Meeting of the Association for Information Science & Technology | Oct. 29 – Nov. 3, 2021 | Salt Lake City, UT. Author(s) retain copyright, but ASIS&T receives an exclusive publication license.
PY - 2021
Y1 - 2021
N2 - Representing archival metadata as linked data can increase the findability and usability of items, and linked data sources such as Wikidata can be used to further enrich existing collection metadata. However, a central challenge to this process is the named entity disambiguation or entity linking that is required to ensure that the named entities in a collection are being properly matched to Wikidata entities so that any additional metadata is applied correctly. This paper details our experimentation with one entity linking system called OpenTapioca, which was chosen for its use of Wikidata and its accessibility to librarians and archivists with minimal technical intervention. We discuss the results of using OpenTapioca for named entity disambiguation on the Belfer Cylinders Collection from the Special Collections Research Center at Syracuse University, highlighting the successes and limitations of the system and of using Wikidata as a knowledge base.
AB - Representing archival metadata as linked data can increase the findability and usability of items, and linked data sources such as Wikidata can be used to further enrich existing collection metadata. However, a central challenge to this process is the named entity disambiguation or entity linking that is required to ensure that the named entities in a collection are being properly matched to Wikidata entities so that any additional metadata is applied correctly. This paper details our experimentation with one entity linking system called OpenTapioca, which was chosen for its use of Wikidata and its accessibility to librarians and archivists with minimal technical intervention. We discuss the results of using OpenTapioca for named entity disambiguation on the Belfer Cylinders Collection from the Special Collections Research Center at Syracuse University, highlighting the successes and limitations of the system and of using Wikidata as a knowledge base.
KW - Archival item-level metadata
KW - entity management
KW - linked data
KW - named entity disambiguation
KW - Wikidata
UR - http://www.scopus.com/inward/record.url?scp=85148463826&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85148463826&partnerID=8YFLogxK
U2 - 10.1002/pra2.490
DO - 10.1002/pra2.490
M3 - Article
AN - SCOPUS:85148463826
SN - 2373-9231
VL - 58
SP - 520
EP - 524
JO - Proceedings of the Association for Information Science and Technology
JF - Proceedings of the Association for Information Science and Technology
IS - 1
ER -