An Evaluation of Information Extraction Tools for Identifying Health Claims in News Headlines

Shi Yuan, Bei Yu

Research output: Chapter in Book/Entry/PoemConference contribution

4 Scopus citations

Abstract

This study evaluates the performance of four information extraction tools (extractors) on identifying health claims in health news headlines. A health claim is defined as a triplet: IV (what is being manipulated), DV (what is being measured) and their relation. Tools that can identify health claims provide the foundation for evaluating the accuracy of these claims against authoritative resources. The evaluation result shows that 26% headlines do not include health claims, and all extractors face difficulty separating them from the rest. For those with health claims, OPENIE-5.0 performed the best with F-measure at 0.6 level for extracting “IV-relation-DV”. However, the characteristic linguistic structures in health news headlines, such as incomplete sentences and non-verb relations, pose particular challenge to existing tools.

Original languageEnglish (US)
Title of host publicationEventStory 2018 - Events and Stories in the News, Proceedings of the Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages34-43
Number of pages10
ISBN (Electronic)9781948087599
StatePublished - 2018
Event2018 Events and Stories in the News Workshop, EventStory 2018, colocated with the 27th International Conference on Computational Linguistics, COLING 2018 - Santa Fe, United States
Duration: Aug 20 2018 → …

Publication series

NameEventStory 2018 - Events and Stories in the News, Proceedings of the Workshop

Conference

Conference2018 Events and Stories in the News Workshop, EventStory 2018, colocated with the 27th International Conference on Computational Linguistics, COLING 2018
Country/TerritoryUnited States
CitySanta Fe
Period8/20/18 → …

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science Applications
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'An Evaluation of Information Extraction Tools for Identifying Health Claims in News Headlines'. Together they form a unique fingerprint.

Cite this