Taxonomy Builder: a Data-driven and User-centric Tool for Streamlining Taxonomy Construction

John Hungerford, Yee Seng Chan, Jessica MacBride, Benjamin M. Gyori, Andrew Zupon, Zheng Tang, Egoitz Laparra, Haoling Qiu, Bonan Min, Yan Zverev, Caitlin Hilverman, Max Thomas, Walt Andrews, Keith Alcock, Zeyu Zhang, Michael Reynolds, Mihai Surdeanu, Steve Bethard, Rebecca Sharp

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

An existing domain taxonomy for normalizing content is often assumed when discussing approaches to information extraction, yet often in real-world scenarios there is none. When one does exist, as the information needs shift, it must be continually extended. This is a slow and tedious task, and one that does not scale well. Here we propose an interactive tool that allows a taxonomy to be built or extended rapidly and with a human in the loop to control precision. We apply insights from text summarization and information extraction to reduce the search space dramatically, then leverage modern pretrained language models to perform contextualized clustering of the remaining concepts to yield candidate nodes for the user to review. We show this allows a user to consider as many as 200 taxonomy concept candidates an hour to quickly build or extend a taxonomy to better fit information needs.

Original languageEnglish (US)
Title of host publicationHCI+NLP 2022 - 2nd Workshop on Bridging Human-Computer Interaction and Natural Language Processing, Proceedings of the Workshop
EditorsSu Lin Blodgett, Hal Daume, Michael Madaio, Ani Nenkova, Brendan O'Connor, Hanna Wallach, Qian Yang
PublisherAssociation for Computational Linguistics (ACL)
Pages1-10
Number of pages10
ISBN (Electronic)9781955917902
StatePublished - 2022
Event2nd Workshop on Bridging Human-Computer Interaction and Natural Language Processing, HCI+NLP 2022 - Seattle, United States
Duration: Jul 15 2022 → …

Publication series

NameHCI+NLP 2022 - 2nd Workshop on Bridging Human-Computer Interaction and Natural Language Processing, Proceedings of the Workshop

Conference

Conference2nd Workshop on Bridging Human-Computer Interaction and Natural Language Processing, HCI+NLP 2022
Country/TerritoryUnited States
CitySeattle
Period7/15/22 → …

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Human-Computer Interaction
  • Information Systems

Fingerprint

Dive into the research topics of 'Taxonomy Builder: a Data-driven and User-centric Tool for Streamlining Taxonomy Construction'. Together they form a unique fingerprint.

Cite this