Synthetic Dataset for Evaluating Complex Compositional Knowledge for Natural Language Inference

Sushma Anand Akoju, Robert Vacareanu, Haris Riaz, Mihai Surdeanu, Eduardo Blanco

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

We introduce a synthetic dataset called Sentences Involving Complex Compositional Knowledge (SICCK) and a novel analysis that investigates the performance of Natural Language Inference (NLI) models to understand compositionality in logic. We produce 1,304 sentence pairs by modifying 15 examples from the SICK dataset (Marelli et al., 2014). To this end, we modify the original texts using a set of phrases - modifiers that correspond to universal quantifiers, existential quantifiers, negation, and other concept modifiers in Natural Logic (NL) (MacCartney, 2009). We use these phrases to modify the subject, verb, and object parts of the premise and hypothesis. Lastly, we annotate these modified texts with the corresponding entailment labels following NL rules. We conduct a preliminary verification of how well the change in the structural and semantic composition is captured by neural NLI models, in both zero-shot and fine-tuned scenarios. We found that the performance of NLI models under the zero-shot setting is poor, especially for modified sentences with negation and existential quantifiers. After fine-tuning this dataset, we observe that models continue to perform poorly over negation, existential and universal modifiers.

Original languageEnglish (US)
Title of host publication1st Workshop on Natural Language Reasoning and Structured Explanations, NLRSE 2023 @ACL 2023 - Proceedings of the Workshop
EditorsBhavana Dalvi Mishra, Greg Durrett, Peter Jansen, Danilo Neves Ribeiro, Jason Wei
PublisherAssociation for Computational Linguistics (ACL)
Pages157-168
Number of pages12
ISBN (Electronic)9781959429944
StatePublished - 2023
Event1st Workshop on Natural Language Reasoning and Structured Explanations, NLRSE 2023, co-located with ACL 2023 - Toronto, Canada
Duration: Jun 13 2023 → …

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference1st Workshop on Natural Language Reasoning and Structured Explanations, NLRSE 2023, co-located with ACL 2023
Country/TerritoryCanada
CityToronto
Period6/13/23 → …

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Synthetic Dataset for Evaluating Complex Compositional Knowledge for Natural Language Inference'. Together they form a unique fingerprint.

Cite this