SemEval-2021 Task 10: Source-Free Domain Adaptation for Semantic Processing

  • Egoitz Laparra (Contributor)
  • Xin Su (Contributor)
  • Yiyun Zhao (Contributor)
  • Özlem Uzuner (Contributor)
  • Timothy A. Miller (Contributor)
  • Steven John Bethard (Contributor)

Dataset

Description

Data sharing restrictions are common in NLP datasets. For example, Twitter policies do not allow sharing of tweet text, though tweet IDs may be shared. The situation is even more common in clinical NLP, where patient health information must be protected, and annotations over health text, when released at all, often require the signing of complex data use agreements. The SemEval-2021 Task 10 framework asks participants to develop semantic annotation systems in the face of data sharing constraints. A participant's goal is to develop an accurate system for a target domain when annotations exist for a related domain but cannot be distributed. Instead of annotated training data, participants are given a model trained on the annotations. Then, given unlabeled target domain data, they are asked to make predictions. Website: https://machine-learning-for-medical-language.github.io/source-free-domain-adaptation/ CodaLab site: https://competitions.codalab.org/competitions/26152 Github repository: https://github.com/Machine-Learning-for-Medical-Language/source-free-domain-adaptation
Date made availableJul 24 2021
PublisherZENODO

Cite this