TY - GEN
T1 - A multi-pass sieve for coreference resolution
AU - Raghunathan, Karthik
AU - Lee, Heeyoung
AU - Rangarajan, Sudarshan
AU - Chambers, Nathanael
AU - Surdeanu, Mihai
AU - Jurafsky, Dan
AU - Manning, Christopher
PY - 2010
Y1 - 2010
N2 - Most coreference resolution models determine if two mentions are coreferent using a single function over a set of constraints or features. This approach can lead to incorrect decisions as lower precision features often overwhelm the smaller number of high precision ones. To overcome this problem, we propose a simple coreference architecture based on a sieve that applies tiers of deterministic coreference models one at a time from highest to lowest precision. Each tier builds on the previous tier's entity cluster output. Further, our model propagates global information by sharing attributes (e.g., gender and number) across mentions in the same cluster. This cautious sieve guarantees that stronger features are given precedence over weaker ones and that each decision is made using all of the information available at the time. The framework is highly modular: new coreference modules can be plugged in without any change to the other modules. In spite of its simplicity, our approach outperforms many state-of-the-art supervised and unsupervised models on several standard corpora. This suggests that sieve-based approaches could be applied to other NLP tasks.
AB - Most coreference resolution models determine if two mentions are coreferent using a single function over a set of constraints or features. This approach can lead to incorrect decisions as lower precision features often overwhelm the smaller number of high precision ones. To overcome this problem, we propose a simple coreference architecture based on a sieve that applies tiers of deterministic coreference models one at a time from highest to lowest precision. Each tier builds on the previous tier's entity cluster output. Further, our model propagates global information by sharing attributes (e.g., gender and number) across mentions in the same cluster. This cautious sieve guarantees that stronger features are given precedence over weaker ones and that each decision is made using all of the information available at the time. The framework is highly modular: new coreference modules can be plugged in without any change to the other modules. In spite of its simplicity, our approach outperforms many state-of-the-art supervised and unsupervised models on several standard corpora. This suggests that sieve-based approaches could be applied to other NLP tasks.
UR - http://www.scopus.com/inward/record.url?scp=80053285632&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80053285632&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:80053285632
SN - 1932432868
SN - 9781932432862
T3 - EMNLP 2010 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
SP - 492
EP - 501
BT - EMNLP 2010 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
T2 - Conference on Empirical Methods in Natural Language Processing, EMNLP 2010
Y2 - 9 October 2010 through 11 October 2010
ER -